Sample records for interpreting gene expression

  1. Combining Shapley value and statistics to the analysis of gene expression data in children exposed to air pollution

    PubMed Central

    Moretti, Stefano; van Leeuwen, Danitsja; Gmuender, Hans; Bonassi, Stefano; van Delft, Joost; Kleinjans, Jos; Patrone, Fioravante; Merlo, Domenico Franco

    2008-01-01

    Background In gene expression analysis, statistical tests for differential gene expression provide lists of candidate genes having, individually, a sufficiently low p-value. However, the interpretation of each single p-value within complex systems involving several interacting genes is problematic. In parallel, in the last sixty years, game theory has been applied to political and social problems to assess the power of interacting agents in forcing a decision and, more recently, to represent the relevance of genes in response to certain conditions. Results In this paper we introduce a Bootstrap procedure to test the null hypothesis that each gene has the same relevance between two conditions, where the relevance is represented by the Shapley value of a particular coalitional game defined on a microarray data-set. This method, which is called Comparative Analysis of Shapley value (shortly, CASh), is applied to data concerning the gene expression in children differentially exposed to air pollution. The results provided by CASh are compared with the results from a parametric statistical test for testing differential gene expression. Both lists of genes provided by CASh and t-test are informative enough to discriminate exposed subjects on the basis of their gene expression profiles. While many genes are selected in common by CASh and the parametric test, it turns out that the biological interpretation of the differences between these two selections is more interesting, suggesting a different interpretation of the main biological pathways in gene expression regulation for exposed individuals. A simulation study suggests that CASh offers more power than t-test for the detection of differential gene expression variability. Conclusion CASh is successfully applied to gene expression analysis of a data-set where the joint expression behavior of genes may be critical to characterize the expression response to air pollution. We demonstrate a synergistic effect between coalitional games and statistics that resulted in a selection of genes with a potential impact in the regulation of complex pathways. PMID:18764936

  2. Estimating intrinsic and extrinsic noise from single-cell gene expression measurements

    PubMed Central

    Fu, Audrey Qiuyan; Pachter, Lior

    2017-01-01

    Gene expression is stochastic and displays variation (“noise”) both within and between cells. Intracellular (intrinsic) variance can be distinguished from extracellular (extrinsic) variance by applying the law of total variance to data from two-reporter assays that probe expression of identically regulated gene pairs in single cells. We examine established formulas [Elowitz, M. B., A. J. Levine, E. D. Siggia and P. S. Swain (2002): “Stochastic gene expression in a single cell,” Science, 297, 1183–1186.] for the estimation of intrinsic and extrinsic noise and provide interpretations of them in terms of a hierarchical model. This allows us to derive alternative estimators that minimize bias or mean squared error. We provide a geometric interpretation of these results that clarifies the interpretation in [Elowitz, M. B., A. J. Levine, E. D. Siggia and P. S. Swain (2002): “Stochastic gene expression in a single cell,” Science, 297, 1183–1186.]. We also demonstrate through simulation and re-analysis of published data that the distribution assumptions underlying the hierarchical model have to be satisfied for the estimators to produce sensible results, which highlights the importance of normalization. PMID:27875323

  3. Use of keyword hierarchies to interpret gene expression patterns.

    PubMed

    Masys, D R; Welsh, J B; Lynn Fink, J; Gribskov, M; Klacansky, I; Corbeil, J

    2001-04-01

    High-density microarray technology permits the quantitative and simultaneous monitoring of thousands of genes. The interpretation challenge is to extract relevant information from this large amount of data. A growing variety of statistical analysis approaches are available to identify clusters of genes that share common expression characteristics, but provide no information regarding the biological similarities of genes within clusters. The published literature provides a potential source of information to assist in interpretation of clustering results. We describe a data mining method that uses indexing terms ('keywords') from the published literature linked to specific genes to present a view of the conceptual similarity of genes within a cluster or group of interest. The method takes advantage of the hierarchical nature of Medical Subject Headings used to index citations in the MEDLINE database, and the registry numbers applied to enzymes.

  4. Integrated pathway-based transcription regulation network mining and visualization based on gene expression profiles.

    PubMed

    Kibinge, Nelson; Ono, Naoaki; Horie, Masafumi; Sato, Tetsuo; Sugiura, Tadao; Altaf-Ul-Amin, Md; Saito, Akira; Kanaya, Shigehiko

    2016-06-01

    Conventionally, workflows examining transcription regulation networks from gene expression data involve distinct analytical steps. There is a need for pipelines that unify data mining and inference deduction into a singular framework to enhance interpretation and hypotheses generation. We propose a workflow that merges network construction with gene expression data mining focusing on regulation processes in the context of transcription factor driven gene regulation. The pipeline implements pathway-based modularization of expression profiles into functional units to improve biological interpretation. The integrated workflow was implemented as a web application software (TransReguloNet) with functions that enable pathway visualization and comparison of transcription factor activity between sample conditions defined in the experimental design. The pipeline merges differential expression, network construction, pathway-based abstraction, clustering and visualization. The framework was applied in analysis of actual expression datasets related to lung, breast and prostrate cancer. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. FARO server: Meta-analysis of gene expression by matching gene expression signatures to a compendium of public gene expression data.

    PubMed

    Manijak, Mieszko P; Nielsen, Henrik B

    2011-06-11

    Although, systematic analysis of gene annotation is a powerful tool for interpreting gene expression data, it sometimes is blurred by incomplete gene annotation, missing expression response of key genes and secondary gene expression responses. These shortcomings may be partially circumvented by instead matching gene expression signatures to signatures of other experiments. To facilitate this we present the Functional Association Response by Overlap (FARO) server, that match input signatures to a compendium of 242 gene expression signatures, extracted from more than 1700 Arabidopsis microarray experiments. Hereby we present a publicly available tool for robust characterization of Arabidopsis gene expression experiments which can point to similar experimental factors in other experiments. The server is available at http://www.cbs.dtu.dk/services/faro/.

  6. Clustering change patterns using Fourier transformation with time-course gene expression data.

    PubMed

    Kim, Jaehee

    2011-01-01

    To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a period of time because biologically related gene groups can share the same change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. This work is aimed at discovering gene groups with similar change patterns which share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. We applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns.

  7. Unifying measures of gene function and evolution.

    PubMed

    Wolf, Yuri I; Carmel, Liran; Koonin, Eugene V

    2006-06-22

    Recent genome analyses revealed intriguing correlations between variables characterizing the functioning of a gene, such as expression level (EL), connectivity of genetic and protein-protein interaction networks, and knockout effect, and variables describing gene evolution, such as sequence evolution rate (ER) and propensity for gene loss. Typically, variables within each of these classes are positively correlated, e.g. products of highly expressed genes also have a propensity to be involved in many protein-protein interactions, whereas variables between classes are negatively correlated, e.g. highly expressed genes, on average, evolve slower than weakly expressed genes. Here, we describe principal component (PC) analysis of seven genome-related variables and propose biological interpretations for the first three PCs. The first PC reflects a gene's 'importance', or the 'status' of a gene in the genomic community, with positive contributions from knockout lethality, EL, number of protein-protein interaction partners and the number of paralogues, and negative contributions from sequence ER and gene loss propensity. The next two PCs define a plane that seems to reflect the functional and evolutionary plasticity of a gene. Specifically, PC2 can be interpreted as a gene's 'adaptability' whereby genes with high adaptability readily duplicate, have many genetic interaction partners and tend to be non-essential. PC3 also might reflect the role of a gene in organismal adaptation albeit with a negative rather than a positive contribution of genetic interactions; we provisionally designate this PC 'reactivity'. The interpretation of PC2 and PC3 as measures of a gene's plasticity is compatible with the observation that genes with high values of these PCs tend to be expressed in a condition- or tissue-specific manner. Functional classes of genes substantially vary in status, adaptability and reactivity, with the highest status characteristic of the translation system and cytoskeletal proteins, highest adaptability seen in cellular processes and signalling genes, and top reactivity characteristic of metabolic enzymes.

  8. SoxB1-driven transcriptional network underlies neural-specific interpretation of morphogen signals.

    PubMed

    Oosterveen, Tony; Kurdija, Sanja; Ensterö, Mats; Uhde, Christopher W; Bergsland, Maria; Sandberg, Magnus; Sandberg, Rickard; Muhr, Jonas; Ericson, Johan

    2013-04-30

    The reiterative deployment of a small cadre of morphogen signals underlies patterning and growth of most tissues during embyogenesis, but how such inductive events result in tissue-specific responses remains poorly understood. By characterizing cis-regulatory modules (CRMs) associated with genes regulated by Sonic hedgehog (Shh), retinoids, or bone morphogenetic proteins in the CNS, we provide evidence that the neural-specific interpretation of morphogen signaling reflects a direct integration of these pathways with SoxB1 proteins at the CRM level. Moreover, expression of SoxB1 proteins in the limb bud confers on mesodermal cells the potential to activate neural-specific target genes upon Shh, retinoid, or bone morphogenetic protein signaling, and the collocation of binding sites for SoxB1 and morphogen-mediatory transcription factors in CRMs faithfully predicts neural-specific gene activity. Thus, an unexpectedly simple transcriptional paradigm appears to conceptually explain the neural-specific interpretation of pleiotropic signaling during vertebrate development. Importantly, genes induced in a SoxB1-dependent manner appear to constitute repressive gene regulatory networks that are directly interlinked at the CRM level to constrain the regional expression of patterning genes. Accordingly, not only does the topology of SoxB1-driven gene regulatory networks provide a tissue-specific mode of gene activation, but it also determines the spatial expression pattern of target genes within the developing neural tube.

  9. BioCichlid: central dogma-based 3D visualization system of time-course microarray data on a hierarchical biological network.

    PubMed

    Ishiwata, Ryosuke R; Morioka, Masaki S; Ogishima, Soichi; Tanaka, Hiroshi

    2009-02-15

    BioCichlid is a 3D visualization system of time-course microarray data on molecular networks, aiming at interpretation of gene expression data by transcriptional relationships based on the central dogma with physical and genetic interactions. BioCichlid visualizes both physical (protein) and genetic (regulatory) network layers, and provides animation of time-course gene expression data on the genetic network layer. Transcriptional regulations are represented to bridge the physical network (transcription factors) and genetic network (regulated genes) layers, thus integrating promoter analysis into the pathway mapping. BioCichlid enhances the interpretation of microarray data and allows for revealing the underlying mechanisms causing differential gene expressions. BioCichlid is freely available and can be accessed at http://newton.tmd.ac.jp/. Source codes for both biocichlid server and client are also available.

  10. TANDEM: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types.

    PubMed

    Aben, Nanne; Vis, Daniel J; Michaut, Magali; Wessels, Lodewyk F A

    2016-09-01

    Clinical response to anti-cancer drugs varies between patients. A large portion of this variation can be explained by differences in molecular features, such as mutation status, copy number alterations, methylation and gene expression profiles. We show that the classic approach for combining these molecular features (Elastic Net regression on all molecular features simultaneously) results in models that are almost exclusively based on gene expression. The gene expression features selected by the classic approach are difficult to interpret as they often represent poorly studied combinations of genes, activated by aberrations in upstream signaling pathways. To utilize all data types in a more balanced way, we developed TANDEM, a two-stage approach in which the first stage explains response using upstream features (mutations, copy number, methylation and cancer type) and the second stage explains the remainder using downstream features (gene expression). Applying TANDEM to 934 cell lines profiled across 265 drugs (GDSC1000), we show that the resulting models are more interpretable, while retaining the same predictive performance as the classic approach. Using the more balanced contributions per data type as determined with TANDEM, we find that response to MAPK pathway inhibitors is largely predicted by mutation data, while predicting response to DNA damaging agents requires gene expression data, in particular SLFN11 expression. TANDEM is available as an R package on CRAN (for more information, see http://ccb.nki.nl/software/tandem). m.michaut@nki.nl or l.wessels@nki.nl Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. Mining microarray data at NCBI's Gene Expression Omnibus (GEO)*.

    PubMed

    Barrett, Tanya; Edgar, Ron

    2006-01-01

    The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) has emerged as the leading fully public repository for gene expression data. This chapter describes how to use Web-based interfaces, applications, and graphics to effectively explore, visualize, and interpret the hundreds of microarray studies and millions of gene expression patterns stored in GEO. Data can be examined from both experiment-centric and gene-centric perspectives using user-friendly tools that do not require specialized expertise in microarray analysis or time-consuming download of massive data sets. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.

  12. Molluscan engrailed expression, serial organization, and shell evolution

    NASA Technical Reports Server (NTRS)

    Jacobs, D. K.; Wray, C. G.; Wedeen, C. J.; Kostriken, R.; DeSalle, R.; Staton, J. L.; Gates, R. D.; Lindberg, D. R.

    2000-01-01

    Whether the serial features found in some molluscs are ancestral or derived is considered controversial. Here, in situ hybridization and antibody studies show iterated engrailed-gene expression in transverse rows of ectodermal cells bounding plate field development and spicule formation in the chiton, Lepidochitona cavema, as well as in cells surrounding the valves and in the early development of the shell hinge in the clam, Transennella tantilla. Ectodermal expression of engrailed is associated with skeletogenesis across a range of bilaterian phyla, suggesting a single evolutionary origin of invertebrate skeletons. The shared ancestry of bilaterian-invertebrate skeletons may help explain the sudden appearance of shelly fossils in the Cambrian. Our interpretation departs from the consideration of canonical metameres or segments as units of evolutionary analysis. In this interpretation, the shared ancestry of engrailed-gene function in the terminal/posterior addition of serially repeated elements during development explains the iterative expression of engrailed genes in a range of metazoan body plans.

  13. Analysis of bHLH coding genes using gene co-expression network approach.

    PubMed

    Srivastava, Swati; Sanchita; Singh, Garima; Singh, Noopur; Srivastava, Gaurava; Sharma, Ashok

    2016-07-01

    Network analysis provides a powerful framework for the interpretation of data. It uses novel reference network-based metrices for module evolution. These could be used to identify module of highly connected genes showing variation in co-expression network. In this study, a co-expression network-based approach was used for analyzing the genes from microarray data. Our approach consists of a simple but robust rank-based network construction. The publicly available gene expression data of Solanum tuberosum under cold and heat stresses were considered to create and analyze a gene co-expression network. The analysis provide highly co-expressed module of bHLH coding genes based on correlation values. Our approach was to analyze the variation of genes expression, according to the time period of stress through co-expression network approach. As the result, the seed genes were identified showing multiple connections with other genes in the same cluster. Seed genes were found to be vary in different time periods of stress. These analyzed seed genes may be utilized further as marker genes for developing the stress tolerant plant species.

  14. Prediction of gene expression with cis-SNPs using mixed models and regularization methods.

    PubMed

    Zeng, Ping; Zhou, Xiang; Huang, Shuiping

    2017-05-11

    It has been shown that gene expression in human tissues is heritable, thus predicting gene expression using only SNPs becomes possible. The prediction of gene expression can offer important implications on the genetic architecture of individual functional associated SNPs and further interpretations of the molecular basis underlying human diseases. We compared three types of methods for predicting gene expression using only cis-SNPs, including the polygenic model, i.e. linear mixed model (LMM), two sparse models, i.e. Lasso and elastic net (ENET), and the hybrid of LMM and sparse model, i.e. Bayesian sparse linear mixed model (BSLMM). The three kinds of prediction methods have very different assumptions of underlying genetic architectures. These methods were evaluated using simulations under various scenarios, and were applied to the Geuvadis gene expression data. The simulations showed that these four prediction methods (i.e. Lasso, ENET, LMM and BSLMM) behaved best when their respective modeling assumptions were satisfied, but BSLMM had a robust performance across a range of scenarios. According to R 2 of these models in the Geuvadis data, the four methods performed quite similarly. We did not observe any clustering or enrichment of predictive genes (defined as genes with R 2  ≥ 0.05) across the chromosomes, and also did not see there was any clear relationship between the proportion of the predictive genes and the proportion of genes in each chromosome. However, an interesting finding in the Geuvadis data was that highly predictive genes (e.g. R 2  ≥ 0.30) may have sparse genetic architectures since Lasso, ENET and BSLMM outperformed LMM for these genes; and this observation was validated in another gene expression data. We further showed that the predictive genes were enriched in approximately independent LD blocks. Gene expression can be predicted with only cis-SNPs using well-developed prediction models and these predictive genes were enriched in some approximately independent LD blocks. The prediction of gene expression can shed some light on the functional interpretation for identified SNPs in GWASs.

  15. Identification and resolution of artifacts in the interpretation of imprinted gene expression.

    PubMed

    Proudhon, Charlotte; Bourc'his, Déborah

    2010-12-01

    Genomic imprinting refers to genes that are epigenetically programmed in the germline to express exclusively or preferentially one allele in a parent-of-origin manner. Expression-based genome-wide screening for the identification of imprinted genes has failed to uncover a significant number of new imprinted genes, probably because of the high tissue- and developmental-stage specificity of imprinted gene expression. A very large number of technical and biological artifacts can also lead to the erroneous evidence of imprinted gene expression. In this article, we focus on three common sources of potential confounding effects: (i) random monoallelic expression in monoclonal cell populations, (ii) genetically determined monoallelic expression and (iii) contamination or infiltration of embryonic tissues with maternal material. This last situation specifically applies to genes that occur as maternally expressed in the placenta. Beside the use of reciprocal crosses that are instrumental to confirm the parental specificity of expression, we provide additional methods for the detection and elimination of these situations that can be misinterpreted as cases of imprinted expression.

  16. Exploratory Visual Analysis of Statistical Results from Microarray Experiments Comparing High and Low Grade Glioma

    PubMed Central

    Reif, David M.; Israel, Mark A.; Moore, Jason H.

    2007-01-01

    The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666

  17. Genome Expression Pathway Analysis Tool – Analysis and visualization of microarray gene expression data under genomic, proteomic and metabolic context

    PubMed Central

    Weniger, Markus; Engelmann, Julia C; Schultz, Jörg

    2007-01-01

    Background Regulation of gene expression is relevant to many areas of biology and medicine, in the study of treatments, diseases, and developmental stages. Microarrays can be used to measure the expression level of thousands of mRNAs at the same time, allowing insight into or comparison of different cellular conditions. The data derived out of microarray experiments is highly dimensional and often noisy, and interpretation of the results can get intricate. Although programs for the statistical analysis of microarray data exist, most of them lack an integration of analysis results and biological interpretation. Results We have developed GEPAT, Genome Expression Pathway Analysis Tool, offering an analysis of gene expression data under genomic, proteomic and metabolic context. We provide an integration of statistical methods for data import and data analysis together with a biological interpretation for subsets of probes or single probes on the chip. GEPAT imports various types of oligonucleotide and cDNA array data formats. Different normalization methods can be applied to the data, afterwards data annotation is performed. After import, GEPAT offers various statistical data analysis methods, as hierarchical, k-means and PCA clustering, a linear model based t-test or chromosomal profile comparison. The results of the analysis can be interpreted by enrichment of biological terms, pathway analysis or interaction networks. Different biological databases are included, to give various information for each probe on the chip. GEPAT offers no linear work flow, but allows the usage of any subset of probes and samples as a start for a new data analysis. GEPAT relies on established data analysis packages, offers a modular approach for an easy extension, and can be run on a computer grid to allow a large number of users. It is freely available under the LGPL open source license for academic and commercial users at . Conclusion GEPAT is a modular, scalable and professional-grade software integrating analysis and interpretation of microarray gene expression data. An installation available for academic users can be found at . PMID:17543125

  18. Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks.

    PubMed

    Wu, Siqi; Joseph, Antony; Hammonds, Ann S; Celniker, Susan E; Yu, Bin; Frise, Erwin

    2016-04-19

    Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set ofDrosophilaearly embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identified 21 principal patterns (PP). Providing a compact yet biologically interpretable representation ofDrosophilaexpression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. The performance of PP with theDrosophiladata suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.

  19. Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Siqi; Joseph, Antony; Hammonds, Ann S.

    Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set of Drosophila early embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identifiedmore » 21 principal patterns (PP). Providing a compact yet biologically interpretable representation of Drosophila expression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. In conclusion, the performance of PP with the Drosophila data suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.« less

  20. Stability-driven nonnegative matrix factorization to interpret spatial gene expression and build local gene networks

    DOE PAGES

    Wu, Siqi; Joseph, Antony; Hammonds, Ann S.; ...

    2016-04-06

    Spatial gene expression patterns enable the detection of local covariability and are extremely useful for identifying local gene interactions during normal development. The abundance of spatial expression data in recent years has led to the modeling and analysis of regulatory networks. The inherent complexity of such data makes it a challenge to extract biological information. We developed staNMF, a method that combines a scalable implementation of nonnegative matrix factorization (NMF) with a new stability-driven model selection criterion. When applied to a set of Drosophila early embryonic spatial gene expression images, one of the largest datasets of its kind, staNMF identifiedmore » 21 principal patterns (PP). Providing a compact yet biologically interpretable representation of Drosophila expression patterns, PP are comparable to a fate map generated experimentally by laser ablation and show exceptional promise as a data-driven alternative to manual annotations. Our analysis mapped genes to cell-fate programs and assigned putative biological roles to uncharacterized genes. Finally, we used the PP to generate local transcription factor regulatory networks. Spatially local correlation networks were constructed for six PP that span along the embryonic anterior-posterior axis. Using a two-tail 5% cutoff on correlation, we reproduced 10 of the 11 links in the well-studied gap gene network. In conclusion, the performance of PP with the Drosophila data suggests that staNMF provides informative decompositions and constitutes a useful computational lens through which to extract biological insight from complex and often noisy gene expression data.« less

  1. MEXPRESS: visualizing expression, DNA methylation and clinical TCGA data.

    PubMed

    Koch, Alexander; De Meyer, Tim; Jeschke, Jana; Van Criekinge, Wim

    2015-08-26

    In recent years, increasing amounts of genomic and clinical cancer data have become publically available through large-scale collaborative projects such as The Cancer Genome Atlas (TCGA). However, as long as these datasets are difficult to access and interpret, they are essentially useless for a major part of the research community and their scientific potential will not be fully realized. To address these issues we developed MEXPRESS, a straightforward and easy-to-use web tool for the integration and visualization of the expression, DNA methylation and clinical TCGA data on a single-gene level ( http://mexpress.be ). In comparison to existing tools, MEXPRESS allows researchers to quickly visualize and interpret the different TCGA datasets and their relationships for a single gene, as demonstrated for GSTP1 in prostate adenocarcinoma. We also used MEXPRESS to reveal the differences in the DNA methylation status of the PAM50 marker gene MLPH between the breast cancer subtypes and how these differences were linked to the expression of MPLH. We have created a user-friendly tool for the visualization and interpretation of TCGA data, offering clinical researchers a simple way to evaluate the TCGA data for their genes or candidate biomarkers of interest.

  2. Ethanol modifies the effect of handling stress on gene expression: problems in the analysis of two-way gene expression studies in mouse brain.

    PubMed

    Rulten, Stuart L; Ripley, Tamzin L; Manerakis, Ektor; Stephens, David N; Mayne, Lynne V

    2006-08-02

    Studies analysing the effects of acute treatments on animal behaviour and brain biochemistry frequently use pairwise comparisons between sham-treated and -untreated animals. In this study, we analyse expression of tPA, Grik2, Smarca2 and the transcription factor, Sp1, in mouse cerebellum following acute ethanol treatment. Expression is compared to saline-injected and -untreated control animals. We demonstrate that acute i.p. injection of saline may alter gene expression in a gene-specific manner and that ethanol may modify the effects of sham treatment on gene expression, as well as inducing specific effects independent of any handling related stress. In addition to demonstrating the complexity of gene expression in response to physical and environmental stress, this work raises questions on the interpretation and validity of studies relying on pairwise comparisons.

  3. Mining Microarray Data at NCBI’s Gene Expression Omnibus (GEO)*

    PubMed Central

    Barrett, Tanya; Edgar, Ron

    2006-01-01

    Summary The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) has emerged as the leading fully public repository for gene expression data. This chapter describes how to use Web-based interfaces, applications, and graphics to effectively explore, visualize, and interpret the hundreds of microarray studies and millions of gene expression patterns stored in GEO. Data can be examined from both experiment-centric and gene-centric perspectives using user-friendly tools that do not require specialized expertise in microarray analysis or time-consuming download of massive data sets. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo. PMID:16888359

  4. Selection and validation of reference genes for quantitative gene expression analyses in various tissues and seeds at different developmental stages in Bixa orellana L.

    PubMed

    Moreira, Viviane S; Soares, Virgínia L F; Silva, Raner J S; Sousa, Aurizangela O; Otoni, Wagner C; Costa, Marcio G C

    2018-05-01

    Bixa orellana L., popularly known as annatto, produces several secondary metabolites of pharmaceutical and industrial interest, including bixin, whose molecular basis of biosynthesis remain to be determined. Gene expression analysis by quantitative real-time PCR (qPCR) is an important tool to advance such knowledge. However, correct interpretation of qPCR data requires the use of suitable reference genes in order to reduce experimental variations. In the present study, we have selected four different candidates for reference genes in B. orellana , coding for 40S ribosomal protein S9 (RPS9), histone H4 (H4), 60S ribosomal protein L38 (RPL38) and 18S ribosomal RNA (18SrRNA). Their expression stabilities in different tissues (e.g. flower buds, flowers, leaves and seeds at different developmental stages) were analyzed using five statistical tools (NormFinder, geNorm, BestKeeper, ΔCt method and RefFinder). The results indicated that RPL38 is the most stable gene in different tissues and stages of seed development and 18SrRNA is the most unstable among the analyzed genes. In order to validate the candidate reference genes, we have analyzed the relative expression of a target gene coding for carotenoid cleavage dioxygenase 1 (CCD1) using the stable RPL38 and the least stable gene, 18SrRNA , for normalization of the qPCR data. The results demonstrated significant differences in the interpretation of the CCD1 gene expression data, depending on the reference gene used, reinforcing the importance of the correct selection of reference genes for normalization.

  5. DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data

    PubMed Central

    Glez-Peña, Daniel; Álvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino

    2009-01-01

    Background Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. Results DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. Conclusion DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released. PMID:19178723

  6. DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data.

    PubMed

    Glez-Peña, Daniel; Alvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino

    2009-01-29

    Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released.

  7. Molecular profiles to biology and pathways: a systems biology approach.

    PubMed

    Van Laere, Steven; Dirix, Luc; Vermeulen, Peter

    2016-06-16

    Interpreting molecular profiles in a biological context requires specialized analysis strategies. Initially, lists of relevant genes were screened to identify enriched concepts associated with pathways or specific molecular processes. However, the shortcoming of interpreting gene lists by using predefined sets of genes has resulted in the development of novel methods that heavily rely on network-based concepts. These algorithms have the advantage that they allow a more holistic view of the signaling properties of the condition under study as well as that they are suitable for integrating different data types like gene expression, gene mutation, and even histological parameters.

  8. Identification and resolution of artifacts in the interpretation of imprinted gene expression

    PubMed Central

    Proudhon, Charlotte

    2010-01-01

    Genomic imprinting refers to genes that are epigenetically programmed in the germline to express exclusively or preferentially one allele in a parent-of-origin manner. Expression-based genome-wide screening for the identification of imprinted genes has failed to uncover a significant number of new imprinted genes, probably because of the high tissue- and developmental-stage specificity of imprinted gene expression. A very large number of technical and biological artifacts can also lead to the erroneous evidence of imprinted gene expression. In this article, we focus on three common sources of potential confounding effects: (i) random monoallelic expression in monoclonal cell populations, (ii) genetically determined monoallelic expression and (iii) contamination or infiltration of embryonic tissues with maternal material. This last situation specifically applies to genes that occur as maternally expressed in the placenta. Beside the use of reciprocal crosses that are instrumental to confirm the parental specificity of expression, we provide additional methods for the detection and elimination of these situations that can be misinterpreted as cases of imprinted expression. PMID:20829207

  9. Application of dynamic topic models to toxicogenomics data.

    PubMed

    Lee, Mikyung; Liu, Zhichao; Huang, Ruili; Tong, Weida

    2016-10-06

    All biological processes are inherently dynamic. Biological systems evolve transiently or sustainably according to sequential time points after perturbation by environment insults, drugs and chemicals. Investigating the temporal behavior of molecular events has been an important subject to understand the underlying mechanisms governing the biological system in response to, such as, drug treatment. The intrinsic complexity of time series data requires appropriate computational algorithms for data interpretation. In this study, we propose, for the first time, the application of dynamic topic models (DTM) for analyzing time-series gene expression data. A large time-series toxicogenomics dataset was studied. It contains over 3144 microarrays of gene expression data corresponding to rat livers treated with 131 compounds (most are drugs) at two doses (control and high dose) in a repeated schedule containing four separate time points (4-, 8-, 15- and 29-day). We analyzed, with DTM, the topics (consisting of a set of genes) and their biological interpretations over these four time points. We identified hidden patterns embedded in this time-series gene expression profiles. From the topic distribution for compound-time condition, a number of drugs were successfully clustered by their shared mode-of-action such as PPARɑ agonists and COX inhibitors. The biological meaning underlying each topic was interpreted using diverse sources of information such as functional analysis of the pathways and therapeutic uses of the drugs. Additionally, we found that sample clusters produced by DTM are much more coherent in terms of functional categories when compared to traditional clustering algorithms. We demonstrated that DTM, a text mining technique, can be a powerful computational approach for clustering time-series gene expression profiles with the probabilistic representation of their dynamic features along sequential time frames. The method offers an alternative way for uncovering hidden patterns embedded in time series gene expression profiles to gain enhanced understanding of dynamic behavior of gene regulation in the biological system.

  10. Gene expression profile differences in left and right liver lobes from mid-gestation fetal baboons: a cautionary tale

    PubMed Central

    Cox, Laura A; Schlabritz-Loutsevitch, Natalia; Hubbard, Gene B; Nijland, Mark J; McDonald, Thomas J; Nathanielsz, Peter W

    2006-01-01

    Interpretation of gene array data presents many potential pitfalls in adult tissues. Gene array techniques applied to fetal tissues present additional confounding pitfalls. The left lobe of the fetal liver is supplied with blood containing more oxygen than the right lobe. Since synthetic activity and cell function are oxygen dependent, we hypothesized major differences in mRNA expression between the fetal right and left liver lobes. Our aim was to demonstrate the need to evaluate RNA samples from both lobes. We performed whole genome expression profiling on left and right liver lobe RNA from six 90-day gestation baboon fetuses (term 180 days). Comparing right with left, we found 875 differentially expressed genes – 312 genes were up-regulated and 563 down-regulated. Pathways for damaged DNA binding, endonuclease activity, interleukin binding and receptor activity were up-regulated in right lobe; ontological pathways related to cell signalling, cell organization, cell biogenesis, development, intracellular transport, phospholipid metabolism, protein biosynthesis, protein localization, protein metabolism, translational regulation and vesicle mediated transport were down-regulated in right lobe. Molecular pathway analysis showed down-regulation of pathways related to heat shock protein binding, ion channel and transporter activities, oxygen binding and transporter activities, translation initiation and translation regulator activities. Genes involved in amino acid biosynthesis, lipid biosynthesis and oxygen transport were also differentially expressed. This is the first demonstration of RNA differences between the two lobes of the fetal liver. The data support the argument that a complete interpretation of gene expression in the developing liver requires data from both lobes. PMID:16484296

  11. Comparison of gene expression microarray data with count-based RNA measurements informs microarray interpretation.

    PubMed

    Richard, Arianne C; Lyons, Paul A; Peters, James E; Biasci, Daniele; Flint, Shaun M; Lee, James C; McKinney, Eoin F; Siegel, Richard M; Smith, Kenneth G C

    2014-08-04

    Although numerous investigations have compared gene expression microarray platforms, preprocessing methods and batch correction algorithms using constructed spike-in or dilution datasets, there remains a paucity of studies examining the properties of microarray data using diverse biological samples. Most microarray experiments seek to identify subtle differences between samples with variable background noise, a scenario poorly represented by constructed datasets. Thus, microarray users lack important information regarding the complexities introduced in real-world experimental settings. The recent development of a multiplexed, digital technology for nucleic acid measurement enables counting of individual RNA molecules without amplification and, for the first time, permits such a study. Using a set of human leukocyte subset RNA samples, we compared previously acquired microarray expression values with RNA molecule counts determined by the nCounter Analysis System (NanoString Technologies) in selected genes. We found that gene measurements across samples correlated well between the two platforms, particularly for high-variance genes, while genes deemed unexpressed by the nCounter generally had both low expression and low variance on the microarray. Confirming previous findings from spike-in and dilution datasets, this "gold-standard" comparison demonstrated signal compression that varied dramatically by expression level and, to a lesser extent, by dataset. Most importantly, examination of three different cell types revealed that noise levels differed across tissues. Microarray measurements generally correlate with relative RNA molecule counts within optimal ranges but suffer from expression-dependent accuracy bias and precision that varies across datasets. We urge microarray users to consider expression-level effects in signal interpretation and to evaluate noise properties in each dataset independently.

  12. RefEx, a reference gene expression dataset as a web tool for the functional analysis of genes.

    PubMed

    Ono, Hiromasa; Ogasawara, Osamu; Okubo, Kosaku; Bono, Hidemasa

    2017-08-29

    Gene expression data are exponentially accumulating; thus, the functional annotation of such sequence data from metadata is urgently required. However, life scientists have difficulty utilizing the available data due to its sheer magnitude and complicated access. We have developed a web tool for browsing reference gene expression pattern of mammalian tissues and cell lines measured using different methods, which should facilitate the reuse of the precious data archived in several public databases. The web tool is called Reference Expression dataset (RefEx), and RefEx allows users to search by the gene name, various types of IDs, chromosomal regions in genetic maps, gene family based on InterPro, gene expression patterns, or biological categories based on Gene Ontology. RefEx also provides information about genes with tissue-specific expression, and the relative gene expression values are shown as choropleth maps on 3D human body images from BodyParts3D. Combined with the newly incorporated Functional Annotation of Mammals (FANTOM) dataset, RefEx provides insight regarding the functional interpretation of unfamiliar genes. RefEx is publicly available at http://refex.dbcls.jp/.

  13. RefEx, a reference gene expression dataset as a web tool for the functional analysis of genes

    PubMed Central

    Ono, Hiromasa; Ogasawara, Osamu; Okubo, Kosaku; Bono, Hidemasa

    2017-01-01

    Gene expression data are exponentially accumulating; thus, the functional annotation of such sequence data from metadata is urgently required. However, life scientists have difficulty utilizing the available data due to its sheer magnitude and complicated access. We have developed a web tool for browsing reference gene expression pattern of mammalian tissues and cell lines measured using different methods, which should facilitate the reuse of the precious data archived in several public databases. The web tool is called Reference Expression dataset (RefEx), and RefEx allows users to search by the gene name, various types of IDs, chromosomal regions in genetic maps, gene family based on InterPro, gene expression patterns, or biological categories based on Gene Ontology. RefEx also provides information about genes with tissue-specific expression, and the relative gene expression values are shown as choropleth maps on 3D human body images from BodyParts3D. Combined with the newly incorporated Functional Annotation of Mammals (FANTOM) dataset, RefEx provides insight regarding the functional interpretation of unfamiliar genes. RefEx is publicly available at http://refex.dbcls.jp/. PMID:28850115

  14. Genetic effects on gene expression across human tissues

    PubMed Central

    2017-01-01

    Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease. PMID:29022597

  15. Genetic effects on gene expression across human tissues.

    PubMed

    Battle, Alexis; Brown, Christopher D; Engelhardt, Barbara E; Montgomery, Stephen B

    2017-10-11

    Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

  16. Revealing complex function, process and pathway interactions with high-throughput expression and biological annotation data.

    PubMed

    Singh, Nitesh Kumar; Ernst, Mathias; Liebscher, Volkmar; Fuellen, Georg; Taher, Leila

    2016-10-20

    The biological relationships both between and within the functions, processes and pathways that operate within complex biological systems are only poorly characterized, making the interpretation of large scale gene expression datasets extremely challenging. Here, we present an approach that integrates gene expression and biological annotation data to identify and describe the interactions between biological functions, processes and pathways that govern a phenotype of interest. The product is a global, interconnected network, not of genes but of functions, processes and pathways, that represents the biological relationships within the system. We validated our approach on two high-throughput expression datasets describing organismal and organ development. Our findings are well supported by the available literature, confirming that developmental processes and apoptosis play key roles in cell differentiation. Furthermore, our results suggest that processes related to pluripotency and lineage commitment, which are known to be critical for development, interact mainly indirectly, through genes implicated in more general biological processes. Moreover, we provide evidence that supports the relevance of cell spatial organization in the developing liver for proper liver function. Our strategy can be viewed as an abstraction that is useful to interpret high-throughput data and devise further experiments.

  17. Graphite Web: web tool for gene set analysis exploiting pathway topology

    PubMed Central

    Sales, Gabriele; Calura, Enrica; Martini, Paolo; Romualdi, Chiara

    2013-01-01

    Graphite web is a novel web tool for pathway analyses and network visualization for gene expression data of both microarray and RNA-seq experiments. Several pathway analyses have been proposed either in the univariate or in the global and multivariate context to tackle the complexity and the interpretation of expression results. These methods can be further divided into ‘topological’ and ‘non-topological’ methods according to their ability to gain power from pathway topology. Biological pathways are, in fact, not only gene lists but can be represented through a network where genes and connections are, respectively, nodes and edges. To this day, the most used approaches are non-topological and univariate although they miss the relationship among genes. On the contrary, topological and multivariate approaches are more powerful, but difficult to be used by researchers without bioinformatic skills. Here we present Graphite web, the first public web server for pathway analysis on gene expression data that combines topological and multivariate pathway analyses with an efficient system of interactive network visualizations for easy results interpretation. Specifically, Graphite web implements five different gene set analyses on three model organisms and two pathway databases. Graphite Web is freely available at http://graphiteweb.bio.unipd.it/. PMID:23666626

  18. Gene Expression Elucidates Functional Impact of Polygenic Risk for Schizophrenia

    PubMed Central

    Fromer, Menachem; Roussos, Panos; Sieberts, Solveig K; Johnson, Jessica S; Kavanagh, David H; Perumal, Thanneer M; Ruderfer, Douglas M; Oh, Edwin C; Topol, Aaron; Shah, Hardik R; Klei, Lambertus L; Kramer, Robin; Pinto, Dalila; Gümüş, Zeynep H; Cicek, A. Ercument; Dang, Kristen K; Browne, Andrew; Lu, Cong; Xie, Lu; Readhead, Ben; Stahl, Eli A; Parvizi, Mahsa; Hamamsy, Tymor; Fullard, John F; Wang, Ying-Chih; Mahajan, Milind C; Derry, Jonathan M J; Dudley, Joel; Hemby, Scott E; Logsdon, Benjamin A; Talbot, Konrad; Raj, Towfique; Bennett, David A; De Jager, Philip L; Zhu, Jun; Zhang, Bin; Sullivan, Patrick F; Chess, Andrew; Purcell, Shaun M; Shinobu, Leslie A; Mangravite, Lara M; Toyoshiba, Hiroyoshi; Gur, Raquel E; Hahn, Chang-Gyu; Lewis, David A; Haroutunian, Vahram; Peters, Mette A; Lipska, Barbara K; Buxbaum, Joseph D; Schadt, Eric E; Hirai, Keisuke; Roeder, Kathryn; Brennand, Kristen J; Katsanis, Nicholas; Domenici, Enrico; Devlin, Bernie; Sklar, Pamela

    2016-01-01

    Over 100 genetic loci harbor schizophrenia associated variants, yet how these variants confer liability is uncertain. The CommonMind Consortium sequenced RNA from dorsolateral prefrontal cortex of schizophrenia cases (N = 258) and control subjects (N = 279), creating a resource of gene expression and its genetic regulation. Using this resource, ~20% of schizophrenia loci have variants that could contribute to altered gene expression and liability. In five loci, only a single gene was involved: FURIN, TSNARE1, CNTN4, CLCN3, or SNAP91. Altering expression of FURIN, TSNARE1, or CNTN4 changes neurodevelopment in zebrafish; knockdown of FURIN in human neural progenitor cells yields abnormal migration. Of 693 genes showing significant case/control differential expression, their fold changes are ≤ 1.33, and an independent cohort yields similar results. Gene co-expression implicates a network relevant for schizophrenia. Our findings show schizophrenia is polygenic and highlight the utility of this resource for mechanistic interpretations of genetic liability for brain diseases. PMID:27668389

  19. Gene expression elucidates functional impact of polygenic risk for schizophrenia.

    PubMed

    Fromer, Menachem; Roussos, Panos; Sieberts, Solveig K; Johnson, Jessica S; Kavanagh, David H; Perumal, Thanneer M; Ruderfer, Douglas M; Oh, Edwin C; Topol, Aaron; Shah, Hardik R; Klei, Lambertus L; Kramer, Robin; Pinto, Dalila; Gümüş, Zeynep H; Cicek, A Ercument; Dang, Kristen K; Browne, Andrew; Lu, Cong; Xie, Lu; Readhead, Ben; Stahl, Eli A; Xiao, Jianqiu; Parvizi, Mahsa; Hamamsy, Tymor; Fullard, John F; Wang, Ying-Chih; Mahajan, Milind C; Derry, Jonathan M J; Dudley, Joel T; Hemby, Scott E; Logsdon, Benjamin A; Talbot, Konrad; Raj, Towfique; Bennett, David A; De Jager, Philip L; Zhu, Jun; Zhang, Bin; Sullivan, Patrick F; Chess, Andrew; Purcell, Shaun M; Shinobu, Leslie A; Mangravite, Lara M; Toyoshiba, Hiroyoshi; Gur, Raquel E; Hahn, Chang-Gyu; Lewis, David A; Haroutunian, Vahram; Peters, Mette A; Lipska, Barbara K; Buxbaum, Joseph D; Schadt, Eric E; Hirai, Keisuke; Roeder, Kathryn; Brennand, Kristen J; Katsanis, Nicholas; Domenici, Enrico; Devlin, Bernie; Sklar, Pamela

    2016-11-01

    Over 100 genetic loci harbor schizophrenia-associated variants, yet how these variants confer liability is uncertain. The CommonMind Consortium sequenced RNA from dorsolateral prefrontal cortex of people with schizophrenia (N = 258) and control subjects (N = 279), creating a resource of gene expression and its genetic regulation. Using this resource, ∼20% of schizophrenia loci have variants that could contribute to altered gene expression and liability. In five loci, only a single gene was involved: FURIN, TSNARE1, CNTN4, CLCN3 or SNAP91. Altering expression of FURIN, TSNARE1 or CNTN4 changed neurodevelopment in zebrafish; knockdown of FURIN in human neural progenitor cells yielded abnormal migration. Of 693 genes showing significant case-versus-control differential expression, their fold changes were ≤ 1.33, and an independent cohort yielded similar results. Gene co-expression implicates a network relevant for schizophrenia. Our findings show that schizophrenia is polygenic and highlight the utility of this resource for mechanistic interpretations of genetic liability for brain diseases.

  20. Pathway results from the chicken data set using GOTM, Pathway Studio and Ingenuity softwares

    PubMed Central

    Bonnet, Agnès; Lagarrigue, Sandrine; Liaubet, Laurence; Robert-Granié, Christèle; SanCristobal, Magali; Tosser-Klopp, Gwenola

    2009-01-01

    Background As presented in the introduction paper, three sets of differentially regulated genes were found after the analysis of the chicken infection data set from EADGENE. Different methods were used to interpret these results. Results GOTM, Pathway Studio and Ingenuity softwares were used to investigate the three lists of genes. The three softwares allowed the analysis of the data and highlighted different networks. However, only one set of genes, showing a differential expression between primary and secondary response gave significant biological interpretation. Conclusion Combining these databases that were developed independently on different annotation sources supplies a useful tool for a global biological interpretation of microarray data, even if they may contain some imperfections (e.g. gene not or not well annotated). PMID:19615111

  1. Analyzing gene expression data in mice with the Neuro Behavior Ontology.

    PubMed

    Hoehndorf, Robert; Hancock, John M; Hardy, Nigel W; Mallon, Ann-Marie; Schofield, Paul N; Gkoutos, Georgios V

    2014-02-01

    We have applied the Neuro Behavior Ontology (NBO), an ontology for the annotation of behavioral gene functions and behavioral phenotypes, to the annotation of more than 1,000 genes in the mouse that are known to play a role in behavior. These annotations can be explored by researchers interested in genes involved in particular behaviors and used computationally to provide insights into the behavioral phenotypes resulting from differences in gene expression. We developed the OntoFUNC tool and have applied it to enrichment analyses over the NBO to provide high-level behavioral interpretations of gene expression datasets. The resulting increase in the number of gene annotations facilitates the identification of behavioral or neurologic processes by assisting the formulation of hypotheses about the relationships between gene, processes, and phenotypic manifestations resulting from behavioral observations.

  2. OncoBinder facilitates interpretation of proteomic interaction data by capturing coactivation pairs in cancer.

    PubMed

    Van Coillie, Samya; Liang, Lunxi; Zhang, Yao; Wang, Huanbin; Fang, Jing-Yuan; Xu, Jie

    2016-04-05

    High-throughput methods such as co-immunoprecipitationmass spectrometry (coIP-MS) and yeast 2 hybridization (Y2H) have suggested a broad range of unannotated protein-protein interactions (PPIs), and interpretation of these PPIs remains a challenging task. The advancements in cancer genomic researches allow for the inference of "coactivation pairs" in cancer, which may facilitate the identification of PPIs involved in cancer. Here we present OncoBinder as a tool for the assessment of proteomic interaction data based on the functional synergy of oncoproteins in cancer. This decision tree-based method combines gene mutation, copy number and mRNA expression information to infer the functional status of protein-coding genes. We applied OncoBinder to evaluate the potential binders of EGFR and ERK2 proteins based on the gastric cancer dataset of The Cancer Genome Atlas (TCGA). As a result, OncoBinder identified high confidence interactions (annotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) or validated by low-throughput assays) more efficiently than co-expression based method. Taken together, our results suggest that evaluation of gene functional synergy in cancer may facilitate the interpretation of proteomic interaction data. The OncoBinder toolbox for Matlab is freely accessible online.

  3. Clustering Algorithms: Their Application to Gene Expression Data

    PubMed Central

    Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel

    2016-01-01

    Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867

  4. Gene expression in cerebral ischemia: a new approach for neuroprotection.

    PubMed

    Millán, Mónica; Arenillas, Juan

    2006-01-01

    Cerebral ischemia is one of the strongest stimuli for gene induction in the brain. Hundreds of genes have been found to be induced by brain ischemia. Many genes are involved in neurodestructive functions such as excitotoxicity, inflammatory response and neuronal apoptosis. However, cerebral ischemia is also a powerful reformatting and reprogramming stimulus for the brain through neuroprotective gene expression. Several genes may participate in both cellular responses. Thus, isolation of candidate genes for neuroprotection strategies and interpretation of expression changes have been proven difficult. Nevertheless, many studies are being carried out to improve the knowledge of the gene activation and protein expression following ischemic stroke, as well as in the development of new therapies that modify biochemical, molecular and genetic changes underlying cerebral ischemia. Owing to the complexity of the process involving numerous critical genes expressed differentially in time, space and concentration, ongoing therapeutic efforts should be based on multiple interventions at different levels. By modification of the acute gene expression induced by ischemia or the apoptotic gene program, gene therapy is a promising treatment but is still in a very experimental phase. Some hurdles will have to be overcome before these therapies can be introduced into human clinical stroke trials. Copyright 2006 S. Karger AG, Basel.

  5. Comparison of gene co-networks reveals the molecular mechanisms of the rice (Oryza sativa L.) response to Rhizoctonia solani AG1 IA infection.

    PubMed

    Zhang, Jinfeng; Zhao, Wenjuan; Fu, Rong; Fu, Chenglin; Wang, Lingxia; Liu, Huainian; Li, Shuangcheng; Deng, Qiming; Wang, Shiquan; Zhu, Jun; Liang, Yueyang; Li, Ping; Zheng, Aiping

    2018-05-05

    Rhizoctonia solani causes rice sheath blight, an important disease affecting the growth of rice (Oryza sativa L.). Attempts to control the disease have met with little success. Based on transcriptional profiling, we previously identified more than 11,947 common differentially expressed genes (TPM > 10) between the rice genotypes TeQing and Lemont. In the current study, we extended these findings by focusing on an analysis of gene co-expression in response to R. solani AG1 IA and identified gene modules within the networks through weighted gene co-expression network analysis (WGCNA). We compared the different genes assigned to each module and the biological interpretations of gene co-expression networks at early and later modules in the two rice genotypes to reveal differential responses to AG1 IA. Our results show that different changes occurred in the two rice genotypes and that the modules in the two groups contain a number of candidate genes possibly involved in pathogenesis, such as the VQ protein. Furthermore, these gene co-expression networks provide comprehensive transcriptional information regarding gene expression in rice in response to AG1 IA. The co-expression networks derived from our data offer ideas for follow-up experimentation that will help advance our understanding of the translational regulation of rice gene expression changes in response to AG1 IA.

  6. The complex genetics of human insulin-like growth factor 2 are not reflected in public databases.

    PubMed

    Rotwein, Peter

    2018-03-23

    Recent advances in genetics present unique opportunities for enhancing knowledge about human physiology and disease susceptibility. Understanding this information at the individual gene level is challenging and requires extracting, collating, and interpreting data from a variety of public gene repositories. Here, I illustrate this challenge by analyzing the gene for human insulin-like growth factor 2 ( IGF2 ) through the lens of several databases. IGF2, a 67-amino acid secreted peptide, is essential for normal prenatal growth and is involved in other physiological and pathophysiological processes in humans. Surprisingly, none of the genetic databases accurately described or completely delineated human IGF2 gene structure or transcript expression, even though all relevant information could be found in the published literature. Although IGF2 shares multiple features with the mouse Igf2 gene, it has several unique properties, including transcription from five promoters. Both genes undergo parental imprinting, with IGF2 / Igf2 being expressed primarily from the paternal chromosome and the adjacent H19 gene from the maternal chromosome. Unlike mouse Igf2 , whose expression declines after birth, human IGF2 remains active throughout life. This characteristic has been attributed to a unique human gene promoter that escapes imprinting, but as shown here, it involves several different promoters with distinct tissue-specific expression patterns. Because new testable hypotheses could lead to critical insights into IGF2 actions in human physiology and disease, it is incumbent that our fundamental understanding is accurate. Similar challenges affecting knowledge of other human genes should promote attempts to critically evaluate, interpret, and correct human genetic data in publicly available databases. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

  7. Improved Annotation of 3′ Untranslated Regions and Complex Loci by Combination of Strand-Specific Direct RNA Sequencing, RNA-Seq and ESTs

    PubMed Central

    Song, Junfang; Duc, Céline; Storey, Kate G.; McLean, W. H. Irwin; Brown, Sara J.; Simpson, Gordon G.; Barton, Geoffrey J.

    2014-01-01

    The reference annotations made for a genome sequence provide the framework for all subsequent analyses of the genome. Correct and complete annotation in addition to the underlying genomic sequence is particularly important when interpreting the results of RNA-seq experiments where short sequence reads are mapped against the genome and assigned to genes according to the annotation. Inconsistencies in annotations between the reference and the experimental system can lead to incorrect interpretation of the effect on RNA expression of an experimental treatment or mutation in the system under study. Until recently, the genome-wide annotation of 3′ untranslated regions received less attention than coding regions and the delineation of intron/exon boundaries. In this paper, data produced for samples in Human, Chicken and A. thaliana by the novel single-molecule, strand-specific, Direct RNA Sequencing technology from Helicos Biosciences which locates 3′ polyadenylation sites to within +/− 2 nt, were combined with archival EST and RNA-Seq data. Nine examples are illustrated where this combination of data allowed: (1) gene and 3′ UTR re-annotation (including extension of one 3′ UTR by 5.9 kb); (2) disentangling of gene expression in complex regions; (3) clearer interpretation of small RNA expression and (4) identification of novel genes. While the specific examples displayed here may become obsolete as genome sequences and their annotations are refined, the principles laid out in this paper will be of general use both to those annotating genomes and those seeking to interpret existing publically available annotations in the context of their own experimental data. PMID:24722185

  8. A deep auto-encoder model for gene expression prediction.

    PubMed

    Xie, Rui; Wen, Jia; Quitadamo, Andrew; Cheng, Jianlin; Shi, Xinghua

    2017-11-17

    Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes' contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.

  9. Biological interpretation of genome-wide association studies using predicted gene functions.

    PubMed

    Pers, Tune H; Karjalainen, Juha M; Chan, Yingleong; Westra, Harm-Jan; Wood, Andrew R; Yang, Jian; Lui, Julian C; Vedantam, Sailaja; Gustafsson, Stefan; Esko, Tonu; Frayling, Tim; Speliotes, Elizabeth K; Boehnke, Michael; Raychaudhuri, Soumya; Fehrmann, Rudolf S N; Hirschhorn, Joel N; Franke, Lude

    2015-01-19

    The main challenge for gaining biological insights from genetic associations is identifying which genes and pathways explain the associations. Here we present DEPICT, an integrative tool that employs predicted gene functions to systematically prioritize the most likely causal genes at associated loci, highlight enriched pathways and identify tissues/cell types where genes from associated loci are highly expressed. DEPICT is not limited to genes with established functions and prioritizes relevant gene sets for many phenotypes.

  10. Selection of reference genes for gene expression studies related to intramuscular fat deposition in Capra hircus skeletal muscle.

    PubMed

    Zhu, Wuzheng; Lin, Yaqiu; Liao, Honghai; Wang, Yong

    2015-01-01

    The identification of suitable reference genes is critical for obtaining reliable results from gene expression studies using quantitative real-time PCR (qPCR) because the expression of reference genes may vary considerably under different experimental conditions. In most cases, however, commonly used reference genes are employed in data normalization without proper validation, which may lead to incorrect data interpretation. Here, we aim to select a set of optimal reference genes for the accurate normalization of gene expression associated with intramuscular fat (IMF) deposition during development. In the present study, eight reference genes (PPIB, HMBS, RPLP0, B2M, YWHAZ, 18S, GAPDH and ACTB) were evaluated by three different algorithms (geNorm, NormFinder and BestKeeper) in two types of muscle tissues (longissimus dorsi muscle and biceps femoris muscle) across different developmental stages. All three algorithms gave similar results. PPIB and HMBS were identified as the most stable reference genes, while the commonly used reference genes 18S and GAPDH were the most variably expressed, with expression varying dramatically across different developmental stages. Furthermore, to reveal the crucial role of appropriate reference genes in obtaining a reliable result, analysis of PPARG expression was performed by normalization to the most and the least stable reference genes. The relative expression levels of PPARG normalized to the most stable reference genes greatly differed from those normalized to the least stable one. Therefore, evaluation of reference genes must be performed for a given experimental condition before the reference genes are used. PPIB and HMBS are the optimal reference genes for analysis of gene expression associated with IMF deposition in skeletal muscle during development.

  11. Models of stochastic gene expression

    NASA Astrophysics Data System (ADS)

    Paulsson, Johan

    2005-06-01

    Gene expression is an inherently stochastic process: Genes are activated and inactivated by random association and dissociation events, transcription is typically rare, and many proteins are present in low numbers per cell. The last few years have seen an explosion in the stochastic modeling of these processes, predicting protein fluctuations in terms of the frequencies of the probabilistic events. Here I discuss commonalities between theoretical descriptions, focusing on a gene-mRNA-protein model that includes most published studies as special cases. I also show how expression bursts can be explained as simplistic time-averaging, and how generic approximations can allow for concrete interpretations without requiring concrete assumptions. Measures and nomenclature are discussed to some extent and the modeling literature is briefly reviewed.

  12. A short treatise concerning a musical approach for the interpretation of gene expression data

    PubMed Central

    Staege, Martin S.

    2015-01-01

    Recent technical developments allow the genome-wide and near-complete analysis of gene expression in a given sample, e.g. by usage of high-density DNA microarrays or next generation sequencing. The generated data structure is usually multi-dimensional and requires extensive processing not only for analysis but also for presentation of the results. Today, such data are usually presented graphically, e.g. in the form of heat maps. In the present paper, we propose an alternative form of analysis and presentation which is based on the transformation of gene expression data into sounds that are characterized by their frequency (pitch) and tone duration. Using DNA microarray data from a panel of neuroblastoma and Ewing sarcoma cell lines as well as from Hodgkin’s lymphoma cell lines and normal B cells, we demonstrate that this Gene Expression Music Algorithm (GEMusicA) can be used for discrimination between samples with different biology and for the characterization of differentially expressed genes. PMID:26472273

  13. Data-Driven Asthma Endotypes Defined from Blood Biomarker and Gene Expression Data

    PubMed Central

    George, Barbara Jane; Reif, David M.; Gallagher, Jane E.; Williams-DeVane, ClarLynda R.; Heidenfelder, Brooke L.; Hudgens, Edward E.; Jones, Wendell; Neas, Lucas; Hubal, Elaine A. Cohen; Edwards, Stephen W.

    2015-01-01

    The diagnosis and treatment of childhood asthma is complicated by its mechanistically distinct subtypes (endotypes) driven by genetic susceptibility and modulating environmental factors. Clinical biomarkers and blood gene expression were collected from a stratified, cross-sectional study of asthmatic and non-asthmatic children from Detroit, MI. This study describes four distinct asthma endotypes identified via a purely data-driven method. Our method was specifically designed to integrate blood gene expression and clinical biomarkers in a way that provides new mechanistic insights regarding the different asthma endotypes. For example, we describe metabolic syndrome-induced systemic inflammation as an associated factor in three of the four asthma endotypes. Context provided by the clinical biomarker data was essential in interpreting gene expression patterns and identifying putative endotypes, which emphasizes the importance of integrated approaches when studying complex disease etiologies. These synthesized patterns of gene expression and clinical markers from our research may lead to development of novel serum-based biomarker panels. PMID:25643280

  14. Are Hox genes ancestrally involved in axial patterning? Evidence from the hydrozoan Clytia hemisphaerica (Cnidaria).

    PubMed

    Chiori, Roxane; Jager, Muriel; Denker, Elsa; Wincker, Patrick; Da Silva, Corinne; Le Guyader, Hervé; Manuel, Michaël; Quéinnec, Eric

    2009-01-01

    The early evolution and diversification of Hox-related genes in eumetazoans has been the subject of conflicting hypotheses concerning the evolutionary conservation of their role in axial patterning and the pre-bilaterian origin of the Hox and ParaHox clusters. The diversification of Hox/ParaHox genes clearly predates the origin of bilaterians. However, the existence of a "Hox code" predating the cnidarian-bilaterian ancestor and supporting the deep homology of axes is more controversial. This assumption was mainly based on the interpretation of Hox expression data from the sea anemone, but growing evidence from other cnidarian taxa puts into question this hypothesis. Hox, ParaHox and Hox-related genes have been investigated here by phylogenetic analysis and in situ hybridisation in Clytia hemisphaerica, an hydrozoan species with medusa and polyp stages alternating in the life cycle. Our phylogenetic analyses do not support an origin of ParaHox and Hox genes by duplication of an ancestral ProtoHox cluster, and reveal a diversification of the cnidarian HOX9-14 genes into three groups called A, B, C. Among the 7 examined genes, only those belonging to the HOX9-14 and the CDX groups exhibit a restricted expression along the oral-aboral axis during development and in the planula larva, while the others are expressed in very specialised areas at the medusa stage. Cross species comparison reveals a strong variability of gene expression along the oral-aboral axis and during the life cycle among cnidarian lineages. The most parsimonious interpretation is that the Hox code, collinearity and conservative role along the antero-posterior axis are bilaterian innovations.

  15. Are Hox Genes Ancestrally Involved in Axial Patterning? Evidence from the Hydrozoan Clytia hemisphaerica (Cnidaria)

    PubMed Central

    Chiori, Roxane; Jager, Muriel; Denker, Elsa; Wincker, Patrick; Da Silva, Corinne; Le Guyader, Hervé; Manuel, Michaël; Quéinnec, Eric

    2009-01-01

    Background The early evolution and diversification of Hox-related genes in eumetazoans has been the subject of conflicting hypotheses concerning the evolutionary conservation of their role in axial patterning and the pre-bilaterian origin of the Hox and ParaHox clusters. The diversification of Hox/ParaHox genes clearly predates the origin of bilaterians. However, the existence of a “Hox code” predating the cnidarian-bilaterian ancestor and supporting the deep homology of axes is more controversial. This assumption was mainly based on the interpretation of Hox expression data from the sea anemone, but growing evidence from other cnidarian taxa puts into question this hypothesis. Methodology/Principal Findings Hox, ParaHox and Hox-related genes have been investigated here by phylogenetic analysis and in situ hybridisation in Clytia hemisphaerica, an hydrozoan species with medusa and polyp stages alternating in the life cycle. Our phylogenetic analyses do not support an origin of ParaHox and Hox genes by duplication of an ancestral ProtoHox cluster, and reveal a diversification of the cnidarian HOX9-14 genes into three groups called A, B, C. Among the 7 examined genes, only those belonging to the HOX9-14 and the CDX groups exhibit a restricted expression along the oral-aboral axis during development and in the planula larva, while the others are expressed in very specialised areas at the medusa stage. Conclusions/Significance Cross species comparison reveals a strong variability of gene expression along the oral-aboral axis and during the life cycle among cnidarian lineages. The most parsimonious interpretation is that the Hox code, collinearity and conservative role along the antero-posterior axis are bilaterian innovations. PMID:19156208

  16. RNA-seq reveals more consistent reference genes for gene expression studies in human non-melanoma skin cancers

    PubMed Central

    Tan, Jean-Marie; Payne, Elizabeth J.; Lin, Lynlee L.; Sinnya, Sudipta; Raphael, Anthony P.; Lambie, Duncan; Frazer, Ian H.; Dinger, Marcel E.; Soyer, H. Peter

    2017-01-01

    Identification of appropriate reference genes (RGs) is critical to accurate data interpretation in quantitative real-time PCR (qPCR) experiments. In this study, we have utilised next generation RNA sequencing (RNA-seq) to analyse the transcriptome of a panel of non-melanoma skin cancer lesions, identifying genes that are consistently expressed across all samples. Genes encoding ribosomal proteins were amongst the most stable in this dataset. Validation of this RNA-seq data was examined using qPCR to confirm the suitability of a set of highly stable genes for use as qPCR RGs. These genes will provide a valuable resource for the normalisation of qPCR data for the analysis of non-melanoma skin cancer. PMID:28852586

  17. Analysis of lamprey clustered Fox genes: insight into Fox gene evolution and expression in vertebrates.

    PubMed

    Wotton, Karl R; Shimeld, Sebastian M

    2011-12-01

    In the human genome, members of the FoxC, FoxF, FoxL1, and FoxQ1 gene families are found in two paralagous clusters. One cluster contains the genes FOXQ1, FOXF2, FOXC1 and the second consists of FOXF1, FOXC2, and FOXL1. In jawed vertebrates these genes are known to be expressed in different pharyngeal tissues and all, except FoxQ1, are involved in patterning the early embryonic mesoderm. We have previously traced the evolution of this cluster in the bony vertebrates, and the gene content is identical in the dogfish, a member of the most basally branching lineage of the jawed vertebrates. Here we extend these analyses to jawless vertebrates. Using genomic searches and molecular approaches we have identified homologues of these genes from lampreys. We identify two FoxC genes, two FoxF genes, two FoxQ1 genes and single FoxL1 gene. We examine the embryonic expression of one predominantly mesodermally expressed gene family, FoxC, and the endodermally expressed member of the cluster, FoxQ1. We identified FoxQ1 transcripts in the pharyngeal endoderm, while the two FoxC genes are differentially expressed in the pharyngeal mesenchyme and ectoderm. Furthermore we identify conserved expression of lamprey FoxC genes in the paraxial and intermediate mesoderms. We interpret our results through a chordate-wide comparison of expression patterns and discuss gene content in the context of theories on the evolution of the vertebrate genome. 2011 Elsevier B.V. All rights reserved.

  18. Evaluation of Reference Genes for Normalization of Gene Expression Using Quantitative RT-PCR under Aluminum, Cadmium, and Heat Stresses in Soybean.

    PubMed

    Gao, Mengmeng; Liu, Yaping; Ma, Xiao; Shuai, Qin; Gai, Junyi; Li, Yan

    2017-01-01

    Quantitative reverse transcription polymerase chain reaction (qRT-PCR) is widely used to analyze the relative gene expression level, however, the accuracy of qRT-PCR is greatly affected by the stability of reference genes, which is tissue- and environment- dependent. Therefore, choosing the most stable reference gene in a specific tissue and environment is critical to interpret gene expression patterns. Aluminum (Al), cadmium (Cd), and heat stresses are three important abiotic factors limiting soybean (Glycine max) production in southern China. To identify the suitable reference genes for normalizing the expression levels of target genes by qRT-PCR in soybean response to Al, Cd and heat stresses, we studied the expression stability of ten commonly used housekeeping genes in soybean roots and leaves under these three abiotic stresses, using five approaches, BestKeeper, Delta Ct, geNorm, NormFinder and RefFinder. We found TUA4 is the most stable reference gene in soybean root tips under Al stress. Under Cd stress, Fbox and UKN2 are the most stable reference genes in roots and leaves, respectively, while 60S is the most suitable reference gene when analyzing both roots and leaves together. For heat stress, TUA4 and UKN2 are the most stable housekeeping genes in roots and leaves, respectively, and UKN2 is the best reference gene for analysis of roots and leaves together. To validate the reference genes, we quantified the relative expression levels of six target genes that were involved in soybean response to Al, Cd or heat stresses, respectively. The expression patterns of these target genes differed between using the most and least stable reference genes, suggesting the selection of a suitable reference gene is critical for gene expression studies.

  19. Biological interpretation of genome-wide association studies using predicted gene functions

    PubMed Central

    Pers, Tune H.; Karjalainen, Juha M.; Chan, Yingleong; Westra, Harm-Jan; Wood, Andrew R.; Yang, Jian; Lui, Julian C.; Vedantam, Sailaja; Gustafsson, Stefan; Esko, Tonu; Frayling, Tim; Speliotes, Elizabeth K.; Boehnke, Michael; Raychaudhuri, Soumya; Fehrmann, Rudolf S.N.; Hirschhorn, Joel N.; Franke, Lude

    2015-01-01

    The main challenge for gaining biological insights from genetic associations is identifying which genes and pathways explain the associations. Here we present DEPICT, an integrative tool that employs predicted gene functions to systematically prioritize the most likely causal genes at associated loci, highlight enriched pathways and identify tissues/cell types where genes from associated loci are highly expressed. DEPICT is not limited to genes with established functions and prioritizes relevant gene sets for many phenotypes. PMID:25597830

  20. Quantitative comparison of microarray experiments with published leukemia related gene expression signatures.

    PubMed

    Klein, Hans-Ulrich; Ruckert, Christian; Kohlmann, Alexander; Bullinger, Lars; Thiede, Christian; Haferlach, Torsten; Dugas, Martin

    2009-12-15

    Multiple gene expression signatures derived from microarray experiments have been published in the field of leukemia research. A comparison of these signatures with results from new experiments is useful for verification as well as for interpretation of the results obtained. Currently, the percentage of overlapping genes is frequently used to compare published gene signatures against a signature derived from a new experiment. However, it has been shown that the percentage of overlapping genes is of limited use for comparing two experiments due to the variability of gene signatures caused by different array platforms or assay-specific influencing parameters. Here, we present a robust approach for a systematic and quantitative comparison of published gene expression signatures with an exemplary query dataset. A database storing 138 leukemia-related published gene signatures was designed. Each gene signature was manually annotated with terms according to a leukemia-specific taxonomy. Two analysis steps are implemented to compare a new microarray dataset with the results from previous experiments stored and curated in the database. First, the global test method is applied to assess gene signatures and to constitute a ranking among them. In a subsequent analysis step, the focus is shifted from single gene signatures to chromosomal aberrations or molecular mutations as modeled in the taxonomy. Potentially interesting disease characteristics are detected based on the ranking of gene signatures associated with these aberrations stored in the database. Two example analyses are presented. An implementation of the approach is freely available as web-based application. The presented approach helps researchers to systematically integrate the knowledge derived from numerous microarray experiments into the analysis of a new dataset. By means of example leukemia datasets we demonstrate that this approach detects related experiments as well as related molecular mutations and may help to interpret new microarray data.

  1. Identification of reference genes for quantitative expression analysis using large-scale RNA-seq data of Arabidopsis thaliana and model crop plants.

    PubMed

    Kudo, Toru; Sasaki, Yohei; Terashima, Shin; Matsuda-Imai, Noriko; Takano, Tomoyuki; Saito, Misa; Kanno, Maasa; Ozaki, Soichi; Suwabe, Keita; Suzuki, Go; Watanabe, Masao; Matsuoka, Makoto; Takayama, Seiji; Yano, Kentaro

    2016-10-13

    In quantitative gene expression analysis, normalization using a reference gene as an internal control is frequently performed for appropriate interpretation of the results. Efforts have been devoted to exploring superior novel reference genes using microarray transcriptomic data and to evaluating commonly used reference genes by targeting analysis. However, because the number of specifically detectable genes is totally dependent on probe design in the microarray analysis, exploration using microarray data may miss some of the best choices for the reference genes. Recently emerging RNA sequencing (RNA-seq) provides an ideal resource for comprehensive exploration of reference genes since this method is capable of detecting all expressed genes, in principle including even unknown genes. We report the results of a comprehensive exploration of reference genes using public RNA-seq data from plants such as Arabidopsis thaliana (Arabidopsis), Glycine max (soybean), Solanum lycopersicum (tomato) and Oryza sativa (rice). To select reference genes suitable for the broadest experimental conditions possible, candidates were surveyed by the following four steps: (1) evaluation of the basal expression level of each gene in each experiment; (2) evaluation of the expression stability of each gene in each experiment; (3) evaluation of the expression stability of each gene across the experiments; and (4) selection of top-ranked genes, after ranking according to the number of experiments in which the gene was expressed stably. Employing this procedure, 13, 10, 12 and 21 top candidates for reference genes were proposed in Arabidopsis, soybean, tomato and rice, respectively. Microarray expression data confirmed that the expression of the proposed reference genes under broad experimental conditions was more stable than that of commonly used reference genes. These novel reference genes will be useful for analyzing gene expression profiles across experiments carried out under various experimental conditions.

  2. GO-PCA: An Unsupervised Method to Explore Gene Expression Data Using Prior Knowledge

    PubMed Central

    Wagner, Florian

    2015-01-01

    Method Genome-wide expression profiling is a widely used approach for characterizing heterogeneous populations of cells, tissues, biopsies, or other biological specimen. The exploratory analysis of such data typically relies on generic unsupervised methods, e.g. principal component analysis (PCA) or hierarchical clustering. However, generic methods fail to exploit prior knowledge about the molecular functions of genes. Here, I introduce GO-PCA, an unsupervised method that combines PCA with nonparametric GO enrichment analysis, in order to systematically search for sets of genes that are both strongly correlated and closely functionally related. These gene sets are then used to automatically generate expression signatures with functional labels, which collectively aim to provide a readily interpretable representation of biologically relevant similarities and differences. The robustness of the results obtained can be assessed by bootstrapping. Results I first applied GO-PCA to datasets containing diverse hematopoietic cell types from human and mouse, respectively. In both cases, GO-PCA generated a small number of signatures that represented the majority of lineages present, and whose labels reflected their respective biological characteristics. I then applied GO-PCA to human glioblastoma (GBM) data, and recovered signatures associated with four out of five previously defined GBM subtypes. My results demonstrate that GO-PCA is a powerful and versatile exploratory method that reduces an expression matrix containing thousands of genes to a much smaller set of interpretable signatures. In this way, GO-PCA aims to facilitate hypothesis generation, design of further analyses, and functional comparisons across datasets. PMID:26575370

  3. GO-PCA: An Unsupervised Method to Explore Gene Expression Data Using Prior Knowledge.

    PubMed

    Wagner, Florian

    2015-01-01

    Genome-wide expression profiling is a widely used approach for characterizing heterogeneous populations of cells, tissues, biopsies, or other biological specimen. The exploratory analysis of such data typically relies on generic unsupervised methods, e.g. principal component analysis (PCA) or hierarchical clustering. However, generic methods fail to exploit prior knowledge about the molecular functions of genes. Here, I introduce GO-PCA, an unsupervised method that combines PCA with nonparametric GO enrichment analysis, in order to systematically search for sets of genes that are both strongly correlated and closely functionally related. These gene sets are then used to automatically generate expression signatures with functional labels, which collectively aim to provide a readily interpretable representation of biologically relevant similarities and differences. The robustness of the results obtained can be assessed by bootstrapping. I first applied GO-PCA to datasets containing diverse hematopoietic cell types from human and mouse, respectively. In both cases, GO-PCA generated a small number of signatures that represented the majority of lineages present, and whose labels reflected their respective biological characteristics. I then applied GO-PCA to human glioblastoma (GBM) data, and recovered signatures associated with four out of five previously defined GBM subtypes. My results demonstrate that GO-PCA is a powerful and versatile exploratory method that reduces an expression matrix containing thousands of genes to a much smaller set of interpretable signatures. In this way, GO-PCA aims to facilitate hypothesis generation, design of further analyses, and functional comparisons across datasets.

  4. ArraySolver: an algorithm for colour-coded graphical display and Wilcoxon signed-rank statistics for comparing microarray gene expression data.

    PubMed

    Khan, Haseeb Ahmad

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann-Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n < or = 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform.

  5. ArraySolver: An Algorithm for Colour-Coded Graphical Display and Wilcoxon Signed-Rank Statistics for Comparing Microarray Gene Expression Data

    PubMed Central

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann–Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n ≤ 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform. PMID:18629036

  6. Clustering approaches to identifying gene expression patterns from DNA microarray data.

    PubMed

    Do, Jin Hwan; Choi, Dong-Kug

    2008-04-30

    The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

  7. Approximate geodesic distances reveal biologically relevant structures in microarray data.

    PubMed

    Nilsson, Jens; Fioretos, Thoas; Höglund, Mattias; Fontes, Magnus

    2004-04-12

    Genome-wide gene expression measurements, as currently determined by the microarray technology, can be represented mathematically as points in a high-dimensional gene expression space. Genes interact with each other in regulatory networks, restricting the cellular gene expression profiles to a certain manifold, or surface, in gene expression space. To obtain knowledge about this manifold, various dimensionality reduction methods and distance metrics are used. For data points distributed on curved manifolds, a sensible distance measure would be the geodesic distance along the manifold. In this work, we examine whether an approximate geodesic distance measure captures biological similarities better than the traditionally used Euclidean distance. We computed approximate geodesic distances, determined by the Isomap algorithm, for one set of lymphoma and one set of lung cancer microarray samples. Compared with the ordinary Euclidean distance metric, this distance measure produced more instructive, biologically relevant, visualizations when applying multidimensional scaling. This suggests the Isomap algorithm as a promising tool for the interpretation of microarray data. Furthermore, the results demonstrate the benefit and importance of taking nonlinearities in gene expression data into account.

  8. Identification of suitable qPCR reference genes in leaves of Brassica oleracea under abiotic stresses.

    PubMed

    Brulle, Franck; Bernard, Fabien; Vandenbulcke, Franck; Cuny, Damien; Dumez, Sylvain

    2014-04-01

    Real-time quantitative PCR is nowadays a standard method to study gene expression variations in various samples and experimental conditions. However, to interpret results accurately, data normalization with appropriate reference genes appears to be crucial. The present study describes the identification and the validation of suitable reference genes in Brassica oleracea leaves. Expression stability of eight candidates was tested following drought and cold abiotic stresses by using three different softwares (BestKeeper, NormFinder and geNorm). Four genes (BolC.TUB6, BolC.SAND1, BolC.UBQ2 and BolC.TBP1) emerged as the most stable across the tested conditions. Further gene expression analysis of a drought- and a cold-responsive gene (BolC.DREB2A and BolC.ELIP, respectively), confirmed the stability and the reliability of the identified reference genes when used for normalization in the leaves of B. oleracea. These four genes were finally tested upon a benzene exposure and all appeared to be useful reference genes along this toxicological condition. These results provide a good starting point for future studies involving gene expression measurement on leaves of B. oleracea exposed to environmental modifications.

  9. Functional network analysis of genes differentially expressed during xylogenesis in soc1ful woody Arabidopsis plants.

    PubMed

    Davin, Nicolas; Edger, Patrick P; Hefer, Charles A; Mizrachi, Eshchar; Schuetz, Mathias; Smets, Erik; Myburg, Alexander A; Douglas, Carl J; Schranz, Michael E; Lens, Frederic

    2016-06-01

    Many plant genes are known to be involved in the development of cambium and wood, but how the expression and functional interaction of these genes determine the unique biology of wood remains largely unknown. We used the soc1ful loss of function mutant - the woodiest genotype known in the otherwise herbaceous model plant Arabidopsis - to investigate the expression and interactions of genes involved in secondary growth (wood formation). Detailed anatomical observations of the stem in combination with mRNA sequencing were used to assess transcriptome remodeling during xylogenesis in wild-type and woody soc1ful plants. To interpret the transcriptome changes, we constructed functional gene association networks of differentially expressed genes using the STRING database. This analysis revealed functionally enriched gene association hubs that are differentially expressed in herbaceous and woody tissues. In particular, we observed the differential expression of genes related to mechanical stress and jasmonate biosynthesis/signaling during wood formation in soc1ful plants that may be an effect of greater tension within woody tissues. Our results suggest that habit shifts from herbaceous to woody life forms observed in many angiosperm lineages could have evolved convergently by genetic changes that modulate the gene expression and interaction network, and thereby redeploy the conserved wood developmental program. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.

  10. Expression Profile of Drug and Nutrient Absorption Related Genes in Madin-Darby Canine Kidney (MDCK) Cells Grown under Differentiation Conditions.

    PubMed

    Quan, Yong; Jin, Yisheng; Faria, Teresa N; Tilford, Charles A; He, Aiqing; Wall, Doris A; Smith, Ronald L; Vig, Balvinder S

    2012-06-18

    The expression levels of genes involved in drug and nutrient absorption were evaluated in the Madin-Darby Canine Kidney (MDCK) in vitro drug absorption model. MDCK cells were grown on plastic surfaces (for 3 days) or on Transwell® membranes (for 3, 5, 7, and 9 days). The expression profile of genes including ABC transporters, SLC transporters, and cytochrome P450 (CYP) enzymes was determined using the Affymetrix® Canine GeneChip®. Expression of genes whose probe sets passed a stringent confirmation process was examined. Expression of a few transporter (MDR1, PEPT1 and PEPT2) genes in MDCK cells was confirmed by RT-PCR. The overall gene expression profile was strongly influenced by the type of support the cells were grown on. After 3 days of growth, expression of 28% of the genes was statistically different (1.5-fold cutoff, p < 0.05) between the cells grown on plastic and Transwell® membranes. When cells were differentiated on Transwell® membranes, large changes in gene expression profile were observed during the early stages, which then stabilized after 5-7 days. Only a small number of genes encoding drug absorption related SLC, ABC, and CYP were detected in MDCK cells, and most of them exhibited low hybridization signals. Results from this study provide valuable reference information on endogenous gene expression in MDCK cells that could assist in design of drug-transporter and/or drug-enzyme interaction studies, and help interpret the contributions of various transporters and metabolic enzymes in studies with MDCK cells.

  11. Expression Profile of Drug and Nutrient Absorption Related Genes in Madin-Darby Canine Kidney (MDCK) Cells Grown under Differentiation Conditions

    PubMed Central

    Quan, Yong; Jin, Yisheng; Faria, Teresa N.; Tilford, Charles A.; He, Aiqing; Wall, Doris A.; Smith, Ronald L.; Vig, Balvinder S.

    2012-01-01

    The expression levels of genes involved in drug and nutrient absorption were evaluated in the Madin-Darby Canine Kidney (MDCK) in vitro drug absorption model. MDCK cells were grown on plastic surfaces (for 3 days) or on Transwell® membranes (for 3, 5, 7, and 9 days). The expression profile of genes including ABC transporters, SLC transporters, and cytochrome P450 (CYP) enzymes was determined using the Affymetrix® Canine GeneChip®. Expression of genes whose probe sets passed a stringent confirmation process was examined. Expression of a few transporter (MDR1, PEPT1 and PEPT2) genes in MDCK cells was confirmed by RT-PCR. The overall gene expression profile was strongly influenced by the type of support the cells were grown on. After 3 days of growth, expression of 28% of the genes was statistically different (1.5-fold cutoff, p < 0.05) between the cells grown on plastic and Transwell® membranes. When cells were differentiated on Transwell® membranes, large changes in gene expression profile were observed during the early stages, which then stabilized after 5–7 days. Only a small number of genes encoding drug absorption related SLC, ABC, and CYP were detected in MDCK cells, and most of them exhibited low hybridization signals. Results from this study provide valuable reference information on endogenous gene expression in MDCK cells that could assist in design of drug-transporter and/or drug-enzyme interaction studies, and help interpret the contributions of various transporters and metabolic enzymes in studies with MDCK cells. PMID:24300234

  12. Mimosa: Mixture Model of Co-expression to Detect Modulators of Regulatory Interaction

    NASA Astrophysics Data System (ADS)

    Hansen, Matthew; Everett, Logan; Singh, Larry; Hannenhalli, Sridhar

    Functionally related genes tend to be correlated in their expression patterns across multiple conditions and/or tissue-types. Thus co-expression networks are often used to investigate functional groups of genes. In particular, when one of the genes is a transcription factor (TF), the co-expression-based interaction is interpreted, with caution, as a direct regulatory interaction. However, any particular TF, and more importantly, any particular regulatory interaction, is likely to be active only in a subset of experimental conditions. Moreover, the subset of expression samples where the regulatory interaction holds may be marked by presence or absence of a modifier gene, such as an enzyme that post-translationally modifies the TF. Such subtlety of regulatory interactions is overlooked when one computes an overall expression correlation. Here we present a novel mixture modeling approach where a TF-Gene pair is presumed to be significantly correlated (with unknown coefficient) in a (unknown) subset of expression samples. The parameters of the model are estimated using a Maximum Likelihood approach. The estimated mixture of expression samples is then mined to identify genes potentially modulating the TF-Gene interaction. We have validated our approach using synthetic data and on three biological cases in cow and in yeast. While limited in some ways, as discussed, the work represents a novel approach to mine expression data and detect potential modulators of regulatory interactions.

  13. Adaptation of muscle gene expression to changes in contractile activity

    NASA Technical Reports Server (NTRS)

    Booth, F. W.; Babij, P.; Thomason, D. B.; Wong, T. S.; Morrison, P. R.

    1987-01-01

    A review of the existing literature regarding the effects of different types of physical activities on the gene expression of adult skeletal muscles leads us to conclude that each type of exercise training program has, as a result, a different phenotype, which means that there are multiple mechanisms, each producing a unique phenotype. A portion of the facts which support this position is presented and interpreted here. [Abstract translated from the original French by NASA].

  14. Unstable Expression of Commonly Used Reference Genes in Rat Pancreatic Islets Early after Isolation Affects Results of Gene Expression Studies.

    PubMed

    Kosinová, Lucie; Cahová, Monika; Fábryová, Eva; Týcová, Irena; Koblas, Tomáš; Leontovyč, Ivan; Saudek, František; Kříž, Jan

    2016-01-01

    The use of RT-qPCR provides a powerful tool for gene expression studies; however, the proper interpretation of the obtained data is crucially dependent on accurate normalization based on stable reference genes. Recently, strong evidence has been shown indicating that the expression of many commonly used reference genes may vary significantly due to diverse experimental conditions. The isolation of pancreatic islets is a complicated procedure which creates severe mechanical and metabolic stress leading possibly to cellular damage and alteration of gene expression. Despite of this, freshly isolated islets frequently serve as a control in various gene expression and intervention studies. The aim of our study was to determine expression of 16 candidate reference genes and one gene of interest (F3) in isolated rat pancreatic islets during short-term cultivation in order to find a suitable endogenous control for gene expression studies. We compared the expression stability of the most commonly used reference genes and evaluated the reliability of relative and absolute quantification using RT-qPCR during 0-120 hrs after isolation. In freshly isolated islets, the expression of all tested genes was markedly depressed and it increased several times throughout the first 48 hrs of cultivation. We observed significant variability among samples at 0 and 24 hrs but substantial stabilization from 48 hrs onwards. During the first 48 hrs, relative quantification failed to reflect the real changes in respective mRNA concentrations while in the interval 48-120 hrs, the relative expression generally paralleled the results determined by absolute quantification. Thus, our data call into question the suitability of relative quantification for gene expression analysis in pancreatic islets during the first 48 hrs of cultivation, as the results may be significantly affected by unstable expression of reference genes. However, this method could provide reliable information from 48 hrs onwards.

  15. Network-Induced Classification Kernels for Gene Expression Profile Analysis

    PubMed Central

    Dror, Gideon; Shamir, Ron

    2012-01-01

    Abstract Computational classification of gene expression profiles into distinct disease phenotypes has been highly successful to date. Still, robustness, accuracy, and biological interpretation of the results have been limited, and it was suggested that use of protein interaction information jointly with the expression profiles can improve the results. Here, we study three aspects of this problem. First, we show that interactions are indeed relevant by showing that co-expressed genes tend to be closer in the network of interactions. Second, we show that the improved performance of one extant method utilizing expression and interactions is not really due to the biological information in the network, while in another method this is not the case. Finally, we develop a new kernel method—called NICK—that integrates network and expression data for SVM classification, and demonstrate that overall it achieves better results than extant methods while running two orders of magnitude faster. PMID:22697242

  16. A transversal approach to predict gene product networks from ontology-based similarity

    PubMed Central

    Chabalier, Julie; Mosser, Jean; Burgun, Anita

    2007-01-01

    Background Interpretation of transcriptomic data is usually made through a "standard" approach which consists in clustering the genes according to their expression patterns and exploiting Gene Ontology (GO) annotations within each expression cluster. This approach makes it difficult to underline functional relationships between gene products that belong to different expression clusters. To address this issue, we propose a transversal analysis that aims to predict functional networks based on a combination of GO processes and data expression. Results The transversal approach presented in this paper consists in computing the semantic similarity between gene products in a Vector Space Model. Through a weighting scheme over the annotations, we take into account the representativity of the terms that annotate a gene product. Comparing annotation vectors results in a matrix of gene product similarities. Combined with expression data, the matrix is displayed as a set of functional gene networks. The transversal approach was applied to 186 genes related to the enterocyte differentiation stages. This approach resulted in 18 functional networks proved to be biologically relevant. These results were compared with those obtained through a standard approach and with an approach based on information content similarity. Conclusion Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression. PMID:17605807

  17. g:Profiler-a web server for functional interpretation of gene lists (2016 update).

    PubMed

    Reimand, Jüri; Arak, Tambet; Adler, Priit; Kolberg, Liis; Reisberg, Sulev; Peterson, Hedi; Vilo, Jaak

    2016-07-08

    Functional enrichment analysis is a key step in interpreting gene lists discovered in diverse high-throughput experiments. g:Profiler studies flat and ranked gene lists and finds statistically significant Gene Ontology terms, pathways and other gene function related terms. Translation of hundreds of gene identifiers is another core feature of g:Profiler. Since its first publication in 2007, our web server has become a popular tool of choice among basic and translational researchers. Timeliness is a major advantage of g:Profiler as genome and pathway information is synchronized with the Ensembl database in quarterly updates. g:Profiler supports 213 species including mammals and other vertebrates, plants, insects and fungi. The 2016 update of g:Profiler introduces several novel features. We have added further functional datasets to interpret gene lists, including transcription factor binding site predictions, Mendelian disease annotations, information about protein expression and complexes and gene mappings of human genetic polymorphisms. Besides the interactive web interface, g:Profiler can be accessed in computational pipelines using our R package, Python interface and BioJS component. g:Profiler is freely available at http://biit.cs.ut.ee/gprofiler/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

    PubMed

    Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi

    2013-01-01

    The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.

  19. Normal gene expression in male F344 rat nasal transitional and respiratory epithelium.

    PubMed

    Hester, Susan D; Benavides, Gina B; Sartor, Maureen; Yoon, Lawrence; Wolf, Douglas C; Morgan, Kevin T

    2002-02-20

    The nasal epithelium is an important target site for chemically-induced toxicity and carcinogenicity in rodents. Gene expression profiles were determined in order to provide normal baseline data for nasal transitional/respiratory epithelium from healthy rats. Cells lining the rat nasal passages were collected and gene expression analysis was performed using Clontech cDNA Rat Atlas 1.2 arrays (1185 genes). The percentages of genes within specific average expression ranges were 4.2% at 45,000-1000, 14.8% at 1000-200, 25.0% at 200-68, and 56.0% below 68. Nine out of a subset of ten genes were confirmed for relative signal intensity using quantitative real-time RT-PCR. The most highly expressed genes included those involved in phase I (e.g. cytochrome P450s) and phase II (e.g. glutathione S-transferases) xenobiotic metabolism, bioenergetics (e.g. cytochrome oxidase), osmotic balance (e.g. Na(+)/K(+) ATPase) and epithelial ionic homeostasis (e.g. ion channels). Such baseline data will contribute to further understanding the normal physiology of these cells and facilitate the interpretation of responses by the nasal epithelial cells to xenobiotic treatment or disease.

  20. Toward an understanding of the pathophysiology of clear cell carcinoma of the ovary (Review)

    PubMed Central

    UEKURI, CHIHARU; SHIGETOMI, HIROSHI; ONO, SUMIRE; SASAKI, YOSHIKAZU; MATSUURA, MIYUKI; KOBAYASHI, HIROSHI

    2013-01-01

    Endometriosis-associated ovarian cancers demonstrate substantial morphological and genetic diversity. The transcription factor, hepatocyte nuclear factor (HNF)-1β, may be one of several key genes involved in the identity of ovarian clear cell carcinoma (CCC). The present study reviews a considerably expanded set of HNF-1β-associated genes and proteins that determine the pathophysiology of CCC. The current literature was reviewed by searching MEDLINE/PubMed. Functional interpretations of gene expression profiling in CCC are provided. Several important CCC-related genes overlap with those known to be regulated by the upregulation of HNF-1β expression, along with a lack of estrogen receptor (ER) expression. Furthermore, the genetic expression pattern in CCC resembles that of the Arias-Stella reaction, decidualization and placentation. HNF-1β regulates a subset of progesterone target genes. HNF-1β may also act as a modulator of female reproduction, playing a role in endometrial regeneration, differentiation, decidualization, glycogen synthesis, detoxification, cell cycle regulation, implantation, uterine receptivity and a successful pregnancy. In conclusion, the present study focused on reviewing the aberrant expression of CCC-specific genes and provided an update on the pathological implications and molecular functions of well-characterized CCC-specific genes. PMID:24179489

  1. Novelty and Fear Conditioning Induced Gene Expression in High and Low States of Anxiety

    ERIC Educational Resources Information Center

    Donley, Melanie P.; Rosen, Jeffrey B.

    2017-01-01

    Emotional states influence how stimuli are interpreted. High anxiety states in humans lead to more negative, threatening interpretations of novel information, typically accompanied by activation of the amygdala. We developed a handling protocol that induces long-lasting high and low anxiety-like states in rats to explore the role of state anxiety…

  2. Microarray-based cancer prediction using soft computing approach.

    PubMed

    Wang, Xiaosheng; Gotoh, Osamu

    2009-05-26

    One of the difficulties in using gene expression profiles to predict cancer is how to effectively select a few informative genes to construct accurate prediction models from thousands or ten thousands of genes. We screen highly discriminative genes and gene pairs to create simple prediction models involved in single genes or gene pairs on the basis of soft computing approach and rough set theory. Accurate cancerous prediction is obtained when we apply the simple prediction models for four cancerous gene expression datasets: CNS tumor, colon tumor, lung cancer and DLBCL. Some genes closely correlated with the pathogenesis of specific or general cancers are identified. In contrast with other models, our models are simple, effective and robust. Meanwhile, our models are interpretable for they are based on decision rules. Our results demonstrate that very simple models may perform well on cancerous molecular prediction and important gene markers of cancer can be detected if the gene selection approach is chosen reasonably.

  3. MiRNA and TF co-regulatory network analysis for the pathology and recurrence of myocardial infarction.

    PubMed

    Lin, Ying; Sibanda, Vusumuzi Leroy; Zhang, Hong-Mei; Hu, Hui; Liu, Hui; Guo, An-Yuan

    2015-04-13

    Myocardial infarction (MI) is a leading cause of death in the world and many genes are involved in it. Transcription factor (TFs) and microRNAs (miRNAs) are key regulators of gene expression. We hypothesized that miRNAs and TFs might play combinatory regulatory roles in MI. After collecting MI candidate genes and miRNAs from various resources, we constructed a comprehensive MI-specific miRNA-TF co-regulatory network by integrating predicted and experimentally validated TF and miRNA targets. We found some hub nodes (e.g. miR-16 and miR-26) in this network are important regulators, and the network can be severed as a bridge to interpret the associations of previous results, which is shown by the case of miR-29 in this study. We also constructed a regulatory network for MI recurrence and found several important genes (e.g. DAB2, BMP6, miR-320 and miR-103), the abnormal expressions of which may be potential regulatory mechanisms and markers of MI recurrence. At last we proposed a cellular model to discuss major TF and miRNA regulators with signaling pathways in MI. This study provides more details on gene expression regulation and regulators involved in MI progression and recurrence. It also linked up and interpreted many previous results.

  4. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool

    PubMed Central

    Clark, Neil R.; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D.; Jones, Matthew R.; Ma’ayan, Avi

    2016-01-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community. PMID:26848405

  5. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool.

    PubMed

    Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi

    2015-11-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.

  6. Passenger mutations and aberrant gene expression in congenic tissue plasminogen activator-deficient mouse strains.

    PubMed

    Szabo, R; Samson, A L; Lawrence, D A; Medcalf, R L; Bugge, T H

    2016-08-01

    Essentials C57BL/6J-tissue plasminogen activator (tPA)-deficient mice are widely used to study tPA function. Congenic C57BL/6J-tPA-deficient mice harbor large 129-derived chromosomal segments. The 129-derived chromosomal segments contain gene mutations that may confound data interpretation. Passenger mutation-free isogenic tPA-deficient mice were generated for study of tPA function. Background The ability to generate defined null mutations in mice revolutionized the analysis of gene function in mammals. However, gene-deficient mice generated by using 129-derived embryonic stem cells may carry large segments of 129 DNA, even when extensively backcrossed to reference strains, such as C57BL/6J, and this may confound interpretation of experiments performed in these mice. Tissue plasminogen activator (tPA), encoded by the PLAT gene, is a fibrinolytic serine protease that is widely expressed in the brain. A number of neurological abnormalities have been reported in tPA-deficient mice. Objectives To study genetic contamination of tPA-deficient mice. Materials and methods Whole genome expression array analysis, RNAseq expression profiling, low- and high-density single nucleotide polymorphism (SNP) analysis, bioinformatics and genome editing were used to analyze gene expression in tPA-deficient mouse brains. Results and conclusions Genes differentially expressed in the brain of Plat(-/-) mice from two independent colonies highly backcrossed onto the C57BL/6J strain clustered near Plat on chromosome 8. SNP analysis attributed this anomaly to about 20 Mbp of DNA flanking Plat being of 129 origin in both strains. Bioinformatic analysis of these 129-derived chromosomal segments identified a significant number of mutations in genes co-segregating with the targeted Plat allele, including several potential null mutations. Using zinc finger nuclease technology, we generated novel 'passenger mutation'-free isogenic C57BL/6J-Plat(-/-) and FVB/NJ-Plat(-/-) mouse strains by introducing an 11 bp deletion into the exon encoding the signal peptide. These novel mouse strains will be a useful community resource for further exploration of tPA function in physiological and pathological processes. © 2016 International Society on Thrombosis and Haemostasis.

  7. Homo sapiens exhibit a distinct pattern of CNV genes regulation: an important role of miRNAs and SNPs in expression plasticity.

    PubMed

    Dweep, Harsh; Kubikova, Nada; Gretz, Norbert; Voskarides, Konstantinos; Felekkis, Kyriacos

    2015-07-16

    Gene expression regulation is a complex and highly organized process involving a variety of genomic factors. It is widely accepted that differences in gene expression can contribute to the phenotypic variability between species, and that their interpretation can aid in the understanding of the physiologic variability. CNVs and miRNAs are two major players in the regulation of expression plasticity and may be responsible for the unique phenotypic characteristics observed in different lineages. We have previously demonstrated that a close interaction between these two genomic elements may have contributed to the regulation of gene expression during evolution. This work presents the molecular interactions between CNV and non CNV genes with miRNAs and other genomic elements in eight different species. A comprehensive analysis of these interactions indicates a unique nature of human CNV genes regulation as compared to other species. By using genes with short 3' UTR that abolish the "canonical" miRNA-dependent regulation, as a model, we demonstrate a distinct and tight regulation of human genes that might explain some of the unique features of human physiology. In addition, comparison of gene expression regulation between species indicated that there is a significant difference between humans and mice possibly questioning the effectiveness of the latest as experimental models of human diseases.

  8. Homo sapiens exhibit a distinct pattern of CNV genes regulation: an important role of miRNAs and SNPs in expression plasticity

    PubMed Central

    Dweep, Harsh; Kubikova, Nada; Gretz, Norbert; Voskarides, Konstantinos; Felekkis, Kyriacos

    2015-01-01

    Gene expression regulation is a complex and highly organized process involving a variety of genomic factors. It is widely accepted that differences in gene expression can contribute to the phenotypic variability between species, and that their interpretation can aid in the understanding of the physiologic variability. CNVs and miRNAs are two major players in the regulation of expression plasticity and may be responsible for the unique phenotypic characteristics observed in different lineages. We have previously demonstrated that a close interaction between these two genomic elements may have contributed to the regulation of gene expression during evolution. This work presents the molecular interactions between CNV and non CNV genes with miRNAs and other genomic elements in eight different species. A comprehensive analysis of these interactions indicates a unique nature of human CNV genes regulation as compared to other species. By using genes with short 3′ UTR that abolish the “canonical” miRNA-dependent regulation, as a model, we demonstrate a distinct and tight regulation of human genes that might explain some of the unique features of human physiology. In addition, comparison of gene expression regulation between species indicated that there is a significant difference between humans and mice possibly questioning the effectiveness of the latest as experimental models of human diseases. PMID:26178010

  9. Housekeeping gene expression during fetal brain development in the rat-validation by semi-quantitative RT-PCR.

    PubMed

    Al-Bader, Maie Dawoud; Al-Sarraf, Hameed Ali

    2005-04-21

    Mammalian gene expression is usually carried out at the level of mRNA where the amount of mRNA of interest is measured under different conditions such as growth and development. It is therefore important to use a "housekeeping gene", that does not change in relative abundance during the experimental conditions, as a standard or internal control. However, recent data suggest that expression of some housekeeping genes may vary with the extent of cell proliferation, differentiation and under various experimental conditions. In this study, the expression of various housekeeping genes (18S rRNA [18S], glyceraldehydes-3-phosphate dehydrogenase [G3PDH], beta-glucuronidase [BGLU], histone H4 [HH4], ribosomal protein L19 [RPL19] and cyclophilin [CY]) was investigated during fetal rat brain development using semi-quantitative RT-PCR at 16, 19 and 21 days gestation. It was found that all genes studied, with exception to G3PDH, did not show any change in their expression levels during development. G3PDH, on the other hand, showed increased expression with development. These results suggest that the choice of a housekeeping gene is critical to the interpretation of experimental results and should be modified according to the nature of the study.

  10. Selection of reference genes for quantitative real-time PCR normalization in Panax ginseng at different stages of growth and in different organs.

    PubMed

    Liu, Jing; Wang, Qun; Sun, Minying; Zhu, Linlin; Yang, Michael; Zhao, Yu

    2014-01-01

    Quantitative real-time reverse transcription PCR (qRT-PCR) has become a widely used method for gene expression analysis; however, its data interpretation largely depends on the stability of reference genes. The transcriptomics of Panax ginseng, one of the most popular and traditional ingredients used in Chinese medicines, is increasingly being studied. Furthermore, it is vital to establish a series of reliable reference genes when qRT-PCR is used to assess the gene expression profile of ginseng. In this study, we screened out candidate reference genes for ginseng using gene expression data generated by a high-throughput sequencing platform. Based on the statistical tests, 20 reference genes (10 traditional housekeeping genes and 10 novel genes) were selected. These genes were tested for the normalization of expression levels in five growth stages and three distinct plant organs of ginseng by qPCR. These genes were subsequently ranked and compared according to the stability of their expressions using geNorm, NormFinder, and BestKeeper computational programs. Although the best reference genes were found to vary across different samples, CYP and EF-1α were the most stable genes amongst all samples. GAPDH/30S RPS20, CYP/60S RPL13 and CYP/QCR were the optimum pair of reference genes in the roots, stems, and leaves. CYP/60S RPL13, CYP/eIF-5A, aTUB/V-ATP, eIF-5A/SAR1, and aTUB/pol IIa were the most stably expressed combinations in each of the five developmental stages. Our study serves as a foundation for developing an accurate method of qRT-PCR and will benefit future studies on gene expression profiles of Panax Ginseng.

  11. Transcriptome architecture across tissues in the pig

    PubMed Central

    Ferraz, André LJ; Ojeda, Ana; López-Béjar, Manel; Fernandes, Lana T; Castelló, Anna; Folch, Josep M; Pérez-Enciso, Miguel

    2008-01-01

    Background Artificial selection has resulted in animal breeds with extreme phenotypes. As an organism is made up of many different tissues and organs, each with its own genetic programme, it is pertinent to ask: How relevant is tissue in terms of total transcriptome variability? Which are the genes most distinctly expressed between tissues? Does breed or sex equally affect the transcriptome across tissues? Results In order to gain insight on these issues, we conducted microarray expression profiling of 16 different tissues from four animals of two extreme pig breeds, Large White and Iberian, two males and two females. Mixed model analysis and neighbor – joining trees showed that tissues with similar developmental origin clustered closer than those with different embryonic origins. Often a sound biological interpretation was possible for overrepresented gene ontology categories within differentially expressed genes between groups of tissues. For instance, an excess of nervous system or muscle development genes were found among tissues of ectoderm or mesoderm origins, respectively. Tissue accounted for ~11 times more variability than sex or breed. Nevertheless, we were able to confidently identify genes with differential expression across tissues between breeds (33 genes) and between sexes (19 genes). The genes primarily affected by sex were overall different than those affected by breed or tissue. Interaction with tissue can be important for differentially expressed genes between breeds but not so much for genes whose expression differ between sexes. Conclusion Embryonic development leaves an enduring footprint on the transcriptome. The interaction in gene × tissue for differentially expressed genes between breeds suggests that animal breeding has targeted differentially each tissue's transcriptome. PMID:18416811

  12. Patterns of gene expression in a scleractinian coral undergoing natural bleaching.

    PubMed

    Seneca, Francois O; Forêt, Sylvain; Ball, Eldon E; Smith-Keune, Carolyn; Miller, David J; van Oppen, Madeleine J H

    2010-10-01

    Coral bleaching is a major threat to coral reefs worldwide and is predicted to intensify with increasing global temperature. This study represents the first investigation of gene expression in an Indo-Pacific coral species undergoing natural bleaching which involved the loss of algal symbionts. Quantitative real-time polymerase chain reaction experiments were conducted to select and evaluate coral internal control genes (ICGs), and to investigate selected coral genes of interest (GOIs) for changes in gene expression in nine colonies of the scleractinian coral Acropora millepora undergoing bleaching at Magnetic Island, Great Barrier Reef, Australia. Among the six ICGs tested, glyceraldehyde 3-phosphate dehydrogenase and the ribosomal protein genes S7 and L9 exhibited the most constant expression levels between samples from healthy-looking colonies and samples from the same colonies when severely bleached a year later. These ICGs were therefore utilised for normalisation of expression data for seven selected GOIs. Of the seven GOIs, homologues of catalase, C-type lectin and chromoprotein genes were significantly up-regulated as a result of bleaching by factors of 1.81, 1.46 and 1.61 (linear mixed models analysis of variance, P < 0.05), respectively. We present these genes as potential coral bleaching response genes. In contrast, three genes, including one putative ICG, showed highly variable levels of expression between coral colonies. Potential variation in microhabitat, gene function unrelated to the stress response and individualised stress responses may influence such differences between colonies and need to be better understood when designing and interpreting future studies of gene expression in natural coral populations.

  13. Prostate cancer-associated gene expression alterations determined from needle biopsies.

    PubMed

    Qian, David Z; Huang, Chung-Ying; O'Brien, Catherine A; Coleman, Ilsa M; Garzotto, Mark; True, Lawrence D; Higano, Celestia S; Vessella, Robert; Lange, Paul H; Nelson, Peter S; Beer, Tomasz M

    2009-05-01

    To accurately identify gene expression alterations that differentiate neoplastic from normal prostate epithelium using an approach that avoids contamination by unwanted cellular components and is not compromised by acute gene expression changes associated with tumor devascularization and resulting ischemia. Approximately 3,000 neoplastic and benign prostate epithelial cells were isolated using laser capture microdissection from snap-frozen prostate biopsy specimens provided by 31 patients who subsequently participated in a clinical trial of preoperative chemotherapy. cDNA synthesized from amplified total RNA was hybridized to custom-made microarrays composed of 6,200 clones derived from the Prostate Expression Database. Expression differences for selected genes were verified using quantitative reverse transcription-PCR. Comparative analyses identified 954 transcript alterations associated with cancer (q < 0.01%), including 149 differentially expressed genes with no known functional roles. Gene expression changes associated with ischemia and surgical removal of the prostate gland were absent. Genes up-regulated in prostate cancer were statistically enriched in categories related to cellular metabolism, energy use, signal transduction, and molecular transport. Genes down-regulated in prostate cancers were enriched in categories related to immune response, cellular responses to pathogens, and apoptosis. A heterogeneous pattern of androgen receptor expression changes was noted. In exploratory analyses, androgen receptor down-regulation was associated with a lower probability of cancer relapse after neoadjuvant chemotherapy followed by radical prostatectomy. Assessments of tumor phenotypes based on gene expression for treatment stratification and drug targeting of oncogenic alterations may best be ascertained using biopsy-based analyses where the effects of ischemia do not complicate interpretation.

  14. Prostate Cancer-Associated Gene Expression Alterations Determined from Needle Biopsies

    PubMed Central

    Qian, David Z.; Huang, Chung-Ying; O'Brien, Catherine A.; Coleman, Ilsa M.; Garzotto, Mark; True, Lawrence D.; Higano, Celestia S.; Vessella, Robert; Lange, Paul H.; Nelson, Peter S.; Beer, Tomasz M.

    2010-01-01

    Purpose To accurately identify gene expression alterations that differentiate neoplastic from normal prostate epithelium using an approach that avoids contamination by unwanted cellular components and is not compromised by acute gene expression changes associated with tumor devascularization and resulting ischemia. Experimental Design Approximately 3,000 neoplastic and benign prostate epithelial cells were isolated using laser capture microdissection from snap-frozen prostate biopsy specimens provided by 31 patients who subsequently participated in a clinical trial of preoperative chemotherapy. cDNA synthesized from amplified total RNA was hybridized to custom-made microarrays comprised of 6200 clones derived from the Prostate Expression Database. Expression differences for selected genes were verified using quantitative RT-PCR. Results Comparative analyses identified 954 transcript alterations associated with cancer (q value <0.01%) including 149 differentially expressed genes with no known functional roles. Gene expression changes associated with ischemia and surgical removal of the prostate gland were absent. Genes up-regulated in prostate cancer were statistically enriched in categories related to cellular metabolism, energy utilization, signal transduction, and molecular transport. Genes down-regulated in prostate cancers were enriched in categories related to immune response, cellular responses to pathogens, and apoptosis. A heterogeneous pattern of AR expression changes was noted. In exploratory analyses, AR down regulation was associated with a lower probability of cancer relapse after neoadjuvant chemotherapy followed by radical prostatectomy. Conclusions Assessments of tumor phenotypes based on gene expression for treatment stratification and drug targeting of oncogenic alterations may best be ascertained using biopsy-based analyses where the effects of ischemia do not complicate interpretation. PMID:19366833

  15. Genes under weaker stabilizing selection increase network evolvability and rapid regulatory adaptation to an environmental shift.

    PubMed

    Laarits, T; Bordalo, P; Lemos, B

    2016-08-01

    Regulatory networks play a central role in the modulation of gene expression, the control of cellular differentiation, and the emergence of complex phenotypes. Regulatory networks could constrain or facilitate evolutionary adaptation in gene expression levels. Here, we model the adaptation of regulatory networks and gene expression levels to a shift in the environment that alters the optimal expression level of a single gene. Our analyses show signatures of natural selection on regulatory networks that both constrain and facilitate rapid evolution of gene expression level towards new optima. The analyses are interpreted from the standpoint of neutral expectations and illustrate the challenge to making inferences about network adaptation. Furthermore, we examine the consequence of variable stabilizing selection across genes on the strength and direction of interactions in regulatory networks and in their subsequent adaptation. We observe that directional selection on a highly constrained gene previously under strong stabilizing selection was more efficient when the gene was embedded within a network of partners under relaxed stabilizing selection pressure. The observation leads to the expectation that evolutionarily resilient regulatory networks will contain optimal ratios of genes whose expression is under weak and strong stabilizing selection. Altogether, our results suggest that the variable strengths of stabilizing selection across genes within regulatory networks might itself contribute to the long-term adaptation of complex phenotypes. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.

  16. Gene expression profiling in multipotent DFAT cells derived from mature adipocytes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ono, Hiromasa; Database Center for Life Science; Oki, Yoshinao

    2011-04-15

    Highlights: {yields} Adipocyte dedifferentiation is evident in a significant decrease in typical genes. {yields} Cell proliferation is strongly related to adipocyte dedifferentiation. {yields} Dedifferentiated adipocytes express several lineage-specific genes. {yields} Comparative analyses using publicly available datasets boost the interpretation. -- Abstract: Cellular dedifferentiation signifies the withdrawal of cells from a specific differentiated state to a stem cell-like undifferentiated state. However, the mechanism of dedifferentiation remains obscure. Here we performed comparative transcriptome analyses during dedifferentiation in mature adipocytes (MAs) to identify the transcriptional signatures of multipotent dedifferentiated fat (DFAT) cells derived from MAs. Using microarray systems, we explored similarly expressed asmore » well as significantly differentially expressed genes in MAs during dedifferentiation. This analysis revealed significant changes in gene expression during this process, including a significant reduction in expression of genes for lipid metabolism concomitantly with a significant increase in expression of genes for cell movement, cell migration, tissue developmental processes, cell growth, cell proliferation, cell morphogenesis, altered cell shape, and cell differentiation. Our observations indicate that the transcriptional signatures of DFAT cells derived from MAs are summarized in terms of a significant decrease in functional phenotype-related genes and a parallel increase in cell proliferation, altered cell morphology, and regulation of the differentiation of related genes. A better understanding of the mechanisms involved in dedifferentiation may enable scientists to control and possibly alter the plasticity of the differentiated state, which may lead to benefits not only in stem cell research but also in regenerative medicine.« less

  17. Identification of Genes Involved in Breast Cancer Metastasis by Integrating Protein-Protein Interaction Information with Expression Data.

    PubMed

    Tian, Xin; Xin, Mingyuan; Luo, Jian; Liu, Mingyao; Jiang, Zhenran

    2017-02-01

    The selection of relevant genes for breast cancer metastasis is critical for the treatment and prognosis of cancer patients. Although much effort has been devoted to the gene selection procedures by use of different statistical analysis methods or computational techniques, the interpretation of the variables in the resulting survival models has been limited so far. This article proposes a new Random Forest (RF)-based algorithm to identify important variables highly related with breast cancer metastasis, which is based on the important scores of two variable selection algorithms, including the mean decrease Gini (MDG) criteria of Random Forest and the GeneRank algorithm with protein-protein interaction (PPI) information. The new gene selection algorithm can be called PPIRF. The improved prediction accuracy fully illustrated the reliability and high interpretability of gene list selected by the PPIRF approach.

  18. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    DOE PAGES

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; ...

    2016-11-24

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less

  19. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less

  20. Maternal-Effect Lethal Mutations on Linkage Group II of Caenorhabditis Elegans

    PubMed Central

    Kemphues, K. J.; Kusch, M.; Wolf, N.

    1988-01-01

    We have analyzed a set of linkage group (LG) II maternal-effect lethal mutations in Caenorhabditis elegans isolated by a new screening procedure. Screens of 12,455 F(1) progeny from mutagenized adults resulted in the recovery of 54 maternal-effect lethal mutations identifying 29 genes. Of the 54 mutations, 39 are strict maternal-effect mutations defining 17 genes. These 17 genes fall into two classes distinguished by frequency of mutation to strict maternal-effect lethality. The smaller class, comprised of four genes, mutated to strict maternal-effect lethality at a frequency close to 5 X 10(-4), a rate typical of essential genes in C. elegans. Two of these genes are expressed during oogenesis and required exclusively for embryogenesis (pure maternal genes), one appears to be required specifically for meiosis, and the fourth has a more complex pattern of expression. The other 13 genes were represented by only one or two strict maternal alleles each. Two of these are identical genes previously identified by nonmaternal embryonic lethal mutations. We interpret our results to mean that although many C. elegans genes can mutate to strict maternal-effect lethality, most genes mutate to that phenotype rarely. Pure maternal genes, however, are among a smaller class of genes that mutate to maternal-effect lethality at typical rates. If our interpretation is correct, we are near saturation for pure maternal genes in the region of LG II balanced by mnC1. We conclude that the number of pure maternal genes in C. elegans is small, being probably not much higher than 12. PMID:3224814

  1. TRACING CO-REGULATORY NETWORK DYNAMICS IN NOISY, SINGLE-CELL TRANSCRIPTOME TRAJECTORIES.

    PubMed

    Cordero, Pablo; Stuart, Joshua M

    2017-01-01

    The availability of gene expression data at the single cell level makes it possible to probe the molecular underpinnings of complex biological processes such as differentiation and oncogenesis. Promising new methods have emerged for reconstructing a progression 'trajectory' from static single-cell transcriptome measurements. However, it remains unclear how to adequately model the appreciable level of noise in these data to elucidate gene regulatory network rewiring. Here, we present a framework called Single Cell Inference of MorphIng Trajectories and their Associated Regulation (SCIMITAR) that infers progressions from static single-cell transcriptomes by employing a continuous parametrization of Gaussian mixtures in high-dimensional curves. SCIMITAR yields rich models from the data that highlight genes with expression and co-expression patterns that are associated with the inferred progression. Further, SCIMITAR extracts regulatory states from the implicated trajectory-evolvingco-expression networks. We benchmark the method on simulated data to show that it yields accurate cell ordering and gene network inferences. Applied to the interpretation of a single-cell human fetal neuron dataset, SCIMITAR finds progression-associated genes in cornerstone neural differentiation pathways missed by standard differential expression tests. Finally, by leveraging the rewiring of gene-gene co-expression relations across the progression, the method reveals the rise and fall of co-regulatory states and trajectory-dependent gene modules. These analyses implicate new transcription factors in neural differentiation including putative co-factors for the multi-functional NFAT pathway.

  2. Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.

    PubMed

    Mathelier, Anthony; Lefebvre, Calvin; Zhang, Allen W; Arenillas, David J; Ding, Jiarui; Wasserman, Wyeth W; Shah, Sohrab P

    2015-04-23

    With the rapid increase of whole-genome sequencing of human cancers, an important opportunity to analyze and characterize somatic mutations lying within cis-regulatory regions has emerged. A focus on protein-coding regions to identify nonsense or missense mutations disruptive to protein structure and/or function has led to important insights; however, the impact on gene expression of mutations lying within cis-regulatory regions remains under-explored. We analyzed somatic mutations from 84 matched tumor-normal whole genomes from B-cell lymphomas with accompanying gene expression measurements to elucidate the extent to which these cancers are disrupted by cis-regulatory mutations. We characterize mutations overlapping a high quality set of well-annotated transcription factor binding sites (TFBSs), covering a similar portion of the genome as protein-coding exons. Our results indicate that cis-regulatory mutations overlapping predicted TFBSs are enriched in promoter regions of genes involved in apoptosis or growth/proliferation. By integrating gene expression data with mutation data, our computational approach culminates with identification of cis-regulatory mutations most likely to participate in dysregulation of the gene expression program. The impact can be measured along with protein-coding mutations to highlight key mutations disrupting gene expression and pathways in cancer. Our study yields specific genes with disrupted expression triggered by genomic mutations in either the coding or the regulatory space. It implies that mutated regulatory components of the genome contribute substantially to cancer pathways. Our analyses demonstrate that identifying genomically altered cis-regulatory elements coupled with analysis of gene expression data will augment biological interpretation of mutational landscapes of cancers.

  3. How well do you know your mutation? Complex effects of genetic background on expressivity, complementation, and ordering of allelic effects

    PubMed Central

    Choi, Lin; DeNieu, Michael; Sonnenschein, Anne; Hummel, Kristen; Marier, Christian; Victory, Andrew; Porter, Cody; Mammel, Anna; Holms, Julie; Sivaratnam, Gayatri

    2017-01-01

    For a given gene, different mutations influence organismal phenotypes to varying degrees. However, the expressivity of these variants not only depends on the DNA lesion associated with the mutation, but also on factors including the genetic background and rearing environment. The degree to which these factors influence related alleles, genes, or pathways similarly, and whether similar developmental mechanisms underlie variation in the expressivity of a single allele across conditions and among alleles is poorly understood. Besides their fundamental biological significance, these questions have important implications for the interpretation of functional genetic analyses, for example, if these factors alter the ordering of allelic series or patterns of complementation. We examined the impact of genetic background and rearing environment for a series of mutations spanning the range of phenotypic effects for both the scalloped and vestigial genes, which influence wing development in Drosophila melanogaster. Genetic background and rearing environment influenced the phenotypic outcome of mutations, including intra-genic interactions, particularly for mutations of moderate expressivity. We examined whether cellular correlates (such as cell proliferation during development) of these phenotypic effects matched the observed phenotypic outcome. While cell proliferation decreased with mutations of increasingly severe effects, surprisingly it did not co-vary strongly with the degree of background dependence. We discuss these findings and propose a phenomenological model to aid in understanding the biology of genes, and how this influences our interpretation of allelic effects in genetic analysis. PMID:29166655

  4. MinePath: Mining for Phenotype Differential Sub-paths in Molecular Pathways

    PubMed Central

    Koumakis, Lefteris; Kartsaki, Evgenia; Chatzimina, Maria; Zervakis, Michalis; Vassou, Despoina; Marias, Kostas; Moustakis, Vassilis; Potamias, George

    2016-01-01

    Pathway analysis methodologies couple traditional gene expression analysis with knowledge encoded in established molecular pathway networks, offering a promising approach towards the biological interpretation of phenotype differentiating genes. Early pathway analysis methodologies, named as gene set analysis (GSA), view pathways just as plain lists of genes without taking into account either the underlying pathway network topology or the involved gene regulatory relations. These approaches, even if they achieve computational efficiency and simplicity, consider pathways that involve the same genes as equivalent in terms of their gene enrichment characteristics. Most recent pathway analysis approaches take into account the underlying gene regulatory relations by examining their consistency with gene expression profiles and computing a score for each profile. Even with this approach, assessing and scoring single-relations limits the ability to reveal key gene regulation mechanisms hidden in longer pathway sub-paths. We introduce MinePath, a pathway analysis methodology that addresses and overcomes the aforementioned problems. MinePath facilitates the decomposition of pathways into their constituent sub-paths. Decomposition leads to the transformation of single-relations to complex regulation sub-paths. Regulation sub-paths are then matched with gene expression sample profiles in order to evaluate their functional status and to assess phenotype differential power. Assessment of differential power supports the identification of the most discriminant profiles. In addition, MinePath assess the significance of the pathways as a whole, ranking them by their p-values. Comparison results with state-of-the-art pathway analysis systems are indicative for the soundness and reliability of the MinePath approach. In contrast with many pathway analysis tools, MinePath is a web-based system (www.minepath.org) offering dynamic and rich pathway visualization functionality, with the unique characteristic to color regulatory relations between genes and reveal their phenotype inclination. This unique characteristic makes MinePath a valuable tool for in silico molecular biology experimentation as it serves the biomedical researchers’ exploratory needs to reveal and interpret the regulatory mechanisms that underlie and putatively govern the expression of target phenotypes. PMID:27832067

  5. MinePath: Mining for Phenotype Differential Sub-paths in Molecular Pathways.

    PubMed

    Koumakis, Lefteris; Kanterakis, Alexandros; Kartsaki, Evgenia; Chatzimina, Maria; Zervakis, Michalis; Tsiknakis, Manolis; Vassou, Despoina; Kafetzopoulos, Dimitris; Marias, Kostas; Moustakis, Vassilis; Potamias, George

    2016-11-01

    Pathway analysis methodologies couple traditional gene expression analysis with knowledge encoded in established molecular pathway networks, offering a promising approach towards the biological interpretation of phenotype differentiating genes. Early pathway analysis methodologies, named as gene set analysis (GSA), view pathways just as plain lists of genes without taking into account either the underlying pathway network topology or the involved gene regulatory relations. These approaches, even if they achieve computational efficiency and simplicity, consider pathways that involve the same genes as equivalent in terms of their gene enrichment characteristics. Most recent pathway analysis approaches take into account the underlying gene regulatory relations by examining their consistency with gene expression profiles and computing a score for each profile. Even with this approach, assessing and scoring single-relations limits the ability to reveal key gene regulation mechanisms hidden in longer pathway sub-paths. We introduce MinePath, a pathway analysis methodology that addresses and overcomes the aforementioned problems. MinePath facilitates the decomposition of pathways into their constituent sub-paths. Decomposition leads to the transformation of single-relations to complex regulation sub-paths. Regulation sub-paths are then matched with gene expression sample profiles in order to evaluate their functional status and to assess phenotype differential power. Assessment of differential power supports the identification of the most discriminant profiles. In addition, MinePath assess the significance of the pathways as a whole, ranking them by their p-values. Comparison results with state-of-the-art pathway analysis systems are indicative for the soundness and reliability of the MinePath approach. In contrast with many pathway analysis tools, MinePath is a web-based system (www.minepath.org) offering dynamic and rich pathway visualization functionality, with the unique characteristic to color regulatory relations between genes and reveal their phenotype inclination. This unique characteristic makes MinePath a valuable tool for in silico molecular biology experimentation as it serves the biomedical researchers' exploratory needs to reveal and interpret the regulatory mechanisms that underlie and putatively govern the expression of target phenotypes.

  6. Technical guide for applications of gene expression profiling in human health risk assessment of environmental chemicals.

    PubMed

    Bourdon-Lacombe, Julie A; Moffat, Ivy D; Deveau, Michelle; Husain, Mainul; Auerbach, Scott; Krewski, Daniel; Thomas, Russell S; Bushel, Pierre R; Williams, Andrew; Yauk, Carole L

    2015-07-01

    Toxicogenomics promises to be an important part of future human health risk assessment of environmental chemicals. The application of gene expression profiles (e.g., for hazard identification, chemical prioritization, chemical grouping, mode of action discovery, and quantitative analysis of response) is growing in the literature, but their use in formal risk assessment by regulatory agencies is relatively infrequent. Although additional validations for specific applications are required, gene expression data can be of immediate use for increasing confidence in chemical evaluations. We believe that a primary reason for the current lack of integration is the limited practical guidance available for risk assessment specialists with limited experience in genomics. The present manuscript provides basic information on gene expression profiling, along with guidance on evaluating the quality of genomic experiments and data, and interpretation of results presented in the form of heat maps, pathway analyses and other common approaches. Moreover, potential ways to integrate information from gene expression experiments into current risk assessment are presented using published studies as examples. The primary objective of this work is to facilitate integration of gene expression data into human health risk assessments of environmental chemicals. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  7. Correlated mRNAs and miRNAs from co-expression and regulatory networks affect porcine muscle and finally meat properties.

    PubMed

    Ponsuksili, Siriluck; Du, Yang; Hadlich, Frieder; Siengdee, Puntita; Murani, Eduard; Schwerin, Manfred; Wimmers, Klaus

    2013-08-05

    Physiological processes aiding the conversion of muscle to meat involve many genes associated with muscle structure and metabolic processes. MicroRNAs regulate networks of genes to orchestrate cellular functions, in turn regulating phenotypes. We applied weighted gene co-expression network analysis to identify co-expression modules that correlated to meat quality phenotypes and were highly enriched for genes involved in glucose metabolism, response to wounding, mitochondrial ribosome, mitochondrion, and extracellular matrix. Negative correlation of miRNA with mRNA and target prediction were used to select transcripts out of the modules of trait-associated mRNAs to further identify those genes that are correlated with post mortem traits. Porcine muscle co-expression transcript networks that correlated to post mortem traits were identified. The integration of miRNA and mRNA expression analyses, as well as network analysis, enabled us to interpret the differentially-regulated genes from a systems perspective. Linking co-expression networks of transcripts and hierarchically organized pairs of miRNAs and mRNAs to meat properties yields new insight into several biological pathways underlying phenotype differences. These pathways may also be diagnostic for many myopathies, which are accompanied by deficient nutrient and oxygen supply of muscle fibers.

  8. Mitochondrial-related gene expression changes are sensitive to agonal-pH state: implications for brain disorders

    PubMed Central

    Vawter, MP; Tomita, H; Meng, F; Bolstad, B; Li, J; Evans, S; Choudary, P; Atz, M; Shao, L; Neal, C; Walsh, DM; Burmeister, M; Speed, T; Myers, R; Jones, EG; Watson, SJ; Akil, H; Bunney, WE

    2010-01-01

    Mitochondrial defects in gene expression have been implicated in the pathophysiology of bipolar disorder and schizophrenia. We have now contrasted control brains with low pH versus high pH and showed that 28% of genes in mitochondrial-related pathways meet criteria for differential expression. A majority of genes in the mitochondrial, chaperone and proteasome pathways of nuclear DNA-encoded gene expression were decreased with decreased brain pH, whereas a majority of genes in the apoptotic and reactive oxygen stress pathways showed an increased gene expression with a decreased brain pH. There was a significant increase in mitochondrial DNA copy number and mitochondrial DNA gene expression with increased agonal duration. To minimize effects of agonal-pH state on mood disorder comparisons, two classic approaches were used, removing all subjects with low pH and agonal factors from analysis, or grouping low and high pH as a separate variable. Three groups of potential candidate genes emerged that may be mood disorder related: (a) genes that showed no sensitivity to pH but were differentially expressed in bipolar disorder or major depressive disorder; (b) genes that were altered by agonal-pH in one direction but altered in mood disorder in the opposite direction to agonal-pH and (c) genes with agonal-pH sensitivity that displayed the same direction of changes in mood disorder. Genes from these categories such as NR4A1 and HSPA2 were confirmed with Q-PCR. The interpretation of postmortem brain studies involving broad mitochondrial gene expression and related pathway alterations must be monitored against the strong effect of agonal-pH state. Genes with the least sensitivity to agonal-pH could present a starting point for candidate gene search in neuropsychiatric disorders. PMID:16636682

  9. BASiCS: Bayesian Analysis of Single-Cell Sequencing Data

    PubMed Central

    Vallejos, Catalina A.; Marioni, John C.; Richardson, Sylvia

    2015-01-01

    Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell’s lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach. PMID:26107944

  10. BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.

    PubMed

    Vallejos, Catalina A; Marioni, John C; Richardson, Sylvia

    2015-06-01

    Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell's lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach.

  11. CARSVM: a class association rule-based classification framework and its application to gene expression data.

    PubMed

    Kianmehr, Keivan; Alhajj, Reda

    2008-09-01

    In this study, we aim at building a classification framework, namely the CARSVM model, which integrates association rule mining and support vector machine (SVM). The goal is to benefit from advantages of both, the discriminative knowledge represented by class association rules and the classification power of the SVM algorithm, to construct an efficient and accurate classifier model that improves the interpretability problem of SVM as a traditional machine learning technique and overcomes the efficiency issues of associative classification algorithms. In our proposed framework: instead of using the original training set, a set of rule-based feature vectors, which are generated based on the discriminative ability of class association rules over the training samples, are presented to the learning component of the SVM algorithm. We show that rule-based feature vectors present a high-qualified source of discrimination knowledge that can impact substantially the prediction power of SVM and associative classification techniques. They provide users with more conveniences in terms of understandability and interpretability as well. We have used four datasets from UCI ML repository to evaluate the performance of the developed system in comparison with five well-known existing classification methods. Because of the importance and popularity of gene expression analysis as real world application of the classification model, we present an extension of CARSVM combined with feature selection to be applied to gene expression data. Then, we describe how this combination will provide biologists with an efficient and understandable classifier model. The reported test results and their biological interpretation demonstrate the applicability, efficiency and effectiveness of the proposed model. From the results, it can be concluded that a considerable increase in classification accuracy can be obtained when the rule-based feature vectors are integrated in the learning process of the SVM algorithm. In the context of applicability, according to the results obtained from gene expression analysis, we can conclude that the CARSVM system can be utilized in a variety of real world applications with some adjustments.

  12. Tobacco exposure-related alterations in DNA methylation and gene expression in human monocytes: the Multi-Ethnic Study of Atherosclerosis (MESA)

    PubMed Central

    Reynolds, Lindsay M.; Lohman, Kurt; Pittman, Gary S.; Barr, R. Graham; Chi, Gloria C.; Kaufman, Joel; Wan, Ma; Bell, Douglas A.; Blaha, Michael J.; Rodriguez, Carlos J.; Liu, Yongmei

    2017-01-01

    ABSTRACT Alterations in DNA methylation and gene expression in blood leukocytes are potential biomarkers of harm and mediators of the deleterious effects of tobacco exposure. However, methodological issues, including the use of self-reported smoking status and mixed cell types have made previously identified alterations in DNA methylation and gene expression difficult to interpret. In this study, we examined associations of tobacco exposure with DNA methylation and gene expression, utilizing a biomarker of tobacco exposure (urine cotinine) and CD14+ purified monocyte samples from 934 participants of the community-based Multi-Ethnic Study of Atherosclerosis (MESA). Urine cotinine levels were measured using an immunoassay. DNA methylation and gene expression were measured with microarrays. Multivariate linear regression was used to test for associations adjusting for age, sex, race/ethnicity, education, and study site. Urine cotinine levels were associated with methylation of 176 CpGs [false discovery rate (FDR)<0.01]. Four CpGs not previously identified by studies of non-purified blood samples nominally replicated (P value<0.05) with plasma cotinine-associated methylation in 128 independent monocyte samples. Urine cotinine levels associated with expression of 12 genes (FDR<0.01), including increased expression of P2RY6 (Beta ± standard error = 0.078 ± 0.008, P = 1.99 × 10−22), a gene previously identified to be involved in the release of pro-inflammatory cytokines. No cotinine-associated (FDR<0.01) methylation profiles significantly (FDR<0.01) correlated with cotinine-associated (FDR<0.01) gene expression profiles. In conclusion, our findings i) identify potential monocyte-specific smoking-associated methylation patterns and ii) suggest that alterations in methylation may not be a main mechanism regulating gene expression in monocytes in response to cigarette smoking. PMID:29166816

  13. Avoiding false positives and optimizing identification of true ...

    EPA Pesticide Factsheets

    The potential for chemicals to affect endocrine signaling is commonly evaluated via in vitro receptor binding and gene activation, but these assays, especially antagonism assays, have potential artifacts that must be addressed for accurate interpretation. Results are presented from screening 94 chemicals from 54 chemical groups for estrogen receptor (ER) activation in a competitive rainbow trout ER (rtER) binding assay and a trout liver slice vitellogenin mRNA expression assay. Results from true competitive agonists and antagonists, and inactive chemicals with little or no indication of ER binding or gene activation were easily interpreted. However, results for numerous industrial chemicals were more challenging to interpret, including chemicals with: (1) apparent competitive binding curves but no gene activation, (2) apparent binding and gene inhibition with evidence of either cytotoxicity or changes in assay media pH, (3) apparent binding but non-competitive gene inhibition of unknown cause, or (4) no rtER binding and gene inhibition not due to competitive ER interaction but due to toxicity, pH change, or some unknown cause. The use of endpoints such as toxicity, pH, precipitate formation, and determination of inhibitor dissociation constants (Ki) for interpreting the results of antagonism and binding assays for diverse chemicals is presented. Of the 94 chemicals tested for antagonism only two, tamoxifen and ICI-182,780, were found to be true competitive

  14. Shared control of gene expression in bacteria by transcription factors and global physiology of the cell

    PubMed Central

    Berthoumieux, Sara; de Jong, Hidde; Baptist, Guillaume; Pinel, Corinne; Ranquet, Caroline; Ropers, Delphine; Geiselmann, Johannes

    2013-01-01

    Gene expression is controlled by the joint effect of (i) the global physiological state of the cell, in particular the activity of the gene expression machinery, and (ii) DNA-binding transcription factors and other specific regulators. We present a model-based approach to distinguish between these two effects using time-resolved measurements of promoter activities. We demonstrate the strength of the approach by analyzing a circuit involved in the regulation of carbon metabolism in E. coli. Our results show that the transcriptional response of the network is controlled by the physiological state of the cell and the signaling metabolite cyclic AMP (cAMP). The absence of a strong regulatory effect of transcription factors suggests that they are not the main coordinators of gene expression changes during growth transitions, but rather that they complement the effect of global physiological control mechanisms. This change of perspective has important consequences for the interpretation of transcriptome data and the design of biological networks in biotechnology and synthetic biology. PMID:23340840

  15. Whole Genome Gene Expression Meta-Analysis of Inflammatory Bowel Disease Colon Mucosa Demonstrates Lack of Major Differences between Crohn's Disease and Ulcerative Colitis

    PubMed Central

    Østvik, Ann E.; Drozdov, Ignat; Gustafsson, Bjørn I.; Kidd, Mark; Beisvag, Vidar; Torp, Sverre H.; Waldum, Helge L.; Martinsen, Tom Christian; Damås, Jan Kristian; Espevik, Terje; Sandvik, Arne K.

    2013-01-01

    Background In inflammatory bowel disease (IBD), genetic susceptibility together with environmental factors disturbs gut homeostasis producing chronic inflammation. The two main IBD subtypes are Ulcerative colitis (UC) and Crohn’s disease (CD). We present the to-date largest microarray gene expression study on IBD encompassing both inflamed and un-inflamed colonic tissue. A meta-analysis including all available, comparable data was used to explore important aspects of IBD inflammation, thereby validating consistent gene expression patterns. Methods Colon pinch biopsies from IBD patients were analysed using Illumina whole genome gene expression technology. Differential expression (DE) was identified using LIMMA linear model in the R statistical computing environment. Results were enriched for gene ontology (GO) categories. Sets of genes encoding antimicrobial proteins (AMP) and proteins involved in T helper (Th) cell differentiation were used in the interpretation of the results. All available data sets were analysed using the same methods, and results were compared on a global and focused level as t-scores. Results Gene expression in inflamed mucosa from UC and CD are remarkably similar. The meta-analysis confirmed this. The patterns of AMP and Th cell-related gene expression were also very similar, except for IL23A which was consistently higher expressed in UC than in CD. Un-inflamed tissue from patients demonstrated minimal differences from healthy controls. Conclusions There is no difference in the Th subgroup involvement between UC and CD. Th1/Th17 related expression, with little Th2 differentiation, dominated both diseases. The different IL23A expression between UC and CD suggests an IBD subtype specific role. AMPs, previously little studied, are strongly overexpressed in IBD. The presented meta-analysis provides a sound background for further research on IBD pathobiology. PMID:23468882

  16. Whole genome gene expression meta-analysis of inflammatory bowel disease colon mucosa demonstrates lack of major differences between Crohn's disease and ulcerative colitis.

    PubMed

    Granlund, Atle van Beelen; Flatberg, Arnar; Østvik, Ann E; Drozdov, Ignat; Gustafsson, Bjørn I; Kidd, Mark; Beisvag, Vidar; Torp, Sverre H; Waldum, Helge L; Martinsen, Tom Christian; Damås, Jan Kristian; Espevik, Terje; Sandvik, Arne K

    2013-01-01

    In inflammatory bowel disease (IBD), genetic susceptibility together with environmental factors disturbs gut homeostasis producing chronic inflammation. The two main IBD subtypes are Ulcerative colitis (UC) and Crohn's disease (CD). We present the to-date largest microarray gene expression study on IBD encompassing both inflamed and un-inflamed colonic tissue. A meta-analysis including all available, comparable data was used to explore important aspects of IBD inflammation, thereby validating consistent gene expression patterns. Colon pinch biopsies from IBD patients were analysed using Illumina whole genome gene expression technology. Differential expression (DE) was identified using LIMMA linear model in the R statistical computing environment. Results were enriched for gene ontology (GO) categories. Sets of genes encoding antimicrobial proteins (AMP) and proteins involved in T helper (Th) cell differentiation were used in the interpretation of the results. All available data sets were analysed using the same methods, and results were compared on a global and focused level as t-scores. Gene expression in inflamed mucosa from UC and CD are remarkably similar. The meta-analysis confirmed this. The patterns of AMP and Th cell-related gene expression were also very similar, except for IL23A which was consistently higher expressed in UC than in CD. Un-inflamed tissue from patients demonstrated minimal differences from healthy controls. There is no difference in the Th subgroup involvement between UC and CD. Th1/Th17 related expression, with little Th2 differentiation, dominated both diseases. The different IL23A expression between UC and CD suggests an IBD subtype specific role. AMPs, previously little studied, are strongly overexpressed in IBD. The presented meta-analysis provides a sound background for further research on IBD pathobiology.

  17. Analysis of expressed sequence tags from a single wheat cultivar facilitates interpretation of tandem mass spectrometry data and discrimination of gamma gliadin proteins that may play different functional roles in flour

    USDA-ARS?s Scientific Manuscript database

    The complement of gamma gliadin genes expressed in the wheat cultivar Butte 86 was evaluated by analyzing publicly available expressed sequence tag (EST) data. Eleven contigs were assembled from 153 Butte 86 ESTs. Nine of the contigs encoded full-length proteins and four of the proteins contained an...

  18. Geographical, environmental and pathophysiological influences on the human blood transcriptome.

    PubMed

    Tabassum, Rubina; Nath, Artika; Preininger, Marcela; Gibson, Greg

    2013-12-01

    Gene expression variation provides a read-out of both genetic and environmental influences on gene activity. Geographical, genomic and sociogenomic studies have highlighted how life circumstances of an individual modify the expression of hundreds and in some cases thousands of genes in a co-ordinated manner. This review places such results in the context of a conserved set of 90 transcripts known as Blood Informative Transcripts (BIT) that capture the major conserved components of variation in the peripheral blood transcriptome. Pathophysiological states are also shown to associate with the perturbation of transcript abundance along the major axes. Discussion of false negative rates leads us to argue that simple significance thresholds provide a biased perspective on assessment of differential expression that may cloud the interpretation of studies with small sample sizes.

  19. Geographical, environmental and pathophysiological influences on the human blood transcriptome

    PubMed Central

    Tabassum, Rubina; Nath, Artika; Preininger, Marcela; Gibson, Greg

    2013-01-01

    Gene expression variation provides a read-out of both genetic and environmental influences on gene activity. Geographical, genomic and sociogenomic studies have highlighted how life circumstances of an individual modify the expression of hundreds and in some cases thousands of genes in a co-ordinated manner. This review places such results in the context of a conserved set of 90 transcripts known as Blood Informative Transcripts (BIT) that capture the major conserved components of variation in the peripheral blood transcriptome. Pathophysiological states are also shown to associate with the perturbation of transcript abundance along the major axes. Discussion of false negative rates leads us to argue that simple significance thresholds provide a biased perspective on assessment of differential expression that may cloud the interpretation of studies with small sample sizes. PMID:25830076

  20. qPCR for second year undergraduates: A short, structured inquiry to illustrate differential gene expression.

    PubMed

    McCauslin, Christine Seitz; Gunn, Kathryn Elaine; Pirone, Dana; Staiger, Jennifer

    2015-01-01

    We describe a structured inquiry laboratory exercise that examines transcriptional regulation of the NOS2 gene under conditions that simulate the inflammatory response in macrophages. Using quantitative PCR and the comparative CT method, students are able determine whether transcriptional activation of NOS2 occurs and to what degree. The exercise is aimed at second year undergraduates who possess basic knowledge of gene expression events. It requires only 4-5 hr of dedicated laboratory time and focuses on use of the primary literature, data analysis, and interpretation. Importantly, this exercise provides a mechanism to introduce the concept of differential gene expression and provides a starting point for development of more complex guided or open inquiry projects for students moving into upper level molecular biology, immunology, and biochemistry course work. © 2015 The International Union of Biochemistry and Molecular Biology.

  1. Cardiogenic Genes Expressed in Cardiac Fibroblasts Contribute to Heart Development and Repair

    PubMed Central

    Furtado, Milena B.; Costa, Mauro W.; Pranoto, Edward Adi; Salimova, Ekaterina; Pinto, Alex; Lam, Nicholas T.; Park, Anthony; Snider, Paige; Chandran, Anjana; Harvey, Richard P.; Boyd, Richard; Conway, Simon J.; Pearson, James; Kaye, David M.; Rosenthal, Nadia A.

    2014-01-01

    Rationale Cardiac fibroblasts are critical to proper heart function through multiple interactions with the myocardial compartment but appreciation of their contribution has suffered from incomplete characterization and lack of cell-specific markers. Objective To generate an unbiased comparative gene expression profile of the cardiac fibroblast pool, identify and characterize the role of key genes in cardiac fibroblast function, and determine their contribution to myocardial development and regeneration. Methods and Results High-throughput cell surface and intracellular profiling of cardiac and tail fibroblasts identified canonical MSC and a surprising number of cardiogenic genes, some expressed at higher levels than in whole heart. Whilst genetically marked fibroblasts contributed heterogeneously to interstitial but not cardiomyocyte compartments in infarcted hearts, fibroblast-restricted depletion of one highly expressed cardiogenic marker, Tbx20, caused marked myocardial dysmorphology and perturbations in scar formation upon myocardial infarction. Conclusions The surprising transcriptional identity of cardiac fibroblasts, the adoption of cardiogenic gene programs and direct contribution to cardiac development and repair provokes alternative interpretations for studies on more specialized cardiac progenitors, offering a novel perspective for reinterpreting cardiac regenerative therapies. PMID:24650916

  2. iCOSSY: An Online Tool for Context-Specific Subnetwork Discovery from Gene Expression Data

    PubMed Central

    Saha, Ashis; Jeon, Minji; Tan, Aik Choon; Kang, Jaewoo

    2015-01-01

    Pathway analyses help reveal underlying molecular mechanisms of complex biological phenotypes. Biologists tend to perform multiple pathway analyses on the same dataset, as there is no single answer. It is often inefficient for them to implement and/or install all the algorithms by themselves. Online tools can help the community in this regard. Here we present an online gene expression analytical tool called iCOSSY which implements a novel pathway-based COntext-specific Subnetwork discoverY (COSSY) algorithm. iCOSSY also includes a few modifications of COSSY to increase its reliability and interpretability. Users can upload their gene expression datasets, and discover important subnetworks of closely interacting molecules to differentiate between two phenotypes (context). They can also interactively visualize the resulting subnetworks. iCOSSY is a web server that finds subnetworks that are differentially expressed in two phenotypes. Users can visualize the subnetworks to understand the biology of the difference. PMID:26147457

  3. Novel gene sets improve set-level classification of prokaryotic gene expression data.

    PubMed

    Holec, Matěj; Kuželka, Ondřej; Železný, Filip

    2015-10-28

    Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.

  4. Massive-scale gene co-expression network construction and robustness testing using random matrix theory.

    PubMed

    Gibson, Scott M; Ficklin, Stephen P; Isaacson, Sven; Luo, Feng; Feltus, Frank A; Smith, Melissa C

    2013-01-01

    The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust.

  5. The Genomic and Transcriptomic Landscape of a HeLa Cell Line

    PubMed Central

    Landry, Jonathan J. M.; Pyl, Paul Theodor; Rausch, Tobias; Zichner, Thomas; Tekkedil, Manu M.; Stütz, Adrian M.; Jauch, Anna; Aiyar, Raeka S.; Pau, Gregoire; Delhomme, Nicolas; Gagneur, Julien; Korbel, Jan O.; Huber, Wolfgang; Steinmetz, Lars M.

    2013-01-01

    HeLa is the most widely used model cell line for studying human cellular and molecular biology. To date, no genomic reference for this cell line has been released, and experiments have relied on the human reference genome. Effective design and interpretation of molecular genetic studies performed using HeLa cells require accurate genomic information. Here we present a detailed genomic and transcriptomic characterization of a HeLa cell line. We performed DNA and RNA sequencing of a HeLa Kyoto cell line and analyzed its mutational portfolio and gene expression profile. Segmentation of the genome according to copy number revealed a remarkably high level of aneuploidy and numerous large structural variants at unprecedented resolution. Some of the extensive genomic rearrangements are indicative of catastrophic chromosome shattering, known as chromothripsis. Our analysis of the HeLa gene expression profile revealed that several pathways, including cell cycle and DNA repair, exhibit significantly different expression patterns from those in normal human tissues. Our results provide the first detailed account of genomic variants in the HeLa genome, yielding insight into their impact on gene expression and cellular function as well as their origins. This study underscores the importance of accounting for the strikingly aberrant characteristics of HeLa cells when designing and interpreting experiments, and has implications for the use of HeLa as a model of human biology. PMID:23550136

  6. Identification of internal control genes for quantitative expression analysis by real-time PCR in bovine peripheral lymphocytes.

    PubMed

    Spalenza, Veronica; Girolami, Flavia; Bevilacqua, Claudia; Riondato, Fulvio; Rasero, Roberto; Nebbia, Carlo; Sacchi, Paola; Martin, Patrice

    2011-09-01

    Gene expression studies in blood cells, particularly lymphocytes, are useful for monitoring potential exposure to toxicants or environmental pollutants in humans and livestock species. Quantitative PCR is the method of choice for obtaining accurate quantification of mRNA transcripts although variations in the amount of starting material, enzymatic efficiency, and the presence of inhibitors can lead to evaluation errors. As a result, normalization of data is of crucial importance. The most common approach is the use of endogenous reference genes as an internal control, whose expression should ideally not vary among individuals and under different experimental conditions. The accurate selection of reference genes is therefore an important step in interpreting quantitative PCR studies. Since no systematic investigation in bovine lymphocytes has been performed, the aim of the present study was to assess the expression stability of seven candidate reference genes in circulating lymphocytes collected from 15 dairy cows. Following the characterization by flow cytometric analysis of the cell populations obtained from blood through a density gradient procedure, three popular softwares were used to evaluate the gene expression data. The results showed that two genes are sufficient for normalization of quantitative PCR studies in cattle lymphocytes and that YWAHZ, S24 and PPIA are the most stable genes. Copyright © 2010 Elsevier Ltd. All rights reserved.

  7. Large Sex Differences in Chicken Behavior and Brain Gene Expression Coincide with Few Differences in Promoter DNA-Methylation

    PubMed Central

    Nätt, Daniel; Agnvall, Beatrix; Jensen, Per

    2014-01-01

    While behavioral sex differences have repeatedly been reported across taxa, the underlying epigenetic mechanisms in the brain are mostly lacking. Birds have previously shown to have only limited dosage compensation, leading to high sex bias of Z-chromosome gene expression. In chickens, a male hyper-methylated region (MHM) on the Z-chromosome has been associated with a local type of dosage compensation, but a more detailed characterization of the avian methylome is limiting our interpretations. Here we report an analysis of genome wide sex differences in promoter DNA-methylation and gene expression in the brain of three weeks old chickens, and associated sex differences in behavior of Red Junglefowl (ancestor of domestic chickens). Combining DNA-methylation tiling arrays with gene expression microarrays we show that a specific locus of the MHM region, together with the promoter for the zinc finger RNA binding protein (ZFR) gene on chromosome 1, is strongly associated with sex dimorphism in gene expression. Except for this, we found few differences in promoter DNA-methylation, even though hundreds of genes were robustly differentially expressed across distantly related breeds. Several of the differentially expressed genes are known to affect behavior, and as suggested from their functional annotation, we found that female Red Junglefowl are more explorative and fearful in a range of tests performed throughout their lives. This paper identifies new sites and, with increased resolution, confirms known sites where DNA-methylation seems to affect sexually dimorphic gene expression, but the general lack of this association is noticeable and strengthens the view that birds do not have dosage compensation. PMID:24782041

  8. A Pathway Based Classification Method for Analyzing Gene Expression for Alzheimer's Disease Diagnosis.

    PubMed

    Voyle, Nicola; Keohane, Aoife; Newhouse, Stephen; Lunnon, Katie; Johnston, Caroline; Soininen, Hilkka; Kloszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon; Hodges, Angela; Kiddle, Steven; Dobson, Richard Jb

    2016-01-01

    Recent studies indicate that gene expression levels in blood may be able to differentiate subjects with Alzheimer's disease (AD) from normal elderly controls and mild cognitively impaired (MCI) subjects. However, there is limited replicability at the single marker level. A pathway-based interpretation of gene expression may prove more robust. This study aimed to investigate whether a case/control classification model built on pathway level data was more robust than a gene level model and may consequently perform better in test data. The study used two batches of gene expression data from the AddNeuroMed (ANM) and Dementia Case Registry (DCR) cohorts. Our study used Illumina Human HT-12 Expression BeadChips to collect gene expression from blood samples. Random forest modeling with recursive feature elimination was used to predict case/control status. Age and APOE ɛ4 status were used as covariates for all analysis. Gene and pathway level models performed similarly to each other and to a model based on demographic information only. Any potential increase in concordance from the novel pathway level approach used here has not lead to a greater predictive ability in these datasets. However, we have only tested one method for creating pathway level scores. Further, we have been able to benchmark pathways against genes in datasets that had been extensively harmonized. Further work should focus on the use of alternative methods for creating pathway level scores, in particular those that incorporate pathway topology, and the use of an endophenotype based approach.

  9. Transcriptional Profiling of Caulobacter crescentus during Growth on Complex and Minimal Media

    PubMed Central

    Hottes, Alison K.; Meewan, Maliwan; Yang, Desiree; Arana, Naomi; Romero, Pedro; McAdams, Harley H.; Stephens, Craig

    2004-01-01

    Microarray analysis was used to examine gene expression in the freshwater oligotrophic bacterium Caulobacter crescentus during growth on three standard laboratory media, including peptone-yeast extract medium (PYE) and minimal salts medium with glucose or xylose as the carbon source. Nearly 400 genes (approximately 10% of the genome) varied significantly in expression between at least two of these media. The differentially expressed genes included many encoding transport systems, most notably diverse TonB-dependent outer membrane channels of unknown substrate specificity. Amino acid degradation pathways constituted the largest class of genes induced in PYE. In contrast, many of the genes upregulated in minimal media encoded enzymes for synthesis of amino acids, including incorporation of ammonia and sulfate into glutamate and cysteine. Glucose availability induced expression of genes encoding enzymes of the Entner-Doudoroff pathway, which was demonstrated here through mutational analysis to be essential in C. crescentus for growth on glucose. Xylose induced expression of genes encoding several hydrolytic exoenzymes as well as an operon that may encode a novel pathway for xylose catabolism. A conserved DNA motif upstream of many xylose-induced genes was identified and shown to confer xylose-specific expression. Xylose is an abundant component of xylan in plant cell walls, and the microarray data suggest that in addition to serving as a carbon source for growth of C. crescentus, this pentose may be interpreted as a signal to produce enzymes associated with plant polymer degradation. PMID:14973021

  10. Clustering gene expression data based on predicted differential effects of GV interaction.

    PubMed

    Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu

    2005-02-01

    Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.

  11. RNA interference can target pre-mRNA: consequences for gene expression in a Caenorhabditis elegans operon.

    PubMed Central

    Bosher, J M; Dufourcq, P; Sookhareea, S; Labouesse, M

    1999-01-01

    In nematodes, flies, trypanosomes, and planarians, introduction of double-stranded RNA results in sequence-specific inactivation of gene function, a process termed RNA interference (RNAi). We demonstrate that RNAi against the Caenorhabditis elegans gene lir-1, which is part of the lir-1/lin-26 operon, induced phenotypes very different from a newly isolated lir-1 null mutation. Specifically, lir-1(RNAi) induced embryonic lethality reminiscent of moderately strong lin-26 alleles, whereas the lir-1 null mutant was viable. We show that the lir-1(RNAi) phenotypes resulted from a severe loss of lin-26 gene expression. In addition, we found that RNAi directed against lir-1 or lin-26 introns induced similar phenotypes, so we conclude that lir-1(RNAi) targets the lir-1/lin-26 pre-mRNA. This provides direct evidence that RNA interference can prevent gene expression by targeting nuclear transcripts. Our results highlight that caution may be necessary when interpreting RNA interference without the benefit of mutant alleles. PMID:10545456

  12. Sub-cellular mRNA localization modulates the regulation of gene expression by small RNAs in bacteria

    NASA Astrophysics Data System (ADS)

    Teimouri, Hamid; Korkmazhan, Elgin; Stavans, Joel; Levine, Erel

    2017-10-01

    Small non-coding RNAs can exert significant regulatory activity on gene expression in bacteria. In recent years, substantial progress has been made in understanding bacterial gene expression by sRNAs. However, recent findings that demonstrate that families of mRNAs show non-trivial sub-cellular distributions raise the question of how localization may affect the regulatory activity of sRNAs. Here we address this question within a simple mathematical model. We show that the non-uniform spatial distributions of mRNA can alter the threshold-linear response that characterizes sRNAs that act stoichiometrically, and modulate the hierarchy among targets co-regulated by the same sRNA. We also identify conditions where the sub-cellular organization of cofactors in the sRNA pathway can induce spatial heterogeneity on sRNA targets. Our results suggest that under certain conditions, interpretation and modeling of natural and synthetic gene regulatory circuits need to take into account the spatial organization of the transcripts of participating genes.

  13. Identification of differentially expressed genes in childhood asthma.

    PubMed

    Zhang, Nian-Zhen; Chen, Xiu-Juan; Mu, Yu-Hua; Wang, Hewen

    2018-05-01

    Asthma has been the most common chronic disease in children that places a major burden for affected people and their families.An integrated analysis of microarrays studies was performed to identify differentially expressed genes (DEGs) in childhood asthma compared with normal control. We also obtained the differentially methylated genes (DMGs) in childhood asthma according to GEO. The genes that were both differentially expressed and differentially methylated were identified. Functional annotation and protein-protein interaction network construction were performed to interpret biological functions of DEGs. We performed q-RT-PCR to verify the expression of selected DEGs.One DNA methylation and 3 gene expression datasets were obtained. Four hundred forty-one DEGs and 1209 DMGs in childhood asthma were identified. Among which, 16 genes were both differentially expressed and differentially methylated in childhood asthma. Natural killer cell mediated cytotoxicity pathway, Jak-STAT signaling pathway, and Wnt signaling pathway were 3 significantly enriched pathways in childhood asthma according to our KEGG enrichment analysis. The PPI network of top 20 up- and downregulated DEGs consisted of 822 nodes and 904 edges and 2 hub proteins (UBQLN4 and MID2) were identified. The expression of 8 DEGs (GZMB, FGFBP2, CLC, TBX21, ALOX15, IL12RB2, UBQLN4) was verified by qRT-PCR and only the expression of GZMB and FGFBP2 was inconsistent with our integrated analysis.Our finding was helpful to elucidate the underlying mechanism of childhood asthma and develop new potential diagnostic biomarker and provide clues for drug design.

  14. Structural and physiological studies of the Escherichia coli histidine operon inserted into plasmid vectors.

    PubMed Central

    Bruni, C B; Musti, A M; Frunzio, R; Blasi, F

    1980-01-01

    A fragment of deoxyribonucleic acid 5,300 base paris long and containing the promoter-proximal portion of the histidine operon of Escherichia coli K-12, has been cloned in plasmid pBR313 (plasmids pCB2 and pCB3). Restriction mapping, partial nucleotide sequencing, and studies on functional expression in vivo and on protein synthesis in minicells have shown that the fragment contains the regulatory region of the operon, the hisG, hisD genes, and part of the hisC gene. Another plasmid (pCB5) contained the hisG gene and part of the hisD gene. Expression of the hisG gene in the latter plasmid was under control of the tetracycline promoter of the pBR313 plasmid. The in vivo expression of the two groups of plasmids described above, as well as their effect on the expression of the histidine genes not carried by the plasmids but present on the host chromosome, has been studied. The presence of multiple copies of pCB2 or pCB3, but not of pCB5, prevented derepression of the chromosomal histidine operon. Possible interpretations of this phenomenon are discussed. Images PMID:6246067

  15. Metabolic modeling helps interpret transcriptomic changes during malaria.

    PubMed

    Tang, Yan; Gupta, Anuj; Garimalla, Swetha; Galinski, Mary R; Styczynski, Mark P; Fonseca, Luis L; Voit, Eberhard O

    2018-06-01

    Disease represents a specific case of malfunctioning within a complex system. Whereas it is often feasible to observe and possibly treat the symptoms of a disease, it is much more challenging to identify and characterize its molecular root causes. Even in infectious diseases that are caused by a known parasite, it is often impossible to pinpoint exactly which molecular profiles of components or processes are directly or indirectly altered. However, a deep understanding of such profiles is a prerequisite for rational, efficacious treatments. Modern omics methodologies are permitting large-scale scans of some molecular profiles, but these scans often yield results that are not intuitive and difficult to interpret. For instance, the comparison of healthy and diseased transcriptome profiles may point to certain sets of involved genes, but a host of post-transcriptional processes and regulatory mechanisms renders predictions regarding metabolic or physiological consequences of the observed changes in gene expression unreliable. Here we present proof of concept that dynamic models of metabolic pathway systems may offer a tool for interpreting transcriptomic profiles measured during disease. We illustrate this strategy with the interpretation of expression data of genes coding for enzymes associated with purine metabolism. These data were obtained during infections of rhesus macaques (Macaca mulatta) with the malaria parasite Plasmodium cynomolgi or P. coatneyi. The model-based interpretation reveals clear patterns of flux redistribution within the purine pathway that are consistent between the two malaria pathogens and are even reflected in data from humans infected with P. falciparum. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. A structured sparse regression method for estimating isoform expression level from multi-sample RNA-seq data.

    PubMed

    Zhang, L; Liu, X J

    2016-06-03

    With the rapid development of next-generation high-throughput sequencing technology, RNA-seq has become a standard and important technique for transcriptome analysis. For multi-sample RNA-seq data, the existing expression estimation methods usually deal with each single-RNA-seq sample, and ignore that the read distributions are consistent across multiple samples. In the current study, we propose a structured sparse regression method, SSRSeq, to estimate isoform expression using multi-sample RNA-seq data. SSRSeq uses a non-parameter model to capture the general tendency of non-uniformity read distribution for all genes across multiple samples. Additionally, our method adds a structured sparse regularization, which not only incorporates the sparse specificity between a gene and its corresponding isoform expression levels, but also reduces the effects of noisy reads, especially for lowly expressed genes and isoforms. Four real datasets were used to evaluate our method on isoform expression estimation. Compared with other popular methods, SSRSeq reduced the variance between multiple samples, and produced more accurate isoform expression estimations, and thus more meaningful biological interpretations.

  17. Positional signaling mediated by a receptor-like kinase in Arabidopsis.

    PubMed

    Kwak, Su-Hwan; Shen, Ronglai; Schiefelbein, John

    2005-02-18

    The position-dependent specification of root epidermal cells in Arabidopsis provides an elegant paradigm for cell patterning during development. Here, we describe a new gene, SCRAMBLED (SCM), required for cells to appropriately interpret their location within the developing root epidermis. SCM encodes a receptor-like kinase protein with a predicted extracellular domain of six leucine-rich repeats and an intracellular serine-threonine kinase domain. SCM regulates the expression of the GLABRA2, CAPRICE, WEREWOLF, and ENHANCER OF GLABRA3 transcription factor genes that define the cell fates. Further, the SCM gene is expressed throughout the developing root. Therefore, SCM likely enables developing epidermal cells to detect positional cues and establish an appropriate cell-type pattern.

  18. An integrated method for cancer classification and rule extraction from microarray data

    PubMed Central

    Huang, Liang-Tsung

    2009-01-01

    Different microarray techniques recently have been successfully used to investigate useful information for cancer diagnosis at the gene expression level due to their ability to measure thousands of gene expression levels in a massively parallel way. One important issue is to improve classification performance of microarray data. However, it would be ideal that influential genes and even interpretable rules can be explored at the same time to offer biological insight. Introducing the concepts of system design in software engineering, this paper has presented an integrated and effective method (named X-AI) for accurate cancer classification and the acquisition of knowledge from DNA microarray data. This method included a feature selector to systematically extract the relative important genes so as to reduce the dimension and retain as much as possible of the class discriminatory information. Next, diagonal quadratic discriminant analysis (DQDA) was combined to classify tumors, and generalized rule induction (GRI) was integrated to establish association rules which can give an understanding of the relationships between cancer classes and related genes. Two non-redundant datasets of acute leukemia were used to validate the proposed X-AI, showing significantly high accuracy for discriminating different classes. On the other hand, I have presented the abilities of X-AI to extract relevant genes, as well as to develop interpretable rules. Further, a web server has been established for cancer classification and it is freely available at . PMID:19272192

  19. Lotus Base: An integrated information portal for the model legume Lotus japonicus

    PubMed Central

    Mun, Terry; Bachmann, Asger; Gupta, Vikas; Stougaard, Jens; Andersen, Stig U.

    2016-01-01

    Lotus japonicus is a well-characterized model legume widely used in the study of plant-microbe interactions. However, datasets from various Lotus studies are poorly integrated and lack interoperability. We recognize the need for a comprehensive repository that allows comprehensive and dynamic exploration of Lotus genomic and transcriptomic data. Equally important are user-friendly in-browser tools designed for data visualization and interpretation. Here, we present Lotus Base, which opens to the research community a large, established LORE1 insertion mutant population containing an excess of 120,000 lines, and serves the end-user tightly integrated data from Lotus, such as the reference genome, annotated proteins, and expression profiling data. We report the integration of expression data from the L. japonicus gene expression atlas project, and the development of tools to cluster and export such data, allowing users to construct, visualize, and annotate co-expression gene networks. Lotus Base takes advantage of modern advances in browser technology to deliver powerful data interpretation for biologists. Its modular construction and publicly available application programming interface enable developers to tap into the wealth of integrated Lotus data. Lotus Base is freely accessible at: https://lotus.au.dk. PMID:28008948

  20. Prediction of Microbial Infection of Cultured Cells Using DNA Microarray Gene-Expression Profiles of Host Responses

    PubMed Central

    Park, Yu Rang; Chung, Tae Su; Lee, Young Joo; Song, Yeong Wook; Lee, Eun Young; Sohn, Yeo Won; Song, Sukgil; Park, Woong Yang

    2012-01-01

    Infection by microorganisms may cause fatally erroneous interpretations in the biologic researches based on cell culture. The contamination by microorganism in the cell culture is quite frequent (5% to 35%). However, current approaches to identify the presence of contamination have many limitations such as high cost of time and labor, and difficulty in interpreting the result. In this paper, we propose a model to predict cell infection, using a microarray technique which gives an overview of the whole genome profile. By analysis of 62 microarray expression profiles under various experimental conditions altering cell type, source of infection and collection time, we discovered 5 marker genes, NM_005298, NM_016408, NM_014588, S76389, and NM_001853. In addition, we discovered two of these genes, S76389, and NM_001853, are involved in a Mycolplasma-specific infection process. We also suggest models to predict the source of infection, cell type or time after infection. We implemented a web based prediction tool in microarray data, named Prediction of Microbial Infection (http://www.snubi.org/software/PMI). PMID:23091307

  1. A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data.

    PubMed

    Seok, Junhee; Davis, Ronald W; Xiao, Wenzhong

    2015-01-01

    Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn't been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge.

  2. A Hybrid Approach of Gene Sets and Single Genes for the Prediction of Survival Risks with Gene Expression Data

    PubMed Central

    Seok, Junhee; Davis, Ronald W.; Xiao, Wenzhong

    2015-01-01

    Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn’t been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge. PMID:25933378

  3. Amniotic fluid RNA gene expression profiling provides insights into the phenotype of Turner syndrome.

    PubMed

    Massingham, Lauren J; Johnson, Kirby L; Scholl, Thomas M; Slonim, Donna K; Wick, Heather C; Bianchi, Diana W

    2014-09-01

    Turner syndrome is a sex chromosome aneuploidy with characteristic malformations. Amniotic fluid, a complex biological material, could contribute to the understanding of Turner syndrome pathogenesis. In this pilot study, global gene expression analysis of cell-free RNA in amniotic fluid supernatant was utilized to identify specific genes/organ systems that may play a role in Turner syndrome pathophysiology. Cell-free RNA from amniotic fluid of five mid-trimester Turner syndrome fetuses and five euploid female fetuses matched for gestational age was extracted, amplified, and hybridized onto Affymetrix(®) U133 Plus 2.0 arrays. Significantly differentially regulated genes were identified using paired t tests. Biological interpretation was performed using Ingenuity Pathway Analysis and BioGPS gene expression atlas. There were 470 statistically significantly differentially expressed genes identified. They were widely distributed across the genome. XIST was significantly down-regulated (p < 0.0001); SHOX was not differentially expressed. One of the most highly represented organ systems was the hematologic/immune system, distinguishing the Turner syndrome transcriptome from other aneuploidies we previously studied. Manual curation of the differentially expressed gene list identified genes of possible pathologic significance, including NFATC3, IGFBP5, and LDLR. Transcriptomic differences in the amniotic fluid of Turner syndrome fetuses are due to genome-wide dysregulation. The hematologic/immune system differences may play a role in early-onset autoimmune dysfunction. Other genes identified with possible pathologic significance are associated with cardiac and skeletal systems, which are known to be affected in females with Turner syndrome. The discovery-driven approach described here may be useful in elucidating novel mechanisms of disease in Turner syndrome.

  4. Detection of changes in gene regulatory patterns, elicited by perturbations of the Hsp90 molecular chaperone complex, by visualizing multiple experiments with an animation

    PubMed Central

    2011-01-01

    Background To make sense out of gene expression profiles, such analyses must be pushed beyond the mere listing of affected genes. For example, if a group of genes persistently display similar changes in expression levels under particular experimental conditions, and the proteins encoded by these genes interact and function in the same cellular compartments, this could be taken as very strong indicators for co-regulated protein complexes. One of the key requirements is having appropriate tools to detect such regulatory patterns. Results We have analyzed the global adaptations in gene expression patterns in the budding yeast when the Hsp90 molecular chaperone complex is perturbed either pharmacologically or genetically. We integrated these results with publicly accessible expression, protein-protein interaction and intracellular localization data. But most importantly, all experimental conditions were simultaneously and dynamically visualized with an animation. This critically facilitated the detection of patterns of gene expression changes that suggested underlying regulatory networks that a standard analysis by pairwise comparison and clustering could not have revealed. Conclusions The results of the animation-assisted detection of changes in gene regulatory patterns make predictions about the potential roles of Hsp90 and its co-chaperone p23 in regulating whole sets of genes. The simultaneous dynamic visualization of microarray experiments, represented in networks built by integrating one's own experimental with publicly accessible data, represents a powerful discovery tool that allows the generation of new interpretations and hypotheses. PMID:21672238

  5. Pregnancy-induced gene expression changes in vivo among women with rheumatoid arthritis: a pilot study.

    PubMed

    Goin, Dana E; Smed, Mette Kiel; Pachter, Lior; Purdom, Elizabeth; Nelson, J Lee; Kjærgaard, Hanne; Olsen, Jørn; Hetland, Merete Lund; Zoffmann, Vibeke; Ottesen, Bent; Jawaheer, Damini

    2017-05-25

    Little is known about gene expression changes induced by pregnancy in women with rheumatoid arthritis (RA) and healthy women because the few studies previously conducted did not have pre-pregnancy samples available as baseline. We have established a cohort of women with RA and healthy women followed prospectively from a pre-pregnancy baseline. In this study, we tested the hypothesis that pregnancy-induced changes in gene expression among women with RA who improve during pregnancy (pregDAS improved ) overlap substantially with changes observed among healthy women and differ from changes observed among women with RA who worsen during pregnancy (pregDAS worse ). Global gene expression profiles were generated by RNA sequencing (RNA-seq) from 11 women with RA and 5 healthy women before pregnancy (T0) and at the third trimester (T3). Among the women with RA, eight showed an improvement in disease activity by T3, whereas three worsened. Differential expression analysis was used to identify genes demonstrating significant changes in expression within each of the RA and healthy groups (T3 vs T0), as well as between the groups at each time point. Gene set enrichment was assessed in terms of Gene Ontology processes and protein networks. A total of 1296 genes were differentially expressed between T3 and T0 among the 8 pregDAS improved women, with 161 genes showing at least two-fold change (FC) in expression by T3. The majority (108 of 161 genes) were also differentially expressed among healthy women (q<0.05, FC≥2). Additionally, a small cluster of genes demonstrated contrasting changes in expression between the pregDAS improved and pregDAS worse groups, all of which were inducible by type I interferon (IFN). These IFN-inducible genes were over-expressed at T3 compared to the T0 baseline among the pregDAS improved women. In our pilot RNA-seq dataset, increased pregnancy-induced expression of type I IFN-inducible genes was observed among women with RA who improved during pregnancy, but not among women who worsened. These findings warrant further investigation into expression of these genes in RA pregnancy and their potential role in modulation of disease activity. These results are nevertheless preliminary and should be interpreted with caution until replicated in a larger sample.

  6. Expression of interest: transcriptomics and the designation of conservation units.

    PubMed

    Hansen, Michael M

    2010-05-01

    An important task within conservation genetics consists in defining intraspecific conservation units. Most conceptual frameworks involve two steps: (i) identifying demographically independent units, and (ii) evaluating their degree of adaptive divergence. Whereas a plethora of methods are available for delineating genetic population structure, assessment of functional genetic divergence remains a challenge. In this issue, Tymchuk et al. (2010) study Atlantic salmon (Salmo salar) populations using both microsatellite markers and analysis of global gene expression. They show that important gene expression differences exist that can be interpreted in the context of different ecological conditions experienced by the populations, along with the populations' histories. This demonstrates an important potential role of transcriptomics for designating conservation units.

  7. Hierarchical Dirichlet process model for gene expression clustering

    PubMed Central

    2013-01-01

    Clustering is an important data processing tool for interpreting microarray data and genomic network inference. In this article, we propose a clustering algorithm based on the hierarchical Dirichlet processes (HDP). The HDP clustering introduces a hierarchical structure in the statistical model which captures the hierarchical features prevalent in biological data such as the gene express data. We develop a Gibbs sampling algorithm based on the Chinese restaurant metaphor for the HDP clustering. We apply the proposed HDP algorithm to both regulatory network segmentation and gene expression clustering. The HDP algorithm is shown to outperform several popular clustering algorithms by revealing the underlying hierarchical structure of the data. For the yeast cell cycle data, we compare the HDP result to the standard result and show that the HDP algorithm provides more information and reduces the unnecessary clustering fragments. PMID:23587447

  8. Molecular Dissection of a Major Gene Effect on a Quantitative Trait: The Level of Alcohol Dehydrogenase Expression in Drosophila Melanogaster

    PubMed Central

    Stam, L. F.; Laurie, C. C.

    1996-01-01

    A molecular mapping experiment shows that a major gene effect on a quantitative trait, the level of alcohol dehydrogenase expression in Drosophila melanogaster, is due to multiple polymorphisms within the Adh gene. These polymorphisms are located in an intron, the coding sequence, and the 3' untranslated region. Because of nonrandom associations among polymorphisms at different sites, the individual effects combine (in some cases epistatically) to produce ``superalleles'' with large effect. These results have implications for the interpretation of major gene effects detected by quantitative trait locus mapping methods. They show that large effects due to a single locus may be due to multiple associated polymorphisms (or sequential fixations in isolated populations) rather than individual mutations of large effect. PMID:8978044

  9. Identification of potential transcriptomic markers in developing pediatric sepsis: a weighted gene co-expression network analysis and a case-control validation study.

    PubMed

    Li, Yiping; Li, Yanhong; Bai, Zhenjiang; Pan, Jian; Wang, Jian; Fang, Fang

    2017-12-13

    Sepsis represents a complex disease with the dysregulated inflammatory response and high mortality rate. The goal of this study was to identify potential transcriptomic markers in developing pediatric sepsis by a co-expression module analysis of the transcriptomic dataset. Using the R software and Bioconductor packages, we performed a weighted gene co-expression network analysis to identify co-expression modules significantly associated with pediatric sepsis. Functional interpretation (gene ontology and pathway analysis) and enrichment analysis with known transcription factors and microRNAs of the identified candidate modules were then performed. In modules significantly associated with sepsis, the intramodular analysis was further performed and "hub genes" were identified and validated by quantitative real-time PCR (qPCR) in this study. 15 co-expression modules in total were detected, and four modules ("midnight blue", "cyan", "brown", and "tan") were most significantly associated with pediatric sepsis and suggested as potential sepsis-associated modules. Gene ontology analysis and pathway analysis revealed that these four modules strongly associated with immune response. Three of the four sepsis-associated modules were also enriched with known transcription factors (false discovery rate-adjusted P < 0.05). Hub genes were identified in each of the four modules. Four of the identified hub genes (MYB proto-oncogene like 1, killer cell lectin like receptor G1, stomatin, and membrane spanning 4-domains A4A) were further validated to be differentially expressed between septic children and controls by qPCR. Four pediatric sepsis-associated co-expression modules were identified in this study. qPCR results suggest that hub genes in these modules are potential transcriptomic markers for pediatric sepsis diagnosis. These results provide novel insights into the pathogenesis of pediatric sepsis and promote the generation of diagnostic gene sets.

  10. Hox11 paralogous genes are essential for metanephric kidney induction

    PubMed Central

    Wellik, Deneen M.; Hawkes, Patrick J.; Capecchi, Mario R.

    2002-01-01

    The mammalian Hox complex is divided into four linkage groups containing 13 sets of paralogous genes. These paralogous genes have retained functional redundancy during evolution. For this reason, loss of only one or two Hox genes within a paralogous group often results in incompletely penetrant phenotypes which are difficult to interpret by molecular analysis. For example, mice individually mutant for Hoxa11 or Hoxd11 show no discernible kidney abnormalities. Hoxa11/Hoxd11 double mutants, however, demonstrate hypoplasia of the kidneys. As described in this study, removal of the last Hox11 paralogous member, Hoxc11, results in the complete loss of metanephric kidney induction. In these triple mutants, the metanephric blastema condenses, and expression of early patterning genes, Pax2 and Wt1, is unperturbed. Eya1 expression is also intact. Six2 expression, however, is absent, as is expression of the inducing growth factor, Gdnf. In the absence of Gdnf, ureteric bud formation is not initiated. Molecular analysis of this phenotype demonstrates that Hox11 control of early metanephric induction is accomplished by the interaction of Hox11 genes with the pax-eya-six regulatory cascade, a pathway that may be used by Hox genes more generally for the induction of multiple structures along the anteroposterior axis. PMID:12050119

  11. Hox11 paralogous genes are essential for metanephric kidney induction.

    PubMed

    Wellik, Deneen M; Hawkes, Patrick J; Capecchi, Mario R

    2002-06-01

    The mammalian Hox complex is divided into four linkage groups containing 13 sets of paralogous genes. These paralogous genes have retained functional redundancy during evolution. For this reason, loss of only one or two Hox genes within a paralogous group often results in incompletely penetrant phenotypes which are difficult to interpret by molecular analysis. For example, mice individually mutant for Hoxa11 or Hoxd11 show no discernible kidney abnormalities. Hoxa11/Hoxd11 double mutants, however, demonstrate hypoplasia of the kidneys. As described in this study, removal of the last Hox11 paralogous member, Hoxc11, results in the complete loss of metanephric kidney induction. In these triple mutants, the metanephric blastema condenses, and expression of early patterning genes, Pax2 and Wt1, is unperturbed. Eya1 expression is also intact. Six2 expression, however, is absent, as is expression of the inducing growth factor, Gdnf. In the absence of Gdnf, ureteric bud formation is not initiated. Molecular analysis of this phenotype demonstrates that Hox11 control of early metanephric induction is accomplished by the interaction of Hox11 genes with the pax-eya-six regulatory cascade, a pathway that may be used by Hox genes more generally for the induction of multiple structures along the anteroposterior axis.

  12. The impact of rare variation on gene expression across tissues.

    PubMed

    Li, Xin; Kim, Yungil; Tsang, Emily K; Davis, Joe R; Damani, Farhan N; Chiang, Colby; Hess, Gaelen T; Zappala, Zachary; Strober, Benjamin J; Scott, Alexandra J; Li, Amy; Ganna, Andrea; Bassik, Michael C; Merker, Jason D; Hall, Ira M; Battle, Alexis; Montgomery, Stephen B

    2017-10-11

    Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.

  13. Evaluation of the testicular toxicity of prenatal exposure to bisphenol A based on microarray analysis combined with MeSH annotation.

    PubMed

    Tainaka, Hitoshi; Takahashi, Hikari; Umezawa, Masakazu; Tanaka, Hiromitsu; Nishimune, Yoshitake; Oshio, Shigeru; Takeda, Ken

    2012-01-01

    Bisphenol A (BPA) is known to be an endocrine disruptor that affects the development of reproductive system. The aim of the present study was to investigate a group of testicular genes dysregulated by prenatal exposure to BPA. Pregnant ICR mice were treated with BPA by subcutaneous administration on days 7 and 14 of pregnancy. Tissue and blood samples were collected from 6-week-old male offspring. Testes were subjected to gene expression analysis using a testis-specific microarray (Testis2), consisting of 2,482 mouse cDNA clones annotated with Medical Subject Headings (MeSH) terms indicative of testicular components and functions. To interpret the microarray data, we used the MeSH terms significantly associated with the altered genes. As a result, MeSH terms related to androgens and Sertoli cells were extracted in BPA-treated groups. Among the genes related to Sertoli cells, downregulation of Msi1h, Ncoa1, Nid1, Hspb2, and Gata6 were detected in the testis of mice treated with BPA (twice administered 50 mg/kg). The MeSH terms associated with this group of genes may provide useful means to interpret the testicular toxicity of BPA. This article concludes that prenatal BPA exposure downregulates expression of genes associated with Sertoli cell function and affects the reproductive function of male offspring. Additionally, a method using MeSH to extract a group of genes was useful for predicting the testicular and reproductive toxicity of prenatal BPA exposure.

  14. Characteristics of genomic signatures derived using univariate methods and mechanistically anchored functional descriptors for predicting drug- and xenobiotic-induced nephrotoxicity.

    PubMed

    Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J

    2008-01-01

    ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.

  15. Eureka-DMA: an easy-to-operate graphical user interface for fast comprehensive investigation and analysis of DNA microarray data.

    PubMed

    Abelson, Sagi

    2014-02-24

    In the past decade, the field of molecular biology has become increasingly quantitative; rapid development of new technologies enables researchers to investigate and address fundamental issues quickly and in an efficient manner which were once impossible. Among these technologies, DNA microarray provides methodology for many applications such as gene discovery, diseases diagnosis, drug development and toxicological research and it has been used increasingly since it first emerged. Multiple tools have been developed to interpret the high-throughput data produced by microarrays. However, many times, less consideration has been given to the fact that an extensive and effective interpretation requires close interplay between the bioinformaticians who analyze the data and the biologists who generate it. To bridge this gap and to simplify the usability of such tools we developed Eureka-DMA - an easy-to-operate graphical user interface that allows bioinformaticians and bench-biologists alike to initiate analyses as well as to investigate the data produced by DNA microarrays. In this paper, we describe Eureka-DMA, a user-friendly software that comprises a set of methods for the interpretation of gene expression arrays. Eureka-DMA includes methods for the identification of genes with differential expression between conditions; it searches for enriched pathways and gene ontology terms and combines them with other relevant features. It thus enables the full understanding of the data for following testing as well as generating new hypotheses. Here we show two analyses, demonstrating examples of how Eureka-DMA can be used and its capability to produce relevant and reliable results. We have integrated several elementary expression analysis tools to provide a unified interface for their implementation. Eureka-DMA's simple graphical user interface provides effective and efficient framework in which the investigator has the full set of tools for the visualization and interpretation of the data with the option of exporting the analysis results for later use in other platforms. Eureka-DMA is freely available for academic users and can be downloaded at http://blue-meduza.org/Eureka-DMA.

  16. Estimation of gene induction enables a relevance-based ranking of gene sets.

    PubMed

    Bartholomé, Kilian; Kreutz, Clemens; Timmer, Jens

    2009-07-01

    In order to handle and interpret the vast amounts of data produced by microarray experiments, the analysis of sets of genes with a common biological functionality has been shown to be advantageous compared to single gene analyses. Some statistical methods have been proposed to analyse the differential gene expression of gene sets in microarray experiments. However, most of these methods either require threshhold values to be chosen for the analysis, or they need some reference set for the determination of significance. We present a method that estimates the number of differentially expressed genes in a gene set without requiring a threshold value for significance of genes. The method is self-contained (i.e., it does not require a reference set for comparison). In contrast to other methods which are focused on significance, our approach emphasizes the relevance of the regulation of gene sets. The presented method measures the degree of regulation of a gene set and is a useful tool to compare the induction of different gene sets and place the results of microarray experiments into the biological context. An R-package is available.

  17. Computational deconvolution of genome wide expression data from Parkinson's and Huntington's disease brain tissues using population-specific expression analysis

    PubMed Central

    Capurro, Alberto; Bodea, Liviu-Gabriel; Schaefer, Patrick; Luthi-Carter, Ruth; Perreau, Victoria M.

    2015-01-01

    The characterization of molecular changes in diseased tissues gives insight into pathophysiological mechanisms and is important for therapeutic development. Genome-wide gene expression analysis has proven valuable for identifying biological processes in neurodegenerative diseases using post mortem human brain tissue and numerous datasets are publically available. However, many studies utilize heterogeneous tissue samples consisting of multiple cell types, all of which contribute to global gene expression values, confounding biological interpretation of the data. In particular, changes in numbers of neuronal and glial cells occurring in neurodegeneration confound transcriptomic analyses, particularly in human brain tissues where sample availability and controls are limited. To identify cell specific gene expression changes in neurodegenerative disease, we have applied our recently published computational deconvolution method, population specific expression analysis (PSEA). PSEA estimates cell-type-specific expression values using reference expression measures, which in the case of brain tissue comprises mRNAs with cell-type-specific expression in neurons, astrocytes, oligodendrocytes and microglia. As an exercise in PSEA implementation and hypothesis development regarding neurodegenerative diseases, we applied PSEA to Parkinson's and Huntington's disease (PD, HD) datasets. Genes identified as differentially expressed in substantia nigra pars compacta neurons by PSEA were validated using external laser capture microdissection data. Network analysis and Annotation Clustering (DAVID) identified molecular processes implicated by differential gene expression in specific cell types. The results of these analyses provided new insights into the implementation of PSEA in brain tissues and additional refinement of molecular signatures in human HD and PD. PMID:25620908

  18. Detection of biomarkers for Hepatocellular Carcinoma using a hybrid univariate gene selection methods

    PubMed Central

    2012-01-01

    Background Discovering new biomarkers has a great role in improving early diagnosis of Hepatocellular carcinoma (HCC). The experimental determination of biomarkers needs a lot of time and money. This motivates this work to use in-silico prediction of biomarkers to reduce the number of experiments required for detecting new ones. This is achieved by extracting the most representative genes in microarrays of HCC. Results In this work, we provide a method for extracting the differential expressed genes, up regulated ones, that can be considered candidate biomarkers in high throughput microarrays of HCC. We examine the power of several gene selection methods (such as Pearson’s correlation coefficient, Cosine coefficient, Euclidean distance, Mutual information and Entropy with different estimators) in selecting informative genes. A biological interpretation of the highly ranked genes is done using KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways, ENTREZ and DAVID (Database for Annotation, Visualization, and Integrated Discovery) databases. The top ten genes selected using Pearson’s correlation coefficient and Cosine coefficient contained six genes that have been implicated in cancer (often multiple cancers) genesis in previous studies. A fewer number of genes were obtained by the other methods (4 genes using Mutual information, 3genes using Euclidean distance and only one gene using Entropy). A better result was obtained by the utilization of a hybrid approach based on intersecting the highly ranked genes in the output of all investigated methods. This hybrid combination yielded seven genes (2 genes for HCC and 5 genes in different types of cancer) in the top ten genes of the list of intersected genes. Conclusions To strengthen the effectiveness of the univariate selection methods, we propose a hybrid approach by intersecting several of these methods in a cascaded manner. This approach surpasses all of univariate selection methods when used individually according to biological interpretation and the examination of gene expression signal profiles. PMID:22867264

  19. Random forests-based differential analysis of gene sets for gene expression data.

    PubMed

    Hsueh, Huey-Miin; Zhou, Da-Wei; Tsai, Chen-An

    2013-04-10

    In DNA microarray studies, gene-set analysis (GSA) has become the focus of gene expression data analysis. GSA utilizes the gene expression profiles of functionally related gene sets in Gene Ontology (GO) categories or priori-defined biological classes to assess the significance of gene sets associated with clinical outcomes or phenotypes. Many statistical approaches have been proposed to determine whether such functionally related gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to the discriminatory power of gene sets and classification of patients. In this study, we propose a method of gene set analysis, in which gene sets are used to develop classifications of patients based on the Random Forest (RF) algorithm. The corresponding empirical p-value of an observed out-of-bag (OOB) error rate of the classifier is introduced to identify differentially expressed gene sets using an adequate resampling method. In addition, we discuss the impacts and correlations of genes within each gene set based on the measures of variable importance in the RF algorithm. Significant classifications are reported and visualized together with the underlying gene sets and their contribution to the phenotypes of interest. Numerical studies using both synthesized data and a series of publicly available gene expression data sets are conducted to evaluate the performance of the proposed methods. Compared with other hypothesis testing approaches, our proposed methods are reliable and successful in identifying enriched gene sets and in discovering the contributions of genes within a gene set. The classification results of identified gene sets can provide an valuable alternative to gene set testing to reveal the unknown, biologically relevant classes of samples or patients. In summary, our proposed method allows one to simultaneously assess the discriminatory ability of gene sets and the importance of genes for interpretation of data in complex biological systems. The classifications of biologically defined gene sets can reveal the underlying interactions of gene sets associated with the phenotypes, and provide an insightful complement to conventional gene set analyses. Copyright © 2012 Elsevier B.V. All rights reserved.

  20. EMAP and EMAGE: a framework for understanding spatially organized data.

    PubMed

    Baldock, Richard A; Bard, Jonathan B L; Burger, Albert; Burton, Nicolas; Christiansen, Jeff; Feng, Guanjie; Hill, Bill; Houghton, Derek; Kaufman, Matthew; Rao, Jianguo; Sharpe, James; Ross, Allyson; Stevenson, Peter; Venkataraman, Shanmugasundaram; Waterhouse, Andrew; Yang, Yiya; Davidson, Duncan R

    2003-01-01

    The Edinburgh MouseAtlas Project (EMAP) is a time-series of mouse-embryo volumetric models. The models provide a context-free spatial framework onto which structural interpretations and experimental data can be mapped. This enables collation, comparison, and query of complex spatial patterns with respect to each other and with respect to known or hypothesized structure. The atlas also includes a time-dependent anatomical ontology and mapping between the ontology and the spatial models in the form of delineated anatomical regions or tissues. The models provide a natural, graphical context for browsing and visualizing complex data. The Edinburgh Mouse Atlas Gene-Expression Database (EMAGE) is one of the first applications of the EMAP framework and provides a spatially mapped gene-expression database with associated tools for data mapping, submission, and query. In this article, we describe the underlying principles of the Atlas and the gene-expression database, and provide a practical introduction to the use of the EMAP and EMAGE tools, including use of new techniques for whole body gene-expression data capture and mapping.

  1. Integration of multi-omics data for integrative gene regulatory network inference.

    PubMed

    Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun; Kang, Mingon

    2017-01-01

    Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called 'multi-omics data', that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN's capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed.

  2. Integration of multi-omics data for integrative gene regulatory network inference

    PubMed Central

    Zarayeneh, Neda; Ko, Euiseong; Oh, Jung Hun; Suh, Sang; Liu, Chunyu; Gao, Jean; Kim, Donghyun

    2017-01-01

    Gene regulatory networks provide comprehensive insights and indepth understanding of complex biological processes. The molecular interactions of gene regulatory networks are inferred from a single type of genomic data, e.g., gene expression data in most research. However, gene expression is a product of sequential interactions of multiple biological processes, such as DNA sequence variations, copy number variations, histone modifications, transcription factors, and DNA methylations. The recent rapid advances of high-throughput omics technologies enable one to measure multiple types of omics data, called ‘multi-omics data’, that represent the various biological processes. In this paper, we propose an Integrative Gene Regulatory Network inference method (iGRN) that incorporates multi-omics data and their interactions in gene regulatory networks. In addition to gene expressions, copy number variations and DNA methylations were considered for multi-omics data in this paper. The intensive experiments were carried out with simulation data, where iGRN’s capability that infers the integrative gene regulatory network is assessed. Through the experiments, iGRN shows its better performance on model representation and interpretation than other integrative methods in gene regulatory network inference. iGRN was also applied to a human brain dataset of psychiatric disorders, and the biological network of psychiatric disorders was analysed. PMID:29354189

  3. Neo-Darwinism, the Modern Synthesis and selfish genes: are they of use in physiology?

    PubMed Central

    Noble, Denis

    2011-01-01

    This article argues that the gene-centric interpretations of evolution, and more particularly the selfish gene expression of those interpretations, form barriers to the integration of physiological science with evolutionary theory. A gene-centred approach analyses the relationships between genotypes and phenotypes in terms of differences (change the genotype and observe changes in phenotype). We now know that, most frequently, this does not correctly reveal the relationships because of extensive buffering by robust networks of interactions. By contrast, understanding biological function through physiological analysis requires an integrative approach in which the activity of the proteins and RNAs formed from each DNA template is analysed in networks of interactions. These networks also include components that are not specified by nuclear DNA. Inheritance is not through DNA sequences alone. The selfish gene idea is not useful in the physiological sciences, since selfishness cannot be defined as an intrinsic property of nucleotide sequences independently of gene frequency, i.e. the ‘success’ in the gene pool that is supposed to be attributable to the ‘selfish’ property. It is not a physiologically testable hypothesis. PMID:21135048

  4. Neo-Darwinism, the modern synthesis and selfish genes: are they of use in physiology?

    PubMed

    Noble, Denis

    2011-03-01

    This article argues that the gene-centric interpretations of evolution, and more particularly the selfish gene expression of those interpretations, form barriers to the integration of physiological science with evolutionary theory. A gene-centred approach analyses the relationships between genotypes and phenotypes in terms of differences (change the genotype and observe changes in phenotype). We now know that, most frequently, this does not correctly reveal the relationships because of extensive buffering by robust networks of interactions. By contrast, understanding biological function through physiological analysis requires an integrative approach in which the activity of the proteins and RNAs formed from each DNA template is analysed in networks of interactions. These networks also include components that are not specified by nuclear DNA. Inheritance is not through DNA sequences alone. The selfish gene idea is not useful in the physiological sciences, since selfishness cannot be defined as an intrinsic property of nucleotide sequences independently of gene frequency, i.e. the 'success' in the gene pool that is supposed to be attributable to the 'selfish' property. It is not a physiologically testable hypothesis.

  5. A probabilistic framework for microarray data analysis: fundamental probability models and statistical inference.

    PubMed

    Ogunnaike, Babatunde A; Gelmi, Claudio A; Edwards, Jeremy S

    2010-05-21

    Gene expression studies generate large quantities of data with the defining characteristic that the number of genes (whose expression profiles are to be determined) exceed the number of available replicates by several orders of magnitude. Standard spot-by-spot analysis still seeks to extract useful information for each gene on the basis of the number of available replicates, and thus plays to the weakness of microarrays. On the other hand, because of the data volume, treating the entire data set as an ensemble, and developing theoretical distributions for these ensembles provides a framework that plays instead to the strength of microarrays. We present theoretical results that under reasonable assumptions, the distribution of microarray intensities follows the Gamma model, with the biological interpretations of the model parameters emerging naturally. We subsequently establish that for each microarray data set, the fractional intensities can be represented as a mixture of Beta densities, and develop a procedure for using these results to draw statistical inference regarding differential gene expression. We illustrate the results with experimental data from gene expression studies on Deinococcus radiodurans following DNA damage using cDNA microarrays. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  6. Massive-Scale Gene Co-Expression Network Construction and Robustness Testing Using Random Matrix Theory

    PubMed Central

    Isaacson, Sven; Luo, Feng; Feltus, Frank A.; Smith, Melissa C.

    2013-01-01

    The study of gene relationships and their effect on biological function and phenotype is a focal point in systems biology. Gene co-expression networks built using microarray expression profiles are one technique for discovering and interpreting gene relationships. A knowledge-independent thresholding technique, such as Random Matrix Theory (RMT), is useful for identifying meaningful relationships. Highly connected genes in the thresholded network are then grouped into modules that provide insight into their collective functionality. While it has been shown that co-expression networks are biologically relevant, it has not been determined to what extent any given network is functionally robust given perturbations in the input sample set. For such a test, hundreds of networks are needed and hence a tool to rapidly construct these networks. To examine functional robustness of networks with varying input, we enhanced an existing RMT implementation for improved scalability and tested functional robustness of human (Homo sapiens), rice (Oryza sativa) and budding yeast (Saccharomyces cerevisiae). We demonstrate dramatic decrease in network construction time and computational requirements and show that despite some variation in global properties between networks, functional similarity remains high. Moreover, the biological function captured by co-expression networks thresholded by RMT is highly robust. PMID:23409071

  7. Analysis of baseline gene expression levels from ...

    EPA Pesticide Factsheets

    The use of gene expression profiling to predict chemical mode of action would be enhanced by better characterization of variance due to individual, environmental, and technical factors. Meta-analysis of microarray data from untreated or vehicle-treated animals within the control arm of toxicogenomics studies has yielded useful information on baseline fluctuations in gene expression. A dataset of control animal microarray expression data was assembled by a working group of the Health and Environmental Sciences Institute's Technical Committee on the Application of Genomics in Mechanism Based Risk Assessment in order to provide a public resource for assessments of variability in baseline gene expression. Data from over 500 Affymetrix microarrays from control rat liver and kidney were collected from 16 different institutions. Thirty-five biological and technical factors were obtained for each animal, describing a wide range of study characteristics, and a subset were evaluated in detail for their contribution to total variability using multivariate statistical and graphical techniques. The study factors that emerged as key sources of variability included gender, organ section, strain, and fasting state. These and other study factors were identified as key descriptors that should be included in the minimal information about a toxicogenomics study needed for interpretation of results by an independent source. Genes that are the most and least variable, gender-selectiv

  8. Low-grade and high-grade mammary carcinomas in WAP-T transgenic mice are independent entities distinguished by Met expression.

    PubMed

    Otto, Benjamin; Gruner, Katharina; Heinlein, Christina; Wegwitz, Florian; Nollau, Peter; Ylstra, Bauke; Pantel, Klaus; Schumacher, Udo; Baumbusch, Lars O; Martin-Subero, José Ignacio; Siebert, Reiner; Wagener, Christoph; Streichert, Thomas; Deppert, Wolfgang; Tolstonog, Genrich V

    2013-03-15

    Mammary carcinomas developing in SV40 transgenic WAP-T mice arise in two distinct histological phenotypes: as differentiated low-grade and undifferentiated high-grade tumors. We integrated different types of information such as histological grading, analysis of aCGH-based gene copy number and gene expression profiling to provide a comprehensive molecular description of mammary tumors in WAP-T mice. Applying a novel procedure for the correlation of gene copy number with gene expression on a global scale, we observed in tumor samples a global coherence between genotype and transcription. This coherence can be interpreted as a matched transcriptional regulation inherited from the cells of tumor origin and determined by the activity of cancer driver genes. Despite common recurrent genomic aberrations, e.g. gain of chr. 15 in most WAP-T tumors, loss of chr. 19 frequently occurs only in low-grade tumors. These tumors show features of "basal-like" epithelial differentiation, particularly expression of keratin 14. The high-grade tumors are clearly separated from the low-grade tumors by strong expression of the Met gene and by coexpression of epithelial (e.g. keratin 18) and mesenchymal (e.g. vimentin) markers. In high-grade tumors, the expression of the nonmutated Met protein is associated with Met-locus amplification and Met activity. The role of Met as a cancer driver gene is supported by the contribution of active Met signaling to motility and growth of mammary tumor-derived cells. Finally, we discuss the independent origin of low- and high-grade tumors from distinct cells of tumor origin, possibly luminal progenitors, distinguished by Met gene expression and Met signaling. Copyright © 2012 UICC.

  9. Gene Delivery to Postnatal Rat Brain by Non-ventricular Plasmid Injection and Electroporation

    PubMed Central

    Molotkov, Dmitry A.; Yukin, Alexey Y.; Afzalov, Ramil A.; Khiroug, Leonard S.

    2010-01-01

    Creation of transgenic animals is a standard approach in studying functions of a gene of interest in vivo. However, many knockout or transgenic animals are not viable in those cases where the modified gene is expressed or deleted in the whole organism. Moreover, a variety of compensatory mechanisms often make it difficult to interpret the results. The compensatory effects can be alleviated by either timing the gene expression or limiting the amount of transfected cells. The method of postnatal non-ventricular microinjection and in vivo electroporation allows targeted delivery of genes, siRNA or dye molecules directly to a small region of interest in the newborn rodent brain. In contrast to conventional ventricular injection technique, this method allows transfection of non-migratory cell types. Animals transfected by means of the method described here can be used, for example, for two-photon in vivo imaging or in electrophysiological experiments on acute brain slices. PMID:20972387

  10. Photoperiod regulates multiple gene expression in the suprachiasmatic nuclei and pars tuberalis of the Siberian hamster (Phodopus sungorus).

    PubMed

    Johnston, Jonathan D; Ebling, Francis J P; Hazlerigg, David G

    2005-06-01

    Photoperiod regulates the seasonal physiology of many mammals living in temperate latitudes. Photoperiodic information is decoded by the master circadian clock in the suprachiasmatic nuclei (SCN) of the hypothalamus and then transduced via pineal melatonin secretion. This neurochemical signal is interpreted by tissues expressing melatonin receptors (e.g. the pituitary pars tuberalis, PT) to drive physiological changes. In this study we analysed the photoperiodic regulation of the circadian clockwork in the SCN and PT of the Siberian hamster. Female hamsters were exposed to either long or short photoperiod for 8 weeks and sampled at 2-h intervals across the 24-h cycle. In the SCN, rhythmic expression of the clock genes Per1, Per2, Cry1, Rev-erbalpha, and the clock-controlled genes arginine vasopressin (AVP) and d-element binding protein (DBP) was modulated by photoperiod. All of these E-box-containing genes tracked dawn, with earlier peak mRNA expression in long, compared to short, photoperiod. This response occurred irrespective of the presence of additional regulatory cis-elements, suggesting photoperiodic regulation of SCN gene expression through a common E-box-related mechanism. In long photoperiod, expression of Cry1 and Per1 in the PT tracked the onset and offset of melatonin secretion, respectively. However, whereas Cry1 tracked melatonin onset in short period, Per1 expression was not detectably rhythmic. We therefore propose that, in the SCN, photoperiodic regulation of clock gene expression primarily occurs via E-boxes, whereas melatonin-driven signal transduction drives the phasing of a subset of clock genes in the PT, independently of the E-box.

  11. The Role of Inducible Hsp70, and Other Heat Shock Proteins, in Adaptive Complex of Cold Tolerance of the Fruit Fly (Drosophila melanogaster)

    PubMed Central

    Štětina, Tomáš; Koštál, Vladimír; Korbelová, Jaroslava

    2015-01-01

    Background The ubiquitous occurrence of inducible Heat Shock Proteins (Hsps) up-regulation in response to cold-acclimation and/or to cold shock, including massive increase of Hsp70 mRNA levels, often led to hasty interpretations of its role in the repair of cold injury expressed as protein denaturation or misfolding. So far, direct functional analyses in Drosophila melanogaster and other insects brought either limited or no support for such interpretations. In this paper, we analyze the cold tolerance and the expression levels of 24 different mRNA transcripts of the Hsps complex and related genes in response to cold in two strains of D. melanogaster: the wild-type and the Hsp70- null mutant lacking all six copies of Hsp70 gene. Principal Findings We found that larvae of both strains show similar patterns of Hsps complex gene expression in response to long-term cold-acclimation and during recovery from chronic cold exposures or acute cold shocks. No transcriptional compensation for missing Hsp70 gene was seen in Hsp70- strain. The cold-induced Hsps gene expression is most probably regulated by alternative splice variants C and D of the Heat Shock Factor. The cold tolerance in Hsp70- null mutants was clearly impaired only when the larvae were exposed to severe acute cold shock. No differences in mortality were found between two strains when the larvae were exposed to relatively mild doses of cold, either chronic exposures to 0°C or acute cold shocks at temperatures down to -4°C. Conclusions The up-regulated expression of a complex of inducible Hsps genes, and Hsp70 mRNA in particular, is tightly associated with cold-acclimation and cold exposure in D. melanogaster. Genetic elimination of Hsp70 up-regulation response has no effect on survival of chronic exposures to 0°C or mild acute cold shocks, while it negatively affects survival after severe acute cold shocks at temperaures below -8°C. PMID:26034990

  12. The Role of Inducible Hsp70, and Other Heat Shock Proteins, in Adaptive Complex of Cold Tolerance of the Fruit Fly (Drosophila melanogaster).

    PubMed

    Štětina, Tomáš; Koštál, Vladimír; Korbelová, Jaroslava

    2015-01-01

    The ubiquitous occurrence of inducible Heat Shock Proteins (Hsps) up-regulation in response to cold-acclimation and/or to cold shock, including massive increase of Hsp70 mRNA levels, often led to hasty interpretations of its role in the repair of cold injury expressed as protein denaturation or misfolding. So far, direct functional analyses in Drosophila melanogaster and other insects brought either limited or no support for such interpretations. In this paper, we analyze the cold tolerance and the expression levels of 24 different mRNA transcripts of the Hsps complex and related genes in response to cold in two strains of D. melanogaster: the wild-type and the Hsp70- null mutant lacking all six copies of Hsp70 gene. We found that larvae of both strains show similar patterns of Hsps complex gene expression in response to long-term cold-acclimation and during recovery from chronic cold exposures or acute cold shocks. No transcriptional compensation for missing Hsp70 gene was seen in Hsp70- strain. The cold-induced Hsps gene expression is most probably regulated by alternative splice variants C and D of the Heat Shock Factor. The cold tolerance in Hsp70- null mutants was clearly impaired only when the larvae were exposed to severe acute cold shock. No differences in mortality were found between two strains when the larvae were exposed to relatively mild doses of cold, either chronic exposures to 0°C or acute cold shocks at temperatures down to -4°C. The up-regulated expression of a complex of inducible Hsps genes, and Hsp70 mRNA in particular, is tightly associated with cold-acclimation and cold exposure in D. melanogaster. Genetic elimination of Hsp70 up-regulation response has no effect on survival of chronic exposures to 0°C or mild acute cold shocks, while it negatively affects survival after severe acute cold shocks at temperatures below -8°C.

  13. The Renilla luciferase gene as a reference gene for normalization of gene expression in transiently transfected cells.

    PubMed

    Jiwaji, Meesbah; Daly, Rónán; Pansare, Kshama; McLean, Pauline; Yang, Jingli; Kolch, Walter; Pitt, Andrew R

    2010-12-31

    The importance of appropriate normalization controls in quantitative real-time polymerase chain reaction (qPCR) experiments has become more apparent as the number of biological studies using this methodology has increased. In developing a system to study gene expression from transiently transfected plasmids, it became clear that normalization using chromosomally encoded genes is not ideal, at it does not take into account the transfection efficiency and the significantly lower expression levels of the plasmids. We have developed and validated a normalization method for qPCR using a co-transfected plasmid. The best chromosomal gene for normalization in the presence of the transcriptional activators used in this study, cadmium, dexamethasone, forskolin and phorbol-12-myristate 13-acetate was first identified. qPCR data was analyzed using geNorm, Normfinder and BestKeeper. Each software application was found to rank the normalization controls differently with no clear correlation. Including a co-transfected plasmid encoding the Renilla luciferase gene (Rluc) in this analysis showed that its calculated stability was not as good as the optimised chromosomal genes, most likely as a result of the lower expression levels and transfection variability. Finally, we validated these analyses by testing two chromosomal genes (B2M and ActB) and a co-transfected gene (Rluc) under biological conditions. When analyzing co-transfected plasmids, Rluc normalization gave the smallest errors compared to the chromosomal reference genes. Our data demonstrates that transfected Rluc is the most appropriate normalization reference gene for transient transfection qPCR analysis; it significantly reduces the standard deviation within biological experiments as it takes into account the transfection efficiencies and has easily controllable expression levels. This improves reproducibility, data validity and most importantly, enables accurate interpretation of qPCR data.

  14. Transcriptional noise in intact and TGF-beta treated human kidney cells; the importance of time-series designs.

    PubMed

    Rabieian, Reyhaneh; Moein, Shiva; Khanahmad, Hossein; Mortazavi, Mojgan; Gheisari, Yousof

    2018-05-26

    The transforming growth factor (TGF)-β signaling pathway plays a key role in various cellular processes. However, insufficient knowledge about the complex and sometimes paradoxical functions of this pathway hinders its therapeutic targeting. In this study, the transcriptional profile of seven mediators and downstream elements of the TGF-β pathway were assessed in TGF-β treated and untreated human kidney derived cells for 2 weeks in a time course manner. As expected the up-regulation of ACTA2 and COL1A2 was evident in the treated cells. However, we observed remarkable fluctuations in gene expression, even in the supposedly steady states. The magnitude of noise was diverse in the examined genes. Our findings underscore the significance of time-course designs for gene expression analyses and clearly show that misleading data can be obtained in single point measurements. Furthermore, we propose specific considerations in the interpretation of time-course data in the context of noisy gene expression. © 2018 International Federation for Cell Biology.

  15. AUCTSP: an improved biomarker gene pair class predictor.

    PubMed

    Kagaris, Dimitri; Khamesipour, Alireza; Yiannoutsos, Constantin T

    2018-06-26

    The Top Scoring Pair (TSP) classifier, based on the concept of relative ranking reversals in the expressions of pairs of genes, has been proposed as a simple, accurate, and easily interpretable decision rule for classification and class prediction of gene expression profiles. The idea that differences in gene expression ranking are associated with presence or absence of disease is compelling and has strong biological plausibility. Nevertheless, the TSP formulation ignores significant available information which can improve classification accuracy and is vulnerable to selecting genes which do not have differential expression in the two conditions ("pivot" genes). We introduce the AUCTSP classifier as an alternative rank-based estimator of the magnitude of the ranking reversals involved in the original TSP. The proposed estimator is based on the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) and as such, takes into account the separation of the entire distribution of gene expression levels in gene pairs under the conditions considered, as opposed to comparing gene rankings within individual subjects as in the original TSP formulation. Through extensive simulations and case studies involving classification in ovarian, leukemia, colon, breast and prostate cancers and diffuse large b-cell lymphoma, we show the superiority of the proposed approach in terms of improving classification accuracy, avoiding overfitting and being less prone to selecting non-informative (pivot) genes. The proposed AUCTSP is a simple yet reliable and robust rank-based classifier for gene expression classification. While the AUCTSP works by the same principle as TSP, its ability to determine the top scoring gene pair based on the relative rankings of two marker genes across all subjects as opposed to each individual subject results in significant performance gains in classification accuracy. In addition, the proposed method tends to avoid selection of non-informative (pivot) genes as members of the top-scoring pair.

  16. Transcriptome analysis of cattle muscle identifies potential markers for skeletal muscle growth rate and major cell types.

    PubMed

    Guo, Bing; Greenwood, Paul L; Cafe, Linda M; Zhou, Guanghong; Zhang, Wangang; Dalrymple, Brian P

    2015-03-13

    This study aimed to identify markers for muscle growth rate and the different cellular contributors to cattle muscle and to link the muscle growth rate markers to specific cell types. The expression of two groups of genes in the longissimus muscle (LM) of 48 Brahman steers of similar age, significantly enriched for "cell cycle" and "ECM (extracellular matrix) organization" Gene Ontology (GO) terms was correlated with average daily gain/kg liveweight (ADG/kg) of the animals. However, expression of the same genes was only partly related to growth rate across a time course of postnatal LM development in two cattle genotypes, Piedmontese x Hereford (high muscling) and Wagyu x Hereford (high marbling). The deposition of intramuscular fat (IMF) altered the relationship between the expression of these genes and growth rate. K-means clustering across the development time course with a large set of genes (5,596) with similar expression profiles to the ECM genes was undertaken. The locations in the clusters of published markers of different cell types in muscle were identified and used to link clusters of genes to the cell type most likely to be expressing them. Overall correspondence between published cell type expression of markers and predicted major cell types of expression in cattle LM was high. However, some exceptions were identified: expression of SOX8 previously attributed to muscle satellite cells was correlated with angiogenesis. Analysis of the clusters and cell types suggested that the "cell cycle" and "ECM" signals were from the fibro/adipogenic lineage. Significant contributions to these signals from the muscle satellite cells, angiogenic cells and adipocytes themselves were not as strongly supported. Based on the clusters and cell type markers, sets of five genes predicted to be representative of fibro/adipogenic precursors (FAPs) and endothelial cells, and/or ECM remodelling and angiogenesis were identified. Gene sets and gene markers for the analysis of many of the major processes/cell populations contributing to muscle composition and growth have been proposed, enabling a consistent interpretation of gene expression datasets from cattle LM. The same gene sets are likely to be applicable in other cattle muscles and in other species.

  17. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    PubMed Central

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; Taylor, Ronald C.; Weisenhorn, Pamela; Olson, Robert D.; Stevens, Rick L.; Rocha, Miguel; Rocha, Isabel; Best, Aaron A.; DeJongh, Matthew; Tintle, Nathan L.; Parrello, Bruce; Overbeek, Ross; Henry, Christopher S.

    2016-01-01

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. An important step toward meeting the challenge of understanding gene function and regulation is the identification of sets of genes that are always co-expressed. These gene sets, Atomic Regulons (ARs), represent fundamental units of function within a cell and could be used to associate genes of unknown function with cellular processes and to enable rational genetic engineering of cellular systems. Here, we describe an approach for inferring ARs that leverages large-scale expression data sets, gene context, and functional relationships among genes. We computed ARs for Escherichia coli based on 907 gene expression experiments and compared our results with gene clusters produced by two prevalent data-driven methods: Hierarchical clustering and k-means clustering. We compared ARs and purely data-driven gene clusters to the curated set of regulatory interactions for E. coli found in RegulonDB, showing that ARs are more consistent with gold standard regulons than are data-driven gene clusters. We further examined the consistency of ARs and data-driven gene clusters in the context of gene interactions predicted by Context Likelihood of Relatedness (CLR) analysis, finding that the ARs show better agreement with CLR predicted interactions. We determined the impact of increasing amounts of expression data on AR construction and find that while more data improve ARs, it is not necessary to use the full set of gene expression experiments available for E. coli to produce high quality ARs. In order to explore the conservation of co-regulated gene sets across different organisms, we computed ARs for Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus, each of which represents increasing degrees of phylogenetic distance from E. coli. Comparison of the organism-specific ARs showed that the consistency of AR gene membership correlates with phylogenetic distance, but there is clear variability in the regulatory networks of closely related organisms. As large scale expression data sets become increasingly common for model and non-model organisms, comparative analyses of atomic regulons will provide valuable insights into fundamental regulatory modules used across the bacterial domain. PMID:27933038

  18. A high-quality annotated transcriptome of swine peripheral blood

    USDA-ARS?s Scientific Manuscript database

    Background: High throughput gene expression profiling assays of peripheral blood are widely used in biomedicine, as well as in animal genetics and physiology research. Accurate, comprehensive, and precise interpretation of such high throughput assays relies on well-characterized reference genomes an...

  19. Transgenic mouse models in the study of reproduction: insights into GATA protein function.

    PubMed

    Tevosian, Sergei G

    2014-07-01

    For the past 2 decades, transgenic technology in mice has allowed for an unprecedented insight into the transcriptional control of reproductive development and function. The key factor among the mouse genetic tools that made this rapid advance possible is a conditional transgenic approach, a particularly versatile method of creating gene deletions and substitutions in the mouse genome. A centerpiece of this strategy is an enzyme, Cre recombinase, which is expressed from defined DNA regulatory elements that are active in the tissue of choice. The regulatory DNA element (either genetically engineered or natural) assures Cre expression only in predetermined cell types, leading to the guided deletion of genetically modified (flanked by loxP or 'floxed' by loxP) gene loci. This review summarizes and compares the studies in which genes encoding GATA family transcription factors were targeted either globally or by Cre recombinases active in the somatic cells of ovaries and testes. The conditional gene loss experiments require detailed knowledge of the spatial and temporal expression of Cre activity, and the challenges in interpreting the outcomes are highlighted. These studies also expose the complexity of GATA-dependent regulation of gonadal gene expression and suggest that gene function is highly context dependent. © 2014 Society for Reproduction and Fertility.

  20. Epigenetics and depression: return of the repressed.

    PubMed

    Dalton, Victoria S; Kolshus, Erik; McLoughlin, Declan M

    2014-02-01

    Epigenetics has recently emerged as a potential mechanism by which adverse environmental stimuli can result in persistent changes in gene expression. Epigenetic mechanisms function alongside the DNA sequence to modulate gene expression and ultimately influence protein production. The current review provides an introduction and overview of epigenetics with a particular focus on preclinical and clinical studies relevant to major depressive disorder (MDD). PubMed and Web of Science databases were interrogated from January 1995 up to December 2012 using combinations of search terms, including "epigenetic", "microRNA" and "DNA methylation" cross referenced with "depression", "early life stress" and "antidepressant". There is an association between adverse environmental stimuli, such as early life stress, and epigenetic modification of gene expression. Epigenetic changes have been reported in humans with MDD and may serve as biomarkers to improve diagnosis. Antidepressant treatments appear to reverse or initiate compensatory epigenetic alterations that may be relevant to their mechanism of action. As a narrative review, the current report was interpretive and qualitative in nature. Epigenetic modification of gene expression provides a mechanism for understanding the link between long-term effects of adverse life events and the changes in gene expression that are associated with depression. Although still a developing field, in the future, epigenetic modifications of gene expression may provide novel biomarkers to predict future susceptibility and/or onset of MDD, improve diagnosis, and aid in the development of epigenetics-based therapies for depression. © 2013 Published by Elsevier B.V.

  1. Integrative Functional Genomics for Systems Genetics in GeneWeaver.org.

    PubMed

    Bubier, Jason A; Langston, Michael A; Baker, Erich J; Chesler, Elissa J

    2017-01-01

    The abundance of existing functional genomics studies permits an integrative approach to interpreting and resolving the results of diverse systems genetics studies. However, a major challenge lies in assembling and harmonizing heterogeneous data sets across species for facile comparison to the positional candidate genes and coexpression networks that come from systems genetic studies. GeneWeaver is an online database and suite of tools at www.geneweaver.org that allows for fast aggregation and analysis of gene set-centric data. GeneWeaver contains curated experimental data together with resource-level data such as GO annotations, MP annotations, and KEGG pathways, along with persistent stores of user entered data sets. These can be entered directly into GeneWeaver or transferred from widely used resources such as GeneNetwork.org. Data are analyzed using statistical tools and advanced graph algorithms to discover new relations, prioritize candidate genes, and generate function hypotheses. Here we use GeneWeaver to find genes common to multiple gene sets, prioritize candidate genes from a quantitative trait locus, and characterize a set of differentially expressed genes. Coupling a large multispecies repository curated and empirical functional genomics data to fast computational tools allows for the rapid integrative analysis of heterogeneous data for interpreting and extrapolating systems genetics results.

  2. Hen uterine gene expression profiling during eggshell formation reveals putative proteins involved in the supply of minerals or in the shell mineralization process

    PubMed Central

    2014-01-01

    Background The chicken eggshell is a natural mechanical barrier to protect egg components from physical damage and microbial penetration. Its integrity and strength is critical for the development of the embryo or to ensure for consumers a table egg free of pathogens. This study compared global gene expression in laying hen uterus in the presence or absence of shell calcification in order to characterize gene products involved in the supply of minerals and / or the shell biomineralization process. Results Microarrays were used to identify a repertoire of 302 over-expressed genes during shell calcification. GO terms enrichment was performed to provide a global interpretation of the functions of the over-expressed genes, and revealed that the most over-represented proteins are related to reproductive functions. Our analysis identified 16 gene products encoding proteins involved in mineral supply, and allowed updating of the general model describing uterine ion transporters during eggshell calcification. A list of 57 proteins potentially secreted into the uterine fluid to be active in the mineralization process was also established. They were classified according to their potential functions (biomineralization, proteoglycans, molecular chaperone, antimicrobials and proteases/antiproteases). Conclusions Our study provides detailed descriptions of genes and corresponding proteins over-expressed when the shell is mineralizing. Some of these proteins involved in the supply of minerals and influencing the shell fabric to protect the egg contents are potentially useful biological markers for the genetic improvement of eggshell quality. PMID:24649854

  3. Lex-SVM: exploring the potential of exon expression profiling for disease classification.

    PubMed

    Yuan, Xiongying; Zhao, Yi; Liu, Changning; Bu, Dongbo

    2011-04-01

    Exon expression profiling technologies, including exon arrays and RNA-Seq, measure the abundance of every exon in a gene. Compared with gene expression profiling technologies like 3' array, exon expression profiling technologies could detect alterations in both transcription and alternative splicing, therefore they are expected to be more sensitive in diagnosis. However, exon expression profiling also brings higher dimension, more redundancy, and significant correlation among features. Ignoring the correlation structure among exons of a gene, a popular classification method like L1-SVM selects exons individually from each gene and thus is vulnerable to noise. To overcome this limitation, we present in this paper a new variant of SVM named Lex-SVM to incorporate correlation structure among exons and known splicing patterns to promote classification performance. Specifically, we construct a new norm, ex-norm, including our prior knowledge on exon correlation structure to regularize the coefficients of a linear SVM. Lex-SVM can be solved efficiently using standard linear programming techniques. The advantage of Lex-SVM is that it can select features group-wisely, force features in a subgroup to take equal weihts and exclude the features that contradict the majority in the subgroup. Experimental results suggest that on exon expression profile, Lex-SVM is more accurate than existing methods. Lex-SVM also generates a more compact model and selects genes more consistently in cross-validation. Unlike L1-SVM selecting only one exon in a gene, Lex-SVM assigns equal weights to as many exons in a gene as possible, lending itself easier for further interpretation.

  4. Analysis of gene expression in a developmental context emphasizes distinct biological leitmotifs in human cancers

    PubMed Central

    Naxerova, Kamila; Bult, Carol J; Peaston, Anne; Fancher, Karen; Knowles, Barbara B; Kasif, Simon; Kohane, Isaac S

    2008-01-01

    Background In recent years, the molecular underpinnings of the long-observed resemblance between neoplastic and immature tissue have begun to emerge. Genome-wide transcriptional profiling has revealed similar gene expression signatures in several tumor types and early developmental stages of their tissue of origin. However, it remains unclear whether such a relationship is a universal feature of malignancy, whether heterogeneities exist in the developmental component of different tumor types and to which degree the resemblance between cancer and development is a tissue-specific phenomenon. Results We defined a developmental landscape by summarizing the main features of ten developmental time courses and projected gene expression from a variety of human tumor types onto this landscape. This comparison demonstrates a clear imprint of developmental gene expression in a wide range of tumors and with respect to different, even non-cognate developmental backgrounds. Our analysis reveals three classes of cancers with developmentally distinct transcriptional patterns. We characterize the biological processes dominating these classes and validate the class distinction with respect to a new time series of murine embryonic lung development. Finally, we identify a set of genes that are upregulated in most cancers and we show that this signature is active in early development. Conclusion This systematic and quantitative overview of the relationship between the neoplastic and developmental transcriptome spanning dozens of tissues provides a reliable outline of global trends in cancer gene expression, reveals potentially clinically relevant differences in the gene expression of different cancer types and represents a reference framework for interpretation of smaller-scale functional studies. PMID:18611264

  5. Semantic integration of gene expression analysis tools and data sources using software connectors

    PubMed Central

    2013-01-01

    Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data. PMID:24341380

  6. Semantic integration of gene expression analysis tools and data sources using software connectors.

    PubMed

    Miyazaki, Flávia A; Guardia, Gabriela D A; Vêncio, Ricardo Z N; de Farias, Cléver R G

    2013-10-25

    The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heterogeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.

  7. Limitations of commonly used internal controls for real-time RT-PCR analysis of renal epithelial-mesenchymal cell transition.

    PubMed

    Elberg, Gerard; Elberg, Dorit; Logan, Charlotte J; Chen, Lijuan; Turman, Martin A

    2006-01-01

    Progressive renal fibrotic disease is accompanied by the massive accumulation of myofibroblasts as defined by alpha smooth muscle actin (alphaSMA) expression. We quantitated gene expression using real-time RT-PCR analysis during conversion of primary cultured human renal tubular cells (RTC) to myofibroblasts after treatment with transforming growth factor-beta1 (TGF-beta1). We report herein the limitations of commonly used reference genes for mRNA quantitation. We determined the expression of alphaSMA and megakaryoblastic leukemia-1 (MKL1), a transcriptional regulator of alphaSMA, by quantitative real-time PCR using three common internal controls, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), cyclophilin A and 18S rRNA. Expression of GAPDH mRNA and cyclophilin A mRNA, and to a lesser extent, 18S rRNA levels varied over time in culture and with exposure to TGF-beta1. Thus, depending on which reference gene was used, TGF-beta1 appeared to have different effects on expression of MKL1 and alphaSMA. RTC converting to myofibroblasts in primary culture is a valuable system to study renal fibrosis in humans. However, variability in expression of reference genes with TGF-beta1 treatment illustrates the need to validate mRNA quantitation with multiple reference genes to provide accurate interpretation of fibrosis studies in the absence of a universal internal standard for mRNA expression. 2006 S. Karger AG, Basel.

  8. paraGSEA: a scalable approach for large-scale gene expression profiling

    PubMed Central

    Peng, Shaoliang; Yang, Shunyun

    2017-01-01

    Abstract More studies have been conducted using gene expression similarity to identify functional connections among genes, diseases and drugs. Gene Set Enrichment Analysis (GSEA) is a powerful analytical method for interpreting gene expression data. However, due to its enormous computational overhead in the estimation of significance level step and multiple hypothesis testing step, the computation scalability and efficiency are poor on large-scale datasets. We proposed paraGSEA for efficient large-scale transcriptome data analysis. By optimization, the overall time complexity of paraGSEA is reduced from O(mn) to O(m+n), where m is the length of the gene sets and n is the length of the gene expression profiles, which contributes more than 100-fold increase in performance compared with other popular GSEA implementations such as GSEA-P, SAM-GS and GSEA2. By further parallelization, a near-linear speed-up is gained on both workstations and clusters in an efficient manner with high scalability and performance on large-scale datasets. The analysis time of whole LINCS phase I dataset (GSE92742) was reduced to nearly half hour on a 1000 node cluster on Tianhe-2, or within 120 hours on a 96-core workstation. The source code of paraGSEA is licensed under the GPLv3 and available at http://github.com/ysycloud/paraGSEA. PMID:28973463

  9. Selection of reference genes for expression analysis of Kumamoto and Portuguese oysters and their hybrid

    NASA Astrophysics Data System (ADS)

    Yan, Lulu; Su, Jiaqi; Wang, Zhaoping; Yan, Xiwu; Yu, Ruihai

    2017-12-01

    Quantitative real-time polymerase chain reaction (qRT-PCR) is a rapid and reliable technique which has been widely used to quantifying gene transcripts (expression analysis). It is also employed for studying heterosis, hybridization breeding and hybrid tolerability of oysters, an ecologically and economically important taxonomic group. For these studies, selection of a suitable set of housekeeping genes as references is crucial for correct interpretation of qRT-PCR data. To identify suitable reference genes for oysters during low temperature and low salinity stresses, we analyzed twelve genes from the gill tissue of Crassostrea sikamea (SS), Crassostrea angulata (AA) and their hybrid (SA), which included three ribosomal genes, 28S ribosomal protein S5 ( RPS5), ribosomal protein L35 ( RPL35), and 60S ribosomal protein L29 ( RPL29); three structural genes, tubulin gamma ( TUBγ), annexin A6 and A7 ( AA6 and AA7); three metabolic pathway genes, ornithine decarboxylase ( OD), glyceraldehyde-3-phosphate dehydrogenase ( GAPDH) and glutathione S-transferase P1 ( GSP); two transcription factors, elongation factor 1 alpha and beta ( EF1α and EF1β); and one protein synthesis gene (ubiquitin ( UBQ). Primers specific for these genes were successfully developed for the three groups of oysters. Three different algorithms, geNorm, NormFinder and BestKeeper, were used to evaluate the expression stability of these candidate genes. BestKeeper program was found to be the most reliable. Based on our analysis, we found that the expression of RPL35 and EF1α was stable under low salinity stress, and the expression of OD, GAPDH and EF1α was stable under low temperature stress in hybrid (SA) oyster; the expression of RPS5 and GAPDH was stable under low salinity stress, and the expression of RPS5, UBQ, GAPDH was stable under low temperature stress in SS oyster; the expression of RPS5, GAPDH, EF1β and AA7 was stable under low salinity stress, and the expression of RPL35, EF1α, GAPDH and EF1β was stable under low temperature stress in AA oyster. Furthermore, to evaluate their suitability, the reference genes were used to quantify six target genes. In conclusion, we have successfully developed primers appropriate for the expression analysis in SS, SA and AA.

  10. Reference gene selection for quantitative gene expression studies during biological invasions: A test on multiple genes and tissues in a model ascidian Ciona savignyi.

    PubMed

    Huang, Xuena; Gao, Yangchun; Jiang, Bei; Zhou, Zunchun; Zhan, Aibin

    2016-01-15

    As invasive species have successfully colonized a wide range of dramatically different local environments, they offer a good opportunity to study interactions between species and rapidly changing environments. Gene expression represents one of the primary and crucial mechanisms for rapid adaptation to local environments. Here, we aim to select reference genes for quantitative gene expression analysis based on quantitative Real-Time PCR (qRT-PCR) for a model invasive ascidian, Ciona savignyi. We analyzed the stability of ten candidate reference genes in three tissues (siphon, pharynx and intestine) under two key environmental stresses (temperature and salinity) in the marine realm based on three programs (geNorm, NormFinder and delta Ct method). Our results demonstrated only minor difference for stability rankings among the three methods. The use of different single reference gene might influence the data interpretation, while multiple reference genes could minimize possible errors. Therefore, reference gene combinations were recommended for different tissues - the optimal reference gene combination for siphon was RPS15 and RPL17 under temperature stress, and RPL17, UBQ and TubA under salinity treatment; for pharynx, TubB, TubA and RPL17 were the most stable genes under temperature stress, while TubB, TubA and UBQ were the best under salinity stress; for intestine, UBQ, RPS15 and RPL17 were the most reliable reference genes under both treatments. Our results suggest that the necessity of selection and test of reference genes for different tissues under varying environmental stresses. The results obtained here are expected to reveal mechanisms of gene expression-mediated invasion success using C. savignyi as a model species. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Selection of housekeeping genes for gene expression studies in the adult rat submandibular gland under normal, inflamed, atrophic and regenerative states

    PubMed Central

    Silver, Nicholas; Cotroneo, Emanuele; Proctor, Gordon; Osailan, Samira; Paterson, Katherine L; Carpenter, Guy H

    2008-01-01

    Background Real-time PCR is a reliable tool with which to measure mRNA transcripts, and provides valuable information on gene expression profiles. Endogenous controls such as housekeeping genes are used to normalise mRNA levels between samples for sensitive comparisons of mRNA transcription. Selection of the most stable control gene(s) is therefore critical for the reliable interpretation of gene expression data. For the purpose of this study, 7 commonly used housekeeping genes were investigated in salivary submandibular glands under normal, inflamed, atrophic and regenerative states. Results The program NormFinder identified the suitability of HPRT to use as a single gene for normalisation within the normal, inflamed and regenerative states, and GAPDH in the atrophic state. For normalisation to multiple housekeeping genes, for each individual state, the optimal number of housekeeping genes as given by geNorm was: ACTB/UBC in the normal, ACTB/YWHAZ in the inflamed, ACTB/HPRT in the atrophic and ACTB/GAPDH in the regenerative state. The most stable housekeeping gene identified between states (compared to normal) was UBC. However, ACTB, identified as one of the most stably expressed genes within states, was found to be one of the most variable between states. Furthermore we demonstrated that normalising between states to ACTB, rather than UBC, introduced an approximately 3 fold magnitude of error. Conclusion Using NormFinder, our studies demonstrated the suitability of HPRT to use as a single gene for normalisation within the normal, inflamed and regenerative groups and GAPDH in the atrophic group. However, if normalising to multiple housekeeping genes, we recommend normalising to those identified by geNorm. For normalisation across the physiological states, we recommend the use of UBC. PMID:18637167

  12. Temporal patterns in the transcriptomic response of rainbow trout, Oncorhynchus mykiss, to crude oil.

    PubMed

    Hook, Sharon E; Lampi, Mark A; Febbo, Eric J; Ward, Jeff A; Parkerton, Thomas F

    2010-09-01

    Time is often not characterized as a variable in ecotoxicogenomic studies. In this study, temporal changes in gene expression were determined during exposure to crude oil and a subsequent recovery period. Juvenile rainbow trout, Oncorhynchus mykiss, were exposed for 96 h to the water accommodated fractions of 0.4, 2 or 10 mgl(-1) crude oil loadings. Following 96 h of exposure, fish were transferred to recovery tanks. Gill and liver samples were collected after 24 and 96 h of exposure, and after 96 h of recovery for RNA extraction and microarray analysis. Fluorescently labeled cDNA was hybridized against matched controls, using salmonid cDNA arrays. Each exposure scenario generated unique patterns of altered gene expression. More genes responded to crude oil in the gill than in the liver. In the gill, 1137 genes had altered expression at 24 h, 2003 genes had altered expression levels at 96 h of exposure, yet by 96 h of recovery, no genes were significantly altered in expression. In the liver at 10 mgl(-1), only five genes were changed at 24 h, yet 192 genes had altered expression after 96 h recovery. At 2 mgl(-1) in the liver, many genes had altered regulation at all three time points. The 0.4 mgl(-1) loading also showed 289 genes upregulated at 24 h after exposure. The Gene Ontology terms associated with altered expression in the liver suggested that the processes of protein synthesis, xenobiotic metabolism, and oxidoreductase activity were altered. The concentration-responsive expression profile of cytochrome P450 1A, a biomarker for oil exposure, did not predict the majority of gene expression profiles in any tissue or dose, since direct relationships with dose were not observed for most genes. While the genes and their associated functions agree with known modes of toxic action for crude oil, the gene lists obtained do not match our previously published work, presumably due to array analysis procedures. These results demonstrate that changes in gene expression with time and dose may be complicated, and should be characterized in controlled laboratory settings before attempts are made to interpret responses in field-collected organisms. Further, processes for analyzing microarray data need to be developed such that standardized gene lists are developed, or that analysis does not rely on lists of significantly altered genes before arrays can be further evaluated as a monitoring tool. Crown Copyright 2010. Published by Elsevier B.V. All rights reserved.

  13. Identification of key microRNAs and genes in preeclampsia by bioinformatics analysis

    PubMed Central

    Luo, Shouling; Cao, Nannan; Tang, Yao; Gu, Weirong

    2017-01-01

    Preeclampsia is a leading cause of perinatal maternal–foetal mortality and morbidity. The aim of this study is to identify the key microRNAs and genes in preeclampsia and uncover their potential functions. We downloaded the miRNA expression profile of GSE84260 and the gene expression profile of GSE73374 from the Gene Expression Omnibus database. Differentially expressed miRNAs and genes were identified and compared to miRNA-target information from MiRWalk 2.0, and a total of 65 differentially expressed miRNAs (DEMIs), including 32 up-regulated miRNAs and 33 down-regulated miRNAs, and 91 differentially expressed genes (DEGs), including 83 up-regulated genes and 8 down-regulated genes, were identified. The pathway enrichment analyses of the DEMIs showed that the up-regulated DEMIs were enriched in the Hippo signalling pathway and MAPK signalling pathway, and the down-regulated DEMIs were enriched in HTLV-I infection and miRNAs in cancers. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses of the DEGs were performed using Multifaceted Analysis Tool for Human Transcriptome. The up-regulated DEGs were enriched in biological processes (BPs), including the response to cAMP, response to hydrogen peroxide and cell-cell adhesion mediated by integrin; no enrichment of down-regulated DEGs was identified. KEGG analysis showed that the up-regulated DEGs were enriched in the Hippo signalling pathway and pathways in cancer. A PPI network of the DEGs was constructed by using Cytoscape software, and FOS, STAT1, MMP14, ITGB1, VCAN, DUSP1, LDHA, MCL1, MET, and ZFP36 were identified as the hub genes. The current study illustrates a characteristic microRNA profile and gene profile in preeclampsia, which may contribute to the interpretation of the progression of preeclampsia and provide novel biomarkers and therapeutic targets for preeclampsia. PMID:28594854

  14. Bioinformatics approaches for cross-species liver cancer analysis based on microarray gene expression profiling

    PubMed Central

    Fang, H; Tong, W; Perkins, R; Shi, L; Hong, H; Cao, X; Xie, Q; Yim, SH; Ward, JM; Pitot, HC; Dragan, YP

    2005-01-01

    Background The completion of the sequencing of human, mouse and rat genomes and knowledge of cross-species gene homologies enables studies of differential gene expression in animal models. These types of studies have the potential to greatly enhance our understanding of diseases such as liver cancer in humans. Genes co-expressed across multiple species are most likely to have conserved functions. We have used various bioinformatics approaches to examine microarray expression profiles from liver neoplasms that arise in albumin-SV40 transgenic rats to elucidate genes, chromosome aberrations and pathways that might be associated with human liver cancer. Results In this study, we first identified 2223 differentially expressed genes by comparing gene expression profiles for two control, two adenoma and two carcinoma samples using an F-test. These genes were subsequently mapped to the rat chromosomes using a novel visualization tool, the Chromosome Plot. Using the same plot, we further mapped the significant genes to orthologous chromosomal locations in human and mouse. Many genes expressed in rat 1q that are amplified in rat liver cancer map to the human chromosomes 10, 11 and 19 and to the mouse chromosomes 7, 17 and 19, which have been implicated in studies of human and mouse liver cancer. Using Comparative Genomics Microarray Analysis (CGMA), we identified regions of potential aberrations in human. Lastly, a pathway analysis was conducted to predict altered human pathways based on statistical analysis and extrapolation from the rat data. All of the identified pathways have been known to be important in the etiology of human liver cancer, including cell cycle control, cell growth and differentiation, apoptosis, transcriptional regulation, and protein metabolism. Conclusion The study demonstrates that the hepatic gene expression profiles from the albumin-SV40 transgenic rat model revealed genes, pathways and chromosome alterations consistent with experimental and clinical research in human liver cancer. The bioinformatics tools presented in this paper are essential for cross species extrapolation and mapping of microarray data, its analysis and interpretation. PMID:16026603

  15. Finding gene regulatory network candidates using the gene expression knowledge base.

    PubMed

    Venkatesan, Aravind; Tripathi, Sushil; Sanz de Galdeano, Alejandro; Blondé, Ward; Lægreid, Astrid; Mironov, Vladimir; Kuiper, Martin

    2014-12-10

    Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis. We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions. Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

  16. Exploratory factor analysis of pathway copy number data with an application towards the integration with gene expression data.

    PubMed

    van Wieringen, Wessel N; van de Wiel, Mark A

    2011-05-01

    Realizing that genes often operate together, studies into the molecular biology of cancer shift focus from individual genes to pathways. In order to understand the regulatory mechanisms of a pathway, one must study its genes at all molecular levels. To facilitate such study at the genomic level, we developed exploratory factor analysis for the characterization of the variability of a pathway's copy number data. A latent variable model that describes the call probability data of a pathway is introduced and fitted with an EM algorithm. In two breast cancer data sets, it is shown that the first two latent variables of GO nodes, which inherit a clear interpretation from the call probabilities, are often related to the proportion of aberrations and a contrast of the probabilities of a loss and of a gain. Linking the latent variables to the node's gene expression data suggests that they capture the "global" effect of genomic aberrations on these transcript levels. In all, the proposed method provides an possibly insightful characterization of pathway copy number data, which may be fruitfully exploited to study the interaction between the pathway's DNA copy number aberrations and data from other molecular levels like gene expression.

  17. Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.

    PubMed

    Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin

    2013-09-22

    High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.

  18. Discovering biclusters in gene expression data based on high-dimensional linear geometries

    PubMed Central

    Gan, Xiangchao; Liew, Alan Wee-Chung; Yan, Hong

    2008-01-01

    Background In DNA microarray experiments, discovering groups of genes that share similar transcriptional characteristics is instrumental in functional annotation, tissue classification and motif identification. However, in many situations a subset of genes only exhibits consistent pattern over a subset of conditions. Conventional clustering algorithms that deal with the entire row or column in an expression matrix would therefore fail to detect these useful patterns in the data. Recently, biclustering has been proposed to detect a subset of genes exhibiting consistent pattern over a subset of conditions. However, most existing biclustering algorithms are based on searching for sub-matrices within a data matrix by optimizing certain heuristically defined merit functions. Moreover, most of these algorithms can only detect a restricted set of bicluster patterns. Results In this paper, we present a novel geometric perspective for the biclustering problem. The biclustering process is interpreted as the detection of linear geometries in a high dimensional data space. Such a new perspective views biclusters with different patterns as hyperplanes in a high dimensional space, and allows us to handle different types of linear patterns simultaneously by matching a specific set of linear geometries. This geometric viewpoint also inspires us to propose a generic bicluster pattern, i.e. the linear coherent model that unifies the seemingly incompatible additive and multiplicative bicluster models. As a particular realization of our framework, we have implemented a Hough transform-based hyperplane detection algorithm. The experimental results on human lymphoma gene expression dataset show that our algorithm can find biologically significant subsets of genes. Conclusion We have proposed a novel geometric interpretation of the biclustering problem. We have shown that many common types of bicluster are just different spatial arrangements of hyperplanes in a high dimensional data space. An implementation of the geometric framework using the Fast Hough transform for hyperplane detection can be used to discover biologically significant subsets of genes under subsets of conditions for microarray data analysis. PMID:18433477

  19. Linking Genes to Cardiovascular Diseases: Gene Action and Gene–Environment Interactions

    PubMed Central

    2016-01-01

    A unique myocardial characteristic is its ability to grow/remodel in order to adapt; this is determined partly by genes and partly by the environment and the milieu intérieur. In the “post-genomic” era, a need is emerging to elucidate the physiologic functions of myocardial genes, as well as potential adaptive and maladaptive modulations induced by environmental/epigenetic factors. Genome sequencing and analysis advances have become exponential lately, with escalation of our knowledge concerning sometimes controversial genetic underpinnings of cardiovascular diseases. Current technologies can identify candidate genes variously involved in diverse normal/abnormal morphomechanical phenotypes, and offer insights into multiple genetic factors implicated in complex cardiovascular syndromes. The expression profiles of thousands of genes are regularly ascertained under diverse conditions. Global analyses of gene expression levels are useful for cataloging genes and correlated phenotypes, and for elucidating the role of genes in maladies. Comparative expression of gene networks coupled to complex disorders can contribute insights as to how “modifier genes” influence the expressed phenotypes. Increasingly, a more comprehensive and detailed systematic understanding of genetic abnormalities underlying, for example, various genetic cardiomyopathies is emerging. Implementing genomic findings in cardiology practice may well lead directly to better diagnosing and therapeutics. There is currently evolving a strong appreciation for the value of studying gene anomalies, and doing so in a non-disjointed, cohesive manner. However, it is challenging for many—practitioners and investigators—to comprehend, interpret, and utilize the clinically increasingly accessible and affordable cardiovascular genomics studies. This survey addresses the need for fundamental understanding in this vital area. PMID:26545598

  20. The stable traits of melanoma genetics: an alternate approach to target discovery

    PubMed Central

    2012-01-01

    Background The weight that gene copy number plays in transcription remains controversial; although in specific cases gene expression correlates with copy number, the relationship cannot be inferred at the global level. We hypothesized that genes steadily expressed by 15 melanoma cell lines (CMs) and their parental tissues (TMs) should be critical for oncogenesis and their expression most frequently influenced by their respective copy number. Results Functional interpretation of 3,030 transcripts concordantly expressed (Pearson's correlation coefficient p-value < 0.05) by CMs and TMs confirmed an enrichment of functions crucial to oncogenesis. Among them, 968 were expressed according to the transcriptional efficiency predicted by copy number analysis (Pearson's correlation coefficient p-value < 0.05). We named these genes, "genomic delegates" as they represent at the transcriptional level the genetic footprint of individual cancers. We then tested whether the genes could categorize 112 melanoma metastases. Two divergent phenotypes were observed: one with prevalent expression of cancer testis antigens, enhanced cyclin activity, WNT signaling, and a Th17 immune phenotype (Class A). This phenotype expressed, therefore, transcripts previously associated to more aggressive cancer. The second class (B) prevalently expressed genes associated with melanoma signaling including MITF, melanoma differentiation antigens, and displayed a Th1 immune phenotype associated with better prognosis and likelihood to respond to immunotherapy. An intermediate third class (C) was further identified. The three phenotypes were confirmed by unsupervised principal component analysis. Conclusions This study suggests that clinically relevant phenotypes of melanoma can be retraced to stable oncogenic properties of cancer cells linked to their genetic back bone, and offers a roadmap for uncovering novel targets for tailored anti-cancer therapy. PMID:22537248

  1. Purification of cardiac myocytes from human heart biopsies for gene expression analysis.

    PubMed

    Kosloski, L M; Bales, I K; Allen, K B; Walker, B L; Borkon, A M; Stuart, R S; Pak, A F; Wacker, M J

    2009-09-01

    The collection of gene expression data from human heart biopsies is important for understanding the cellular mechanisms of arrhythmias and diseases such as cardiac hypertrophy and heart failure. Many clinical and basic research laboratories conduct gene expression analysis using RNA from whole cardiac biopsies. This allows for the analysis of global changes in gene expression in areas of the heart, while eliminating the need for more complex and technically difficult single-cell isolation procedures (such as flow cytometry, laser capture microdissection, etc.) that require expensive equipment and specialized training. The abundance of fibroblasts and other cell types in whole biopsies, however, can complicate gene expression analysis and the interpretation of results. Therefore, we have designed a technique to quickly and easily purify cardiac myocytes from whole cardiac biopsies for RNA extraction. Human heart tissue samples were collected, and our purification method was compared with the standard nonpurification method. Cell imaging using acridine orange staining of the purified sample demonstrated that >98% of total RNA was contained within identifiable cardiac myocytes. Real-time RT-PCR was performed comparing nonpurified and purified samples for the expression of troponin T (myocyte marker), vimentin (fibroblast marker), and alpha-smooth muscle actin (smooth muscle marker). Troponin T expression was significantly increased, and vimentin and alpha-smooth muscle actin were significantly decreased in the purified sample (n = 8; P < 0.05). Extracted RNA was analyzed during each step of the purification, and no significant degradation occurred. These results demonstrate that this isolation method yields a more purified cardiac myocyte RNA sample suitable for downstream applications, such as real-time RT-PCR, and allows for more accurate gene expression changes in cardiac myocytes from heart biopsies.

  2. A Functional Genomic Meta-Analysis of Clinical Trials in Systemic Sclerosis: Toward Precision Medicine and Combination Therapy.

    PubMed

    Taroni, Jaclyn N; Martyanov, Viktor; Mahoney, J Matthew; Whitfield, Michael L

    2017-05-01

    Systemic sclerosis is an orphan, systemic autoimmune disease with no FDA-approved treatments. Its heterogeneity and rarity often result in underpowered clinical trials making the analysis and interpretation of associated molecular data challenging. We performed a meta-analysis of gene expression data from skin biopsies of patients with systemic sclerosis treated with five therapies: mycophenolate mofetil, rituximab, abatacept, nilotinib, and fresolimumab. A common clinical improvement criterion of -20% or -5 modified Rodnan skin score was applied to each study. We applied a machine learning approach that captured features beyond differential expression and was better at identifying targets of therapies than the differential expression alone. Regardless of treatment mechanism, abrogation of inflammatory pathways accompanied clinical improvement in multiple studies suggesting that high expression of immune-related genes indicates active and targetable disease. Our framework allowed us to compare different trials and ask if patients who failed one therapy would likely improve on a different therapy, based on changes in gene expression. Genes with high expression at baseline in fresolimumab nonimprovers were downregulated in mycophenolate mofetil improvers, suggesting that immunomodulatory or combination therapy may have benefitted these patients. This approach can be broadly applied to increase tissue specificity and sensitivity of differential expression results. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  3. The Global Error Assessment (GEA) model for the selection of differentially expressed genes in microarray data.

    PubMed

    Mansourian, Robert; Mutch, David M; Antille, Nicolas; Aubert, Jerome; Fogel, Paul; Le Goff, Jean-Marc; Moulin, Julie; Petrov, Anton; Rytz, Andreas; Voegel, Johannes J; Roberts, Matthew-Alan

    2004-11-01

    Microarray technology has become a powerful research tool in many fields of study; however, the cost of microarrays often results in the use of a low number of replicates (k). Under circumstances where k is low, it becomes difficult to perform standard statistical tests to extract the most biologically significant experimental results. Other more advanced statistical tests have been developed; however, their use and interpretation often remain difficult to implement in routine biological research. The present work outlines a method that achieves sufficient statistical power for selecting differentially expressed genes under conditions of low k, while remaining as an intuitive and computationally efficient procedure. The present study describes a Global Error Assessment (GEA) methodology to select differentially expressed genes in microarray datasets, and was developed using an in vitro experiment that compared control and interferon-gamma treated skin cells. In this experiment, up to nine replicates were used to confidently estimate error, thereby enabling methods of different statistical power to be compared. Gene expression results of a similar absolute expression are binned, so as to enable a highly accurate local estimate of the mean squared error within conditions. The model then relates variability of gene expression in each bin to absolute expression levels and uses this in a test derived from the classical ANOVA. The GEA selection method is compared with both the classical and permutational ANOVA tests, and demonstrates an increased stability, robustness and confidence in gene selection. A subset of the selected genes were validated by real-time reverse transcription-polymerase chain reaction (RT-PCR). All these results suggest that GEA methodology is (i) suitable for selection of differentially expressed genes in microarray data, (ii) intuitive and computationally efficient and (iii) especially advantageous under conditions of low k. The GEA code for R software is freely available upon request to authors.

  4. Selection of reference genes for microRNA analysis associated to early stress response to handling and confinement in Salmo salar.

    PubMed

    Zavala, Eduardo; Reyes, Daniela; Deerenberg, Robert; Vidal, Rodrigo

    2017-05-11

    MicroRNAs are key non-coding RNA molecules that play a relevant role in the regulation of gene expression through translational repression and/or transcript cleavage during normal development and physiological adaptation processes like stress. Quantitative reverse transcription polymerase chain reaction (RT-qPCR) has become the approach normally used to determine the levels of microRNAs. However, this approach needs the use of endogenous reference. An improper selection of endogenous references can result in confusing interpretation of data. The aim of this study was to identify and validate appropriate endogenous reference miRNA genes for normalizing RT-qPCR survey of miRNAs expression in four different tissues of Atlantic salmon, under handling and confinement stress conditions associated to early or primary stress response. Nine candidate reference normalizers, including microRNAs and nuclear genes, normally used in vertebrate microRNA expression studies were selected from literature, validated by RT-qPCR and analyzed by the algorithms geNorm and NormFinder. The results revealed that the ssa-miR-99-5p gene was the most stable overall and that ssa-miR-99-5p and ssa-miR-23a-5p genes were the best combination. Moreover, the suitability of ssa-miR-99-5p and ssa-miR-23a-5p as endogeneuos reference genes was demostrated by the expression analysis of ssa-miR-193-5p gene.

  5. Immunologic applications of conditional gene modification technology in the mouse.

    PubMed

    Sharma, Suveena; Zhu, Jinfang

    2014-04-02

    Since the success of homologous recombination in altering mouse genome and the discovery of Cre-loxP system, the combination of these two breakthroughs has created important applications for studying the immune system in the mouse. Here, we briefly summarize the general principles of this technology and its applications in studying immune cell development and responses; such implications include conditional gene knockout and inducible and/or tissue-specific gene over-expression, as well as lineage fate mapping. We then discuss the pros and cons of a few commonly used Cre-expressing mouse lines for studying lymphocyte development and functions. We also raise several general issues, such as efficiency of gene deletion, leaky activity of Cre, and Cre toxicity, all of which may have profound impacts on data interpretation. Finally, we selectively list some useful links to the Web sites as valuable mouse resources. Copyright © 2014 John Wiley & Sons, Inc.

  6. Transfection of isolated rainbow trout, Oncorhynchus mykiss, granulosa cells through chemical transfection and electroporation at 12°C.

    PubMed

    Marivin, E; Mourot, B; Loyer, P; Rime, H; Bobe, J; Fostier, A

    2015-09-15

    Over-expression or inhibition of gene expression can be efficiently used to analyse the functions and/or regulation of target genes. Modulation of gene expression can be achieved through transfection of exogenous nucleic acids into target cells. Such techniques require the development of specific protocols to transfect cell cultures with nucleic acids. The aim of this study was to develop a method of transfection suitable for rainbow trout granulosa cells in primary culture. After the isolation of rainbow trout granulosa cells, chemical transfection of cells with a fluorescent morpholino oligonucleotide (MO) was tested using FuGENE HD at 12 °C. Electroporation was also employed to transfect these cells with either a plasmid or MO. Transfection was more efficient using electroporation (with the following settings: 1200 V/40 ms/1p) than chemical transfection, but electroporation by itself was deleterious, resulting in a decrease of the steroidogenic capacity of the cells, measured via estradiol production from its androgenic substrate. The disturbance of cell biology induced by the transfection method per se should be taken into account in data interpretation when investigating the effects of under- or over-expression of candidate genes. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. BioVLAB-mCpG-SNP-EXPRESS: A system for multi-level and multi-perspective analysis and exploration of DNA methylation, sequence variation (SNPs), and gene expression from multi-omics data.

    PubMed

    Chae, Heejoon; Lee, Sangseon; Seo, Seokjun; Jung, Daekyoung; Chang, Hyeonsook; Nephew, Kenneth P; Kim, Sun

    2016-12-01

    Measuring gene expression, DNA sequence variation, and DNA methylation status is routinely done using high throughput sequencing technologies. To analyze such multi-omics data and explore relationships, reliable bioinformatics systems are much needed. Existing systems are either for exploring curated data or for processing omics data in the form of a library such as R. Thus scientists have much difficulty in investigating relationships among gene expression, DNA sequence variation, and DNA methylation using multi-omics data. In this study, we report a system called BioVLAB-mCpG-SNP-EXPRESS for the integrated analysis of DNA methylation, sequence variation (SNPs), and gene expression for distinguishing cellular phenotypes at the pairwise and multiple phenotype levels. The system can be deployed on either the Amazon cloud or a publicly available high-performance computing node, and the data analysis and exploration of the analysis result can be conveniently done using a web-based interface. In order to alleviate analysis complexity, all the process are fully automated, and graphical workflow system is integrated to represent real-time analysis progression. The BioVLAB-mCpG-SNP-EXPRESS system works in three stages. First, it processes and analyzes multi-omics data as input in the form of the raw data, i.e., FastQ files. Second, various integrated analyses such as methylation vs. gene expression and mutation vs. methylation are performed. Finally, the analysis result can be explored in a number of ways through a web interface for the multi-level, multi-perspective exploration. Multi-level interpretation can be done by either gene, gene set, pathway or network level and multi-perspective exploration can be explored from either gene expression, DNA methylation, sequence variation, or their relationship perspective. The utility of the system is demonstrated by performing analysis of phenotypically distinct 30 breast cancer cell line data set. BioVLAB-mCpG-SNP-EXPRESS is available at http://biohealth.snu.ac.kr/software/biovlab_mcpg_snp_express/. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Evidence of inflammatory immune signaling in chronic fatigue syndrome: A pilot study of gene expression in peripheral blood.

    PubMed

    Aspler, Anne L; Bolshin, Carly; Vernon, Suzanne D; Broderick, Gordon

    2008-09-26

    Genomic profiling of peripheral blood reveals altered immunity in chronic fatigue syndrome (CFS) however interpretation remains challenging without immune demographic context. The object of this work is to identify modulation of specific immune functional components and restructuring of co-expression networks characteristic of CFS using the quantitative genomics of peripheral blood. Gene sets were constructed a priori for CD4+ T cells, CD8+ T cells, CD19+ B cells, CD14+ monocytes and CD16+ neutrophils from published data. A group of 111 women were classified using empiric case definition (U.S. Centers for Disease Control and Prevention) and unsupervised latent cluster analysis (LCA). Microarray profiles of peripheral blood were analyzed for expression of leukocyte-specific gene sets and characteristic changes in co-expression identified from topological evaluation of linear correlation networks. Median expression for a set of 6 genes preferentially up-regulated in CD19+ B cells was significantly lower in CFS (p = 0.01) due mainly to PTPRK and TSPAN3 expression. Although no other gene set was differentially expressed at p < 0.05, patterns of co-expression in each group differed markedly. Significant co-expression of CD14+ monocyte with CD16+ neutrophil (p = 0.01) and CD19+ B cell sets (p = 0.00) characterized CFS and fatigue phenotype groups. Also in CFS was a significant negative correlation between CD8+ and both CD19+ up-regulated (p = 0.02) and NK gene sets (p = 0.08). These patterns were absent in controls. Dissection of blood microarray profiles points to B cell dysfunction with coordinated immune activation supporting persistent inflammation and antibody-mediated NK cell modulation of T cell activity. This has clinical implications as the CD19+ genes identified could provide robust and biologically meaningful basis for the early detection and unambiguous phenotyping of CFS.

  9. Activation of Wnt/β-Catenin Pathway in Monocytes Derived from Chronic Kidney Disease Patients

    PubMed Central

    Al-Chaqmaqchi, Heevy Abdulkareem Musa; Moshfegh, Ali; Dadfar, Elham; Paulsson, Josefin; Hassan, Moustapha; Jacobson, Stefan H.; Lundahl, Joachim

    2013-01-01

    Patients with chronic kidney disease (CKD) have significantly increased morbidity and mortality resulting from infections and cardiovascular diseases. Since monocytes play an essential role in host immunity, this study was directed to explore the gene expression profile in order to identify differences in activated pathways in monocytes relevant to the pathophysiology of atherosclerosis and increased susceptibility to infections. Monocytes from CKD patients (stages 4 and 5, estimated GFR <20 ml/min/1.73 m2) and healthy donors were collected from peripheral blood. Microarray gene expression profile was performed and data were interpreted by GeneSpring software and by PANTHER tool. Western blot was done to validate the pathway members. The results demonstrated that 600 and 272 genes were differentially up- and down regulated respectively in the patient group. Pathways involved in the inflammatory response were highly expressed and the Wnt/β-catenin signaling pathway was the most significant pathway expressed in the patient group. Since this pathway has been attributed to a variety of inflammatory manifestations, the current findings may contribute to dysfunctional monocytes in CKD patients. Strategies to interfere with this pathway may improve host immunity and prevent cardiovascular complications in CKD patients. PMID:23935909

  10. Making sense of snapshot data: ergodic principle for clonal cell populations

    PubMed Central

    2017-01-01

    Population growth is often ignored when quantifying gene expression levels across clonal cell populations. We develop a framework for obtaining the molecule number distributions in an exponentially growing cell population taking into account its age structure. In the presence of generation time variability, the average acquired across a population snapshot does not obey the average of a dividing cell over time, apparently contradicting ergodicity between single cells and the population. Instead, we show that the variation observed across snapshots with known cell age is captured by cell histories, a single-cell measure obtained from tracking an arbitrary cell of the population back to the ancestor from which it originated. The correspondence between cells of known age in a population with their histories represents an ergodic principle that provides a new interpretation of population snapshot data. We illustrate the principle using analytical solutions of stochastic gene expression models in cell populations with arbitrary generation time distributions. We further elucidate that the principle breaks down for biochemical reactions that are under selection, such as the expression of genes conveying antibiotic resistance, which gives rise to an experimental criterion with which to probe selection on gene expression fluctuations. PMID:29187636

  11. Making sense of snapshot data: ergodic principle for clonal cell populations.

    PubMed

    Thomas, Philipp

    2017-11-01

    Population growth is often ignored when quantifying gene expression levels across clonal cell populations. We develop a framework for obtaining the molecule number distributions in an exponentially growing cell population taking into account its age structure. In the presence of generation time variability, the average acquired across a population snapshot does not obey the average of a dividing cell over time, apparently contradicting ergodicity between single cells and the population. Instead, we show that the variation observed across snapshots with known cell age is captured by cell histories, a single-cell measure obtained from tracking an arbitrary cell of the population back to the ancestor from which it originated. The correspondence between cells of known age in a population with their histories represents an ergodic principle that provides a new interpretation of population snapshot data. We illustrate the principle using analytical solutions of stochastic gene expression models in cell populations with arbitrary generation time distributions. We further elucidate that the principle breaks down for biochemical reactions that are under selection, such as the expression of genes conveying antibiotic resistance, which gives rise to an experimental criterion with which to probe selection on gene expression fluctuations. © 2017 The Author(s).

  12. Co-expression networks reveal the tissue-specific regulation of transcription and splicing

    PubMed Central

    Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D.H.; Jo, Brian; Gao, Chuan; McDowell, Ian C.; Engelhardt, Barbara E.

    2017-01-01

    Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues. PMID:29021288

  13. Knowledge Driven Variable Selection (KDVS) – a new approach to enrichment analysis of gene signatures obtained from high–throughput data

    PubMed Central

    2013-01-01

    Background High–throughput (HT) technologies provide huge amount of gene expression data that can be used to identify biomarkers useful in the clinical practice. The most frequently used approaches first select a set of genes (i.e. gene signature) able to characterize differences between two or more phenotypical conditions, and then provide a functional assessment of the selected genes with an a posteriori enrichment analysis, based on biological knowledge. However, this approach comes with some drawbacks. First, gene selection procedure often requires tunable parameters that affect the outcome, typically producing many false hits. Second, a posteriori enrichment analysis is based on mapping between biological concepts and gene expression measurements, which is hard to compute because of constant changes in biological knowledge and genome analysis. Third, such mapping is typically used in the assessment of the coverage of gene signature by biological concepts, that is either score–based or requires tunable parameters as well, limiting its power. Results We present Knowledge Driven Variable Selection (KDVS), a framework that uses a priori biological knowledge in HT data analysis. The expression data matrix is transformed, according to prior knowledge, into smaller matrices, easier to analyze and to interpret from both computational and biological viewpoints. Therefore KDVS, unlike most approaches, does not exclude a priori any function or process potentially relevant for the biological question under investigation. Differently from the standard approach where gene selection and functional assessment are applied independently, KDVS embeds these two steps into a unified statistical framework, decreasing the variability derived from the threshold–dependent selection, the mapping to the biological concepts, and the signature coverage. We present three case studies to assess the usefulness of the method. Conclusions We showed that KDVS not only enables the selection of known biological functionalities with accuracy, but also identification of new ones. An efficient implementation of KDVS was devised to obtain results in a fast and robust way. Computing time is drastically reduced by the effective use of distributed resources. Finally, integrated visualization techniques immediately increase the interpretability of results. Overall, KDVS approach can be considered as a viable alternative to enrichment–based approaches. PMID:23302187

  14. Transcription of key genes regulating gonadal steroidogenesis in control and ketoconazole- or vinclozolin-exposed fathead minnows

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Villeneuve, Daniel L.; Blake, Lindsey S.; Brodin, Jeffrey

    2007-08-01

    This study evaluated changes in the expression of steroidogenesis-related genes in male fathead minnows exposed to ketoconazole (KTC) or vinclozolin (VZ) for 21 days. The aim was to evaluate links between molecular changes and higher level outcomes after exposure to endocrine-active chemicals (EACs) with different modes of action. To aid our analysis and interpretation of EAC-related effects, we first examined variation in the relative abundance of steroidogenesis-related gene transcripts in the gonads of male and female fathead minnows as a function of age, gonad development, and spawning status, independent of EAC exposure. Gonadal expression of several genes varied with agemore » and/or gonadal somatic index in either males or females. However, with the exception of aromatase, steroidogenesis-related gene expression did not vary with spawning status. Following the baseline experiments, expression of the selected genes in male fathead minnows exposed to KTC or VZ was evaluated in the context of effects observed at higher levels of organization. Exposure to KTC elicited changes in gene transcription that were consistent with an apparent compensatory response to the chemical's anticipated direct inhibition of steroidogenic enzyme activity. Exposure to VZ, an antiandrogen expected to indirectly impact steroidogenesis, increased pituitary expression of follicle-stimulating hormone beta-subunit as well as testis expression of 20beta-hydroxysteroid dehydrogenase and luteinizing hormone receptor transcripts. Results of this study contribute to ongoing research aimed at understanding responses of the teleost hypothalamic-pituitary-gonadal axis to different types of EACs and how changes in molecular endpoints translate into apical outcomes reflective of either adverse effect or compensation.« less

  15. A transcriptional dynamic network during Arabidopsis thaliana pollen development.

    PubMed

    Wang, Jigang; Qiu, Xiaojie; Li, Yuhua; Deng, Youping; Shi, Tieliu

    2011-01-01

    To understand transcriptional regulatory networks (TRNs), especially the coordinated dynamic regulation between transcription factors (TFs) and their corresponding target genes during development, computational approaches would represent significant advances in the genome-wide expression analysis. The major challenges for the experiments include monitoring the time-specific TFs' activities and identifying the dynamic regulatory relationships between TFs and their target genes, both of which are currently not yet available at the large scale. However, various methods have been proposed to computationally estimate those activities and regulations. During the past decade, significant progresses have been made towards understanding pollen development at each development stage under the molecular level, yet the regulatory mechanisms that control the dynamic pollen development processes remain largely unknown. Here, we adopt Networks Component Analysis (NCA) to identify TF activities over time course, and infer their regulatory relationships based on the coexpression of TFs and their target genes during pollen development. We carried out meta-analysis by integrating several sets of gene expression data related to Arabidopsis thaliana pollen development (stages range from UNM, BCP, TCP, HP to 0.5 hr pollen tube and 4 hr pollen tube). We constructed a regulatory network, including 19 TFs, 101 target genes and 319 regulatory interactions. The computationally estimated TF activities were well correlated to their coordinated genes' expressions during the development process. We clustered the expression of their target genes in the context of regulatory influences, and inferred new regulatory relationships between those TFs and their target genes, such as transcription factor WRKY34, which was identified that specifically expressed in pollen, and regulated several new target genes. Our finding facilitates the interpretation of the expression patterns with more biological relevancy, since the clusters corresponding to the activity of specific TF or the combination of TFs suggest the coordinated regulation of TFs to their target genes. Through integrating different resources, we constructed a dynamic regulatory network of Arabidopsis thaliana during pollen development with gene coexpression and NCA. The network illustrated the relationships between the TFs' activities and their target genes' expression, as well as the interactions between TFs, which provide new insight into the molecular mechanisms that control the pollen development.

  16. Utilization of next generation sequencing for analyzing transgenic insertions in plum

    USDA-ARS?s Scientific Manuscript database

    When utilizing transgenic plants, it is useful to know how many copies of the genes were inserted and the locations of these insertions in the genome. This information can provide important insights for the interpretation of transgene expression and the resulting phenotype. Traditionally, these qu...

  17. Multi-membership gene regulation in pathway based microarray analysis

    PubMed Central

    2011-01-01

    Background Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. Results We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. Conclusions We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes. PMID:21939531

  18. Multi-membership gene regulation in pathway based microarray analysis.

    PubMed

    Pavlidis, Stelios P; Payne, Annette M; Swift, Stephen M

    2011-09-22

    Gene expression analysis has been intensively researched for more than a decade. Recently, there has been elevated interest in the integration of microarray data analysis with other types of biological knowledge in a holistic analytical approach. We propose a methodology that can be facilitated for pathway based microarray data analysis, based on the observation that a substantial proportion of genes present in biochemical pathway databases are members of a number of distinct pathways. Our methodology aims towards establishing the state of individual pathways, by identifying those truly affected by the experimental conditions based on the behaviour of such genes. For that purpose it considers all the pathways in which a gene participates and the general census of gene expression per pathway. We utilise hill climbing, simulated annealing and a genetic algorithm to analyse the consistency of the produced results, through the application of fuzzy adjusted rand indexes and hamming distance. All algorithms produce highly consistent genes to pathways allocations, revealing the contribution of genes to pathway functionality, in agreement with current pathway state visualisation techniques, with the simulated annealing search proving slightly superior in terms of efficiency. We show that the expression values of genes, which are members of a number of biochemical pathways or modules, are the net effect of the contribution of each gene to these biochemical processes. We show that by manipulating the pathway and module contribution of such genes to follow underlying trends we can interpret microarray results centred on the behaviour of these genes.

  19. Metabolomic analysis of the selection response of Drosophila melanogaster to environmental stress: are there links to gene expression and phenotypic traits?

    NASA Astrophysics Data System (ADS)

    Malmendal, Anders; Sørensen, Jesper Givskov; Overgaard, Johannes; Holmstrup, Martin; Nielsen, Niels Chr.; Loeschcke, Volker

    2013-05-01

    We investigated the global metabolite response to artificial selection for tolerance to stressful conditions such as cold, heat, starvation, and desiccation, and for longevity in Drosophila melanogaster. Our findings were compared to data from other levels of biological organization, including gene expression, physiological traits, and organismal stress tolerance phenotype. Overall, we found that selection for environmental stress tolerance changes the metabolomic 1H NMR fingerprint largely in a similar manner independent of the trait selected for, indicating that experimental evolution led to a general stress selection response at the metabolomic level. Integrative analyses across data sets showed little similarity when general correlations between selection effects at the level of the metabolome and gene expression were compared. This is likely due to the fact that the changes caused by these selection regimes were rather mild and/or that the dominating determinants for gene expression and metabolite levels were different. However, expression of a number of genes was correlated with the metabolite data. Many of the identified genes were general stress response genes that are down-regulated in response to selection for some of the stresses in this study. Overall, the results illustrate that selection markedly alters the metabolite profile and that the coupling between different levels of biological organization indeed is present though not very strong for stress selection at this level. The results highlight the extreme complexity of environmental stress adaptation and the difficulty of extrapolating and interpreting responses across levels of biological organization.

  20. Proteogenomic characterization of human colon and rectal cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Bing; Wang, Jing; Wang, Xiaojing

    2014-09-18

    We analyzed proteomes of colon and rectal tumors previously characterized by the Cancer Genome Atlas (TCGA) and performed integrated proteogenomic analyses. Protein sequence variants encoded by somatic genomic variations displayed reduced expression compared to protein variants encoded by germline variations. mRNA transcript abundance did not reliably predict protein expression differences between tumors. Proteomics identified five protein expression subtypes, two of which were associated with the TCGA "MSI/CIMP" transcriptional subtype, but had distinct mutation and methylation patterns and associated with different clinical outcomes. Although CNAs showed strong cis- and trans-effects on mRNA expression, relatively few of these extend to the proteinmore » level. Thus, proteomics data enabled prioritization of candidate driver genes. Our analyses identified HNF4A, a novel candidate driver gene in tumors with chromosome 20q amplifications. Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords novel insights into cancer biology.« less

  1. A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research.

    PubMed

    Weidner, Christopher; Steinfath, Matthias; Wistorf, Elisa; Oelgeschläger, Michael; Schneider, Marlon R; Schönfelder, Gilbert

    2017-08-16

    Recent studies that compared transcriptomic datasets of human diseases with datasets from mouse models using traditional gene-to-gene comparison techniques resulted in contradictory conclusions regarding the relevance of animal models for translational research. A major reason for the discrepancies between different gene expression analyses is the arbitrary filtering of differentially expressed genes. Furthermore, the comparison of single genes between different species and platforms often is limited by technical variance, leading to misinterpretation of the con/discordance between data from human and animal models. Thus, standardized approaches for systematic data analysis are needed. To overcome subjective gene filtering and ineffective gene-to-gene comparisons, we recently demonstrated that gene set enrichment analysis (GSEA) has the potential to avoid these problems. Therefore, we developed a standardized protocol for the use of GSEA to distinguish between appropriate and inappropriate animal models for translational research. This protocol is not suitable to predict how to design new model systems a-priori, as it requires existing experimental omics data. However, the protocol describes how to interpret existing data in a standardized manner in order to select the most suitable animal model, thus avoiding unnecessary animal experiments and misleading translational studies.

  2. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights

    PubMed Central

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-01

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher’s exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO’s usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher. PMID:26750448

  3. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    PubMed

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  4. Use of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells.

    PubMed

    Xin, Yurong; Kim, Jinrang; Ni, Min; Wei, Yi; Okamoto, Haruka; Lee, Joseph; Adler, Christina; Cavino, Katie; Murphy, Andrew J; Yancopoulos, George D; Lin, Hsin Chieh; Gromada, Jesper

    2016-03-22

    This study provides an assessment of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells. The system combines microfluidic technology and nanoliter-scale reactions. We sequenced 622 cells, allowing identification of 341 islet cells with high-quality gene expression profiles. The cells clustered into populations of α-cells (5%), β-cells (92%), δ-cells (1%), and pancreatic polypeptide cells (2%). We identified cell-type-specific transcription factors and pathways primarily involved in nutrient sensing and oxidation and cell signaling. Unexpectedly, 281 cells had to be removed from the analysis due to low viability, low sequencing quality, or contamination resulting in the detection of more than one islet hormone. Collectively, we provide a resource for identification of high-quality gene expression datasets to help expand insights into genes and pathways characterizing islet cell types. We reveal limitations in the C1 Fluidigm cell capture process resulting in contaminated cells with altered gene expression patterns. This calls for caution when interpreting single-cell transcriptomics data using the C1 Fluidigm system.

  5. Time-series RNA-seq analysis package (TRAP) and its application to the analysis of rice, Oryza sativa L. ssp. Japonica, upon drought stress.

    PubMed

    Jo, Kyuri; Kwon, Hawk-Bin; Kim, Sun

    2014-06-01

    Measuring expression levels of genes at the whole genome level can be useful for many purposes, especially for revealing biological pathways underlying specific phenotype conditions. When gene expression is measured over a time period, we have opportunities to understand how organisms react to stress conditions over time. Thus many biologists routinely measure whole genome level gene expressions at multiple time points. However, there are several technical difficulties for analyzing such whole genome expression data. In addition, these days gene expression data is often measured by using RNA-sequencing rather than microarray technologies and then analysis of expression data is much more complicated since the analysis process should start with mapping short reads and produce differentially activated pathways and also possibly interactions among pathways. In addition, many useful tools for analyzing microarray gene expression data are not applicable for the RNA-seq data. Thus a comprehensive package for analyzing time series transcriptome data is much needed. In this article, we present a comprehensive package, Time-series RNA-seq Analysis Package (TRAP), integrating all necessary tasks such as mapping short reads, measuring gene expression levels, finding differentially expressed genes (DEGs), clustering and pathway analysis for time-series data in a single environment. In addition to implementing useful algorithms that are not available for RNA-seq data, we extended existing pathway analysis methods, ORA and SPIA, for time series analysis and estimates statistical values for combined dataset by an advanced metric. TRAP also produces visual summary of pathway interactions. Gene expression change labeling, a practical clustering method used in TRAP, enables more accurate interpretation of the data when combined with pathway analysis. We applied our methods on a real dataset for the analysis of rice (Oryza sativa L. Japonica nipponbare) upon drought stress. The result showed that TRAP was able to detect pathways more accurately than several existing methods. TRAP is available at http://biohealth.snu.ac.kr/software/TRAP/. Copyright © 2014 Elsevier Inc. All rights reserved.

  6. Transcriptional Profiling of Ileocecal Valve of Holstein Dairy Cows Infected with Mycobacterium avium subsp. Paratuberculosis

    PubMed Central

    Hempel, Randy J.; Bannantine, John P.

    2016-01-01

    Johne’s disease is a chronic infection of the small intestine caused by Mycobacterium avium subspecies paratuberculosis (MAP), an intracellular bacterium. The events of pathogen survival within the host cell(s), chronic inflammation and the progression from asymptomatic subclinical stage to an advanced clinical stage of infection, are poorly understood. This study examines gene expression in the ileocecal valve (ICV) of Holstein dairy cows at different stages of MAP infection. The ICV is known to be a primary site of MAP colonization and provides an ideal location to identify genes that are relevant to the progression of this disease. RNA was prepared from ICV tissues and RNA-Seq was used to compare gene transcription between clinical, subclinical, and uninfected control animals. Interpretation of the gene expression data was performed using pathway analysis and gene ontology categories containing multiple differentially expressed genes. Results demonstrated that many of the pathways that had strong differential gene expression between uninfected control and clinical cows were related to the immune system, such as the T- and B-cell receptor signaling, apoptosis, NOD-like receptor signaling, and leukocyte transendothelial migration pathways. In contrast, the comparison of gene transcription between control and subclinical cows identified pathways that were primarily involved in metabolism. The results from the comparison between clinical and subclinical animals indicate recruitment of neutrophils, up regulation of lysosomal peptidases, increase in immune cell transendothelial migration, and modifications of the extracelluar matrix. This study provides important insight into how cattle respond to a natural MAP infection at the gene transcription level within a key target tissue for infection. PMID:27093613

  7. Genexpi: a toolset for identifying regulons and validating gene regulatory networks using time-course expression data.

    PubMed

    Modrák, Martin; Vohradský, Jiří

    2018-04-13

    Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks. We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package. Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.

  8. RNA-Seq reveals 10 novel promising candidate genes affecting milk protein concentration in the Chinese Holstein population.

    PubMed

    Li, Cong; Cai, Wentao; Zhou, Chenghao; Yin, Hongwei; Zhang, Ziqi; Loor, Juan J; Sun, Dongxiao; Zhang, Qin; Liu, Jianfeng; Zhang, Shengli

    2016-06-02

    Paired-end RNA sequencing (RNA-Seq) was used to explore the bovine transcriptome from the mammary tissue of 12 Chinese Holstein cows with 6 extremely high and 6 low phenotypic values for milk protein percentage. We defined the differentially expressed transcripts between the two comparison groups, extremely high and low milk protein percentage during the peak lactation (HP vs LP) and during the non-lactating period (HD vs LD), respectively. Within the differentially expressed genes (DEGs), we detected 157 at peak lactation and 497 in the non-lactating period with a highly significant correlation with milk protein concentration. Integrated interpretation of differential gene expression indicated that SERPINA1, CLU, CNTFR, ERBB2, NEDD4L, ANG, GALE, HSPA8, LPAR6 and CD14 are the most promising candidate genes affecting milk protein concentration. Similarly, LTF, FCGR3A, MEGF10, RRM2 and UBE2C are the most promising candidates that in the non-lactating period could help the mammary tissue prevent issues with inflammation and udder disorders. Putative genes will be valuable resources for designing better breeding strategies to optimize the content of milk protein and also to provide new insights into regulation of lactogenesis.

  9. A sociogenomic perspective on neuroscience in organizational behavior.

    PubMed

    Spain, Seth M; Harms, P D

    2014-01-01

    We critically examine the current biological models of individual organizational behavior, with particular emphasis on the roles of genetics and the brain. We demonstrate how approaches to biology in the organizational sciences assume that biological systems are simultaneously causal and essentially static; that genotypes exert constant effects. In contrast, we present a sociogenomic approach to organizational research, which could provide a meta-theoretical framework for understanding organizational behavior. Sociogenomics is an interactionist approach that derives power from its ability to explain how genes and environment operate. The key insight is that both genes and the environment operate by modifying gene expression. This leads to a conception of genetic and environmental effects that is fundamentally dynamic, rather than the static view of classical biometric approaches. We review biometric research within organizational behavior, and contrast these interpretations with a sociogenomic view. We provide a review of gene expression mechanisms that help explain the dynamism observed in individual organizational behavior, particularly factors associated with gene expression in the brain. Finally, we discuss the ethics of genomic and neuroscientific findings for practicing managers and discuss whether it is possible to practically apply these findings in management.

  10. A sociogenomic perspective on neuroscience in organizational behavior

    PubMed Central

    Spain, Seth M.; Harms, P. D.

    2014-01-01

    We critically examine the current biological models of individual organizational behavior, with particular emphasis on the roles of genetics and the brain. We demonstrate how approaches to biology in the organizational sciences assume that biological systems are simultaneously causal and essentially static; that genotypes exert constant effects. In contrast, we present a sociogenomic approach to organizational research, which could provide a meta-theoretical framework for understanding organizational behavior. Sociogenomics is an interactionist approach that derives power from its ability to explain how genes and environment operate. The key insight is that both genes and the environment operate by modifying gene expression. This leads to a conception of genetic and environmental effects that is fundamentally dynamic, rather than the static view of classical biometric approaches. We review biometric research within organizational behavior, and contrast these interpretations with a sociogenomic view. We provide a review of gene expression mechanisms that help explain the dynamism observed in individual organizational behavior, particularly factors associated with gene expression in the brain. Finally, we discuss the ethics of genomic and neuroscientific findings for practicing managers and discuss whether it is possible to practically apply these findings in management. PMID:24616682

  11. Fyn-Dependent Gene Networks in Acute Ethanol Sensitivity

    PubMed Central

    Farris, Sean P.; Miles, Michael F.

    2013-01-01

    Studies in humans and animal models document that acute behavioral responses to ethanol are predisposing factor for the risk of long-term drinking behavior. Prior microarray data from our laboratory document strain- and brain region-specific variation in gene expression profile responses to acute ethanol that may be underlying regulators of ethanol behavioral phenotypes. The non-receptor tyrosine kinase Fyn has previously been mechanistically implicated in the sedative-hypnotic response to acute ethanol. To further understand how Fyn may modulate ethanol behaviors, we used whole-genome expression profiling. We characterized basal and acute ethanol-evoked (3 g/kg) gene expression patterns in nucleus accumbens (NAC), prefrontal cortex (PFC), and ventral midbrain (VMB) of control and Fyn knockout mice. Bioinformatics analysis identified a set of Fyn-related gene networks differently regulated by acute ethanol across the three brain regions. In particular, our analysis suggested a coordinate basal decrease in myelin-associated gene expression within NAC and PFC as an underlying factor in sensitivity of Fyn null animals to ethanol sedation. An in silico analysis across the BXD recombinant inbred (RI) strains of mice identified a significant correlation between Fyn expression and a previously published ethanol loss-of-righting-reflex (LORR) phenotype. By combining PFC gene expression correlates to Fyn and LORR across multiple genomic datasets, we identified robust Fyn-centric gene networks related to LORR. Our results thus suggest that multiple system-wide changes exist within specific brain regions of Fyn knockout mice, and that distinct Fyn-dependent expression networks within PFC may be important determinates of the LORR due to acute ethanol. These results add to the interpretation of acute ethanol behavioral sensitivity in Fyn kinase null animals, and identify Fyn-centric gene networks influencing variance in ethanol LORR. Such networks may also inform future design of pharmacotherapies for the treatment and prevention of alcohol use disorders. PMID:24312422

  12. Use of Attribute Driven Incremental Discretization and Logic Learning Machine to build a prognostic classifier for neuroblastoma patients.

    PubMed

    Cangelosi, Davide; Muselli, Marco; Parodi, Stefano; Blengio, Fabiola; Becherini, Pamela; Versteeg, Rogier; Conte, Massimo; Varesio, Luigi

    2014-01-01

    Cancer patient's outcome is written, in part, in the gene expression profile of the tumor. We previously identified a 62-probe sets signature (NB-hypo) to identify tissue hypoxia in neuroblastoma tumors and showed that NB-hypo stratified neuroblastoma patients in good and poor outcome 1. It was important to develop a prognostic classifier to cluster patients into risk groups benefiting of defined therapeutic approaches. Novel classification and data discretization approaches can be instrumental for the generation of accurate predictors and robust tools for clinical decision support. We explored the application to gene expression data of Rulex, a novel software suite including the Attribute Driven Incremental Discretization technique for transforming continuous variables into simplified discrete ones and the Logic Learning Machine model for intelligible rule generation. We applied Rulex components to the problem of predicting the outcome of neuroblastoma patients on the bases of 62 probe sets NB-hypo gene expression signature. The resulting classifier consisted in 9 rules utilizing mainly two conditions of the relative expression of 11 probe sets. These rules were very effective predictors, as shown in an independent validation set, demonstrating the validity of the LLM algorithm applied to microarray data and patients' classification. The LLM performed as efficiently as Prediction Analysis of Microarray and Support Vector Machine, and outperformed other learning algorithms such as C4.5. Rulex carried out a feature selection by selecting a new signature (NB-hypo-II) of 11 probe sets that turned out to be the most relevant in predicting outcome among the 62 of the NB-hypo signature. Rules are easily interpretable as they involve only few conditions. Our findings provided evidence that the application of Rulex to the expression values of NB-hypo signature created a set of accurate, high quality, consistent and interpretable rules for the prediction of neuroblastoma patients' outcome. We identified the Rulex weighted classification as a flexible tool that can support clinical decisions. For these reasons, we consider Rulex to be a useful tool for cancer classification from microarray gene expression data.

  13. Identification of genes associated with renal cell carcinoma using gene expression profiling analysis.

    PubMed

    Yao, Ting; Wang, Qinfu; Zhang, Wenyong; Bian, Aihong; Zhang, Jinping

    2016-07-01

    Renal cell carcinoma (RCC) is the most common type of kidney cancer in adults and accounts for ~80% of all kidney cancer cases. However, the pathogenesis of RCC has not yet been fully elucidated. To interpret the pathogenesis of RCC at the molecular level, gene expression data and bio-informatics methods were used to identify RCC associated genes. Gene expression data was downloaded from Gene Expression Omnibus (GEO) database and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in RCC patients compared with controls. In addition, a regulatory network was constructed using the known regulatory data between transcription factors (TFs) and target genes in the University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) and the regulatory impact factor of each TF was calculated. A total of 258,0427 pairs of DCGs were identified. The regulatory network contained 1,525 pairs of regulatory associations between 126 TFs and 1,259 target genes and these genes were mainly enriched in cancer pathways, ErbB and MAPK. In the regulatory network, the 10 most strongly associated TFs were FOXC1, GATA3, ESR1, FOXL1, PATZ1, MYB, STAT5A, EGR2, EGR3 and PELP1. GATA3, ERG and MYB serve important roles in RCC while FOXC1, ESR1, FOXL1, PATZ1, STAT5A and PELP1 may be potential genes associated with RCC. In conclusion, the present study constructed a regulatory network and screened out several TFs that may be used as molecular biomarkers of RCC. However, future studies are needed to confirm the findings of the present study.

  14. Identification of genes associated with renal cell carcinoma using gene expression profiling analysis

    PubMed Central

    YAO, TING; WANG, QINFU; ZHANG, WENYONG; BIAN, AIHONG; ZHANG, JINPING

    2016-01-01

    Renal cell carcinoma (RCC) is the most common type of kidney cancer in adults and accounts for ~80% of all kidney cancer cases. However, the pathogenesis of RCC has not yet been fully elucidated. To interpret the pathogenesis of RCC at the molecular level, gene expression data and bio-informatics methods were used to identify RCC associated genes. Gene expression data was downloaded from Gene Expression Omnibus (GEO) database and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in RCC patients compared with controls. In addition, a regulatory network was constructed using the known regulatory data between transcription factors (TFs) and target genes in the University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) and the regulatory impact factor of each TF was calculated. A total of 258,0427 pairs of DCGs were identified. The regulatory network contained 1,525 pairs of regulatory associations between 126 TFs and 1,259 target genes and these genes were mainly enriched in cancer pathways, ErbB and MAPK. In the regulatory network, the 10 most strongly associated TFs were FOXC1, GATA3, ESR1, FOXL1, PATZ1, MYB, STAT5A, EGR2, EGR3 and PELP1. GATA3, ERG and MYB serve important roles in RCC while FOXC1, ESR1, FOXL1, PATZ1, STAT5A and PELP1 may be potential genes associated with RCC. In conclusion, the present study constructed a regulatory network and screened out several TFs that may be used as molecular biomarkers of RCC. However, future studies are needed to confirm the findings of the present study. PMID:27347102

  15. Analysis and interpretation of transcriptomic data obtained from extended Warburg effect genes in patients with clear cell renal cell carcinoma

    PubMed Central

    Sanders, Edward; Diehl, Svenja

    2015-01-01

    Background Many cancers adopt a metabolism that is characterized by the well-known Warburg effect (aerobic glycolysis). Recently, numerous attempts have been made to treat cancer by targeting one or more gene products involved in this pathway without notable success. This work outlines a transcriptomic approach to identify genes that are highly perturbed in clear cell renal cell carcinoma (CCRCC). Methods We developed a model of the extended Warburg effect and outlined the model using Cytoscape. Following this, gene expression fold changes (FCs) for tumor and adjacent normal tissue from patients with CCRCC (GSE6344) were mapped on to the network. Gene expression values with FCs of greater than two were considered as potential targets for treatment of CCRCC. Results The Cytoscape network includes glycolysis, gluconeogenesis, the pentose phosphate pathway (PPP), the TCA cycle, the serine/glycine pathway, and partial glutaminolysis and fatty acid synthesis pathways. Gene expression FCs for nine of the 10 CCRCC patients in the GSE6344 data set were consistent with a shift to aerobic glycolysis. Genes involved in glycolysis and the synthesis and transport of lactate were over-expressed, as was the gene that codes for the kinase that inhibits the conversion of pyruvate to acetyl-CoA. Interestingly, genes that code for unique proteins involved in gluconeogenesis were strongly under-expressed as was also the case for the serine/glycine pathway. These latter two results suggest that the role attributed to the M2 isoform of pyruvate kinase (PKM2), frequently the principal isoform of PK present in cancer: i.e. causing a buildup of glucose metabolites that are shunted into branch pathways for synthesis of key biomolecules, may not be operative in CCRCC. The fact that there was no increase in the expression FC of any gene in the PPP is consistent with this hypothesis. Literature protein data generally support the transcriptomic findings. Conclusions A number of key genes have been identified that could serve as valid targets for anti-cancer pharmaceutical agents. Genes that are highly over-expressed include ENO2, HK2, PFKP, SLC2A3, PDK1, and SLC16A1. Genes that are highly under-expressed include ALDOB, PKLR, PFKFB2, G6PC, PCK1, FBP1, PC, and SUCLG1. PMID:25859558

  16. RNA-seq Data: Challenges in and Recommendations for Experimental Design and Analysis.

    PubMed

    Williams, Alexander G; Thomas, Sean; Wyman, Stacia K; Holloway, Alisha K

    2014-10-01

    RNA-seq is widely used to determine differential expression of genes or transcripts as well as identify novel transcripts, identify allele-specific expression, and precisely measure translation of transcripts. Thoughtful experimental design and choice of analysis tools are critical to ensure high-quality data and interpretable results. Important considerations for experimental design include number of replicates, whether to collect paired-end or single-end reads, sequence length, and sequencing depth. Common analysis steps in all RNA-seq experiments include quality control, read alignment, assigning reads to genes or transcripts, and estimating gene or transcript abundance. Our aims are two-fold: to make recommendations for common components of experimental design and assess tool capabilities for each of these steps. We also test tools designed to detect differential expression, since this is the most widespread application of RNA-seq. We hope that these analyses will help guide those who are new to RNA-seq and will generate discussion about remaining needs for tool improvement and development. Copyright © 2014 John Wiley & Sons, Inc.

  17. Novelty and fear conditioning induced gene expression in high and low states of anxiety.

    PubMed

    Donley, Melanie P; Rosen, Jeffrey B

    2017-09-01

    Emotional states influence how stimuli are interpreted. High anxiety states in humans lead to more negative, threatening interpretations of novel information, typically accompanied by activation of the amygdala. We developed a handling protocol that induces long-lasting high and low anxiety-like states in rats to explore the role of state anxiety on brain activation during exposure to a novel environment and fear conditioning. In situ hybridization of the inducible transcription factor Egr-1 found increased gene expression in the lateral nucleus of the amygdala (LA) following exposure to a novel environment and contextual fear conditioning in high anxiety-like rats. In contrast, low state anxiety-like rats did not generate Egr-1 increases in LA when placed in a novel chamber. Egr-1 expression was also examined in the dorsal hippocampus and prefrontal cortex. In CA1 of the hippocampus and medial prefrontal cortex (mPFC), Egr-1 expression increased in response to novel context exposure and fear conditioning, independent of state anxiety level. Furthermore, in mPFC, Egr-1 in low anxiety-like rats was increased more with fear conditioning than novel exposure. The current series of experiments show that brain areas involved in fear and anxiety-like states do not respond uniformly to novelty during high and low states of anxiety. © 2017 Donley and Rosen; Published by Cold Spring Harbor Laboratory Press.

  18. A statistical approach to identify, monitor, and manage incomplete curated data sets.

    PubMed

    Howe, Douglas G

    2018-04-02

    Many biological knowledge bases gather data through expert curation of published literature. High data volume, selective partial curation, delays in access, and publication of data prior to the ability to curate it can result in incomplete curation of published data. Knowing which data sets are incomplete and how incomplete they are remains a challenge. Awareness that a data set may be incomplete is important for proper interpretation, to avoiding flawed hypothesis generation, and can justify further exploration of published literature for additional relevant data. Computational methods to assess data set completeness are needed. One such method is presented here. In this work, a multivariate linear regression model was used to identify genes in the Zebrafish Information Network (ZFIN) Database having incomplete curated gene expression data sets. Starting with 36,655 gene records from ZFIN, data aggregation, cleansing, and filtering reduced the set to 9870 gene records suitable for training and testing the model to predict the number of expression experiments per gene. Feature engineering and selection identified the following predictive variables: the number of journal publications; the number of journal publications already attributed for gene expression annotation; the percent of journal publications already attributed for expression data; the gene symbol; and the number of transgenic constructs associated with each gene. Twenty-five percent of the gene records (2483 genes) were used to train the model. The remaining 7387 genes were used to test the model. One hundred and twenty-two and 165 of the 7387 tested genes were identified as missing expression annotations based on their residuals being outside the model lower or upper 95% confidence interval respectively. The model had precision of 0.97 and recall of 0.71 at the negative 95% confidence interval and precision of 0.76 and recall of 0.73 at the positive 95% confidence interval. This method can be used to identify data sets that are incompletely curated, as demonstrated using the gene expression data set from ZFIN. This information can help both database resources and data consumers gauge when it may be useful to look further for published data to augment the existing expertly curated information.

  19. STARNET 2: a web-based tool for accelerating discovery of gene regulatory networks using microarray co-expression data

    PubMed Central

    Jupiter, Daniel; Chen, Hailin; VanBuren, Vincent

    2009-01-01

    Background Although expression microarrays have become a standard tool used by biologists, analysis of data produced by microarray experiments may still present challenges. Comparison of data from different platforms, organisms, and labs may involve complicated data processing, and inferring relationships between genes remains difficult. Results STARNET 2 is a new web-based tool that allows post hoc visual analysis of correlations that are derived from expression microarray data. STARNET 2 facilitates user discovery of putative gene regulatory networks in a variety of species (human, rat, mouse, chicken, zebrafish, Drosophila, C. elegans, S. cerevisiae, Arabidopsis and rice) by graphing networks of genes that are closely co-expressed across a large heterogeneous set of preselected microarray experiments. For each of the represented organisms, raw microarray data were retrieved from NCBI's Gene Expression Omnibus for a selected Affymetrix platform. All pairwise Pearson correlation coefficients were computed for expression profiles measured on each platform, respectively. These precompiled results were stored in a MySQL database, and supplemented by additional data retrieved from NCBI. A web-based tool allows user-specified queries of the database, centered at a gene of interest. The result of a query includes graphs of correlation networks, graphs of known interactions involving genes and gene products that are present in the correlation networks, and initial statistical analyses. Two analyses may be performed in parallel to compare networks, which is facilitated by the new HEATSEEKER module. Conclusion STARNET 2 is a useful tool for developing new hypotheses about regulatory relationships between genes and gene products, and has coverage for 10 species. Interpretation of the correlation networks is supported with a database of previously documented interactions, a test for enrichment of Gene Ontology terms, and heat maps of correlation distances that may be used to compare two networks. The list of genes in a STARNET network may be useful in developing a list of candidate genes to use for the inference of causal networks. The tool is freely available at , and does not require user registration. PMID:19828039

  20. Radiogenomics of hepatocellular carcinoma: multiregion analysis-based identification of prognostic imaging biomarkers by integrating gene data—a preliminary study

    NASA Astrophysics Data System (ADS)

    Xia, Wei; Chen, Ying; Zhang, Rui; Yan, Zhuangzhi; Zhou, Xiaobo; Zhang, Bo; Gao, Xin

    2018-02-01

    Our objective was to identify prognostic imaging biomarkers for hepatocellular carcinoma in contrast-enhanced computed tomography (CECT) with biological interpretations by associating imaging features and gene modules. We retrospectively analyzed 371 patients who had gene expression profiles. For the 38 patients with CECT imaging data, automatic intra-tumor partitioning was performed, resulting in three spatially distinct subregions. We extracted a total of 37 quantitative imaging features describing intensity, geometry, and texture from each subregion. Imaging features were selected after robustness and redundancy analysis. Gene modules acquired from clustering were chosen for their prognostic significance. By constructing an association map between imaging features and gene modules with Spearman rank correlations, the imaging features that significantly correlated with gene modules were obtained. These features were evaluated with Cox’s proportional hazard models and Kaplan-Meier estimates to determine their prognostic capabilities for overall survival (OS). Eight imaging features were significantly correlated with prognostic gene modules, and two of them were associated with OS. Among these, the geometry feature volume fraction of the subregion, which was significantly correlated with all prognostic gene modules representing cancer-related interpretation, was predictive of OS (Cox p  =  0.022, hazard ratio  =  0.24). The texture feature cluster prominence in the subregion, which was correlated with the prognostic gene module representing lipid metabolism and complement activation, also had the ability to predict OS (Cox p  =  0.021, hazard ratio  =  0.17). Imaging features depicting the volume fraction and textural heterogeneity in subregions have the potential to be predictors of OS with interpretable biological meaning.

  1. Discovering semantic features in the literature: a foundation for building functional associations

    PubMed Central

    Chagoyen, Monica; Carmona-Saez, Pedro; Shatkay, Hagit; Carazo, Jose M; Pascual-Montano, Alberto

    2006-01-01

    Background Experimental techniques such as DNA microarray, serial analysis of gene expression (SAGE) and mass spectrometry proteomics, among others, are generating large amounts of data related to genes and proteins at different levels. As in any other experimental approach, it is necessary to analyze these data in the context of previously known information about the biological entities under study. The literature is a particularly valuable source of information for experiment validation and interpretation. Therefore, the development of automated text mining tools to assist in such interpretation is one of the main challenges in current bioinformatics research. Results We present a method to create literature profiles for large sets of genes or proteins based on common semantic features extracted from a corpus of relevant documents. These profiles can be used to establish pair-wise similarities among genes, utilized in gene/protein classification or can be even combined with experimental measurements. Semantic features can be used by researchers to facilitate the understanding of the commonalities indicated by experimental results. Our approach is based on non-negative matrix factorization (NMF), a machine-learning algorithm for data analysis, capable of identifying local patterns that characterize a subset of the data. The literature is thus used to establish putative relationships among subsets of genes or proteins and to provide coherent justification for this clustering into subsets. We demonstrate the utility of the method by applying it to two independent and vastly different sets of genes. Conclusion The presented method can create literature profiles from documents relevant to sets of genes. The representation of genes as additive linear combinations of semantic features allows for the exploration of functional associations as well as for clustering, suggesting a valuable methodology for the validation and interpretation of high-throughput experimental data. PMID:16438716

  2. A Novel Strategy for Selection and Validation of Reference Genes in Dynamic Multidimensional Experimental Design in Yeast

    PubMed Central

    Cankorur-Cetinkaya, Ayca; Dereli, Elif; Eraslan, Serpil; Karabekmez, Erkan; Dikicioglu, Duygu; Kirdar, Betul

    2012-01-01

    Background Understanding the dynamic mechanism behind the transcriptional organization of genes in response to varying environmental conditions requires time-dependent data. The dynamic transcriptional response obtained by real-time RT-qPCR experiments could only be correctly interpreted if suitable reference genes are used in the analysis. The lack of available studies on the identification of candidate reference genes in dynamic gene expression studies necessitates the identification and the verification of a suitable gene set for the analysis of transient gene expression response. Principal Findings In this study, a candidate reference gene set for RT-qPCR analysis of dynamic transcriptional changes in Saccharomyces cerevisiae was determined using 31 different publicly available time series transcriptome datasets. Ten of the twelve candidates (TPI1, FBA1, CCW12, CDC19, ADH1, PGK1, GCN4, PDC1, RPS26A and ARF1) we identified were not previously reported as potential reference genes. Our method also identified the commonly used reference genes ACT1 and TDH3. The most stable reference genes from this pool were determined as TPI1, FBA1, CDC19 and ACT1 in response to a perturbation in the amount of available glucose and as FBA1, TDH3, CCW12 and ACT1 in response to a perturbation in the amount of available ammonium. The use of these newly proposed gene sets outperformed the use of common reference genes in the determination of dynamic transcriptional response of the target genes, HAP4 and MEP2, in response to relaxation from glucose and ammonium limitations, respectively. Conclusions A candidate reference gene set to be used in dynamic real-time RT-qPCR expression profiling in yeast was proposed for the first time in the present study. Suitable pools of stable reference genes to be used under different experimental conditions could be selected from this candidate set in order to successfully determine the expression profiles for the genes of interest. PMID:22675547

  3. Sex Difference in Daily Rhythms of Clock Gene Expression in the Aged Human Cerebral Cortex

    PubMed Central

    Lim, Andrew S.P.; Myers, Amanda J.; Yu, Lei; Buchman, Aron S.; Duffy, Jeanne F.; De Jager, Philip L.; Bennett, David A.

    2013-01-01

    Background Studies using self-report and physiological markers of circadian rhythmicity have demonstrated sex differences in a number of circadian attributes including morningness-eveningness, entrained phase, and intrinsic period. However, these sex differences have not been examined at the level of the molecular clock, and not in human cerebral cortex. We tested the hypothesis that there are detectable daily rhythms of clock gene expression in human cerebral cortex, and that there are significant sex differences in the timing of these rhythms. Methods We quantified the expression levels of three clock genes – PER2, PER3, and ARNTL1 in samples of dorsolateral prefrontal cortex from 490 deceased individuals in two cohort studies of older individuals, the Religious Orders Study and the Rush Memory and Aging Project, using mRNA microarray data. We parameterized clock gene expression at death as a function of time of death using cosine curves, and examined for sex differences in the phase of these curves. Findings Significant daily variation was seen in the expression of PER2 (p=0.004), PER3 (p=0.003) and ARNTL1 (p=0.0005). PER2/3 expression peaked at 10:38 [95%CI 9:20–11:56] and 10:44 [95%CI 9:29–11:59] respectively, and ARNTL1 expression peaked in antiphase to this at 21:23 [95%CI 20:16–22:30]. The timing of the expression of all three genes was significantly earlier in women than in men (PER2 6.8 hours p=0.002; PER3 5.5 hours p=0.001; ARNTL1 4.7 hours p=0.007). Interpretation Daily rhythms of clock gene expression are present in human cerebral cortex and can be inferred from postmortem samples. Moreover, these rhythms are relatively delayed in men compared to women. PMID:23606611

  4. Systems Biology Analysis of Gene Expression during In Vivo Mycobacterium avium paratuberculosis Enteric Colonization Reveals Role for Immune Tolerance

    PubMed Central

    Khare, Sangeeta; Lawhon, Sara D.; Drake, Kenneth L.; Nunes, Jairo E. S.; Figueiredo, Josely F.; Rossetti, Carlos A.; Gull, Tamara; Everts, Robin E.; Lewin, Harris A.; Galindo, Cristi L.; Garner, Harold R.; Adams, Leslie Garry

    2012-01-01

    Survival and persistence of Mycobacterium avium subsp. paratuberculosis (MAP) in the intestinal mucosa is associated with host immune tolerance. However, the initial events during MAP interaction with its host that lead to pathogen survival, granulomatous inflammation, and clinical disease progression are poorly defined. We hypothesize that immune tolerance is initiated upon initial contact of MAP with the intestinal Peyer's patch. To test our hypothesis, ligated ileal loops in neonatal calves were infected with MAP. Intestinal tissue RNAs were collected (0.5, 1, 2, 4, 8 and 12 hrs post-infection), processed, and hybridized to bovine gene expression microarrays. By comparing the gene transcription responses of calves infected with the MAP, informative complex patterns of expression were clearly visible. To interpret these complex data, changes in the gene expression were further analyzed by dynamic Bayesian analysis, and genes were grouped into the specific pathways and gene ontology categories to create a holistic model. This model revealed three different phases of responses: i) early (30 min and 1 hr post-infection), ii) intermediate (2, 4 and 8 hrs post-infection), and iii) late (12 hrs post-infection). We describe here the data that include expression profiles for perturbed pathways, as well as, mechanistic genes (genes predicted to have regulatory influence) that are associated with immune tolerance. In the Early Phase of MAP infection, multiple pathways were initiated in response to MAP invasion via receptor mediated endocytosis and changes in intestinal permeability. During the Intermediate Phase, perturbed pathways involved the inflammatory responses, cytokine-cytokine receptor interaction, and cell-cell signaling. During the Late Phase of infection, gene responses associated with immune tolerance were initiated at the level of T-cell signaling. Our study provides evidence that MAP infection resulted in differentially regulated genes, perturbed pathways and specifically modified mechanistic genes contributing to the colonization of Peyer's patch. PMID:22912686

  5. Building gene co-expression networks using transcriptomics data for systems biology investigations: Comparison of methods using microarray data

    PubMed Central

    Kadarmideen, Haja N; Watson-haigh, Nathan S

    2012-01-01

    Gene co-expression networks (GCN), built using high-throughput gene expression data are fundamental aspects of systems biology. The main aims of this study were to compare two popular approaches to building and analysing GCN. We use real ovine microarray transcriptomics datasets representing four different treatments with Metyrapone, an inhibitor of cortisol biosynthesis. We conducted several microarray quality control checks before applying GCN methods to filtered datasets. Then we compared the outputs of two methods using connectivity as a criterion, as it measures how well a node (gene) is connected within a network. The two GCN construction methods used were, Weighted Gene Co-expression Network Analysis (WGCNA) and Partial Correlation and Information Theory (PCIT) methods. Nodes were ranked based on their connectivity measures in each of the four different networks created by WGCNA and PCIT and node ranks in two methods were compared to identify those nodes which are highly differentially ranked (HDR). A total of 1,017 HDR nodes were identified across one or more of four networks. We investigated HDR nodes by gene enrichment analyses in relation to their biological relevance to phenotypes. We observed that, in contrast to WGCNA method, PCIT algorithm removes many of the edges of the most highly interconnected nodes. Removal of edges of most highly connected nodes or hub genes will have consequences for downstream analyses and biological interpretations. In general, for large GCN construction (with > 20000 genes) access to large computer clusters, particularly those with larger amounts of shared memory is recommended. PMID:23144540

  6. A HYPOTHESIS ACCOUNTING FOR THE PARADOXICAL EXPRESSION OF THE D GENE SEGMENT IN THE BCR AND THE TCR

    PubMed Central

    Cohn, Melvin

    2009-01-01

    The D gene segment expressed in both the TCR and BCR has a challenging behavior that begs interpretation. It is incorporated in three reading frames in the rearranged transcription unit but is expressed in antigen-selected cells in a preferred frame. Why was it so important to waste 2/3 of newborn cells? The hypothesis is presented that the D region is framework playing a role in both the TCR and the BCR by determining whether a signal is transmitted to the cell upon interaction with a cognate ligand. This assumption operates in determining haplotype exclusion for the BCR and in regulating the signaling orientation for the TCR. Relevant data as well as a definitive experiment challenging the validity of this hypothesis, are discussed. PMID:18546143

  7. Interpretable Early Classification of Multivariate Time Series

    ERIC Educational Resources Information Center

    Ghalwash, Mohamed F.

    2013-01-01

    Recent advances in technology have led to an explosion in data collection over time rather than in a single snapshot. For example, microarray technology allows us to measure gene expression levels in different conditions over time. Such temporal data grants the opportunity for data miners to develop algorithms to address domain-related problems,…

  8. Understanding disease mechanisms with models of signaling pathway activities.

    PubMed

    Sebastian-Leon, Patricia; Vidal, Enrique; Minguez, Pablo; Conesa, Ana; Tarazona, Sonia; Amadoz, Alicia; Armero, Carmen; Salavert, Francisco; Vidal-Puig, Antonio; Montaner, David; Dopazo, Joaquín

    2014-10-25

    Understanding the aspects of the cell functionality that account for disease or drug action mechanisms is one of the main challenges in the analysis of genomic data and is on the basis of the future implementation of precision medicine. Here we propose a simple probabilistic model in which signaling pathways are separated into elementary sub-pathways or signal transmission circuits (which ultimately trigger cell functions) and then transforms gene expression measurements into probabilities of activation of such signal transmission circuits. Using this model, differential activation of such circuits between biological conditions can be estimated. Thus, circuit activation statuses can be interpreted as biomarkers that discriminate among the compared conditions. This type of mechanism-based biomarkers accounts for cell functional activities and can easily be associated to disease or drug action mechanisms. The accuracy of the proposed model is demonstrated with simulations and real datasets. The proposed model provides detailed information that enables the interpretation disease mechanisms as a consequence of the complex combinations of altered gene expression values. Moreover, it offers a framework for suggesting possible ways of therapeutic intervention in a pathologically perturbed system.

  9. GC-Content Normalization for RNA-Seq Data

    PubMed Central

    2011-01-01

    Background Transcriptome sequencing (RNA-Seq) has become the assay of choice for high-throughput studies of gene expression. However, as is the case with microarrays, major technology-related artifacts and biases affect the resulting expression measures. Normalization is therefore essential to ensure accurate inference of expression levels and subsequent analyses thereof. Results We focus on biases related to GC-content and demonstrate the existence of strong sample-specific GC-content effects on RNA-Seq read counts, which can substantially bias differential expression analysis. We propose three simple within-lane gene-level GC-content normalization approaches and assess their performance on two different RNA-Seq datasets, involving different species and experimental designs. Our methods are compared to state-of-the-art normalization procedures in terms of bias and mean squared error for expression fold-change estimation and in terms of Type I error and p-value distributions for tests of differential expression. The exploratory data analysis and normalization methods proposed in this article are implemented in the open-source Bioconductor R package EDASeq. Conclusions Our within-lane normalization procedures, followed by between-lane normalization, reduce GC-content bias and lead to more accurate estimates of expression fold-changes and tests of differential expression. Such results are crucial for the biological interpretation of RNA-Seq experiments, where downstream analyses can be sensitive to the supplied lists of genes. PMID:22177264

  10. Co-expression networks reveal the tissue-specific regulation of transcription and splicing.

    PubMed

    Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D H; Jo, Brian; Gao, Chuan; McDowell, Ian C; Engelhardt, Barbara E; Battle, Alexis

    2017-11-01

    Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues. © 2017 Saha et al.; Published by Cold Spring Harbor Laboratory Press.

  11. An Approximation to the Temporal Order in Endogenous Circadian Rhythms of Genes Implicated in Human Adipose Tissue Metabolism

    PubMed Central

    GARAULET, MARTA; ORDOVÁS, JOSÉ M.; GÓMEZ-ABELLÁN, PURIFICACIÓN; MARTÍNEZ, JOSE A.; MADRID, JUAN A.

    2015-01-01

    Although it is well established that human adipose tissue (AT) shows circadian rhythmicity, published studies have been discussed as if tissues or systems showed only one or few circadian rhythms at a time. To provide an overall view of the internal temporal order of circadian rhythms in human AT including genes implicated in metabolic processes such as energy intake and expenditure, insulin resistance, adipocyte differentiation, dyslipidemia, and body fat distribution. Visceral and subcutaneous abdominal AT biopsies (n = 6) were obtained from morbid obese women (BMI ≥ 40 kg/m2). To investigate rhythmic expression pattern, AT explants were cultured during 24-h and gene expression was analyzed at the following times: 08:00, 14:00, 20:00, 02:00 h using quantitative real-time PCR. Clock genes, glucocorticoid metabolism-related genes, leptin, adiponectin and their receptors were studied. Significant differences were found both in achrophases and relative-amplitude among genes (P <0.05). Amplitude of most genes rhythms was high (>30%). When interpreting the phase map of gene expression in both depots, data indicated that circadian rhythmicity of the genes studied followed a predictable physiological pattern, particularly for subcutaneous AT. Interesting are the relationships between adiponectin, leptin, and glucocorticoid metabolism-related genes circadian profiles. Their metabolic significance is discussed. Visceral AT behaved in a different way than subcutaneous for most of the genes studied. For every gene, protein mRNA levels fluctuated during the day in synchrony with its receptors. We have provided an overall view of the internal temporal order of circadian rhythms in human adipose tissue. PMID:21520059

  12. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data.

    PubMed

    Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

    2016-03-01

    Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics, and others yet to emerge on the postgenomics horizon.

  13. Ras GTPases Modulate Morphogenesis, Sporulation and Cellulase Gene Expression in the Cellulolytic Fungus Trichoderma reesei

    PubMed Central

    Zhang, Jiwei; Zhang, Yanmei; Zhong, Yaohua; Qu, Yinbo; Wang, Tianhong

    2012-01-01

    Background The model cellulolytic fungus Trichoderma reesei (teleomorph Hypocrea jecorina) is capable of responding to environmental cues to compete for nutrients in its natural saprophytic habitat despite its genome encodes fewer degradative enzymes. Efficient signalling pathways in perception and interpretation of environmental signals are indispensable in this process. Ras GTPases represent a kind of critical signal proteins involved in signal transduction and regulation of gene expression. In T. reesei the genome contains two Ras subfamily small GTPases TrRas1 and TrRas2 homologous to Ras1 and Ras2 from S. cerevisiae, but their functions remain unknown. Methodology/Principal Findings Here, we have investigated the roles of GTPases TrRas1 and TrRas2 during fungal morphogenesis and cellulase gene expression. We show that both TrRas1 and TrRas2 play important roles in some cellular processes such as polarized apical growth, hyphal branch formation, sporulation and cAMP level adjustment, while TrRas1 is more dominant in these processes. Strikingly, we find that TrRas2 is involved in modulation of cellulase gene expression. Deletion of TrRas2 results in considerably decreased transcription of cellulolytic genes upon growth on cellulose. Although the strain carrying a constitutively activated TrRas2G16V allele exhibits increased cellulase gene transcription, the cbh1 and cbh2 expression in this mutant still strictly depends on cellulose, indicating TrRas2 does not directly mediate the transmission of the cellulose signal. In addition, our data suggest that the effect of TrRas2 on cellulase gene is exerted through regulation of transcript abundance of cellulase transcription factors such as Xyr1, but the influence is independent of cAMP signalling pathway. Conclusions/Significance Together, these findings elucidate the functions for Ras signalling of T. reesei in cellular morphogenesis, especially in cellulase gene expression, which contribute to deciphering the powerful competitive ability of plant cell wall degrading fungi in nature. PMID:23152805

  14. Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework.

    PubMed

    Yang, Lingjian; Ainali, Chrysanthi; Tsoka, Sophia; Papageorgiou, Lazaros G

    2014-12-05

    Applying machine learning methods on microarray gene expression profiles for disease classification problems is a popular method to derive biomarkers, i.e. sets of genes that can predict disease state or outcome. Traditional approaches where expression of genes were treated independently suffer from low prediction accuracy and difficulty of biological interpretation. Current research efforts focus on integrating information on protein interactions through biochemical pathway datasets with expression profiles to propose pathway-based classifiers that can enhance disease diagnosis and prognosis. As most of the pathway activity inference methods in literature are either unsupervised or applied on two-class datasets, there is good scope to address such limitations by proposing novel methodologies. A supervised multiclass pathway activity inference method using optimisation techniques is reported. For each pathway expression dataset, patterns of its constituent genes are summarised into one composite feature, termed pathway activity, and a novel mathematical programming model is proposed to infer this feature as a weighted linear summation of expression of its constituent genes. Gene weights are determined by the optimisation model, in a way that the resulting pathway activity has the optimal discriminative power with regards to disease phenotypes. Classification is then performed on the resulting low-dimensional pathway activity profile. The model was evaluated through a variety of published gene expression profiles that cover different types of disease. We show that not only does it improve classification accuracy, but it can also perform well in multiclass disease datasets, a limitation of other approaches from the literature. Desirable features of the model include the ability to control the maximum number of genes that may participate in determining pathway activity, which may be pre-specified by the user. Overall, this work highlights the potential of building pathway-based multi-phenotype classifiers for accurate disease diagnosis and prognosis problems.

  15. Step-by-Step Construction of Gene Co-expression Networks from High-Throughput Arabidopsis RNA Sequencing Data.

    PubMed

    Contreras-López, Orlando; Moyano, Tomás C; Soto, Daniela C; Gutiérrez, Rodrigo A

    2018-01-01

    The rapid increase in the availability of transcriptomics data generated by RNA sequencing represents both a challenge and an opportunity for biologists without bioinformatics training. The challenge is handling, integrating, and interpreting these data sets. The opportunity is to use this information to generate testable hypothesis to understand molecular mechanisms controlling gene expression and biological processes (Fig. 1). A successful strategy to generate tractable hypotheses from transcriptomics data has been to build undirected network graphs based on patterns of gene co-expression. Many examples of new hypothesis derived from network analyses can be found in the literature, spanning different organisms including plants and specific fields such as root developmental biology.In order to make the process of constructing a gene co-expression network more accessible to biologists, here we provide step-by-step instructions using published RNA-seq experimental data obtained from a public database. Similar strategies have been used in previous studies to advance root developmental biology. This guide includes basic instructions for the operation of widely used open source platforms such as Bio-Linux, R, and Cytoscape. Even though the data we used in this example was obtained from Arabidopsis thaliana, the workflow developed in this guide can be easily adapted to work with RNA-seq data from any organism.

  16. The consequences of chromosomal aneuploidy on the transcriptome of cancer cells☆

    PubMed Central

    Ried, Thomas; Hu, Yue; Difilippantonio, Michael J.; Ghadimi, B. Michael; Grade, Marian; Camps, Jordi

    2016-01-01

    Chromosomal aneuploidies are a defining feature of carcinomas, i.e., tumors of epithelial origin. Such aneuploidies result in tumor specific genomic copy number alterations. The patterns of genomic imbalances are tumor specific, and to a certain extent specific for defined stages of tumor development. Genomic imbalances occur already in premalignant precursor lesions, i.e., before the transition to invasive disease, and their distribution is maintained in metastases, and in cell lines derived from primary tumors. These observations are consistent with the interpretation that tumor specific genomic imbalances are drivers of malignant transformation. Naturally, this precipitates the question of how such imbalances influence the expression of resident genes. A number of laboratories have systematically integrated copy number alterations with gene expression changes in primary tumors and metastases, cell lines, and experimental models of aneuploidy to address the question as to whether genomic imbalances deregulate the expression of one or few key genes, or rather affect the cancer transcriptome more globally. The majority of these studies showed that gene expression levels follow genomic copy number. Therefore, gross genomic copy number changes, including aneuploidies of entire chromosome arms and chromosomes, result in a massive deregulation of the transcriptome of cancer cells. This article is part of a Special Issue entitled: Chromatin in time and space. PMID:22426433

  17. Causal Reasoning on Biological Networks: Interpreting Transcriptional Changes

    NASA Astrophysics Data System (ADS)

    Chindelevitch, Leonid; Ziemek, Daniel; Enayetallah, Ahmed; Randhawa, Ranjit; Sidders, Ben; Brockel, Christoph; Huang, Enoch

    Over the past decade gene expression data sets have been generated at an increasing pace. In addition to ever increasing data generation, the biomedical literature is growing exponentially. The PubMed database (Sayers et al., 2010) comprises more than 20 million citations as of October 2010. The goal of our method is the prediction of putative upstream regulators of observed expression changes based on a set of over 400,000 causal relationships. The resulting putative regulators constitute directly testable hypotheses for follow-up.

  18. Decoding transcriptional enhancers: Evolving from annotation to functional interpretation

    PubMed Central

    Engel, Krysta L.; Mackiewicz, Mark; Hardigan, Andrew A.; Myers, Richard M.; Savic, Daniel

    2016-01-01

    Deciphering the intricate molecular processes that orchestrate the spatial and temporal regulation of genes has become an increasingly major focus of biological research. The differential expression of genes by diverse cell types with a common genome is a hallmark of complex cellular functions, as well as the basis for multicellular life. Importantly, a more coherent understanding of gene regulation is critical for defining developmental processes, evolutionary principles and disease etiologies. Here we present our current understanding of gene regulation by focusing on the role of enhancer elements in these complex processes. Although functional genomic methods have provided considerable advances to our understanding of gene regulation, these assays, which are usually performed on a genome-wide scale, typically provide correlative observations that lack functional interpretation. Recent innovations in genome editing technologies have placed gene regulatory studies at an exciting crossroads, as systematic, functional evaluation of enhancers and other transcriptional regulatory elements can now be performed in a coordinated, high-throughput manner across the entire genome. This review provides insights on transcriptional enhancer function, their role in development and disease, and catalogues experimental tools commonly used to study these elements. Additionally, we discuss the crucial role of novel techniques in deciphering the complex gene regulatory landscape and how these studies will shape future research. PMID:27224938

  19. Decoding transcriptional enhancers: Evolving from annotation to functional interpretation.

    PubMed

    Engel, Krysta L; Mackiewicz, Mark; Hardigan, Andrew A; Myers, Richard M; Savic, Daniel

    2016-09-01

    Deciphering the intricate molecular processes that orchestrate the spatial and temporal regulation of genes has become an increasingly major focus of biological research. The differential expression of genes by diverse cell types with a common genome is a hallmark of complex cellular functions, as well as the basis for multicellular life. Importantly, a more coherent understanding of gene regulation is critical for defining developmental processes, evolutionary principles and disease etiologies. Here we present our current understanding of gene regulation by focusing on the role of enhancer elements in these complex processes. Although functional genomic methods have provided considerable advances to our understanding of gene regulation, these assays, which are usually performed on a genome-wide scale, typically provide correlative observations that lack functional interpretation. Recent innovations in genome editing technologies have placed gene regulatory studies at an exciting crossroads, as systematic, functional evaluation of enhancers and other transcriptional regulatory elements can now be performed in a coordinated, high-throughput manner across the entire genome. This review provides insights on transcriptional enhancer function, their role in development and disease, and catalogues experimental tools commonly used to study these elements. Additionally, we discuss the crucial role of novel techniques in deciphering the complex gene regulatory landscape and how these studies will shape future research. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. An ensemble of SVM classifiers based on gene pairs.

    PubMed

    Tong, Muchenxuan; Liu, Kun-Hong; Xu, Chungui; Ju, Wenbin

    2013-07-01

    In this paper, a genetic algorithm (GA) based ensemble support vector machine (SVM) classifier built on gene pairs (GA-ESP) is proposed. The SVMs (base classifiers of the ensemble system) are trained on different informative gene pairs. These gene pairs are selected by the top scoring pair (TSP) criterion. Each of these pairs projects the original microarray expression onto a 2-D space. Extensive permutation of gene pairs may reveal more useful information and potentially lead to an ensemble classifier with satisfactory accuracy and interpretability. GA is further applied to select an optimized combination of base classifiers. The effectiveness of the GA-ESP classifier is evaluated on both binary-class and multi-class datasets. Copyright © 2013 Elsevier Ltd. All rights reserved.

  1. snpGeneSets: An R Package for Genome-Wide Study Annotation

    PubMed Central

    Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

    2016-01-01

    Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048

  2. Clustering of change patterns using Fourier coefficients.

    PubMed

    Kim, Jaehee; Kim, Haseong

    2008-01-15

    To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a time period because biologically related gene groups can share the same change patterns. Many clustering algorithms have been proposed to group observation data. However, because of the complexity of the underlying functions there have not been many studies on grouping data based on change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. The sample Fourier coefficients not only provide information about the underlying functions, but also reduce the dimension. In addition, as their limiting distribution is a multivariate normal, a model-based clustering method incorporating statistical properties would be appropriate. This work is aimed at discovering gene groups with similar change patterns that share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. The model-based method is advantageous over other methods in our proposed model because the sample Fourier coefficients asymptotically follow the multivariate normal distribution. Change patterns are automatically estimated with the Fourier representation in our model. Our model was tested in simulations and on real gene data sets. The simulation results showed that the model-based clustering method with the sample Fourier coefficients has a lower clustering error rate than K-means clustering. Even when the number of repeated time points was small, the same results were obtained. We also applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns. The R program is available upon the request.

  3. Enhanced identification and biological validation of differential gene expression via Illumina whole-genome expression arrays through the use of the model-based background correction methodology

    PubMed Central

    Ding, Liang-Hao; Xie, Yang; Park, Seongmi; Xiao, Guanghua; Story, Michael D.

    2008-01-01

    Despite the tremendous growth of microarray usage in scientific studies, there is a lack of standards for background correction methodologies, especially in single-color microarray platforms. Traditional background subtraction methods often generate negative signals and thus cause large amounts of data loss. Hence, some researchers prefer to avoid background corrections, which typically result in the underestimation of differential expression. Here, by utilizing nonspecific negative control features integrated into Illumina whole genome expression arrays, we have developed a method of model-based background correction for BeadArrays (MBCB). We compared the MBCB with a method adapted from the Affymetrix robust multi-array analysis algorithm and with no background subtraction, using a mouse acute myeloid leukemia (AML) dataset. We demonstrated that differential expression ratios obtained by using the MBCB had the best correlation with quantitative RT–PCR. MBCB also achieved better sensitivity in detecting differentially expressed genes with biological significance. For example, we demonstrated that the differential regulation of Tnfr2, Ikk and NF-kappaB, the death receptor pathway, in the AML samples, could only be detected by using data after MBCB implementation. We conclude that MBCB is a robust background correction method that will lead to more precise determination of gene expression and better biological interpretation of Illumina BeadArray data. PMID:18450815

  4. NFκB pathway analysis: An approach to analyze gene co-expression networks employing feedback cycles.

    PubMed

    Dillenburg, Fabiane Cristine; Zanotto-Filho, Alfeu; Fonseca Moreira, José Cláudio; Ribeiro, Leila; Carro, Luigi

    2018-02-01

    The genes of the NFκB pathway are involved in the control of a plethora of biological processes ranking from inhibition of apoptosis to metastasis in cancer. It has been described that Gliobastoma multiforme (GBM) patients carry aberrant NFκB activation, but the molecular mechanisms are not completely understood. Here, we present a NFκB pathway analysis in tumor specimens of GBM compared to non-neoplasic brain tissues, based on the different kind of cycles found among genes of a gene co-expression network constructed using quantized data obtained from the microarrays. A cycle is a closed walk with all vertices distinct (except the first and last). Thanks to this way of finding relations among genes, a more robust interpretation of gene correlations is possible, because the cycles are associated with feedback mechanisms that are very common in biological networks. In GBM samples, we could conclude that the stoichiometric relationship between genes involved in NFκB pathway regulation is unbalanced. This can be measured and explained by the identification of a cycle. This conclusion helps to understand more about the biology of this type of tumor. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach.

    PubMed

    Zhang, Song; Cao, Jing; Kong, Y Megan; Scheuermann, Richard H

    2010-04-01

    A typical approach for the interpretation of high-throughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g. genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, overrepresentation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example, as represented by Gene Ontology (GO) annotations, are statistically overrepresented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO-term hierarchy. We have developed a Bayesian approach (GO-Bayes) to measure overrepresentation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e. parents, children, siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of overrepresentation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example.

  6. Reproducibility-optimized test statistic for ranking genes in microarray studies.

    PubMed

    Elo, Laura L; Filén, Sanna; Lahesmaa, Riitta; Aittokallio, Tero

    2008-01-01

    A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.

  7. Mutual information estimation reveals global associations between stimuli and biological processes

    PubMed Central

    Suzuki, Taiji; Sugiyama, Masashi; Kanamori, Takafumi; Sese, Jun

    2009-01-01

    Background Although microarray gene expression analysis has become popular, it remains difficult to interpret the biological changes caused by stimuli or variation of conditions. Clustering of genes and associating each group with biological functions are often used methods. However, such methods only detect partial changes within cell processes. Herein, we propose a method for discovering global changes within a cell by associating observed conditions of gene expression with gene functions. Results To elucidate the association, we introduce a novel feature selection method called Least-Squares Mutual Information (LSMI), which computes mutual information without density estimaion, and therefore LSMI can detect nonlinear associations within a cell. We demonstrate the effectiveness of LSMI through comparison with existing methods. The results of the application to yeast microarray datasets reveal that non-natural stimuli affect various biological processes, whereas others are no significant relation to specific cell processes. Furthermore, we discover that biological processes can be categorized into four types according to the responses of various stimuli: DNA/RNA metabolism, gene expression, protein metabolism, and protein localization. Conclusion We proposed a novel feature selection method called LSMI, and applied LSMI to mining the association between conditions of yeast and biological processes through microarray datasets. In fact, LSMI allows us to elucidate the global organization of cellular process control. PMID:19208155

  8. RNA-Seq reveals 10 novel promising candidate genes affecting milk protein concentration in the Chinese Holstein population

    PubMed Central

    Li, Cong; Cai, Wentao; Zhou, Chenghao; Yin, Hongwei; Zhang, Ziqi; Loor, Juan J.; Sun, Dongxiao; Zhang, Qin; Liu, Jianfeng; Zhang, Shengli

    2016-01-01

    Paired-end RNA sequencing (RNA-Seq) was used to explore the bovine transcriptome from the mammary tissue of 12 Chinese Holstein cows with 6 extremely high and 6 low phenotypic values for milk protein percentage. We defined the differentially expressed transcripts between the two comparison groups, extremely high and low milk protein percentage during the peak lactation (HP vs LP) and during the non-lactating period (HD vs LD), respectively. Within the differentially expressed genes (DEGs), we detected 157 at peak lactation and 497 in the non-lactating period with a highly significant correlation with milk protein concentration. Integrated interpretation of differential gene expression indicated that SERPINA1, CLU, CNTFR, ERBB2, NEDD4L, ANG, GALE, HSPA8, LPAR6 and CD14 are the most promising candidate genes affecting milk protein concentration. Similarly, LTF, FCGR3A, MEGF10, RRM2 and UBE2C are the most promising candidates that in the non-lactating period could help the mammary tissue prevent issues with inflammation and udder disorders. Putative genes will be valuable resources for designing better breeding strategies to optimize the content of milk protein and also to provide new insights into regulation of lactogenesis. PMID:27254118

  9. Exploring Plant Co-Expression and Gene-Gene Interactions with CORNET 3.0.

    PubMed

    Van Bel, Michiel; Coppens, Frederik

    2017-01-01

    Selecting and filtering a reference expression and interaction dataset when studying specific pathways and regulatory interactions can be a very time-consuming and error-prone task. In order to reduce the duplicated efforts required to amass such datasets, we have created the CORNET (CORrelation NETworks) platform which allows for easy access to a wide variety of data types: coexpression data, protein-protein interactions, regulatory interactions, and functional annotations. The CORNET platform outputs its results in either text format or through the Cytoscape framework, which is automatically launched by the CORNET website.CORNET 3.0 is the third iteration of the web platform designed for the user exploration of the coexpression space of plant genomes, with a focus on the model species Arabidopsis thaliana. Here we describe the platform: the tools, data, and best practices when using the platform. We indicate how the platform can be used to infer networks from a set of input genes, such as upregulated genes from an expression experiment. By exploring the network, new target and regulator genes can be discovered, allowing for follow-up experiments and more in-depth study. We also indicate how to avoid common pitfalls when evaluating the networks and how to avoid over interpretation of the results.All CORNET versions are available at http://bioinformatics.psb.ugent.be/cornet/ .

  10. A Systems Level, Functional Genomics Analysis of Chronic Epilepsy

    PubMed Central

    Bragin, Anatol; Kudo, Lili C.; Gehman, Lauren; Ruidera, Josephine; Geschwind, Daniel H.; Engel, Jerome

    2011-01-01

    Neither the molecular basis of the pathologic tendency of neuronal circuits to generate spontaneous seizures (epileptogenicity) nor anti-epileptogenic mechanisms that maintain a seizure-free state are well understood. Here, we performed transcriptomic analysis in the intrahippocampal kainate model of temporal lobe epilepsy in rats using both Agilent and Codelink microarray platforms to characterize the epileptic processes. The experimental design allowed subtraction of the confounding effects of the lesion, identification of expression changes associated with epileptogenicity, and genes upregulated by seizures with potential homeostatic anti-epileptogenic effects. Using differential expression analysis, we identified several hundred expression changes in chronic epilepsy, including candidate genes associated with epileptogenicity such as Bdnf and Kcnj13. To analyze these data from a systems perspective, we applied weighted gene co-expression network analysis (WGCNA) to identify groups of co-expressed genes (modules) and their central (hub) genes. One such module contained genes upregulated in the epileptogenic region, including multiple epileptogenicity candidate genes, and was found to be involved the protection of glial cells against oxidative stress, implicating glial oxidative stress in epileptogenicity. Another distinct module corresponded to the effects of chronic seizures and represented changes in neuronal synaptic vesicle trafficking. We found that the network structure and connectivity of one hub gene, Sv2a, showed significant changes between normal and epileptogenic tissue, becoming more highly connected in epileptic brain. Since Sv2a is a target of the antiepileptic levetiracetam, this module may be important in controlling seizure activity. Bioinformatic analysis of this module also revealed a potential mechanism for the observed transcriptional changes via generation of longer alternatively polyadenlyated transcripts through the upregulation of the RNA binding protein HuD. In summary, combining conventional statistical methods and network analysis allowed us to interpret the differentially regulated genes from a systems perspective, yielding new insight into several biological pathways underlying homeostatic anti-epileptogenic effects and epileptogenicity. PMID:21695113

  11. Comprehensive discovery of subsample gene expression components by information explanation: therapeutic implications in cancer.

    PubMed

    Pepke, Shirley; Ver Steeg, Greg

    2017-03-15

    De novo inference of clinically relevant gene function relationships from tumor RNA-seq remains a challenging task. Current methods typically either partition patient samples into a few subtypes or rely upon analysis of pairwise gene correlations that will miss some groups in noisy data. Leveraging higher dimensional information can be expected to increase the power to discern targetable pathways, but this is commonly thought to be an intractable computational problem. In this work we adapt a recently developed machine learning algorithm for sensitive detection of complex gene relationships. The algorithm, CorEx, efficiently optimizes over multivariate mutual information and can be iteratively applied to generate a hierarchy of relatively independent latent factors. The learned latent factors are used to stratify patients for survival analysis with respect to both single factors and combinations. These analyses are performed and interpreted in the context of biological function annotations and protein network interactions that might be utilized to match patients to multiple therapies. Analysis of ovarian tumor RNA-seq samples demonstrates the algorithm's power to infer well over one hundred biologically interpretable gene cohorts, several times more than standard methods such as hierarchical clustering and k-means. The CorEx factor hierarchy is also informative, with related but distinct gene clusters grouped by upper nodes. Some latent factors correlate with patient survival, including one for a pathway connected with the epithelial-mesenchymal transition in breast cancer that is regulated by a microRNA that modulates epigenetics. Further, combinations of factors lead to a synergistic survival advantage in some cases. In contrast to studies that attempt to partition patients into a small number of subtypes (typically 4 or fewer) for treatment purposes, our approach utilizes subgroup information for combinatoric transcriptional phenotyping. Considering only the 66 gene expression groups that are found to both have significant Gene Ontology enrichment and are small enough to indicate specific drug targets implies a computational phenotype for ovarian cancer that allows for 3 66 possible patient profiles, enabling truly personalized treatment. The findings here demonstrate a new technique that sheds light on the complexity of gene expression dependencies in tumors and could eventually enable the use of patient RNA-seq profiles for selection of personalized and effective cancer treatments.

  12. In silico method for modelling metabolism and gene product expression at genome scale

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lerman, Joshua A.; Hyduke, Daniel R.; Latif, Haythem

    2012-07-03

    Transcription and translation use raw materials and energy generated metabolically to create the macromolecular machinery responsible for all cellular functions, including metabolism. A biochemically accurate model of molecular biology and metabolism will facilitate comprehensive and quantitative computations of an organism's molecular constitution as a function of genetic and environmental parameters. Here we formulate a model of metabolism and macromolecular expression. Prototyping it using the simple microorganism Thermotoga maritima, we show our model accurately simulates variations in cellular composition and gene expression. Moreover, through in silico comparative transcriptomics, the model allows the discovery of new regulons and improving the genome andmore » transcription unit annotations. Our method presents a framework for investigating molecular biology and cellular physiology in silico and may allow quantitative interpretation of multi-omics data sets in the context of an integrated biochemical description of an organism.« less

  13. Tuning the chemosensory window

    PubMed Central

    Zhou, Shanshan; Mackay, Trudy FC

    2010-01-01

    Accurate perception of chemical signals from the environment is critical for the fitness of most animals. Drosophila melanogaster experiences its chemical environment through families of chemoreceptors that include olfactory receptors, gustatory receptors and odorant binding proteins. Its chemical environment, however, changes during its life cycle and the interpretation of chemical signals is dependent on dynamic social and physical surroundings. Phenotypic plasticity of gene expression of the chemoreceptor repertoire allows flies to adjust the chemosensory window through which they “view” their world and to modify the ensemble of expressed chemoreceptor proteins in line with their developmental and physiological state and according to their needs to locate food and oviposition sites under different social and physical environmental conditions. Furthermore, males and females differ in their expression profiles of chemoreceptor genes. Thus, each sex experiences its chemical environment via combinatorial activation of distinct chemoreceptor ensembles. The remarkable phenotypic plasticity of the chemoreceptor repertoire raises several fundamental questions. What are the mechanisms that translate environmental cues into regulation of chemoreceptor gene expression? How are gustatory and olfactory cues integrated perceptually? What is the relationship between ensembles of odorant binding proteins and odorant receptors? And, what is the significance of co-regulated chemoreceptor transcriptional networks? PMID:20305396

  14. Coordinated transcriptional regulation patterns associated with infertility phenotypes in men

    PubMed Central

    Ellis, Peter J I; Furlong, Robert A; Conner, Sarah J; Kirkman‐Brown, Jackson; Afnan, Masoud; Barratt, Christopher; Griffin, Darren K; Affara, Nabeel A

    2007-01-01

    Introduction Microarray gene‐expression profiling is a powerful tool for global analysis of the transcriptional consequences of disease phenotypes. Understanding the genetic correlates of particular pathological states is important for more accurate diagnosis and screening of patients, and thus for suggesting appropriate avenues of treatment. As yet, there has been little research describing gene‐expression profiling of infertile and subfertile men, and thus the underlying transcriptional events involved in loss of spermatogenesis remain unclear. Here we present the results of an initial screen of 33 patients with differing spermatogenic phenotypes. Methods Oligonucleotide array expression profiling was performed on testis biopsies for 33 patients presenting for testicular sperm extraction. Significantly regulated genes were selected using a mixed model analysis of variance. Principle components analysis and hierarchical clustering were used to interpret the resulting dataset with reference to the patient history, clinical findings and histological composition of the biopsies. Results Striking patterns of coordinated gene expression were found. The most significant contains multiple germ cell‐specific genes and corresponds to the degree of successful spermatogenesis in each patient, whereas a second pattern corresponds to inflammatory activity within the testis. Smaller‐scale patterns were also observed, relating to unique features of the individual biopsies. PMID:17496197

  15. SZGR 2.0: a one-stop shop of schizophrenia candidate genes

    PubMed Central

    Jia, Peilin; Han, Guangchun; Zhao, Junfei; Lu, Pinyi; Zhao, Zhongming

    2017-01-01

    SZGR 2.0 is a comprehensive resource of candidate variants and genes for schizophrenia, covering genetic, epigenetic, transcriptomic, translational and many other types of evidence. By systematic review and curation of multiple lines of evidence, we included almost all variants and genes that have ever been reported to be associated with schizophrenia. In particular, we collected ∼4200 common variants reported in genome-wide association studies, ∼1000 de novo mutations discovered by large-scale sequencing of family samples, 215 genes spanning rare and replication copy number variations, 99 genes overlapping with linkage regions, 240 differentially expressed genes, 4651 differentially methylated genes and 49 genes as antipsychotic drug targets. To facilitate interpretation, we included various functional annotation data, especially brain eQTL, methylation QTL, brain expression featured in deep categorization of brain areas and developmental stages and brain-specific promoter and enhancer annotations. Furthermore, we conducted cross-study, cross-data type and integrative analyses of the multidimensional data deposited in SZGR 2.0, and made the data and results available through a user-friendly interface. In summary, SZGR 2.0 provides a one-stop shop of schizophrenia variants and genes and their function and regulation, providing an important resource in the schizophrenia and other mental disease community. SZGR 2.0 is available at https://bioinfo.uth.edu/SZGR/. PMID:27733502

  16. Dynamic changes of RNA-sequencing expression for precision medicine: N-of-1-pathways Mahalanobis distance within pathways of single subjects predicts breast cancer survival

    PubMed Central

    Piegorsch, Walter W.; Lussier, Yves A.

    2015-01-01

    Motivation: The conventional approach to personalized medicine relies on molecular data analytics across multiple patients. The path to precision medicine lies with molecular data analytics that can discover interpretable single-subject signals (N-of-1). We developed a global framework, N-of-1-pathways, for a mechanistic-anchored approach to single-subject gene expression data analysis. We previously employed a metric that could prioritize the statistical significance of a deregulated pathway in single subjects, however, it lacked in quantitative interpretability (e.g. the equivalent to a gene expression fold-change). Results: In this study, we extend our previous approach with the application of statistical Mahalanobis distance (MD) to quantify personal pathway-level deregulation. We demonstrate that this approach, N-of-1-pathways Paired Samples MD (N-OF-1-PATHWAYS-MD), detects deregulated pathways (empirical simulations), while not inflating false-positive rate using a study with biological replicates. Finally, we establish that N-OF-1-PATHWAYS-MD scores are, biologically significant, clinically relevant and are predictive of breast cancer survival (P < 0.05, n = 80 invasive carcinoma; TCGA RNA-sequences). Conclusion: N-of-1-pathways MD provides a practical approach towards precision medicine. The method generates the magnitude and the biological significance of personal deregulated pathways results derived solely from the patient’s transcriptome. These pathways offer the opportunities for deriving clinically actionable decisions that have the potential to complement the clinical interpretability of personal polymorphisms obtained from DNA acquired or inherited polymorphisms and mutations. In addition, it offers an opportunity for applicability to diseases in which DNA changes may not be relevant, and thus expand the ‘interpretable ‘omics’ of single subjects (e.g. personalome). Availability and implementation: http://www.lussierlab.net/publications/N-of-1-pathways. Contact: yves@email.arizona.edu or piegorsch@math.arizona.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26072495

  17. Robust Gaussian Graphical Modeling via l1 Penalization

    PubMed Central

    Sun, Hokeun; Li, Hongzhe

    2012-01-01

    Summary Gaussian graphical models have been widely used as an effective method for studying the conditional independency structure among genes and for constructing genetic networks. However, gene expression data typically have heavier tails or more outlying observations than the standard Gaussian distribution. Such outliers in gene expression data can lead to wrong inference on the dependency structure among the genes. We propose a l1 penalized estimation procedure for the sparse Gaussian graphical models that is robustified against possible outliers. The likelihood function is weighted according to how the observation is deviated, where the deviation of the observation is measured based on its own likelihood. An efficient computational algorithm based on the coordinate gradient descent method is developed to obtain the minimizer of the negative penalized robustified-likelihood, where nonzero elements of the concentration matrix represents the graphical links among the genes. After the graphical structure is obtained, we re-estimate the positive definite concentration matrix using an iterative proportional fitting algorithm. Through simulations, we demonstrate that the proposed robust method performs much better than the graphical Lasso for the Gaussian graphical models in terms of both graph structure selection and estimation when outliers are present. We apply the robust estimation procedure to an analysis of yeast gene expression data and show that the resulting graph has better biological interpretation than that obtained from the graphical Lasso. PMID:23020775

  18. GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle.

    PubMed

    Lardenois, Aurélie; Gattiker, Alexandre; Collin, Olivier; Chalmel, Frédéric; Primig, Michael

    2010-01-01

    GermOnline 4.0 is a cross-species database portal focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. It is thus a source of information for life scientists as well as clinicians who are interested in gene expression and regulatory networks. The GermOnline gateway provides unlimited access to information produced with high-density oligonucleotide microarrays (3'-UTR GeneChips), genome-wide protein-DNA binding assays and protein-protein interaction studies in the context of Ensembl genome annotation. Samples used to produce high-throughput expression data and to carry out genome-wide in vivo DNA binding assays are annotated via the MIAME-compliant Multiomics Information Management and Annotation System (MIMAS 3.0). Furthermore, the Saccharomyces Genomics Viewer (SGV) was developed and integrated into the gateway. SGV is a visualization tool that outputs genome annotation and DNA-strand specific expression data produced with high-density oligonucleotide tiling microarrays (Sc_tlg GeneChips) which cover the complete budding yeast genome on both DNA strands. It facilitates the interpretation of expression levels and transcript structures determined for various cell types cultured under different growth and differentiation conditions. Database URL: www.germonline.org/

  19. GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle

    PubMed Central

    Lardenois, Aurélie; Gattiker, Alexandre; Collin, Olivier; Chalmel, Frédéric; Primig, Michael

    2010-01-01

    GermOnline 4.0 is a cross-species database portal focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. It is thus a source of information for life scientists as well as clinicians who are interested in gene expression and regulatory networks. The GermOnline gateway provides unlimited access to information produced with high-density oligonucleotide microarrays (3′-UTR GeneChips), genome-wide protein–DNA binding assays and protein–protein interaction studies in the context of Ensembl genome annotation. Samples used to produce high-throughput expression data and to carry out genome-wide in vivo DNA binding assays are annotated via the MIAME-compliant Multiomics Information Management and Annotation System (MIMAS 3.0). Furthermore, the Saccharomyces Genomics Viewer (SGV) was developed and integrated into the gateway. SGV is a visualization tool that outputs genome annotation and DNA-strand specific expression data produced with high-density oligonucleotide tiling microarrays (Sc_tlg GeneChips) which cover the complete budding yeast genome on both DNA strands. It facilitates the interpretation of expression levels and transcript structures determined for various cell types cultured under different growth and differentiation conditions. Database URL: www.germonline.org/ PMID:21149299

  20. Archetypal analysis of diverse Pseudomonas aeruginosa transcriptomes reveals adaptation in cystic fibrosis airways

    PubMed Central

    2013-01-01

    Background Analysis of global gene expression by DNA microarrays is widely used in experimental molecular biology. However, the complexity of such high-dimensional data sets makes it difficult to fully understand the underlying biological features present in the data. The aim of this study is to introduce a method for DNA microarray analysis that provides an intuitive interpretation of data through dimension reduction and pattern recognition. We present the first “Archetypal Analysis” of global gene expression. The analysis is based on microarray data from five integrated studies of Pseudomonas aeruginosa isolated from the airways of cystic fibrosis patients. Results Our analysis clustered samples into distinct groups with comprehensible characteristics since the archetypes representing the individual groups are closely related to samples present in the data set. Significant changes in gene expression between different groups identified adaptive changes of the bacteria residing in the cystic fibrosis lung. The analysis suggests a similar gene expression pattern between isolates with a high mutation rate (hypermutators) despite accumulation of different mutations for these isolates. This suggests positive selection in the cystic fibrosis lung environment, and changes in gene expression for these isolates are therefore most likely related to adaptation of the bacteria. Conclusions Archetypal analysis succeeded in identifying adaptive changes of P. aeruginosa. The combination of clustering and matrix factorization made it possible to reveal minor similarities among different groups of data, which other analytical methods failed to identify. We suggest that this analysis could be used to supplement current methods used to analyze DNA microarray data. PMID:24059747

  1. The metabolic trinity, glucose-glycogen-lactate, links astrocytes and neurons in brain energetics, signaling, memory, and gene expression.

    PubMed

    Dienel, Gerald A

    2017-01-10

    Glucose, glycogen, and lactate are traditionally identified with brain energetics, ATP turnover, and pathophysiology. However, recent studies extend their roles to include involvement in astrocytic signaling, memory consolidation, and gene expression. Emerging roles for these brain fuels and a readily-diffusible by-product are linked to differential fluxes in glycolytic and oxidative pathways, astrocytic glycogen dynamics, redox shifts, neuron-astrocyte interactions, and regulation of astrocytic activities by noradrenaline released from the locus coeruleus. Disproportionate utilization of carbohydrate compared with oxygen during brain activation is influenced by catecholamines, but its physiological basis is not understood and its magnitude may be affected by technical aspects of metabolite assays. Memory consolidation and gene expression are impaired by glycogenolysis blockade, and prevention of these deficits by injection of abnormally-high concentrations of lactate was interpreted as a requirement for astrocyte-to-neuron lactate shuttling in memory and gene expression. However, lactate transport was not measured and evidence for presumed shuttling is not compelling. In fact, high levels of lactate used to preserve memory consolidation and induce gene expression are sufficient to shut down neuronal firing via the HCAR1 receptor. In contrast, low lactate levels activate a receptor in locus coeruleus that stimulates noradrenaline release that may activate astrocytes throughout brain. Physiological relevance of exogenous concentrations of lactate used to mimic and evaluate metabolic, molecular, and behavioral effects of lactate requires close correspondence with the normal lactate levels, the biochemical and cellular sources and sinks, and specificity of lactate delivery to target cells. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  2. Age gene expression and coexpression progressive signatures in peripheral blood leukocytes.

    PubMed

    Irizar, Haritz; Goñi, Joaquín; Alzualde, Ainhoa; Castillo-Triviño, Tamara; Olascoaga, Javier; Lopez de Munain, Adolfo; Otaegui, David

    2015-12-01

    Both cellular senescence and organismic aging are known to be dynamic processes that start early in life and progress constantly during the whole life of the individual. In this work, with the objective of identifying signatures of age-related progressive change at the transcriptomic level, we have performed a whole-genome gene expression analysis of peripheral blood leukocytes in a group of healthy individuals with ages ranging from 14 to 93 years. A set of genes with progressively changing gene expression (either increase or decrease with age) has been identified and contextualized in a coexpression network. A modularity analysis has been performed on this network and biological-term and pathway enrichment analyses have been used for biological interpretation of each module. In summary, the results of the present work reveal the existence of a transcriptomic component that shows progressive expression changes associated to age in peripheral blood leukocytes, highlighting both the dynamic nature of the process and the need to complement young vs. elder studies with longitudinal studies that include middle aged individuals. From the transcriptional point of view, immunosenescence seems to be occurring from a relatively early age, at least from the late 20s/early 30s, and the 49-56 year old age-range appears to be critical. In general, the genes that, according to our results, show progressive expression changes with aging are involved in pathogenic/cellular processes that have classically been linked to aging in humans: cancer, immune processes and cellular growth vs. maintenance. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. GeneTopics - interpretation of gene sets via literature-driven topic models

    PubMed Central

    2013-01-01

    Background Annotation of a set of genes is often accomplished through comparison to a library of labelled gene sets such as biological processes or canonical pathways. However, this approach might fail if the employed libraries are not up to date with the latest research, don't capture relevant biological themes or are curated at a different level of granularity than is required to appropriately analyze the input gene set. At the same time, the vast biomedical literature offers an unstructured repository of the latest research findings that can be tapped to provide thematic sub-groupings for any input gene set. Methods Our proposed method relies on a gene-specific text corpus and extracts commonalities between documents in an unsupervised manner using a topic model approach. We automatically determine the number of topics summarizing the corpus and calculate a gene relevancy score for each topic allowing us to eliminate non-specific topics. As a result we obtain a set of literature topics in which each topic is associated with a subset of the input genes providing directly interpretable keywords and corresponding documents for literature research. Results We validate our method based on labelled gene sets from the KEGG metabolic pathway collection and the genetic association database (GAD) and show that the approach is able to detect topics consistent with the labelled annotation. Furthermore, we discuss the results on three different types of experimentally derived gene sets, (1) differentially expressed genes from a cardiac hypertrophy experiment in mice, (2) altered transcript abundance in human pancreatic beta cells, and (3) genes implicated by GWA studies to be associated with metabolite levels in a healthy population. In all three cases, we are able to replicate findings from the original papers in a quick and semi-automated manner. Conclusions Our approach provides a novel way of automatically generating meaningful annotations for gene sets that are directly tied to relevant articles in the literature. Extending a general topic model method, the approach introduced here establishes a workflow for the interpretation of gene sets generated from diverse experimental scenarios that can complement the classical approach of comparison to reference gene sets. PMID:24564875

  4. Identification of mechanosensitive genes during skeletal development: alteration of genes associated with cytoskeletal rearrangement and cell signalling pathways.

    PubMed

    Rolfe, Rebecca A; Nowlan, Niamh C; Kenny, Elaine M; Cormican, Paul; Morris, Derek W; Prendergast, Patrick J; Kelly, Daniel; Murphy, Paula

    2014-01-20

    Mechanical stimulation is necessary for regulating correct formation of the skeleton. Here we test the hypothesis that mechanical stimulation of the embryonic skeletal system impacts expression levels of genes implicated in developmentally important signalling pathways in a genome wide approach. We use a mutant mouse model with altered mechanical stimulation due to the absence of limb skeletal muscle (Splotch-delayed) where muscle-less embryos show specific defects in skeletal elements including delayed ossification, changes in the size and shape of cartilage rudiments and joint fusion. We used Microarray and RNA sequencing analysis tools to identify differentially expressed genes between muscle-less and control embryonic (TS23) humerus tissue. We found that 680 independent genes were down-regulated and 452 genes up-regulated in humeri from muscle-less Spd embryos compared to littermate controls (at least 2-fold; corrected p-value ≤0.05). We analysed the resulting differentially expressed gene sets using Gene Ontology annotations to identify significant enrichment of genes associated with particular biological processes, showing that removal of mechanical stimuli from muscle contractions affected genes associated with development and differentiation, cytoskeletal architecture and cell signalling. Among cell signalling pathways, the most strongly disturbed was Wnt signalling, with 34 genes including 19 pathway target genes affected. Spatial gene expression analysis showed that both a Wnt ligand encoding gene (Wnt4) and a pathway antagonist (Sfrp2) are up-regulated specifically in the developing joint line, while the expression of a Wnt target gene, Cd44, is no longer detectable in muscle-less embryos. The identification of 84 genes associated with the cytoskeleton that are down-regulated in the absence of muscle indicates a number of candidate genes that are both mechanoresponsive and potentially involved in mechanotransduction, converting a mechanical stimulus into a transcriptional response. This work identifies key developmental regulatory genes impacted by altered mechanical stimulation, sheds light on the molecular mechanisms that interpret mechanical stimulation during skeletal development and provides valuable resources for further investigation of the mechanistic basis of mechanoregulation. In particular it highlights the Wnt signalling pathway as a potential point of integration of mechanical and molecular signalling and cytoskeletal components as mediators of the response.

  5. Identification of mechanosensitive genes during skeletal development: alteration of genes associated with cytoskeletal rearrangement and cell signalling pathways

    PubMed Central

    2014-01-01

    Background Mechanical stimulation is necessary for regulating correct formation of the skeleton. Here we test the hypothesis that mechanical stimulation of the embryonic skeletal system impacts expression levels of genes implicated in developmentally important signalling pathways in a genome wide approach. We use a mutant mouse model with altered mechanical stimulation due to the absence of limb skeletal muscle (Splotch-delayed) where muscle-less embryos show specific defects in skeletal elements including delayed ossification, changes in the size and shape of cartilage rudiments and joint fusion. We used Microarray and RNA sequencing analysis tools to identify differentially expressed genes between muscle-less and control embryonic (TS23) humerus tissue. Results We found that 680 independent genes were down-regulated and 452 genes up-regulated in humeri from muscle-less Spd embryos compared to littermate controls (at least 2-fold; corrected p-value ≤0.05). We analysed the resulting differentially expressed gene sets using Gene Ontology annotations to identify significant enrichment of genes associated with particular biological processes, showing that removal of mechanical stimuli from muscle contractions affected genes associated with development and differentiation, cytoskeletal architecture and cell signalling. Among cell signalling pathways, the most strongly disturbed was Wnt signalling, with 34 genes including 19 pathway target genes affected. Spatial gene expression analysis showed that both a Wnt ligand encoding gene (Wnt4) and a pathway antagonist (Sfrp2) are up-regulated specifically in the developing joint line, while the expression of a Wnt target gene, Cd44, is no longer detectable in muscle-less embryos. The identification of 84 genes associated with the cytoskeleton that are down-regulated in the absence of muscle indicates a number of candidate genes that are both mechanoresponsive and potentially involved in mechanotransduction, converting a mechanical stimulus into a transcriptional response. Conclusions This work identifies key developmental regulatory genes impacted by altered mechanical stimulation, sheds light on the molecular mechanisms that interpret mechanical stimulation during skeletal development and provides valuable resources for further investigation of the mechanistic basis of mechanoregulation. In particular it highlights the Wnt signalling pathway as a potential point of integration of mechanical and molecular signalling and cytoskeletal components as mediators of the response. PMID:24443808

  6. Study of gene expression alteration in male androgenetic alopecia: evidence of predominant molecular signalling pathways.

    PubMed

    Michel, L; Reygagne, P; Benech, P; Jean-Louis, F; Scalvino, S; Ly Ka So, S; Hamidou, Z; Bianovici, S; Pouch, J; Ducos, B; Bonnet, M; Bensussan, A; Patatian, A; Lati, E; Wdzieczak-Bakala, J; Choulot, J-C; Loing, E; Hocquaux, M

    2017-11-01

    Male androgenetic alopecia (AGA) is the most common form of hair loss in men. It is characterized by a distinct pattern of progressive hair loss starting from the frontal area and the vertex of the scalp. Although several genetic risk loci have been identified, relevant genes for AGA remain to be defined. To identify biomarkers associated with AGA. Molecular biomarkers associated with premature AGA were identified through gene expression analysis using cDNA generated from scalp vertex biopsies of hairless or bald men with premature AGA, and healthy volunteers. This monocentric study reveals that genes encoding mast cell granule enzymes, inflammatory mediators and immunoglobulin-associated immune mediators were significantly overexpressed in AGA. In contrast, underexpressed genes appear to be associated with the Wnt/β-catenin and bone morphogenic protein/transforming growth factor-β signalling pathways. Although involvement of these pathways in hair follicle regeneration is well described, functional interpretation of the transcriptomic data highlights different events that account for their inhibition. In particular, one of these events depends on the dysregulated expression of proopiomelanocortin, as confirmed by polymerase chain reaction and immunohistochemistry. In addition, lower expression of CYP27B1 in patients with AGA supports the notion that changes in vitamin D metabolism contributes to hair loss. This study provides compelling evidence for distinct molecular events contributing to alopecia that may pave the way for new therapeutic approaches. © 2017 British Association of Dermatologists.

  7. Transduction of skeletal muscles with common reporter genes can promote muscle fiber degeneration and inflammation.

    PubMed

    Winbanks, Catherine E; Beyer, Claudia; Qian, Hongwei; Gregorevic, Paul

    2012-01-01

    Recombinant adeno-associated viral vectors (rAAV vectors) are promising tools for delivering transgenes to skeletal muscle, in order to study the mechanisms that control the muscle phenotype, and to ameliorate diseases that perturb muscle homeostasis. Many studies have employed rAAV vectors carrying reporter genes encoding for β-galactosidase (β-gal), human placental alkaline phosphatase (hPLAP), and green fluorescent protein (GFP) as experimental controls when studying the effects of manipulating other genes. However, it is not clear to what extent these reporter genes can influence signaling and gene expression signatures in skeletal muscle, which may confound the interpretation of results obtained in experimentally manipulated muscles. Herein, we report a strong pro-inflammatory effect of expressing reporter genes in skeletal muscle. Specifically, we show that the administration of rAAV6:hPLAP vectors to the hind limb muscles of mice is associated with dose- and time-dependent macrophage recruitment, and skeletal muscle damage. Dose-dependent expression of hPLAP also led to marked activity of established pro-inflammatory IL-6/Stat3, TNFα, IKKβ and JNK signaling in lysates obtained from homogenized muscles. These effects were independent of promoter type, as expression cassettes featuring hPLAP under the control of constitutive CMV and muscle-specific CK6 promoters both drove cellular responses when matched for vector dose. Importantly, the administration of rAAV6:GFP vectors did not induce muscle damage or inflammation except at the highest doses we examined, and administration of a transgene-null vector (rAAV6:MCS) did not cause damage or inflammation at any of the doses tested, demonstrating that GFP-expressing, or transgene-null vectors may be more suitable as experimental controls. The studies highlight the importance of considering the potential effects of reporter genes when designing experiments that examine gene manipulation in vivo.

  8. Transcriptomic meta-analysis identifies gene expression characteristics in various samples of HIV-infected patients with nonprogressive disease.

    PubMed

    Zhang, Le-Le; Zhang, Zi-Ning; Wu, Xian; Jiang, Yong-Jun; Fu, Ya-Jing; Shang, Hong

    2017-09-12

    A small proportion of HIV-infected patients remain clinically and/or immunologically stable for years, including elite controllers (ECs) who have undetectable viremia (<50 copies/ml) and long-term nonprogressors (LTNPs) who maintain normal CD4 + T cell counts for prolonged periods (>10 years). However, the mechanism of nonprogression needs to be further resolved. In this study, a transcriptome meta-analysis was performed on nonprogressor and progressor microarray data to identify differential transcriptome pathways and potential biomarkers. Using the INMEX (integrative meta-analysis of expression data) program, we performed the meta-analysis to identify consistently differentially expressed genes (DEGs) in nonprogressors and further performed functional interpretation (gene ontology analysis and pathway analysis) of the DEGs identified in the meta-analysis. Five microarray datasets (81 cases and 98 controls in total), including whole blood, CD4 + and CD8 + T cells, were collected for meta-analysis. We determined that nonprogressors have reduced expression of important interferon-stimulated genes (ISGs), CD38, lymphocyte activation gene 3 (LAG-3) in whole blood, CD4 + and CD8 + T cells. Gene ontology (GO) analysis showed a significant enrichment in DEGs that function in the type I interferon signaling pathway. Upregulated pathways, including the PI3K-Akt signaling pathway in whole blood, cytokine-cytokine receptor interaction in CD4 + T cells and the MAPK signaling pathway in CD8 + T cells, were identified in nonprogressors compared with progressors. In each metabolic functional category, the number of downregulated DEGs was more than the upregulated DEGs, and almost all genes were downregulated DEGs in the oxidative phosphorylation (OXPHOS) and tricarboxylic acid (TCA) cycle in the three types of samples. Our transcriptomic meta-analysis provides a comprehensive evaluation of the gene expression profiles in major blood types of nonprogressors, providing new insights in the understanding of HIV pathogenesis and developing strategies to delay HIV disease progression.

  9. PAGER 2.0: an update to the pathway, annotated-list and gene-signature electronic repository for Human Network Biology

    PubMed Central

    Yue, Zongliang; Zheng, Qi; Neylon, Michael T; Yoo, Minjae; Shin, Jimin; Zhao, Zhiying; Tan, Aik Choon

    2018-01-01

    Abstract Integrative Gene-set, Network and Pathway Analysis (GNPA) is a powerful data analysis approach developed to help interpret high-throughput omics data. In PAGER 1.0, we demonstrated that researchers can gain unbiased and reproducible biological insights with the introduction of PAGs (Pathways, Annotated-lists and Gene-signatures) as the basic data representation elements. In PAGER 2.0, we improve the utility of integrative GNPA by significantly expanding the coverage of PAGs and PAG-to-PAG relationships in the database, defining a new metric to quantify PAG data qualities, and developing new software features to simplify online integrative GNPA. Specifically, we included 84 282 PAGs spanning 24 different data sources that cover human diseases, published gene-expression signatures, drug–gene, miRNA–gene interactions, pathways and tissue-specific gene expressions. We introduced a new normalized Cohesion Coefficient (nCoCo) score to assess the biological relevance of genes inside a PAG, and RP-score to rank genes and assign gene-specific weights inside a PAG. The companion web interface contains numerous features to help users query and navigate the database content. The database content can be freely downloaded and is compatible with third-party Gene Set Enrichment Analysis tools. We expect PAGER 2.0 to become a major resource in integrative GNPA. PAGER 2.0 is available at http://discovery.informatics.uab.edu/PAGER/. PMID:29126216

  10. The top skin-associated genes: a comparative analysis of human and mouse skin transcriptomes.

    PubMed

    Gerber, Peter Arne; Buhren, Bettina Alexandra; Schrumpf, Holger; Homey, Bernhard; Zlotnik, Albert; Hevezi, Peter

    2014-06-01

    The mouse represents a key model system for the study of the physiology and biochemistry of skin. Comparison of skin between mouse and human is critical for interpretation and application of data from mouse experiments to human disease. Here, we review the current knowledge on structure and immunology of mouse and human skin. Moreover, we present a systematic comparison of human and mouse skin transcriptomes. To this end, we have recently used a genome-wide database of human gene expression to identify genes highly expressed in skin, with no, or limited expression elsewhere - human skin-associated genes (hSAGs). Analysis of our set of hSAGs allowed us to generate a comprehensive molecular characterization of healthy human skin. Here, we used a similar database to generate a list of mouse skin-associated genes (mSAGs). A comparative analysis between the top human (n=666) and mouse (n=873) skin-associated genes (SAGs) revealed a total of only 30.2% identity between the two lists. The majority of shared genes encode proteins that participate in structural and barrier functions. Analysis of the top functional annotation terms revealed an overlap for morphogenesis, cell adhesion, structure, and signal transduction. The results of this analysis, discussed in the context of published data, illustrate the diversity between the molecular make up of skin of both species and grants a probable explanation, why results generated in murine in vivo models often fail to translate into the human.

  11. Grouping patients for masseter muscle genotype-phenotype studies.

    PubMed

    Moawad, Hadwah Abdelmatloub; Sinanan, Andrea C M; Lewis, Mark P; Hunt, Nigel P

    2012-03-01

    To use various facial classifications, including either/both vertical and horizontal facial criteria, to assess their effects on the interpretation of masseter muscle (MM) gene expression. Fresh MM biopsies were obtained from 29 patients (age, 16-36 years) with various facial phenotypes. Based on clinical and cephalometric analysis, patients were grouped using three different classifications: (1) basic vertical, (2) basic horizontal, and (3) combined vertical and horizontal. Gene expression levels of the myosin heavy chain genes MYH1, MYH2, MYH3, MYH6, MYH7, and MYH8 were recorded using quantitative reverse transcriptase polymerase chain reaction (RT-PCR) and were related to the various classifications. The significance level for statistical analysis was set at P ≤ .05. Using classification 1, none of the MYH genes were found to be significantly different between long face (LF) patients and the average vertical group. Using classification 2, MYH3, MYH6, and MYH7 genes were found to be significantly upregulated in retrognathic patients compared with prognathic and average horizontal groups. Using classification 3, only the MYH7 gene was found to be significantly upregulated in retrognathic LF compared with prognathic LF, prognathic average vertical faces, and average vertical and horizontal groups. The use of basic vertical or basic horizontal facial classifications may not be sufficient for genetics-based studies of facial phenotypes. Prognathic and retrognathic facial phenotypes have different MM gene expressions; therefore, it is not recommended to combine them into one single group, even though they may have a similar vertical facial phenotype.

  12. Whole genome mRNA transcriptomics analysis reveals different modes of action of the diarrheic shellfish poisons okadaic acid and dinophysis toxin-1 versus azaspiracid-1 in Caco-2 cells.

    PubMed

    Bodero, Marcia; Hoogenboom, Ron L A P; Bovee, Toine F H; Portier, Liza; de Haan, Laura; Peijnenburg, Ad; Hendriksen, Peter J M

    2018-02-01

    A study with DNA microarrays was performed to investigate the effects of two diarrhetic and one azaspiracid shellfish poison, okadaic acid (OA), dinophysistoxin-1 (DTX-1) and azaspiracid-1 (AZA-1) respectively, on the whole-genome mRNA expression of undifferentiated intestinal Caco-2 cells. Previously, the most responding genes were used to develop a dedicated array tube test to screen shellfish samples on the presence of these toxins. In the present study the whole genome mRNA expression was analyzed in order to reveal modes of action and obtain hints on potential biomarkers suitable to be used in alternative bioassays. Effects on key genes in the most affected pathways and processes were confirmed by qPCR. OA and DTX-1 induced almost identical effects on mRNA expression, which strongly indicates that OA and DTX-1induce similar toxic effects. Biological interpretation of the microarray data indicates that both compounds induce hypoxia related pathways/processes, the unfolded protein response (UPR) and endoplasmic reticulum (ER) stress. The gene expression profile of AZA-1 is different and shows increased mRNA expression of genes involved in cholesterol synthesis and glycolysis, suggesting a different mode of action for this toxin. Future studies should reveal whether identified pathways provide suitable biomarkers for rapid detection of DSPs in shellfish. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  13. Improvement of experimental testing and network training conditions with genome-wide microarrays for more accurate predictions of drug gene targets

    PubMed Central

    2014-01-01

    Background Genome-wide microarrays have been useful for predicting chemical-genetic interactions at the gene level. However, interpreting genome-wide microarray results can be overwhelming due to the vast output of gene expression data combined with off-target transcriptional responses many times induced by a drug treatment. This study demonstrates how experimental and computational methods can interact with each other, to arrive at more accurate predictions of drug-induced perturbations. We present a two-stage strategy that links microarray experimental testing and network training conditions to predict gene perturbations for a drug with a known mechanism of action in a well-studied organism. Results S. cerevisiae cells were treated with the antifungal, fluconazole, and expression profiling was conducted under different biological conditions using Affymetrix genome-wide microarrays. Transcripts were filtered with a formal network-based method, sparse simultaneous equation models and Lasso regression (SSEM-Lasso), under different network training conditions. Gene expression results were evaluated using both gene set and single gene target analyses, and the drug’s transcriptional effects were narrowed first by pathway and then by individual genes. Variables included: (i) Testing conditions – exposure time and concentration and (ii) Network training conditions – training compendium modifications. Two analyses of SSEM-Lasso output – gene set and single gene – were conducted to gain a better understanding of how SSEM-Lasso predicts perturbation targets. Conclusions This study demonstrates that genome-wide microarrays can be optimized using a two-stage strategy for a more in-depth understanding of how a cell manifests biological reactions to a drug treatment at the transcription level. Additionally, a more detailed understanding of how the statistical model, SSEM-Lasso, propagates perturbations through a network of gene regulatory interactions is achieved. PMID:24444313

  14. Exploiting the full power of temporal gene expression profiling through a new statistical test: application to the analysis of muscular dystrophy data.

    PubMed

    Vinciotti, Veronica; Liu, Xiaohui; Turk, Rolf; de Meijer, Emile J; 't Hoen, Peter A C

    2006-04-03

    The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Most statistical methods used in the literature do not fully exploit the temporal ordering in the dataset and are not suited to the case where temporal profiles are measured for a number of different biological conditions. We present a statistical test that makes explicit use of the temporal order in the data by fitting polynomial functions to the temporal profile of each gene and for each biological condition. A Hotelling T2-statistic is derived to detect the genes for which the parameters of these polynomials are significantly different from each other. We validate the temporal Hotelling T2-test on muscular gene expression data from four mouse strains which were profiled at different ages: dystrophin-, beta-sarcoglycan and gamma-sarcoglycan deficient mice, and wild-type mice. The first three are animal models for different muscular dystrophies. Extensive biological validation shows that the method is capable of finding genes with temporal profiles significantly different across the four strains, as well as identifying potential biomarkers for each form of the disease. The added value of the temporal test compared to an identical test which does not make use of temporal ordering is demonstrated via a simulation study, and through confirmation of the expression profiles from selected genes by quantitative PCR experiments. The proposed method maximises the detection of the biologically interesting genes, whilst minimising false detections. The temporal Hotelling T2-test is capable of finding relatively small and robust sets of genes that display different temporal profiles between the conditions of interest. The test is simple, it can be used on gene expression data generated from any experimental design and for any number of conditions, and it allows fast interpretation of the temporal behaviour of genes. The R code is available from V.V. The microarray data have been submitted to GEO under series GSE1574 and GSE3523.

  15. Exploiting the full power of temporal gene expression profiling through a new statistical test: Application to the analysis of muscular dystrophy data

    PubMed Central

    Vinciotti, Veronica; Liu, Xiaohui; Turk, Rolf; de Meijer, Emile J; 't Hoen, Peter AC

    2006-01-01

    Background The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Most statistical methods used in the literature do not fully exploit the temporal ordering in the dataset and are not suited to the case where temporal profiles are measured for a number of different biological conditions. We present a statistical test that makes explicit use of the temporal order in the data by fitting polynomial functions to the temporal profile of each gene and for each biological condition. A Hotelling T2-statistic is derived to detect the genes for which the parameters of these polynomials are significantly different from each other. Results We validate the temporal Hotelling T2-test on muscular gene expression data from four mouse strains which were profiled at different ages: dystrophin-, beta-sarcoglycan and gamma-sarcoglycan deficient mice, and wild-type mice. The first three are animal models for different muscular dystrophies. Extensive biological validation shows that the method is capable of finding genes with temporal profiles significantly different across the four strains, as well as identifying potential biomarkers for each form of the disease. The added value of the temporal test compared to an identical test which does not make use of temporal ordering is demonstrated via a simulation study, and through confirmation of the expression profiles from selected genes by quantitative PCR experiments. The proposed method maximises the detection of the biologically interesting genes, whilst minimising false detections. Conclusion The temporal Hotelling T2-test is capable of finding relatively small and robust sets of genes that display different temporal profiles between the conditions of interest. The test is simple, it can be used on gene expression data generated from any experimental design and for any number of conditions, and it allows fast interpretation of the temporal behaviour of genes. The R code is available from V.V. The microarray data have been submitted to GEO under series GSE1574 and GSE3523. PMID:16584545

  16. Development of a versatile enrichment analysis tool reveals associations between the maternal brain and mental health disorders, including autism

    PubMed Central

    2013-01-01

    Background A recent study of lateral septum (LS) suggested a large number of autism-related genes with altered expression in the postpartum state. However, formally testing the findings for enrichment of autism-associated genes proved to be problematic with existing software. Many gene-disease association databases have been curated which are not currently incorporated in popular, full-featured enrichment tools, and the use of custom gene lists in these programs can be difficult to perform and interpret. As a simple alternative, we have developed the Modular Single-set Enrichment Test (MSET), a minimal tool that enables one to easily evaluate expression data for enrichment of any conceivable gene list of interest. Results The MSET approach was validated by testing several publicly available expression data sets for expected enrichment in areas of autism, attention deficit hyperactivity disorder (ADHD), and arthritis. Using nine independent, unique autism gene lists extracted from association databases and two recent publications, a striking consensus of enrichment was detected within gene expression changes in LS of postpartum mice. A network of 160 autism-related genes was identified, representing developmental processes such as synaptic plasticity, neuronal morphogenesis, and differentiation. Additionally, maternal LS displayed enrichment for genes associated with bipolar disorder, schizophrenia, ADHD, and depression. Conclusions The transition to motherhood includes the most fundamental social bonding event in mammals and features naturally occurring changes in sociability. Some individuals with autism, schizophrenia, or other mental health disorders exhibit impaired social traits. Genes involved in these deficits may also contribute to elevated sociability in the maternal brain. To date, this is the first study to show a significant, quantitative link between the maternal brain and mental health disorders using large scale gene expression data. Thus, the postpartum brain may provide a novel and promising platform for understanding the complex genetics of improved sociability that may have direct relevance for multiple psychiatric illnesses. This study also provides an important new tool that fills a critical analysis gap and makes evaluation of enrichment using any database of interest possible with an emphasis on ease of use and methodological transparency. PMID:24245670

  17. Development of a versatile enrichment analysis tool reveals associations between the maternal brain and mental health disorders, including autism.

    PubMed

    Eisinger, Brian E; Saul, Michael C; Driessen, Terri M; Gammie, Stephen C

    2013-11-19

    A recent study of lateral septum (LS) suggested a large number of autism-related genes with altered expression in the postpartum state. However, formally testing the findings for enrichment of autism-associated genes proved to be problematic with existing software. Many gene-disease association databases have been curated which are not currently incorporated in popular, full-featured enrichment tools, and the use of custom gene lists in these programs can be difficult to perform and interpret. As a simple alternative, we have developed the Modular Single-set Enrichment Test (MSET), a minimal tool that enables one to easily evaluate expression data for enrichment of any conceivable gene list of interest. The MSET approach was validated by testing several publicly available expression data sets for expected enrichment in areas of autism, attention deficit hyperactivity disorder (ADHD), and arthritis. Using nine independent, unique autism gene lists extracted from association databases and two recent publications, a striking consensus of enrichment was detected within gene expression changes in LS of postpartum mice. A network of 160 autism-related genes was identified, representing developmental processes such as synaptic plasticity, neuronal morphogenesis, and differentiation. Additionally, maternal LS displayed enrichment for genes associated with bipolar disorder, schizophrenia, ADHD, and depression. The transition to motherhood includes the most fundamental social bonding event in mammals and features naturally occurring changes in sociability. Some individuals with autism, schizophrenia, or other mental health disorders exhibit impaired social traits. Genes involved in these deficits may also contribute to elevated sociability in the maternal brain. To date, this is the first study to show a significant, quantitative link between the maternal brain and mental health disorders using large scale gene expression data. Thus, the postpartum brain may provide a novel and promising platform for understanding the complex genetics of improved sociability that may have direct relevance for multiple psychiatric illnesses. This study also provides an important new tool that fills a critical analysis gap and makes evaluation of enrichment using any database of interest possible with an emphasis on ease of use and methodological transparency.

  18. Transcriptomic response of the mycoparasitic fungus Trichoderma atroviride to the presence of a fungal prey.

    PubMed

    Seidl, Verena; Song, Lifu; Lindquist, Erika; Gruber, Sabine; Koptchinskiy, Alexeji; Zeilinger, Susanne; Schmoll, Monika; Martínez, Pedro; Sun, Jibin; Grigoriev, Igor; Herrera-Estrella, Alfredo; Baker, Scott E; Kubicek, Christian P

    2009-11-30

    Combating the action of plant pathogenic microorganisms by mycoparasitic fungi has been announced as an attractive biological alternative to the use of chemical fungicides since two decades. The fungal genus Trichoderma includes a high number of taxa which are able to recognize, combat and finally besiege and kill their prey. Only fragments of the biochemical processes related to this ability have been uncovered so far, however. We analyzed genome-wide gene expression changes during the begin of physical contact between Trichoderma atroviride and two plant pathogens Botrytis cinerea and Rhizoctonia solani, and compared with gene expression patterns of mycelial and conidiating cultures, respectively. About 3000 ESTs, representing about 900 genes, were obtained from each of these three growth conditions. 66 genes, represented by 442 ESTs, were specifically and significantly overexpressed during onset of mycoparasitism, and the expression of a subset thereof was verified by expression analysis. The upregulated genes comprised 18 KOG groups, but were most abundant from the groups representing posttranslational processing, and amino acid metabolism, and included components of the stress response, reaction to nitrogen shortage, signal transduction and lipid catabolism. Metabolic network analysis confirmed the upregulation of the genes for amino acid biosynthesis and of those involved in the catabolism of lipids and aminosugars. The analysis of the genes overexpressed during the onset of mycoparasitism in T. atroviride has revealed that the fungus reacts to this condition with several previously undetected physiological reactions. These data enable a new and more comprehensive interpretation of the physiology of mycoparasitism, and will aid in the selection of traits for improvement of biocontrol strains by recombinant techniques.

  19. Transcriptomic response of the mycoparasitic fungus Trichoderma atroviride to the presence of a fungal prey

    PubMed Central

    2009-01-01

    Background Combating the action of plant pathogenic microorganisms by mycoparasitic fungi has been announced as an attractive biological alternative to the use of chemical fungicides since two decades. The fungal genus Trichoderma includes a high number of taxa which are able to recognize, combat and finally besiege and kill their prey. Only fragments of the biochemical processes related to this ability have been uncovered so far, however. Results We analyzed genome-wide gene expression changes during the begin of physical contact between Trichoderma atroviride and two plant pathogens Botrytis cinerea and Rhizoctonia solani, and compared with gene expression patterns of mycelial and conidiating cultures, respectively. About 3000 ESTs, representing about 900 genes, were obtained from each of these three growth conditions. 66 genes, represented by 442 ESTs, were specifically and significantly overexpressed during onset of mycoparasitism, and the expression of a subset thereof was verified by expression analysis. The upregulated genes comprised 18 KOG groups, but were most abundant from the groups representing posttranslational processing, and amino acid metabolism, and included components of the stress response, reaction to nitrogen shortage, signal transduction and lipid catabolism. Metabolic network analysis confirmed the upregulation of the genes for amino acid biosynthesis and of those involved in the catabolism of lipids and aminosugars. Conclusion The analysis of the genes overexpressed during the onset of mycoparasitism in T. atroviride has revealed that the fungus reacts to this condition with several previously undetected physiological reactions. These data enable a new and more comprehensive interpretation of the physiology of mycoparasitism, and will aid in the selection of traits for improvement of biocontrol strains by recombinant techniques. PMID:19948043

  20. Transcriptomic response of the mycoparasitic fungus Trichoderma atroviride to the presence of a fungal prey

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seidl, Verena; Song, Lifu; Lindquist, Erika

    BACKGROUND: Combating the action of plant pathogenic microorganisms by mycoparasitic fungi has been announced as an attractive biological alternative to the use of chemical fungicides since two decades. The fungal genus Trichoderma includes a high number of taxa which are able to recognize, combat and finally besiege and kill their prey. Only fragments of the biochemical processes related to this ability have been uncovered so far, however. RESULTS: We analyzed genome-wide gene expression changes during the begin of physical contact between Trichoderma atroviride and two plant pathogens Botrytis cinerea and Rhizoctonia solani, and compared with gene expression patterns of mycelialmore » and conidiating cultures, respectively. About 3000 ESTs, representing about 900 genes, were obtained from each of these three growth conditions. 66 genes, represented by 442 ESTs, were specifically and significantly overexpressed during onset of mycoparasitism, and the expression of a subset thereof was verified by expression analysis. The upregulated genes comprised 18 KOG groups, but were most abundant from the groups representing posttranslational processing, and amino acid metabolism, and included components of the stress response, reaction to nitrogen shortage, signal transduction and lipid catabolism. Metabolic network analysis confirmed the upregulation of the genes for amino acid biosynthesis and of those involved in the catabolism of lipids and aminosugars. CONCLUSION: The analysis of the genes overexpressed during the onset of mycoparasitism in T. atroviride has revealed that the fungus reacts to this condition with several previously undetected physiological reactions. These data enable a new and more comprehensive interpretation of the physiology of mycoparasitism, and will aid in the selection of traits for improvement of biocontrol strains by recombinant techniques.« less

  1. CoPub: a literature-based keyword enrichment tool for microarray data analysis.

    PubMed

    Frijters, Raoul; Heupers, Bart; van Beek, Pieter; Bouwhuis, Maurice; van Schaik, René; de Vlieg, Jacob; Polman, Jan; Alkema, Wynand

    2008-07-01

    Medline is a rich information source, from which links between genes and keywords describing biological processes, pathways, drugs, pathologies and diseases can be extracted. We developed a publicly available tool called CoPub that uses the information in the Medline database for the biological interpretation of microarray data. CoPub allows batch input of multiple human, mouse or rat genes and produces lists of keywords from several biomedical thesauri that are significantly correlated with the set of input genes. These lists link to Medline abstracts in which the co-occurring input genes and correlated keywords are highlighted. Furthermore, CoPub can graphically visualize differentially expressed genes and over-represented keywords in a network, providing detailed insight in the relationships between genes and keywords, and revealing the most influential genes as highly connected hubs. CoPub is freely accessible at http://services.nbic.nl/cgi-bin/copub/CoPub.pl.

  2. The PluriNetWork: An Electronic Representation of the Network Underlying Pluripotency in Mouse, and Its Applications

    PubMed Central

    Greber, Boris; Siatkowski, Marcin; Paudel, Yogesh; Warsow, Gregor; Cap, Clemens; Schöler, Hans; Fuellen, Georg

    2010-01-01

    Background Analysis of the mechanisms underlying pluripotency and reprogramming would benefit substantially from easy access to an electronic network of genes, proteins and mechanisms. Moreover, interpreting gene expression data needs to move beyond just the identification of the up-/downregulation of key genes and of overrepresented processes and pathways, towards clarifying the essential effects of the experiment in molecular terms. Methodology/Principal Findings We have assembled a network of 574 molecular interactions, stimulations and inhibitions, based on a collection of research data from 177 publications until June 2010, involving 274 mouse genes/proteins, all in a standard electronic format, enabling analyses by readily available software such as Cytoscape and its plugins. The network includes the core circuit of Oct4 (Pou5f1), Sox2 and Nanog, its periphery (such as Stat3, Klf4, Esrrb, and c-Myc), connections to upstream signaling pathways (such as Activin, WNT, FGF, BMP, Insulin, Notch and LIF), and epigenetic regulators as well as some other relevant genes/proteins, such as proteins involved in nuclear import/export. We describe the general properties of the network, as well as a Gene Ontology analysis of the genes included. We use several expression data sets to condense the network to a set of network links that are affected in the course of an experiment, yielding hypotheses about the underlying mechanisms. Conclusions/Significance We have initiated an electronic data repository that will be useful to understand pluripotency and to facilitate the interpretation of high-throughput data. To keep up with the growth of knowledge on the fundamental processes of pluripotency and reprogramming, we suggest to combine Wiki and social networking software towards a community curation system that is easy to use and flexible, and tailored to provide a benefit for the scientist, and to improve communication and exchange of research results. A PluriNetWork tutorial is available at http://www.ibima.med.uni-rostock.de/IBIMA/PluriNetWork/. PMID:21179244

  3. Fast gene ontology based clustering for microarray experiments.

    PubMed

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  4. The level of BMP4 signaling is critical for the regulation of distinct T-box gene expression domains and growth along the dorso-ventral axis of the optic cup

    PubMed Central

    Behesti, Hourinaz; Holt, James KL; Sowden, Jane C

    2006-01-01

    Background Polarised gene expression is thought to lead to the graded distribution of signaling molecules providing a patterning mechanism across the embryonic eye. Bone morphogenetic protein 4 (Bmp4) is expressed in the dorsal optic vesicle as it transforms into the optic cup. Bmp4 deletions in human and mouse result in failure of eye development, but little attempt has been made to investigate mammalian targets of BMP4 signaling. In chick, retroviral gene overexpression studies indicate that Bmp4 activates the dorsally expressed Tbx5 gene, which represses ventrally expressed cVax. It is not known whether the Tbx5 related genes, Tbx2 and Tbx3, are BMP4 targets in the mammalian retina and whether BMP4 acts at a distance from its site of expression. Although it is established that Drosophila Dpp (homologue of vertebrate Bmp4) acts as a morphogen, there is little evidence that BMP4 gradients are interpreted to create domains of BMP4 target gene expression in the mouse. Results Our data show that the level of BMP4 signaling is critical for the regulation of distinct Tbx2, Tbx3, Tbx5 and Vax2 gene expression domains along the dorso-ventral axis of the mouse optic cup. BMP4 signaling gradients were manipulated in whole mouse embryo cultures during optic cup development, by implantation of beads soaked in BMP4, or the BMP antagonist Noggin, to provide a local signaling source. Tbx2, Tbx3 and Tbx5, showed a differential response to alterations in the level of BMP4 along the entire dorso-ventral axis of the optic cup, suggesting that BMP4 acts across a distance. Increased levels of BMP4 caused expansion of Tbx2 and Tbx3, but not Tbx5, into the ventral retina and repression of the ventral marker Vax2. Conversely, Noggin abolished Tbx5 expression but only shifted Tbx2 expression dorsally. Increased levels of BMP4 signaling caused decreased proliferation, reduced retinal volume and altered the shape of the optic cup. Conclusion Our findings suggest the existence of a dorsal-high, ventral-low BMP4 signaling gradient across which distinct domains of Tbx2, Tbx3, Tbx5 and Vax2 transcription factor gene expression are set up. Furthermore we show that the correct level of BMP4 signaling is critical for normal growth of the mammalian embryonic eye. PMID:17173667

  5. Dynamic Maternal Gradients Control Timing and Shift-Rates for Drosophila Gap Gene Expression

    PubMed Central

    Verd, Berta; Crombach, Anton

    2017-01-01

    Pattern formation during development is a highly dynamic process. In spite of this, few experimental and modelling approaches take into account the explicit time-dependence of the rules governing regulatory systems. We address this problem by studying dynamic morphogen interpretation by the gap gene network in Drosophila melanogaster. Gap genes are involved in segment determination during early embryogenesis. They are activated by maternal morphogen gradients encoded by bicoid (bcd) and caudal (cad). These gradients decay at the same time-scale as the establishment of the antero-posterior gap gene pattern. We use a reverse-engineering approach, based on data-driven regulatory models called gene circuits, to isolate and characterise the explicitly time-dependent effects of changing morphogen concentrations on gap gene regulation. To achieve this, we simulate the system in the presence and absence of dynamic gradient decay. Comparison between these simulations reveals that maternal morphogen decay controls the timing and limits the rate of gap gene expression. In the anterior of the embyro, it affects peak expression and leads to the establishment of smooth spatial boundaries between gap domains. In the posterior of the embryo, it causes a progressive slow-down in the rate of gap domain shifts, which is necessary to correctly position domain boundaries and to stabilise the spatial gap gene expression pattern. We use a newly developed method for the analysis of transient dynamics in non-autonomous (time-variable) systems to understand the regulatory causes of these effects. By providing a rigorous mechanistic explanation for the role of maternal gradient decay in gap gene regulation, our study demonstrates that such analyses are feasible and reveal important aspects of dynamic gene regulation which would have been missed by a traditional steady-state approach. More generally, it highlights the importance of transient dynamics for understanding complex regulatory processes in development. PMID:28158178

  6. Dynamic Maternal Gradients Control Timing and Shift-Rates for Drosophila Gap Gene Expression.

    PubMed

    Verd, Berta; Crombach, Anton; Jaeger, Johannes

    2017-02-01

    Pattern formation during development is a highly dynamic process. In spite of this, few experimental and modelling approaches take into account the explicit time-dependence of the rules governing regulatory systems. We address this problem by studying dynamic morphogen interpretation by the gap gene network in Drosophila melanogaster. Gap genes are involved in segment determination during early embryogenesis. They are activated by maternal morphogen gradients encoded by bicoid (bcd) and caudal (cad). These gradients decay at the same time-scale as the establishment of the antero-posterior gap gene pattern. We use a reverse-engineering approach, based on data-driven regulatory models called gene circuits, to isolate and characterise the explicitly time-dependent effects of changing morphogen concentrations on gap gene regulation. To achieve this, we simulate the system in the presence and absence of dynamic gradient decay. Comparison between these simulations reveals that maternal morphogen decay controls the timing and limits the rate of gap gene expression. In the anterior of the embyro, it affects peak expression and leads to the establishment of smooth spatial boundaries between gap domains. In the posterior of the embryo, it causes a progressive slow-down in the rate of gap domain shifts, which is necessary to correctly position domain boundaries and to stabilise the spatial gap gene expression pattern. We use a newly developed method for the analysis of transient dynamics in non-autonomous (time-variable) systems to understand the regulatory causes of these effects. By providing a rigorous mechanistic explanation for the role of maternal gradient decay in gap gene regulation, our study demonstrates that such analyses are feasible and reveal important aspects of dynamic gene regulation which would have been missed by a traditional steady-state approach. More generally, it highlights the importance of transient dynamics for understanding complex regulatory processes in development.

  7. Polymorphism at Expressed DQ and DR Loci in Five Common Equine MHC Haplotypes

    PubMed Central

    Miller, Donald; Tallmadge, Rebecca L.; Binns, Matthew; Zhu, Baoli; Mohamoud, Yasmin Ali; Ahmed, Ayeda; Brooks, Samantha A.; Antczak, Douglas F.

    2016-01-01

    The polymorphism of Major Histocompatibility Complex (MHC) class II DQ and DR genes in five common Equine Leukocyte Antigen (ELA) haplotypes was determined through sequencing of mRNA transcripts isolated from lymphocytes of eight ELA homozygous horses. Ten expressed MHC class II genes were detected in horses of the ELA-A3 haplotype carried by the donor horses of the equine Bacterial Artificial Chromosome (BAC) library and the reference genome sequence: four DR genes and six DQ genes. The other four ELA haplotypes contained at least eight expressed polymorphic MHC class II loci. Next Generation Sequencing (NGS) of genomic DNA of these four MHC haplotypes revealed stop codons in the DQA3 gene in the ELA-A2, ELA-A5, and ELA-A9 haplotypes. Few NGS reads were obtained for the other MHC class II genes that were not amplified in these horses. The amino acid sequences across haplotypes contained locus-specific residues, and the locus clusters produced by phylogenetic analysis were well supported. The MHC class II alleles within the five tested haplotypes were largely non-overlapping between haplotypes. The complement of equine MHC class II DQ and DR genes appears to be well conserved between haplotypes, in contrast to the recently described variation in class I gene loci between equine MHC haplotypes. The identification of allelic series of equine MHC class II loci will aid comparative studies of mammalian MHC conservation and evolution and may also help to interpret associations between the equine MHC class II region and diseases of the horse. PMID:27889800

  8. High natural gene expression variation in the reef-building coral Acropora millepora: potential for acclimative and adaptive plasticity.

    PubMed

    Granados-Cifuentes, Camila; Bellantuono, Anthony J; Ridgway, Tyrone; Hoegh-Guldberg, Ove; Rodriguez-Lanetty, Mauricio

    2013-04-08

    Ecosystems worldwide are suffering the consequences of anthropogenic impact. The diverse ecosystem of coral reefs, for example, are globally threatened by increases in sea surface temperatures due to global warming. Studies to date have focused on determining genetic diversity, the sequence variability of genes in a species, as a proxy to estimate and predict the potential adaptive response of coral populations to environmental changes linked to climate changes. However, the examination of natural gene expression variation has received less attention. This variation has been implicated as an important factor in evolutionary processes, upon which natural selection can act. We acclimatized coral nubbins from six colonies of the reef-building coral Acropora millepora to a common garden in Heron Island (Great Barrier Reef, GBR) for a period of four weeks to remove any site-specific environmental effects on the physiology of the coral nubbins. By using a cDNA microarray platform, we detected a high level of gene expression variation, with 17% (488) of the unigenes differentially expressed across coral nubbins of the six colonies (jsFDR-corrected, p < 0.01). Among the main categories of biological processes found differentially expressed were transport, translation, response to stimulus, oxidation-reduction processes, and apoptosis. We found that the transcriptional profiles did not correspond to the genotype of the colony characterized using either an intron of the carbonic anhydrase gene or microsatellite loci markers. Our results provide evidence of the high inter-colony variation in A. millepora at the transcriptomic level grown under a common garden and without a correspondence with genotypic identity. This finding brings to our attention the importance of taking into account natural variation between reef corals when assessing experimental gene expression differences. The high transcriptional variation detected in this study is interpreted and discussed within the context of adaptive potential and phenotypic plasticity of reef corals. Whether this variation will allow coral reefs to survive to current challenges remains unknown.

  9. Understanding how long‐acting β2‐adrenoceptor agonists enhance the clinical efficacy of inhaled corticosteroids in asthma – an update

    PubMed Central

    Giembycz, Mark A

    2016-01-01

    In moderate‐to‐severe asthma, adding an inhaled long‐acting β2‐adenoceptor agonist (LABA) to an inhaled corticosteroid (ICS) provides better disease control than simply increasing the dose of ICS. Acting on the glucocorticoid receptor (GR, gene NR3C1), ICSs promote anti‐inflammatory/anti‐asthma gene expression. In vitro, LABAs synergistically enhance the maximal expression of many glucocorticoid‐induced genes. Other genes, including dual‐specificity phosphatase 1(DUSP1) in human airways smooth muscle (ASM) and epithelial cells, are up‐regulated additively by both drug classes. Synergy may also occur for LABA‐induced genes, as illustrated by the bronchoprotective gene, regulator of G‐protein signalling 2 (RGS2) in ASM. Such effects cannot be produced by either drug alone and may explain the therapeutic efficacy of ICS/LABA combination therapies. While the molecular basis of synergy remains unclear, mechanistic interpretations must accommodate gene‐specific regulation. We explore the concept that each glucocorticoid‐induced gene is an independent signal transducer optimally activated by a specific, ligand‐directed, GR conformation. In addition to explaining partial agonism, this realization provides opportunities to identify novel GR ligands that exhibit gene expression bias. Translating this into improved therapeutic ratios requires consideration of GR density in target tissues and further understanding of gene function. Similarly, the ability of a LABA to interact with a glucocorticoid may be suboptimal due to low β2‐adrenoceptor density or biased β2‐adrenoceptor signalling. Strategies to overcome these limitations include adding‐on a phosphodiesterase inhibitor and using agonists of other Gs‐coupled receptors. In all cases, the rational design of ICS/LABA, and derivative, combination therapies requires functional knowledge of induced (and repressed) genes for therapeutic benefit to be maximized. PMID:27646470

  10. Liverome: a curated database of liver cancer-related gene signatures with self-contained context information.

    PubMed

    Lee, Langho; Wang, Kai; Li, Gang; Xie, Zhi; Wang, Yuli; Xu, Jiangchun; Sun, Shaoxian; Pocalyko, David; Bhak, Jong; Kim, Chulhong; Lee, Kee-Ho; Jang, Ye Jin; Yeom, Young Il; Yoo, Hyang-Sook; Hwang, Seungwoo

    2011-11-30

    Hepatocellular carcinoma (HCC) is the fifth most common cancer worldwide. A number of molecular profiling studies have investigated the changes in gene and protein expression that are associated with various clinicopathological characteristics of HCC and generated a wealth of scattered information, usually in the form of gene signature tables. A database of the published HCC gene signatures would be useful to liver cancer researchers seeking to retrieve existing differential expression information on a candidate gene and to make comparisons between signatures for prioritization of common genes. A challenge in constructing such database is that a direct import of the signatures as appeared in articles would lead to a loss or ambiguity of their context information that is essential for a correct biological interpretation of a gene's expression change. This challenge arises because designation of compared sample groups is most often abbreviated, ad hoc, or even missing from published signature tables. Without manual curation, the context information becomes lost, leading to uninformative database contents. Although several databases of gene signatures are available, none of them contains informative form of signatures nor shows comprehensive coverage on liver cancer. Thus we constructed Liverome, a curated database of liver cancer-related gene signatures with self-contained context information. Liverome's data coverage is more than three times larger than any other signature database, consisting of 143 signatures taken from 98 HCC studies, mostly microarray and proteome, and involving 6,927 genes. The signatures were post-processed into an informative and uniform representation and annotated with an itemized summary so that all context information is unambiguously self-contained within the database. The signatures were further informatively named and meaningfully organized according to ten functional categories for guided browsing. Its web interface enables a straightforward retrieval of known differential expression information on a query gene and a comparison of signatures to prioritize common genes. The utility of Liverome-collected data is shown by case studies in which useful biological insights on HCC are produced. Liverome database provides a comprehensive collection of well-curated HCC gene signatures and straightforward interfaces for gene search and signature comparison as well. Liverome is available at http://liverome.kobic.re.kr.

  11. Cellular heterogeneity contributes to subtype-specific expression of ZEB1 in human glioblastoma.

    PubMed

    Euskirchen, Philipp; Radke, Josefine; Schmidt, Marc Sören; Schulze Heuling, Eva; Kadikowski, Eric; Maricos, Meron; Knab, Felix; Grittner, Ulrike; Zerbe, Norman; Czabanka, Marcus; Dieterich, Christoph; Miletic, Hrvoje; Mørk, Sverre; Koch, Arend; Endres, Matthias; Harms, Christoph

    2017-01-01

    The transcription factor ZEB1 has gained attention in tumor biology of epithelial cancers because of its function in epithelial-mesenchymal transition, DNA repair, stem cell biology and tumor-induced immunosuppression, but its role in gliomas with respect to invasion and prognostic value is controversial. We characterized ZEB1 expression at single cell level in 266 primary brain tumors and present a comprehensive dataset of high grade gliomas with Ki67, p53, IDH1, and EGFR immunohistochemistry, as well as EGFR FISH. ZEB1 protein expression in glioma stem cell lines was compared to their parental tumors with respect to gene expression subtypes based on RNA-seq transcriptomic profiles. ZEB1 is widely expressed in glial tumors, but in a highly variable fraction of cells. In glioblastoma, ZEB1 labeling index is higher in tumors with EGFR amplification or IDH1 mutation. Co-labeling studies showed that tumor cells and reactive astroglia, but not immune cells contribute to the ZEB1 positive population. In contrast, glioma cell lines constitutively express ZEB1 irrespective of gene expression subtype. In conclusion, our data indicate that immune infiltration likely contributes to differential labelling of ZEB1 and confounds interpretation of bulk ZEB1 expression data.

  12. The expression of genes involved in myometrial contractility changes during ex situ culture of pregnant human uterine smooth muscle tissue.

    PubMed

    Ilicic, Marina; Butler, Trent; Zakar, Tamas; Paul, Jonathan W

    2017-01-01

    Ex situ analyses of human myometrial tissue has been used to investigate the regulation of uterine quiescence and transition to a contractile phenotype. Following concerns about the validity of cultured primary cells, we examined whether myometrial tissue undergoes culture-induced changes ex situ that may affect the validity of in vitro models. To determine whether human myometrial tissue undergoes culture-induced changes ex situ in Estrogen receptor 1 (ESR1), Prostaglandin-endoperoxide synthase 2 (PTGS2) and Oxytocin receptor (OXTR) expression. Additionally, to determine whether culture conditions approaching the in vivo environment influence the expression of these key genes. Term non-laboring human myometrial tissues were cultured in the presence of specific treatments, including; serum supplementation, progesterone and estrogen, cAMP, PMA, stretch or NF-κB inhibitors. ESR1, PTGS2 and OXTR mRNA abundance after 48 h culture was determined using quantitative RT-PCR. Myometrial tissue in culture exhibited culture-induced up-regulation of ESR1 and PTGS2 and down-regulation of OXTR mRNA expression. Progesterone prevented culture-induced increase in ESR1 expression. Estrogen further up-regulated PTGS2 expression. Stretch had no direct effect, but blocked the effects of progesterone and estrogen on ESR1 and PTGS2 expression. cAMP had no effect whereas PMA further up-regulated PTGS2 expression and prevented decline of OXTR expression. Human myometrial tissue in culture undergoes culture-induced gene expression changes consistent with transition toward a laboring phenotype. Changes in ESR1, PTGS2 and OXTR expression could not be controlled simultaneously. Until optimal culture conditions are determined, results of in vitro experiments with myometrial tissues should be interpreted with caution.

  13. SZGR 2.0: a one-stop shop of schizophrenia candidate genes.

    PubMed

    Jia, Peilin; Han, Guangchun; Zhao, Junfei; Lu, Pinyi; Zhao, Zhongming

    2017-01-04

    SZGR 2.0 is a comprehensive resource of candidate variants and genes for schizophrenia, covering genetic, epigenetic, transcriptomic, translational and many other types of evidence. By systematic review and curation of multiple lines of evidence, we included almost all variants and genes that have ever been reported to be associated with schizophrenia. In particular, we collected ∼4200 common variants reported in genome-wide association studies, ∼1000 de novo mutations discovered by large-scale sequencing of family samples, 215 genes spanning rare and replication copy number variations, 99 genes overlapping with linkage regions, 240 differentially expressed genes, 4651 differentially methylated genes and 49 genes as antipsychotic drug targets. To facilitate interpretation, we included various functional annotation data, especially brain eQTL, methylation QTL, brain expression featured in deep categorization of brain areas and developmental stages and brain-specific promoter and enhancer annotations. Furthermore, we conducted cross-study, cross-data type and integrative analyses of the multidimensional data deposited in SZGR 2.0, and made the data and results available through a user-friendly interface. In summary, SZGR 2.0 provides a one-stop shop of schizophrenia variants and genes and their function and regulation, providing an important resource in the schizophrenia and other mental disease community. SZGR 2.0 is available at https://bioinfo.uth.edu/SZGR/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Global survey of mRNA levels and decay rates of Chlamydia trachomatis trachoma and lymphogranuloma venereum biovars.

    PubMed

    Ferreira, Rita; Borges, Vítor; Borrego, Maria José; Gomes, João Paulo

    2017-07-01

    Interpreting the intricate bacterial transcriptomics implies understanding the dynamic relationship established between de novo transcription and the degradation of transcripts. Here, we performed a comparative overview of gene expression levels and mRNA decay rates for different-biovar (trachoma and lymphogranuloma venereum) strains of the obligate intracellular bacterium Chlamydia trachomatis . By using RNA-sequencing to measure gene expression levels at mid developmental stage and mRNA decay rates upon rifampicin-based transcription blockage, we observed that: i ) 60-70% of the top-50 expressed genes encode proteins with unknown function and proteins involved in "Translation, ribosomal structure and biogenesis" for all strains; ii ) the expression ranking by genes' functional categories was in general concordant among different-biovar strains; iii ) the median of the half-life time (t 1/2 ) values of transcripts were 15-17 min, indicating that the degree of transcripts' stability seems to correlate with the bacterial intracellular life-style, as these values are considerably higher than the ones observed in other studies for facultative intracellular and free-living bacteria; iv ) transcript decay rates were highly heterogeneous within each C. trachomatis strain and did not correlate with steady-state expression levels; v ) only at very few instances (essentially at gene functional category level) was possible to unveil dissimilarities potentially underlying phenotypic differences between biovars. In summary, the unveiled transcriptomic scenario, marked by a general lack of correlation between transcript production and degradation and a huge inter-transcript heterogeneity in decay rates, likely reflects the challenges underlying the unique biphasic developmental cycle of C. trachomatis and its intricate interactions with the human host, which probably exacerbate the complexity of the bacterial transcription regulation.

  15. Optimized Probe Masking for Comparative Transcriptomics of Closely Related Species

    PubMed Central

    Poeschl, Yvonne; Delker, Carolin; Trenner, Jana; Ullrich, Kristian Karsten; Quint, Marcel; Grosse, Ivo

    2013-01-01

    Microarrays are commonly applied to study the transcriptome of specific species. However, many available microarrays are restricted to model organisms, and the design of custom microarrays for other species is often not feasible. Hence, transcriptomics approaches of non-model organisms as well as comparative transcriptomics studies among two or more species often make use of cost-intensive RNAseq studies or, alternatively, by hybridizing transcripts of a query species to a microarray of a closely related species. When analyzing these cross-species microarray expression data, differences in the transcriptome of the query species can cause problems, such as the following: (i) lower hybridization accuracy of probes due to mismatches or deletions, (ii) probes binding multiple transcripts of different genes, and (iii) probes binding transcripts of non-orthologous genes. So far, methods for (i) exist, but these neglect (ii) and (iii). Here, we propose an approach for comparative transcriptomics addressing problems (i) to (iii), which retains only transcript-specific probes binding transcripts of orthologous genes. We apply this approach to an Arabidopsis lyrata expression data set measured on a microarray designed for Arabidopsis thaliana, and compare it to two alternative approaches, a sequence-based approach and a genomic DNA hybridization-based approach. We investigate the number of retained probe sets, and we validate the resulting expression responses by qRT-PCR. We find that the proposed approach combines the benefit of sequence-based stringency and accuracy while allowing the expression analysis of much more genes than the alternative sequence-based approach. As an added benefit, the proposed approach requires probes to detect transcripts of orthologous genes only, which provides a superior base for biological interpretation of the measured expression responses. PMID:24260119

  16. Discovering functional modules by topic modeling RNA-Seq based toxicogenomic data.

    PubMed

    Yu, Ke; Gong, Binsheng; Lee, Mikyung; Liu, Zhichao; Xu, Joshua; Perkins, Roger; Tong, Weida

    2014-09-15

    Toxicogenomics (TGx) endeavors to elucidate the underlying molecular mechanisms through exploring gene expression profiles in response to toxic substances. Recently, RNA-Seq is increasingly regarded as a more powerful alternative to microarrays in TGx studies. However, realizing RNA-Seq's full potential requires novel approaches to extracting information from the complex TGx data. Considering read counts as the number of times a word occurs in a document, gene expression profiles from RNA-Seq are analogous to a word by document matrix used in text mining. Topic modeling aiming at to discover the latent structures in text corpora would be helpful to explore RNA-Seq based TGx data. In this study, topic modeling was applied on a typical RNA-Seq based TGx data set to discover hidden functional modules. The RNA-Seq based gene expression profiles were transformed into "documents", on which latent Dirichlet allocation (LDA) was used to build a topic model. We found samples treated by the compounds with the same modes of actions (MoAs) could be clustered based on topic similarities. The topic most relevant to each cluster was identified as a "marker" topic, which was interpreted by gene enrichment analysis with MoAs then confirmed by compound and pathways associations mined from literature. To further validate the "marker" topics, we tested topic transferability from RNA-Seq to microarrays. The RNA-Seq based gene expression profile of a topic specifically associated with peroxisome proliferator-activated receptors (PPAR) signaling pathway was used to query samples with similar expression profiles in two different microarray data sets, yielding accuracy of about 85%. This proof-of-concept study demonstrates the applicability of topic modeling to discover functional modules in RNA-Seq data and suggests a valuable computational tool for leveraging information within TGx data in RNA-Seq era.

  17. Differential gene expression in dentate granule cells in mesial temporal lobe epilepsy with and without hippocampal sclerosis.

    PubMed

    Griffin, Nicole G; Wang, Yu; Hulette, Christine M; Halvorsen, Matt; Cronin, Kenneth D; Walley, Nicole M; Haglund, Michael M; Radtke, Rodney A; Skene, J H Pate; Sinha, Saurabh R; Heinzen, Erin L

    2016-03-01

    Hippocampal sclerosis is the most common neuropathologic finding in cases of medically intractable mesial temporal lobe epilepsy. In this study, we analyzed the gene expression profiles of dentate granule cells of patients with mesial temporal lobe epilepsy with and without hippocampal sclerosis to show that next-generation sequencing methods can produce interpretable genomic data from RNA collected from small homogenous cell populations, and to shed light on the transcriptional changes associated with hippocampal sclerosis. RNA was extracted, and complementary DNA (cDNA) was prepared and amplified from dentate granule cells that had been harvested by laser capture microdissection from surgically resected hippocampi from patients with mesial temporal lobe epilepsy with and without hippocampal sclerosis. Sequencing libraries were sequenced, and the resulting sequencing reads were aligned to the reference genome. Differential expression analysis was used to ascertain expression differences between patients with and without hippocampal sclerosis. Greater than 90% of the RNA-Seq reads aligned to the reference. There was high concordance between transcriptional profiles obtained for duplicate samples. Principal component analysis revealed that the presence or absence of hippocampal sclerosis was the main determinant of the variance within the data. Among the genes up-regulated in the hippocampal sclerosis samples, there was significant enrichment for genes involved in oxidative phosphorylation. By analyzing the gene expression profiles of dentate granule cells from surgically resected hippocampal specimens from patients with mesial temporal lobe epilepsy with and without hippocampal sclerosis, we have demonstrated the utility of next-generation sequencing methods for producing biologically relevant results from small populations of homogeneous cells, and have provided insight on the transcriptional changes associated with this pathology. Wiley Periodicals, Inc. © 2016 International League Against Epilepsy.

  18. Setting the pace: host rhythmic behaviour and gene expression patterns in the facultatively symbiotic cnidarian Aiptasia are determined largely by Symbiodinium.

    PubMed

    Sorek, Michal; Schnytzer, Yisrael; Ben-Asher, Hiba Waldman; Caspi, Vered Chalifa; Chen, Chii-Shiarng; Miller, David J; Levy, Oren

    2018-05-09

    All organisms employ biological clocks to anticipate physical changes in the environment; however, the integration of biological clocks in symbiotic systems has received limited attention. In corals, the interpretation of rhythmic behaviours is complicated by the daily oscillations in tissue oxygen tension resulting from the photosynthetic and respiratory activities of the associated algal endosymbiont Symbiodinium. In order to better understand the integration of biological clocks in cnidarian hosts of Symbiodinium, daily rhythms of behaviour and gene expression were studied in symbiotic and aposymbiotic morphs of the sea-anemone Aiptasia diaphana. The results showed that whereas circatidal (approx. 12-h) cycles of activity and gene expression predominated in aposymbiotic morphs, circadian (approx. 24-h) patterns were the more common in symbiotic morphs, where the expression of a significant number of genes shifted from a 12- to 24-h rhythm. The behavioural experiments on symbiotic A. diaphana displayed diel (24-h) rhythmicity in body and tentacle contraction under the light/dark cycles, whereas aposymbiotic morphs showed approximately 12-h (circatidal) rhythmicity. Reinfection experiments represent an important step in understanding the hierarchy of endogenous clocks in symbiotic associations, where the aposymbiotic Aiptasia morphs returned to a 24-h behavioural rhythm after repopulation with algae. Whilst some modification of host metabolism is to be expected, the extent to which the presence of the algae modified host endogenous behavioural and transcriptional rhythms implies that it is the symbionts that influence the pace. Our results clearly demonstrate the importance of the endosymbiotic algae in determining the timing and the duration of the extension and contraction of the body and tentacles and temporal gene expression.

  19. Epigenomic regulation of oncogenesis by chromatin remodeling.

    PubMed

    Kumar, R; Li, D-Q; Müller, S; Knapp, S

    2016-08-25

    Disruption of the intricate gene expression program represents one of major driving factors for the development, progression and maintenance of human cancer, and is often associated with acquired therapeutic resistance. At the molecular level, cancerous phenotypes are the outcome of cellular functions of critical genes, regulatory interactions of histones and chromatin remodeling complexes in response to dynamic and persistent upstream signals. A large body of genetic and biochemical evidence suggests that the chromatin remodelers integrate the extracellular and cytoplasmic signals to control gene activity. Consequently, widespread dysregulation of chromatin remodelers and the resulting inappropriate expression of regulatory genes, together, lead to oncogenesis. We summarize the recent developments and current state of the dysregulation of the chromatin remodeling components as the driving mechanism underlying the growth and progression of human tumors. Because chromatin remodelers, modifying enzymes and protein-protein interactions participate in interpreting the epigenetic code, selective chromatin remodelers and bromodomains have emerged as new frontiers for pharmacological intervention to develop future anti-cancer strategies to be used either as single-agent or in combination therapies with chemotherapeutics or radiotherapy.

  20. A Robust Unified Approach to Analyzing Methylation and Gene Expression Data

    PubMed Central

    Khalili, Abbas; Huang, Tim; Lin, Shili

    2009-01-01

    Microarray technology has made it possible to investigate expression levels, and more recently methylation signatures, of thousands of genes simultaneously, in a biological sample. Since more and more data from different biological systems or technological platforms are being generated at an incredible rate, there is an increasing need to develop statistical methods that are applicable to multiple data types and platforms. Motivated by such a need, a flexible finite mixture model that is applicable to methylation, gene expression, and potentially data from other biological systems, is proposed. Two major thrusts of this approach are to allow for a variable number of components in the mixture to capture non-biological variation and small biases, and to use a robust procedure for parameter estimation and probe classification. The method was applied to the analysis of methylation signatures of three breast cancer cell lines. It was also tested on three sets of expression microarray data to study its power and type I error rates. Comparison with a number of existing methods in the literature yielded very encouraging results; lower type I error rates and comparable/better power were achieved based on the limited study. Furthermore, the method also leads to more biologically interpretable results for the three breast cancer cell lines. PMID:20161265

  1. The shortest path is not the one you know: application of biological network resources in precision oncology research.

    PubMed

    Kuperstein, Inna; Grieco, Luca; Cohen, David P A; Thieffry, Denis; Zinovyev, Andrei; Barillot, Emmanuel

    2015-03-01

    Several decades of molecular biology research have delivered a wealth of detailed descriptions of molecular interactions in normal and tumour cells. This knowledge has been functionally organised and assembled into dedicated biological pathway resources that serve as an invaluable tool, not only for structuring the information about molecular interactions but also for making it available for biological, clinical and computational studies. With the advent of high-throughput molecular profiling of tumours, close to complete molecular catalogues of mutations, gene expression and epigenetic modifications are available and require adequate interpretation. Taking into account the information about biological signalling machinery in cells may help to better interpret molecular profiles of tumours. Making sense out of these descriptions requires biological pathway resources for functional interpretation of the data. In this review, we describe the available biological pathway resources, their characteristics in terms of construction mode, focus, aims and paradigms of biological knowledge representation. We present a new resource that is focused on cancer-related signalling, the Atlas of Cancer Signalling Networks. We briefly discuss current approaches for data integration, visualisation and analysis, using biological networks, such as pathway scoring, guilt-by-association and network propagation. Finally, we illustrate with several examples the added value of data interpretation in the context of biological networks and demonstrate that it may help in analysis of high-throughput data like mutation, gene expression or small interfering RNA screening and can guide in patients stratification. Finally, we discuss perspectives for improving precision medicine using biological network resources and tools. Taking into account the information about biological signalling machinery in cells may help to better interpret molecular patterns of tumours and enable to put precision oncology into general clinical practice. © The Author 2015. Published by Oxford University Press on behalf of the UK Environmental Mutagen Society. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Seasonal variations of gene expression biomarkers in Mytilus galloprovincialis cultured populations: temperature, oxidative stress and reproductive cycle as major modulators.

    PubMed

    Jarque, Sergio; Prats, Eva; Olivares, Alba; Casado, Marta; Ramón, Montserrat; Piña, Benjamin

    2014-11-15

    The blue mussel Mytilus galloprovincialis has been used as monitoring organism in many biomonitoring programs because of its broad distribution in South European sea waters and its physiological characteristics. Different pollution-stress biomarkers, including gene expression biomarkers, have been developed to determine its physiological response to the presence of different pollutants. However, the existing information about basal expression profiles is very limited, as very few biomarker-based studies were designed to reflect the natural seasonal variations. In the present study, we analyzed the natural expression patterns of several genes commonly used in biomonitoring, namely ferritin, metallothionein, cytochrome P450, glutathione S-transferase, heat shock protein and the kinase responsive to stress KRS, during an annual life cycle. Analysis of mantle-gonad samples of cultured populations of M. galloprovincialis from the Delta del Ebro (North East Spain) showed natural seasonal variability of these biomarkers, pointing to temperature and oxidative stress as major abiotic modulators. In turn, the reproductive cycle, a process that can be tracked by VCLM7 expression, and known to be influenced by temperature, seems to be the major biotic factor involved in seasonality. Our results illustrate the influence of environmental factors in the physiology of mussels through their annual cycle, a crucial information for the correct interpretation of responses under stress conditions. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. Interpretation of the FGF8 morphogen gradient is regulated by endocytic trafficking.

    PubMed

    Nowak, Matthias; Machate, Anja; Yu, Shuizi Rachel; Gupta, Mansi; Brand, Michael

    2011-02-01

    Forty years ago, it was proposed that during embryonic development and organogenesis, morphogen gradients provide positional information to the individual cells within a tissue leading to specific fate decisions. Recently, much insight has been gained into how such morphogen gradients are formed and maintained; however, which cellular mechanisms govern their interpretation within target tissues remains debated. Here we used in vivo fluorescence correlation spectroscopy and automated image analysis to assess the role of endocytic sorting dynamics on fibroblast growth factor 8 (Fgf8) morphogen gradient interpretation. By interfering with the function of the ubiquitin ligase Cbl, we found an expanded range of Fgf target gene expression and a delay of Fgf8 lysosomal transport. However, the extracellular Fgf8 morphogen gradient remained unchanged, indicating that the observed signalling changes are due to altered gradient interpretation. We propose that regulation of morphogen signalling activity through endocytic sorting allows fast feedback-induced changes in gradient interpretation during the establishment of complex patterns.

  4. Noncoding copy-number variations are associated with congenital limb malformation.

    PubMed

    Flöttmann, Ricarda; Kragesteen, Bjørt K; Geuer, Sinje; Socha, Magdalena; Allou, Lila; Sowińska-Seidler, Anna; Bosquillon de Jarcy, Laure; Wagner, Johannes; Jamsheer, Aleksander; Oehl-Jaschkowitz, Barbara; Wittler, Lars; de Silva, Deepthi; Kurth, Ingo; Maya, Idit; Santos-Simarro, Fernando; Hülsemann, Wiebke; Klopocki, Eva; Mountford, Roger; Fryer, Alan; Borck, Guntram; Horn, Denise; Lapunzina, Pablo; Wilson, Meredith; Mascrez, Bénédicte; Duboule, Denis; Mundlos, Stefan; Spielmann, Malte

    2017-10-12

    PurposeCopy-number variants (CNVs) are generally interpreted by linking the effects of gene dosage with phenotypes. The clinical interpretation of noncoding CNVs remains challenging. We investigated the percentage of disease-associated CNVs in patients with congenital limb malformations that affect noncoding cis-regulatory sequences versus genes sensitive to gene dosage effects.MethodsWe applied high-resolution copy-number analysis to 340 unrelated individuals with isolated limb malformation. To investigate novel candidate CNVs, we re-engineered human CNVs in mice using clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing.ResultsOf the individuals studied, 10% harbored CNVs segregating with the phenotype in the affected families. We identified 31 CNVs previously associated with congenital limb malformations and four novel candidate CNVs. Most of the disease-associated CNVs (57%) affected the noncoding cis-regulatory genome, while only 43% included a known disease gene and were likely to result from gene dosage effects. In transgenic mice harboring four novel candidate CNVs, we observed altered gene expression in all cases, indicating that the CNVs had a regulatory effect either by changing the enhancer dosage or altering the topological associating domain architecture of the genome.ConclusionOur findings suggest that CNVs affecting noncoding regulatory elements are a major cause of congenital limb malformations.Genetics in Medicine advance online publication, 12 October 2017; doi:10.1038/gim.2017.154.

  5. Accurate and sensitive quantification of protein-DNA binding affinity.

    PubMed

    Rastogi, Chaitanya; Rube, H Tomas; Kribelbauer, Judith F; Crocker, Justin; Loker, Ryan E; Martini, Gabriella D; Laptenko, Oleg; Freed-Pastor, William A; Prives, Carol; Stern, David L; Mann, Richard S; Bussemaker, Harmen J

    2018-04-17

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. Copyright © 2018 the Author(s). Published by PNAS.

  6. Accurate and sensitive quantification of protein-DNA binding affinity

    PubMed Central

    Rastogi, Chaitanya; Rube, H. Tomas; Kribelbauer, Judith F.; Crocker, Justin; Loker, Ryan E.; Martini, Gabriella D.; Laptenko, Oleg; Freed-Pastor, William A.; Prives, Carol; Stern, David L.; Mann, Richard S.; Bussemaker, Harmen J.

    2018-01-01

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. PMID:29610332

  7. SVGMap: configurable image browser for experimental data.

    PubMed

    Rafael-Palou, Xavier; Schroeder, Michael P; Lopez-Bigas, Nuria

    2012-01-01

    Spatial data visualization is very useful to represent biological data and quickly interpret the results. For instance, to show the expression pattern of a gene in different tissues of a fly, an intuitive approach is to draw the fly with the corresponding tissues and color the expression of the gene in each of them. However, the creation of these visual representations may be a burdensome task. Here we present SVGMap, a java application that automatizes the generation of high-quality graphics for singular data items (e.g. genes) and biological conditions. SVGMap contains a browser that allows the user to navigate the different images created and can be used as a web-based results publishing tool. SVGMap is freely available as precompiled java package as well as source code at http://bg.upf.edu/svgmap. It requires Java 6 and any recent web browser with JavaScript enabled. The software can be run on Linux, Mac OS X and Windows systems. nuria.lopez@upf.edu

  8. Systems Biology Approach to the Dissection of the Complexity of Regulatory Networks in the S. scrofa Cardiocirculatory System

    PubMed Central

    Martini, Paolo; Sales, Gabriele; Calura, Enrica; Brugiolo, Mattia; Lanfranchi, Gerolamo; Romualdi, Chiara; Cagnin, Stefano

    2013-01-01

    Genome-wide experiments are routinely used to increase the understanding of the biological processes involved in the development and maintenance of a variety of pathologies. Although the technical feasibility of this type of experiment has improved in recent years, data analysis remains challenging. In this context, gene set analysis has emerged as a fundamental tool for the interpretation of the results. Here, we review strategies used in the gene set approach, and using datasets for the pig cardiocirculatory system as a case study, we demonstrate how the use of a combination of these strategies can enhance the interpretation of results. Gene set analyses are able to distinguish vessels from the heart and arteries from veins in a manner that is consistent with the different cellular composition of smooth muscle cells. By integrating microRNA elements in the regulatory circuits identified, we find that vessel specificity is maintained through specific miRNAs, such as miR-133a and miR-143, which show anti-correlated expression with their mRNA targets. PMID:24284405

  9. Metatranscriptome Analysis of Aquifer Samples Reveals Unexpected Metabolic Lifestyles Relevant to Active Biogeochemical Cycling

    NASA Astrophysics Data System (ADS)

    Beller, H. R.; Jewell, T. N. M.; Karaoz, U.; Banfield, J. F.; Brodie, E.; Williams, K. H.

    2015-12-01

    Modern molecular ecology techniques are revealing the metabolic potential of uncultivated microorganisms, but there is still much to be learned about the actual biogeochemical roles of microbes that have cultivated relatives. Here, we present metatranscriptomic and metagenomic data from a field study that provides evidence of coupled redox processes that have not been documented in cultivated relatives and, indeed, represent strains with metabolic traits that are novel with respect to closely related isolates. The data come from omics analysis of groundwater samples collected during an experiment in which nitrate (a native electron acceptor) was injected into a perennially suboxic aquifer in Rifle (CO). Transcriptional data indicated that just two groups of chemolithoautotrophic bacteria accounted for a very large portion (~80%) of overall community gene expression: (1) members of the Fe(II)-oxidizing Gallionellaceae family and (2) strains of the S-oxidizing species, Sulfurimonas denitrificans. Metabolic lifestyles for Gallionellaceae strains that were novel compared to cultivated representatives included nitrate-dependent Fe(II) oxidation and S oxidation. Evidence for these metabolisms included highly correlated temporal expression in binned data of nitrate reductase (e.g., narGHI) genes (which have never been reported in Gallionellaceae genomes) and Fe(II) oxidation genes (e.g., mtoA) or S oxidation genes (e.g., dsrE, aprA). Of the two most active strains of S. denitrificans, only one showed strong expression of S oxidation genes, whereas the other was apparently using an unexpected (as-yet unidentified) primary electron donor. Transcriptional data added considerable interpretive value to this study, as (1) metagenomic data would not have highlighted these organisms, which had a disproportionately large role in community metabolism relative to their populations, and (2) co-expression of coupled pathway genes could not be predicted based solely on metagenomic data.

  10. Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data.

    PubMed

    Chen, Shuonan; Mar, Jessica C

    2018-06-19

    A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods now becoming accessible, general network inference algorithms that were initially developed for data collected from bulk samples may not be suitable for single cells. Meanwhile, although methods that are specific for single cell data are now emerging, whether they have improved performance over general methods is unknown. In this study, we evaluate the applicability of five general methods and three single cell methods for inferring gene regulatory networks from both experimental single cell gene expression data and in silico simulated data. Standard evaluation metrics using ROC curves and Precision-Recall curves against reference sets sourced from the literature demonstrated that most of the methods performed poorly when they were applied to either experimental single cell data, or simulated single cell data, which demonstrates their lack of performance for this task. Using default settings, network methods were applied to the same datasets. Comparisons of the learned networks highlighted the uniqueness of some predicted edges for each method. The fact that different methods infer networks that vary substantially reflects the underlying mathematical rationale and assumptions that distinguish network methods from each other. This study provides a comprehensive evaluation of network modeling algorithms applied to experimental single cell gene expression data and in silico simulated datasets where the network structure is known. Comparisons demonstrate that most of these assessed network methods are not able to predict network structures from single cell expression data accurately, even if they are specifically developed for single cell methods. Also, single cell methods, which usually depend on more elaborative algorithms, in general have less similarity to each other in the sets of edges detected. The results from this study emphasize the importance for developing more accurate optimized network modeling methods that are compatible for single cell data. Newly-developed single cell methods may uniquely capture particular features of potential gene-gene relationships, and caution should be taken when we interpret these results.

  11. Over-expression of XIST, the Master Gene for X Chromosome Inactivation, in Females With Major Affective Disorders

    PubMed Central

    Ji, Baohu; Higa, Kerin K.; Kelsoe, John R.; Zhou, Xianjin

    2015-01-01

    Background Psychiatric disorders are common mental disorders without a pathological biomarker. Classic genetic studies found that an extra X chromosome frequently causes psychiatric symptoms in patients with either Klinefelter syndrome (XXY) or Triple X syndrome (XXX). Over-dosage of some X-linked escapee genes was suggested to cause psychiatric disorders. However, relevance of these rare genetic diseases to the pathogenesis of psychiatric disorders in the general population of psychiatric patients is unknown. Methods XIST and several X-linked genes were studied in 36 lymphoblastoid cell lines from healthy females and 60 lymphoblastoid cell lines from female patients with either bipolar disorder or recurrent major depression. XIST and KDM5C expression was also quantified in 48 RNA samples from postmortem human brains of healthy female controls and female psychiatric patients. Findings We found that the XIST gene, a master in control of X chromosome inactivation (XCI), is significantly over-expressed (p = 1 × 10− 7, corrected after multiple comparisons) in the lymphoblastoid cells of female patients with either bipolar disorder or major depression. The X-linked escapee gene KDM5C also displays significant up-regulation (p = 5.3 × 10− 7, corrected after multiple comparisons) in the patients' cells. Expression of XIST and KDM5C is highly correlated (Pearson's coefficient, r = 0.78, p = 1.3 × 10− 13). Studies on human postmortem brains supported over-expression of the XIST gene in female psychiatric patients. Interpretations We propose that over-expression of XIST may cause or result from subtle alteration of XCI, which up-regulates the expression of some X-linked escapee genes including KDM5C. Over-expression of X-linked genes could be a common mechanism for the development of psychiatric disorders between patients with those rare genetic diseases and the general population of female psychiatric patients with XIST over-expression. Our studies suggest that XIST and KDM5C expression could be used as a biological marker for diagnosis of psychiatric disorders in a significantly large subset of female patients. Research in context Due to lack of biological markers, diagnosis and treatment of psychiatric disorders are subjective. There is utmost urgency to identify biomarkers for clinics, research, and drug development. We found that XIST and KDM5C gene expression may be used as a biological marker for diagnosis of major affective disorders in a significantly large subset of female patients from the general population. Our studies show that over-expression of XIST and some X-linked escapee genes may be a common mechanism for development of psychiatric disorders between the patients with rare genetic diseases (XXY or XXX) and the general population of female psychiatric patients. PMID:26425698

  12. Computational biology for ageing

    PubMed Central

    Wieser, Daniela; Papatheodorou, Irene; Ziehm, Matthias; Thornton, Janet M.

    2011-01-01

    High-throughput genomic and proteomic technologies have generated a wealth of publicly available data on ageing. Easy access to these data, and their computational analysis, is of great importance in order to pinpoint the causes and effects of ageing. Here, we provide a description of the existing databases and computational tools on ageing that are available for researchers. We also describe the computational approaches to data interpretation in the field of ageing including gene expression, comparative and pathway analyses, and highlight the challenges for future developments. We review recent biological insights gained from applying bioinformatics methods to analyse and interpret ageing data in different organisms, tissues and conditions. PMID:21115530

  13. Analysis of temporal transcription expression profiles reveal links between protein function and developmental stages of Drosophila melanogaster.

    PubMed

    Wan, Cen; Lees, Jonathan G; Minneci, Federico; Orengo, Christine A; Jones, David T

    2017-10-01

    Accurate gene or protein function prediction is a key challenge in the post-genome era. Most current methods perform well on molecular function prediction, but struggle to provide useful annotations relating to biological process functions due to the limited power of sequence-based features in that functional domain. In this work, we systematically evaluate the predictive power of temporal transcription expression profiles for protein function prediction in Drosophila melanogaster. Our results show significantly better performance on predicting protein function when transcription expression profile-based features are integrated with sequence-derived features, compared with the sequence-derived features alone. We also observe that the combination of expression-based and sequence-based features leads to further improvement of accuracy on predicting all three domains of gene function. Based on the optimal feature combinations, we then propose a novel multi-classifier-based function prediction method for Drosophila melanogaster proteins, FFPred-fly+. Interpreting our machine learning models also allows us to identify some of the underlying links between biological processes and developmental stages of Drosophila melanogaster.

  14. microRNA Profiling of Amniotic Fluid: Evidence of Synergy of microRNAs in Fetal Development.

    PubMed

    Sun, Tingting; Li, Weiyun; Li, Tianpeng; Ling, Shucai

    2016-01-01

    Amniotic fluid (AF) continuously exchanges molecules with the fetus, playing critical roles in fetal development especially via its complex components. Among these components, microRNAs are thought to be transferred between cells loaded in microvesicles. However, the functions of AF microRNAs remain unknown. To date, few studies have examined microRNAs in amniotic fluid. In this study, we employed miRCURY Locked Nucleotide Acid arrays to profile the dynamic expression of microRNAs in AF from mice on embryonic days E13, E15, and E17. At these times, 233 microRNAs were differentially expressed (p< 0.01), accounting for 23% of the total Mus musculus microRNAs. These differentially-expressed microRNAs were divided into two distinct groups based on their expression patterns. Gene ontology analysis showed that the intersectional target genes of these differentially-expressed microRNAs were mainly distributed in synapse, synaptosome, cell projection, and cytoskeleton. Pathway analysis revealed that the target genes of the two groups of microRNAs were synergistically enriched in axon guidance, focal adhesion, and MAPK signaling pathways. MicroRNA-mRNA network analysis and gene- mapping showed that these microRNAs synergistically regulated cell motility, cell proliferation and differentiation, and especially the axon guidance process. Cancer pathways associated with growth and proliferation were also enriched in AF. Taken together, the results of this study are the first to show the functions of microRNAs in AF during fetal development, providing novel insights into interpreting the roles of AF microRNAs in fetal development.

  15. Update on Aire and thymic negative selection.

    PubMed

    Passos, Geraldo A; Speck-Hernandez, Cesar A; Assis, Amanda F; Mendes-da-Cruz, Daniella A

    2018-01-01

    Twenty years ago, the autoimmune regulator (Aire) gene was associated with autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy, and was cloned and sequenced. Its importance goes beyond its abstract link with human autoimmune disease. Aire identification opened new perspectives to better understand the molecular basis of central tolerance and self-non-self distinction, the main properties of the immune system. Since 1997, a growing number of immunologists and molecular geneticists have made important discoveries about the function of Aire, which is essentially a pleiotropic gene. Aire is one of the functional markers in medullary thymic epithelial cells (mTECs), controlling their differentiation and expression of peripheral tissue antigens (PTAs), mTEC-thymocyte adhesion and the expression of microRNAs, among other functions. With Aire, the immunological tolerance became even more apparent from the molecular genetics point of view. Currently, mTECs represent the most unusual cells because they express almost the entire functional genome but still maintain their identity. Due to the enormous diversity of PTAs, this uncommon gene expression pattern was termed promiscuous gene expression, the interpretation of which is essentially immunological - i.e. it is related to self-representation in the thymus. Therefore, this knowledge is strongly linked to the negative selection of autoreactive thymocytes. In this update, we focus on the most relevant results of Aire as a transcriptional and post-transcriptional controller of PTAs in mTECs, its mechanism of action, and its influence on the negative selection of autoreactive thymocytes as the bases of the induction of central tolerance and prevention of autoimmune diseases. © 2017 John Wiley & Sons Ltd.

  16. Gene protein detection platform--a comparison of a new human epidermal growth factor receptor 2 assay with conventional immunohistochemistry and fluorescence in situ hybridization platforms.

    PubMed

    Stålhammar, Gustav; Farrajota, Pedro; Olsson, Ann; Silva, Cristina; Hartman, Johan; Elmberger, Göran

    2015-08-01

    Human epidermal growth factor receptor 2 (HER2) immunohistochemistry (IHC) and fluorescence in situ hybridization (FISH) are widely used semiquantitative assays for selecting breast cancer patients for HER2 antibody therapy. However, both techniques have been shown to have disadvantages. Our aim was to test a recent automated technique of combined IHC and brightfield dual in situ hybridization-gene protein detection platform (GPDP)-in breast cancer HER2 protein, gene, and chromosome 17 centromere status evaluations, comparing the results in accordance to the American Society of Clinical Oncology/College of American Pathologists recommendations for HER2 testing in breast cancer from both 2007 and 2013. The GPDP technique performance was evaluated on 52 consecutive whole slide invasive breast cancer cases with HER2 IHC 2/3+ scoring results. Applying in turns the American Society of Clinical Oncology/College of American Pathologists recommendations for HER2 testing in breast cancer from 2007 and 2013 to both FISH and GPDP DISH assays, the HER2 gene amplification results showed 100% concordance among amplified/nonamplified cases, but there was a shift in 4 cases toward positive from equivocal results and toward equivocal from negative results. This might be related to the emphasis on the average HER2 copy number in the 2013 criteria. HER2 expression by IVD market IHC kit (Pathway®) has a strong correlation with GPDP HER2 protein, including a full concordance for all cases scored as 3+ and a reduction from 2+ to 1+ in 7 cases corresponding to nonamplified cases. Gene protein detection platform HER2 protein "solo" could have spared the need for 7 FISH studies. In addition, the platform offered advantages on interpretation reassurance including selecting areas for counting gene signals paralleled with protein IHC expression, on heterogeneity detection, interpretation time, technical time, and tissue expense. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. Differentially expressed regulatory genes in honey bee caste development

    NASA Astrophysics Data System (ADS)

    Hepperle, C.; Hartfelder, K.

    2001-03-01

    In the honey bee, an eminently fertile queen with up to 200 ovarioles per ovary monopolizes colony level reproduction. In contrast, worker bees have only few ovarioles and are essentially sterile. This phenotype divergence is a result of caste-specifically modulated juvenile hormone and ecdysteroid titers in larval development. In this study we employed a differential-display reverse transcription (DDRT)-PCR protocol to detect ecdysteroid-regulated gene expression during a critical phase of caste development. We identified a Ftz-F1 homolog and a Cut-like transcript. Ftz-F1 could be a putative element of the metamorphic ecdysone response cascade of bees, whereas Cut-like proteins are described as transcription factors involved in maintaining cellular differentiation states. The downregulation of both factors can be interpreted as steps in the metamorphic degradation of ovarioles in worker-bee ovaries.

  18. Deciphering the transcriptional cis-regulatory code.

    PubMed

    Yáñez-Cuna, J Omar; Kvon, Evgeny Z; Stark, Alexander

    2013-01-01

    Information about developmental gene expression resides in defined regulatory elements, called enhancers, in the non-coding part of the genome. Although cells reliably utilize enhancers to orchestrate gene expression, a cis-regulatory code that would allow their interpretation has remained one of the greatest challenges of modern biology. In this review, we summarize studies from the past three decades that describe progress towards revealing the properties of enhancers and discuss how recent approaches are providing unprecedented insights into regulatory elements in animal genomes. Over the next years, we believe that the functional characterization of regulatory sequences in entire genomes, combined with recent computational methods, will provide a comprehensive view of genomic regulatory elements and their building blocks and will enable researchers to begin to understand the sequence basis of the cis-regulatory code. Copyright © 2012 Elsevier Ltd. All rights reserved.

  19. SpliceCenter: A suite of web-based bioinformatic applications for evaluating the impact of alternative splicing on RT-PCR, RNAi, microarray, and peptide-based studies

    PubMed Central

    Ryan, Michael C; Zeeberg, Barry R; Caplen, Natasha J; Cleland, James A; Kahn, Ari B; Liu, Hongfang; Weinstein, John N

    2008-01-01

    Background Over 60% of protein-coding genes in vertebrates express mRNAs that undergo alternative splicing. The resulting collection of transcript isoforms poses significant challenges for contemporary biological assays. For example, RT-PCR validation of gene expression microarray results may be unsuccessful if the two technologies target different splice variants. Effective use of sequence-based technologies requires knowledge of the specific splice variant(s) that are targeted. In addition, the critical roles of alternative splice forms in biological function and in disease suggest that assay results may be more informative if analyzed in the context of the targeted splice variant. Results A number of contemporary technologies are used for analyzing transcripts or proteins. To enable investigation of the impact of splice variation on the interpretation of data derived from those technologies, we have developed SpliceCenter. SpliceCenter is a suite of user-friendly, web-based applications that includes programs for analysis of RT-PCR primer/probe sets, effectors of RNAi, microarrays, and protein-targeting technologies. Both interactive and high-throughput implementations of the tools are provided. The interactive versions of SpliceCenter tools provide visualizations of a gene's alternative transcripts and probe target positions, enabling the user to identify which splice variants are or are not targeted. The high-throughput batch versions accept user query files and provide results in tabular form. When, for example, we used SpliceCenter's batch siRNA-Check to process the Cancer Genome Anatomy Project's large-scale shRNA library, we found that only 59% of the 50,766 shRNAs in the library target all known splice variants of the target gene, 32% target some but not all, and 9% do not target any currently annotated transcript. Conclusion SpliceCenter provides unique, user-friendly applications for assessing the impact of transcript variation on the design and interpretation of RT-PCR, RNAi, gene expression microarrays, antibody-based detection, and mass spectrometry proteomics. The tools are intended for use by bench biologists as well as bioinformaticists. PMID:18638396

  20. Second-generation inhibitors demonstrate the involvement of p38 mitogen-activated protein kinase in post-transcriptional modulation of inflammatory mediator production in human and rodent airways.

    PubMed

    Birrell, Mark A; Wong, Sissie; McCluskie, Kerryn; Catley, Matthew C; Hardaker, Elizabeth L; Haj-Yahia, Saleem; Belvisi, Maria G

    2006-03-01

    The exact role of p38 mitogen-activated protein kinase (MAPK) in the expression of inflammatory cytokines is not clear; it may regulate transcriptionally, post-transcriptionally, translationally, or post-translationally. The involvement of one or more of these mechanisms has been suggested to depend on the particular cytokine, the cell type studied, and the specific stimulus used. Interpretation of some of the published data is further complicated by the use of inhibitors such as 4-(4-fluorophenyl)-2-(4-methylsulfinylphenyl)-5-(4-pyridyl)-1H-imidazole (SB 203580) used at single, high concentrations. The aim of this study was to determine the impact of two second-generation p38 MAPK inhibitors on the expression of a range of inflammatory cytokines at the gene and protein levels in human cultured cells. Similar assessment of the impact of these compounds on inflammatory cytokine expression in a preclinical in vivo model of airway inflammation was performed. The results in THP-1 cells and primary airway macrophages clearly show that protein expression is inhibited at much lower concentrations of inhibitor than are needed to impact on gene expression. In the rodent model, both compounds, at doses that cause maximal inhibition of cellular recruitment, inhibit tumor necrosis factor alpha (TNFalpha) protein production without impacting on nuclear factor kappaB pathway activation or TNFalpha gene expression. In summary, the data shown here demonstrate that, although at high compound concentrations there is some level of transcriptional regulation, the predominant role of p38 MAPK in cytokine production is at the translational level. These data question whether the effect of p38 inhibitors on gene transcription is related to their potential therapeutic role as anti-inflammatory compounds.

  1. Transcriptional Regulation in Ebola Virus: Effects of Gene Border Structure and Regulatory Elements on Gene Expression and Polymerase Scanning Behavior

    PubMed Central

    Brauburger, Kristina; Boehmann, Yannik; Krähling, Verena

    2015-01-01

    ABSTRACT The highly pathogenic Ebola virus (EBOV) has a nonsegmented negative-strand (NNS) RNA genome containing seven genes. The viral genes either are separated by intergenic regions (IRs) of variable length or overlap. The structure of the EBOV gene overlaps is conserved throughout all filovirus genomes and is distinct from that of the overlaps found in other NNS RNA viruses. Here, we analyzed how diverse gene borders and noncoding regions surrounding the gene borders influence transcript levels and govern polymerase behavior during viral transcription. Transcription of overlapping genes in EBOV bicistronic minigenomes followed the stop-start mechanism, similar to that followed by IR-containing gene borders. When the gene overlaps were extended, the EBOV polymerase was able to scan the template in an upstream direction. This polymerase feature seems to be generally conserved among NNS RNA virus polymerases. Analysis of IR-containing gene borders showed that the IR sequence plays only a minor role in transcription regulation. Changes in IR length were generally well tolerated, but specific IR lengths led to a strong decrease in downstream gene expression. Correlation analysis revealed that these effects were largely independent of the surrounding gene borders. Each EBOV gene contains exceptionally long untranslated regions (UTRs) flanking the open reading frame. Our data suggest that the UTRs adjacent to the gene borders are the main regulators of transcript levels. A highly complex interplay between the different cis-acting elements to modulate transcription was revealed for specific combinations of IRs and UTRs, emphasizing the importance of the noncoding regions in EBOV gene expression control. IMPORTANCE Our data extend those from previous analyses investigating the implication of noncoding regions at the EBOV gene borders for gene expression control. We show that EBOV transcription is regulated in a highly complex yet not easily predictable manner by a set of interacting cis-active elements. These findings are important not only for the design of recombinant filoviruses but also for the design of other replicon systems widely used as surrogate systems to study the filovirus replication cycle under low biosafety levels. Insights into the complex regulation of EBOV transcription conveyed by noncoding sequences will also help to interpret the importance of mutations that have been detected within these regions, including in isolates of the current outbreak. PMID:26656691

  2. Transcriptional Regulation in Ebola Virus: Effects of Gene Border Structure and Regulatory Elements on Gene Expression and Polymerase Scanning Behavior.

    PubMed

    Brauburger, Kristina; Boehmann, Yannik; Krähling, Verena; Mühlberger, Elke

    2016-02-15

    The highly pathogenic Ebola virus (EBOV) has a nonsegmented negative-strand (NNS) RNA genome containing seven genes. The viral genes either are separated by intergenic regions (IRs) of variable length or overlap. The structure of the EBOV gene overlaps is conserved throughout all filovirus genomes and is distinct from that of the overlaps found in other NNS RNA viruses. Here, we analyzed how diverse gene borders and noncoding regions surrounding the gene borders influence transcript levels and govern polymerase behavior during viral transcription. Transcription of overlapping genes in EBOV bicistronic minigenomes followed the stop-start mechanism, similar to that followed by IR-containing gene borders. When the gene overlaps were extended, the EBOV polymerase was able to scan the template in an upstream direction. This polymerase feature seems to be generally conserved among NNS RNA virus polymerases. Analysis of IR-containing gene borders showed that the IR sequence plays only a minor role in transcription regulation. Changes in IR length were generally well tolerated, but specific IR lengths led to a strong decrease in downstream gene expression. Correlation analysis revealed that these effects were largely independent of the surrounding gene borders. Each EBOV gene contains exceptionally long untranslated regions (UTRs) flanking the open reading frame. Our data suggest that the UTRs adjacent to the gene borders are the main regulators of transcript levels. A highly complex interplay between the different cis-acting elements to modulate transcription was revealed for specific combinations of IRs and UTRs, emphasizing the importance of the noncoding regions in EBOV gene expression control. Our data extend those from previous analyses investigating the implication of noncoding regions at the EBOV gene borders for gene expression control. We show that EBOV transcription is regulated in a highly complex yet not easily predictable manner by a set of interacting cis-active elements. These findings are important not only for the design of recombinant filoviruses but also for the design of other replicon systems widely used as surrogate systems to study the filovirus replication cycle under low biosafety levels. Insights into the complex regulation of EBOV transcription conveyed by noncoding sequences will also help to interpret the importance of mutations that have been detected within these regions, including in isolates of the current outbreak. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  3. Considerations for optimal use of postmortem human brains for molecular psychiatry: lessons from schizophrenia.

    PubMed

    Weickert, Cynthia Shannon; Rothmond, Debora A; Purves-Tyson, Tertia D

    2018-01-01

    Schizophrenia is a disabling disease impacting millions of people around the world, for which there is no known cure. Current antipsychotic treatments for schizophrenia mainly target psychotic symptoms, do little to ameliorate social or cognitive deficits, have side-effects that cause weight gain, and diabetes and 30% of people do not respond. Thus, better therapeutics for schizophrenia aimed at the route biologic changes are needed and discovering the underlying neurobiology is key to this quest. Postmortem brain studies provide the most direct and detailed way to determine the pathophysiology of schizophrenia. This chapter outlines steps that can be taken to ensure the best-quality molecular data from postmortem brain tissue are obtained. In this chapter, we also discuss targeted and high-throughput methods for examining gene and protein expression and some of the strengths and limitations of each method. We briefly consider why gene and protein expression changes may not always concur within brain tissue. We conclude that postmortem brain research that investigates gene and protein expression in well-characterized and matched brain cohorts provides an important foundation to be considered when interpreting data obtained from studies of living schizophrenia patients. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. Integrative sparse principal component analysis of gene expression data.

    PubMed

    Liu, Mengque; Fan, Xinyan; Fang, Kuangnan; Zhang, Qingzhao; Ma, Shuangge

    2017-12-01

    In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high dimensionality" characteristic of gene expression data, the analysis results generated from a single dataset are often unsatisfactory. Under contexts other than dimension reduction, integrative analysis techniques, which jointly analyze the raw data of multiple independent datasets, have been developed and shown to outperform "classic" meta-analysis and other multidatasets techniques and single-dataset analysis. In this study, we conduct integrative analysis by developing the iSPCA (integrative SPCA) method. iSPCA achieves the selection and estimation of sparse loadings using a group penalty. To take advantage of the similarity across datasets and generate more accurate results, we further impose contrasted penalties. Different penalties are proposed to accommodate different data conditions. Extensive simulations show that iSPCA outperforms the alternatives under a wide spectrum of settings. The analysis of breast cancer and pancreatic cancer data further shows iSPCA's satisfactory performance. © 2017 WILEY PERIODICALS, INC.

  5. Using Genome-Wide Expression Profiling to Define Gene Networks Relevant to the Study of Complex Traits: From RNA Integrity to Network Topology

    PubMed Central

    O'Brien, M.A.; Costin, B.N.; Miles, M.F.

    2014-01-01

    Postgenomic studies of the function of genes and their role in disease have now become an area of intense study since efforts to define the raw sequence material of the genome have largely been completed. The use of whole-genome approaches such as microarray expression profiling and, more recently, RNA-sequence analysis of transcript abundance has allowed an unprecedented look at the workings of the genome. However, the accurate derivation of such high-throughput data and their analysis in terms of biological function has been critical to truly leveraging the postgenomic revolution. This chapter will describe an approach that focuses on the use of gene networks to both organize and interpret genomic expression data. Such networks, derived from statistical analysis of large genomic datasets and the application of multiple bioinformatics data resources, poten-tially allow the identification of key control elements for networks associated with human disease, and thus may lead to derivation of novel therapeutic approaches. However, as discussed in this chapter, the leveraging of such networks cannot occur without a thorough understanding of the technical and statistical factors influencing the derivation of genomic expression data. Thus, while the catch phrase may be “it's the network … stupid,” the understanding of factors extending from RNA isolation to genomic profiling technique, multivariate statistics, and bioinformatics are all critical to defining fully useful gene networks for study of complex biology. PMID:23195313

  6. The cure: design and evaluation of a crowdsourcing game for gene selection for breast cancer survival prediction.

    PubMed

    Good, Benjamin M; Loguercio, Salvatore; Griffith, Obi L; Nanis, Max; Wu, Chunlei; Su, Andrew I

    2014-07-29

    Molecular signatures for predicting breast cancer prognosis could greatly improve care through personalization of treatment. Computational analyses of genome-wide expression datasets have identified such signatures, but these signatures leave much to be desired in terms of accuracy, reproducibility, and biological interpretability. Methods that take advantage of structured prior knowledge (eg, protein interaction networks) show promise in helping to define better signatures, but most knowledge remains unstructured. Crowdsourcing via scientific discovery games is an emerging methodology that has the potential to tap into human intelligence at scales and in modes unheard of before. The main objective of this study was to test the hypothesis that knowledge linking expression patterns of specific genes to breast cancer outcomes could be captured from players of an open, Web-based game. We envisioned capturing knowledge both from the player's prior experience and from their ability to interpret text related to candidate genes presented to them in the context of the game. We developed and evaluated an online game called The Cure that captured information from players regarding genes for use as predictors of breast cancer survival. Information gathered from game play was aggregated using a voting approach, and used to create rankings of genes. The top genes from these rankings were evaluated using annotation enrichment analysis, comparison to prior predictor gene sets, and by using them to train and test machine learning systems for predicting 10 year survival. Between its launch in September 2012 and September 2013, The Cure attracted more than 1000 registered players, who collectively played nearly 10,000 games. Gene sets assembled through aggregation of the collected data showed significant enrichment for genes known to be related to key concepts such as cancer, disease progression, and recurrence. In terms of the predictive accuracy of models trained using this information, these gene sets provided comparable performance to gene sets generated using other methods, including those used in commercial tests. The Cure is available on the Internet. The principal contribution of this work is to show that crowdsourcing games can be developed as a means to address problems involving domain knowledge. While most prior work on scientific discovery games and crowdsourcing in general takes as a premise that contributors have little or no expertise, here we demonstrated a crowdsourcing system that succeeded in capturing expert knowledge.

  7. Defining the limits of flowers: the challenge of distinguishing between the evolutionary products of simple versus compound strobili

    PubMed Central

    Rudall, Paula J.; Bateman, Richard M.

    2010-01-01

    Recent phylogenetic reconstructions suggest that axially condensed flower-like structures evolved iteratively in seed plants from either simple or compound strobili. The simple-strobilus model of flower evolution, widely applied to the angiosperm flower, interprets the inflorescence as a compound strobilus. The conifer cone and the gnetalean ‘flower’ are commonly interpreted as having evolved from a compound strobilus by extreme condensation and (at least in the case of male conifer cones) elimination of some structures present in the presumed ancestral compound strobilus. These two hypotheses have profoundly different implications for reconstructing the evolution of developmental genetic mechanisms in seed plants. If different flower-like structures evolved independently, there should intuitively be little commonality of patterning genes. However, reproductive units of some early-divergent angiosperms, including the extant genus Trithuria (Hydatellaceae) and the extinct genus Archaefructus (Archaefructaceae), apparently combine features considered typical of flowers and inflorescences. We re-evaluate several disparate strands of comparative data to explore whether flower-like structures could have arisen by co-option of flower-expressed patterning genes into independently evolved condensed inflorescences, or vice versa. We discuss the evolution of the inflorescence in both gymnosperms and angiosperms, emphasising the roles of heterotopy in dictating gender expression and heterochrony in permitting internodal compression. PMID:20047867

  8. Defining the limits of flowers: the challenge of distinguishing between the evolutionary products of simple versus compound strobili.

    PubMed

    Rudall, Paula J; Bateman, Richard M

    2010-02-12

    Recent phylogenetic reconstructions suggest that axially condensed flower-like structures evolved iteratively in seed plants from either simple or compound strobili. The simple-strobilus model of flower evolution, widely applied to the angiosperm flower, interprets the inflorescence as a compound strobilus. The conifer cone and the gnetalean 'flower' are commonly interpreted as having evolved from a compound strobilus by extreme condensation and (at least in the case of male conifer cones) elimination of some structures present in the presumed ancestral compound strobilus. These two hypotheses have profoundly different implications for reconstructing the evolution of developmental genetic mechanisms in seed plants. If different flower-like structures evolved independently, there should intuitively be little commonality of patterning genes. However, reproductive units of some early-divergent angiosperms, including the extant genus Trithuria (Hydatellaceae) and the extinct genus Archaefructus (Archaefructaceae), apparently combine features considered typical of flowers and inflorescences. We re-evaluate several disparate strands of comparative data to explore whether flower-like structures could have arisen by co-option of flower-expressed patterning genes into independently evolved condensed inflorescences, or vice versa. We discuss the evolution of the inflorescence in both gymnosperms and angiosperms, emphasising the roles of heterotopy in dictating gender expression and heterochrony in permitting internodal compression.

  9. Wnt, Ptk7, and FGFRL expression gradients control trunk positional identity in planarian regeneration.

    PubMed

    Lander, Rachel; Petersen, Christian P

    2016-04-13

    Mechanisms enabling positional identity re-establishment are likely critical for tissue regeneration. Planarians use Wnt/beta-catenin signaling to polarize the termini of their anteroposterior axis, but little is known about how regeneration signaling restores regionalization along body or organ axes. We identify three genes expressed constitutively in overlapping body-wide transcriptional gradients that control trunk-tail positional identity in regeneration. ptk7 encodes a trunk-expressed kinase-dead Wnt co-receptor, wntP-2 encodes a posterior-expressed Wnt ligand, and ndl-3 encodes an anterior-expressed homolog of conserved FGFRL/nou-darake decoy receptors. ptk7 and wntP-2 maintain and allow appropriate regeneration of trunk tissue position independently of canonical Wnt signaling and with suppression of ndl-3 expression in the posterior. These results suggest that restoration of regional identity in regeneration involves the interpretation and re-establishment of axis-wide transcriptional gradients of signaling molecules.

  10. Classification of a large microarray data set: Algorithm comparison and analysis of drug signatures

    PubMed Central

    Natsoulis, Georges; El Ghaoui, Laurent; Lanckriet, Gert R.G.; Tolley, Alexander M.; Leroy, Fabrice; Dunlea, Shane; Eynon, Barrett P.; Pearson, Cecelia I.; Tugendreich, Stuart; Jarnagin, Kurt

    2005-01-01

    A large gene expression database has been produced that characterizes the gene expression and physiological effects of hundreds of approved and withdrawn drugs, toxicants, and biochemical standards in various organs of live rats. In order to derive useful biological knowledge from this large database, a variety of supervised classification algorithms were compared using a 597-microarray subset of the data. Our studies show that several types of linear classifiers based on Support Vector Machines (SVMs) and Logistic Regression can be used to derive readily interpretable drug signatures with high classification performance. Both methods can be tuned to produce classifiers of drug treatments in the form of short, weighted gene lists which upon analysis reveal that some of the signature genes have a positive contribution (act as “rewards” for the class-of-interest) while others have a negative contribution (act as “penalties”) to the classification decision. The combination of reward and penalty genes enhances performance by keeping the number of false positive treatments low. The results of these algorithms are combined with feature selection techniques that further reduce the length of the drug signatures, an important step towards the development of useful diagnostic biomarkers and low-cost assays. Multiple signatures with no genes in common can be generated for the same classification end-point. Comparison of these gene lists identifies biological processes characteristic of a given class. PMID:15867433

  11. XGR software for enhanced interpretation of genomic summary data, illustrated by application to immunological traits.

    PubMed

    Fang, Hai; Knezevic, Bogdan; Burnham, Katie L; Knight, Julian C

    2016-12-13

    Biological interpretation of genomic summary data such as those resulting from genome-wide association studies (GWAS) and expression quantitative trait loci (eQTL) studies is one of the major bottlenecks in medical genomics research, calling for efficient and integrative tools to resolve this problem. We introduce eXploring Genomic Relations (XGR), an open source tool designed for enhanced interpretation of genomic summary data enabling downstream knowledge discovery. Targeting users of varying computational skills, XGR utilises prior biological knowledge and relationships in a highly integrated but easily accessible way to make user-input genomic summary datasets more interpretable. We show how by incorporating ontology, annotation, and systems biology network-driven approaches, XGR generates more informative results than conventional analyses. We apply XGR to GWAS and eQTL summary data to explore the genomic landscape of the activated innate immune response and common immunological diseases. We provide genomic evidence for a disease taxonomy supporting the concept of a disease spectrum from autoimmune to autoinflammatory disorders. We also show how XGR can define SNP-modulated gene networks and pathways that are shared and distinct between diseases, how it achieves functional, phenotypic and epigenomic annotations of genes and variants, and how it enables exploring annotation-based relationships between genetic variants. XGR provides a single integrated solution to enhance interpretation of genomic summary data for downstream biological discovery. XGR is released as both an R package and a web-app, freely available at http://galahad.well.ox.ac.uk/XGR .

  12. Upregulation of inflammatory gene transcripts in periosteum of chronic migraineurs: implications to extracranial origin of headache

    PubMed Central

    Perry, Carlton; Blake, Pamela; Buettner, Catherine; Papavassiliou, Efstathios; Schain, Aaron; Bhasin, Manoj; Burstein, Rami

    2016-01-01

    Objective Chronic migraine (CM) is often associated with chronic tenderness of pericranial muscles. In fact, a distinct increase in muscle tenderness prior to onset of occipital headache that eventually progresses into a full blown migraine attack is common. This experience raises the possibility that some CM attacks originate outside the cranium. The objective of this study was to determine whether there are extracranial pathophysiologies in these headaches. Methods We biopsied and measured the expression of gene transcripts (mRNA) encoding proteins that play roles in immune and inflammatory responses in affected (i.e., where the head hurts) calvarial periosteum of (a) patients whose CMs are associated with muscle tenderness and (b) patients with no history of headache. Results Expression of proinflammatory genes (e.g., CCL8, TLR2) in the calvarial periosteum significantly increases in CM patients attesting to muscle tenderness, whereas expression of genes that suppress inflammation and immune cell differentiation (e.g., IL10RA, CSF1R) decreased. Interpretation Because the up-regulated genes were linked to activation of white blood cells, production of cytokines, and inhibition of NFKB, and the down-regulated genes linked to prevention of macrophage activation and cell lysis, we suggest that the molecular environment surrounding periosteal pain fibers is inflamed and in turn activates trigeminovascular nociceptors that reach the affected periosteum through suture branches of intracranial meningeal nociceptors and/or somatic branches of the occipital nerve. This study provides the first set of evidence for localized extracranial pathophysiology in chronic migraine. PMID:27091721

  13. Identification of Chemosensory Genes Based on the Transcriptomic Analysis of Six Different Chemosensory Organs in Spodoptera exigua.

    PubMed

    Zhang, Ya-Nan; Qian, Jia-Li; Xu, Ji-Wei; Zhu, Xiu-Yun; Li, Meng-Ya; Xu, Xiao-Xue; Liu, Chun-Xiang; Xue, Tao; Sun, Liang

    2018-01-01

    Insects have a complex chemosensory system that accurately perceives external chemicals and plays a pivotal role in many insect life activities. Thus, the study of the chemosensory mechanism has become an important research topic in entomology. Spodoptera exigua Hübner (Lepidoptera: Noctuidae) is a major agricultural polyphagous pest that causes significant agricultural economic losses worldwide. However, except for a few genes that have been discovered, its olfactory and gustatory mechanisms remain uncertain. In the present study, we acquired 144,479 unigenes of S. exigua by assembling 65.81 giga base reads from 6 chemosensory organs (female and male antennae, female and male proboscises, and female and male labial palps), and identified many differentially expressed genes in the gustatory and olfactory organs. Analysis of the transcriptome data obtained 159 putative chemosensory genes, including 24 odorant binding proteins (OBPs; 3 were new), 19 chemosensory proteins (4 were new), 64 odorant receptors (57 were new), 22 ionotropic receptors (16 were new), and 30 new gustatory receptors. Phylogenetic analyses of all genes and SexiGRs expression patterns using quantitative real-time polymerase chain reactions were investigated. Our results found that several of these genes had differential expression features in the olfactory organs compared to the gustatory organs that might play crucial roles in the chemosensory system of S. exigua , and could be utilized as targets for future functional studies to assist in the interpretation of the molecular mechanism of the system. They could also be used for developing novel behavioral disturbance agents to control the population of the moths in the future.

  14. Identification of Chemosensory Genes Based on the Transcriptomic Analysis of Six Different Chemosensory Organs in Spodoptera exigua

    PubMed Central

    Zhang, Ya-Nan; Qian, Jia-Li; Xu, Ji-Wei; Zhu, Xiu-Yun; Li, Meng-Ya; Xu, Xiao-Xue; Liu, Chun-Xiang; Xue, Tao; Sun, Liang

    2018-01-01

    Insects have a complex chemosensory system that accurately perceives external chemicals and plays a pivotal role in many insect life activities. Thus, the study of the chemosensory mechanism has become an important research topic in entomology. Spodoptera exigua Hübner (Lepidoptera: Noctuidae) is a major agricultural polyphagous pest that causes significant agricultural economic losses worldwide. However, except for a few genes that have been discovered, its olfactory and gustatory mechanisms remain uncertain. In the present study, we acquired 144,479 unigenes of S. exigua by assembling 65.81 giga base reads from 6 chemosensory organs (female and male antennae, female and male proboscises, and female and male labial palps), and identified many differentially expressed genes in the gustatory and olfactory organs. Analysis of the transcriptome data obtained 159 putative chemosensory genes, including 24 odorant binding proteins (OBPs; 3 were new), 19 chemosensory proteins (4 were new), 64 odorant receptors (57 were new), 22 ionotropic receptors (16 were new), and 30 new gustatory receptors. Phylogenetic analyses of all genes and SexiGRs expression patterns using quantitative real-time polymerase chain reactions were investigated. Our results found that several of these genes had differential expression features in the olfactory organs compared to the gustatory organs that might play crucial roles in the chemosensory system of S. exigua, and could be utilized as targets for future functional studies to assist in the interpretation of the molecular mechanism of the system. They could also be used for developing novel behavioral disturbance agents to control the population of the moths in the future. PMID:29740343

  15. MAVTgsa: An R Package for Gene Set (Enrichment) Analysis

    DOE PAGES

    Chien, Chih-Yi; Chang, Ching-Wei; Tsai, Chen-An; ...

    2014-01-01

    Gene semore » t analysis methods aim to determine whether an a priori defined set of genes shows statistically significant difference in expression on either categorical or continuous outcomes. Although many methods for gene set analysis have been proposed, a systematic analysis tool for identification of different types of gene set significance modules has not been developed previously. This work presents an R package, called MAVTgsa, which includes three different methods for integrated gene set enrichment analysis. (1) The one-sided OLS (ordinary least squares) test detects coordinated changes of genes in gene set in one direction, either up- or downregulation. (2) The two-sided MANOVA (multivariate analysis variance) detects changes both up- and downregulation for studying two or more experimental conditions. (3) A random forests-based procedure is to identify gene sets that can accurately predict samples from different experimental conditions or are associated with the continuous phenotypes. MAVTgsa computes the P values and FDR (false discovery rate) q -value for all gene sets in the study. Furthermore, MAVTgsa provides several visualization outputs to support and interpret the enrichment results. This package is available online.« less

  16. A Scalable Approach for Discovering Conserved Active Subnetworks across Species

    PubMed Central

    Verfaillie, Catherine M.; Hu, Wei-Shou; Myers, Chad L.

    2010-01-01

    Overlaying differential changes in gene expression on protein interaction networks has proven to be a useful approach to interpreting the cell's dynamic response to a changing environment. Despite successes in finding active subnetworks in the context of a single species, the idea of overlaying lists of differentially expressed genes on networks has not yet been extended to support the analysis of multiple species' interaction networks. To address this problem, we designed a scalable, cross-species network search algorithm, neXus (Network - cross(X)-species - Search), that discovers conserved, active subnetworks based on parallel differential expression studies in multiple species. Our approach leverages functional linkage networks, which provide more comprehensive coverage of functional relationships than physical interaction networks by combining heterogeneous types of genomic data. We applied our cross-species approach to identify conserved modules that are differentially active in stem cells relative to differentiated cells based on parallel gene expression studies and functional linkage networks from mouse and human. We find hundreds of conserved active subnetworks enriched for stem cell-associated functions such as cell cycle, DNA repair, and chromatin modification processes. Using a variation of this approach, we also find a number of species-specific networks, which likely reflect mechanisms of stem cell function that have diverged between mouse and human. We assess the statistical significance of the subnetworks by comparing them with subnetworks discovered on random permutations of the differential expression data. We also describe several case examples that illustrate the utility of comparative analysis of active subnetworks. PMID:21170309

  17. Gene transcription ontogeny of hypothalamic-pituitary-thyroid axis development in early-life stage fathead minnow and zebrafish.

    PubMed

    Vergauwen, Lucia; Cavallin, Jenna E; Ankley, Gerald T; Bars, Chloé; Gabriëls, Isabelle J; Michiels, Ellen D G; Fitzpatrick, Krysta R; Periz-Stanacev, Jelena; Randolph, Eric C; Robinson, Serina L; Saari, Travis W; Schroeder, Anthony L; Stinckens, Evelyn; Swintek, Joe; Van Cruchten, Steven J; Verbueken, Evy; Villeneuve, Daniel L; Knapen, Dries

    2018-05-04

    The hypothalamic-pituitary-thyroid (HPT) axis is known to play a crucial role in the development of teleost fish. However, knowledge of endogenous transcription profiles of thyroid-related genes in developing teleosts remains fragmented. We selected two model teleost species, the fathead minnow (Pimephales promelas) and the zebrafish (Danio rerio), to compare the gene transcription ontogeny of the HPT axis. Control organisms were sampled at several time points during embryonic and larval development until 33 days post-fertilization. Total RNA was extracted from pooled, whole fish, and thyroid-related mRNA expression was evaluated using quantitative polymerase chain reaction. Gene transcripts examined included: thyrotropin-releasing hormone receptor (trhr), thyroid-stimulating hormone receptor (tshr), sodium-iodide symporter (nis), thyroid peroxidase (tpo), thyroglobulin (tg), transthyretin (ttr), deiodinases 1, 2, 3a, and 3b (dio1, dio2, dio3a and 3b), and thyroid hormone receptors alpha and beta (thrα and β). A loess regression method was successful in identifying maxima and minima of transcriptional expression during early development of both species. Overall, we observed great similarities between the species, including maternal transfer, at least to some extent, of almost all transcripts (confirmed in unfertilized eggs), increasing expression of most transcripts during hatching and embryo-larval transition, and indications of a fully functional HPT axis in larvae. These data will aid in the development of hypotheses on the role of certain genes and pathways during development. Furthermore, this provides a background reference dataset for designing and interpreting targeted transcriptional expression studies both for fundamental research and for applications such as toxicology. Copyright © 2018 Elsevier Inc. All rights reserved.

  18. Glutathione S-transferase omega genes in Alzheimer and Parkinson disease risk, age-at-diagnosis and brain gene expression: an association study with mechanistic implications.

    PubMed

    Allen, Mariet; Zou, Fanggeng; Chai, High Seng; Younkin, Curtis S; Miles, Richard; Nair, Asha A; Crook, Julia E; Pankratz, V Shane; Carrasquillo, Minerva M; Rowley, Christopher N; Nguyen, Thuy; Ma, Li; Malphrus, Kimberly G; Bisceglio, Gina; Ortolaza, Alexandra I; Palusak, Ryan; Middha, Sumit; Maharjan, Sooraj; Georgescu, Constantin; Schultz, Debra; Rakhshan, Fariborz; Kolbert, Christopher P; Jen, Jin; Sando, Sigrid B; Aasly, Jan O; Barcikowska, Maria; Uitti, Ryan J; Wszolek, Zbigniew K; Ross, Owen A; Petersen, Ronald C; Graff-Radford, Neill R; Dickson, Dennis W; Younkin, Steven G; Ertekin-Taner, Nilüfer

    2012-04-11

    Glutathione S-transferase omega-1 and 2 genes (GSTO1, GSTO2), residing within an Alzheimer and Parkinson disease (AD and PD) linkage region, have diverse functions including mitigation of oxidative stress and may underlie the pathophysiology of both diseases. GSTO polymorphisms were previously reported to associate with risk and age-at-onset of these diseases, although inconsistent follow-up study designs make interpretation of results difficult. We assessed two previously reported SNPs, GSTO1 rs4925 and GSTO2 rs156697, in AD (3,493 ADs vs. 4,617 controls) and PD (678 PDs vs. 712 controls) for association with disease risk (case-controls), age-at-diagnosis (cases) and brain gene expression levels (autopsied subjects). We found that rs156697 minor allele associates with significantly increased risk (odds ratio = 1.14, p = 0.038) in the older ADs with age-at-diagnosis > 80 years. The minor allele of GSTO1 rs4925 associates with decreased risk in familial PD (odds ratio = 0.78, p = 0.034). There was no other association with disease risk or age-at-diagnosis. The minor alleles of both GSTO SNPs associate with lower brain levels of GSTO2 (p = 4.7 × 10-11-1.9 × 10-27), but not GSTO1. Pathway analysis of significant genes in our brain expression GWAS, identified significant enrichment for glutathione metabolism genes (p = 0.003). These results suggest that GSTO locus variants may lower brain GSTO2 levels and consequently confer AD risk in older age. Other glutathione metabolism genes should be assessed for their effects on AD and other chronic, neurologic diseases.

  19. A CRISPR-Cas9 Generated MDCK Cell Line Expressing Human MDR1 Without Endogenous Canine MDR1 (cABCB1): An Improved Tool for Drug Efflux Studies.

    PubMed

    Karlgren, Maria; Simoff, Ivailo; Backlund, Maria; Wegler, Christine; Keiser, Markus; Handin, Niklas; Müller, Janett; Lundquist, Patrik; Jareborg, Anne-Christine; Oswald, Stefan; Artursson, Per

    2017-09-01

    Madin-Darby canine kidney (MDCK) II cells stably transfected with transport proteins are commonly used models for drug transport studies. However, endogenous expression of especially canine MDR1 (cMDR1) confounds the interpretation of such studies. Here we have established an MDCK cell line stably overexpressing the human MDR1 transporter (hMDR1; P-glycoprotein), and used CRISPR-Cas9 gene editing to knockout the endogenous cMDR1. Genomic screening revealed the generation of a clonal cell line homozygous for a 4-nucleotide deletion in the canine ABCB1 gene leading to a frameshift and a premature stop codon. Knockout of cMDR1 expression was verified by quantitative protein analysis and functional studies showing retained activity of the human MDR1 transporter. Application of this cell line allowed unbiased reclassification of drugs previously defined as both substrates and non-substrates in different studies using commonly used MDCK-MDR1 clones. Our new MDCK-hMDR1 cell line, together with a previously developed control cell line, both with identical deletions in the canine ABCB1 gene and lack of cMDR1 expression represent excellent in vitro tools for use in drug discovery. Copyright © 2017 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.

  20. Examination of the Abscission-Associated Transcriptomes for Soybean, Tomato, and Arabidopsis Highlights the Conserved Biosynthesis of an Extensible Extracellular Matrix and Boundary Layer.

    PubMed

    Kim, Joonyup; Sundaresan, Srivignesh; Philosoph-Hadas, Sonia; Yang, Ronghui; Meir, Shimon; Tucker, Mark L

    2015-01-01

    Abscission zone (AZ) development and the progression of abscission (detachment of plant organs) have been roughly separated into four stages: first, AZ differentiation; second, competence to respond to abscission signals; third, activation of abscission; and fourth, formation of a protective layer and post-abscission trans-differentiation. Stage three, activation of abscission, is when changes in the cell wall and extracellular matrix occur to support successful organ separation. Most abscission research has focused on gene expression for enzymes that disassemble the cell wall within the AZ and changes in phytohormones and other signaling events that regulate their expression. Here, transcriptome data for soybean, tomato and Arabidopsis were examined and compared with a focus not only on genes associated with disassembly of the cell wall but also on gene expression linked to the biosynthesis of a new extracellular matrix. AZ-specific up-regulation of genes associated with cell wall disassembly including cellulases (beta-1,4-endoglucanases, CELs), polygalacturonases (PGs), and expansins (EXPs) were much as expected; however, curiously, changes in expression of xyloglucan endotransglucosylase/hydrolases (XTHs) were not AZ-specific in soybean. Unexpectedly, we identified an early increase in the expression of genes underlying the synthesis of a waxy-like cuticle. Based on the expression data, we propose that the early up-regulation of an abundance of small pathogenesis-related (PR) genes is more closely linked to structural changes in the extracellular matrix of separating cells than an enzymatic role in pathogen resistance. Furthermore, these observations led us to propose that, in addition to cell wall loosening enzymes, abscission requires (or is enhanced by) biosynthesis and secretion of small proteins (15-25 kDa) and waxes that form an extensible extracellular matrix and boundary layer on the surface of separating cells. The synthesis of the boundary layer precedes what is typically associated with the post-abscission synthesis of a protective scar over the fracture plane. This modification in the abscission model is discussed in regard to how it influences our interpretation of the role of multiple abscission signals.

  1. High natural gene expression variation in the reef-building coral Acropora millepora: potential for acclimative and adaptive plasticity

    PubMed Central

    2013-01-01

    Background Ecosystems worldwide are suffering the consequences of anthropogenic impact. The diverse ecosystem of coral reefs, for example, are globally threatened by increases in sea surface temperatures due to global warming. Studies to date have focused on determining genetic diversity, the sequence variability of genes in a species, as a proxy to estimate and predict the potential adaptive response of coral populations to environmental changes linked to climate changes. However, the examination of natural gene expression variation has received less attention. This variation has been implicated as an important factor in evolutionary processes, upon which natural selection can act. Results We acclimatized coral nubbins from six colonies of the reef-building coral Acropora millepora to a common garden in Heron Island (Great Barrier Reef, GBR) for a period of four weeks to remove any site-specific environmental effects on the physiology of the coral nubbins. By using a cDNA microarray platform, we detected a high level of gene expression variation, with 17% (488) of the unigenes differentially expressed across coral nubbins of the six colonies (jsFDR-corrected, p < 0.01). Among the main categories of biological processes found differentially expressed were transport, translation, response to stimulus, oxidation-reduction processes, and apoptosis. We found that the transcriptional profiles did not correspond to the genotype of the colony characterized using either an intron of the carbonic anhydrase gene or microsatellite loci markers. Conclusion Our results provide evidence of the high inter-colony variation in A. millepora at the transcriptomic level grown under a common garden and without a correspondence with genotypic identity. This finding brings to our attention the importance of taking into account natural variation between reef corals when assessing experimental gene expression differences. The high transcriptional variation detected in this study is interpreted and discussed within the context of adaptive potential and phenotypic plasticity of reef corals. Whether this variation will allow coral reefs to survive to current challenges remains unknown. PMID:23565725

  2. MetNet: Software to Build and Model the Biogenetic Lattice of Arabidopsis

    DOE PAGES

    Wurtele, Eve Syrkin; Li, Jie; Diao, Lixia; ...

    2003-01-01

    MetNet (http://www.botany.iastate.edu/∼mash/metnetex/metabolicnetex.html) is publicly available software in development for analysis of genome-wide RNA, protein and metabolite profiling data. The software is designed to enable the biologist to visualize, statistically analyse and model a metabolic and regulatory network map of Arabidopsis , combined with gene expression profiling data. It contains a JAVA interface to an interactions database (MetNetDB) containing information on regulatory and metabolic interactions derived from a combination of web databases (TAIR, KEGG, BRENDA) and input from biologists in their area of expertise. FCModeler captures input from MetNetDB in a graphical form. Sub-networks can be identified and interpreted using simplemore » fuzzy cognitive maps. FCModeler is intended to develop and evaluate hypotheses, and provide a modelling framework for assessing the large amounts of data captured by high-throughput gene expression experiments. FCModeler and MetNetDB are currently being extended to three-dimensional virtual reality display. The MetNet map, together with gene expression data, can be viewed using multivariate graphics tools in GGobi linked with the data analytic tools in R. Users can highlight different parts of the metabolic network and see the relevant expression data highlighted in other data plots. Multi-dimensional expression data can be rotated through different dimensions. Statistical analysis can be computed alongside the visual. MetNet is designed to provide a framework for the formulation of testable hypotheses regarding the function of specific genes, and in the long term provide the basis for identification of metabolic and regulatory networks that control plant composition and development.« less

  3. Informatics approaches in the Biological Characterization of ...

    EPA Pesticide Factsheets

    Adverse Outcome Pathways (AOPs) are a conceptual framework to characterize toxicity pathways by a series of mechanistic steps from a molecular initiating event to population outcomes. This framework helps to direct risk assessment research, for example by aiding in computational prioritization of chemicals, genes, and tissues relevant to an adverse health outcome. We have designed and implemented a computational workflow to access a wealth of public data relating genes, chemicals, diseases, pathways, and species, to provide a biological context for putative AOPs. We selected three AOP case studies: ER/Aromatase Antagonism Leading to Reproductive Dysfunction, AHR1 Activation Leading to Cardiotoxicity, and AChE Inhibition Leading to Acute Mortality, and deduced a taxonomic range of applicability for each AOP. We developed computational tools to automatically access and analyze the pathway activity of AOP-relevant protein orthologs, finding broad similarity among vertebrate species for the ER/Aromatase and AHR1 AOPs, and similarity extending to invertebrate animal species for AChE inhibition. Additionally, we used public gene expression data to find groups of highly co-expressed genes, and compared those groups across organisms. To interpret these findings at a higher level of biological organization, we created the AOPdb, a relational database that mines results from sources including NCBI, KEGG, Reactome, CTD, and OMIM. This multi-source database connects genes,

  4. Application of the laser capture microdissection technique for molecular definition of skeletal cell differentiation in vivo.

    PubMed

    Benayahu, Dafna; Socher, Rina; Shur, Irena

    2008-01-01

    Laser capture microdissection (LCM) method allows selection of individual or clustered cells from intact tissues. This technology enables one to pick cells from tissues that are difficult to study individually, sort the anatomical complexity of these tissues, and make the cells available for molecular analyses. Following the cells' extraction, the nucleic acids and proteins can be isolated and used for multiple applications that provide an opportunity to uncover the molecular control of cellular fate in the natural microenvironment. Utilization of LCM for the molecular analysis of cells from skeletal tissues will enable one to study differential patterns of gene expression in the native intact skeletal tissue with reliable interpretation of function for known genes as well as to discover novel genes. Variability between samples may be caused either by differences in the tissue samples (different areas isolated from the same section) or some variances in sample handling. LCM is a multi-task technology that combines histology, microscopy work, and dedicated molecular biology. The LCM application will provide results that will pave the way toward high throughput profiling of tissue-specific gene expression using Gene Chip arrays. Detailed description of in vivo molecular pathways will make it possible to elaborate on control systems to apply for the repair of genetic or metabolic diseases of skeletal tissues.

  5. Evidence for a Common Toolbox Based on Necrotrophy in a Fungal Lineage Spanning Necrotrophs, Biotrophs, Endophytes, Host Generalists and Specialists

    PubMed Central

    Andrew, Marion; Barua, Reeta; Short, Steven M.; Kohn, Linda M.

    2012-01-01

    The Sclerotiniaceae (Ascomycotina, Leotiomycetes) is a relatively recently evolved lineage of necrotrophic host generalists, and necrotrophic or biotrophic host specialists, some latent or symptomless. We hypothesized that they inherited a basic toolbox of genes for plant symbiosis from their common ancestor. Maintenance and evolutionary diversification of symbiosis could require selection on toolbox genes or on timing and magnitude of gene expression. The genes studied were chosen because their products have been previously investigated as pathogenicity factors in the Sclerotiniaceae. They encode proteins associated with cell wall degradation: acid protease 1 (acp1), aspartyl protease (asps), and polygalacturonases (pg1, pg3, pg5, pg6), and the oxalic acid (OA) pathway: a zinc finger transcription factor (pac1), and oxaloacetate acetylhydrolase (oah), catalyst in OA production, essential for full symptom production in Sclerotinia sclerotiorum. Site-specific likelihood analyses provided evidence for purifying selection in all 8 pathogenicity-related genes. Consistent with an evolutionary arms race model, positive selection was detected in 5 of 8 genes. Only generalists produced large, proliferating disease lesions on excised Arabidopsis thaliana leaves and oxalic acid by 72 hours in vitro. In planta expression of oah was 10–300 times greater among the necrotrophic host generalists than necrotrophic and biotrophic host specialists; pac1 was not differentially expressed. Ability to amplify 6/8 pathogenicity related genes and produce oxalic acid in all genera are consistent with the common toolbox hypothesis for this gene sample. That our data did not distinguish biotrophs from necrotrophs is consistent with 1) a common toolbox based on necrotrophy and 2) the most conservative interpretation of the 3-locus housekeeping gene phylogeny – a baseline of necrotrophy from which forms of biotrophy emerged at least twice. Early oah overexpression likely expands the host range of necrotrophic generalists in the Sclerotiniaceae, while specialists and biotrophs deploy oah, or other as-yet-unknown toolbox genes, differently. PMID:22253834

  6. Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series.

    PubMed

    Gálvez, Juan Manuel; Castillo, Daniel; Herrera, Luis Javier; San Román, Belén; Valenzuela, Olga; Ortuño, Francisco Manuel; Rojas, Ignacio

    2018-01-01

    Most of the research studies developed applying microarray technology to the characterization of different pathological states of any disease may fail in reaching statistically significant results. This is largely due to the small repertoire of analysed samples, and to the limitation in the number of states or pathologies usually addressed. Moreover, the influence of potential deviations on the gene expression quantification is usually disregarded. In spite of the continuous changes in omic sciences, reflected for instance in the emergence of new Next-Generation Sequencing-related technologies, the existing availability of a vast amount of gene expression microarray datasets should be properly exploited. Therefore, this work proposes a novel methodological approach involving the integration of several heterogeneous skin cancer series, and a later multiclass classifier design. This approach is thus a way to provide the clinicians with an intelligent diagnosis support tool based on the use of a robust set of selected biomarkers, which simultaneously distinguishes among different cancer-related skin states. To achieve this, a multi-platform combination of microarray datasets from Affymetrix and Illumina manufacturers was carried out. This integration is expected to strengthen the statistical robustness of the study as well as the finding of highly-reliable skin cancer biomarkers. Specifically, the designed operation pipeline has allowed the identification of a small subset of 17 differentially expressed genes (DEGs) from which to distinguish among 7 involved skin states. These genes were obtained from the assessment of a number of potential batch effects on the gene expression data. The biological interpretation of these genes was inspected in the specific literature to understand their underlying information in relation to skin cancer. Finally, in order to assess their possible effectiveness in cancer diagnosis, a cross-validation Support Vector Machines (SVM)-based classification including feature ranking was performed. The accuracy attained exceeded the 92% in overall recognition of the 7 different cancer-related skin states. The proposed integration scheme is expected to allow the co-integration with other state-of-the-art technologies such as RNA-seq.

  7. Procedural Semantics as a Theory of Meaning.

    DTIC Science & Technology

    1981-03-01

    Aaron Sloman (none of whom can be held responsible, of course, fcor the opinions expressed herein). Special thanks are also due to John Lyons for valuable...Meanings 12 7 Parametric Ambiguity 14 8 The Economic Necessity of Ambiguity 16 9 Semantic Interpretation 19 10 Semantics of the Internal Language 21 11...sufficiently low order organisms, the behavioral characteristics of that organism in response to stimuli are essentially "wired in" by their genes

  8. Fundamental limits on dynamic inference from single-cell snapshots

    PubMed Central

    Weinreb, Caleb; Tusi, Betsabeh K.; Socolovsky, Merav

    2018-01-01

    Single-cell expression profiling reveals the molecular states of individual cells with unprecedented detail. Because these methods destroy cells in the process of analysis, they cannot measure how gene expression changes over time. However, some information on dynamics is present in the data: the continuum of molecular states in the population can reflect the trajectory of a typical cell. Many methods for extracting single-cell dynamics from population data have been proposed. However, all such attempts face a common limitation: for any measured distribution of cell states, there are multiple dynamics that could give rise to it, and by extension, multiple possibilities for underlying mechanisms of gene regulation. Here, we describe the aspects of gene expression dynamics that cannot be inferred from a static snapshot alone and identify assumptions necessary to constrain a unique solution for cell dynamics from static snapshots. We translate these constraints into a practical algorithmic approach, population balance analysis (PBA), which makes use of a method from spectral graph theory to solve a class of high-dimensional differential equations. We use simulations to show the strengths and limitations of PBA, and then apply it to single-cell profiles of hematopoietic progenitor cells (HPCs). Cell state predictions from this analysis agree with HPC fate assays reported in several papers over the past two decades. By highlighting the fundamental limits on dynamic inference faced by any method, our framework provides a rigorous basis for dynamic interpretation of a gene expression continuum and clarifies best experimental designs for trajectory reconstruction from static snapshot measurements. PMID:29463712

  9. TransAtlasDB: an integrated database connecting expression data, metadata and variants

    PubMed Central

    Adetunji, Modupeore O; Lamont, Susan J; Schmidt, Carl J

    2018-01-01

    Abstract High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating suitable hypothesis, thus an innovative storage solution that addresses these limitations, such as hard disk storage requirements, efficiency and reproducibility are paramount. By offering a uniform data storage and retrieval mechanism, various data can be compared and easily investigated. We present a sophisticated system, TransAtlasDB, which incorporates a hybrid architecture of both relational and NoSQL databases for fast and efficient data storage, processing and querying of large datasets from transcript expression analysis with corresponding metadata, as well as gene-associated variants (such as SNPs) and their predicted gene effects. TransAtlasDB provides the data model of accurate storage of the large amount of data derived from RNAseq analysis and also methods of interacting with the database, either via the command-line data management workflows, written in Perl, with useful functionalities that simplifies the complexity of data storage and possibly manipulation of the massive amounts of data generated from RNAseq analysis or through the web interface. The database application is currently modeled to handle analyses data from agricultural species, and will be expanded to include more species groups. Overall TransAtlasDB aims to serve as an accessible repository for the large complex results data files derived from RNAseq gene expression profiling and variant analysis. Database URL: https://modupeore.github.io/TransAtlasDB/ PMID:29688361

  10. Altered cortical expression of GABA-related genes in schizophrenia: illness progression vs developmental disturbance.

    PubMed

    Hoftman, Gil D; Volk, David W; Bazmi, H Holly; Li, Siyu; Sampson, Allan R; Lewis, David A

    2015-01-01

    Schizophrenia is a neurodevelopmental disorder with altered expression of GABA-related genes in the prefrontal cortex (PFC). However, whether these gene expression abnormalities reflect disturbances in postnatal developmental processes before clinical onset or arise as a consequence of clinical illness remains unclear. Expression levels for 7 GABA-related transcripts (vesicular GABA transporter [vGAT], GABA membrane transporter [GAT1], GABAA receptor subunit α1 [GABRA1] [novel in human and monkey cohorts], glutamic acid decarboxylase 67 [GAD67], parvalbumin, calretinin, and somatostatin [previously reported in human cohort, but not in monkey cohort]) were quantified in the PFC from 42 matched pairs of schizophrenia and comparison subjects and from 49 rhesus monkeys ranging in age from 1 week postnatal to adulthood. Levels of vGAT and GABRA1, but not of GAT1, messenger RNAs (mRNAs) were lower in the PFC of the schizophrenia subjects. As previously reported, levels of GAD67, parvalbumin, and somatostatin, but not of calretinin, mRNAs were also lower in these subjects. Neither illness duration nor age accounted for the levels of the transcripts with altered expression in schizophrenia. In monkey PFC, developmental changes in expression levels of many of these transcripts were in the opposite direction of the changes observed in schizophrenia. For example, mRNA levels for vGAT, GABRA1, GAD67, and parvalbumin all increased with age. Together with published reports, these findings support the interpretation that the altered expression of GABA-related transcripts in schizophrenia reflects a blunting of normal postnatal development changes, but they cannot exclude a decline during the early stages of clinical illness. © The Author 2013. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  11. Expression of P53 protein after exposure to ionizing radiation

    NASA Astrophysics Data System (ADS)

    Salazar, A. M.; Salvador, C.; Ruiz-Trejo, C.; Ostrosky, P.; Brandan, M. E.

    2001-10-01

    One of the most important tumor suppressor genes is p53 gene, which is involved in apoptotic cell death, cell differentiation and cell cycle arrest. The expression of p53 gene can be evaluated by determining the presence of P53 protein in cells using Western Blot assay with a chemiluminescent method. This technique has shown variabilities that are due to biological factors. Film developing process can influence the quality of the p53 bands obtained. We irradiated tumor cell lines and human peripheral lymphocytes with 137Cs and 60Co gamma rays to standardize irradiation conditions, to compare ionizing radiation with actinomycin D and to reduce the observed variability of P53 protein induction levels. We found that increasing radiation doses increase P53 protein induction while it decreases viability. We also conclude that ionizing radiation could serve as a positive control for Western Blot analysis of protein P53. In addition, our results show that the developing process may play an important role in the quality of P53 protein bands and data interpretation.

  12. Predicting Viral Infection From High-Dimensional Biomarker Trajectories

    PubMed Central

    Chen, Minhua; Zaas, Aimee; Woods, Christopher; Ginsburg, Geoffrey S.; Lucas, Joseph; Dunson, David; Carin, Lawrence

    2013-01-01

    There is often interest in predicting an individual’s latent health status based on high-dimensional biomarkers that vary over time. Motivated by time-course gene expression array data that we have collected in two influenza challenge studies performed with healthy human volunteers, we develop a novel time-aligned Bayesian dynamic factor analysis methodology. The time course trajectories in the gene expressions are related to a relatively low-dimensional vector of latent factors, which vary dynamically starting at the latent initiation time of infection. Using a nonparametric cure rate model for the latent initiation times, we allow selection of the genes in the viral response pathway, variability among individuals in infection times, and a subset of individuals who are not infected. As we demonstrate using held-out data, this statistical framework allows accurate predictions of infected individuals in advance of the development of clinical symptoms, without labeled data and even when the number of biomarkers vastly exceeds the number of individuals under study. Biological interpretation of several of the inferred pathways (factors) is provided. PMID:23704802

  13. Cellular GFP Toxicity and Immunogenicity: Potential Confounders in in Vivo Cell Tracking Experiments.

    PubMed

    Ansari, Amir Mehdi; Ahmed, A Karim; Matsangos, Aerielle E; Lay, Frank; Born, Louis J; Marti, Guy; Harmon, John W; Sun, Zhaoli

    2016-10-01

    Green Fluorescent protein (GFP), used as a cellular tag, provides researchers with a valuable method of measuring gene expression and cell tracking. However, there is evidence to suggest that the immunogenicity and cytotoxicity of GFP potentially confounds the interpretation of in vivo experimental data. Studies have shown that GFP expression can deteriorate over time as GFP tagged cells are prone to death. Therefore, the cells that were originally marked with GFP do not survive and cannot be accurately traced over time. This review will present current evidence for the immunogenicity and cytotoxicity of GFP in in vivo studies by characterizing these responses.

  14. Molecular mechanisms underlying variations in lung function: a systems genetics analysis

    PubMed Central

    Obeidat, Ma’en; Hao, Ke; Bossé, Yohan; Nickle, David C; Nie, Yunlong; Postma, Dirkje S; Laviolette, Michel; Sandford, Andrew J; Daley, Denise D; Hogg, James C; Elliott, W Mark; Fishbane, Nick; Timens, Wim; Hysi, Pirro G; Kaprio, Jaakko; Wilson, James F; Hui, Jennie; Rawal, Rajesh; Schulz, Holger; Stubbe, Beate; Hayward, Caroline; Polasek, Ozren; Järvelin, Marjo-Riitta; Zhao, Jing Hua; Jarvis, Deborah; Kähönen, Mika; Franceschini, Nora; North, Kari E; Loth, Daan W; Brusselle, Guy G; Smith, Albert Vernon; Gudnason, Vilmundur; Bartz, Traci M; Wilk, Jemma B; O’Connor, George T; Cassano, Patricia A; Tang, Wenbo; Wain, Louise V; Artigas, María Soler; Gharib, Sina A; Strachan, David P; Sin, Don D; Tobin, Martin D; London, Stephanie J; Hall, Ian P; Paré, Peter D

    2016-01-01

    Summary Background Lung function measures reflect the physiological state of the lung, and are essential to the diagnosis of chronic obstructive pulmonary disease (COPD). The SpiroMeta-CHARGE consortium undertook the largest genome-wide association study (GWAS) so far (n=48 201) for forced expiratory volume in 1 s (FEV1) and the ratio of FEV1 to forced vital capacity (FEV1/FVC) in the general population. The lung expression quantitative trait loci (eQTLs) study mapped the genetic architecture of gene expression in lung tissue from 1111 individuals. We used a systems genetics approach to identify single nucleotide polymorphisms (SNPs) associated with lung function that act as eQTLs and change the level of expression of their target genes in lung tissue; termed eSNPs. Methods The SpiroMeta-CHARGE GWAS results were integrated with lung eQTLs to map eSNPs and the genes and pathways underlying the associations in lung tissue. For comparison, a similar analysis was done in peripheral blood. The lung mRNA expression levels of the eSNP-regulated genes were tested for associations with lung function measures in 727 individuals. Additional analyses identified the pleiotropic effects of eSNPs from the published GWAS catalogue, and mapped enrichment in regulatory regions from the ENCODE project. Finally, the Connectivity Map database was used to identify potential therapeutics in silico that could reverse the COPD lung tissue gene signature. Findings SNPs associated with lung function measures were more likely to be eQTLs and vice versa. The integration mapped the specific genes underlying the GWAS signals in lung tissue. The eSNP-regulated genes were enriched for developmental and inflammatory pathways; by comparison, SNPs associated with lung function that were eQTLs in blood, but not in lung, were only involved in inflammatory pathways. Lung function eSNPs were enriched for regulatory elements and were over-represented among genes showing differential expression during fetal lung development. An mRNA gene expression signature for COPD was identified in lung tissue and compared with the Connectivity Map. This in-silico drug repurposing approach suggested several compounds that reverse the COPD gene expression signature, including a nicotine receptor antagonist. These findings represent novel therapeutic pathways for COPD. Interpretation The system genetics approach identified lung tissue genes driving the variation in lung function and susceptibility to COPD. The identification of these genes and the pathways in which they are enriched is essential to understand the pathophysiology of airway obstruction and to identify novel therapeutic targets and biomarkers for COPD, including drugs that reverse the COPD gene signature in silico. Funding The research reported in this article was not specifically funded by any agency. See Acknowledgments for a full list of funders of the lung eQTL study and the Spiro-Meta CHARGE GWAS. PMID:26404118

  15. Identification of Differentially Expressed Genes through Integrated Study of Alzheimer's Disease Affected Brain Regions.

    PubMed

    Puthiyedth, Nisha; Riveros, Carlos; Berretta, Regina; Moscato, Pablo

    2016-01-01

    Alzheimer's disease (AD) is the most common form of dementia in older adults that damages the brain and results in impaired memory, thinking and behaviour. The identification of differentially expressed genes and related pathways among affected brain regions can provide more information on the mechanisms of AD. In the past decade, several studies have reported many genes that are associated with AD. This wealth of information has become difficult to follow and interpret as most of the results are conflicting. In that case, it is worth doing an integrated study of multiple datasets that helps to increase the total number of samples and the statistical power in detecting biomarkers. In this study, we present an integrated analysis of five different brain region datasets and introduce new genes that warrant further investigation. The aim of our study is to apply a novel combinatorial optimisation based meta-analysis approach to identify differentially expressed genes that are associated to AD across brain regions. In this study, microarray gene expression data from 161 samples (74 non-demented controls, 87 AD) from the Entorhinal Cortex (EC), Hippocampus (HIP), Middle temporal gyrus (MTG), Posterior cingulate cortex (PC), Superior frontal gyrus (SFG) and visual cortex (VCX) brain regions were integrated and analysed using our method. The results are then compared to two popular meta-analysis methods, RankProd and GeneMeta, and to what can be obtained by analysing the individual datasets. We find genes related with AD that are consistent with existing studies, and new candidate genes not previously related with AD. Our study confirms the up-regualtion of INFAR2 and PTMA along with the down regulation of GPHN, RAB2A, PSMD14 and FGF. Novel genes PSMB2, WNK1, RPL15, SEMA4C, RWDD2A and LARGE are found to be differentially expressed across all brain regions. Further investigation on these genes may provide new insights into the development of AD. In addition, we identified the presence of 23 non-coding features, including four miRNA precursors (miR-7, miR570, miR-1229 and miR-6821), dysregulated across the brain regions. Furthermore, we compared our results with two popular meta-analysis methods RankProd and GeneMeta to validate our findings and performed a sensitivity analysis by removing one dataset at a time to assess the robustness of our results. These new findings may provide new insights into the disease mechanisms and thus make a significant contribution in the near future towards understanding, prevention and cure of AD.

  16. The synovial microenvironment of osteoarthritic joints alters RNA-seq expression profiles of human primary articular chondrocytes

    PubMed Central

    Lewallen, Eric A.; Bonin, Carolina A.; Li, Xin; Smith, Jay; Karperien, Marcel; Larson, A. Noelle; Lewallen, David G.; Cool, Simon M.; Westendorf, Jennifer J.; Krych, Aaron J.; Leontovich, Alexey A.; Im, Hee-Jeong; van Wijnen, Andre J.

    2018-01-01

    Osteoarthritis (OA) is a disabling degenerative joint disease that prompts pain with limited treatment options. To permit early diagnosis and treatment of OA, a high resolution mechanistic understanding of human chondrocytes in normal and diseased states is necessary. In this study, we assessed the biological effects of OA-related changes in the synovial microenvironment on chondrocytes embedded within anatomically intact cartilage from joints with different pathological grades by next generation RNA-sequencing (RNA-seq). We determined the transcriptome of primary articular chondrocytes derived from pristine knees and ankles, as well as from joints affected by OA. The GALAXY bioinformatics platform was used to facilitate biological interpretations. Comparisons of patient samples by k-means, hierarchical clustering and principal component analysis reveal that primary chondrocytes exhibit OA grade-related differences in gene expression, including genes involved in cell-adhesion, ECM production and immune response. We conclude that diseased synovial microenvironments in joints with different histopathological OA grades directly alter gene expression in chondrocytes. One ramification of this finding is that sampling anatomically intact cartilage from OA joints is not an ideal source of healthy chondrocytes, nor should they be used to generate a normal baseline for the molecular characterization of diseased joints. PMID:27378743

  17. Coral Carbonic Anhydrases: Regulation by Ocean Acidification.

    PubMed

    Zoccola, Didier; Innocenti, Alessio; Bertucci, Anthony; Tambutté, Eric; Supuran, Claudiu T; Tambutté, Sylvie

    2016-06-03

    Global change is a major threat to the oceans, as it implies temperature increase and acidification. Ocean acidification (OA) involving decreasing pH and changes in seawater carbonate chemistry challenges the capacity of corals to form their skeletons. Despite the large number of studies that have investigated how rates of calcification respond to ocean acidification scenarios, comparatively few studies tackle how ocean acidification impacts the physiological mechanisms that drive calcification itself. The aim of our paper was to determine how the carbonic anhydrases, which play a major role in calcification, are potentially regulated by ocean acidification. For this we measured the effect of pH on enzyme activity of two carbonic anhydrase isoforms that have been previously characterized in the scleractinian coral Stylophora pistillata. In addition we looked at gene expression of these enzymes in vivo. For both isoforms, our results show (1) a change in gene expression under OA (2) an effect of OA and temperature on carbonic anhydrase activity. We suggest that temperature increase could counterbalance the effect of OA on enzyme activity. Finally we point out that caution must, thus, be taken when interpreting transcriptomic data on carbonic anhydrases in ocean acidification and temperature stress experiments, as the effect of these stressors on the physiological function of CA will depend both on gene expression and enzyme activity.

  18. -A curated transcriptomic dataset collection relevant to embryonic development associated with in vitro fertilization in healthy individuals and patients with polycystic ovary syndrome.

    PubMed

    Mackeh, Rafah; Boughorbel, Sabri; Chaussabel, Damien; Kino, Tomoshige

    2017-01-01

    The collection of large-scale datasets available in public repositories is rapidly growing and providing opportunities to identify and fill gaps in different fields of biomedical research. However, users of these datasets should be able to selectively browse datasets related to their field of interest. Here we made available a collection of transcriptome datasets related to human follicular cells from normal individuals or patients with polycystic ovary syndrome, in the process of their development, during in vitro fertilization. After RNA-seq dataset exclusion and careful selection based on study description and sample information, 12 datasets, encompassing a total of 85 unique transcriptome profiles, were identified in NCBI Gene Expression Omnibus and uploaded to the Gene Expression Browser (GXB), a web application specifically designed for interactive query and visualization of integrated large-scale data. Once annotated in GXB, multiple sample grouping has been made in order to create rank lists to allow easy data interpretation and comparison. The GXB tool also allows the users to browse a single gene across multiple projects to evaluate its expression profiles in multiple biological systems/conditions in a web-based customized graphical views. The curated dataset is accessible at the following link: http://ivf.gxbsidra.org/dm3/landing.gsp.

  19. ­A curated transcriptomic dataset collection relevant to embryonic development associated with in vitro fertilization in healthy individuals and patients with polycystic ovary syndrome

    PubMed Central

    Mackeh, Rafah; Boughorbel, Sabri; Chaussabel, Damien; Kino, Tomoshige

    2017-01-01

    The collection of large-scale datasets available in public repositories is rapidly growing and providing opportunities to identify and fill gaps in different fields of biomedical research. However, users of these datasets should be able to selectively browse datasets related to their field of interest. Here we made available a collection of transcriptome datasets related to human follicular cells from normal individuals or patients with polycystic ovary syndrome, in the process of their development, during in vitro fertilization. After RNA-seq dataset exclusion and careful selection based on study description and sample information, 12 datasets, encompassing a total of 85 unique transcriptome profiles, were identified in NCBI Gene Expression Omnibus and uploaded to the Gene Expression Browser (GXB), a web application specifically designed for interactive query and visualization of integrated large-scale data. Once annotated in GXB, multiple sample grouping has been made in order to create rank lists to allow easy data interpretation and comparison. The GXB tool also allows the users to browse a single gene across multiple projects to evaluate its expression profiles in multiple biological systems/conditions in a web-based customized graphical views. The curated dataset is accessible at the following link: http://ivf.gxbsidra.org/dm3/landing.gsp. PMID:28413616

  20. Shifts in Host Mucosal Innate Immune Function Are Associated with Ruminal Microbial Succession in Supplemental Feeding and Grazing Goats at Different Ages

    PubMed Central

    Jiao, Jinzhen; Zhou, Chuanshe; Guan, L. L.; McSweeney, C. S.; Tang, Shaoxun; Wang, Min; Tan, Zhiliang

    2017-01-01

    Gastrointestinal microbiota may play an important role in regulating host mucosal innate immune function. This study was conducted to test the hypothesis that age (non-rumination, transition and rumination) and feeding type [Supplemental feeding (S) vs. Grazing (G)] could alter ruminal microbial diversity and maturation of host mucosal innate immune system in goat kids. MiSeq sequencing was applied to investigate ruminal microbial composition and diversity, and RT-PCR was used to test expression of immune-related genes in ruminal mucosa. Results showed that higher (P < 0.05) relative abundances of Prevotella, Butyrivibrio, Pseudobutyrivibrio, Methanobrevibacter.gottschalkii, Neocallimastix, Anoplodinium–Diplodinium, and Polyplastron, and lower relative abundance of Methanosphaera (P = 0.042) were detected in the rumen of S kids when compared to those in G kids. The expression of genes encoding TLRs, IL1α, IL1β and TICAM2 was down-regulated (P < 0.01), while expression of genes encoding tight junction proteins was up-regulated (P < 0.05) in the ruminal mucosa of S kids when compared to that in G kids. Moreover, irrespective of feeding type, relative abundances of ruminal Prevotella, Fibrobacter, Ruminococcus, Butyrivibrio, Methanobrevibacter, Neocallimastix, and Entodinium increased with age. The expression of most genes encoding TLRs and cytokines increased (P < 0.05) from day 0 to 7, while expression of genes encoding tight junction proteins declined with age (P < 0.05). This study revealed that the composition of each microbial domain changed as animals grew, and these changes might be associated with variations in host mucosal innate immune function. Moreover, supplementing goat kids with concentrate could modulate ruminal microbial composition, enhance barrier function and decrease local inflammation. The findings provide useful information in interpreting microbiota and host interactions, and developing nutritional strategies to improve the productivity and health of rumen during early life. PMID:28912767

  1. Integrating multi-omic features exploiting Chromosome Conformation Capture data.

    PubMed

    Merelli, Ivan; Tordini, Fabio; Drocco, Maurizio; Aldinucci, Marco; Liò, Pietro; Milanesi, Luciano

    2015-01-01

    The representation, integration, and interpretation of omic data is a complex task, in particular considering the huge amount of information that is daily produced in molecular biology laboratories all around the world. The reason is that sequencing data regarding expression profiles, methylation patterns, and chromatin domains is difficult to harmonize in a systems biology view, since genome browsers only allow coordinate-based representations, discarding functional clusters created by the spatial conformation of the DNA in the nucleus. In this context, recent progresses in high throughput molecular biology techniques and bioinformatics have provided insights into chromatin interactions on a larger scale and offer a formidable support for the interpretation of multi-omic data. In particular, a novel sequencing technique called Chromosome Conformation Capture allows the analysis of the chromosome organization in the cell's natural state. While performed genome wide, this technique is usually called Hi-C. Inspired by service applications such as Google Maps, we developed NuChart, an R package that integrates Hi-C data to describe the chromosomal neighborhood starting from the information about gene positions, with the possibility of mapping on the achieved graphs genomic features such as methylation patterns and histone modifications, along with expression profiles. In this paper we show the importance of the NuChart application for the integration of multi-omic data in a systems biology fashion, with particular interest in cytogenetic applications of these techniques. Moreover, we demonstrate how the integration of multi-omic data can provide useful information in understanding why genes are in certain specific positions inside the nucleus and how epigenetic patterns correlate with their expression.

  2. contamDE: differential expression analysis of RNA-seq data for contaminated tumor samples.

    PubMed

    Shen, Qi; Hu, Jiyuan; Jiang, Ning; Hu, Xiaohua; Luo, Zewei; Zhang, Hong

    2016-03-01

    Accurate detection of differentially expressed genes between tumor and normal samples is a primary approach of cancer-related biomarker identification. Due to the infiltration of tumor surrounding normal cells, the expression data derived from tumor samples would always be contaminated with normal cells. Ignoring such cellular contamination would deflate the power of detecting DE genes and further confound the biological interpretation of the analysis results. For the time being, there does not exists any differential expression analysis approach for RNA-seq data in literature that can properly account for the contamination of tumor samples. Without appealing to any extra information, we develop a new method 'contamDE' based on a novel statistical model that associates RNA-seq expression levels with cell types. It is demonstrated through simulation studies that contamDE could be much more powerful than the existing methods that ignore the contamination. In the application to two cancer studies, contamDE uniquely found several potential therapy and prognostic biomarkers of prostate cancer and non-small cell lung cancer. An R package contamDE is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/ zhanghfd@fudan.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data.

    PubMed

    Hettne, Kristina M; Boorsma, André; van Dartel, Dorien A M; Goeman, Jelle J; de Jong, Esther; Piersma, Aldert H; Stierum, Rob H; Kleinjans, Jos C; Kors, Jan A

    2013-01-29

    Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values < 0.05) of the next-gen TM-derived gene sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect.

  4. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    PubMed Central

    2013-01-01

    Background Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values < 0.05) of the next-gen TM-derived gene sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. Conclusions Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect. PMID:23356878

  5. Strabismus genetics across a spectrum of eye misalignment disorders

    PubMed Central

    Ye, XC; Pegado, V; Patel, MS; Wasserman, WW

    2014-01-01

    Eye misalignment, called strabismus, is amongst the most common phenotypes observed, occurring in up to 5% of individuals in a studied population. While misalignment is frequently observed in rare complex syndromes, the majority of strabismus cases are non-syndromic. Over the past decade, genes and pathways associated with syndromic forms of strabismus have emerged, but the genes contributing to non-syndromic strabismus remain elusive. Genetic testing for strabismus risk may allow for earlier diagnosis and treatment, as well as decreased frequency of surgery. We review human and model organism literature describing non-syndromic strabismus, including family, twin, linkage, and gene expression studies. Recent advances in the genetics of Duane retraction syndrome are considered, as relatives of those impacted show elevated familial rates of non-syndromic strabismus. As whole genome sequencing efforts are advancing for the discovery of the elusive strabismus genes, this overview is intended to support the interpretation of the new findings. PMID:24579652

  6. Preserved dopaminergic homeostasis and dopamine-related behaviour in hemizygous TH-Cre mice.

    PubMed

    Runegaard, Annika H; Jensen, Kathrine L; Fitzpatrick, Ciarán M; Dencker, Ditte; Weikop, Pia; Gether, Ulrik; Rickhag, Mattias

    2017-01-01

    Cre-driver mouse lines have been extensively used as genetic tools to target and manipulate genetically defined neuronal populations by expression of Cre recombinase under selected gene promoters. This approach has greatly advanced neuroscience but interpretations are hampered by the fact that most Cre-driver lines have not been thoroughly characterized. Thus, a phenotypic characterization is of major importance to reveal potential aberrant phenotypes prior to implementation and usage to selectively inactivate or induce transgene expression. Here, we present a biochemical and behavioural assessment of the dopaminergic system in hemizygous tyrosine hydroxylase (TH)-Cre mice in comparison to wild-type (WT) controls. Our data show that TH-Cre mice display preserved dopaminergic homeostasis with unaltered levels of TH and dopamine as well as unaffected dopamine turnover in striatum. TH-Cre mice also show preserved dopamine transporter expression and function supporting sustained dopaminergic transmission. In addition, TH-Cre mice demonstrate normal responses in basic behavioural paradigms related to dopaminergic signalling including locomotor activity, reward preference and anxiolytic behaviour. Our results suggest that TH-Cre mice represent a valid tool to study the dopamine system, though careful characterization must always be performed to prevent false interpretations following Cre-dependent transgene expression and manipulation of selected neuronal pathways. © 2016 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.

  7. Genetic interaction networks: better understand to better predict

    PubMed Central

    Boucher, Benjamin; Jenna, Sarah

    2013-01-01

    A genetic interaction (GI) between two genes generally indicates that the phenotype of a double mutant differs from what is expected from each individual mutant. In the last decade, genome scale studies of quantitative GIs were completed using mainly synthetic genetic array technology and RNA interference in yeast and Caenorhabditis elegans. These studies raised questions regarding the functional interpretation of GIs, the relationship of genetic and molecular interaction networks, the usefulness of GI networks to infer gene function and co-functionality, the evolutionary conservation of GI, etc. While GIs have been used for decades to dissect signaling pathways in genetic models, their functional interpretations are still not trivial. The existence of a GI between two genes does not necessarily imply that these two genes code for interacting proteins or that the two genes are even expressed in the same cell. In fact, a GI only implies that the two genes share a functional relationship. These two genes may be involved in the same biological process or pathway; or they may also be involved in compensatory pathways with unrelated apparent function. Considering the powerful opportunity to better understand gene function, genetic relationship, robustness and evolution, provided by a genome-wide mapping of GIs, several in silico approaches have been employed to predict GIs in unicellular and multicellular organisms. Most of these methods used weighted data integration. In this article, we will review the later knowledge acquired on GI networks in metazoans by looking more closely into their relationship with pathways, biological processes and molecular complexes but also into their modularity and organization. We will also review the different in silico methods developed to predict GIs and will discuss how the knowledge acquired on GI networks can be used to design predictive tools with higher performances. PMID:24381582

  8. Integrated analysis of microRNA and gene expression profiles reveals a functional regulatory module associated with liver fibrosis.

    PubMed

    Chen, Wei; Zhao, Wenshan; Yang, Aiting; Xu, Anjian; Wang, Huan; Cong, Min; Liu, Tianhui; Wang, Ping; You, Hong

    2017-12-15

    Liver fibrosis, characterized with the excessive accumulation of extracellular matrix (ECM) proteins, represents the final common pathway of chronic liver inflammation. Ever-increasing evidence indicates microRNAs (miRNAs) dysregulation has important implications in the different stages of liver fibrosis. However, our knowledge of miRNA-gene regulation details pertaining to such disease remains unclear. The publicly available Gene Expression Omnibus (GEO) datasets of patients suffered from cirrhosis were extracted for integrated analysis. Differentially expressed miRNAs (DEMs) and genes (DEGs) were identified using GEO2R web tool. Putative target gene prediction of DEMs was carried out using the intersection of five major algorithms: DIANA-microT, TargetScan, miRanda, PICTAR5 and miRWalk. Functional miRNA-gene regulatory network (FMGRN) was constructed based on the computational target predictions at the sequence level and the inverse expression relationships between DEMs and DEGs. DAVID web server was selected to perform KEGG pathway enrichment analysis. Functional miRNA-gene regulatory module was generated based on the biological interpretation. Internal connections among genes in liver fibrosis-related module were determined using String database. MiRNA-gene regulatory modules related to liver fibrosis were experimentally verified in recombinant human TGFβ1 stimulated and specific miRNA inhibitor treated LX-2 cells. We totally identified 85 and 923 dysregulated miRNAs and genes in liver cirrhosis biopsy samples compared to their normal controls. All evident miRNA-gene pairs were identified and assembled into FMGRN which consisted of 990 regulations between 51 miRNAs and 275 genes, forming two big sub-networks that were defined as down-network and up-network, respectively. KEGG pathway enrichment analysis revealed that up-network was prominently involved in several KEGG pathways, in which "Focal adhesion", "PI3K-Akt signaling pathway" and "ECM-receptor interaction" were remarked significant (adjusted p<0.001). Genes enriched in these pathways coupled with their regulatory miRNAs formed a functional miRNA-gene regulatory module that contains 7 miRNAs, 22 genes and 42 miRNA-gene connections. Gene interaction analysis based on String database revealed that 8 out of 22 genes were highly clustered. Finally, we experimentally confirmed a functional regulatory module containing 5 miRNAs (miR-130b-3p, miR-148a-3p, miR-345-5p, miR-378a-3p, and miR-422a) and 6 genes (COL6A1, COL6A2, COL6A3, PIK3R3, COL1A1, CCND2) associated with liver fibrosis. Our integrated analysis of miRNA and gene expression profiles highlighted a functional miRNA-gene regulatory module associated with liver fibrosis, which, to some extent, may provide important clues to better understand the underlying pathogenesis of liver fibrosis. Copyright © 2017. Published by Elsevier B.V.

  9. Knowledge-driven genomic interactions: an application in ovarian cancer.

    PubMed

    Kim, Dokyoon; Li, Ruowang; Dudek, Scott M; Frase, Alex T; Pendergrass, Sarah A; Ritchie, Marylyn D

    2014-01-01

    Effective cancer clinical outcome prediction for understanding of the mechanism of various types of cancer has been pursued using molecular-based data such as gene expression profiles, an approach that has promise for providing better diagnostics and supporting further therapies. However, clinical outcome prediction based on gene expression profiles varies between independent data sets. Further, single-gene expression outcome prediction is limited for cancer evaluation since genes do not act in isolation, but rather interact with other genes in complex signaling or regulatory networks. In addition, since pathways are more likely to co-operate together, it would be desirable to incorporate expert knowledge to combine pathways in a useful and informative manner. Thus, we propose a novel approach for identifying knowledge-driven genomic interactions and applying it to discover models associated with cancer clinical phenotypes using grammatical evolution neural networks (GENN). In order to demonstrate the utility of the proposed approach, an ovarian cancer data from the Cancer Genome Atlas (TCGA) was used for predicting clinical stage as a pilot project. We identified knowledge-driven genomic interactions associated with cancer stage from single knowledge bases such as sources of pathway-pathway interaction, but also knowledge-driven genomic interactions across different sets of knowledge bases such as pathway-protein family interactions by integrating different types of information. Notably, an integration model from different sources of biological knowledge achieved 78.82% balanced accuracy and outperformed the top models with gene expression or single knowledge-based data types alone. Furthermore, the results from the models are more interpretable because they are framed in the context of specific biological pathways or other expert knowledge. The success of the pilot study we have presented herein will allow us to pursue further identification of models predictive of clinical cancer survival and recurrence. Understanding the underlying tumorigenesis and progression in ovarian cancer through the global view of interactions within/between different biological knowledge sources has the potential for providing more effective screening strategies and therapeutic targets for many types of cancer.

  10. Frequency domain analysis of noise in simple gene circuits

    NASA Astrophysics Data System (ADS)

    Cox, Chris D.; McCollum, James M.; Austin, Derek W.; Allen, Michael S.; Dar, Roy D.; Simpson, Michael L.

    2006-06-01

    Recent advances in single cell methods have spurred progress in quantifying and analyzing stochastic fluctuations, or noise, in genetic networks. Many of these studies have focused on identifying the sources of noise and quantifying its magnitude, and at the same time, paying less attention to the frequency content of the noise. We have developed a frequency domain approach to extract the information contained in the frequency content of the noise. In this article we review our work in this area and extend it to explicitly consider sources of extrinsic and intrinsic noise. First we review applications of the frequency domain approach to several simple circuits, including a constitutively expressed gene, a gene regulated by transitions in its operator state, and a negatively autoregulated gene. We then review our recent experimental study, in which time-lapse microscopy was used to measure noise in the expression of green fluorescent protein in individual cells. The results demonstrate how changes in rate constants within the gene circuit are reflected in the spectral content of the noise in a manner consistent with the predictions derived through frequency domain analysis. The experimental results confirm our earlier theoretical prediction that negative autoregulation not only reduces the magnitude of the noise but shifts its content out to higher frequency. Finally, we develop a frequency domain model of gene expression that explicitly accounts for extrinsic noise at the transcriptional and translational levels. We apply the model to interpret a shift in the autocorrelation function of green fluorescent protein induced by perturbations of the translational process as a shift in the frequency spectrum of extrinsic noise and a decrease in its weighting relative to intrinsic noise.

  11. Pilot study of small bowel mucosal gene expression in patients with irritable bowel syndrome with diarrhea.

    PubMed

    Camilleri, Michael; Carlson, Paula; Valentin, Nelson; Acosta, Andres; O'Neill, Jessica; Eckert, Deborah; Dyer, Roy; Na, Jie; Klee, Eric W; Murray, Joseph A

    2016-09-01

    Prior studies in with irritable bowel syndrome with diarrhea (IBS-D) patients showed immune activation, secretion, and barrier dysfunction in jejunal or colorectal mucosa. We measured mRNA expression by RT-PCR of 91 genes reflecting tight junction proteins, chemokines, innate immunity, ion channels, transmitters, housekeeping genes, and controls for DNA contamination and PCR efficiency in small intestinal mucosa from 15 IBS-D and 7 controls (biopsies negative for celiac disease). Fold change was calculated using 2((-ΔΔCT)) formula. Nominal P values (P < 0.05) were interpreted with false detection rate (FDR) correction (q value). Cluster analysis with Lens for Enrichment and Network Studies (LENS) explored connectivity of mechanisms. Upregulated genes (uncorrected P < 0.05) were related to ion transport (INADL, MAGI1, and SONS1), barrier (TJP1, 2, and 3 and CLDN) or immune functions (TLR3, IL15, and MAPKAPK5), or histamine metabolism (HNMT); downregulated genes were related to immune function (IL-1β, TGF-β1, and CCL20) or antigen detection (TLR1 and 8). The following genes were significantly upregulated (q < 0.05) in IBS-D: INADL, MAGI1, PPP2R5C, MAPKAPK5, TLR3, and IL-15. Among the 14 nominally upregulated genes, there was clustering of barrier and PDZ domains (TJP1, TJP2, TJP3, CLDN4, INADL, and MAGI1) and clustering of downregulated genes (CCL20, TLR1, IL1B, and TLR8). Protein expression of PPP2R5C in nuclear lysates was greater in patients with IBS-D and controls. There was increase in INADL protein (median 9.4 ng/ml) in patients with IBS-D relative to controls (median 5.8 ng/ml, P > 0.05). In conclusion, altered transcriptome (and to lesser extent protein) expression of ion transport, barrier, immune, and mast cell mechanisms in small bowel may reflect different alterations in function and deserves further study in IBS-D. Copyright © 2016 the American Physiological Society.

  12. Integrating mitosis, toxicity, and transgene expression in a telecommunications packet-switched network model of lipoplex-mediated gene delivery.

    PubMed

    Martin, Timothy M; Wysocki, Beata J; Beyersdorf, Jared P; Wysocki, Tadeusz A; Pannier, Angela K

    2014-08-01

    Gene delivery systems transport exogenous genetic information to cells or biological systems with the potential to directly alter endogenous gene expression and behavior with applications in functional genomics, tissue engineering, medical devices, and gene therapy. Nonviral systems offer advantages over viral systems because of their low immunogenicity, inexpensive synthesis, and easy modification but suffer from lower transfection levels. The representation of gene transfer using models offers perspective and interpretation of complex cellular mechanisms,including nonviral gene delivery where exact mechanisms are unknown. Here, we introduce a novel telecommunications model of the nonviral gene delivery process in which the delivery of the gene to a cell is synonymous with delivery of a packet of information to a destination computer within a packet-switched computer network. Such a model uses nodes and layers to simplify the complexity of modeling the transfection process and to overcome several challenges of existing models. These challenges include a limited scope and limited time frame, which often does not incorporate biological effects known to affect transfection. The telecommunication model was constructed in MATLAB to model lipoplex delivery of the gene encoding the green fluorescent protein to HeLa cells. Mitosis and toxicity events were included in the model resulting in simulation outputs of nuclear internalization and transfection efficiency that correlated with experimental data. A priori predictions based on model sensitivity analysis suggest that increasing endosomal escape and decreasing lysosomal degradation, protein degradation, and GFP-induced toxicity can improve transfection efficiency by three-fold. Application of the telecommunications model to nonviral gene delivery offers insight into the development of new gene delivery systems with therapeutically relevant transfection levels.

  13. Detection of rearrangements and transcriptional up-regulation of ALK in FFPE lung cancer specimens using a novel, sensitive, quantitative reverse transcription polymerase chain reaction assay.

    PubMed

    Gruber, Kim; Horn, Heike; Kalla, Jörg; Fritz, Peter; Rosenwald, Andreas; Kohlhäufl, Martin; Friedel, Godehard; Schwab, Matthias; Ott, German; Kalla, Claudia

    2014-03-01

    The approved dual-color fluorescence in situ hybridization (FISH) test for the detection of anaplastic lymphoma receptor tyrosine kinase (ALK) gene rearrangements in non-small-cell lung cancer (NSCLC) is complex and represents a low-throughput assay difficult to use in daily diagnostic practice. We devised a sensitive and robust routine diagnostic test for the detection of rearrangements and transcriptional up-regulation of ALK. We developed a quantitative reverse transcription polymerase chain reaction (qRT-PCR) assay adapted to RNA isolated from routine formalin-fixed, paraffin-embedded material and applied it to 652 NSCLC specimens. The reliability of this technique to detect ALK dysregulation was shown by comparison with FISH and immunohistochemistry. qRT-PCR analysis detected unbalanced ALK expression indicative of a gene rearrangement in 24 (4.6%) and full-length ALK transcript expression in six (1.1%) of 523 interpretable tumors. Among 182 tumors simultaneously analyzed by FISH and qRT-PCR, the latter accurately typed 97% of 19 rearranged and 158 nonrearranged tumors and identified ALK deregulation in two cases with insufficient FISH. Six tumors expressing full-length ALK transcripts did not show rearrangements of the gene. Immunohistochemistry detected ALK protein overexpression in tumors with gene fusions and transcriptional up-regulation, but did not distinguish between the two. One case with full-length ALK expression carried a heterozygous point mutation (S1220Y) within the kinase domain potentially interfering with kinase activity and/or inhibitor binding. Our qRT-PCR assay reliably identifies and distinguishes ALK rearrangements and full-length transcript expression in formalin-fixed, paraffin-embedded material. It is an easy-to-perform, cost-effective, and high-throughput tool for the diagnosis of ALK activation. The expression of full-length ALK transcripts may be relevant for ALK inhibitor therapy in NSCLC.

  14. Systems Analysis of Early Host Gene Expression Provides Clues for Transient Mycobacterium avium ssp avium vs. Persistent Mycobacterium avium ssp paratuberculosis Intestinal Infections

    PubMed Central

    Khare, Sangeeta; Drake, Kenneth L.; Lawhon, Sara D.; Nunes, Jairo E. S.; Figueiredo, Josely F.; Rossetti, Carlos A.; Gull, Tamara; Everts, Robin E.; Lewin, Harris. A.; Adams, Leslie Garry

    2016-01-01

    It has long been a quest in ruminants to understand how two very similar mycobacterial species, Mycobacterium avium ssp. paratuberculosis (MAP) and Mycobacterium avium ssp. avium (MAA) lead to either a chronic persistent infection or a rapid-transient infection, respectively. Here, we hypothesized that when the host immune response is activated by MAP or MAA, the outcome of the infection depends on the early activation of signaling molecules and host temporal gene expression. To test our hypothesis, ligated jejuno-ileal loops including Peyer’s patches in neonatal calves were inoculated with PBS, MAP, or MAA. A temporal analysis of the host transcriptome profile was conducted at several times post-infection (0.5, 1, 2, 4, 8 and 12 hours). When comparing the transcriptional responses of calves infected with the MAA versus MAP, discordant patterns of mucosal expression were clearly evident, and the numbers of unique transcripts altered were moderately less for MAA-infected tissue than were mucosal tissues infected with the MAP. To interpret these complex data, changes in the gene expression were further analyzed by dynamic Bayesian analysis. Bayesian network modeling identified mechanistic genes, gene-to-gene relationships, pathways and Gene Ontologies (GO) biological processes that are involved in specific cell activation during infection. MAP and MAA had significant different pathway perturbation at 0.5 and 12 hours post inoculation. Inverse processes were observed between MAP and MAA response for epithelial cell proliferation, negative regulation of chemotaxis, cell-cell adhesion mediated by integrin and regulation of cytokine-mediated signaling. MAP inoculated tissue had significantly lower expression of phagocytosis receptors such as mannose receptor and complement receptors. This study reveals that perturbation of genes and cellular pathways during MAP infection resulted in host evasion by mucosal membrane barrier weakening to access entry in the ileum, inhibition of Ca signaling associated with decreased phagosome-lysosome fusion as well as phagocytosis inhibition, bias toward Th2 cell immune response accompanied by cell recruitment, cell proliferation and cell differentiation; leading to persistent infection. Contrarily, MAA infection was related to cellular responses associated with activation of molecular pathways that release chemicals and cytokines involved with containment of infection and a strong bias toward Th1 immune response, resulting in a transient infection. PMID:27653506

  15. Systems Analysis of Early Host Gene Expression Provides Clues for Transient Mycobacterium avium ssp avium vs. Persistent Mycobacterium avium ssp paratuberculosis Intestinal Infections.

    PubMed

    Khare, Sangeeta; Drake, Kenneth L; Lawhon, Sara D; Nunes, Jairo E S; Figueiredo, Josely F; Rossetti, Carlos A; Gull, Tamara; Everts, Robin E; Lewin, Harris A; Adams, Leslie Garry

    It has long been a quest in ruminants to understand how two very similar mycobacterial species, Mycobacterium avium ssp. paratuberculosis (MAP) and Mycobacterium avium ssp. avium (MAA) lead to either a chronic persistent infection or a rapid-transient infection, respectively. Here, we hypothesized that when the host immune response is activated by MAP or MAA, the outcome of the infection depends on the early activation of signaling molecules and host temporal gene expression. To test our hypothesis, ligated jejuno-ileal loops including Peyer's patches in neonatal calves were inoculated with PBS, MAP, or MAA. A temporal analysis of the host transcriptome profile was conducted at several times post-infection (0.5, 1, 2, 4, 8 and 12 hours). When comparing the transcriptional responses of calves infected with the MAA versus MAP, discordant patterns of mucosal expression were clearly evident, and the numbers of unique transcripts altered were moderately less for MAA-infected tissue than were mucosal tissues infected with the MAP. To interpret these complex data, changes in the gene expression were further analyzed by dynamic Bayesian analysis. Bayesian network modeling identified mechanistic genes, gene-to-gene relationships, pathways and Gene Ontologies (GO) biological processes that are involved in specific cell activation during infection. MAP and MAA had significant different pathway perturbation at 0.5 and 12 hours post inoculation. Inverse processes were observed between MAP and MAA response for epithelial cell proliferation, negative regulation of chemotaxis, cell-cell adhesion mediated by integrin and regulation of cytokine-mediated signaling. MAP inoculated tissue had significantly lower expression of phagocytosis receptors such as mannose receptor and complement receptors. This study reveals that perturbation of genes and cellular pathways during MAP infection resulted in host evasion by mucosal membrane barrier weakening to access entry in the ileum, inhibition of Ca signaling associated with decreased phagosome-lysosome fusion as well as phagocytosis inhibition, bias toward Th2 cell immune response accompanied by cell recruitment, cell proliferation and cell differentiation; leading to persistent infection. Contrarily, MAA infection was related to cellular responses associated with activation of molecular pathways that release chemicals and cytokines involved with containment of infection and a strong bias toward Th1 immune response, resulting in a transient infection.

  16. The Cure: Design and Evaluation of a Crowdsourcing Game for Gene Selection for Breast Cancer Survival Prediction

    PubMed Central

    Loguercio, Salvatore; Griffith, Obi L; Nanis, Max; Wu, Chunlei; Su, Andrew I

    2014-01-01

    Background Molecular signatures for predicting breast cancer prognosis could greatly improve care through personalization of treatment. Computational analyses of genome-wide expression datasets have identified such signatures, but these signatures leave much to be desired in terms of accuracy, reproducibility, and biological interpretability. Methods that take advantage of structured prior knowledge (eg, protein interaction networks) show promise in helping to define better signatures, but most knowledge remains unstructured. Crowdsourcing via scientific discovery games is an emerging methodology that has the potential to tap into human intelligence at scales and in modes unheard of before. Objective The main objective of this study was to test the hypothesis that knowledge linking expression patterns of specific genes to breast cancer outcomes could be captured from players of an open, Web-based game. We envisioned capturing knowledge both from the player’s prior experience and from their ability to interpret text related to candidate genes presented to them in the context of the game. Methods We developed and evaluated an online game called The Cure that captured information from players regarding genes for use as predictors of breast cancer survival. Information gathered from game play was aggregated using a voting approach, and used to create rankings of genes. The top genes from these rankings were evaluated using annotation enrichment analysis, comparison to prior predictor gene sets, and by using them to train and test machine learning systems for predicting 10 year survival. Results Between its launch in September 2012 and September 2013, The Cure attracted more than 1000 registered players, who collectively played nearly 10,000 games. Gene sets assembled through aggregation of the collected data showed significant enrichment for genes known to be related to key concepts such as cancer, disease progression, and recurrence. In terms of the predictive accuracy of models trained using this information, these gene sets provided comparable performance to gene sets generated using other methods, including those used in commercial tests. The Cure is available on the Internet. Conclusions The principal contribution of this work is to show that crowdsourcing games can be developed as a means to address problems involving domain knowledge. While most prior work on scientific discovery games and crowdsourcing in general takes as a premise that contributors have little or no expertise, here we demonstrated a crowdsourcing system that succeeded in capturing expert knowledge. PMID:25654473

  17. Bioinformatics/biostatistics: microarray analysis.

    PubMed

    Eichler, Gabriel S

    2012-01-01

    The quantity and complexity of the molecular-level data generated in both research and clinical settings require the use of sophisticated, powerful computational interpretation techniques. It is for this reason that bioinformatic analysis of complex molecular profiling data has become a fundamental technology in the development of personalized medicine. This chapter provides a high-level overview of the field of bioinformatics and outlines several, classic bioinformatic approaches. The highlighted approaches can be aptly applied to nearly any sort of high-dimensional genomic, proteomic, or metabolomic experiments. Reviewed technologies in this chapter include traditional clustering analysis, the Gene Expression Dynamics Inspector (GEDI), GoMiner (GoMiner), Gene Set Enrichment Analysis (GSEA), and the Learner of Functional Enrichment (LeFE).

  18. RuleGO: a logical rules-based tool for description of gene groups by means of Gene Ontology

    PubMed Central

    Gruca, Aleksandra; Sikora, Marek; Polanski, Andrzej

    2011-01-01

    Genome-wide expression profiles obtained with the use of DNA microarray technology provide abundance of experimental data on biological and molecular processes. Such amount of data need to be further analyzed and interpreted in order to obtain biological conclusions on the basis of experimental results. The analysis requires a lot of experience and is usually time-consuming process. Thus, frequently various annotation databases are used to improve the whole process of analysis. Here, we present RuleGO—the web-based application that allows the user to describe gene groups on the basis of logical rules that include Gene Ontology (GO) terms in their premises. Presented application allows obtaining rules that reflect coappearance of GO-terms describing genes supported by the rules. The ontology level and number of coappearing GO-terms is adjusted in automatic manner. The user limits the space of possible solutions only. The RuleGO application is freely available at http://rulego.polsl.pl/. PMID:21715384

  19. Progestins alter photo-transduction cascade and circadian rhythm network in eyes of zebrafish (Danio rerio)

    NASA Astrophysics Data System (ADS)

    Zhao, Yanbin; Fent, Karl

    2016-02-01

    Environmental progestins are implicated in endocrine disruption in vertebrates. Additional targets that may be affected in organisms are poorly known. Here we report that progesterone (P4) and drospirenone (DRS) interfere with the photo-transduction cascade and circadian rhythm network in the eyes of zebrafish. Breeding pairs of adult zebrafish were exposed to P4 and DRS for 21 days with different measured concentrations of 7-742 ng/L and 99-13´650 ng/L, respectively. Of totally 10 key photo-transduction cascade genes analyzed, transcriptional levels of most were significantly up-regulated, or normal down-regulation was attenuated. Similarly, for some circadian rhythm genes, dose-dependent transcriptional alterations were also observed in the totally 33 genes analyzed. Significant alterations occurred even at environmental relevant levels of 7 ng/L P4. Different patterns were observed for these transcriptional alterations, of which, the nfil3 family displayed most significant changes. Furthermore, we demonstrate the importance of sampling time for the determination and interpretation of gene expression data, and put forward recommendations for sampling strategies to avoid false interpretations. Our results suggest that photo-transduction signals and circadian rhythm are potential targets for progestins. Further studies are required to assess alterations on the protein level, on physiology and behavior, as well as on implications in mammals.

  20. α-lipoic acid inhibits high glucose-induced apoptosis in HIT-T15 cells.

    PubMed

    Yang, Yi; Wang, Weiping; Liu, Yinan; Guo, Ting; Chen, Ping; Ma, Kangtao; Zhou, Chunyan

    2012-06-01

    High blood glucose plays an important role in the pathogenesis of diabetes. α-lipoic acid (LA) has been used to prevent and treat diabetes, and is thought to act by increasing insulin sensitivity in many tissues. However, whether LA also has a cytoprotective effect on pancreatic islet beta cells remains unclear. In this study, we assessed whether LA could inhibit apoptosis in beta cells exposed to high glucose concentrations. HIT-T15 pancreatic beta cells were treated with 30 mmol/L glucose in the presence or absence of 0.5 mmol/L LA for 8 days. LA significantly reduced the numbers of apoptotic HIT-T15 cells and inhibited the cell overgrowth normally induced by high glucose treatment. Additionally, LA inhibited insulin expression and secretion in HIT-T15 cells induced by high glucose. Further study demonstrated that LA upregulated Pdx1 and Bcl2 gene expression, reduced Bax gene expression, and promoted phosphorylation of Akt in HIT-T15 cells treated with high glucose. Intriguingly, knockdown of Pdx1 expression partially offset the anti-apoptotic effect of LA. However, inhibition of Akt by PI3K/AKT antagonist LY294002 only slightly reversed the anti-apoptosis effect of LA and mildly decreased the gene expression level of Pdx1 (P > 0.05). Moreover, LA only slightly attenuated reactive oxygen species (ROS) production and augmented mitochondrial membrane potential. Therefore, our data suggest that α-lipoic acid can effectively attenuate high glucose-induced HIT-T15 cell apoptosis probably by increasing Pdx1 expression. These findings provide a new interpretation on the role of LA in the treatment of diabetes. © 2012 The Authors Development, Growth & Differentiation © 2012 Japanese Society of Developmental Biologists.

  1. In vitro evaluation of natural thermal mineral waters in human keratinocyte cells: a preliminary study

    NASA Astrophysics Data System (ADS)

    Karagülle, Müfit Zeki; Karagülle, Mine; Kılıç, Songül; Sevinç, Hakan; Dündar, Cihat; Türkoğlu, Murat

    2018-06-01

    We aimed to test the anti-inflammatory and angiogenic properties of two different thermal waters at the cellular level in human keratinocyte cells in the present study. Two different thermal waters, thermo-mineral BJ1 (Bursa, Turkey) and oligomineral BG (Bolu, Turkey), were tested in human keratinocyte (HaCaT) cell line. HaCaT cells were incubated for 3 days with thermal waters; RNA isolation was carried out in the treated and untreated cells. The gene expressions of TNFα, IL-1α, and VEGF were measured using the RT-qPCR. The tested thermal waters significantly decreased the expression of IL-1α (BJ1 93% p = 0.0024 and BG 38% p = 0.0303). BJ1 and BG thermal waters downregulated the expression of TNFα (59% p = 0.0001 and 23% p = 0.0238 respectively). Furthermore, BJ1 and BG significantly downregulated the gene expression of VEGF (98% p = 0.0430 and 15% p = 0.0120). The observed decrease in the gene expression of TNFα and IL1α could be interpreted as an anti-inflammatory effect of mineral waters on HaCaT cells. Moreover, the suppressed VEGF expression might be an indicator of the antiangiogenic effect on human keratinocytes. Therefore, we hypothesized that depending on their specific chemical composition such as silica (128 mg/L) in BJ1 and hydrogen sulfide (1.2 mg/L) in BG, thermal waters suppress pro-inflammatory cytokines and angiogenic growth factor. These preliminary findings might give insight on the underlying mechanisms of the therapeutic benefits observed in some skin diseases such as rosacea and psoriasis.

  2. Wnt, Ptk7, and FGFRL expression gradients control trunk positional identity in planarian regeneration

    PubMed Central

    Lander, Rachel; Petersen, Christian P

    2016-01-01

    Mechanisms enabling positional identity re-establishment are likely critical for tissue regeneration. Planarians use Wnt/beta-catenin signaling to polarize the termini of their anteroposterior axis, but little is known about how regeneration signaling restores regionalization along body or organ axes. We identify three genes expressed constitutively in overlapping body-wide transcriptional gradients that control trunk-tail positional identity in regeneration. ptk7 encodes a trunk-expressed kinase-dead Wnt co-receptor, wntP-2 encodes a posterior-expressed Wnt ligand, and ndl-3 encodes an anterior-expressed homolog of conserved FGFRL/nou-darake decoy receptors. ptk7 and wntP-2 maintain and allow appropriate regeneration of trunk tissue position independently of canonical Wnt signaling and with suppression of ndl-3 expression in the posterior. These results suggest that restoration of regional identity in regeneration involves the interpretation and re-establishment of axis-wide transcriptional gradients of signaling molecules. DOI: http://dx.doi.org/10.7554/eLife.12850.001 PMID:27074666

  3. Just how happy is the happy puppet? An emotion signaling and kinship theory perspective on the behavioral phenotype of children with Angelman syndrome.

    PubMed

    Brown, William M; Consedine, Nathan S

    2004-01-01

    The favored level of parental investment in a child may differ for genes of maternal and paternal origin in the child. This conflict can be expressed in the phenomenon of genomic imprinting that refers to situations in which the same gene is differentially expressed depending on its parent of origin. Two disorders that show the effects of genomic imprinting--both at 15q11-q13--are Angelman Syndrome (AS) which is due to the absence of expression of maternally-inherited genes and Prader-Willi syndromes (PWS) which is due to the absence of expression of paternally-inherited genes. However, although both disorders can arise from the deletion of the same genetic region, the gustatory, behavioral, and affective characteristics of AS and PWS children are remarkably distinct. Recent research inspired by kinship theory has suggested the origins of these phenotypic differences may lie in the differential investment of each parent's genome in the AS or PWS child. Specifically, it is thought that each set of parental genes have different 'ideas' regarding how the child should behave towards the mother and how much investment they should look to extract. In normal cases, the trade-off between the competing parental genomes produces a behavioral equilibrium in the child. However, in pathological instances, particularly where gene expression is one-sided, the evolved behavioral strategies favored by the contributing genome will dominate the child's behavior. To date, research in the area of genomic conflict in AS and PWS children has primarily focusing on differences in post-natal nutrition-related behaviors. The current paper extends this framework by offering an emotion and evolutionary signaling interpretation of the affective characteristics of AS children. A review of the affective characteristics of the two syndromes (PWS and AS) is presented before kinship and emotions theory are used to examine the functions that differential affect expression may serve in altering maternal investment. We expected that because the ultimate goal of paternal genes is to increase the child rearing burden of mothers, the Angelman behavioral phenotype should exhibit the emotion signaling characteristics that elicit levels of investment more consistent with paternal genetic interests. AS children display more positive, relative to negative, affect expressions (i.e. AS children laugh and smile more frequently than PWS children). In affect signaling theories, positive affect signals (i.e., smiling, laughing) have evolved to manipulate the sensory systems of receivers to increase social resources. In contrast, because the expression of some negative affects may indicate to the mother that the infant is not viable, negative affect expression is characteristically low among AS children. However, AS children may nonetheless have high levels of non-expressed anxiety because of its role in assisting the child (and its paternal genome) to maintain vigilance for changes in investment on the part of the mother. Overall, our kinship and emotion signaling analysis of AS children suggests that their global pattern of affect signaling represents one manifestation of an array of possible evolved strategies within the parental genome. Specifically, because AS exhibits the effects of paternally-inherited genes unhindered by the expression of maternally-inherited genes, the AS infant manifests a pattern of expression and non-expression that maximize maternal investment and thus paternal fitness. This theory is a significant departure from the standard but erroneous conjecture that a mother and child's inclusive fitness interests are one and the same. Copyright 2004 Elsevier Ltd.

  4. Identification and differential induction of the expression of aquaporins by salinity in broccoli plants.

    PubMed

    Muries, Beatriz; Faize, Mohamed; Carvajal, Micaela; Martínez-Ballesta, María Del Carmen

    2011-04-01

    Plant aquaporins belong to a large superfamily of conserved proteins called the major intrinsic proteins (MIPs). There is limited information about the diversity of MIPs and their water transport capacity in broccoli (Brassica oleracea) plants. In this study, the cDNAs of isoforms of Plasma Membrane Intrinsic Proteins (PIPs), a class of aquaporins, from broccoli roots have been partially sequenced. Thus, sequencing experiments led to the identification of eight PIP1 and three PIP2 genes encoding PIPs in B. oleracea plants. The occurrence of different gene products encoding PIPs suggests that they may play different roles in plants. The screening of their expression as well as the expression of two specific PIP2 isoforms (BoPIP2;2 and BoPIP2;3), in different organs and under different salt-stress conditions in two varieties, has helped to unravel the function and the regulation of PIPs in plants. Thus, a high degree of BoPIP2;3 expression in mature leaves suggests that this BoPIP2;3 isoform plays important roles, not only in root water relations but also in the physiology and development of leaves. In addition, differences between gene and protein patterns led us to consider that mRNA synthesis is inhibited by the accumulation of the corresponding encoded protein. Therefore, transcript levels, protein abundance determination and the integrated hydraulic architecture of the roots must be considered in order to interpret the response of broccoli to salinity.

  5. Translating standards into practice - one Semantic Web API for Gene Expression.

    PubMed

    Deus, Helena F; Prud'hommeaux, Eric; Miller, Michael; Zhao, Jun; Malone, James; Adamusiak, Tomasz; McCusker, Jim; Das, Sudeshna; Rocca Serra, Philippe; Fox, Ronan; Marshall, M Scott

    2012-08-01

    Sharing and describing experimental results unambiguously with sufficient detail to enable replication of results is a fundamental tenet of scientific research. In today's cluttered world of "-omics" sciences, data standards and standardized use of terminologies and ontologies for biomedical informatics play an important role in reporting high-throughput experiment results in formats that can be interpreted by both researchers and analytical tools. Increasing adoption of Semantic Web and Linked Data technologies for the integration of heterogeneous and distributed health care and life sciences (HCLSs) datasets has made the reuse of standards even more pressing; dynamic semantic query federation can be used for integrative bioinformatics when ontologies and identifiers are reused across data instances. We present here a methodology to integrate the results and experimental context of three different representations of microarray-based transcriptomic experiments: the Gene Expression Atlas, the W3C BioRDF task force approach to reporting Provenance of Microarray Experiments, and the HSCI blood genomics project. Our approach does not attempt to improve the expressivity of existing standards for genomics but, instead, to enable integration of existing datasets published from microarray-based transcriptomic experiments. SPARQL Construct is used to create a posteriori mappings of concepts and properties and linking rules that match entities based on query constraints. We discuss how our integrative approach can encourage reuse of the Experimental Factor Ontology (EFO) and the Ontology for Biomedical Investigations (OBIs) for the reporting of experimental context and results of gene expression studies. Copyright © 2012 Elsevier Inc. All rights reserved.

  6. Glucose Alters Per2 Rhythmicity Independent of AMPK, Whereas AMPK Inhibitor Compound C Causes Profound Repression of Clock Genes and AgRP in mHypoE-37 Hypothalamic Neurons.

    PubMed

    Oosterman, Johanneke E; Belsham, Denise D

    2016-01-01

    Specific neurons in the hypothalamus are regulated by peripheral hormones and nutrients to maintain proper metabolic control. It is unclear if nutrients can directly control clock gene expression. We have therefore utilized the immortalized, hypothalamic cell line mHypoE-37, which exhibits robust circadian rhythms of core clock genes. mHypoE-37 neurons were exposed to 0.5 or 5.5 mM glucose, comparable to physiological levels in the brain. Per2 and Bmal1 mRNAs were assessed every 3 hours over 36 hours. Incubation with 5.5 mM glucose significantly shortened the period and delayed the phase of Per2 mRNA levels, but had no effect on Bmal1. Glucose had no significant effect on phospho-GSK3β, whereas AMPK phosphorylation was altered. Thus, the AMPK inhibitor Compound C was utilized, and mRNA levels of Per2, Bmal1, Cryptochrome1 (Cry1), agouti-related peptide (AgRP), carnitine palmitoyltransferase 1C (Cpt1c), and O-linked N-acetylglucosamine transferase (Ogt) were measured. Remarkably, Compound C dramatically reduced transcript levels of Per2, Bmal1, Cry1, and AgRP, but not Cpt1c or Ogt. Because AMPK was not inhibited at the same time or concentrations as the clock genes, we suggest that the effect of Compound C on gene expression occurs through an AMPK-independent mechanism. The consequences of inhibition of the rhythmic expression of clock genes, and in turn downstream metabolic mediators, such as AgRP, could have detrimental effects on overall metabolic processes. Importantly, the effects of the most commonly used AMPK inhibitor Compound C should be interpreted with caution, considering its role in AMPK-independent repression of specific genes, and especially clock gene rhythm dysregulation.

  7. Glucose Alters Per2 Rhythmicity Independent of AMPK, Whereas AMPK Inhibitor Compound C Causes Profound Repression of Clock Genes and AgRP in mHypoE-37 Hypothalamic Neurons

    PubMed Central

    Oosterman, Johanneke E.; Belsham, Denise D.

    2016-01-01

    Specific neurons in the hypothalamus are regulated by peripheral hormones and nutrients to maintain proper metabolic control. It is unclear if nutrients can directly control clock gene expression. We have therefore utilized the immortalized, hypothalamic cell line mHypoE-37, which exhibits robust circadian rhythms of core clock genes. mHypoE-37 neurons were exposed to 0.5 or 5.5 mM glucose, comparable to physiological levels in the brain. Per2 and Bmal1 mRNAs were assessed every 3 hours over 36 hours. Incubation with 5.5 mM glucose significantly shortened the period and delayed the phase of Per2 mRNA levels, but had no effect on Bmal1. Glucose had no significant effect on phospho-GSK3β, whereas AMPK phosphorylation was altered. Thus, the AMPK inhibitor Compound C was utilized, and mRNA levels of Per2, Bmal1, Cryptochrome1 (Cry1), agouti-related peptide (AgRP), carnitine palmitoyltransferase 1C (Cpt1c), and O-linked N-acetylglucosamine transferase (Ogt) were measured. Remarkably, Compound C dramatically reduced transcript levels of Per2, Bmal1, Cry1, and AgRP, but not Cpt1c or Ogt. Because AMPK was not inhibited at the same time or concentrations as the clock genes, we suggest that the effect of Compound C on gene expression occurs through an AMPK-independent mechanism. The consequences of inhibition of the rhythmic expression of clock genes, and in turn downstream metabolic mediators, such as AgRP, could have detrimental effects on overall metabolic processes. Importantly, the effects of the most commonly used AMPK inhibitor Compound C should be interpreted with caution, considering its role in AMPK-independent repression of specific genes, and especially clock gene rhythm dysregulation. PMID:26784927

  8. GO Explorer: A gene-ontology tool to aid in the interpretation of shotgun proteomics data.

    PubMed

    Carvalho, Paulo C; Fischer, Juliana Sg; Chen, Emily I; Domont, Gilberto B; Carvalho, Maria Gc; Degrave, Wim M; Yates, John R; Barbosa, Valmir C

    2009-02-24

    Spectral counting is a shotgun proteomics approach comprising the identification and relative quantitation of thousands of proteins in complex mixtures. However, this strategy generates bewildering amounts of data whose biological interpretation is a challenge. Here we present a new algorithm, termed GO Explorer (GOEx), that leverages the gene ontology (GO) to aid in the interpretation of proteomic data. GOEx stands out because it combines data from protein fold changes with GO over-representation statistics to help draw conclusions. Moreover, it is tightly integrated within the PatternLab for Proteomics project and, thus, lies within a complete computational environment that provides parsers and pattern recognition tools designed for spectral counting. GOEx offers three independent methods to query data: an interactive directed acyclic graph, a specialist mode where key words can be searched, and an automatic search. Its usefulness is demonstrated by applying it to help interpret the effects of perillyl alcohol, a natural chemotherapeutic agent, on glioblastoma multiform cell lines (A172). We used a new multi-surfactant shotgun proteomic strategy and identified more than 2600 proteins; GOEx pinpointed key sets of differentially expressed proteins related to cell cycle, alcohol catabolism, the Ras pathway, apoptosis, and stress response, to name a few. GOEx facilitates organism-specific studies by leveraging GO and providing a rich graphical user interface. It is a simple to use tool, specialized for biologists who wish to analyze spectral counting data from shotgun proteomics. GOEx is available at http://pcarvalho.com/patternlab.

  9. Asian Citrus Psyllid Expression Profiles Suggest Candidatus Liberibacter Asiaticus-Mediated Alteration of Adult Nutrition and Metabolism, and of Nymphal Development and Immunity

    PubMed Central

    He, Ruifeng; Nelson, William; Yin, Guohua; Cicero, Joseph M.; Willer, Mark; Kim, Ryan; Kramer, Robin; May, Greg A.; Crow, John A.; Soderlund, Carol A.; Gang, David R.; Brown, Judith K.

    2015-01-01

    The Asian citrus psyllid (ACP) Diaphorina citri Kuwayama (Hemiptera: Psyllidae) is the insect vector of the fastidious bacterium Candidatus Liberibacter asiaticus (CLas), the causal agent of citrus greening disease, or Huanglongbing (HLB). The widespread invasiveness of the psyllid vector and HLB in citrus trees worldwide has underscored the need for non-traditional approaches to manage the disease. One tenable solution is through the deployment of RNA interference technology to silence protein-protein interactions essential for ACP-mediated CLas invasion and transmission. To identify psyllid interactor-bacterial effector combinations associated with psyllid-CLas interactions, cDNA libraries were constructed from CLas-infected and CLas-free ACP adults and nymphs, and analyzed for differential expression. Library assemblies comprised 24,039,255 reads and yielded 45,976 consensus contigs. They were annotated (UniProt), classified using Gene Ontology, and subjected to in silico expression analyses using the Transcriptome Computational Workbench (TCW) (http://www.sohomoptera.org/ACPPoP/). Functional-biological pathway interpretations were carried out using the Kyoto Encyclopedia of Genes and Genomes databases. Differentially expressed contigs in adults and/or nymphs represented genes and/or metabolic/pathogenesis pathways involved in adhesion, biofilm formation, development-related, immunity, nutrition, stress, and virulence. Notably, contigs involved in gene silencing and transposon-related responses were documented in a psyllid for the first time. This is the first comparative transcriptomic analysis of ACP adults and nymphs infected and uninfected with CLas. The results provide key initial insights into host-parasite interactions involving CLas effectors that contribute to invasion-virulence, and to host nutritional exploitation and immune-related responses that appear to be essential for successful ACP-mediated circulative, propagative CLas transmission. PMID:26091106

  10. Immediate-early gene response to repeated immobilization: Fos protein and arc mRNA levels appear to be less sensitive than c-fos mRNA to adaptation.

    PubMed

    Ons, Sheila; Rotllant, David; Marín-Blasco, Ignacio J; Armario, Antonio

    2010-06-01

    Stress exposure resulted in brain induction of immediate-early genes (IEGs), considered as markers of neuronal activation. Upon repeated exposure to the same stressor, reduction of IEG response (adaptation) has been often observed, but there are important discrepancies in literature that may be in part related to the particular IEG and methodology used. We studied the differential pattern of adaptation of the IEGs c-fos and arc (activity-regulated cytoskeleton-associated protein) after repeated exposure to a severe stressor: immobilization on wooden boards (IMO). Rats repeatedly exposed to IMO showed reduced c-fos mRNA levels in response to acute IMO in most brain areas studied: the medial prefrontal cortex (mPFC), lateral septum (LS), medial amygdala (MeA), paraventricular nucleus of the hypothalamus (PVN) and locus coeruleus. In contrast, the number of neurons showing Fos-like immunoreactivity was only reduced in the MeA and the various subregions of the PVN. IMO-induced increases in arc gene expression were restricted to telencephalic regions and reduced by repeated IMO only in the mPFC. Double-labelling in the LS of IMO-exposed rats revealed that arc was expressed in only one-third of Fos+ neurons, suggesting two populations of Fos+ neurons. These data suggest that c-fos mRNA levels are more affected by repeated IMO than corresponding protein, and that arc gene expression does not reflect adaptation in most brain regions, which may be related to its constitutive expression. Therefore, the choice of a particular IEG and the method of measurement are important for proper interpretation of the impact of chronic repeated stress on brain activation.

  11. The candidate tumor suppressor gene BLU, located at the commonly deleted region 3p21.3, is an E2F-regulated, stress-responsive gene and inactivated by both epigenetic and genetic mechanisms in nasopharyngeal carcinoma.

    PubMed

    Qiu, Guo-Hua; Tan, Luke K S; Loh, Kwok Seng; Lim, Chai Yen; Srivastava, Gopesh; Tsai, Sen-Tien; Tsao, Sai Wah; Tao, Qian

    2004-06-10

    Loss of heterozygosity at 3p21 is common in various cancers including nasopharyngeal carcinoma (NPC). BLU is one of the candidate tumor suppressor genes (TSGs) in this region. Ectopic expression of BLU results in the inhibition of colony formation of cancer cells, suggesting that BLU is a tumor suppressor. We have identified a functional BLU promoter and found that it can be activated by environmental stresses such as heat shock, and is regulated by E2F. The promoter and first exon are located within a CpG island. BLU is highly expressed in testis and normal upper respiratory tract tissues including nasopharynx. However, in all seven NPC cell lines examined, BLU expression was downregulated and inversely correlated with promoter hypermethylation. Biallelic epigenetic inactivation of BLU was also observed in three cell lines. Hypermethylation was further detected in 19/29 (66%) of primary NPC tumors, but not in normal nasopharyngeal tissues. Treatment of NPC cell lines with 5-aza-2'-deoxycytidine activated BLU expression along with promoter demethylation. Although hypermethylation of RASSF1A, another TSG located immediately downstream of BLU, was detected in 20/27 (74%) of NPC tumors, no correlation between the hypermethylation of these two TSGs was observed (P=0.6334). In addition to methylation, homozygous deletion of BLU was found in 7/29 (24%) of tumors. Therefore, BLU is a stress-responsive gene, being disrupted in 83% (24/29) of NPC tumors by either epigenetic or genetic mechanisms. Our data are consistent with the interpretation that BLU is a TSG for NPC.

  12. Pathways and genes differentially expressed in the motor cortex of patients with sporadic amyotrophic lateral sclerosis.

    PubMed

    Lederer, Carsten W; Torrisi, Antonietta; Pantelidou, Maria; Santama, Niovi; Cavallaro, Sebastiano

    2007-01-23

    Amyotrophic lateral sclerosis (ALS) is a fatal disorder caused by the progressive degeneration of motoneurons in brain and spinal cord. Despite identification of disease-linked mutations, the diversity of processes involved and the ambiguity of their relative importance in ALS pathogenesis still represent a major impediment to disease models as a basis for effective therapies. Moreover, the human motor cortex, although critical to ALS pathology and physiologically altered in most forms of the disease, has not been screened systematically for therapeutic targets. By whole-genome expression profiling and stringent significance tests we identify genes and gene groups de-regulated in the motor cortex of patients with sporadic ALS, and interpret the role of individual candidate genes in a framework of differentially expressed pathways. Our findings emphasize the importance of defense responses and cytoskeletal, mitochondrial and proteasomal dysfunction, reflect reduced neuronal maintenance and vesicle trafficking, and implicate impaired ion homeostasis and glycolysis in ALS pathogenesis. Additionally, we compared our dataset with publicly available data for the SALS spinal cord, and show a high correlation of changes linked to the diseased state in the SALS motor cortex. In an analogous comparison with data for the Alzheimer's disease hippocampus we demonstrate a low correlation of global changes and a moderate correlation for changes specifically linked to the SALS diseased state. Gene and sample numbers investigated allow pathway- and gene-based analyses by established error-correction methods, drawing a molecular portrait of the ALS motor cortex that faithfully represents many known disease features and uncovers several novel aspects of ALS pathology. Contrary to expectations for a tissue under oxidative stress, nuclear-encoded mitochondrial genes are uniformly down-regulated. Moreover, the down-regulation of mitochondrial and glycolytic genes implies a combined reduction of mitochondrial and cytoplasmic energy supply, with a possible role in the death of ALS motoneurons. Identifying candidate genes exclusively expressed in non-neuronal cells, we also highlight the importance of these cells in disease development in the motor cortex. Notably, some pathways and candidate genes identified by this study are direct or indirect targets of medication already applied to unrelated illnesses and point the way towards the rapid development of effective symptomatic ALS therapies.

  13. BubbleGUM: automatic extraction of phenotype molecular signatures and comprehensive visualization of multiple Gene Set Enrichment Analyses.

    PubMed

    Spinelli, Lionel; Carpentier, Sabrina; Montañana Sanchis, Frédéric; Dalod, Marc; Vu Manh, Thien-Phong

    2015-10-19

    Recent advances in the analysis of high-throughput expression data have led to the development of tools that scaled-up their focus from single-gene to gene set level. For example, the popular Gene Set Enrichment Analysis (GSEA) algorithm can detect moderate but coordinated expression changes of groups of presumably related genes between pairs of experimental conditions. This considerably improves extraction of information from high-throughput gene expression data. However, although many gene sets covering a large panel of biological fields are available in public databases, the ability to generate home-made gene sets relevant to one's biological question is crucial but remains a substantial challenge to most biologists lacking statistic or bioinformatic expertise. This is all the more the case when attempting to define a gene set specific of one condition compared to many other ones. Thus, there is a crucial need for an easy-to-use software for generation of relevant home-made gene sets from complex datasets, their use in GSEA, and the correction of the results when applied to multiple comparisons of many experimental conditions. We developed BubbleGUM (GSEA Unlimited Map), a tool that allows to automatically extract molecular signatures from transcriptomic data and perform exhaustive GSEA with multiple testing correction. One original feature of BubbleGUM notably resides in its capacity to integrate and compare numerous GSEA results into an easy-to-grasp graphical representation. We applied our method to generate transcriptomic fingerprints for murine cell types and to assess their enrichments in human cell types. This analysis allowed us to confirm homologies between mouse and human immunocytes. BubbleGUM is an open-source software that allows to automatically generate molecular signatures out of complex expression datasets and to assess directly their enrichment by GSEA on independent datasets. Enrichments are displayed in a graphical output that helps interpreting the results. This innovative methodology has recently been used to answer important questions in functional genomics, such as the degree of similarities between microarray datasets from different laboratories or with different experimental models or clinical cohorts. BubbleGUM is executable through an intuitive interface so that both bioinformaticians and biologists can use it. It is available at http://www.ciml.univ-mrs.fr/applications/BubbleGUM/index.html .

  14. INfORM: Inference of NetwOrk Response Modules.

    PubMed

    Marwah, Veer Singh; Kinaret, Pia Anneli Sofia; Serra, Angela; Scala, Giovanni; Lauerma, Antti; Fortino, Vittorio; Greco, Dario

    2018-06-15

    Detecting and interpreting responsive modules from gene expression data by using network-based approaches is a common but laborious task. It often requires the application of several computational methods implemented in different software packages, forcing biologists to compile complex analytical pipelines. Here we introduce INfORM (Inference of NetwOrk Response Modules), an R shiny application that enables non-expert users to detect, evaluate and select gene modules with high statistical and biological significance. INfORM is a comprehensive tool for the identification of biologically meaningful response modules from consensus gene networks inferred by using multiple algorithms. It is accessible through an intuitive graphical user interface allowing for a level of abstraction from the computational steps. INfORM is freely available for academic use at https://github.com/Greco-Lab/INfORM. Supplementary data are available at Bioinformatics online.

  15. Effect of Various Diets on the Expression of Phase-I Drug Metabolizing Enzymes in Livers of Mice

    PubMed Central

    Guo, Ying; Cui, Julia Yue; Lu, Hong; Klaassen, Curtis D.

    2017-01-01

    Previous studies have shown that diets can alter the metabolism of drugs; however, it is difficult to compare the effects of multiple diets on drug metabolism among different experimental settings. Phase-I related genes play a major role in the biotransformation of pro-drugs and drugs.In the current study, effects of nine diets on the mRNA expression of phase-I drug-metabolizing enzymes in livers of mice were simultaneously investigated. Compared to the AIN-93M purified diet (control), 73 of the 132 critical phase-I drug metabolizing genes were differentially regulated by at least one diet. Diet restriction produced the most number of changed genes (51), followed by the atherogenic diet (27), high-fat diet (25), standard rodent chow (21), western diet (20), high-fructose diet (5), EFA deficient diet (3), and low n-3 FA diet (1). The mRNAs of the Fmo family changed most, followed by Cyp2b and 4a subfamilies, as well as Por (From 1121 to 21-fold increase of theses mRNAs). There were 59 genes not altered by any of these diets.The present results may improve the interpretation of studies with mice and aid in determining effective and safe doses for individuals with different nutritional diets. PMID:25733028

  16. Analysis of Msx1; Msx2 double mutants reveals multiple roles for Msx genes in limb development.

    PubMed

    Lallemand, Yvan; Nicola, Marie-Anne; Ramos, Casto; Bach, Antoine; Cloment, Cécile Saint; Robert, Benoît

    2005-07-01

    The homeobox-containing genes Msx1 and Msx2 are highly expressed in the limb field from the earliest stages of limb formation and, subsequently, in both the apical ectodermal ridge and underlying mesenchyme. However, mice homozygous for a null mutation in either Msx1 or Msx2 do not display abnormalities in limb development. By contrast, Msx1; Msx2 double mutants exhibit a severe limb phenotype. Our analysis indicates that these genes play a role in crucial processes during limb morphogenesis along all three axes. Double mutant limbs are shorter and lack anterior skeletal elements (radius/tibia, thumb/hallux). Gene expression analysis confirms that there is no formation of regions with anterior identity. This correlates with the absence of dorsoventral boundary specification in the anterior ectoderm, which precludes apical ectodermal ridge formation anteriorly. As a result, anterior mesenchyme is not maintained, leading to oligodactyly. Paradoxically, polydactyly is also frequent and appears to be associated with extended Fgf activity in the apical ectodermal ridge, which is maintained up to 14.5 dpc. This results in a major outgrowth of the mesenchyme anteriorly, which nevertheless maintains a posterior identity, and leads to formation of extra digits. These defects are interpreted in the context of an impairment of Bmp signalling.

  17. Using yeast to determine the functional consequences of mutations in the human p53 tumor suppressor gene: An introductory course-based undergraduate research experience in molecular and cell biology.

    PubMed

    Hekmat-Scafe, Daria S; Brownell, Sara E; Seawell, Patricia Chandler; Malladi, Shyamala; Imam, Jamie F Conklin; Singla, Veena; Bradon, Nicole; Cyert, Martha S; Stearns, Tim

    2017-03-04

    The opportunity to engage in scientific research is an important, but often neglected, component of undergraduate training in biology. We describe the curriculum for an innovative, course-based undergraduate research experience (CURE) appropriate for a large, introductory cell and molecular biology laboratory class that leverages students' high level of interest in cancer. The course is highly collaborative and emphasizes the analysis and interpretation of original scientific data. During the course, students work in teams to characterize a collection of mutations in the human p53 tumor suppressor gene via expression and analysis in yeast. Initially, student pairs use both qualitative and quantitative assays to assess the ability of their p53 mutant to activate expression of reporter genes, and they localize their mutation within the p53 structure. Through facilitated discussion, students suggest possible molecular explanations for the transactivation defects displayed by their p53 mutants and propose experiments to test these hypotheses that they execute during the second part of the course. They use a western blot to determine whether mutant p53 levels are reduced, a DNA-binding assay to test whether recognition of any of three p53 target sequences is compromised, and fluorescence microscopy to assay nuclear localization. Students studying the same p53 mutant periodically convene to discuss and interpret their combined data. The course culminates in a poster session during which students present their findings to peers, instructors, and the greater biosciences community. Based on our experience, we provide recommendations for the development of similar large introductory lab courses. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(2):161-178, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.

  18. Using yeast to determine the functional consequences of mutations in the human p53 tumor suppressor gene: An introductory course‐based undergraduate research experience in molecular and cell biology

    PubMed Central

    Brownell, Sara E.; Seawell, Patricia Chandler; Malladi, Shyamala; Imam, Jamie F. Conklin; Singla, Veena; Bradon, Nicole; Cyert, Martha S.; Stearns, Tim

    2016-01-01

    Abstract The opportunity to engage in scientific research is an important, but often neglected, component of undergraduate training in biology. We describe the curriculum for an innovative, course‐based undergraduate research experience (CURE) appropriate for a large, introductory cell and molecular biology laboratory class that leverages students′ high level of interest in cancer. The course is highly collaborative and emphasizes the analysis and interpretation of original scientific data. During the course, students work in teams to characterize a collection of mutations in the human p53 tumor suppressor gene via expression and analysis in yeast. Initially, student pairs use both qualitative and quantitative assays to assess the ability of their p53 mutant to activate expression of reporter genes, and they localize their mutation within the p53 structure. Through facilitated discussion, students suggest possible molecular explanations for the transactivation defects displayed by their p53 mutants and propose experiments to test these hypotheses that they execute during the second part of the course. They use a western blot to determine whether mutant p53 levels are reduced, a DNA‐binding assay to test whether recognition of any of three p53 target sequences is compromised, and fluorescence microscopy to assay nuclear localization. Students studying the same p53 mutant periodically convene to discuss and interpret their combined data. The course culminates in a poster session during which students present their findings to peers, instructors, and the greater biosciences community. Based on our experience, we provide recommendations for the development of similar large introductory lab courses. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(2):161–178, 2017. PMID:27873457

  19. Threshold-free high-power methods for the ontological analysis of genome-wide gene-expression studies

    PubMed Central

    Nilsson, Björn; Håkansson, Petra; Johansson, Mikael; Nelander, Sven; Fioretos, Thoas

    2007-01-01

    Ontological analysis facilitates the interpretation of microarray data. Here we describe new ontological analysis methods which, unlike existing approaches, are threshold-free and statistically powerful. We perform extensive evaluations and introduce a new concept, detection spectra, to characterize methods. We show that different ontological analysis methods exhibit distinct detection spectra, and that it is critical to account for this diversity. Our results argue strongly against the continued use of existing methods, and provide directions towards an enhanced approach. PMID:17488501

  20. Functional characterization of EZH2β reveals the increased complexity of EZH2 isoforms involved in the regulation of mammalian gene expression

    PubMed Central

    2013-01-01

    Background Histone methyltransferase enhancer of zeste homologue 2 (EZH2) forms an obligate repressive complex with suppressor of zeste 12 and embryonic ectoderm development, which is thought, along with EZH1, to be primarily responsible for mediating Polycomb-dependent gene silencing. Polycomb-mediated repression influences gene expression across the entire gamut of biological processes, including development, differentiation and cellular proliferation. Deregulation of EZH2 expression is implicated in numerous complex human diseases. To date, most EZH2-mediated function has been primarily ascribed to a single protein product of the EZH2 locus. Results We report that the EZH2 locus undergoes alternative splicing to yield at least two structurally and functionally distinct EZH2 methyltransferases. The longest protein encoded by this locus is the conventional enzyme, which we refer to as EZH2α, whereas EZH2β, characterized here, represents a novel isoform. We find that EZH2β localizes to the cell nucleus, complexes with embryonic ectoderm development and suppressor of zeste 12, trimethylates histone 3 at lysine 27, and mediates silencing of target promoters. At the cell biological level, we find that increased EZH2β induces cell proliferation, demonstrating that this protein is functional in the regulation of processes previously attributed to EZH2α. Biochemically, through the use of genome-wide expression profiling, we demonstrate that EZH2β governs a pattern of gene repression that is often ontologically redundant from that of EZH2α, but also divergent for a wide variety of specific target genes. Conclusions Combined, these results demonstrate that an expanded repertoire of EZH2 writers can modulate histone code instruction during histone 3 lysine 27-mediated gene silencing. These data support the notion that the regulation of EZH2-mediated gene silencing is more complex than previously anticipated and should guide the design and interpretation of future studies aimed at understanding the biochemical and biological roles of this important family of epigenomic regulators. PMID:23448518

  1. Computational synchronization of microarray data with application to Plasmodium falciparum.

    PubMed

    Zhao, Wei; Dauwels, Justin; Niles, Jacquin C; Cao, Jianshu

    2012-06-21

    Microarrays are widely used to investigate the blood stage of Plasmodium falciparum infection. Starting with synchronized cells, gene expression levels are continually measured over the 48-hour intra-erythrocytic cycle (IDC). However, the cell population gradually loses synchrony during the experiment. As a result, the microarray measurements are blurred. In this paper, we propose a generalized deconvolution approach to reconstruct the intrinsic expression pattern, and apply it to P. falciparum IDC microarray data. We develop a statistical model for the decay of synchrony among cells, and reconstruct the expression pattern through statistical inference. The proposed method can handle microarray measurements with noise and missing data. The original gene expression patterns become more apparent in the reconstructed profiles, making it easier to analyze and interpret the data. We hypothesize that reconstructed gene expression patterns represent better temporally resolved expression profiles that can be probabilistically modeled to match changes in expression level to IDC transitions. In particular, we identify transcriptionally regulated protein kinases putatively involved in regulating the P. falciparum IDC. By analyzing publicly available microarray data sets for the P. falciparum IDC, protein kinases are ranked in terms of their likelihood to be involved in regulating transitions between the ring, trophozoite and schizont developmental stages of the P. falciparum IDC. In our theoretical framework, a few protein kinases have high probability rankings, and could potentially be involved in regulating these developmental transitions. This study proposes a new methodology for extracting intrinsic expression patterns from microarray data. By applying this method to P. falciparum microarray data, several protein kinases are predicted to play a significant role in the P. falciparum IDC. Earlier experiments have indeed confirmed that several of these kinases are involved in this process. Overall, these results indicate that further functional analysis of these additional putative protein kinases may reveal new insights into how the P. falciparum IDC is regulated.

  2. Modelling formulations using gene expression programming--a comparative analysis with artificial neural networks.

    PubMed

    Colbourn, E A; Roskilly, S J; Rowe, R C; York, P

    2011-10-09

    This study has investigated the utility and potential advantages of gene expression programming (GEP)--a new development in evolutionary computing for modelling data and automatically generating equations that describe the cause-and-effect relationships in a system--to four types of pharmaceutical formulation and compared the models with those generated by neural networks, a technique now widely used in the formulation development. Both methods were capable of discovering subtle and non-linear relationships within the data, with no requirement from the user to specify the functional forms that should be used. Although the neural networks rapidly developed models with higher values for the ANOVA R(2) these were black box and provided little insight into the key relationships. However, GEP, although significantly slower at developing models, generated relatively simple equations describing the relationships that could be interpreted directly. The results indicate that GEP can be considered an effective and efficient modelling technique for formulation data. Copyright © 2011 Elsevier B.V. All rights reserved.

  3. Does CTCF mediate between nuclear organization and gene expression?

    PubMed

    Ohlsson, Rolf; Lobanenkov, Victor; Klenova, Elena

    2010-01-01

    The multifunctional zinc-finger protein CCCTC-binding factor (CTCF) is a very strong candidate for the role of coordinating the expression level of coding sequences with their three-dimensional position in the nucleus, apparently responding to a "code" in the DNA itself. Dynamic interactions between chromatin fibers in the context of nuclear architecture have been implicated in various aspects of genome functions. However, the molecular basis of these interactions still remains elusive and is a subject of intense debate. Here we discuss the nature of CTCF-DNA interactions, the CTCF-binding specificity to its binding sites and the relationship between CTCF and chromatin, and we examine data linking CTCF with gene regulation in the three-dimensional nuclear space. We discuss why these features render CTCF a very strong candidate for the role and propose a unifying model, the "CTCF code," explaining the mechanistic basis of how the information encrypted in DNA may be interpreted by CTCF into diverse nuclear functions.

  4. TSVdb: a web-tool for TCGA splicing variants analysis.

    PubMed

    Sun, Wenjie; Duan, Ting; Ye, Panmeng; Chen, Kelie; Zhang, Guanling; Lai, Maode; Zhang, Honghe

    2018-05-29

    Collaborative projects such as The Cancer Genome Atlas (TCGA) have generated various -omics and clinical data on cancer. Many computational tools have been developed to facilitate the study of the molecular characterization of tumors using data from the TCGA. Alternative splicing of a gene produces splicing variants, and accumulating evidence has revealed its essential role in cancer-related processes, implying the urgent need to discover tumor-specific isoforms and uncover their potential functions in tumorigenesis. We developed TSVdb, a web-based tool, to explore alternative splicing based on TCGA samples with 30 clinical variables from 33 tumors. TSVdb has an integrated and well-proportioned interface for visualization of the clinical data, gene expression, usage of exons/junctions and splicing patterns. Researchers can interpret the isoform expression variations between or across clinical subgroups and estimate the relationships between isoforms and patient prognosis. TSVdb is available at http://www.tsvdb.com , and the source code is available at https://github.com/wenjie1991/TSVdb . TSVdb will inspire oncologists and accelerate isoform-level advances in cancer research.

  5. An Autosomal Gene That Affects X Chromosome Expression and Sex Determination in CAENORHABDITIS ELEGANS

    PubMed Central

    Meneely, Philip M.; Wood, William B.

    1984-01-01

    Recessive mutant alleles at the autosomal dpy-21 locus of C. elegans cause a dumpy phenotype in XX animals but not in XO animals. This dumpy phenotype is characteristic of X chromosome aneuploids with higher than normal X to autosome ratios and is proposed to result from overexpression of X-linked genes. We have isolated a new dpy-21 allele that also causes partial hermaphroditization of XO males, without causing the dumpy phenotype. All dpy-21 alleles show hermaphroditization effects in XO males that carry a duplication of part of the X chromosome and also partially suppress a transformer (tra-1) mutation that converts XX animals into males. Experiments with a set of X chromosome duplications show that the defects of dpy-21 mutants can result from interaction with several different regions of the X chromosome. We propose that dpy-21 regulates X chromosome expression and may be involved in interpreting X chromosome dose for the developmental decisions of both sex determination and dosage compensation. PMID:6537930

  6. New strategies in drug discovery.

    PubMed

    Ohlstein, Eliot H; Johnson, Anthony G; Elliott, John D; Romanic, Anne M

    2006-01-01

    Gene identification followed by determination of the expression of genes in a given disease and understanding of the function of the gene products is central to the drug discovery process. The ability to associate a specific gene with a disease can be attributed primarily to the extraordinary progress that has been made in the areas of gene sequencing and information technologies. Selection and validation of novel molecular targets have become of great importance in light of the abundance of new potential therapeutic drug targets that have emerged from human gene sequencing. In response to this revolution within the pharmaceutical industry, the development of high-throughput methods in both biology and chemistry has been necessitated. Further, the successful translation of basic scientific discoveries into clinical experimental medicine and novel therapeutics is an increasing challenge. As such, a new paradigm for drug discovery has emerged. This process involves the integration of clinical, genetic, genomic, and molecular phenotype data partnered with cheminformatics. Central to this process, the data generated are managed, collated, and interpreted with the use of informatics. This review addresses the use of new technologies that have arisen to deal with this new paradigm.

  7. Identification of Differentially Expressed Genes through Integrated Study of Alzheimer’s Disease Affected Brain Regions

    PubMed Central

    Berretta, Regina; Moscato, Pablo

    2016-01-01

    Background Alzheimer’s disease (AD) is the most common form of dementia in older adults that damages the brain and results in impaired memory, thinking and behaviour. The identification of differentially expressed genes and related pathways among affected brain regions can provide more information on the mechanisms of AD. In the past decade, several studies have reported many genes that are associated with AD. This wealth of information has become difficult to follow and interpret as most of the results are conflicting. In that case, it is worth doing an integrated study of multiple datasets that helps to increase the total number of samples and the statistical power in detecting biomarkers. In this study, we present an integrated analysis of five different brain region datasets and introduce new genes that warrant further investigation. Methods The aim of our study is to apply a novel combinatorial optimisation based meta-analysis approach to identify differentially expressed genes that are associated to AD across brain regions. In this study, microarray gene expression data from 161 samples (74 non-demented controls, 87 AD) from the Entorhinal Cortex (EC), Hippocampus (HIP), Middle temporal gyrus (MTG), Posterior cingulate cortex (PC), Superior frontal gyrus (SFG) and visual cortex (VCX) brain regions were integrated and analysed using our method. The results are then compared to two popular meta-analysis methods, RankProd and GeneMeta, and to what can be obtained by analysing the individual datasets. Results We find genes related with AD that are consistent with existing studies, and new candidate genes not previously related with AD. Our study confirms the up-regualtion of INFAR2 and PTMA along with the down regulation of GPHN, RAB2A, PSMD14 and FGF. Novel genes PSMB2, WNK1, RPL15, SEMA4C, RWDD2A and LARGE are found to be differentially expressed across all brain regions. Further investigation on these genes may provide new insights into the development of AD. In addition, we identified the presence of 23 non-coding features, including four miRNA precursors (miR-7, miR570, miR-1229 and miR-6821), dysregulated across the brain regions. Furthermore, we compared our results with two popular meta-analysis methods RankProd and GeneMeta to validate our findings and performed a sensitivity analysis by removing one dataset at a time to assess the robustness of our results. These new findings may provide new insights into the disease mechanisms and thus make a significant contribution in the near future towards understanding, prevention and cure of AD. PMID:27050411

  8. Leveraging lung tissue transcriptome to uncover candidate causal genes in COPD genetic associations.

    PubMed

    Lamontagne, Maxime; Bérubé, Jean-Christophe; Obeidat, Ma'en; Cho, Michael H; Hobbs, Brian D; Sakornsakolpat, Phuwanat; de Jong, Kim; Boezen, H Marike; Nickle, David; Hao, Ke; Timens, Wim; van den Berge, Maarten; Joubert, Philippe; Laviolette, Michel; Sin, Don D; Paré, Peter D; Bossé, Yohan

    2018-05-15

    Causal genes of chronic obstructive pulmonary disease (COPD) remain elusive. The current study aims at integrating genome-wide association studies (GWAS) and lung expression quantitative trait loci (eQTL) data to map COPD candidate causal genes and gain biological insights into the recently discovered COPD susceptibility loci. Two complementary genomic datasets on COPD were studied. First, the lung eQTL dataset which included whole-genome gene expression and genotyping data from 1038 individuals. Second, the largest COPD GWAS to date from the International COPD Genetics Consortium (ICGC) with 13 710 cases and 38 062 controls. Methods that integrated GWAS with eQTL signals including transcriptome-wide association study (TWAS), colocalization and Mendelian randomization-based (SMR) approaches were used to map causality genes, i.e. genes with the strongest evidence of being the functional effector at specific loci. These methods were applied at the genome-wide level and at COPD risk loci derived from the GWAS literature. Replication was performed using lung data from GTEx. We collated 129 non-overlapping risk loci for COPD from the GWAS literature. At the genome-wide scale, 12 new COPD candidate genes/loci were revealed and six replicated in GTEx including CAMK2A, DMPK, MYO15A, TNFRSF10A, BTN3A2 and TRBV30. In addition, we mapped candidate causal genes for 60 out of the 129 GWAS-nominated loci and 23 of them were replicated in GTEx. Mapping candidate causal genes in lung tissue represents an important contribution to the genetics of COPD, enriches our biological interpretation of GWAS findings, and brings us closer to clinical translation of genetic associations.

  9. iCanPlot: Visual Exploration of High-Throughput Omics Data Using Interactive Canvas Plotting

    PubMed Central

    Sinha, Amit U.; Armstrong, Scott A.

    2012-01-01

    Increasing use of high throughput genomic scale assays requires effective visualization and analysis techniques to facilitate data interpretation. Moreover, existing tools often require programming skills, which discourages bench scientists from examining their own data. We have created iCanPlot, a compelling platform for visual data exploration based on the latest technologies. Using the recently adopted HTML5 Canvas element, we have developed a highly interactive tool to visualize tabular data and identify interesting patterns in an intuitive fashion without the need of any specialized computing skills. A module for geneset overlap analysis has been implemented on the Google App Engine platform: when the user selects a region of interest in the plot, the genes in the region are analyzed on the fly. The visualization and analysis are amalgamated for a seamless experience. Further, users can easily upload their data for analysis—which also makes it simple to share the analysis with collaborators. We illustrate the power of iCanPlot by showing an example of how it can be used to interpret histone modifications in the context of gene expression. PMID:22393367

  10. Independent AMP and NAD signaling regulates C2C12 differentiation and metabolic adaptation.

    PubMed

    Hsu, Chia George; Burkholder, Thomas J

    2016-12-01

    The balance of ATP production and consumption is reflected in adenosine monophosphate (AMP) and nicotinamide adenine dinucleotide (NAD) content and has been associated with phenotypic plasticity in striated muscle. Some studies have suggested that AMPK-dependent plasticity may be an indirect consequence of increased NAD synthesis and SIRT1 activity. The primary goal of this study was to assess the interaction of AMP- and NAD-dependent signaling in adaptation of C2C12 myotubes. Changes in myotube developmental and metabolic gene expression were compared following incubation with 5-aminoimidazole-4-carboxamide ribonucleotide (AICAR) and nicotinamide mononucleotide (NMN) to activate AMPK- and NAD-related signaling. AICAR showed no effect on NAD pool or nampt expression but significantly reduced histone H3 acetylation and GLUT1, cytochrome C oxidase subunit 2 (COX2), and MYH3 expression. In contrast, NMN supplementation for 24 h increased NAD pool by 45 % but did not reduce histone H3 acetylation nor promote mitochondrial gene expression. The combination of AMP and NAD signaling did not induce further metabolic adaptation, but NMN ameliorated AICAR-induced myotube reduction. We interpret these results as indication that AMP and NAD contribute to C2C12 differentiation and metabolic adaptation independently.

  11. Evaluation of Parkinson Disease Risk Variants as Expression-QTLs

    PubMed Central

    Latourelle, Jeanne C.; Dumitriu, Alexandra; Hadzi, Tiffany C.; Beach, Thomas G.; Myers, Richard H.

    2012-01-01

    The recent Parkinson Disease GWAS Consortium meta-analysis and replication study reports association at several previously confirmed risk loci SNCA, MAPT, GAK/DGKQ, and HLA and identified a novel risk locus at RIT2. To further explore functional consequences of these associations, we investigated modification of gene expression in prefrontal cortex brain samples of pathologically confirmed PD cases (N = 26) and controls (N = 24) by 67 associated SNPs in these 5 loci. Association between the eSNPs and expression was evaluated using a 2-degrees of freedom test of both association and difference in association between cases and controls, adjusted for relevant covariates. SNPs at each of the 5 loci were tested for cis-acting effects on all probes within 250 kb of each locus. Trans-effects of the SNPs on the 39,122 probes passing all QC on the microarray were also examined. From the analysis of cis-acting SNP effects, several SNPs in the MAPT region show significant association to multiple nearby probes, including two strongly correlated probes targeting the gene LOC644246 and the duplicated genes LRRC37A and LRRC37A2, and a third uncorrelated probe targeting the gene DCAKD. Significant cis-associations were also observed between SNPs and two probes targeting genes in the HLA region on chromosome 6. Expanding the association study to examine trans effects revealed an additional 23 SNP-probe associations reaching statistical significance (p<2.8×10−8) including SNPs from the SNCA, MAPT and RIT2 regions. These findings provide additional context for the interpretation of PD associated SNPs identified in recent GWAS as well as potential insight into the mechanisms underlying the observed SNP associations. PMID:23071545

  12. An expression database for roots of the model legume Medicago truncatula under salt stress

    PubMed Central

    2009-01-01

    Background Medicago truncatula is a model legume whose genome is currently being sequenced by an international consortium. Abiotic stresses such as salt stress limit plant growth and crop productivity, including those of legumes. We anticipate that studies on M. truncatula will shed light on other economically important legumes across the world. Here, we report the development of a database called MtED that contains gene expression profiles of the roots of M. truncatula based on time-course salt stress experiments using the Affymetrix Medicago GeneChip. Our hope is that MtED will provide information to assist in improving abiotic stress resistance in legumes. Description The results of our microarray experiment with roots of M. truncatula under 180 mM sodium chloride were deposited in the MtED database. Additionally, sequence and annotation information regarding microarray probe sets were included. MtED provides functional category analysis based on Gene and GeneBins Ontology, and other Web-based tools for querying and retrieving query results, browsing pathways and transcription factor families, showing metabolic maps, and comparing and visualizing expression profiles. Utilities like mapping probe sets to genome of M. truncatula and In-Silico PCR were implemented by BLAT software suite, which were also available through MtED database. Conclusion MtED was built in the PHP script language and as a MySQL relational database system on a Linux server. It has an integrated Web interface, which facilitates ready examination and interpretation of the results of microarray experiments. It is intended to help in selecting gene markers to improve abiotic stress resistance in legumes. MtED is available at http://bioinformatics.cau.edu.cn/MtED/. PMID:19906315

  13. An expression database for roots of the model legume Medicago truncatula under salt stress.

    PubMed

    Li, Daofeng; Su, Zhen; Dong, Jiangli; Wang, Tao

    2009-11-11

    Medicago truncatula is a model legume whose genome is currently being sequenced by an international consortium. Abiotic stresses such as salt stress limit plant growth and crop productivity, including those of legumes. We anticipate that studies on M. truncatula will shed light on other economically important legumes across the world. Here, we report the development of a database called MtED that contains gene expression profiles of the roots of M. truncatula based on time-course salt stress experiments using the Affymetrix Medicago GeneChip. Our hope is that MtED will provide information to assist in improving abiotic stress resistance in legumes. The results of our microarray experiment with roots of M. truncatula under 180 mM sodium chloride were deposited in the MtED database. Additionally, sequence and annotation information regarding microarray probe sets were included. MtED provides functional category analysis based on Gene and GeneBins Ontology, and other Web-based tools for querying and retrieving query results, browsing pathways and transcription factor families, showing metabolic maps, and comparing and visualizing expression profiles. Utilities like mapping probe sets to genome of M. truncatula and In-Silico PCR were implemented by BLAT software suite, which were also available through MtED database. MtED was built in the PHP script language and as a MySQL relational database system on a Linux server. It has an integrated Web interface, which facilitates ready examination and interpretation of the results of microarray experiments. It is intended to help in selecting gene markers to improve abiotic stress resistance in legumes. MtED is available at http://bioinformatics.cau.edu.cn/MtED/.

  14. Filaggrin-stratified transcriptomic analysis of pediatric skin identifies mechanistic pathways in patients with atopic dermatitis.

    PubMed

    Cole, Christian; Kroboth, Karin; Schurch, Nicholas J; Sandilands, Aileen; Sherstnev, Alexander; O'Regan, Grainne M; Watson, Rosemarie M; McLean, W H Irwin; Barton, Geoffrey J; Irvine, Alan D; Brown, Sara J

    2014-07-01

    Atopic dermatitis (AD; eczema) is characterized by a widespread abnormality in cutaneous barrier function and propensity to inflammation. Filaggrin is a multifunctional protein and plays a key role in skin barrier formation. Loss-of-function mutations in the gene encoding filaggrin (FLG) are a highly significant risk factor for atopic disease, but the molecular mechanisms leading to dermatitis remain unclear. We sought to interrogate tissue-specific variations in the expressed genome in the skin of children with AD and to investigate underlying pathomechanisms in atopic skin. We applied single-molecule direct RNA sequencing to analyze the whole transcriptome using minimal tissue samples. Uninvolved skin biopsy specimens from 26 pediatric patients with AD were compared with site-matched samples from 10 nonatopic teenage control subjects. Cases and control subjects were screened for FLG genotype to stratify the data set. Two thousand four hundred thirty differentially expressed genes (false discovery rate, P < .05) were identified, of which 211 were significantly upregulated and 490 downregulated by greater than 2-fold. Gene ontology terms for "extracellular space" and "defense response" were enriched, whereas "lipid metabolic processes" were downregulated. The subset of FLG wild-type cases showed dysregulation of genes involved with lipid metabolism, whereas filaggrin haploinsufficiency affected global gene expression and was characterized by a type 1 interferon-mediated stress response. These analyses demonstrate the importance of extracellular space and lipid metabolism in atopic skin pathology independent of FLG genotype, whereas an aberrant defense response is seen in subjects with FLG mutations. Genotype stratification of the large data set has facilitated functional interpretation and might guide future therapy development. Copyright © 2014 The Authors. Published by Mosby, Inc. All rights reserved.

  15. DNA microarray analyses reveal a post-irradiation differential time-dependent gene expression profile in yeast cells exposed to X-rays and gamma-rays.

    PubMed

    Kimura, Shinzo; Ishidou, Emi; Kurita, Sakiko; Suzuki, Yoshiteru; Shibato, Junko; Rakwal, Randeep; Iwahashi, Hitoshi

    2006-07-21

    Ionizing radiation (IR) is the most enigmatic of genotoxic stress inducers in our environment that has been around from the eons of time. IR is generally considered harmful, and has been the subject of numerous studies, mostly looking at the DNA damaging effects in cells and the repair mechanisms therein. Moreover, few studies have focused on large-scale identification of cellular responses to IR, and to this end, we describe here an initial study on the transcriptional responses of the unicellular genome model, yeast (Saccharomyces cerevisiae strain S288C), by cDNA microarray. The effect of two different IR, X-rays, and gamma (gamma)-rays, was investigated by irradiating the yeast cells cultured in YPD medium with 50 Gy doses of X- and gamma-rays, followed by resuspension of the cells in YPD for time-course experiments. The samples were collected for microarray analysis at 20, 40, and 80 min after irradiation. Microarray analysis revealed a time-course transcriptional profile of changed gene expressions. Up-regulated genes belonged to the functional categories mainly related to cell cycle and DNA processing, cell rescue defense and virulence, protein and cell fate, and metabolism (X- and gamma-rays). Similarly, for X- and gamma-rays, the down-regulated genes belonged to mostly transcription and protein synthesis, cell cycle and DNA processing, control of cellular organization, cell fate, and C-compound and carbohydrate metabolism categories, respectively. This study provides for the first time a snapshot of the genome-wide mRNA expression profiles in X- and gamma-ray post-irradiated yeast cells and comparatively interprets/discusses the changed gene functional categories as effects of these two radiations vis-à-vis their energy levels.

  16. Petri net modelling of gene regulation of the Duchenne muscular dystrophy.

    PubMed

    Grunwald, Stefanie; Speer, Astrid; Ackermann, Jörg; Koch, Ina

    2008-05-01

    Searching for therapeutic strategies for Duchenne muscular dystrophy, it is of great interest to understand the responsible molecular pathways down-stream of dystrophin completely. For this reason we have performed real-time PCR experiments to compare mRNA expression levels of relevant genes in tissues of affected patients and controls. To bring experimental data in context with the underlying pathway theoretical models are needed. Modelling of biological processes in the cell at higher description levels is still an open problem in the field of systems biology. In this paper, a new application of Petri net theory is presented to model gene regulatory processes of Duchenne muscular dystrophy. We have developed a Petri net model, which is based mainly on own experimental and literature data. We distinguish between up- and down-regulated states of gene expression. The analysis of the model comprises the computation of structural and dynamic properties with focus on a thorough T-invariant analysis, including clustering techniques and the decomposition of the network into maximal common transition sets (MCT-sets), which can be interpreted as functionally related building blocks. All possible pathways, which reflect the complex net behaviour in dependence of different gene expression patterns, are discussed. We introduce Mauritius maps of T-invariants, which enable, for example, theoretical knockout analysis. The resulted model serves as basis for a better understanding of pathological processes, and thereby for planning next experimental steps in searching for new therapeutic possibilities. Free availability of the Petri net editor and animator Snoopy and the clustering tool PInA via http://www-dssz.informatik.tu-cottbus.de/~ wwwdssz/. The Petri net models used can be accessed via http://www.tfh-berlin.de/bi/duchenne/.

  17. Identification of Glutathione S-Transferase (GST) Genes from a Dark Septate Endophytic Fungus (Exophiala pisciphila) and Their Expression Patterns under Varied Metals Stress

    PubMed Central

    Qiao, Qin; Liu, Lei; Wang, Jun-Ling; Cao, Guan-Hua; Li, Tao; Zhao, Zhi-Wei

    2015-01-01

    Glutathione S-transferases (GSTs) compose a family of multifunctional enzymes that play important roles in the detoxification of xenobiotics and the oxidative stress response. In the present study, twenty four GST genes from the transcriptome of a metal-tolerant dark septate endophyte (DSE), Exophiala pisciphila, were identified based on sequence homology, and their responses to various heavy metal exposures were also analyzed. Phylogenetic analysis showed that the 24 GST genes from E. pisciphila (EpGSTs) were divided into eight distinct classes, including seven cytosolic classes and one mitochondrial metaxin 1-like class. Moreover, the variable expression patterns of these EpGSTs were observed under different heavy metal stresses at their effective concentrations for inhibiting growth by 50% (EC50). Lead (Pb) exposure caused the up-regulation of all EpGSTs, while cadmium (Cd), copper (Cu) and zinc (Zn) treatments led to the significant up-regulation of most of the EpGSTs (p < 0.05 to p < 0.001). Furthermore, although heavy metal-specific differences in performance were observed under various heavy metals in Escherichia coli BL21 (DE3) transformed with EpGSTN-31, the over-expression of this gene was able to enhance the heavy metal tolerance of the host cells. These results indicate that E. Pisciphila harbored a diverse of GST genes and the up-regulated EpGSTs are closely related to the heavy metal tolerance of E. pisciphila. The study represents the first investigation of the GST family in E. pisciphila and provides a primary interpretation of heavy metal detoxification for E. pisciphila. PMID:25884726

  18. A possible role for flowering locus T-encoding genes in interpreting environmental and internal cues affecting olive (Olea europaea L.) flower induction.

    PubMed

    Haberman, Amnon; Bakhshian, Ortal; Cerezo-Medina, Sergio; Paltiel, Judith; Adler, Chen; Ben-Ari, Giora; Mercado, Jose Angel; Pliego-Alfaro, Fernando; Lavee, Shimon; Samach, Alon

    2017-08-01

    Olive (Olea europaea L.) inflorescences, formed in lateral buds, flower in spring. However, there is some debate regarding time of flower induction and inflorescence initiation. Olive juvenility and seasonality of flowering were altered by overexpressing genes encoding flowering locus T (FT). OeFT1 and OeFT2 caused early flowering under short days when expressed in Arabidopsis. Expression of OeFT1/2 in olive leaves and OeFT2 in buds increased in winter, while initiation of inflorescences occurred i n late winter. Trees exposed to an artificial warm winter expressed low levels of OeFT1/2 in leaves and did not flower. Olive flower induction thus seems to be mediated by an increase in FT levels in response to cold winters. Olive flowering is dependent on additional internal factors. It was severely reduced in trees that carried a heavy fruit load the previous season (harvested in November) and in trees without fruit to which cold temperatures were artificially applied in summer. Expression analysis suggested that these internal factors work either by reducing the increase in OeFT1/2 expression or through putative flowering repressors such as TFL1. With expected warmer winters, future consumption of olive oil, as part of a healthy Mediterranean diet, should benefit from better understanding these factors. © 2017 John Wiley & Sons Ltd.

  19. Influence maximization in time bounded network identifies transcription factors regulating perturbed pathways

    PubMed Central

    Jo, Kyuri; Jung, Inuk; Moon, Ji Hwan; Kim, Sun

    2016-01-01

    Motivation: To understand the dynamic nature of the biological process, it is crucial to identify perturbed pathways in an altered environment and also to infer regulators that trigger the response. Current time-series analysis methods, however, are not powerful enough to identify perturbed pathways and regulators simultaneously. Widely used methods include methods to determine gene sets such as differentially expressed genes or gene clusters and these genes sets need to be further interpreted in terms of biological pathways using other tools. Most pathway analysis methods are not designed for time series data and they do not consider gene-gene influence on the time dimension. Results: In this article, we propose a novel time-series analysis method TimeTP for determining transcription factors (TFs) regulating pathway perturbation, which narrows the focus to perturbed sub-pathways and utilizes the gene regulatory network and protein–protein interaction network to locate TFs triggering the perturbation. TimeTP first identifies perturbed sub-pathways that propagate the expression changes along the time. Starting points of the perturbed sub-pathways are mapped into the network and the most influential TFs are determined by influence maximization technique. The analysis result is visually summarized in TF-Pathway map in time clock. TimeTP was applied to PIK3CA knock-in dataset and found significant sub-pathways and their regulators relevant to the PIP3 signaling pathway. Availability and Implementation: TimeTP is implemented in Python and available at http://biohealth.snu.ac.kr/software/TimeTP/. Supplementary information: Supplementary data are available at Bioinformatics online. Contact: sunkim.bioinfo@snu.ac.kr PMID:27307609

  20. Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline

    PubMed Central

    Rahmatallah, Yasir; Emmert-Streib, Frank

    2016-01-01

    Transcriptome sequencing (RNA-seq) is gradually replacing microarrays for high-throughput studies of gene expression. The main challenge of analyzing microarray data is not in finding differentially expressed genes, but in gaining insights into the biological processes underlying phenotypic differences. To interpret experimental results from microarrays, gene set analysis (GSA) has become the method of choice, in particular because it incorporates pre-existing biological knowledge (in a form of functionally related gene sets) into the analysis. Here we provide a brief review of several statistically different GSA approaches (competitive and self-contained) that can be adapted from microarrays practice as well as those specifically designed for RNA-seq. We evaluate their performance (in terms of Type I error rate, power, robustness to the sample size and heterogeneity, as well as the sensitivity to different types of selection biases) on simulated and real RNA-seq data. Not surprisingly, the performance of various GSA approaches depends only on the statistical hypothesis they test and does not depend on whether the test was developed for microarrays or RNA-seq data. Interestingly, we found that competitive methods have lower power as well as robustness to the samples heterogeneity than self-contained methods, leading to poor results reproducibility. We also found that the power of unsupervised competitive methods depends on the balance between up- and down-regulated genes in tested gene sets. These properties of competitive methods have been overlooked before. Our evaluation provides a concise guideline for selecting GSA approaches, best performing under particular experimental settings in the context of RNA-seq. PMID:26342128

  1. Plant Reactome: a resource for plant pathways and comparative analysis

    PubMed Central

    Naithani, Sushma; Preece, Justin; D'Eustachio, Peter; Gupta, Parul; Amarasinghe, Vindhya; Dharmawardhana, Palitha D.; Wu, Guanming; Fabregat, Antonio; Elser, Justin L.; Weiser, Joel; Keays, Maria; Fuentes, Alfonso Munoz-Pomer; Petryszak, Robert; Stein, Lincoln D.; Ware, Doreen; Jaiswal, Pankaj

    2017-01-01

    Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX. PMID:27799469

  2. Asynchronous oscillations of two zebrafish CLOCK partners reveal differential clock control and function

    PubMed Central

    Cermakian, Nicolas; Whitmore, David; Foulkes, Nicholas S.; Sassone-Corsi, Paolo

    2000-01-01

    Most clock genes encode transcription factors that interact to elicit cooperative control of clock function. Using a two-hybrid system approach, we have isolated two different partners of zebrafish (zf) CLOCK, which are similar to the mammalian BMAL1 (brain and muscle arylhydrocarbon receptor nuclear translocator-like protein 1). The two homologs, zfBMAL1 and zfBMAL2, contain conserved basic helix–loop–helix-PAS (Period-Arylhydrocarbon receptor-Singleminded) domains but diverge in the carboxyl termini, thus bearing different transcriptional activation potential. As for zfClock, the expression of both zfBmals oscillates in most tissues in the animal. However, in many tissues, the peak, levels, and kinetics of expression are different between the two genes and for the same gene from tissue to tissue. These results support the existence of independent peripheral oscillators and suggest that zfBMAL1 and zfBMAL2 may exert distinct circadian functions, interacting differentially with zfCLOCK at various times in different tissues. Our findings also indicate that multiple controls may be exerted by the central clock and/or that peripheral oscillators can differentially interpret central clock signals. PMID:10760301

  3. An overview of technical considerations when using quantitative real-time PCR analysis of gene expression in human exercise research

    PubMed Central

    Yan, Xu; Bishop, David J.

    2018-01-01

    Gene expression analysis by quantitative PCR in skeletal muscle is routine in exercise studies. The reproducibility and reliability of the data fundamentally depend on how the experiments are performed and interpreted. Despite the popularity of the assay, there is a considerable variation in experimental protocols and data analyses from different laboratories, and there is a lack of consistency of proper quality control steps throughout the assay. In this study, we present a number of experiments on various steps of quantitative PCR workflow, and demonstrate how to perform a quantitative PCR experiment with human skeletal muscle samples in an exercise study. We also tested some common mistakes in performing qPCR. Interestingly, we found that mishandling of muscle for a short time span (10 mins) before RNA extraction did not affect RNA quality, and isolated total RNA was preserved for up to one week at room temperature. Demonstrated by our data, use of unstable reference genes lead to substantial differences in the final results. Alternatively, cDNA content can be used for data normalisation; however, complete removal of RNA from cDNA samples is essential for obtaining accurate cDNA content. PMID:29746477

  4. The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression.

    PubMed

    Catto, James W F; Abbod, Maysam F; Wild, Peter J; Linkens, Derek A; Pilarsky, Christian; Rehman, Ishtiaq; Rosario, Derek J; Denzinger, Stefan; Burger, Maximilian; Stoehr, Robert; Knuechel, Ruth; Hartmann, Arndt; Hamdy, Freddie C

    2010-03-01

    New methods for identifying bladder cancer (BCa) progression are required. Gene expression microarrays can reveal insights into disease biology and identify novel biomarkers. However, these experiments produce large datasets that are difficult to interpret. To develop a novel method of microarray analysis combining two forms of artificial intelligence (AI): neurofuzzy modelling (NFM) and artificial neural networks (ANN) and validate it in a BCa cohort. We used AI and statistical analyses to identify progression-related genes in a microarray dataset (n=66 tumours, n=2800 genes). The AI-selected genes were then investigated in a second cohort (n=262 tumours) using immunohistochemistry. We compared the accuracy of AI and statistical approaches to identify tumour progression. AI identified 11 progression-associated genes (odds ratio [OR]: 0.70; 95% confidence interval [CI], 0.56-0.87; p=0.0004), and these were more discriminate than genes chosen using statistical analyses (OR: 1.24; 95% CI, 0.96-1.60; p=0.09). The expression of six AI-selected genes (LIG3, FAS, KRT18, ICAM1, DSG2, and BRCA2) was determined using commercial antibodies and successfully identified tumour progression (concordance index: 0.66; log-rank test: p=0.01). AI-selected genes were more discriminate than pathologic criteria at determining progression (Cox multivariate analysis: p=0.01). Limitations include the use of statistical correlation to identify 200 genes for AI analysis and that we did not compare regression identified genes with immunohistochemistry. AI and statistical analyses use different techniques of inference to determine gene-phenotype associations and identify distinct prognostic gene signatures that are equally valid. We have identified a prognostic gene signature whose members reflect a variety of carcinogenic pathways that could identify progression in non-muscle-invasive BCa. 2009 European Association of Urology. Published by Elsevier B.V. All rights reserved.

  5. A safer, urea-based in situ hybridization method improves detection of gene expression in diverse animal species.

    PubMed

    Sinigaglia, Chiara; Thiel, Daniel; Hejnol, Andreas; Houliston, Evelyn; Leclère, Lucas

    2018-02-01

    In situ hybridization is a widely employed technique allowing spatial visualization of gene expression in fixed specimens. It has greatly advanced our understanding of biological processes, including developmental regulation. In situ protocols are today routinely followed in numerous laboratories, and although details might change, they all include a hybridization step, where specific antisense RNA or DNA probes anneal to the target nucleic acid sequence. This step is generally carried out at high temperatures and in a denaturing solution, called hybridization buffer, commonly containing 50% (v/v) formamide - a hazardous chemical. When applied to the soft-bodied hydrozoan medusa Clytia hemisphaerica, we found that this traditional hybridization approach was not fully satisfactory, causing extensive deterioration of morphology and tissue texture which compromised our observation and interpretation of results. We thus tested alternative solutions for in situ detection of gene expression and, inspired by optimized protocols for Northern and Southern blot analysis, we substituted the 50% formamide with an equal volume of 8M urea solution in the hybridization buffer. Our new protocol not only yielded better morphologies and tissue consistency, but also notably improved the resolution of the signal, allowing more precise localization of gene expression and reducing aspecific staining associated with problematic areas. Given the improved results and reduced manipulation risks, we tested the urea protocol on other metazoans, two brachiopod species (Novocrania anomala and Terebratalia transversa) and the priapulid worm Priapulus caudatus, obtaining a similar reduction of aspecific probe binding. Overall, substitution of formamide by urea during in situ hybridization offers a safer alternative, potentially of widespread use in research, medical and teaching contexts. We encourage other workers to test this approach on their study organisms, and hope that they will also obtain better sample preservation, more precise expression patterns and fewer problems due to aspecific staining, as we report here for Clytia medusae and Novocrania and Terebratalia developing larvae. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Metabolic syndrome alters expression of insulin signaling-related genes in swine mesenchymal stem cells.

    PubMed

    Conley, Sabena M; Zhu, Xiang-Yang; Eirin, Alfonso; Tang, Hui; Lerman, Amir; van Wijnen, Andre J; Lerman, Lilach O

    2018-02-20

    Metabolic syndrome (MetS) is associated with insulin resistance (IR) and impaired glucose metabolism in muscle, fat, and other cells, and may induce inflammation and vascular remodeling. Endogenous reparative systems, including adipose tissue-derived mesenchymal stem/stromal cells (MSC), are responsible for repair of damaged tissue. MSC have also been proposed as an exogenous therapeutic intervention in patients with cardiovascular and chronic kidney disease (CKD). The feasibility of using autologous cells depends on their integrity, but whether in MetS IR involves adipose tissue-derived MSC remains unknown. The aim of this study was to examine the expression of mRNA involved in insulin signaling in MSC from subjects with MetS. Domestic pigs consumed a lean or obese diet (n=6 each) for 16weeks. MSC were collected from subcutaneous abdominal fat and analyzed using high-throughput RNA-sequencing for expression of genes involved in insulin signaling. Expression profiles for enriched (fold change>1.4, p<0.05) and suppressed (fold change<0.7, p<0.05) mRNAs in MetS pigs were functionally interpreted by gene ontology analysis. The most prominently upregulated and downregulated mRNAs were further probed. We identified in MetS-MSC 168 up-regulated and 51 down-regulated mRNAs related to insulin signaling. Enriched mRNAs were implicated in biological pathways including hepatic glucose metabolism, adipocyte differentiation, and transcription regulation, and down-regulated mRNAs in intracellular calcium signaling and cleaving peptides. Functional analysis suggested that overall these alterations could increase IR. MetS alters mRNA expression related to insulin signaling in adipose tissue-derived MSC. These observations mandate caution during administration of autologous MSC in subjects with MetS. Copyright © 2017. Published by Elsevier B.V.

  7. Power enhancement via multivariate outlier testing with gene expression arrays.

    PubMed

    Asare, Adam L; Gao, Zhong; Carey, Vincent J; Wang, Richard; Seyfert-Margolis, Vicki

    2009-01-01

    As the use of microarrays in human studies continues to increase, stringent quality assurance is necessary to ensure accurate experimental interpretation. We present a formal approach for microarray quality assessment that is based on dimension reduction of established measures of signal and noise components of expression followed by parametric multivariate outlier testing. We applied our approach to several data resources. First, as a negative control, we found that the Affymetrix and Illumina contributions to MAQC data were free from outliers at a nominal outlier flagging rate of alpha=0.01. Second, we created a tunable framework for artificially corrupting intensity data from the Affymetrix Latin Square spike-in experiment to allow investigation of sensitivity and specificity of quality assurance (QA) criteria. Third, we applied the procedure to 507 Affymetrix microarray GeneChips processed with RNA from human peripheral blood samples. We show that exclusion of arrays by this approach substantially increases inferential power, or the ability to detect differential expression, in large clinical studies. http://bioconductor.org/packages/2.3/bioc/html/arrayMvout.html and http://bioconductor.org/packages/2.3/bioc/html/affyContam.html affyContam (credentials: readonly/readonly)

  8. Linear Regression Links Transcriptomic Data and Cellular Raman Spectra.

    PubMed

    Kobayashi-Kirschvink, Koseki J; Nakaoka, Hidenori; Oda, Arisa; Kamei, Ken-Ichiro F; Nosho, Kazuki; Fukushima, Hiroko; Kanesaki, Yu; Yajima, Shunsuke; Masaki, Haruhiko; Ohta, Kunihiro; Wakamoto, Yuichi

    2018-06-08

    Raman microscopy is an imaging technique that has been applied to assess molecular compositions of living cells to characterize cell types and states. However, owing to the diverse molecular species in cells and challenges of assigning peaks to specific molecules, it has not been clear how to interpret cellular Raman spectra. Here, we provide firm evidence that cellular Raman spectra and transcriptomic profiles of Schizosaccharomyces pombe and Escherichia coli can be computationally connected and thus interpreted. We find that the dimensions of high-dimensional Raman spectra and transcriptomes measured by RNA sequencing can be reduced and connected linearly through a shared low-dimensional subspace. Accordingly, we were able to predict global gene expression profiles by applying the calculated transformation matrix to Raman spectra, and vice versa. Highly expressed non-coding RNAs contributed to the Raman-transcriptome linear correspondence more significantly than mRNAs in S. pombe. This demonstration of correspondence between cellular Raman spectra and transcriptomes is a promising step toward establishing spectroscopic live-cell omics studies. Copyright © 2018 Elsevier Inc. All rights reserved.

  9. Identification of functional enolase genes of the silkworm Bombyx mori from public databases with a combination of dry and wet bench processes.

    PubMed

    Kikuchi, Akira; Nakazato, Takeru; Ito, Katsuhiko; Nojima, Yosui; Yokoyama, Takeshi; Iwabuchi, Kikuo; Bono, Hidemasa; Toyoda, Atsushi; Fujiyama, Asao; Sato, Ryoichi; Tabunoki, Hiroko

    2017-01-13

    Various insect species have been added to genomic databases over the years. Thus, researchers can easily obtain online genomic information on invertebrates and insects. However, many incorrectly annotated genes are included in these databases, which can prevent the correct interpretation of subsequent functional analyses. To address this problem, we used a combination of dry and wet bench processes to select functional genes from public databases. Enolase is an important glycolytic enzyme in all organisms. We used a combination of dry and wet bench processes to identify functional enolases in the silkworm Bombyx mori (BmEno). First, we detected five annotated enolases from public databases using a Hidden Markov Model (HMM) search, and then through cDNA cloning, Northern blotting, and RNA-seq analysis, we revealed three functional enolases in B. mori: BmEno1, BmEno2, and BmEnoC. BmEno1 contained a conserved key amino acid residue for metal binding and substrate binding in other species. However, BmEno2 and BmEnoC showed a change in this key amino acid. Phylogenetic analysis showed that BmEno2 and BmEnoC were distinct from BmEno1 and other enolases, and were distributed only in lepidopteran clusters. BmEno1 was expressed in all of the tissues used in our study. In contrast, BmEno2 was mainly expressed in the testis with some expression in the ovary and suboesophageal ganglion. BmEnoC was weakly expressed in the testis. Quantitative RT-PCR showed that the mRNA expression of BmEno2 and BmEnoC correlated with testis development; thus, BmEno2 and BmEnoC may be related to lepidopteran-specific spermiogenesis. We identified and characterized three functional enolases from public databases with a combination of dry and wet bench processes in the silkworm B. mori. In addition, we determined that BmEno2 and BmEnoC had species-specific functions. Our strategy could be helpful for the detection of minor genes and functional genes in non-model organisms from public databases.

  10. In vitro investigation of the effect of matrix molecules on the behavior of colon cancer cells under the effect of geldanamycin derivative.

    PubMed

    Vural, Kamil; Kosova, Funda; Kurt, Feyzan Özdal; Tuğlu, İbrahim

    2017-10-01

    The chaperone-binding drug, 17-allylamino-17-demethoxygeldanamycin, has recently come into clinical use. It is a derivative of geldanamycin, an ansamycin benzoquinone antibiotic with anti-carcinogenic effect. Understanding the effect of this drug on the cancer cells and their niche is important for treatment. We applied 17-allylamino-17-demethoxygeldanamycin to colon cancer cell line (Colo 205) on matrix molecules to investigate the relationship of apoptosis with terminal deoxynucleotidyl transferase dUTP nick end labeling immunocytochemistry and related gene expression. We used laminin and collagen I for matrix molecules and vascular endothelial growth factor for angiogenic structure. We also examined apoptosis-related signaling pathway including mitochondrial proteins, cytochrome c, Bcl-2, caspase-9, Apaf-1 expression using real-time polymerase chain reaction. There was clear effect of 17-allylamino-17-demethoxygeldanamycin that killed more cells on tissue culture plastic compared to matrix molecules. The IC 50 value was 0.58 µg/mL for tissue culture plastic compared with 0.64 µg/mL for laminin and 0.75 µg/mL for collagen I. The analyses showed that more cells on matrix molecules underwent apoptosis compared to that on tissue culture plastic. Apoptosis-related gene expression was similar in which Bcl-2 expression decreased and proapoptotic gene expression of the cells on matrix molecules increased compared to that on tissue culture plastic. However, the application of 17-allylamino-17-demethoxygeldanamycin was more effective for the cells on collagen I compared to the cells on laminin. There was also a decrease in angiogenesis as shown by the vascular endothelial growth factor staining. This was more pronounced by coating of the tissue culture plastic with matrix molecules. Our results supported the anti-cancer effect of 17-allylamino-17-demethoxygeldanamycin, and this effect depended on matrix molecules. This effect occurs through apoptosis, and related genes were also altered. All these genes may serve for novel target under the effect of matrix substrate. However, correct interpretation of the results requires further studies.

  11. An Integrated Encyclopedia of DNA Elements in the Human Genome

    PubMed Central

    2012-01-01

    Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616

  12. The Transcriptome of the Zebrafish Embryo After Chemical Exposure: A Meta-Analysis.

    PubMed

    Schüttler, Andreas; Reiche, Kristin; Altenburger, Rolf; Busch, Wibke

    2017-06-01

    Numerous studies have been published in the past years investigating the transcriptome of the zebrafish embryo (ZFE) upon being subjected to chemical stress. Aiming at a more mechanistic understanding of the results of such studies, knowledge about commonalities of transcript regulation in response to chemical stress is needed. Thus, our goal in this study was to identify and interpret genes and gene sets constituting a general response to chemical exposure. Therefore, we aggregated and reanalyzed published toxicogenomics data obtained with the ZFE. We found that overlap of differentially transcribed genes in response to chemical stress across independent studies is generally low and the most commonly differentially transcribed genes appear in less than 50% of all treatments across studies. However, effect size analysis revealed several genes showing a common trend of differential expression, among which genes related to calcium homeostasis emerged as key, especially in exposure settings up to 24 h post-fertilization. Additionally, we found that these and other downregulated genes are often linked to anatomical regions developing during the respective exposure period. Genes showing a trend of increased expression were, among others, linked to signaling pathways (e.g., Wnt, Fgf) as well as lysosomal structures and apoptosis. The findings of this study help to increase the understanding of chemical stress responses in the developing zebrafish embryo and provide a starting point to improve experimental designs for this model system. In future, improved time- and concentration-resolved experiments should offer better understanding of stress response patterns and access to mechanistic information. © The Author 2017. Published by Oxford University Press on behalf of the Society of Toxicology.

  13. Interpretable dimensionality reduction of single cell transcriptome data with deep generative models.

    PubMed

    Ding, Jiarui; Condon, Anne; Shah, Sohrab P

    2018-05-21

    Single-cell RNA-sequencing has great potential to discover cell types, identify cell states, trace development lineages, and reconstruct the spatial organization of cells. However, dimension reduction to interpret structure in single-cell sequencing data remains a challenge. Existing algorithms are either not able to uncover the clustering structures in the data or lose global information such as groups of clusters that are close to each other. We present a robust statistical model, scvis, to capture and visualize the low-dimensional structures in single-cell gene expression data. Simulation results demonstrate that low-dimensional representations learned by scvis preserve both the local and global neighbor structures in the data. In addition, scvis is robust to the number of data points and learns a probabilistic parametric mapping function to add new data points to an existing embedding. We then use scvis to analyze four single-cell RNA-sequencing datasets, exemplifying interpretable two-dimensional representations of the high-dimensional single-cell RNA-sequencing data.

  14. Analysis of the pattern of expression of the Fanconi anemia group C (Facc) gene during murine development

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Krasnoshtein, F.; Buchwald, M.

    1994-09-01

    Fanconi anemia (FA) is an autosomal recessive disorder characterized by a variety of congenital and skeletal malformations, progressive pancytopanenia and predisposition to malignancies. FA cells display chromosomal instability and hypersensitivity to DNA-damaging agents. Both the human and the corresponding murine cDNAs have been cloned in our lab. Here we describe the expression of Facc during mouse development, using mRNA in situ hybridization. Our aim is to obtain clues on the possible function of the Facc gene product during development that may help elucidate basic defect(s) in FA. In addition, knowledge of the exact pattern of Facc expression will assist inmore » interpreting the phenotypes of mutant mice, currently being developed. In embryos the gene is diffusely expressed over the entire embryo, with higher hybridization levels in the mesenchyme and in both upper and lower extremities. Specific expression of Facc is seen in the perichondrium and marrow of long bones of hind limbs/hip; long bones of front limbs/shoulder region; developing digits of front and hind paws; and ribs. The signal is also detected in the following regions: cranial/frontal; facial/periorbital and maxillary/mandibular, hair follicles, diaphragm and lung. In addition, generalized Facc expression is seen during these embryonic stages. The pattern of Facc expression is consistent with the known skeletal abnormalities in FA patients, which include radial ray deformities, metacarpal hypoplasia, and abnormalities of lower limbs, ribs, head and face. The signal in the lung is consistent with the lung lobe absence and abnormal pulmonary drainage that have been detected in some FA patients. The sloped forehead and microcephaly in FA patients may have some association with the signal seen in the frontal region of the mouse cranium. Taken together, our results suggest that Facc is directly involved in the development of various embryonic tissues, particularly bone.« less

  15. Predicted Arabidopsis Interactome Resource and Gene Set Linkage Analysis: A Transcriptomic Analysis Resource.

    PubMed

    Yao, Heng; Wang, Xiaoxuan; Chen, Pengcheng; Hai, Ling; Jin, Kang; Yao, Lixia; Mao, Chuanzao; Chen, Xin

    2018-05-01

    An advanced functional understanding of omics data is important for elucidating the design logic of physiological processes in plants and effectively controlling desired traits in plants. We present the latest versions of the Predicted Arabidopsis Interactome Resource (PAIR) and of the gene set linkage analysis (GSLA) tool, which enable the interpretation of an observed transcriptomic change (differentially expressed genes [DEGs]) in Arabidopsis ( Arabidopsis thaliana ) with respect to its functional impact for biological processes. PAIR version 5.0 integrates functional association data between genes in multiple forms and infers 335,301 putative functional interactions. GSLA relies on this high-confidence inferred functional association network to expand our perception of the functional impacts of an observed transcriptomic change. GSLA then interprets the biological significance of the observed DEGs using established biological concepts (annotation terms), describing not only the DEGs themselves but also their potential functional impacts. This unique analytical capability can help researchers gain deeper insights into their experimental results and highlight prospective directions for further investigation. We demonstrate the utility of GSLA with two case studies in which GSLA uncovered how molecular events may have caused physiological changes through their collective functional influence on biological processes. Furthermore, we showed that typical annotation-enrichment tools were unable to produce similar insights to PAIR/GSLA. The PAIR version 5.0-inferred interactome and GSLA Web tool both can be accessed at http://public.synergylab.cn/pair/. © 2018 American Society of Plant Biologists. All Rights Reserved.

  16. A novel FY*A allele with the 265T and 298A SNPs formerly associated exclusively with the FY*B allele and weak Fy(b) antigen expression: implication for genotyping interpretative algorithms.

    PubMed

    Lopez, G H; Condon, J A; Wilson, B; Martin, J R; Liew, Y-W; Flower, R L; Hyland, C A

    2015-01-01

    An Australian Caucasian blood donor consistently presented a serology profile for the Duffy blood group as Fy(a+b+) with Fy(a) antigen expression weaker than other examples of Fy(a+b+) red cells. Molecular typing studies were performed to investigate the reason for the observed serology profile. Blood group genotyping was performed using a commercial SNP microarray platform. Sanger sequencing was performed using primer sets to amplify across exons 1 and 2 of the FY gene and using allele-specific primers. The propositus was genotyped as FY*A/B, FY*X heterozygote that predicted the Fy(a+b+(w) ) phenotype. Sequencing identified the 265T and 298A variants on the FY*A allele. This link between FY*A allele and 265T was confirmed by allele-specific PCR. The reduced Fy(a) antigen reactivity is attributed to a FY*A allele-carrying 265T and 298A variants previously defined in combination only with the FY*B allele and associated with weak Fy(b) antigen expression. This novel allele should be considered in genotyping interpretative algorithms for generating a predicted phenotype. © 2014 International Society of Blood Transfusion.

  17. Recurrent 15q11.2 BP1-BP2 microdeletions and microduplications in the etiology of neurodevelopmental disorders.

    PubMed

    Picinelli, Chiara; Lintas, Carla; Piras, Ignazio Stefano; Gabriele, Stefano; Sacco, Roberto; Brogna, Claudia; Persico, Antonio Maria

    2016-12-01

    Rare and common CNVs can contribute to the etiology of neurodevelopmental disorders. One of the recurrent genomic aberrations associated with these phenotypes and proposed as a susceptibility locus is the 15q11.2 BP1-BP2 CNV encompassing TUBGCP5, CYFIP1, NIPA2, and NIPA1. Characterizing by array-CGH a cohort of 243 families with various neurodevelopmental disorders, we identified five patients carrying the 15q11.2 duplication and one carrying the deletion. All CNVs were confirmed by qPCR and were inherited, except for one duplication where parents were not available. The phenotypic spectrum of CNV carriers was broad but mainly neurodevelopmental, in line with all four genes being implicated in axonal growth and neural connectivity. Phenotypically normal and mildly affected carriers complicate the interpretation of this aberration. This variability may be due to reduced penetrance or altered gene dosage on a particular genetic background. We evaluated the expression levels of the four genes in peripheral blood RNA and found the expected reduction in the deleted case, while duplicated carriers displayed high interindividual variability. These data suggest that differential expression of these genes could partially account for differences in clinical phenotypes, especially among duplication carriers. Furthermore, urinary Mg 2+ levels appear negatively correlated with NIPA2 gene copy number, suggesting they could potentially represent a useful biomarker, whose reliability will need replication in larger samples. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  18. Rational confederation of genes and diseases: NGS interpretation via GeneCards, MalaCards and VarElect.

    PubMed

    Rappaport, Noa; Fishilevich, Simon; Nudel, Ron; Twik, Michal; Belinky, Frida; Plaschkes, Inbar; Stein, Tsippi Iny; Cohen, Dana; Oz-Levi, Danit; Safran, Marilyn; Lancet, Doron

    2017-08-18

    A key challenge in the realm of human disease research is next generation sequencing (NGS) interpretation, whereby identified filtered variant-harboring genes are associated with a patient's disease phenotypes. This necessitates bioinformatics tools linked to comprehensive knowledgebases. The GeneCards suite databases, which include GeneCards (human genes), MalaCards (human diseases) and PathCards (human pathways) together with additional tools, are presented with the focus on MalaCards utility for NGS interpretation as well as for large scale bioinformatic analyses. VarElect, our NGS interpretation tool, leverages the broad information in the GeneCards suite databases. MalaCards algorithms unify disease-related terms and annotations from 69 sources. Further, MalaCards defines hierarchical relatedness-aliases, disease families, a related diseases network, categories and ontological classifications. GeneCards and MalaCards delineate and share a multi-tiered, scored gene-disease network, with stringency levels, including the definition of elite status-high quality gene-disease pairs, coming from manually curated trustworthy sources, that includes 4500 genes for 8000 diseases. This unique resource is key to NGS interpretation by VarElect. VarElect, a comprehensive search tool that helps infer both direct and indirect links between genes and user-supplied disease/phenotype terms, is robustly strengthened by the information found in MalaCards. The indirect mode benefits from GeneCards' diverse gene-to-gene relationships, including SuperPaths-integrated biological pathways from 12 information sources. We are currently adding an important information layer in the form of "disease SuperPaths", generated from the gene-disease matrix by an algorithm similar to that previously employed for biological pathway unification. This allows the discovery of novel gene-disease and disease-disease relationships. The advent of whole genome sequencing necessitates capacities to go beyond protein coding genes. GeneCards is highly useful in this respect, as it also addresses 101,976 non-protein-coding RNA genes. In a more recent development, we are currently adding an inclusive map of regulatory elements and their inferred target genes, generated by integration from 4 resources. MalaCards provides a rich big-data scaffold for in silico biomedical discovery within the gene-disease universe. VarElect, which depends significantly on both GeneCards and MalaCards power, is a potent tool for supporting the interpretation of wet-lab experiments, notably NGS analyses of disease. The GeneCards suite has thus transcended its 2-decade role in biomedical research, maturing into a key player in clinical investigation.

  19. New insights into the Saccharomyces cerevisiae fermentation switch: Dynamic transcriptional response to anaerobicity and glucose-excess

    PubMed Central

    van den Brink, Joost; Daran-Lapujade, Pascale; Pronk, Jack T; de Winde, Johannes H

    2008-01-01

    Background The capacity of respiring cultures of Saccharomyces cerevisiae to immediately switch to fast alcoholic fermentation upon a transfer to anaerobic sugar-excess conditions is a key characteristic of Saccharomyces cerevisiae in many of its industrial applications. This transition was studied by exposing aerobic glucose-limited chemostat cultures grown at a low specific growth rate to two simultaneous perturbations: oxygen depletion and relief of glucose limitation. Results The shift towards fully fermentative conditions caused a massive transcriptional reprogramming, where one third of all genes within the genome were transcribed differentially. The changes in transcript levels were mostly driven by relief from glucose-limitation. After an initial strong response to the addition of glucose, the expression profile of most transcriptionally regulated genes displayed a clear switch at 30 minutes. In this respect, a striking difference was observed between the transcript profiles of genes encoding ribosomal proteins and those encoding ribosomal biogenesis components. Not all regulated genes responded with this binary profile. A group of 87 genes showed a delayed and steady increase in expression that specifically responded to anaerobiosis. Conclusion Our study demonstrated that, despite the complexity of this multiple-input perturbation, the transcriptional responses could be categorized and biologically interpreted. By comparing this study with public datasets representing dynamic and steady conditions, 14 up-regulated and 11 down-regulated genes were determined to be anaerobic specific. Therefore, these can be seen as true "signature" transcripts for anaerobicity under dynamic as well as under steady state conditions. PMID:18304306

  20. Transcription Factor FoxO1 Is Essential for Enamel Biomineralization

    PubMed Central

    Poché, Ross A.; Sharma, Ramaswamy; Garcia, Monica D.; Wada, Aya M.; Nolte, Mark J.; Udan, Ryan S.; Paik, Ji-Hye; DePinho, Ronald A.; Bartlett, John D.; Dickinson, Mary E.

    2012-01-01

    The Transforming growth factor β (Tgf-β) pathway, by signaling via the activation of Smad transcription factors, induces the expression of many diverse downstream target genes thereby regulating a vast array of cellular events essential for proper development and homeostasis. In order for a specific cell type to properly interpret the Tgf-β signal and elicit a specific cellular response, cell-specific transcriptional co-factors often cooperate with the Smads to activate a discrete set of genes in the appropriate temporal and spatial manner. Here, via a conditional knockout approach, we show that mice mutant for Forkhead Box O transcription factor FoxO1 exhibit an enamel hypomaturation defect which phenocopies that of the Smad3 mutant mice. Furthermore, we determined that both the FoxO1 and Smad3 mutant teeth exhibit changes in the expression of similar cohort of genes encoding enamel matrix proteins required for proper enamel development. These data raise the possibility that FoxO1 and Smad3 act in concert to regulate a common repertoire of genes necessary for complete enamel maturation. This study is the first to define an essential role for the FoxO family of transcription factors in tooth development and provides a new molecular entry point which will allow researchers to delineate novel genetic pathways regulating the process of biomineralization which may also have significance for studies of human tooth diseases such as amelogenesis imperfecta. PMID:22291941

  1. Physical Interactions and Expression Quantitative Traits Loci Identify Regulatory Connections for Obesity and Type 2 Diabetes Associated SNPs

    PubMed Central

    Fadason, Tayaza; Ekblad, Cameron; Ingram, John R.; Schierding, William S.; O'Sullivan, Justin M.

    2017-01-01

    The mechanisms that underlie the association between obesity and type 2 diabetes are not fully understood. Here, we investigated the role of the 3D genome organization in the pathogeneses of obesity and type-2 diabetes. We interpreted the combined and differential impacts of 196 diabetes and 390 obesity associated single nucleotide polymorphisms (SNPs) by integrating data on the genes with which they physically interact (as captured by Hi-C) and the functional [i.e., expression quantitative trait loci (eQTL)] outcomes associated with these interactions. We identified 861 spatially regulated genes (e.g., AP3S2, ELP5, SVIP, IRS1, FADS2, WFS1, RBM6, HORMAD1, PYROXD2), which are enriched in tissues (e.g., adipose, skeletal muscle, pancreas) and biological processes and canonical pathways (e.g., lipid metabolism, leptin, and glucose-insulin signaling pathways) that are important for the pathogenesis of type 2 diabetes and obesity. Our discovery-based approach also identifies enrichment for eQTL SNP-gene interactions in tissues that are not classically associated with diabetes or obesity. We propose that the combinatorial action of active obesity and diabetes spatial eQTL SNPs on their gene pairs within different tissues reduces the ability of these tissues to contribute to the maintenance of a healthy energy metabolism. PMID:29081791

  2. Functional characterisation of osteosarcoma cell lines and identification of mRNAs and miRNAs associated with aggressive cancer phenotypes

    PubMed Central

    Lauvrak, S U; Munthe, E; Kresse, S H; Stratford, E W; Namløs, H M; Meza-Zepeda, L A; Myklebost, O

    2013-01-01

    Background: Osteosarcoma is the most common primary malignant bone tumour, predominantly affecting children and adolescents. Cancer cell line models are required to understand the underlying mechanisms of tumour progression and for preclinical investigations. Methods: To identify cell lines that are well suited for studies of critical cancer-related phenotypes, such as tumour initiation, growth and metastasis, we have evaluated 22 osteosarcoma cell lines for in vivo tumorigenicity, in vitro colony-forming ability, invasive/migratory potential and proliferation capacity. Importantly, we have also identified mRNA and microRNA (miRNA) gene expression patterns associated with these phenotypes by expression profiling. Results: The cell lines exhibited a wide range of cancer-related phenotypes, from rather indolent to very aggressive. Several mRNAs were differentially expressed in highly aggressive osteosarcoma cell lines compared with non-aggressive cell lines, including RUNX2, several S100 genes, collagen genes and genes encoding proteins involved in growth factor binding, cell adhesion and extracellular matrix remodelling. Most notably, four genes—COL1A2, KYNU, ACTG2 and NPPB—were differentially expressed in high and non-aggressive cell lines for all the cancer-related phenotypes investigated, suggesting that they might have important roles in the process of osteosarcoma tumorigenesis. At the miRNA level, miR-199b-5p and mir-100-3p were downregulated in the highly aggressive cell lines, whereas miR-155-5p, miR-135b-5p and miR-146a-5p were upregulated. miR-135b-5p and miR-146a-5p were further predicted to be linked to the metastatic capacity of the disease. Interpretation: The detailed characterisation of cell line phenotypes will support the selection of models to use for specific preclinical investigations. The differentially expressed mRNAs and miRNAs identified in this study may represent good candidates for future therapeutic targets. To our knowledge, this is the first time that expression profiles are associated with functional characteristics of osteosarcoma cell lines. PMID:24064976

  3. Suppression of Beneficial Mutations in Dynamic Microbial Populations

    NASA Astrophysics Data System (ADS)

    Bittihn, Philip; Hasty, Jeff; Tsimring, Lev S.

    2017-01-01

    Quantitative predictions for the spread of mutations in bacterial populations are essential to interpret evolution experiments and to improve the stability of synthetic gene circuits. We derive analytical expressions for the suppression factor for beneficial mutations in populations that undergo periodic dilutions, covering arbitrary population sizes, dilution factors, and growth advantages in a single stochastic model. We find that the suppression factor grows with the dilution factor and depends nontrivially on the growth advantage, resulting in the preferential elimination of mutations with certain growth advantages. We confirm our results by extensive numerical simulations.

  4. Synergistic Effects of Toxic Elements on Heat Shock Proteins

    PubMed Central

    Mahmood, Khalid; Mahmood, Qaisar; Irshad, Muhammad; Hussain, Jamshaid

    2014-01-01

    Heat shock proteins show remarkable variations in their expression levels under a variety of toxic conditions. A research span expanded over five decades has revealed their molecular characterization, gene regulation, expression patterns, vast similarity in diverse groups, and broad range of functional capabilities. Their functions include protection and tolerance against cytotoxic conditions through their molecular chaperoning activity, maintaining cytoskeleton stability, and assisting in cell signaling. However, their role as biomarkers for monitoring the environmental risk assessment is controversial due to a number of conflicting, validating, and nonvalidating reports. The current knowledge regarding the interpretation of HSPs expression levels has been discussed in the present review. The candidature of heat shock proteins as biomarkers of toxicity is thus far unreliable due to synergistic effects of toxicants and other environmental factors. The adoption of heat shock proteins as “suit of biomarkers in a set of organisms” requires further investigation. PMID:25136596

  5. Cardiomyocyte-specific deletion of the G protein-coupled estrogen receptor (GPER) leads to left ventricular dysfunction and adverse remodeling: A sex-specific gene profiling analysis.

    PubMed

    Wang, Hao; Sun, Xuming; Chou, Jeff; Lin, Marina; Ferrario, Carlos M; Zapata-Sudo, Gisele; Groban, Leanne

    2017-08-01

    Activation of G protein-coupled estrogen receptor (GPER) by its agonist, G1, protects the heart from stressors such as pressure-overload, ischemia, a high-salt diet, estrogen loss, and aging, in various male and female animal models. Due to nonspecific effects of G1, the exact functions of cardiac GPER cannot be concluded from studies using systemic G1 administration. Moreover, global knockdown of GPER affects glucose homeostasis, blood pressure, and many other cardiovascular-related systems, thereby confounding interpretation of its direct cardiac actions. We generated a cardiomyocyte-specific GPER knockout (KO) mouse model to specifically investigate the functions of GPER in cardiomyocytes. Compared to wild type mice, cardiomyocyte-specific GPER KO mice exhibited adverse alterations in cardiac structure and impaired systolic and diastolic function, as measured by echocardiography. Gene deletion effects on left ventricular dimensions were more profound in male KO mice compared to female KO mice. Analysis of DNA microarray data from isolated cardiomyocytes of wild type and KO mice revealed sex-based differences in gene expression profiles affecting multiple transcriptional networks. Gene Set Enrichment Analysis (GSEA) revealed that mitochondrial genes are enriched in GPER KO females, whereas inflammatory response genes are enriched in GPER KO males, compared to their wild type counterparts of the same sex. The cardiomyocyte-specific GPER KO mouse model provides us with a powerful tool to study the functions of GPER in cardiomyocytes. The gene expression profiles of the GPER KO mice provide foundational information for further study of the mechanisms underlying sex-specific cardioprotection by GPER. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. Normal tubular regeneration and differentiation of the post-ischemic kidney in mice lacking vimentin.

    PubMed Central

    Terzi, F.; Maunoury, R.; Colucci-Guyon, E.; Babinet, C.; Federici, P.; Briand, P.; Friedlander, G.

    1997-01-01

    Proliferation and dedifferentiation of tubular cells are the hallmark of early regeneration after renal ischemic injury. Vimentin, a class III intermediate filament expressed only in mesenchymal cells of mature mammals, was shown to be transiently expressed in post-ischemic renal tubular epithelial cells. Vimentin re-expression was interpreted as a marker of cellular dedifferentiation, but its role in tubular regeneration after renal ischemia has also been hypothesized. This role was evaluated in mice bearing a null mutation of the vimentin gene. Expression of vimentin, proliferating cell nuclear antigen (a marker of cellular proliferation), and villin (a marker of differentiated brush-border membranes) was studied in wild-type (Vim+/+), heterozygous (Vim+/-), and homozygous (Vim-/-) mice subjected to transient ischemia of the left kidney. As expected, vimentin was detected by immunohistochemistry at the basal pole of proximal tubular cells from post-ischemic kidney in Vim+/+ and Vim+/- mice from day 2 to day 28. The expression of the reporter gene beta-galactosidase in Vim+/- and Vim-/- mice confirmed the tubular origin of vimentin. No compensatory expression of keratin could be demonstrated in Vim-/- mice. The intensity of proliferating cell nuclear antigen labeling and the pattern of villin expression were comparable in Vim-/-, Vim+/- and Vim+/+ mice at any time of the study. After 60 days, the structure of post-ischemic kidneys in Vim-/- mice was indistinguishable from that of normal non-operated kidneys in Vim+/+ mice. In conclusion, 1) the pattern of post-ischemic proximal tubular cell proliferation, differentiation, and tubular organization was not impaired in mice lacking vimentin and 2) these results suggest that the transient tubular expression of vimentin is not instrumental in tubular regeneration after renal ischemic injury. Images Figure 1 Figure 2 Figure 3 Figure 5 Figure 6 Figure 7 PMID:9094992

  7. The landscape of viral proteomics and its potential to impact human health

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oxford, Kristie L.; Wendler, Jason P.; McDermott, Jason E.

    2016-05-06

    Translating the intimate discourse between viruses and their host cells during infection is a challenging but critical task for development of antiviral interventions and diagnostics. Viruses commandeer cellular processes at every step of their life cycle, altering expression of genes and proteins. Advances in mass spectrometry-based proteomic technologies are enhancing studies of viral pathogenesis by identifying virus-induced changes in the protein repertoire of infected cells or extracellular fluids. Interpretation of proteomics results using knowledge of cellular pathways and networks leads to identification of proteins that influence a range of infection processes, thereby focusing efforts for clinical diagnoses and therapeutics development.more » Herein we discuss applications of global proteomic studies of viral infections with the goal of providing a basis for improved studies that will benefit community-wide data integration and interpretation.« less

  8. Brain Growth Across the Life Span in Autism: Age-Specific Changes in Anatomical Pathology

    PubMed Central

    Courchesne, Eric; Campbell, Kathleen; Solso, Stephanie

    2014-01-01

    Autism is marked by overgrowth of the brain at the earliest ages but not at older ages when decreases in structural volumes and neuron numbers are observed instead. This has lead to the theory of age-specific anatomic abnormalities in autism. Here we report age-related changes in brain size in autistic and typical subjects from 12 months to 50 years of age based on analyses of 586 longitudinal and cross-sectional MRI scans. This dataset is several times larger than the largest autism study to date. Results demonstrate early brain overgrowth during infancy and the toddler years in autistic boys and girls, followed by an accelerated rate of decline in size and perhaps degeneration from adolescence to late middle age in this disorder. We theorize that underlying these age-specific changes in anatomic abnormalities in autism there may also be age-specific changes in gene expression, molecular, synaptic, cellular and circuit abnormalities. A peak age for detecting and studying the earliest fundamental biological underpinnings of autism is prenatal life and the first three postnatal years. Studies of the older autistic brain may not address original causes but are essential to discovering how best to help the older aging autistic person. Lastly, the theory of age-specific anatomic abnormalities in autism has broad implications for a wide range of work on the disorder including the design, validation and interpretation of animal model, lymphocyte gene expression, brain gene expression, and genotype/CNV-anatomic phenotype studies. PMID:20920490

  9. Protein interaction networks from literature mining

    NASA Astrophysics Data System (ADS)

    Ihara, Sigeo

    2005-03-01

    The ability to accurately predict and understand physiological changes in the biological network system in response to disease or drug therapeutics is of crucial importance in life science. The extensive amount of gene expression data generated from even a single microarray experiment often proves difficult to fully interpret and comprehend the biological significance. An increasing knowledge of protein interactions stored in the PubMed database, as well as the advancement of natural language processing, however, makes it possible to construct protein interaction networks from the gene expression information that are essential for understanding the biological meaning. From the in house literature mining system we have developed, the protein interaction network for humans was constructed. By analysis based on the graph-theoretical characterization of the total interaction network in literature, we found that the network is scale-free and semantic long-ranged interactions (i.e. inhibit, induce) between proteins dominate in the total interaction network, reducing the degree exponent. Interaction networks generated based on scientific text in which the interaction event is ambiguously described result in disconnected networks. In contrast interaction networks based on text in which the interaction events are clearly stated result in strongly connected networks. The results of protein-protein interaction networks obtained in real applications from microarray experiments are discussed: For example, comparisons of the gene expression data indicative of either a good or a poor prognosis for acute lymphoblastic leukemia with MLL rearrangements, using our system, showed newly discovered signaling cross-talk.

  10. Divergence of Iron Metabolism in Wild Malaysian Yeast

    PubMed Central

    Lee, Hana N.; Mostovoy, Yulia; Hsu, Tiffany Y.; Chang, Amanda H.; Brem, Rachel B.

    2013-01-01

    Comparative genomic studies have reported widespread variation in levels of gene expression within and between species. Using these data to infer organism-level trait divergence has proven to be a key challenge in the field. We have used a wild Malaysian population of S. cerevisiae as a test bed in the search to predict and validate trait differences based on observations of regulatory variation. Malaysian yeast, when cultured in standard medium, activated regulatory programs that protect cells from the toxic effects of high iron. Malaysian yeast also showed a hyperactive regulatory response during culture in the presence of excess iron and had a unique growth defect in conditions of high iron. Molecular validation experiments pinpointed the iron metabolism factors AFT1, CCC1, and YAP5 as contributors to these molecular and cellular phenotypes; in genome-scale sequence analyses, a suite of iron toxicity response genes showed evidence for rapid protein evolution in Malaysian yeast. Our findings support a model in which iron metabolism has diverged in Malaysian yeast as a consequence of a change in selective pressure, with Malaysian alleles shifting the dynamic range of iron response to low-iron concentrations and weakening resistance to extreme iron toxicity. By dissecting the iron scarcity specialist behavior of Malaysian yeast, our work highlights the power of expression divergence as a signpost for biologically and evolutionarily relevant variation at the organismal level. Interpreting the phenotypic relevance of gene expression variation is one of the primary challenges of modern genomics. PMID:24142925

  11. Divergence of iron metabolism in wild Malaysian yeast.

    PubMed

    Lee, Hana N; Mostovoy, Yulia; Hsu, Tiffany Y; Chang, Amanda H; Brem, Rachel B

    2013-12-09

    Comparative genomic studies have reported widespread variation in levels of gene expression within and between species. Using these data to infer organism-level trait divergence has proven to be a key challenge in the field. We have used a wild Malaysian population of S. cerevisiae as a test bed in the search to predict and validate trait differences based on observations of regulatory variation. Malaysian yeast, when cultured in standard medium, activated regulatory programs that protect cells from the toxic effects of high iron. Malaysian yeast also showed a hyperactive regulatory response during culture in the presence of excess iron and had a unique growth defect in conditions of high iron. Molecular validation experiments pinpointed the iron metabolism factors AFT1, CCC1, and YAP5 as contributors to these molecular and cellular phenotypes; in genome-scale sequence analyses, a suite of iron toxicity response genes showed evidence for rapid protein evolution in Malaysian yeast. Our findings support a model in which iron metabolism has diverged in Malaysian yeast as a consequence of a change in selective pressure, with Malaysian alleles shifting the dynamic range of iron response to low-iron concentrations and weakening resistance to extreme iron toxicity. By dissecting the iron scarcity specialist behavior of Malaysian yeast, our work highlights the power of expression divergence as a signpost for biologically and evolutionarily relevant variation at the organismal level. Interpreting the phenotypic relevance of gene expression variation is one of the primary challenges of modern genomics.

  12. Appropriate suppression of Notch signaling by Mesp factors is essential for stripe pattern formation leading to segment boundary formation.

    PubMed

    Takahashi, Yu; Yasuhiko, Yukuto; Kitajima, Satoshi; Kanno, Jun; Saga, Yumiko

    2007-04-15

    Mesp1 and Mesp2 are homologous transcription factors that are co-expressed in the anterior presomitic mesoderm (PSM) during mouse somitogenesis. The loss of Mesp2 alone in our conventional Mesp2-null mice results in the complete disruption of somitogenesis, including segment border formation, rostro-caudal patterning and epithelialization of somitic mesoderm. This has led us to interpret that Mesp2 is solely responsible for somitogenesis. Our novel Mesp2 knock-in alleles, however, exhibit a remarkable upregulation of Mesp1. Removal of the pgk-neo cassette from the new allele leads to localization of Mesp1 and several gene expression, and somite formation in the tail region. Moreover, a reduction in the gene dosage of Mesp1 by one copy disrupts somite formation, confirming the involvement of Mesp1 in the rescue events. Furthermore, we find that activated Notch1 knock-in significantly upregulates Mesp1 expression, even in the absence of a Notch signal mediator, Psen1. This indicates that the Psen1-independent effects of activated Notch1 are mostly attributable to the induction of Mesp1. However, we have also confirmed that Mesp2 enhances the expression of the Notch1 receptor in the anterior PSM. The activation and subsequent suppression of Notch signaling might thus be a crucial event for both stripe pattern formation and boundary formation.

  13. Molecular fingerprinting of principal neurons in the rodent hippocampus: A neuroinformatics approach.

    PubMed

    Hamilton, D J; White, C M; Rees, C L; Wheeler, D W; Ascoli, G A

    2017-09-10

    Neurons are often classified by their morphological and molecular properties. The online knowledge base Hippocampome.org primarily defines neuron types from the rodent hippocampal formation based on their main neurotransmitter (glutamate or GABA) and the spatial distributions of their axons and dendrites. For each neuron type, this open-access resource reports any and all published information regarding the presence or absence of known molecular markers, including calcium-binding proteins, neuropeptides, receptors, channels, transcription factors, and other molecules of biomedical relevance. The resulting chemical profile is relatively sparse: even for the best studied neuron types, the expression or lack thereof of fewer than 70 molecules has been firmly established to date. The mouse genome-wide in situ hybridization mapping of the Allen Brain Atlas provides a wealth of data that, when appropriately analyzed, can substantially augment the molecular marker knowledge in Hippocampome.org. Here we focus on the principal cell layers of dentate gyrus (DG), CA3, CA2, and CA1, which together contain approximately 90% of hippocampal neurons. These four anatomical parcels are densely packed with somata of mostly excitatory projection neurons. Thus, gene expression data for those layers can be justifiably linked to the respective principal neuron types: granule cells in DG and pyramidal cells in CA3, CA2, and CA1. In order to enable consistent interpretation across genes and regions, we screened the whole-genome dataset against known molecular markers of those neuron types. The resulting threshold values allow over 6000 very-high confidence (>99.5%) expressed/not-expressed assignments, expanding the biochemical information content of Hippocampome.org more than five-fold. Many of these newly identified molecular markers are potential pharmacological targets for major neurological and psychiatric conditions. Furthermore, our approach yields reasonable expression/non-expression estimates for every single gene in each of these four neuron types with >90% average confidence, providing a considerably complete genetic characterization of hippocampal principal neurons. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. CEBS object model for systems biology data, SysBio-OM.

    PubMed

    Xirasagar, Sandhya; Gustafson, Scott; Merrick, B Alex; Tomer, Kenneth B; Stasiewicz, Stanley; Chan, Denny D; Yost, Kenneth J; Yates, John R; Sumner, Susan; Xiao, Nianqing; Waters, Michael D

    2004-09-01

    To promote a systems biology approach to understanding the biological effects of environmental stressors, the Chemical Effects in Biological Systems (CEBS) knowledge base is being developed to house data from multiple complex data streams in a systems friendly manner that will accommodate extensive querying from users. Unified data representation via a single object model will greatly aid in integrating data storage and management, and facilitate reuse of software to analyze and display data resulting from diverse differential expression or differential profile technologies. Data streams include, but are not limited to, gene expression analysis (transcriptomics), protein expression and protein-protein interaction analysis (proteomics) and changes in low molecular weight metabolite levels (metabolomics). To enable the integration of microarray gene expression, proteomics and metabolomics data in the CEBS system, we designed an object model, Systems Biology Object Model (SysBio-OM). The model is comprehensive and leverages other open source efforts, namely the MicroArray Gene Expression Object Model (MAGE-OM) and the Proteomics Experiment Data Repository (PEDRo) object model. SysBio-OM is designed by extending MAGE-OM to represent protein expression data elements (including those from PEDRo), protein-protein interaction and metabolomics data. SysBio-OM promotes the standardization of data representation and data quality by facilitating the capture of the minimum annotation required for an experiment. Such standardization refines the accuracy of data mining and interpretation. The open source SysBio-OM model, which can be implemented on varied computing platforms is presented here. A universal modeling language depiction of the entire SysBio-OM is available at http://cebs.niehs.nih.gov/SysBioOM/. The Rational Rose object model package is distributed under an open source license that permits unrestricted academic and commercial use and is available at http://cebs.niehs.nih.gov/cebsdownloads. The database and interface are being built to implement the model and will be available for public use at http://cebs.niehs.nih.gov.

  15. GEM-TREND: a web tool for gene expression data mining toward relevant network discovery

    PubMed Central

    Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi

    2009-01-01

    Background DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. Results GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories. Conclusion GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at . PMID:19728865

  16. GEM-TREND: a web tool for gene expression data mining toward relevant network discovery.

    PubMed

    Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi

    2009-09-03

    DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories. GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at http://cgs.pharm.kyoto-u.ac.jp/services/network.

  17. Optimal Threshold Determination for Interpreting Semantic Similarity and Particularity: Application to the Comparison of Gene Sets and Metabolic Pathways Using GO and ChEBI

    PubMed Central

    Bettembourg, Charles; Diot, Christian; Dameron, Olivier

    2015-01-01

    Background The analysis of gene annotations referencing back to Gene Ontology plays an important role in the interpretation of high-throughput experiments results. This analysis typically involves semantic similarity and particularity measures that quantify the importance of the Gene Ontology annotations. However, there is currently no sound method supporting the interpretation of the similarity and particularity values in order to determine whether two genes are similar or whether one gene has some significant particular function. Interpretation is frequently based either on an implicit threshold, or an arbitrary one (typically 0.5). Here we investigate a method for determining thresholds supporting the interpretation of the results of a semantic comparison. Results We propose a method for determining the optimal similarity threshold by minimizing the proportions of false-positive and false-negative similarity matches. We compared the distributions of the similarity values of pairs of similar genes and pairs of non-similar genes. These comparisons were performed separately for all three branches of the Gene Ontology. In all situations, we found overlap between the similar and the non-similar distributions, indicating that some similar genes had a similarity value lower than the similarity value of some non-similar genes. We then extend this method to the semantic particularity measure and to a similarity measure applied to the ChEBI ontology. Thresholds were evaluated over the whole HomoloGene database. For each group of homologous genes, we computed all the similarity and particularity values between pairs of genes. Finally, we focused on the PPAR multigene family to show that the similarity and particularity patterns obtained with our thresholds were better at discriminating orthologs and paralogs than those obtained using default thresholds. Conclusion We developed a method for determining optimal semantic similarity and particularity thresholds. We applied this method on the GO and ChEBI ontologies. Qualitative analysis using the thresholds on the PPAR multigene family yielded biologically-relevant patterns. PMID:26230274

  18. Neighboring Genes Show Correlated Evolution in Gene Expression

    PubMed Central

    Ghanbarian, Avazeh T.; Hurst, Laurence D.

    2015-01-01

    When considering the evolution of a gene’s expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. PMID:25743543

  19. RNA-Seq based transcriptome of whole blood from immunocompetent pigs (Sus scrofa) experimentally infected with Mycoplasma suis strain Illinois.

    PubMed

    do Nascimento, Naíla C; Guimaraes, Ana M S; Dos Santos, Andrea P; Chu, Yuefeng; Marques, Lucas M; Messick, Joanne B

    2018-06-18

    Pigs are popular animal models in biomedical research. RNA-Seq is becoming the predominant tool to investigate transcriptional changes of the pig's response to infection. The high sensitivity of this tool requires a strict control of the study design beginning with the selection of healthy animals to provide accurate interpretation of research data. Pigs chronically infected with Mycoplasma suis often show no obvious clinical signs, however the infection may affect the validity of animal research. The goal of this study was to investigate whether or not this silent infection is also silent at the host transcriptional level. Therefore, immunocompetent pigs were experimentally infected with M. suis and transcriptional profiles of whole blood, generated by RNA-Seq, were analyzed and compared to non-infected animals. RNA-Seq showed 55 differentially expressed (DE) genes in the M. suis infected pigs. Down-regulation of genes related to innate immunity (tlr8, chemokines, chemokines receptors) and genes containing IFN gamma-activated sequence (gbp1, gbp2, il15, cxcl10, casp1, cd274) suggests a general suppression of the immune response in the infected animals. Sixteen (29.09%) of the DE genes were involved in two protein interaction networks: one involving chemokines, chemokine receptors and interleukin-15 and another involving the complement cascade. Genes related to vascular permeability, blood coagulation, and endothelium integrity were also DE in infected pigs. These findings suggest that M. suis subclinical infection causes significant alterations in blood mRNA levels, which could impact data interpretation of research using pigs. Screening of pigs for M. suis infection before initiating animal studies is strongly recommended.

  20. Integration of copy number and transcriptomics provides risk stratification in prostate cancer: A discovery and validation cohort study

    PubMed Central

    Ross-Adams, H.; Lamb, A.D.; Dunning, M.J.; Halim, S.; Lindberg, J.; Massie, C.M.; Egevad, L.A.; Russell, R.; Ramos-Montoya, A.; Vowler, S.L.; Sharma, N.L.; Kay, J.; Whitaker, H.; Clark, J.; Hurst, R.; Gnanapragasam, V.J.; Shah, N.C.; Warren, A.Y.; Cooper, C.S.; Lynch, A.G.; Stark, R.; Mills, I.G.; Grönberg, H.; Neal, D.E.

    2015-01-01

    Background Understanding the heterogeneous genotypes and phenotypes of prostate cancer is fundamental to improving the way we treat this disease. As yet, there are no validated descriptions of prostate cancer subgroups derived from integrated genomics linked with clinical outcome. Methods In a study of 482 tumour, benign and germline samples from 259 men with primary prostate cancer, we used integrative analysis of copy number alterations (CNA) and array transcriptomics to identify genomic loci that affect expression levels of mRNA in an expression quantitative trait loci (eQTL) approach, to stratify patients into subgroups that we then associated with future clinical behaviour, and compared with either CNA or transcriptomics alone. Findings We identified five separate patient subgroups with distinct genomic alterations and expression profiles based on 100 discriminating genes in our separate discovery and validation sets of 125 and 103 men. These subgroups were able to consistently predict biochemical relapse (p = 0.0017 and p = 0.016 respectively) and were further validated in a third cohort with long-term follow-up (p = 0.027). We show the relative contributions of gene expression and copy number data on phenotype, and demonstrate the improved power gained from integrative analyses. We confirm alterations in six genes previously associated with prostate cancer (MAP3K7, MELK, RCBTB2, ELAC2, TPD52, ZBTB4), and also identify 94 genes not previously linked to prostate cancer progression that would not have been detected using either transcript or copy number data alone. We confirm a number of previously published molecular changes associated with high risk disease, including MYC amplification, and NKX3-1, RB1 and PTEN deletions, as well as over-expression of PCA3 and AMACR, and loss of MSMB in tumour tissue. A subset of the 100 genes outperforms established clinical predictors of poor prognosis (PSA, Gleason score), as well as previously published gene signatures (p = 0.0001). We further show how our molecular profiles can be used for the early detection of aggressive cases in a clinical setting, and inform treatment decisions. Interpretation For the first time in prostate cancer this study demonstrates the importance of integrated genomic analyses incorporating both benign and tumour tissue data in identifying molecular alterations leading to the generation of robust gene sets that are predictive of clinical outcome in independent patient cohorts. PMID:26501111

  1. Interactive knowledge discovery and data mining on genomic expression data with numeric formal concept analysis.

    PubMed

    González-Calabozo, Jose M; Valverde-Albacete, Francisco J; Peláez-Moreno, Carmen

    2016-09-15

    Gene Expression Data (GED) analysis poses a great challenge to the scientific community that can be framed into the Knowledge Discovery in Databases (KDD) and Data Mining (DM) paradigm. Biclustering has emerged as the machine learning method of choice to solve this task, but its unsupervised nature makes result assessment problematic. This is often addressed by means of Gene Set Enrichment Analysis (GSEA). We put forward a framework in which GED analysis is understood as an Exploratory Data Analysis (EDA) process where we provide support for continuous human interaction with data aiming at improving the step of hypothesis abduction and assessment. We focus on the adaptation to human cognition of data interpretation and visualization of the output of EDA. First, we give a proper theoretical background to bi-clustering using Lattice Theory and provide a set of analysis tools revolving around [Formula: see text]-Formal Concept Analysis ([Formula: see text]-FCA), a lattice-theoretic unsupervised learning technique for real-valued matrices. By using different kinds of cost structures to quantify expression we obtain different sequences of hierarchical bi-clusterings for gene under- and over-expression using thresholds. Consequently, we provide a method with interleaved analysis steps and visualization devices so that the sequences of lattices for a particular experiment summarize the researcher's vision of the data. This also allows us to define measures of persistence and robustness of biclusters to assess them. Second, the resulting biclusters are used to index external omics databases-for instance, Gene Ontology (GO)-thus offering a new way of accessing publicly available resources. This provides different flavors of gene set enrichment against which to assess the biclusters, by obtaining their p-values according to the terminology of those resources. We illustrate the exploration procedure on a real data example confirming results previously published. The GED analysis problem gets transformed into the exploration of a sequence of lattices enabling the visualization of the hierarchical structure of the biclusters with a certain degree of granularity. The ability of FCA-based bi-clustering methods to index external databases such as GO allows us to obtain a quality measure of the biclusters, to observe the evolution of a gene throughout the different biclusters it appears in, to look for relevant biclusters-by observing their genes and what their persistence is-to infer, for instance, hypotheses on their function.

  2. Identification of Human HK Genes and Gene Expression Regulation Study in Cancer from Transcriptomics Data Analysis

    PubMed Central

    Zhang, Zhang; Liu, Jingxing; Wu, Jiayan; Yu, Jun

    2013-01-01

    The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer. PMID:23382867

  3. Cultivar Variation in Hormonal Balance Is a Significant Determinant of Disease Susceptibility to Xanthomonas campestris pv. campestris in Brassica napus.

    PubMed

    Islam, Md Tabibul; Lee, Bok-Rye; Park, Sang-Hyun; La, Van Hien; Bae, Dong-Won; Kim, Tae-Hwan

    2017-01-01

    This study aimed to directly elucidate cultivar variation in disease susceptibility and disease responses in relation to hormonal status in the interaction of Brassica napus cultivars and Xanthomonas campestris pv. campestris (Xcc), the causal agent of black rot disease. Fully expanded leaves of six B. napus cultivars (cvs. Capitol, Youngsan, Saturnin, Colosse, Tamra, and Mosa) were inoculated with Xcc. At 14 days post-inoculation with Xcc, cultivar variation in susceptibility or resistance was interpreted with defense responses as estimated by redox status, defensive metabolites, and expression of phenylpropanoid synthesis-related genes in relation to endogenous hormonal status. Disease susceptibility of six cultivars was distinguished by necrotic lesions in the Xcc-inoculated leaves and characterized concurrently based on the higher increase in reactive oxygen species and lipid peroxidation. Among these cultivars, as the susceptibility was higher, the ratios of abscisic acid (ABA)/jasmonic acid (JA) and salicylic acid (SA)/JA tended to increase with enhanced expression of SA signaling regulatory gene NPR1 and transcriptional factor TGA1 and antagonistic suppression of JA-regulated gene PDF 1.2 . In the resistant cultivar (cv. Capitol), accumulation of defensive metabolites with enhanced expression of genes involved in flavonoids (chalcone synthase), proanthocyanidins (anthocyanidin reductase), and hydroxycinnamic acids (ferulate-5-hydroxylase) biosynthesis and higher redox status were observed, whereas the opposite results were obtained for susceptible cultivars (cvs. Mosa and Tamra). These results clearly indicate that cultivar variation in susceptibility to infection by Xcc was determined by enhanced alteration of the SA/JA ratio, as a negative regulator of redox status and phenylpropanoid synthesis in the Brasica napus -Xcc pathosystem.

  4. A novel member of the SAF (scaffold attachment factor)-box protein family inhibits gene expression and induces apoptosis

    PubMed Central

    Chan, Ching Wan; Lee, Youn-Bok; Uney, James; Flynn, Andrea; Tobias, Jonathan H.; Norman, Michael

    2007-01-01

    The SLTM [SAF (scaffold attachment factor)-like transcription modulator] protein contains a SAF-box DNA-binding motif and an RNA-binding domain, and shares an overall identity of 34% with SAFB1 {scaffold attachment factor-B1; also known as SAF-B (scaffold attachment factor B), HET [heat-shock protein 27 ERE (oestrogen response element) and TATA-box-binding protein] or HAP (heterogeneous nuclear ribonucleoprotein A1-interacting protein)}. Here, we show that SLTM is localized to the cell nucleus, but excluded from nucleoli, and to a large extent it co-localizes with SAFB1. In the nucleus, SLTM has a punctate distribution and it does not co-localize with SR (serine/arginine) proteins. Overexpression of SAFB1 has been shown to exert a number of inhibitory effects, including suppression of oestrogen signalling. Although SLTM also suppressed the ability of oestrogen to activate a reporter gene in MCF-7 breast-cancer cells, inhibition of a constitutively active β-galactosidase gene suggested that this was primarily the consequence of a generalized inhibitory effect on transcription. Measurement of RNA synthesis, which showed a particularly marked inhibition of [3H]uridine incorporation into mRNA, supported this conclusion. In addition, analysis of cell-cycle parameters, chromatin condensation and cytochrome c release showed that SLTM induced apoptosis in a range of cultured cell lines. Thus the inhibitory effects of SLTM on gene expression appear to result from generalized down-regulation of mRNA synthesis and initiation of apoptosis consequent upon overexpressing the protein. While indicating a crucial role for SLTM in cellular function, these results also emphasize the need for caution when interpreting phenotypic changes associated with manipulation of protein expression levels. PMID:17630952

  5. Expression of lignocellulolytic enzymes in Pichia pastoris

    PubMed Central

    2012-01-01

    Background Sustainable utilization of plant biomass as renewable source for fuels and chemical building blocks requires a complex mixture of diverse enzymes, including hydrolases which comprise the largest class of lignocellulolytic enzymes. These enzymes need to be available in large amounts at a low price to allow sustainable and economic biotechnological processes. Over the past years Pichia pastoris has become an attractive host for the cost-efficient production and engineering of heterologous (eukaryotic) proteins due to several advantages. Results In this paper codon optimized genes and synthetic alcohol oxidase 1 promoter variants were used to generate Pichia pastoris strains which individually expressed cellobiohydrolase 1, cellobiohydrolase 2 and beta-mannanase from Trichoderma reesei and xylanase A from Thermomyces lanuginosus. For three of these enzymes we could develop strains capable of secreting gram quantities of enzyme per liter in fed-batch cultivations. Additionally, we compared our achieved yields of secreted enzymes and the corresponding activities to literature data. Conclusion In our experiments we could clearly show the importance of gene optimization and strain characterization for successfully improving secretion levels. We also present a basic guideline how to correctly interpret the interplay of promoter strength and gene dosage for a successful improvement of the secretory production of lignocellulolytic enzymes in Pichia pastoris. PMID:22583625

  6. Transcriptome analysis in non-model species: a new method for the analysis of heterologous hybridization on microarrays

    PubMed Central

    2010-01-01

    Background Recent developments in high-throughput methods of analyzing transcriptomic profiles are promising for many areas of biology, including ecophysiology. However, although commercial microarrays are available for most common laboratory models, transcriptome analysis in non-traditional model species still remains a challenge. Indeed, the signal resulting from heterologous hybridization is low and difficult to interpret because of the weak complementarity between probe and target sequences, especially when no microarray dedicated to a genetically close species is available. Results We show here that transcriptome analysis in a species genetically distant from laboratory models is made possible by using MAXRS, a new method of analyzing heterologous hybridization on microarrays. This method takes advantage of the design of several commercial microarrays, with different probes targeting the same transcript. To illustrate and test this method, we analyzed the transcriptome of king penguin pectoralis muscle hybridized to Affymetrix chicken microarrays, two organisms separated by an evolutionary distance of approximately 100 million years. The differential gene expression observed between different physiological situations computed by MAXRS was confirmed by real-time PCR on 10 genes out of 11 tested. Conclusions MAXRS appears to be an appropriate method for gene expression analysis under heterologous hybridization conditions. PMID:20509979

  7. Plant Reactome: a resource for plant pathways and comparative analysis.

    PubMed

    Naithani, Sushma; Preece, Justin; D'Eustachio, Peter; Gupta, Parul; Amarasinghe, Vindhya; Dharmawardhana, Palitha D; Wu, Guanming; Fabregat, Antonio; Elser, Justin L; Weiser, Joel; Keays, Maria; Fuentes, Alfonso Munoz-Pomer; Petryszak, Robert; Stein, Lincoln D; Ware, Doreen; Jaiswal, Pankaj

    2017-01-04

    Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Fundamentals of nutrigenetics and nutrigenomics.

    PubMed

    Bouchard, Claude; Ordovas, Jose M

    2012-01-01

    This volume of Progress in Molecular Biology and Translational Science is devoted to the exciting and promising field of nutrigenetics and nutrigenomics. The introductory chapter defines the basic concepts necessary for the interpretation of the material covered in the remainder of the volume. Emphasis is on the concept of personalized nutrition and its likely role in public health and disease prevention, as well as in therapeutics. Nutrigenetics refers to the role of DNA sequence variation in the responses to nutrients, whereas nutrigenomics is the study of the role of nutrients in gene expression. This research is predicated on the assumption that there are individual differences in responsiveness to acute or repeated exposures to a given nutrient or combination of nutrients. Throughout human history, diet has affected the expression of genes, resulting in phenotypes that are able to successfully respond to environmental challenges and that allow better exploitation of food resources. These adaptations have been key to human growth and development. Technological advances have made it possible to investigate not only specific genes but also to explore in unbiased designs the whole genome-wide complement of DNA sequence variants or transcriptome. These advances provide an opportunity to establish the foundation for incorporating biological individuality into dietary recommendations, with significant therapeutic potential. Copyright © 2012 Elsevier Inc. All rights reserved.

  9. An integrated clinical and genomic information system for cancer precision medicine.

    PubMed

    Jang, Yeongjun; Choi, Taekjin; Kim, Jongho; Park, Jisub; Seo, Jihae; Kim, Sangok; Kwon, Yeajee; Lee, Seungjae; Lee, Sanghyuk

    2018-04-20

    Increasing affordability of next-generation sequencing (NGS) has created an opportunity for realizing genomically-informed personalized cancer therapy as a path to precision oncology. However, the complex nature of genomic information presents a huge challenge for clinicians in interpreting the patient's genomic alterations and selecting the optimum approved or investigational therapy. An elaborate and practical information system is urgently needed to support clinical decision as well as to test clinical hypotheses quickly. Here, we present an integrated clinical and genomic information system (CGIS) based on NGS data analyses. Major components include modules for handling clinical data, NGS data processing, variant annotation and prioritization, drug-target-pathway analysis, and population cohort explorer. We built a comprehensive knowledgebase of genes, variants, drugs by collecting annotated information from public and in-house resources. Structured reports for molecular pathology are generated using standardized terminology in order to help clinicians interpret genomic variants and utilize them for targeted cancer therapy. We also implemented many features useful for testing hypotheses to develop prognostic markers from mutation and gene expression data. Our CGIS software is an attempt to provide useful information for both clinicians and scientists who want to explore genomic information for precision oncology.

  10. Neighboring Genes Show Correlated Evolution in Gene Expression.

    PubMed

    Ghanbarian, Avazeh T; Hurst, Laurence D

    2015-07-01

    When considering the evolution of a gene's expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  11. SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics

    PubMed Central

    2013-01-01

    Background Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics. Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis. Results We used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides. The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other known databases and two case studies: 1) in liver cancer and 2) in breast cancer. Conclusions The SASD provides the scientific community with an efficient means to identify, analyze, and characterize novel Exon Skipping and Intron Retention protein isoforms from mass spectrometry and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. PMID:24267658

  12. Individuality in harpsichord performance: disentangling performer- and piece-specific influences on interpretive choices

    PubMed Central

    Gingras, Bruno; Asselin, Pierre-Yves; McAdams, Stephen

    2013-01-01

    Although a growing body of research has examined issues related to individuality in music performance, few studies have attempted to quantify markers of individuality that transcend pieces and musical styles. This study aims to identify such meta-markers by discriminating between influences linked to specific pieces or interpretive goals and performer-specific playing styles, using two complementary statistical approaches: linear mixed models (LMMs) to estimate fixed (piece and interpretation) and random (performer) effects, and similarity analyses to compare expressive profiles on a note-by-note basis across pieces and expressive parameters. Twelve professional harpsichordists recorded three pieces representative of the Baroque harpsichord repertoire, including three interpretations of one of these pieces, each emphasizing a different melodic line, on an instrument equipped with a MIDI console. Four expressive parameters were analyzed: articulation, note onset asynchrony, timing, and velocity. LMMs showed that piece-specific influences were much larger for articulation than for other parameters, for which performer-specific effects were predominant, and that piece-specific influences were generally larger than effects associated with interpretive goals. Some performers consistently deviated from the mean values for articulation and velocity across pieces and interpretations, suggesting that global measures of expressivity may in some cases constitute valid markers of artistic individuality. Similarity analyses detected significant associations among the magnitudes of the correlations between the expressive profiles of different performers. These associations were found both when comparing across parameters and within the same piece or interpretation, or on the same parameter and across pieces or interpretations. These findings suggest the existence of expressive meta-strategies that can manifest themselves across pieces, interpretive goals, or expressive devices. PMID:24348446

  13. Down-weighting overlapping genes improves gene set analysis

    PubMed Central

    2012-01-01

    Background The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. Results In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method Pathway Analysis with Down-weighting of Overlapping Genes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. Conclusions PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/or http://www.bioconductor.org. PMID:22713124

  14. Transcriptomic characterization of MRI contrast with focus on the T1-w/T2-w ratio in the cerebral cortex.

    PubMed

    Ritchie, Jacob; Pantazatos, Spiro P; French, Leon

    2018-07-01

    Magnetic resonance (MR) images of the brain are of immense clinical and research utility. At the atomic and subatomic levels, the sources of MR signals are well understood. However, we lack a comprehensive understanding of the macromolecular correlates of MR signal contrast. To address this gap, we used genome-wide measurements to correlate gene expression with MR signal intensity across the cerebral cortex in the Allen Human Brain Atlas (AHBA). We focused on the ratio of T1-weighted and T2-weighted intensities (T1-w/T2-w ratio image), which is considered to be a useful proxy for myelin content. As expected, we found enrichment of positive correlations between myelin-associated genes and the ratio image, supporting its use as a myelin marker. Genome-wide, there was an association with protein mass, with genes coding for heavier proteins expressed in regions with high T1-w/T2-w values. Oligodendrocyte gene markers were strongly correlated with the T1-w/T2-w ratio, but this was not driven by myelin-associated genes. Mitochondrial genes exhibit the strongest relationship, showing higher expression in regions with low T1-w/T2-w ratio. This may be due to the pH gradient in mitochondria as genes up-regulated by pH in the brain were also highly correlated with the ratio. While we corroborate associations with myelin and synaptic plasticity, differences in the T1-w/T2-w ratio across the cortex are more strongly linked to molecule size, oligodendrocyte markers, mitochondria, and pH. We evaluate correlations between AHBA transcriptomic measurements and a group averaged T1-w/T2-w ratio image, showing agreement with in-sample results. Expanding our analysis to the whole brain results in strong positive T1-w/T2-w correlations for immune system, inflammatory disease, and microglia marker genes. Genes with negative correlations were enriched for neuron markers and synaptic plasticity genes. Lastly, our findings are similar when performed on T1-w or inverted T2-w intensities alone. These results provide a molecular characterization of MR contrast that will aid interpretation of future MR studies of the brain. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  15. Application of community phylogenetic approaches to understand gene expression: differential exploration of venom gene space in predatory marine gastropods.

    PubMed

    Chang, Dan; Duda, Thomas F

    2014-06-05

    Predatory marine gastropods of the genus Conus exhibit substantial variation in venom composition both within and among species. Apart from mechanisms associated with extensive turnover of gene families and rapid evolution of genes that encode venom components ('conotoxins'), the evolution of distinct conotoxin expression patterns is an additional source of variation that may drive interspecific differences in the utilization of species' 'venom gene space'. To determine the evolution of expression patterns of venom genes of Conus species, we evaluated the expression of A-superfamily conotoxin genes of a set of closely related Conus species by comparing recovered transcripts of A-superfamily genes that were previously identified from the genomes of these species. We modified community phylogenetics approaches to incorporate phylogenetic history and disparity of genes and their expression profiles to determine patterns of venom gene space utilization. Less than half of the A-superfamily gene repertoire of these species is expressed, and only a few orthologous genes are coexpressed among species. Species exhibit substantially distinct expression strategies, with some expressing sets of closely related loci ('under-dispersed' expression of available genes) while others express sets of more disparate genes ('over-dispersed' expression). In addition, expressed genes show higher dN/dS values than either unexpressed or ancestral genes; this implies that expression exposes genes to selection and facilitates rapid evolution of these genes. Few recent lineage-specific gene duplicates are expressed simultaneously, suggesting that expression divergence among redundant gene copies may be established shortly after gene duplication. Our study demonstrates that venom gene space is explored differentially by Conus species, a process that effectively permits the independent and rapid evolution of venoms in these species.

  16. A Comparative Genomic Study in Schizophrenic and in Bipolar Disorder Patients, Based on Microarray Expression Profiling Meta-Analysis

    PubMed Central

    Logotheti, Marianthi; Papadodima, Olga; Venizelos, Nikolaos; Chatziioannou, Aristotelis; Kolisis, Fragiskos

    2013-01-01

    Schizophrenia affecting almost 1% and bipolar disorder affecting almost 3%–5% of the global population constitute two severe mental disorders. The catecholaminergic and the serotonergic pathways have been proved to play an important role in the development of schizophrenia, bipolar disorder, and other related psychiatric disorders. The aim of the study was to perform and interpret the results of a comparative genomic profiling study in schizophrenic patients as well as in healthy controls and in patients with bipolar disorder and try to relate and integrate our results with an aberrant amino acid transport through cell membranes. In particular we have focused on genes and mechanisms involved in amino acid transport through cell membranes from whole genome expression profiling data. We performed bioinformatic analysis on raw data derived from four different published studies. In two studies postmortem samples from prefrontal cortices, derived from patients with bipolar disorder, schizophrenia, and control subjects, have been used. In another study we used samples from postmortem orbitofrontal cortex of bipolar subjects while the final study was performed based on raw data from a gene expression profiling dataset in the postmortem superior temporal cortex of schizophrenics. The data were downloaded from NCBI's GEO datasets. PMID:23554570

  17. Characterization and Molecular Interpretation of the Photosynthetic Traits of Lonicera confusa in Karst Environment

    PubMed Central

    Gan, Lu; Fu, Chunhua; Zhang, Libin; Yu, Longjiang; Li, Maoteng

    2014-01-01

    Lonicera confusa was a medical plant which could adapt to the Ca-rich environment in the karst area of China. The photosynthesis, relative chlorophyll content,differentially expressed genes (DEGs) and differentially expressed proteins (DEPs) of L. confusa that cultivated in calcareous and sandstone soils were investigated. The results showed that the relative chlorophyll content and net photosynthesis rate of L. confusa in calcareous soil are much higher than that planted in sandstone soil, the higher content of calcium might play a role in keeping the chloroplast from harm and showed higher photosynthesis rate. The transpiration and stomata conductance were decreased in calcareous soil, which might result from the closure of stomata. The GeneFishing and proteomic results showed that the expression of DEGs and DEPs were critical for photosynthesis and stomata closure, such as RuBisCO, photosynthetic electron transfer c and malate dehydrogenase varied in the leaves of L. confusa that cultivated in different soils. These DEGs or DEPs were further found to be directly or indirectly regulated by calcium sensor proteins. This study enriched our knowledge of the molecular mechanism of high net photosynthesis rate and lower transpiration of L. confusa that cultivated in the calcareous soil in some degree. PMID:24959829

  18. Comparative Proteomics Provides Insights into Metabolic Responses in Rat Liver to Isolated Soy and Meat Proteins.

    PubMed

    Song, Shangxin; Hooiveld, Guido J; Zhang, Wei; Li, Mengjie; Zhao, Fan; Zhu, Jing; Xu, Xinglian; Muller, Michael; Li, Chunbao; Zhou, Guanghong

    2016-04-01

    It has been reported that isolated dietary soy and meat proteins have distinct effects on physiology and liver gene expression, but the impact on protein expression responses are unknown. Because these may differ from gene expression responses, we investigated dietary protein-induced changes in liver proteome. Rats were fed for 1 week semisynthetic diets that differed only regarding protein source; casein (reference) was fully replaced by isolated soy, chicken, fish, or pork protein. Changes in liver proteome were measured by iTRAQ labeling and LC-ESI-MS/MS. A robust set totaling 1437 unique proteins was identified and subjected to differential protein analysis and biological interpretation. Compared with casein, all other protein sources reduced the abundance of proteins involved in fatty acid metabolism and Pparα signaling pathway. All dietary proteins, except chicken, increased oxidoreductive transformation reactions but reduced energy and essential amino acid metabolic pathways. Only soy protein increased the metabolism of sulfur-containing and nonessential amino acids. Soy and fish proteins increased translation and mRNA processing, whereas only chicken protein increased TCA cycle but reduced immune responses. These findings were partially in line with previously reported transcriptome results. This study further shows the distinct effects of soy and meat proteins on liver metabolism in rats.

  19. Catabolite and Oxygen Regulation of Enterohemorrhagic Escherichia coli Virulence.

    PubMed

    Carlson-Banning, Kimberly M; Sperandio, Vanessa

    2016-11-22

    The biogeography of the gut is diverse in its longitudinal axis, as well as within specific microenvironments. Differential oxygenation and nutrient composition drive the membership of microbial communities in these habitats. Moreover, enteric pathogens can orchestrate further modifications to gain a competitive advantage toward host colonization. These pathogens are versatile and adept when exploiting the human colon. They expertly navigate complex environmental cues and interkingdom signaling to colonize and infect their hosts. Here we demonstrate how enterohemorrhagic Escherichia coli (EHEC) uses three sugar-sensing transcription factors, Cra, KdpE, and FusR, to exquisitely regulate the expression of virulence factors associated with its type III secretion system (T3SS) when exposed to various oxygen concentrations. We also explored the effect of mucin-derived nonpreferred carbon sources on EHEC growth and expression of virulence genes. Taken together, the results show that EHEC represses the expression of its T3SS when oxygen is absent, mimicking the largely anaerobic lumen, and activates its T3SS when oxygen is available through Cra. In addition, when EHEC senses mucin-derived sugars heavily present in the O-linked and N-linked glycans of the large intestine, virulence gene expression is initiated. Sugars derived from pectin, a complex plant polysaccharide digested in the large intestine, also increased virulence gene expression. Not only does EHEC sense host- and microbiota-derived interkingdom signals, it also uses oxygen availability and mucin-derived sugars liberated by the microbiota to stimulate expression of the T3SS. This precision in gene regulation allows EHEC to be an efficient pathogen with an extremely low infectious dose. Enteric pathogens have to be crafty when interpreting multiple environmental cues to successfully establish themselves within complex and diverse gut microenvironments. Differences in oxygen tension and nutrient composition determine the biogeography of the gut microbiota and provide unique niches that can be exploited by enteric pathogens. EHEC is an enteric pathogen that colonizes the colon and causes outbreaks of bloody diarrhea and hemolytic-uremic syndrome worldwide. It has a very low infectious dose, which requires it to be an extremely effective pathogen. Hence, here we show that EHEC senses multiple sugar sources and oxygen levels to optimally control the expression of its virulence repertoire. This exquisite regulatory control equips EHEC to sense different intestinal compartments to colonize the host. Copyright © 2016 Carlson-Banning and Sperandio.

  20. NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis.

    PubMed

    Sun, Duanchen; Liu, Yinliang; Zhang, Xiang-Sun; Wu, Ling-Yun

    2017-09-21

    High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( http://github.com/wulingyun/CopTea/ ). Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.

Top