Sample records for predict gene expression

  1. Prediction of gene expression with cis-SNPs using mixed models and regularization methods.

    PubMed

    Zeng, Ping; Zhou, Xiang; Huang, Shuiping

    2017-05-11

    It has been shown that gene expression in human tissues is heritable, thus predicting gene expression using only SNPs becomes possible. The prediction of gene expression can offer important implications on the genetic architecture of individual functional associated SNPs and further interpretations of the molecular basis underlying human diseases. We compared three types of methods for predicting gene expression using only cis-SNPs, including the polygenic model, i.e. linear mixed model (LMM), two sparse models, i.e. Lasso and elastic net (ENET), and the hybrid of LMM and sparse model, i.e. Bayesian sparse linear mixed model (BSLMM). The three kinds of prediction methods have very different assumptions of underlying genetic architectures. These methods were evaluated using simulations under various scenarios, and were applied to the Geuvadis gene expression data. The simulations showed that these four prediction methods (i.e. Lasso, ENET, LMM and BSLMM) behaved best when their respective modeling assumptions were satisfied, but BSLMM had a robust performance across a range of scenarios. According to R 2 of these models in the Geuvadis data, the four methods performed quite similarly. We did not observe any clustering or enrichment of predictive genes (defined as genes with R 2  ≥ 0.05) across the chromosomes, and also did not see there was any clear relationship between the proportion of the predictive genes and the proportion of genes in each chromosome. However, an interesting finding in the Geuvadis data was that highly predictive genes (e.g. R 2  ≥ 0.30) may have sparse genetic architectures since Lasso, ENET and BSLMM outperformed LMM for these genes; and this observation was validated in another gene expression data. We further showed that the predictive genes were enriched in approximately independent LD blocks. Gene expression can be predicted with only cis-SNPs using well-developed prediction models and these predictive genes were enriched in some approximately independent LD blocks. The prediction of gene expression can shed some light on the functional interpretation for identified SNPs in GWASs.

  2. Gene Expression Profiling Predicts the Development of Oral Cancer

    PubMed Central

    Saintigny, Pierre; Zhang, Li; Fan, You-Hong; El-Naggar, Adel K.; Papadimitrakopoulou, Vali; Feng, Lei; Lee, J. Jack; Kim, Edward S.; Hong, Waun Ki; Mao, Li

    2011-01-01

    Patients with oral preneoplastic lesion (OPL) have high risk of developing oral cancer. Although certain risk factors such as smoking status and histology are known, our ability to predict oral cancer risk remains poor. The study objective was to determine the value of gene expression profiling in predicting oral cancer development. Gene expression profile was measured in 86 of 162 OPL patients who were enrolled in a clinical chemoprevention trial that used the incidence of oral cancer development as a prespecified endpoint. The median follow-up time was 6.08 years and 35 of the 86 patients developed oral cancer over the course. Gene expression profiles were associated with oral cancer-free survival and used to develope multivariate predictive models for oral cancer prediction. We developed a 29-transcript predictive model which showed marked improvement in terms of prediction accuracy (with 8% predicting error rate) over the models using previously known clinico-pathological risk factors. Based on the gene expression profile data, we also identified 2182 transcripts significantly associated with oral cancer risk associated genes (P-value<0.01, single variate Cox proportional hazards model). Functional pathway analysis revealed proteasome machinery, MYC, and ribosomes components as the top gene sets associated with oral cancer risk. In multiple independent datasets, the expression profiles of the genes can differentiate head and neck cancer from normal mucosa. Our results show that gene expression profiles may improve the prediction of oral cancer risk in OPL patients and the significant genes identified may serve as potential targets for oral cancer chemoprevention. PMID:21292635

  3. A deep auto-encoder model for gene expression prediction.

    PubMed

    Xie, Rui; Wen, Jia; Quitadamo, Andrew; Cheng, Jianlin; Shi, Xinghua

    2017-11-17

    Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes' contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.

  4. Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

    PubMed Central

    Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

    2012-01-01

    Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606

  5. Cohort-specific imputation of gene expression improves prediction of warfarin dose for African Americans.

    PubMed

    Gottlieb, Assaf; Daneshjou, Roxana; DeGorter, Marianne; Bourgeois, Stephane; Svensson, Peter J; Wadelius, Mia; Deloukas, Panos; Montgomery, Stephen B; Altman, Russ B

    2017-11-24

    Genome-wide association studies are useful for discovering genotype-phenotype associations but are limited because they require large cohorts to identify a signal, which can be population-specific. Mapping genetic variation to genes improves power and allows the effects of both protein-coding variation as well as variation in expression to be combined into "gene level" effects. Previous work has shown that warfarin dose can be predicted using information from genetic variation that affects protein-coding regions. Here, we introduce a method that improves dose prediction by integrating tissue-specific gene expression. In particular, we use drug pathways and expression quantitative trait loci knowledge to impute gene expression-on the assumption that differential expression of key pathway genes may impact dose requirement. We focus on 116 genes from the pharmacokinetic and pharmacodynamic pathways of warfarin within training and validation sets comprising both European and African-descent individuals. We build gene-tissue signatures associated with warfarin dose in a cohort-specific manner and identify a signature of 11 gene-tissue pairs that significantly augments the International Warfarin Pharmacogenetics Consortium dosage-prediction algorithm in both populations. Our results demonstrate that imputed expression can improve dose prediction and bridge population-specific compositions. MATLAB code is available at https://github.com/assafgo/warfarin-cohort.

  6. Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex

    PubMed Central

    Hulsman, Marc; Lelieveldt, Boudewijn P. F.; de Ridder, Jeroen; Reinders, Marcel

    2015-01-01

    The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale). PMID:25965262

  7. Changes in Gene Expression Predicting Local Control in Cervical Cancer: Results from Radiation Therapy Oncology Group 0128

    PubMed Central

    Weidhaas, Joanne B.; Li, Shu-Xia; Winter, Kathryn; Ryu, Janice; Jhingran, Anuja; Miller, Bridgette; Dicker, Adam P.; Gaffney, David

    2009-01-01

    Purpose To evaluate the potential of gene expression signatures to predict response to treatment in locally advanced cervical cancer treated with definitive chemotherapy and radiation. Experimental Design Tissue biopsies were collected from patients participating in Radiation Therapy Oncology Group (RTOG) 0128, a phase II trial evaluating the benefit of celecoxib in addition to cisplatin chemotherapy and radiation for locally advanced cervical cancer. Gene expression profiling was done and signatures of pretreatment, mid-treatment (before the first implant), and “changed” gene expression patterns between pre- and mid-treatment samples were determined. The ability of the gene signatures to predict local control versus local failure was evaluated. Two-group t test was done to identify the initial gene set separating these end points. Supervised classification methods were used to enrich the gene sets. The results were further validated by leave-one-out and 2-fold cross-validation. Results Twenty-two patients had suitable material from pretreatment samples for analysis, and 13 paired pre- and mid-treatment samples were obtained. The changed gene expression signatures between the pre- and mid-treatment biopsies predicted response to treatment, separating patients with local failures from those who achieved local control with a seven-gene signature. The in-sample prediction rate, leave-one-out prediction rate, and 2-fold prediction rate are 100% for this seven-gene signature. This signature was enriched for cell cycle genes. Conclusions Changed gene expression signatures during therapy in cervical cancer can predict outcome as measured by local control. After further validation, such findings could be applied to direct additional therapy for cervical cancer patients treated with chemotherapy and radiation. PMID:19509178

  8. Creating and validating cis-regulatory maps of tissue-specific gene expression regulation

    PubMed Central

    O'Connor, Timothy R.; Bailey, Timothy L.

    2014-01-01

    Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules–CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for ‘other’ tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a ‘nearest neighbor’ heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps. PMID:25200088

  9. General statistics of stochastic process of gene expression in eukaryotic cells.

    PubMed Central

    Kuznetsov, V A; Knott, G D; Bonner, R F

    2002-01-01

    Thousands of genes are expressed at such very low levels (< or =1 copy per cell) that global gene expression analysis of rarer transcripts remains problematic. Ambiguity in identification of rarer transcripts creates considerable uncertainty in fundamental questions such as the total number of genes expressed in an organism and the biological significance of rarer transcripts. Knowing the distribution of the true number of genes expressed at each level and the corresponding gene expression level probability function (GELPF) could help resolve these uncertainties. We found that all observed large-scale gene expression data sets in yeast, mouse, and human cells follow a Pareto-like distribution model skewed by many low-abundance transcripts. A novel stochastic model of the gene expression process predicts the universality of the GELPF both across different cell types within a multicellular organism and across different organisms. This model allows us to predict the frequency distribution of all gene expression levels within a single cell and to estimate the number of expressed genes in a single cell and in a population of cells. A random "basal" transcription mechanism for protein-coding genes in all or almost all eukaryotic cell types is predicted. This fundamental mechanism might enhance the expression of rarely expressed genes and, thus, provide a basic level of phenotypic diversity, adaptability, and random monoallelic expression in cell populations. PMID:12136033

  10. Applicability of a gene expression based prediction method to SD and Wistar rats: an example of CARCINOscreen®.

    PubMed

    Matsumoto, Hiroshi; Saito, Fumiyo; Takeyoshi, Masahiro

    2015-12-01

    Recently, the development of several gene expression-based prediction methods has been attempted in the fields of toxicology. CARCINOscreen® is a gene expression-based screening method to predict carcinogenicity of chemicals which target the liver with high accuracy. In this study, we investigated the applicability of the gene expression-based screening method to SD and Wistar rats by using CARCINOscreen®, originally developed with F344 rats, with two carcinogens, 2,4-diaminotoluen and thioacetamide, and two non-carcinogens, 2,6-diaminotoluen and sodium benzoate. After the 28-day repeated dose test was conducted with each chemical in SD and Wistar rats, microarray analysis was performed using total RNA extracted from each liver. Obtained gene expression data were applied to CARCINOscreen®. Predictive scores obtained by the CARCINOscreen® for known carcinogens were > 2 in all strains of rats, while non-carcinogens gave prediction scores below 0.5. These results suggested that the gene expression based screening method, CARCINOscreen®, can be applied to SD and Wistar rats, widely used strains in toxicological studies, by setting of an appropriate boundary line of prediction score to classify the chemicals into carcinogens and non-carcinogens.

  11. Genome-wide prediction and analysis of human tissue-selective genes using microarray expression data

    PubMed Central

    2013-01-01

    Background Understanding how genes are expressed specifically in particular tissues is a fundamental question in developmental biology. Many tissue-specific genes are involved in the pathogenesis of complex human diseases. However, experimental identification of tissue-specific genes is time consuming and difficult. The accurate predictions of tissue-specific gene targets could provide useful information for biomarker development and drug target identification. Results In this study, we have developed a machine learning approach for predicting the human tissue-specific genes using microarray expression data. The lists of known tissue-specific genes for different tissues were collected from UniProt database, and the expression data retrieved from the previously compiled dataset according to the lists were used for input vector encoding. Random Forests (RFs) and Support Vector Machines (SVMs) were used to construct accurate classifiers. The RF classifiers were found to outperform SVM models for tissue-specific gene prediction. The results suggest that the candidate genes for brain or liver specific expression can provide valuable information for further experimental studies. Our approach was also applied for identifying tissue-selective gene targets for different types of tissues. Conclusions A machine learning approach has been developed for accurately identifying the candidate genes for tissue specific/selective expression. The approach provides an efficient way to select some interesting genes for developing new biomedical markers and improve our knowledge of tissue-specific expression. PMID:23369200

  12. Adipose Gene Expression Prior to Weight Loss Can Differentiate and Weakly Predict Dietary Responders

    PubMed Central

    Mutch, David M.; Temanni, M. Ramzi; Henegar, Corneliu; Combes, Florence; Pelloux, Véronique; Holst, Claus; Sørensen, Thorkild I. A.; Astrup, Arne; Martinez, J. Alfredo; Saris, Wim H. M.; Viguerie, Nathalie; Langin, Dominique; Zucker, Jean-Daniel; Clément, Karine

    2007-01-01

    Background The ability to identify obese individuals who will successfully lose weight in response to dietary intervention will revolutionize disease management. Therefore, we asked whether it is possible to identify subjects who will lose weight during dietary intervention using only a single gene expression snapshot. Methodology/Principal Findings The present study involved 54 female subjects from the Nutrient-Gene Interactions in Human Obesity-Implications for Dietary Guidelines (NUGENOB) trial to determine whether subcutaneous adipose tissue gene expression could be used to predict weight loss prior to the 10-week consumption of a low-fat hypocaloric diet. Using several statistical tests revealed that the gene expression profiles of responders (8–12 kgs weight loss) could always be differentiated from non-responders (<4 kgs weight loss). We also assessed whether this differentiation was sufficient for prediction. Using a bottom-up (i.e. black-box) approach, standard class prediction algorithms were able to predict dietary responders with up to 61.1%±8.1% accuracy. Using a top-down approach (i.e. using differentially expressed genes to build a classifier) improved prediction accuracy to 80.9%±2.2%. Conclusion Adipose gene expression profiling prior to the consumption of a low-fat diet is able to differentiate responders from non-responders as well as serve as a weak predictor of subjects destined to lose weight. While the degree of prediction accuracy currently achieved with a gene expression snapshot is perhaps insufficient for clinical use, this work reveals that the comprehensive molecular signature of adipose tissue paves the way for the future of personalized nutrition. PMID:18094752

  13. Gene expression analysis predicts insect venom anaphylaxis in indolent systemic mastocytosis.

    PubMed

    Niedoszytko, M; Bruinenberg, M; van Doormaal, J J; de Monchy, J G R; Nedoszytko, B; Koppelman, G H; Nawijn, M C; Wijmenga, C; Jassem, E; Elberink, J N G Oude

    2011-05-01

    Anaphylaxis to insect venom (Hymenoptera) is most severe in patients with mastocytosis and may even lead to death. However, not all patients with mastocytosis suffer from anaphylaxis. The aim of the study was to analyze differences in gene expression between patients with indolent systemic mastocytosis (ISM) and a history of insect venom anaphylaxis (IVA) compared to those patients without a history of anaphylaxis, and to determine the predictive use of gene expression profiling. Whole-genome gene expression analysis was performed in peripheral blood cells. Twenty-two adults with ISM were included: 12 with a history of IVA and 10 without a history of anaphylaxis of any kind. Significant differences in single gene expression corrected for multiple testing were found for 104 transcripts (P < 0.05). Gene ontology analysis revealed that the differentially expressed genes were involved in pathways responsible for the development of cancer and focal and cell adhesion suggesting that the expression of genes related to the differentiation state of cells is higher in patients with a history of anaphylaxis. Based on the gene expression profiles, a naïve Bayes prediction model was built identifying patients with IVA. In ISM, gene expression profiles are different between patients with a history of IVA and those without. These findings might reflect a more pronounced mast cells dysfunction in patients without a history of anaphylaxis. Gene expression profiling might be a useful tool to predict the risk of anaphylaxis on insect venom in patients with ISM. Prospective studies are needed to substantiate any conclusions. © 2010 John Wiley & Sons A/S.

  14. Predicting Gene Expression Level from Relative Codon Usage Bias: An Application to Escherichia coli Genome

    PubMed Central

    Roymondal, Uttam; Das, Shibsankar; Sahoo, Satyabrata

    2009-01-01

    We present an expression measure of a gene, devised to predict the level of gene expression from relative codon bias (RCB). There are a number of measures currently in use that quantify codon usage in genes. Based on the hypothesis that gene expressivity and codon composition is strongly correlated, RCB has been defined to provide an intuitively meaningful measure of an extent of the codon preference in a gene. We outline a simple approach to assess the strength of RCB (RCBS) in genes as a guide to their likely expression levels and illustrate this with an analysis of Escherichia coli (E. coli) genome. Our efforts to quantitatively predict gene expression levels in E. coli met with a high level of success. Surprisingly, we observe a strong correlation between RCBS and protein length indicating natural selection in favour of the shorter genes to be expressed at higher level. The agreement of our result with high protein abundances, microarray data and radioactive data demonstrates that the genomic expression profile available in our method can be applied in a meaningful way to the study of cell physiology and also for more detailed studies of particular genes of interest. PMID:19131380

  15. Gene expression signature in urine for diagnosing and assessing aggressiveness of bladder urothelial carcinoma.

    PubMed

    Mengual, Lourdes; Burset, Moisès; Ribal, María José; Ars, Elisabet; Marín-Aguilera, Mercedes; Fernández, Manuel; Ingelmo-Torres, Mercedes; Villavicencio, Humberto; Alcaraz, Antonio

    2010-05-01

    To develop an accurate and noninvasive method for bladder cancer diagnosis and prediction of disease aggressiveness based on the gene expression patterns of urine samples. Gene expression patterns of 341 urine samples from bladder urothelial cell carcinoma (UCC) patients and 235 controls were analyzed via TaqMan Arrays. In a first phase of the study, three consecutive gene selection steps were done to identify a gene set expression signature to detect and stratify UCC in urine. Subsequently, those genes more informative for UCC diagnosis and prediction of tumor aggressiveness were combined to obtain a classification system of bladder cancer samples. In a second phase, the obtained gene set signature was evaluated in a routine clinical scenario analyzing only voided urine samples. We have identified a 12+2 gene expression signature for UCC diagnosis and prediction of tumor aggressiveness on urine samples. Overall, this gene set panel had 98% sensitivity (SN) and 99% specificity (SP) in discriminating between UCC and control samples and 79% SN and 92% SP in predicting tumor aggressiveness. The translation of the model to the clinically applicable format corroborates that the 12+2 gene set panel described maintains a high accuracy for UCC diagnosis (SN = 89% and SP = 95%) and tumor aggressiveness prediction (SN = 79% and SP = 91%) in voided urine samples. The 12+2 gene expression signature described in urine is able to identify patients suffering from UCC and predict tumor aggressiveness. We show that a panel of molecular markers may improve the schedule for diagnosis and follow-up in UCC patients. Copyright 2010 AACR.

  16. Building predictive gene signatures through simultaneous assessment of transcription factor activation and gene expression.

    EPA Science Inventory

    Building predictive gene signatures through simultaneous assessment of transcription factor activation and gene expression Exposure to many drugs and environmentally-relevant chemicals can cause adverse outcomes. These adverse outcomes, such as cancer, have been linked to mol...

  17. Building gene expression signatures indicative of transcription factor activation to predict AOP modulation

    EPA Science Inventory

    Building gene expression signatures indicative of transcription factor activation to predict AOP modulation Adverse outcome pathways (AOPs) are a framework for predicting quantitative relationships between molecular initiatin...

  18. Predicting features of breast cancer with gene expression patterns.

    PubMed

    Lu, Xuesong; Lu, Xin; Wang, Zhigang C; Iglehart, J Dirk; Zhang, Xuegong; Richardson, Andrea L

    2008-03-01

    Data from gene expression arrays hold an enormous amount of biological information. We sought to determine if global gene expression in primary breast cancers contained information about biologic, histologic, and anatomic features of the disease in individual patients. Microarray data from the tumors of 129 patients were analyzed for the ability to predict biomarkers [estrogen receptor (ER) and HER2], histologic features [grade and lymphatic-vascular invasion (LVI)], and stage parameters (tumor size and lymph node metastasis). Multiple statistical predictors were used and the prediction accuracy was determined by cross-validation error rate; multidimensional scaling (MDS) allowed visualization of the predicted states under study. Models built from gene expression data accurately predict ER and HER2 status, and divide tumor grade into high-grade and low-grade clusters; intermediate-grade tumors are not a unique group. In contrast, gene expression data is inaccurate at predicting tumor size, lymph node status or LVI. The best model for prediction of nodal status included tumor size, LVI status and pathologically defined tumor subtype (based on combinations of ER, HER2, and grade); the addition of microarray-based prediction to this model failed to improve the prediction accuracy. Global gene expression supports a binary division of ER, HER2, and grade, clearly separating tumors into two categories; intermediate values for these bio-indicators do not define intermediate tumor subsets. Results are consistent with a model of regional metastasis that depends on inherent biologic differences in metastatic propensity between breast cancer subtypes, upon which time and chance then operate.

  19. Predictable transcriptome evolution in the convergent and complex bioluminescent organs of squid

    PubMed Central

    Pankey, M. Sabrina; Minin, Vladimir N.; Imholte, Greg C.; Suchard, Marc A.; Oakley, Todd H.

    2014-01-01

    Despite contingency in life’s history, the similarity of evolutionarily convergent traits may represent predictable solutions to common conditions. However, the extent to which overall gene expression levels (transcriptomes) underlying convergent traits are themselves convergent remains largely unexplored. Here, we show strong statistical support for convergent evolutionary origins and massively parallel evolution of the entire transcriptomes in symbiotic bioluminescent organs (bacterial photophores) from two divergent squid species. The gene expression similarities are so strong that regression models of one species’ photophore can predict organ identity of a distantly related photophore from gene expression levels alone. Our results point to widespread parallel changes in gene expression evolution associated with convergent origins of complex organs. Therefore, predictable solutions may drive not only the evolution of novel, complex organs but also the evolution of overall gene expression levels that underlie them. PMID:25336755

  20. Center of Excellence for Individuation of Therapy for Breast Cancer

    DTIC Science & Technology

    2012-03-01

    Sledge, B. Leyland-Jones (2011) Gene copy number and expression of TYMP and TYMS are predictive of outcome in breast cancer patients treated with... Gene copy number and expression of TYMP and TYMS are predictive of outcome in breast cancer patients treated with capecitabine. R. Audet, C...determine if a specific gene expression signature could be used as predictive marker for treatment outcome . Results summary for Cohort A: doxorubicin

  1. Microarray-based cancer prediction using soft computing approach.

    PubMed

    Wang, Xiaosheng; Gotoh, Osamu

    2009-05-26

    One of the difficulties in using gene expression profiles to predict cancer is how to effectively select a few informative genes to construct accurate prediction models from thousands or ten thousands of genes. We screen highly discriminative genes and gene pairs to create simple prediction models involved in single genes or gene pairs on the basis of soft computing approach and rough set theory. Accurate cancerous prediction is obtained when we apply the simple prediction models for four cancerous gene expression datasets: CNS tumor, colon tumor, lung cancer and DLBCL. Some genes closely correlated with the pathogenesis of specific or general cancers are identified. In contrast with other models, our models are simple, effective and robust. Meanwhile, our models are interpretable for they are based on decision rules. Our results demonstrate that very simple models may perform well on cancerous molecular prediction and important gene markers of cancer can be detected if the gene selection approach is chosen reasonably.

  2. Testing the recent theories for the origin of the hermaphrodite flower by comparison of the transcriptomes of gymnosperms and angiosperms.

    PubMed

    Tavares, Raquel; Cagnon, Mathilde; Negrutiu, Ioan; Mouchiroud, Dominque

    2010-08-03

    Different theories for the origin of the angiosperm hermaphrodite flower make different predictions concerning the overlap between the genes expressed in the male and female cones of gymnosperms and the genes expressed in the hermaphrodite flower of angiosperms. The Mostly Male (MM) theory predicts that, of genes expressed primarily in male versus female gymnosperm cones, an excess of male orthologs will be expressed in flowers, excluding ovules, while Out Of Male (OOM) and Out Of Female (OOF) theories predict no such excess. In this paper, we tested these predictions by comparing the transcriptomes of three gymnosperms (Ginkgo biloba, Welwitschia mirabilis and Zamia fisheri) and two angiosperms (Arabidopsis thaliana and Oryza sativa), using EST data. We found that the proportion of orthologous genes expressed in the reproductive organs of the gymnosperms and in the angiosperms flower is significantly higher than the proportion of orthologous genes expressed in the reproductive organs of the gymnosperms and in the angiosperms vegetative tissues, which shows that the approach is correct. However, we detected no significant differences between the proportion of gymnosperm orthologous genes expressed in the male cone and in the angiosperms flower and the proportion of gymnosperm orthologous genes expressed in the female cone and in the angiosperms flower. These results do not support the MM theory prediction of an excess of male gymnosperm genes expressed in the hermaphrodite flower of the angiosperms and seem to support the OOM/OOF theories. However, other explanations can be given for the 1:1 ratio that we found. More abundant and more specific (namely carpel and ovule) expression data should be produced in order to further test these theories.

  3. Testing the recent theories for the origin of the hermaphrodite flower by comparison of the transcriptomes of gymnosperms and angiosperms

    PubMed Central

    2010-01-01

    Background Different theories for the origin of the angiosperm hermaphrodite flower make different predictions concerning the overlap between the genes expressed in the male and female cones of gymnosperms and the genes expressed in the hermaphrodite flower of angiosperms. The Mostly Male (MM) theory predicts that, of genes expressed primarily in male versus female gymnosperm cones, an excess of male orthologs will be expressed in flowers, excluding ovules, while Out Of Male (OOM) and Out Of Female (OOF) theories predict no such excess. Results In this paper, we tested these predictions by comparing the transcriptomes of three gymnosperms (Ginkgo biloba, Welwitschia mirabilis and Zamia fisheri) and two angiosperms (Arabidopsis thaliana and Oryza sativa), using EST data. We found that the proportion of orthologous genes expressed in the reproductive organs of the gymnosperms and in the angiosperms flower is significantly higher than the proportion of orthologous genes expressed in the reproductive organs of the gymnosperms and in the angiosperms vegetative tissues, which shows that the approach is correct. However, we detected no significant differences between the proportion of gymnosperm orthologous genes expressed in the male cone and in the angiosperms flower and the proportion of gymnosperm orthologous genes expressed in the female cone and in the angiosperms flower. Conclusions These results do not support the MM theory prediction of an excess of male gymnosperm genes expressed in the hermaphrodite flower of the angiosperms and seem to support the OOM/OOF theories. However, other explanations can be given for the 1:1 ratio that we found. More abundant and more specific (namely carpel and ovule) expression data should be produced in order to further test these theories. PMID:20682074

  4. Prediction of gene expression in embryonic structures of Drosophila melanogaster.

    PubMed

    Samsonova, Anastasia A; Niranjan, Mahesan; Russell, Steven; Brazma, Alvis

    2007-07-01

    Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms.

  5. Prediction of Gene Expression in Embryonic Structures of Drosophila melanogaster

    PubMed Central

    Samsonova, Anastasia A; Niranjan, Mahesan; Russell, Steven; Brazma, Alvis

    2007-01-01

    Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms. PMID:17658945

  6. Annotation of gene function in citrus using gene expression information and co-expression networks

    PubMed Central

    2014-01-01

    Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks, functional enrichment analysis and gene expression information provide opportunities to infer gene function in citrus. We present a publicly accessible tool, Network Inference for Citrus Co-Expression (NICCE, http://citrus.adelaide.edu.au/nicce/home.aspx), for the gene co-expression analysis in citrus. PMID:25023870

  7. Gene expression patterns in formalin-fixed, paraffin-embedded core biopsies predict docetaxel chemosensitivity in breast cancer patients.

    PubMed

    Chang, Jenny C; Makris, Andreas; Gutierrez, M Carolina; Hilsenbeck, Susan G; Hackett, James R; Jeong, Jennie; Liu, Mei-Lan; Baker, Joffre; Clark-Langone, Kim; Baehner, Frederick L; Sexton, Krsytal; Mohsin, Syed; Gray, Tara; Alvarez, Laura; Chamness, Gary C; Osborne, C Kent; Shak, Steven

    2008-03-01

    Previously, we had identified gene expression patterns that predicted response to neoadjuvant docetaxel. Other studies have validated that a high Recurrence Score (RS) by the 21-gene RT-PCR assay is predictive of worse prognosis but better response to chemotherapy. We investigated whether tumor expression of these 21 genes and other candidate genes can predict response to docetaxel. Core biopsies from 97 patients were obtained before treatment with neoadjuvant docetaxel (4 cycles, 100 mg/m2 q3 weeks). Three 10-microm FFPE sections were submitted for quantitative RT-PCR assays of 192 genes that were selected from our previous work and the literature. Of the 97 patients, 81 (84%) had sufficient invasive cancer, 80 (82%) had sufficient RNA for QRTPCR assay, and 72 (74%) had clinical response data. Mean age was 48.5 years, and the median tumor size was 6 cm. Clinical complete responses (CR) were observed in 12 (17%), partial responses in 41 (57%), stable disease in 17 (24%), and progressive disease in 2 patients (3%). A significant relationship (P<0.05) between gene expression and CR was observed for 14 genes, including CYBA. CR was associated with lower expression of the ER gene group and higher expression of the proliferation gene group from the 21 gene assay. Of note, CR was more likely with a high RS (P=0.008). We have established molecular profiles of sensitivity to docetaxel. RT-PCR technology provides a potential platform for a predictive test of docetaxel chemosensitivity using small amounts of routinely processed material.

  8. A cis-regulatory logic simulator.

    PubMed

    Zeigler, Robert D; Gertz, Jason; Cohen, Barak A

    2007-07-27

    A major goal of computational studies of gene regulation is to accurately predict the expression of genes based on the cis-regulatory content of their promoters. The development of computational methods to decode the interactions among cis-regulatory elements has been slow, in part, because it is difficult to know, without extensive experimental validation, whether a particular method identifies the correct cis-regulatory interactions that underlie a given set of expression data. There is an urgent need for test expression data in which the interactions among cis-regulatory sites that produce the data are known. The ability to rapidly generate such data sets would facilitate the development and comparison of computational methods that predict gene expression patterns from promoter sequence. We developed a gene expression simulator which generates expression data using user-defined interactions between cis-regulatory sites. The simulator can incorporate additive, cooperative, competitive, and synergistic interactions between regulatory elements. Constraints on the spacing, distance, and orientation of regulatory elements and their interactions may also be defined and Gaussian noise can be added to the expression values. The simulator allows for a data transformation that simulates the sigmoid shape of expression levels from real promoters. We found good agreement between sets of simulated promoters and predicted regulatory modules from real expression data. We present several data sets that may be useful for testing new methodologies for predicting gene expression from promoter sequence. We developed a flexible gene expression simulator that rapidly generates large numbers of simulated promoters and their corresponding transcriptional output based on specified interactions between cis-regulatory sites. When appropriate rule sets are used, the data generated by our simulator faithfully reproduces experimentally derived data sets. We anticipate that using simulated gene expression data sets will facilitate the direct comparison of computational strategies to predict gene expression from promoter sequence. The source code is available online and as additional material. The test sets are available as additional material.

  9. Molecular Structure-Based Large-Scale Prediction of Chemical-Induced Gene Expression Changes.

    PubMed

    Liu, Ruifeng; AbdulHameed, Mohamed Diwan M; Wallqvist, Anders

    2017-09-25

    The quantitative structure-activity relationship (QSAR) approach has been used to model a wide range of chemical-induced biological responses. However, it had not been utilized to model chemical-induced genomewide gene expression changes until very recently, owing to the complexity of training and evaluating a very large number of models. To address this issue, we examined the performance of a variable nearest neighbor (v-NN) method that uses information on near neighbors conforming to the principle that similar structures have similar activities. Using a data set of gene expression signatures of 13 150 compounds derived from cell-based measurements in the NIH Library of Integrated Network-based Cellular Signatures program, we were able to make predictions for 62% of the compounds in a 10-fold cross validation test, with a correlation coefficient of 0.61 between the predicted and experimentally derived signatures-a reproducibility rivaling that of high-throughput gene expression measurements. To evaluate the utility of the predicted gene expression signatures, we compared the predicted and experimentally derived signatures in their ability to identify drugs known to cause specific liver, kidney, and heart injuries. Overall, the predicted and experimentally derived signatures had similar receiver operating characteristics, whose areas under the curve ranged from 0.71 to 0.77 and 0.70 to 0.73, respectively, across the three organ injury models. However, detailed analyses of enrichment curves indicate that signatures predicted from multiple near neighbors outperformed those derived from experiments, suggesting that averaging information from near neighbors may help improve the signal from gene expression measurements. Our results demonstrate that the v-NN method can serve as a practical approach for modeling large-scale, genomewide, chemical-induced, gene expression changes.

  10. Machine Learning–Based Differential Network Analysis: A Study of Stress-Responsive Transcriptomes in Arabidopsis[W

    PubMed Central

    Ma, Chuang; Xin, Mingming; Feldmann, Kenneth A.; Wang, Xiangfeng

    2014-01-01

    Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive “noninformative” genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained “informative” genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes. PMID:24520154

  11. A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data.

    PubMed

    Seok, Junhee; Davis, Ronald W; Xiao, Wenzhong

    2015-01-01

    Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn't been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge.

  12. A Hybrid Approach of Gene Sets and Single Genes for the Prediction of Survival Risks with Gene Expression Data

    PubMed Central

    Seok, Junhee; Davis, Ronald W.; Xiao, Wenzhong

    2015-01-01

    Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn’t been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge. PMID:25933378

  13. In Silico Prediction and Validation of Gfap as an miR-3099 Target in Mouse Brain.

    PubMed

    Abidin, Shahidee Zainal; Leong, Jia-Wen; Mahmoudi, Marzieh; Nordin, Norshariza; Abdullah, Syahril; Cheah, Pike-See; Ling, King-Hwa

    2017-08-01

    MicroRNAs are small non-coding RNAs that play crucial roles in the regulation of gene expression and protein synthesis during brain development. MiR-3099 is highly expressed throughout embryogenesis, especially in the developing central nervous system. Moreover, miR-3099 is also expressed at a higher level in differentiating neurons in vitro, suggesting that it is a potential regulator during neuronal cell development. This study aimed to predict the target genes of miR-3099 via in-silico analysis using four independent prediction algorithms (miRDB, miRanda, TargetScan, and DIANA-micro-T-CDS) with emphasis on target genes related to brain development and function. Based on the analysis, a total of 3,174 miR-3099 target genes were predicted. Those predicted by at least three algorithms (324 genes) were subjected to DAVID bioinformatics analysis to understand their overall functional themes and representation. The analysis revealed that nearly 70% of the target genes were expressed in the nervous system and a significant proportion were associated with transcriptional regulation and protein ubiquitination mechanisms. Comparison of in situ hybridization (ISH) expression patterns of miR-3099 in both published and in-house-generated ISH sections with the ISH sections of target genes from the Allen Brain Atlas identified 7 target genes (Dnmt3a, Gabpa, Gfap, Itga4, Lxn, Smad7, and Tbx18) having expression patterns complementary to miR-3099 in the developing and adult mouse brain samples. Of these, we validated Gfap as a direct downstream target of miR-3099 using the luciferase reporter gene system. In conclusion, we report the successful prediction and validation of Gfap as an miR-3099 target gene using a combination of bioinformatics resources with enrichment of annotations based on functional ontologies and a spatio-temporal expression dataset.

  14. Combining classifiers to predict gene function in Arabidopsis thaliana using large-scale gene expression measurements.

    PubMed

    Lan, Hui; Carson, Rachel; Provart, Nicholas J; Bonner, Anthony J

    2007-09-21

    Arabidopsis thaliana is the model species of current plant genomic research with a genome size of 125 Mb and approximately 28,000 genes. The function of half of these genes is currently unknown. The purpose of this study is to infer gene function in Arabidopsis using machine-learning algorithms applied to large-scale gene expression data sets, with the goal of identifying genes that are potentially involved in plant response to abiotic stress. Using in house and publicly available data, we assembled a large set of gene expression measurements for A. thaliana. Using those genes of known function, we first evaluated and compared the ability of basic machine-learning algorithms to predict which genes respond to stress. Predictive accuracy was measured using ROC50 and precision curves derived through cross validation. To improve accuracy, we developed a method for combining these classifiers using a weighted-voting scheme. The combined classifier was then trained on genes of known function and applied to genes of unknown function, identifying genes that potentially respond to stress. Visual evidence corroborating the predictions was obtained using electronic Northern analysis. Three of the predicted genes were chosen for biological validation. Gene knockout experiments confirmed that all three are involved in a variety of stress responses. The biological analysis of one of these genes (At1g16850) is presented here, where it is shown to be necessary for the normal response to temperature and NaCl. Supervised learning methods applied to large-scale gene expression measurements can be used to predict gene function. However, the ability of basic learning methods to predict stress response varies widely and depends heavily on how much dimensionality reduction is used. Our method of combining classifiers can improve the accuracy of such predictions - in this case, predictions of genes involved in stress response in plants - and it effectively chooses the appropriate amount of dimensionality reduction automatically. The method provides a useful means of identifying genes in A. thaliana that potentially respond to stress, and we expect it would be useful in other organisms and for other gene functions.

  15. A Gene Expression Profile of BRCAness That Predicts for Responsiveness to Platinum and PARP Inhibitors

    DTIC Science & Technology

    2015-10-01

    1 Award Number: W81XWH-10-1-0585 TITLE: A Gene Expression Profile of BRCAness That Predicts for Responsiveness to Platinum and PARP Inhibitors...TITLE AND SUBTITLE A Gene Expression Profile of BRCAness That Predicts for Responsiveness to Platinum and PARP Inhibitors 5a. CONTRACT NUMBER W81XWH...BRCAlike, i.e. not HR deficient and are resistant to PARPis but are sensitive to platinum . These tumors exhibit alterations in another DNA repair

  16. Complex Expression of the Cellulolytic Transcriptome of Saccharophagus degradans † ▿

    PubMed Central

    Zhang, Haitao; Hutcheson, Steven W.

    2011-01-01

    Saccharophagus degradans is an aerobic marine bacterium that can degrade cellulose by the induced expression of an unusual cellulolytic system composed of multiple endoglucanases and glucosidases. To understand the regulation of the cellulolytic system, transcript levels for the genes predicted to contribute to the cellulolytic system were monitored by quantitative real-time PCR (qRT-PCR) during the transition to growth on cellulose. Four glucanases of the cellulolytic system exhibited basal expression during growth on glucose. All but one of the predicted cellulolytic system genes were induced strongly during growth on Avicel, with three patterns of expression observed. One group showed increased expression (up to 6-fold) within 4 h of the nutritional shift, with the relative expression remaining constant over the next 22 h. A second group of genes was strongly induced between 4 and 10 h after nutritional transfer, with relative expression declining thereafter. The third group of genes was slowly induced and was expressed maximally after 24 h. Cellodextrins and cellobiose, products of the predicted basally expressed endoglucanases, stimulated expression of representative cellulase genes. A model is proposed by which the activity of basally expressed endoglucanases releases cellodextrins from Avicel that are then perceived and transduced to initiate transcription of each of the regulated cellulolytic system genes forming an expression pattern. PMID:21705539

  17. A peripheral blood transcriptomic signature predicts autoantibody development in infants at risk of type 1 diabetes.

    PubMed

    Mehdi, Ahmed M; Hamilton-Williams, Emma E; Cristino, Alexandre; Ziegler, Anette; Bonifacio, Ezio; Le Cao, Kim-Anh; Harris, Mark; Thomas, Ranjeny

    2018-03-08

    Autoimmune-mediated destruction of pancreatic islet β cells results in type 1 diabetes (T1D). Serum islet autoantibodies usually develop in genetically susceptible individuals in early childhood before T1D onset, with multiple islet autoantibodies predicting diabetes development. However, most at-risk children remain islet-antibody negative, and no test currently identifies those likely to seroconvert. We sought a genomic signature predicting seroconversion risk by integrating longitudinal peripheral blood gene expression profiles collected in high-risk children included in the BABYDIET and DIPP cohorts, of whom 50 seroconverted. Subjects were followed for 10 years to determine time of seroconversion. Any cohort effect and the time of seroconversion were corrected to uncover genes differentially expressed (DE) in seroconverting children. Gene expression signatures associated with seroconversion were evident during the first year of life, with 67 DE genes identified in seroconverting children relative to those remaining antibody negative. These genes contribute to T cell-, DC-, and B cell-related immune responses. Near-birth expression of ADCY9, PTCH1, MEX3B, IL15RA, ZNF714, TENM1, and PLEKHA5, along with HLA risk score predicted seroconversion (AUC 0.85). The ubiquitin-proteasome pathway linked DE genes and T1D susceptibility genes. Therefore, a gene expression signature in infancy predicts risk of seroconversion. Ubiquitination may play a mechanistic role in diabetes progression.

  18. A peripheral blood transcriptomic signature predicts autoantibody development in infants at risk of type 1 diabetes

    PubMed Central

    Mehdi, Ahmed M.; Hamilton-Williams, Emma E.; Cristino, Alexandre; Ziegler, Anette; Harris, Mark

    2018-01-01

    Autoimmune-mediated destruction of pancreatic islet β cells results in type 1 diabetes (T1D). Serum islet autoantibodies usually develop in genetically susceptible individuals in early childhood before T1D onset, with multiple islet autoantibodies predicting diabetes development. However, most at-risk children remain islet-antibody negative, and no test currently identifies those likely to seroconvert. We sought a genomic signature predicting seroconversion risk by integrating longitudinal peripheral blood gene expression profiles collected in high-risk children included in the BABYDIET and DIPP cohorts, of whom 50 seroconverted. Subjects were followed for 10 years to determine time of seroconversion. Any cohort effect and the time of seroconversion were corrected to uncover genes differentially expressed (DE) in seroconverting children. Gene expression signatures associated with seroconversion were evident during the first year of life, with 67 DE genes identified in seroconverting children relative to those remaining antibody negative. These genes contribute to T cell–, DC-, and B cell–related immune responses. Near-birth expression of ADCY9, PTCH1, MEX3B, IL15RA, ZNF714, TENM1, and PLEKHA5, along with HLA risk score predicted seroconversion (AUC 0.85). The ubiquitin-proteasome pathway linked DE genes and T1D susceptibility genes. Therefore, a gene expression signature in infancy predicts risk of seroconversion. Ubiquitination may play a mechanistic role in diabetes progression. PMID:29515040

  19. Clinical Value of Prognosis Gene Expression Signatures in Colorectal Cancer: A Systematic Review

    PubMed Central

    Cordero, David; Riccadonna, Samantha; Solé, Xavier; Crous-Bou, Marta; Guinó, Elisabet; Sanjuan, Xavier; Biondo, Sebastiano; Soriano, Antonio; Jurman, Giuseppe; Capella, Gabriel; Furlanello, Cesare; Moreno, Victor

    2012-01-01

    Introduction The traditional staging system is inadequate to identify those patients with stage II colorectal cancer (CRC) at high risk of recurrence or with stage III CRC at low risk. A number of gene expression signatures to predict CRC prognosis have been proposed, but none is routinely used in the clinic. The aim of this work was to assess the prediction ability and potential clinical usefulness of these signatures in a series of independent datasets. Methods A literature review identified 31 gene expression signatures that used gene expression data to predict prognosis in CRC tissue. The search was based on the PubMed database and was restricted to papers published from January 2004 to December 2011. Eleven CRC gene expression datasets with outcome information were identified and downloaded from public repositories. Random Forest classifier was used to build predictors from the gene lists. Matthews correlation coefficient was chosen as a measure of classification accuracy and its associated p-value was used to assess association with prognosis. For clinical usefulness evaluation, positive and negative post-tests probabilities were computed in stage II and III samples. Results Five gene signatures showed significant association with prognosis and provided reasonable prediction accuracy in their own training datasets. Nevertheless, all signatures showed low reproducibility in independent data. Stratified analyses by stage or microsatellite instability status showed significant association but limited discrimination ability, especially in stage II tumors. From a clinical perspective, the most predictive signatures showed a minor but significant improvement over the classical staging system. Conclusions The published signatures show low prediction accuracy but moderate clinical usefulness. Although gene expression data may inform prognosis, better strategies for signature validation are needed to encourage their widespread use in the clinic. PMID:23145004

  20. Prediction of cardioembolic, arterial and lacunar causes of cryptogenic stroke by gene expression and infarct location

    PubMed Central

    Jickling, Glen C; Stamova, Boryana; Ander, Bradley P; Zhan, Xinhua; Liu, Dazhi; Sison, Shara-Mae; Verro, Piero; Sharp, Frank R

    2012-01-01

    Background and Purpose The cause of ischemic stroke remains unclear, or cryptogenic, in as many as 35% of stroke patients. Not knowing the cause of stroke restricts optimal implementation of prevention therapy and limits stroke research. We demonstrate how gene expression profiles in blood can be used in conjunction with a measure of infarct location on neuroimaging to predict a probable cause in cryptogenic stroke. Methods The cause of cryptogenic stroke was predicted using previously described profiles of differentially expressed genes characteristic of patients with cardioembolic, arterial and lacunar stroke. RNA was isolated from peripheral blood of 131 cryptogenic strokes and compared to profiles derived from 149 strokes of known cause. Each sample was run on Affymetrix U133 Plus2.0 microarrays. Cause of cryptogenic stroke was predicted using gene expression in blood and infarct location. Results Cryptogenic strokes were predicted to be 58% cardioembolic, 18% arterial, 12% lacunar and 12% unclear etiology. Cryptogenic stroke of predicted cardioembolic etiology had more prior myocardial infarction and higher CHA2DS2-VASc scores compared to stroke of predicted arterial etiology. Predicted lacunar strokes had higher systolic and diastolic blood pressures and lower NIHSS compared to predicted arterial and cardioembolic strokes. Cryptogenic strokes of unclear predicted etiology were less likely to have a prior TIA or ischemic stroke. Conclusions Gene expression in conjunction with a measure of infarct location can predict a probable cause in cryptogenic strokes. Predicted groups require further evaluation to determine whether relevant clinical, imaging, or therapeutic differences exist for each group. PMID:22627989

  1. Intra- and interspecies gene expression models for predicting drug response in canine osteosarcoma.

    PubMed

    Fowles, Jared S; Brown, Kristen C; Hess, Ann M; Duval, Dawn L; Gustafson, Daniel L

    2016-02-19

    Genomics-based predictors of drug response have the potential to improve outcomes associated with cancer therapy. Osteosarcoma (OS), the most common primary bone cancer in dogs, is commonly treated with adjuvant doxorubicin or carboplatin following amputation of the affected limb. We evaluated the use of gene-expression based models built in an intra- or interspecies manner to predict chemosensitivity and treatment outcome in canine OS. Models were built and evaluated using microarray gene expression and drug sensitivity data from human and canine cancer cell lines, and canine OS tumor datasets. The "COXEN" method was utilized to filter gene signatures between human and dog datasets based on strong co-expression patterns. Models were built using linear discriminant analysis via the misclassification penalized posterior algorithm. The best doxorubicin model involved genes identified in human lines that were co-expressed and trained on canine OS tumor data, which accurately predicted clinical outcome in 73 % of dogs (p = 0.0262, binomial). The best carboplatin model utilized canine lines for gene identification and model training, with canine OS tumor data for co-expression. Dogs whose treatment matched our predictions had significantly better clinical outcomes than those that didn't (p = 0.0006, Log Rank), and this predictor significantly associated with longer disease free intervals in a Cox multivariate analysis (hazard ratio = 0.3102, p = 0.0124). Our data show that intra- and interspecies gene expression models can successfully predict response in canine OS, which may improve outcome in dogs and serve as pre-clinical validation for similar methods in human cancer research.

  2. Testing the predictive value of peripheral gene expression for nonremission following citalopram treatment for major depression.

    PubMed

    Guilloux, Jean-Philippe; Bassi, Sabrina; Ding, Ying; Walsh, Chris; Turecki, Gustavo; Tseng, George; Cyranowski, Jill M; Sibille, Etienne

    2015-02-01

    Major depressive disorder (MDD) in general, and anxious-depression in particular, are characterized by poor rates of remission with first-line treatments, contributing to the chronic illness burden suffered by many patients. Prospective research is needed to identify the biomarkers predicting nonremission prior to treatment initiation. We collected blood samples from a discovery cohort of 34 adult MDD patients with co-occurring anxiety and 33 matched, nondepressed controls at baseline and after 12 weeks (of citalopram plus psychotherapy treatment for the depressed cohort). Samples were processed on gene arrays and group differences in gene expression were investigated. Exploratory analyses suggest that at pretreatment baseline, nonremitting patients differ from controls with gene function and transcription factor analyses potentially related to elevated inflammation and immune activation. In a second phase, we applied an unbiased machine learning prediction model and corrected for model-selection bias. Results show that baseline gene expression predicted nonremission with 79.4% corrected accuracy with a 13-gene model. The same gene-only model predicted nonremission after 8 weeks of citalopram treatment with 76% corrected accuracy in an independent validation cohort of 63 MDD patients treated with citalopram at another institution. Together, these results demonstrate the potential, but also the limitations, of baseline peripheral blood-based gene expression to predict nonremission after citalopram treatment. These results not only support their use in future prediction tools but also suggest that increased accuracy may be obtained with the inclusion of additional predictors (eg, genetics and clinical scales).

  3. Automated Protocol for Large-Scale Modeling of Gene Expression Data.

    PubMed

    Hall, Michelle Lynn; Calkins, David; Sherman, Woody

    2016-11-28

    With the continued rise of phenotypic- and genotypic-based screening projects, computational methods to analyze, process, and ultimately make predictions in this field take on growing importance. Here we show how automated machine learning workflows can produce models that are predictive of differential gene expression as a function of a compound structure using data from A673 cells as a proof of principle. In particular, we present predictive models with an average accuracy of greater than 70% across a highly diverse ∼1000 gene expression profile. In contrast to the usual in silico design paradigm, where one interrogates a particular target-based response, this work opens the opportunity for virtual screening and lead optimization for desired multitarget gene expression profiles.

  4. Changes in gene expression associated with response to neoadjuvant chemotherapy in breast cancer.

    PubMed

    Hannemann, Juliane; Oosterkamp, Hendrika M; Bosch, Cathy A J; Velds, Arno; Wessels, Lodewyk F A; Loo, Claudette; Rutgers, Emiel J; Rodenhuis, Sjoerd; van de Vijver, Marc J

    2005-05-20

    At present, clinically useful markers predicting response of primary breast carcinomas to either doxorubicin-cyclophosphamide (AC) or doxorubicin-docetaxel (AD) are lacking. We investigated whether gene expression profiles of the primary tumor could be used to predict treatment response to either of those chemotherapy regimens. Within a single-institution, randomized, phase II trial, patients with locally advanced breast cancer received six courses of either AC (n = 24) or AD (n = 24) neoadjuvant chemotherapy. Gene expression profiles were generated from core-needle biopsies obtained before treatment and correlated with the response of the primary tumor to the chemotherapy administered. Additionally, pretreatment gene expression profiles were compared with those in tumors remaining after chemotherapy. Ten (20%) of 48 patients showed a (near) pathologic complete remission of the primary tumor after treatment. No gene expression pattern correlating with response could be identified for all patients or for the AC or AD groups separately. The comparison of the pretreatment biopsy and the tumor excised after chemotherapy revealed differences in gene expression in tumors that showed a partial remission but not in tumors that did not respond to chemotherapy. No gene expression profile predicting the response of primary breast carcinomas to AC- or AD-based neoadjuvant chemotherapy could be detected in this interim analysis. More subtle differences in gene expression are likely to be present but can only be reliably identified by studying a larger group of patients. Response of a breast tumor to neoadjuvant chemotherapy results in alterations in gene expression.

  5. Transcriptional Coupling of Neighboring Genes and Gene Expression Noise: Evidence that Gene Orientation and Noncoding Transcripts Are Modulators of Noise

    PubMed Central

    Wang, Guang-Zhong; Lercher, Martin J.; Hurst, Laurence D.

    2011-01-01

    Abstract How is noise in gene expression modulated? Do mechanisms of noise control impact genome organization? In yeast, the expression of one gene can affect that of a very close neighbor. As the effect is highly regionalized, we hypothesize that genes in different orientations will have differing degrees of coupled expression and, in turn, different noise levels. Divergently organized gene pairs, in particular those with bidirectional promoters, have close promoters, maximizing the likelihood that expression of one gene affects the neighbor. With more distant promoters, the same is less likely to hold for gene pairs in nondivergent orientation. Stochastic models suggest that coupled chromatin dynamics will typically result in low abundance-corrected noise (ACN). Transcription of noncoding RNA (ncRNA) from a bidirectional promoter, we thus hypothesize to be a noise-reduction, expression-priming, mechanism. The hypothesis correctly predicts that protein-coding genes with a bidirectional promoter, including those with a ncRNA partner, have lower ACN than other genes and divergent gene pairs uniquely have correlated ACN. Moreover, as predicted, ACN increases with the distance between promoters. The model also correctly predicts ncRNA transcripts to be often divergently transcribed from genes that a priori would be under selection for low noise (essential genes, protein complex genes) and that the latter genes should commonly reside in divergent orientation. Likewise, that genes with bidirectional promoters are rare subtelomerically, cluster together, and are enriched in essential gene clusters is expected and observed. We conclude that gene orientation and transcription of ncRNAs are candidate modulators of noise. PMID:21402863

  6. Predictive model for inflammation grades of chronic hepatitis B: Large-scale analysis of clinical parameters and gene expressions.

    PubMed

    Zhou, Weichen; Ma, Yanyun; Zhang, Jun; Hu, Jingyi; Zhang, Menghan; Wang, Yi; Li, Yi; Wu, Lijun; Pan, Yida; Zhang, Yitong; Zhang, Xiaonan; Zhang, Xinxin; Zhang, Zhanqing; Zhang, Jiming; Li, Hai; Lu, Lungen; Jin, Li; Wang, Jiucun; Yuan, Zhenghong; Liu, Jie

    2017-11-01

    Liver biopsy is the gold standard to assess pathological features (eg inflammation grades) for hepatitis B virus-infected patients although it is invasive and traumatic; meanwhile, several gene profiles of chronic hepatitis B (CHB) have been separately described in relatively small hepatitis B virus (HBV)-infected samples. We aimed to analyse correlations among inflammation grades, gene expressions and clinical parameters (serum alanine amino transaminase, aspartate amino transaminase and HBV-DNA) in large-scale CHB samples and to predict inflammation grades by using clinical parameters and/or gene expressions. We analysed gene expressions with three clinical parameters in 122 CHB samples by an improved regression model. Principal component analysis and machine-learning methods including Random Forest, K-nearest neighbour and support vector machine were used for analysis and further diagnosis models. Six normal samples were conducted to validate the predictive model. Significant genes related to clinical parameters were found enriching in the immune system, interferon-stimulated, regulation of cytokine production, anti-apoptosis, and etc. A panel of these genes with clinical parameters can effectively predict binary classifications of inflammation grade (area under the ROC curve [AUC]: 0.88, 95% confidence interval [CI]: 0.77-0.93), validated by normal samples. A panel with only clinical parameters was also valuable (AUC: 0.78, 95% CI: 0.65-0.86), indicating that liquid biopsy method for detecting the pathology of CHB is possible. This is the first study to systematically elucidate the relationships among gene expressions, clinical parameters and pathological inflammation grades in CHB, and to build models predicting inflammation grades by gene expressions and/or clinical parameters as well. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  7. Gene expression profiles in paraffin-embedded core biopsy tissue predict response to chemotherapy in women with locally advanced breast cancer.

    PubMed

    Gianni, Luca; Zambetti, Milvia; Clark, Kim; Baker, Joffre; Cronin, Maureen; Wu, Jenny; Mariani, Gabriella; Rodriguez, Jaime; Carcangiu, Marialuisa; Watson, Drew; Valagussa, Pinuccia; Rouzier, Roman; Symmans, W Fraser; Ross, Jeffrey S; Hortobagyi, Gabriel N; Pusztai, Lajos; Shak, Steven

    2005-10-10

    We sought to identify gene expression markers that predict the likelihood of chemotherapy response. We also tested whether chemotherapy response is correlated with the 21-gene Recurrence Score assay that quantifies recurrence risk. Patients with locally advanced breast cancer received neoadjuvant paclitaxel and doxorubicin. RNA was extracted from the pretreatment formalin-fixed paraffin-embedded core biopsies. The expression of 384 genes was quantified using reverse transcriptase polymerase chain reaction and correlated with pathologic complete response (pCR). The performance of genes predicting for pCR was tested in patients from an independent neoadjuvant study where gene expression was obtained using DNA microarrays. Of 89 assessable patients (mean age, 49.9 years; mean tumor size, 6.4 cm), 11 (12%) had a pCR. Eighty-six genes correlated with pCR (unadjusted P < .05); pCR was more likely with higher expression of proliferation-related genes and immune-related genes, and with lower expression of estrogen receptor (ER) -related genes. In 82 independent patients treated with neoadjuvant paclitaxel and doxorubicin, DNA microarray data were available for 79 of the 86 genes. In univariate analysis, 24 genes correlated with pCR with P < .05 (false discovery, four genes) and 32 genes showed correlation with P < .1 (false discovery, eight genes). The Recurrence Score was positively associated with the likelihood of pCR (P = .005), suggesting that the patients who are at greatest recurrence risk are more likely to have chemotherapy benefit. Quantitative expression of ER-related genes, proliferation genes, and immune-related genes are strong predictors of pCR in women with locally advanced breast cancer receiving neoadjuvant anthracyclines and paclitaxel.

  8. Genome-wide patterns of promoter sharing and co-expression in bovine skeletal muscle.

    PubMed

    Gu, Quan; Nagaraj, Shivashankar H; Hudson, Nicholas J; Dalrymple, Brian P; Reverter, Antonio

    2011-01-12

    Gene regulation by transcription factors (TF) is species, tissue and time specific. To better understand how the genetic code controls gene expression in bovine muscle we associated gene expression data from developing Longissimus thoracis et lumborum skeletal muscle with bovine promoter sequence information. We created a highly conserved genome-wide promoter landscape comprising 87,408 interactions relating 333 TFs with their 9,242 predicted target genes (TGs). We discovered that the complete set of predicted TGs share an average of 2.75 predicted TF binding sites (TFBSs) and that the average co-expression between a TF and its predicted TGs is higher than the average co-expression between the same TF and all genes. Conversely, pairs of TFs sharing predicted TGs showed a co-expression correlation higher that pairs of TFs not sharing TGs. Finally, we exploited the co-occurrence of predicted TFBS in the context of muscle-derived functionally-coherent modules including cell cycle, mitochondria, immune system, fat metabolism, muscle/glycolysis, and ribosome. Our findings enabled us to reverse engineer a regulatory network of core processes, and correctly identified the involvement of E2F1, GATA2 and NFKB1 in the regulation of cell cycle, fat, and muscle/glycolysis, respectively. The pivotal implication of our research is two-fold: (1) there exists a robust genome-wide expression signal between TFs and their predicted TGs in cattle muscle consistent with the extent of promoter sharing; and (2) this signal can be exploited to recover the cellular mechanisms underpinning transcription regulation of muscle structure and development in bovine. Our study represents the first genome-wide report linking tissue specific co-expression to co-regulation in a non-model vertebrate.

  9. DeSigN: connecting gene expression with therapeutics for drug repurposing and development.

    PubMed

    Lee, Bernard Kok Bang; Tiong, Kai Hung; Chang, Jit Kang; Liew, Chee Sun; Abdul Rahman, Zainal Ariff; Tan, Aik Choon; Khang, Tsung Fei; Cheong, Sok Ching

    2017-01-25

    The drug discovery and development pipeline is a long and arduous process that inevitably hampers rapid drug development. Therefore, strategies to improve the efficiency of drug development are urgently needed to enable effective drugs to enter the clinic. Precision medicine has demonstrated that genetic features of cancer cells can be used for predicting drug response, and emerging evidence suggest that gene-drug connections could be predicted more accurately by exploring the cumulative effects of many genes simultaneously. We developed DeSigN, a web-based tool for predicting drug efficacy against cancer cell lines using gene expression patterns. The algorithm correlates phenotype-specific gene signatures derived from differentially expressed genes with pre-defined gene expression profiles associated with drug response data (IC 50 ) from 140 drugs. DeSigN successfully predicted the right drug sensitivity outcome in four published GEO studies. Additionally, it predicted bosutinib, a Src/Abl kinase inhibitor, as a sensitive inhibitor for oral squamous cell carcinoma (OSCC) cell lines. In vitro validation of bosutinib in OSCC cell lines demonstrated that indeed, these cell lines were sensitive to bosutinib with IC 50 of 0.8-1.2 μM. As further confirmation, we demonstrated experimentally that bosutinib has anti-proliferative activity in OSCC cell lines, demonstrating that DeSigN was able to robustly predict drug that could be beneficial for tumour control. DeSigN is a robust method that is useful for the identification of candidate drugs using an input gene signature obtained from gene expression analysis. This user-friendly platform could be used to identify drugs with unanticipated efficacy against cancer cell lines of interest, and therefore could be used for the repurposing of drugs, thus improving the efficiency of drug development.

  10. Microarray Meta-Analysis Identifies Acute Lung Injury Biomarkers in Donor Lungs That Predict Development of Primary Graft Failure in Recipients

    PubMed Central

    Haitsma, Jack J.; Furmli, Suleiman; Masoom, Hussain; Liu, Mingyao; Imai, Yumiko; Slutsky, Arthur S.; Beyene, Joseph; Greenwood, Celia M. T.; dos Santos, Claudia

    2012-01-01

    Objectives To perform a meta-analysis of gene expression microarray data from animal studies of lung injury, and to identify an injury-specific gene expression signature capable of predicting the development of lung injury in humans. Methods We performed a microarray meta-analysis using 77 microarray chips across six platforms, two species and different animal lung injury models exposed to lung injury with or/and without mechanical ventilation. Individual gene chips were classified and grouped based on the strategy used to induce lung injury. Effect size (change in gene expression) was calculated between non-injurious and injurious conditions comparing two main strategies to pool chips: (1) one-hit and (2) two-hit lung injury models. A random effects model was used to integrate individual effect sizes calculated from each experiment. Classification models were built using the gene expression signatures generated by the meta-analysis to predict the development of lung injury in human lung transplant recipients. Results Two injury-specific lists of differentially expressed genes generated from our meta-analysis of lung injury models were validated using external data sets and prospective data from animal models of ventilator-induced lung injury (VILI). Pathway analysis of gene sets revealed that both new and previously implicated VILI-related pathways are enriched with differentially regulated genes. Classification model based on gene expression signatures identified in animal models of lung injury predicted development of primary graft failure (PGF) in lung transplant recipients with larger than 80% accuracy based upon injury profiles from transplant donors. We also found that better classifier performance can be achieved by using meta-analysis to identify differentially-expressed genes than using single study-based differential analysis. Conclusion Taken together, our data suggests that microarray analysis of gene expression data allows for the detection of “injury" gene predictors that can classify lung injury samples and identify patients at risk for clinically relevant lung injury complications. PMID:23071521

  11. A signature inferred from Drosophila mitotic genes predicts survival of breast cancer patients.

    PubMed

    Damasco, Christian; Lembo, Antonio; Somma, Maria Patrizia; Gatti, Maurizio; Di Cunto, Ferdinando; Provero, Paolo

    2011-02-28

    The classification of breast cancer patients into risk groups provides a powerful tool for the identification of patients who will benefit from aggressive systemic therapy. The analysis of microarray data has generated several gene expression signatures that improve diagnosis and allow risk assessment. There is also evidence that cell proliferation-related genes have a high predictive power within these signatures. We thus constructed a gene expression signature (the DM signature) using the human orthologues of 108 Drosophila melanogaster genes required for either the maintenance of chromosome integrity (36 genes) or mitotic division (72 genes). The DM signature has minimal overlap with the extant signatures and is highly predictive of survival in 5 large breast cancer datasets. In addition, we show that the DM signature outperforms many widely used breast cancer signatures in predictive power, and performs comparably to other proliferation-based signatures. For most genes of the DM signature, an increased expression is negatively correlated with patient survival. The genes that provide the highest contribution to the predictive power of the DM signature are those involved in cytokinesis. This finding highlights cytokinesis as an important marker in breast cancer prognosis and as a possible target for antimitotic therapies.

  12. Genome-wide targeted prediction of ABA responsive genes in rice based on over-represented cis-motif in co-expressed genes.

    PubMed

    Lenka, Sangram K; Lohia, Bikash; Kumar, Abhay; Chinnusamy, Viswanathan; Bansal, Kailash C

    2009-02-01

    Abscisic acid (ABA), the popular plant stress hormone, plays a key role in regulation of sub-set of stress responsive genes. These genes respond to ABA through specific transcription factors which bind to cis-regulatory elements present in their promoters. We discovered the ABA Responsive Element (ABRE) core (ACGT) containing CGMCACGTGB motif as over-represented motif among the promoters of ABA responsive co-expressed genes in rice. Targeted gene prediction strategy using this motif led to the identification of 402 protein coding genes potentially regulated by ABA-dependent molecular genetic network. RT-PCR analysis of arbitrarily chosen 45 genes from the predicted 402 genes confirmed 80% accuracy of our prediction. Plant Gene Ontology (GO) analysis of ABA responsive genes showed enrichment of signal transduction and stress related genes among diverse functional categories.

  13. Employing conservation of co-expression to improve functional inference

    PubMed Central

    Daub, Carsten O; Sonnhammer, Erik LL

    2008-01-01

    Background Observing co-expression between genes suggests that they are functionally coupled. Co-expression of orthologous gene pairs across species may improve function prediction beyond the level achieved in a single species. Results We used orthology between genes of the three different species S. cerevisiae, D. melanogaster, and C. elegans to combine co-expression across two species at a time. This led to increased function prediction accuracy when we incorporated expression data from either of the other two species and even further increased when conservation across both of the two other species was considered at the same time. Employing the conservation across species to incorporate abundant model organism data for the prediction of protein interactions in poorly characterized species constitutes a very powerful annotation method. Conclusion To be able to employ the most suitable co-expression distance measure for our analysis, we evaluated the ability of four popular gene co-expression distance measures to detect biologically relevant interactions between pairs of genes. For the expression datasets employed in our co-expression conservation analysis above, we used the GO and the KEGG PATHWAY databases as gold standards. While the differences between distance measures were small, Spearman correlation showed to give most robust results. PMID:18808668

  14. Google Goes Cancer: Improving Outcome Prediction for Cancer Patients by Network-Based Ranking of Marker Genes

    PubMed Central

    Roy, Janine; Aust, Daniela; Knösel, Thomas; Rümmele, Petra; Jahnke, Beatrix; Hentrich, Vera; Rückert, Felix; Niedergethmann, Marco; Weichert, Wilko; Bahra, Marcus; Schlitt, Hans J.; Settmacher, Utz; Friess, Helmut; Büchler, Markus; Saeger, Hans-Detlev; Schroeder, Michael; Pilarsky, Christian; Grützmann, Robert

    2012-01-01

    Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice. PMID:22615549

  15. EvoCor: a platform for predicting functionally related genes using phylogenetic and expression profiles.

    PubMed

    Dittmar, W James; McIver, Lauren; Michalak, Pawel; Garner, Harold R; Valdez, Gregorio

    2014-07-01

    The wealth of publicly available gene expression and genomic data provides unique opportunities for computational inference to discover groups of genes that function to control specific cellular processes. Such genes are likely to have co-evolved and be expressed in the same tissues and cells. Unfortunately, the expertise and computational resources required to compare tens of genomes and gene expression data sets make this type of analysis difficult for the average end-user. Here, we describe the implementation of a web server that predicts genes involved in affecting specific cellular processes together with a gene of interest. We termed the server 'EvoCor', to denote that it detects functional relationships among genes through evolutionary analysis and gene expression correlation. This web server integrates profiles of sequence divergence derived by a Hidden Markov Model (HMM) and tissue-wide gene expression patterns to determine putative functional linkages between pairs of genes. This server is easy to use and freely available at http://pilot-hmm.vbi.vt.edu/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Analyzing gene expression from relative codon usage bias in Yeast genome: a statistical significance and biological relevance.

    PubMed

    Das, Shibsankar; Roymondal, Uttam; Sahoo, Satyabrata

    2009-08-15

    Based on the hypothesis that highly expressed genes are often characterized by strong compositional bias in terms of codon usage, there are a number of measures currently in use that quantify codon usage bias in genes, and hence provide numerical indices to predict the expression levels of genes. With the recent advent of expression measure from the score of the relative codon usage bias (RCBS), we have explicitly tested the performance of this numerical measure to predict the gene expression level and illustrate this with an analysis of Yeast genomes. In contradiction with previous other studies, we observe a weak correlations between GC content and RCBS, but a selective pressure on the codon preferences in highly expressed genes. The assertion that the expression of a given gene depends on the score of relative codon usage bias (RCBS) is supported by the data. We further observe a strong correlation between RCBS and protein length indicating natural selection in favour of shorter genes to be expressed at higher level. We also attempt a statistical analysis to assess the strength of relative codon bias in genes as a guide to their likely expression level, suggesting a decrease of the informational entropy in the highly expressed genes.

  17. Molecular evolution and expression of oxygen transport genes in livebearing fishes (Poeciliidae) from hydrogen sulfide rich springs.

    PubMed

    Barts, Nicholas; Greenway, Ryan; Passow, Courtney N; Arias-Rodriguez, Lenin; Kelley, Joanna L; Tobler, Michael

    2018-04-01

    Hydrogen sulfide (H 2 S) is a natural toxicant in some aquatic environments that has diverse molecular targets. It binds to oxygen transport proteins, rendering them non-functional by reducing oxygen-binding affinity. Hence, organisms permanently inhabiting H 2 S-rich environments are predicted to exhibit adaptive modifications to compensate for the reduced capacity to transport oxygen. We investigated 10 lineages of fish of the family Poeciliidae that have colonized freshwater springs rich in H 2 S-along with related lineages from non-sulfidic environments-to test hypotheses about the expression and evolution of oxygen transport genes in a phylogenetic context. We predicted shifts in the expression of and signatures of positive selection on oxygen transport genes upon colonization of H 2 S-rich habitats. Our analyses indicated significant shifts in gene expression for multiple hemoglobin genes in lineages that have colonized H 2 S-rich environments, and three hemoglobin genes exhibited relaxed selection in sulfidic compared to non-sulfidic lineages. However, neither changes in gene expression nor signatures of selection were consistent among all lineages in H 2 S-rich environments. Oxygen transport genes may consequently be predictable targets of selection during adaptation to sulfidic environments, but changes in gene expression and molecular evolution of oxygen transport genes in H 2 S-rich environments are not necessarily repeatable across replicated lineages.

  18. Multiple fuzzy neural network system for outcome prediction and classification of 220 lymphoma patients on the basis of molecular profiling.

    PubMed

    Ando, Tatsuya; Suguro, Miyuki; Kobayashi, Takeshi; Seto, Masao; Honda, Hiroyuki

    2003-10-01

    A fuzzy neural network (FNN) using gene expression profile data can select combinations of genes from thousands of genes, and is applicable to predict outcome for cancer patients after chemotherapy. However, wide clinical heterogeneity reduces the accuracy of prediction. To overcome this problem, we have proposed an FNN system based on majoritarian decision using multiple noninferior models. We used transcriptional profiling data, which were obtained from "Lymphochip" DNA microarrays (http://llmpp.nih.gov/DLBCL), reported by Rosenwald (N Engl J Med 2002; 346: 1937-47). When the data were analyzed by our FNN system, accuracy (73.4%) of outcome prediction using only 1 FNN model with 4 genes was higher than that (68.5%) of the Cox model using 17 genes. Higher accuracy (91%) was obtained when an FNN system with 9 noninferior models, consisting of 35 independent genes, was used. The genes selected by the system included genes that are informative in the prognosis of Diffuse large B-cell lymphoma (DLBCL), such as genes showing an expression pattern similar to that of CD10 and BCL-6 or similar to that of IRF-4 and BCL-4. We classified 220 DLBCL patients into 5 groups using the prediction results of 9 FNN models. These groups may correspond to DLBCL subtypes. In group A containing half of the 220 patients, patients with poor outcome were found to satisfy 2 rules, i.e., high expression of MAX dimerization with high expression of unknown A (LC_26146), or high expression of MAX dimerization with low expression of unknown B (LC_33144). The present paper is the first to describe the multiple noninferior FNN modeling system. This system is a powerful tool for predicting outcome and classifying patients, and is applicable to other heterogeneous diseases.

  19. Predictive computation of genomic logic processing functions in embryonic development

    PubMed Central

    Peter, Isabelle S.; Faure, Emmanuel; Davidson, Eric H.

    2012-01-01

    Gene regulatory networks (GRNs) control the dynamic spatial patterns of regulatory gene expression in development. Thus, in principle, GRN models may provide system-level, causal explanations of developmental process. To test this assertion, we have transformed a relatively well-established GRN model into a predictive, dynamic Boolean computational model. This Boolean model computes spatial and temporal gene expression according to the regulatory logic and gene interactions specified in a GRN model for embryonic development in the sea urchin. Additional information input into the model included the progressive embryonic geometry and gene expression kinetics. The resulting model predicted gene expression patterns for a large number of individual regulatory genes each hour up to gastrulation (30 h) in four different spatial domains of the embryo. Direct comparison with experimental observations showed that the model predictively computed these patterns with remarkable spatial and temporal accuracy. In addition, we used this model to carry out in silico perturbations of regulatory functions and of embryonic spatial organization. The model computationally reproduced the altered developmental functions observed experimentally. Two major conclusions are that the starting GRN model contains sufficiently complete regulatory information to permit explanation of a complex developmental process of gene expression solely in terms of genomic regulatory code, and that the Boolean model provides a tool with which to test in silico regulatory circuitry and developmental perturbations. PMID:22927416

  20. CisMapper: predicting regulatory interactions from transcription factor ChIP-seq data

    PubMed Central

    O'Connor, Timothy; Bodén, Mikael

    2017-01-01

    Abstract Identifying the genomic regions and regulatory factors that control the transcription of genes is an important, unsolved problem. The current method of choice predicts transcription factor (TF) binding sites using chromatin immunoprecipitation followed by sequencing (ChIP-seq), and then links the binding sites to putative target genes solely on the basis of the genomic distance between them. Evidence from chromatin conformation capture experiments shows that this approach is inadequate due to long-distance regulation via chromatin looping. We present CisMapper, which predicts the regulatory targets of a TF using the correlation between a histone mark at the TF's bound sites and the expression of each gene across a panel of tissues. Using both chromatin conformation capture and differential expression data, we show that CisMapper is more accurate at predicting the target genes of a TF than the distance-based approaches currently used, and is particularly advantageous for predicting the long-range regulatory interactions typical of tissue-specific gene expression. CisMapper also predicts which TF binding sites regulate a given gene more accurately than using genomic distance. Unlike distance-based methods, CisMapper can predict which transcription start site of a gene is regulated by a particular binding site of the TF. PMID:28204599

  1. Gene expression markers in circulating tumor cells may predict bone metastasis and response to hormonal treatment in breast cancer

    PubMed Central

    WANG, HAIYING; MOLINA, JULIAN; JIANG, JOHN; FERBER, MATTHEW; PRUTHI, SANDHYA; JATKOE, TIMOTHY; DERECHO, CARLO; RAJPUROHIT, YASHODA; ZHENG, JIAN; WANG, YIXIN

    2013-01-01

    Circulating tumor cells (CTCs) have recently attracted attention due to their potential as prognostic and predictive markers for the clinical management of metastatic breast cancer patients. The isolation of CTCs from patients may enable the molecular characterization of these cells, which may help establish a minimally invasive assay for the prediction of metastasis and further optimization of treatment. Molecular markers of proven clinical value may therefore be useful in predicting disease aggressiveness and response to treatment. In our earlier study, we identified a gene signature in breast cancer that appears to be significantly associated with bone metastasis. Among the genes that constitute this signature, trefoil factor 1 (TFF1) was identified as the most differentially expressed gene associated with bone metastasis. In this study, we investigated 25 candidate gene markers in the CTCs of metastatic breast cancer patients with different metastatic sites. The panel of the 25 markers was investigated in 80 baseline samples (first blood draw of CTCs) and 30 follow-up samples. In addition, 40 healthy blood donors (HBDs) were analyzed as controls. The assay was performed using quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) with RNA extracted from CTCs captured by the CellSearch system. Our study indicated that 12 of the genes were uniquely expressed in CTCs and 10 were highly expressed in the CTCs obtained from patients compared to those obtained from HBDs. Among these genes, the expression of keratin 19 was highly correlated with the CTC count. The TFF1 expression in CTCs was a strong predictor of bone metastasis and the patients with a high expression of estrogen receptor β in CTCs exhibited a better response to hormonal treatment. Molecular characterization of these genes in CTCs may provide a better understanding of the mechanism underlying tumor metastasis and identify gene markers in CTCs for predicting disease progression and response to treatment. PMID:24649289

  2. Gene expression markers in circulating tumor cells may predict bone metastasis and response to hormonal treatment in breast cancer.

    PubMed

    Wang, Haiying; Molina, Julian; Jiang, John; Ferber, Matthew; Pruthi, Sandhya; Jatkoe, Timothy; Derecho, Carlo; Rajpurohit, Yashoda; Zheng, Jian; Wang, Yixin

    2013-11-01

    Circulating tumor cells (CTCs) have recently attracted attention due to their potential as prognostic and predictive markers for the clinical management of metastatic breast cancer patients. The isolation of CTCs from patients may enable the molecular characterization of these cells, which may help establish a minimally invasive assay for the prediction of metastasis and further optimization of treatment. Molecular markers of proven clinical value may therefore be useful in predicting disease aggressiveness and response to treatment. In our earlier study, we identified a gene signature in breast cancer that appears to be significantly associated with bone metastasis. Among the genes that constitute this signature, trefoil factor 1 (TFF1) was identified as the most differentially expressed gene associated with bone metastasis. In this study, we investigated 25 candidate gene markers in the CTCs of metastatic breast cancer patients with different metastatic sites. The panel of the 25 markers was investigated in 80 baseline samples (first blood draw of CTCs) and 30 follow-up samples. In addition, 40 healthy blood donors (HBDs) were analyzed as controls. The assay was performed using quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) with RNA extracted from CTCs captured by the CellSearch system. Our study indicated that 12 of the genes were uniquely expressed in CTCs and 10 were highly expressed in the CTCs obtained from patients compared to those obtained from HBDs. Among these genes, the expression of keratin 19 was highly correlated with the CTC count. The TFF1 expression in CTCs was a strong predictor of bone metastasis and the patients with a high expression of estrogen receptor β in CTCs exhibited a better response to hormonal treatment. Molecular characterization of these genes in CTCs may provide a better understanding of the mechanism underlying tumor metastasis and identify gene markers in CTCs for predicting disease progression and response to treatment.

  3. Identifying gnostic predictors of the vaccine response.

    PubMed

    Haining, W Nicholas; Pulendran, Bali

    2012-06-01

    Molecular predictors of the response to vaccination could transform vaccine development. They would allow larger numbers of vaccine candidates to be rapidly screened, shortening the development time for new vaccines. Gene-expression based predictors of vaccine response have shown early promise. However, a limitation of gene-expression based predictors is that they often fail to reveal the mechanistic basis of their ability to classify response. Linking predictive signatures to the function of their component genes would advance basic understanding of vaccine immunity and also improve the robustness of vaccine prediction. New analytic tools now allow more biological meaning to be extracted from predictive signatures. Functional genomic approaches to perturb gene expression in mammalian cells permit the function of predictive genes to be surveyed in highly parallel experiments. The challenge for vaccinologists is therefore to use these tools to embed mechanistic insights into predictors of vaccine response. Copyright © 2012 Elsevier Ltd. All rights reserved.

  4. Inferring gene expression from ribosomal promoter sequences, a crowdsourcing approach

    PubMed Central

    Meyer, Pablo; Siwo, Geoffrey; Zeevi, Danny; Sharon, Eilon; Norel, Raquel; Segal, Eran; Stolovitzky, Gustavo; Siwo, Geoffrey; Rider, Andrew K.; Tan, Asako; Pinapati, Richard S.; Emrich, Scott; Chawla, Nitesh; Ferdig, Michael T.; Tung, Yi-An; Chen, Yong-Syuan; Chen, Mei-Ju May; Chen, Chien-Yu; Knight, Jason M.; Sahraeian, Sayed Mohammad Ebrahim; Esfahani, Mohammad Shahrokh; Dreos, Rene; Bucher, Philipp; Maier, Ezekiel; Saeys, Yvan; Szczurek, Ewa; Myšičková, Alena; Vingron, Martin; Klein, Holger; Kiełbasa, Szymon M.; Knisley, Jeff; Bonnell, Jeff; Knisley, Debra; Kursa, Miron B.; Rudnicki, Witold R.; Bhattacharjee, Madhuchhanda; Sillanpää, Mikko J.; Yeung, James; Meysman, Pieter; Rodríguez, Aminael Sánchez; Engelen, Kristof; Marchal, Kathleen; Huang, Yezhou; Mordelet, Fantine; Hartemink, Alexander; Pinello, Luca; Yuan, Guo-Cheng

    2013-01-01

    The Gene Promoter Expression Prediction challenge consisted of predicting gene expression from promoter sequences in a previously unknown experimentally generated data set. The challenge was presented to the community in the framework of the sixth Dialogue for Reverse Engineering Assessments and Methods (DREAM6), a community effort to evaluate the status of systems biology modeling methodologies. Nucleotide-specific promoter activity was obtained by measuring fluorescence from promoter sequences fused upstream of a gene for yellow fluorescence protein and inserted in the same genomic site of yeast Saccharomyces cerevisiae. Twenty-one teams submitted results predicting the expression levels of 53 different promoters from yeast ribosomal protein genes. Analysis of participant predictions shows that accurate values for low-expressed and mutated promoters were difficult to obtain, although in the latter case, only when the mutation induced a large change in promoter activity compared to the wild-type sequence. As in previous DREAM challenges, we found that aggregation of participant predictions provided robust results, but did not fare better than the three best algorithms. Finally, this study not only provides a benchmark for the assessment of methods predicting activity of a specific set of promoters from their sequence, but it also shows that the top performing algorithm, which used machine-learning approaches, can be improved by the addition of biological features such as transcription factor binding sites. PMID:23950146

  5. Bone Metastasis in Advanced Breast Cancer: Analysis of Gene Expression Microarray.

    PubMed

    Cosphiadi, Irawan; Atmakusumah, Tubagus D; Siregar, Nurjati C; Muthalib, Abdul; Harahap, Alida; Mansyur, Muchtarruddin

    2018-03-08

    Approximately 30% to 40% of breast cancer recurrences involve bone metastasis (BM). Certain genes have been linked to BM; however, none have been able to predict bone involvement. In this study, we analyzed gene expression profiles in advanced breast cancer patients to elucidate genes that can be used to predict BM. A total of 92 advanced breast cancer patients, including 46 patients with BM and 46 patients without BM, were identified for this study. Immunohistochemistry and gene expression analysis was performed on 81 formalin-fixed paraffin-embedded samples. Data were collected through medical records, and gene expression of 200 selected genes compiled from 6 previous studies was performed using NanoString nCounter. Genetic expression profiles showed that 22 genes were significantly differentially expressed between breast cancer patients with metastasis in bone and other organs (BM+) and non-BM, whereas subjects with only BM showed 17 significantly differentially expressed genes. The following genes were associated with an increasing incidence of BM in the BM+ group: estrogen receptor 1 (ESR1), GATA binding protein 3 (GATA3), and melanophilin with an area under the curve (AUC) of 0.804. In the BM group, the following genes were associated with an increasing incidence of BM: ESR1, progesterone receptor, B-cell lymphoma 2, Rab escort protein, N-acetyltransferase 1, GATA3, annexin A9, and chromosome 9 open reading frame 116. ESR1 and GATA3 showed an increased strength of association with an AUC of 0.928. A combination of the identified 3 genes in BM+ and 8 genes in BM showed better prediction than did each individual gene, and this combination can be used as a training set. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  6. Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities

    PubMed Central

    Fang, Xin; Sastry, Anand; Mih, Nathan; Kim, Donghyuk; Tan, Justin; Lloyd, Colton J.; Gao, Ye; Yang, Laurence; Palsson, Bernhard O.

    2017-01-01

    Transcriptional regulatory networks (TRNs) have been studied intensely for >25 y. Yet, even for the Escherichia coli TRN—probably the best characterized TRN—several questions remain. Here, we address three questions: (i) How complete is our knowledge of the E. coli TRN; (ii) how well can we predict gene expression using this TRN; and (iii) how robust is our understanding of the TRN? First, we reconstructed a high-confidence TRN (hiTRN) consisting of 147 transcription factors (TFs) regulating 1,538 transcription units (TUs) encoding 1,764 genes. The 3,797 high-confidence regulatory interactions were collected from published, validated chromatin immunoprecipitation (ChIP) data and RegulonDB. For 21 different TF knockouts, up to 63% of the differentially expressed genes in the hiTRN were traced to the knocked-out TF through regulatory cascades. Second, we trained supervised machine learning algorithms to predict the expression of 1,364 TUs given TF activities using 441 samples. The algorithms accurately predicted condition-specific expression for 86% (1,174 of 1,364) of the TUs, while 193 TUs (14%) were predicted better than random TRNs. Third, we identified 10 regulatory modules whose definitions were robust against changes to the TRN or expression compendium. Using surrogate variable analysis, we also identified three unmodeled factors that systematically influenced gene expression. Our computational workflow comprehensively characterizes the predictive capabilities and systems-level functions of an organism’s TRN from disparate data types. PMID:28874552

  7. Early gene expression profiles of patients with chronic hepatitis C treated with pegylated interferon-alfa and ribavirin.

    PubMed

    Younossi, Zobair M; Baranova, Ancha; Afendy, Arian; Collantes, Rochelle; Stepanova, Maria; Manyam, Ganiraju; Bakshi, Anita; Sigua, Christopher L; Chan, Joanne P; Iverson, Ayuko A; Santini, Christopher D; Chang, Sheng-Yung P

    2009-03-01

    Responsiveness to hepatitis C virus (HCV) therapy depends on viral and host factors. Our aim was to assess sustained virologic response (SVR)-associated early gene expression in patients with HCV receiving pegylated interferon-alpha2a (PEG-IFN-alpha2a) or PEG-IFN-alpha2b and ribavirin with the duration based on genotypes. Blood samples were collected into PAXgene tubes prior to treatment as well as 1, 7, 28, and 56 days after treatment. From the peripheral blood cells, total RNA was extracted, quantified, and used for one-step reverse transcription polymerase chain reaction to profile 154 messenger RNAs. Expression levels of messenger RNAs were normalized with six "housekeeping" genes and a reference RNA. Multiple regression and stepwise selection were performed to assess differences in gene expression at different time points, and predictive performance was evaluated for each model. A total of 68 patients were enrolled in the study and treated with combination therapy. The results of gene expression showed that SVR could be predicted by the gene expression of signal transducer and activator of transcription-6 (STAT-6) and suppressor of cytokine signaling-1 in the pretreatment samples. After 24 hours, SVR was predicted by the expression of interferon-dependent genes, and this dependence continued to be prominent throughout the treatment. Early gene expression during anti-HCV therapy may elucidate important molecular pathways that may be influencing the probability of achieving virologic response.

  8. A Novel Method to Predict Highly Expressed Genes Based on Radius Clustering and Relative Synonymous Codon Usage.

    PubMed

    Tran, Tuan-Anh; Vo, Nam Tri; Nguyen, Hoang Duc; Pham, Bao The

    2015-12-01

    Recombinant proteins play an important role in many aspects of life and have generated a huge income, notably in the industrial enzyme business. A gene is introduced into a vector and expressed in a host organism-for example, E. coli-to obtain a high productivity of target protein. However, transferred genes from particular organisms are not usually compatible with the host's expression system because of various reasons, for example, codon usage bias, GC content, repetitive sequences, and secondary structure. The solution is developing programs to optimize for designing a nucleotide sequence whose origin is from peptide sequences using properties of highly expressed genes (HEGs) of the host organism. Existing data of HEGs determined by practical and computer-based methods do not satisfy for qualifying and quantifying. Therefore, the demand for developing a new HEG prediction method is critical. We proposed a new method for predicting HEGs and criteria to evaluate gene optimization. Codon usage bias was weighted by amplifying the difference between HEGs and non-highly expressed genes (non-HEGs). The number of predicted HEGs is 5% of the genome. In comparison with Puigbò's method, the result is twice as good as Puigbò's one, in kernel ratio and kernel sensitivity. Concerning transcription/translation factor proteins (TF), the proposed method gives low TF sensitivity, while Puigbò's method gives moderate one. In summary, the results indicated that the proposed method can be a good optional applying method to predict optimized genes for particular organisms, and we generated an HEG database for further researches in gene design.

  9. Ion channel gene expression predicts survival in glioma patients

    PubMed Central

    Wang, Rong; Gurguis, Christopher I.; Gu, Wanjun; Ko, Eun A; Lim, Inja; Bang, Hyoweon; Zhou, Tong; Ko, Jae-Hong

    2015-01-01

    Ion channels are important regulators in cell proliferation, migration, and apoptosis. The malfunction and/or aberrant expression of ion channels may disrupt these important biological processes and influence cancer progression. In this study, we investigate the expression pattern of ion channel genes in glioma. We designate 18 ion channel genes that are differentially expressed in high-grade glioma as a prognostic molecular signature. This ion channel gene expression based signature predicts glioma outcome in three independent validation cohorts. Interestingly, 16 of these 18 genes were down-regulated in high-grade glioma. This signature is independent of traditional clinical, molecular, and histological factors. Resampling tests indicate that the prognostic power of the signature outperforms random gene sets selected from human genome in all the validation cohorts. More importantly, this signature performs better than the random gene signatures selected from glioma-associated genes in two out of three validation datasets. This study implicates ion channels in brain cancer, thus expanding on knowledge of their roles in other cancers. Individualized profiling of ion channel gene expression serves as a superior and independent prognostic tool for glioma patients. PMID:26235283

  10. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    PubMed

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of E<10(-5)) are included in 27 clusters. Five clusters are associated with metabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.

  11. Regulators of gene expression as biomarkers for prostate cancer

    PubMed Central

    Willard, Stacey S; Koochekpour, Shahriar

    2012-01-01

    Recent technological advancements in gene expression analysis have led to the discovery of a promising new group of prostate cancer (PCa) biomarkers that have the potential to influence diagnosis and the prediction of disease severity. The accumulation of deleterious changes in gene expression is a fundamental mechanism of prostate carcinogenesis. Aberrant gene expression can arise from changes in epigenetic regulation or mutation in the genome affecting either key regulatory elements or gene sequences themselves. At the epigenetic level, a myriad of abnormal histone modifications and changes in DNA methylation are found in PCa patients. In addition, many mutations in the genome have been associated with higher PCa risk. Finally, over- or underexpression of key genes involved in cell cycle regulation, apoptosis, cell adhesion and regulation of transcription has been observed. An interesting group of biomarkers are emerging from these studies which may prove more predictive than the standard prostate specific antigen (PSA) serum test. In this review, we discuss recent results in the field of gene expression analysis in PCa including the most promising biomarkers in the areas of epigenetics, genomics and the transcriptome, some of which are currently under investigation as clinical tests for early detection and better prognostic prediction of PCa. PMID:23226612

  12. Network regularised Cox regression and multiplex network models to predict disease comorbidities and survival of cancer.

    PubMed

    Xu, Haoming; Moni, Mohammad Ali; Liò, Pietro

    2015-12-01

    In cancer genomics, gene expression levels provide important molecular signatures for all types of cancer, and this could be very useful for predicting the survival of cancer patients. However, the main challenge of gene expression data analysis is high dimensionality, and microarray is characterised by few number of samples with large number of genes. To overcome this problem, a variety of penalised Cox proportional hazard models have been proposed. We introduce a novel network regularised Cox proportional hazard model and a novel multiplex network model to measure the disease comorbidities and to predict survival of the cancer patient. Our methods are applied to analyse seven microarray cancer gene expression datasets: breast cancer, ovarian cancer, lung cancer, liver cancer, renal cancer and osteosarcoma. Firstly, we applied a principal component analysis to reduce the dimensionality of original gene expression data. Secondly, we applied a network regularised Cox regression model on the reduced gene expression datasets. By using normalised mutual information method and multiplex network model, we predict the comorbidities for the liver cancer based on the integration of diverse set of omics and clinical data, and we find the diseasome associations (disease-gene association) among different cancers based on the identified common significant genes. Finally, we evaluated the precision of the approach with respect to the accuracy of survival prediction using ROC curves. We report that colon cancer, liver cancer and renal cancer share the CXCL5 gene, and breast cancer, ovarian cancer and renal cancer share the CCND2 gene. Our methods are useful to predict survival of the patient and disease comorbidities more accurately and helpful for improvement of the care of patients with comorbidity. Software in Matlab and R is available on our GitHub page: https://github.com/ssnhcom/NetworkRegularisedCox.git. Copyright © 2015. Published by Elsevier Ltd.

  13. Muscle myeloid type I interferon gene expression may predict therapeutic responses to rituximab in myositis patients

    PubMed Central

    Nagaraju, Kanneboyina; Ghimbovschi, Svetlana; Rayavarapu, Sree; Phadke, Aditi; Rider, Lisa G.; Hoffman, Eric P.

    2016-01-01

    Abstract Objective. To identify muscle gene expression patterns that predict rituximab responses and assess the effects of rituximab on muscle gene expression in PM and DM. Methods. In an attempt to understand the molecular mechanism of response and non-response to rituximab therapy, we performed Affymetrix gene expression array analyses on muscle biopsy specimens taken before and after rituximab therapy from eight PM and two DM patients in the Rituximab in Myositis study. We also analysed selected muscle-infiltrating cell phenotypes in these biopsies by immunohistochemical staining. Partek and Ingenuity pathway analyses assessed the gene pathways and networks. Results. Myeloid type I IFN signature genes were expressed at higher levels at baseline in the skeletal muscle of rituximab responders than in non-responders, whereas classic non-myeloid IFN signature genes were expressed at higher levels in non-responders at baseline. Also, rituximab responders have a greater reduction of the myeloid and non-myeloid type I IFN signatures than non-responders. The decrease in the type I IFN signature following administration of rituximab may be associated with the decreases in muscle-infiltrating CD19 + B cells and CD68 + macrophages in responders. Conclusion. Our findings suggest that high levels of myeloid type I IFN gene expression in skeletal muscle predict responses to rituximab in PM/DM and that rituximab responders also have a greater decrease in the expression of these genes. These data add further evidence to recent studies defining the type I IFN signature as both a predictor of therapeutic responses and a biomarker of myositis disease activity. PMID:27215813

  14. Rrp1b, a New Candidate Susceptibility Gene for Breast Cancer Progression and Metastasis

    PubMed Central

    Crawford, Nigel P. S; Qian, Xiaolan; Ziogas, Argyrios; Papageorge, Alex G; Boersma, Brenda J; Walker, Renard C; Lukes, Luanne; Rowe, William L; Zhang, Jinghui; Ambs, Stefan; Lowy, Douglas R; Anton-Culver, Hoda; Hunter, Kent W

    2007-01-01

    A novel candidate metastasis modifier, ribosomal RNA processing 1 homolog B (Rrp1b), was identified through two independent approaches. First, yeast two-hybrid, immunoprecipitation, and functional assays demonstrated a physical and functional interaction between Rrp1b and the previous identified metastasis modifier Sipa1. In parallel, using mouse and human metastasis gene expression data it was observed that extracellular matrix (ECM) genes are common components of metastasis predictive signatures, suggesting that ECM genes are either important markers or causal factors in metastasis. To investigate the relationship between ECM genes and poor prognosis in breast cancer, expression quantitative trait locus analysis of polyoma middle-T transgene-induced mammary tumor was performed. ECM gene expression was found to be consistently associated with Rrp1b expression. In vitro expression of Rrp1b significantly altered ECM gene expression, tumor growth, and dissemination in metastasis assays. Furthermore, a gene signature induced by ectopic expression of Rrp1b in tumor cells predicted survival in a human breast cancer gene expression dataset. Finally, constitutional polymorphism within RRP1B was found to be significantly associated with tumor progression in two independent breast cancer cohorts. These data suggest that RRP1B may be a novel susceptibility gene for breast cancer progression and metastasis. PMID:18081427

  15. Artificial neural network classifier predicts neuroblastoma patients' outcome.

    PubMed

    Cangelosi, Davide; Pelassa, Simone; Morini, Martina; Conte, Massimo; Bosco, Maria Carla; Eva, Alessandra; Sementa, Angela Rita; Varesio, Luigi

    2016-11-08

    More than fifty percent of neuroblastoma (NB) patients with adverse prognosis do not benefit from treatment making the identification of new potential targets mandatory. Hypoxia is a condition of low oxygen tension, occurring in poorly vascularized tissues, which activates specific genes and contributes to the acquisition of the tumor aggressive phenotype. We defined a gene expression signature (NB-hypo), which measures the hypoxic status of the neuroblastoma tumor. We aimed at developing a classifier predicting neuroblastoma patients' outcome based on the assessment of the adverse effects of tumor hypoxia on the progression of the disease. Multi-layer perceptron (MLP) was trained on the expression values of the 62 probe sets constituting NB-hypo signature to develop a predictive model for neuroblastoma patients' outcome. We utilized the expression data of 100 tumors in a leave-one-out analysis to select and construct the classifier and the expression data of the remaining 82 tumors to test the classifier performance in an external dataset. We utilized the Gene set enrichment analysis (GSEA) to evaluate the enrichment of hypoxia related gene sets in patients predicted with "Poor" or "Good" outcome. We utilized the expression of the 62 probe sets of the NB-Hypo signature in 182 neuroblastoma tumors to develop a MLP classifier predicting patients' outcome (NB-hypo classifier). We trained and validated the classifier in a leave-one-out cross-validation analysis on 100 tumor gene expression profiles. We externally tested the resulting NB-hypo classifier on an independent 82 tumors' set. The NB-hypo classifier predicted the patients' outcome with the remarkable accuracy of 87 %. NB-hypo classifier prediction resulted in 2 % classification error when applied to clinically defined low-intermediate risk neuroblastoma patients. The prediction was 100 % accurate in assessing the death of five low/intermediated risk patients. GSEA of tumor gene expression profile demonstrated the hypoxic status of the tumor in patients with poor prognosis. We developed a robust classifier predicting neuroblastoma patients' outcome with a very low error rate and we provided independent evidence that the poor outcome patients had hypoxic tumors, supporting the potential of using hypoxia as target for neuroblastoma treatment.

  16. Reverse-engineering the genetic circuitry of a cancer cell with predicted intervention in chronic lymphocytic leukemia.

    PubMed

    Vallat, Laurent; Kemper, Corey A; Jung, Nicolas; Maumy-Bertrand, Myriam; Bertrand, Frédéric; Meyer, Nicolas; Pocheville, Arnaud; Fisher, John W; Gribben, John G; Bahram, Seiamak

    2013-01-08

    Cellular behavior is sustained by genetic programs that are progressively disrupted in pathological conditions--notably, cancer. High-throughput gene expression profiling has been used to infer statistical models describing these cellular programs, and development is now needed to guide orientated modulation of these systems. Here we develop a regression-based model to reverse-engineer a temporal genetic program, based on relevant patterns of gene expression after cell stimulation. This method integrates the temporal dimension of biological rewiring of genetic programs and enables the prediction of the effect of targeted gene disruption at the system level. We tested the performance accuracy of this model on synthetic data before reverse-engineering the response of primary cancer cells to a proliferative (protumorigenic) stimulation in a multistate leukemia biological model (i.e., chronic lymphocytic leukemia). To validate the ability of our method to predict the effects of gene modulation on the global program, we performed an intervention experiment on a targeted gene. Comparison of the predicted and observed gene expression changes demonstrates the possibility of predicting the effects of a perturbation in a gene regulatory network, a first step toward an orientated intervention in a cancer cell genetic program.

  17. Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data.

    PubMed

    Yılmaz Isıkhan, Selen; Karabulut, Erdem; Alpar, Celal Reha

    2016-01-01

    Background/Aim . Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. Materials and Methods . Eleven real gene expression datasets containing dose values were included. First, important genes for dose prediction were selected using iterative sure independence screening. Then, the performances of regression trees (RTs), support vector regression (SVR), RT bagging, SVR bagging, and RT boosting were examined. Results . The results demonstrated that a regression-based feature selection method substantially reduced the number of irrelevant genes from raw datasets. Overall, the best prediction performance in nine of 11 datasets was achieved using SVR; the second most accurate performance was provided using a gradient-boosting machine (GBM). Conclusion . Analysis of various dose values based on microarray gene expression data identified common genes found in our study and the referenced studies. According to our findings, SVR and GBM can be good predictors of dose-gene datasets. Another result of the study was to identify the sample size of n = 25 as a cutoff point for RT bagging to outperform a single RT.

  18. A novel gene expression-based prognostic scoring system to predict survival in gastric cancer

    DOE PAGES

    Wang, Pin; Wang, Yunshan; Hang, Bo; ...

    2016-07-11

    Analysis of gene expression patterns in gastric cancer (GC) can help to identify a comprehensive panel of gene biomarkers for predicting clinical outcomes and to discover potential new therapeutic targets. Here, a multi-step bioinformatics analytic approach was developed to establish a novel prognostic scoring system for GC. We first identified 276 genes that were robustly differentially expressed between normal and GC tissues, of which, 249 were found to be significantly associated with overall survival (OS) by univariate Cox regression analysis. The biological functions of 249 genes are related to cell cycle, RNA/ncRNA process, acetylation and extracellular matrix organization. A networkmore » was generated for view of the gene expression architecture of 249 genes in 265 GCs. Finally, we applied a canonical discriminant analysis approach to identify a 53-gene signature and a prognostic scoring system was established based on a canonical discriminant function of 53 genes. The prognostic scores strongly predicted patients with GC to have either a poor or good OS. Our study raises the prospect that the practicality of GC patient prognosis can be assessed by this prognostic scoring system.« less

  19. A novel gene expression-based prognostic scoring system to predict survival in gastric cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Pin; Wang, Yunshan; Hang, Bo

    Analysis of gene expression patterns in gastric cancer (GC) can help to identify a comprehensive panel of gene biomarkers for predicting clinical outcomes and to discover potential new therapeutic targets. Here, a multi-step bioinformatics analytic approach was developed to establish a novel prognostic scoring system for GC. We first identified 276 genes that were robustly differentially expressed between normal and GC tissues, of which, 249 were found to be significantly associated with overall survival (OS) by univariate Cox regression analysis. The biological functions of 249 genes are related to cell cycle, RNA/ncRNA process, acetylation and extracellular matrix organization. A networkmore » was generated for view of the gene expression architecture of 249 genes in 265 GCs. Finally, we applied a canonical discriminant analysis approach to identify a 53-gene signature and a prognostic scoring system was established based on a canonical discriminant function of 53 genes. The prognostic scores strongly predicted patients with GC to have either a poor or good OS. Our study raises the prospect that the practicality of GC patient prognosis can be assessed by this prognostic scoring system.« less

  20. Prediction of the contact sensitizing potential of chemicals using analysis of gene expression changes in human THP-1 monocytes.

    PubMed

    Arkusz, Joanna; Stępnik, Maciej; Sobala, Wojciech; Dastych, Jarosław

    2010-11-10

    The aim of this study was to find differentially regulated genes in THP-1 monocytic cells exposed to sensitizers and nonsensitizers and to investigate if such genes could be reliable markers for an in vitro predictive method for the identification of skin sensitizing chemicals. Changes in expression of 35 genes in the THP-1 cell line following treatment with chemicals of different sensitizing potential (from nonsensitizers to extreme sensitizers) were assessed using real-time PCR. Verification of 13 candidate genes by testing a large number of chemicals (an additional 22 sensitizers and 8 nonsensitizers) revealed that prediction of contact sensitization potential was possible based on evaluation of changes in three genes: IL8, HMOX1 and PAIMP1. In total, changes in expression of these genes allowed correct detection of sensitization potential of 21 out of 27 (78%) test sensitizers. The gene expression levels inside potency groups varied and did not allow estimation of sensitization potency of test chemicals. Results of this study indicate that evaluation of changes in expression of proposed biomarkers in THP-1 cells could be a valuable model for preliminary screening of chemicals to discriminate an appreciable majority of sensitizers from nonsensitizers. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  1. Integrated analysis of gene expression and methylation profiles of 48 candidate genes in breast cancer patients.

    PubMed

    Li, Zibo; Heng, Jianfu; Yan, Jinhua; Guo, Xinwu; Tang, Lili; Chen, Ming; Peng, Limin; Wu, Yepeng; Wang, Shouman; Xiao, Zhi; Deng, Zhongping; Dai, Lizhong; Wang, Jun

    2016-11-01

    Gene-specific methylation and expression have shown biological and clinical importance for breast cancer diagnosis and prognosis. Integrated analysis of gene methylation and gene expression may identify genes associated with biology mechanism and clinical outcome of breast cancer and aid in clinical management. Using high-throughput microfluidic quantitative PCR, we analyzed the expression profiles of 48 candidate genes in 96 Chinese breast cancer patients and investigated their correlation with gene methylation and associations with breast cancer clinical parameters. Breast cancer-specific gene expression alternation was found in 25 genes with significant expression difference between paired tumor and normal tissues. A total of 9 genes (CCND2, EGFR, GSTP1, PGR, PTGS2, RECK, SOX17, TNFRSF10D, and WIF1) showed significant negative correlation between methylation and gene expression, which were validated in the TCGA database. Total 23 genes (ACADL, APC, BRCA2, CADM1, CAV1, CCND2, CST6, EGFR, ESR2, GSTP1, ICAM5, NPY, PGR, PTGS2, RECK, RUNX3, SFRP1, SOX17, SYK, TGFBR2, TNFRSF10D, WIF1, and WRN) annotated with potential TFBSs in the promoter regions showed negative correlation between methylation and expression. In logistics regression analysis, 31 of the 48 genes showed improved performance in disease prediction with combination of methylation and expression coefficient. Our results demonstrated the complex correlation and the possible regulatory mechanisms between DNA methylation and gene expression. Integration analysis of methylation and expression of candidate genes could improve performance in breast cancer prediction. These findings would contribute to molecular characterization and identification of biomarkers for potential clinical applications.

  2. A whole blood gene expression-based signature for smoking status

    PubMed Central

    2012-01-01

    Background Smoking is the leading cause of preventable death worldwide and has been shown to increase the risk of multiple diseases including coronary artery disease (CAD). We sought to identify genes whose levels of expression in whole blood correlate with self-reported smoking status. Methods Microarrays were used to identify gene expression changes in whole blood which correlated with self-reported smoking status; a set of significant genes from the microarray analysis were validated by qRT-PCR in an independent set of subjects. Stepwise forward logistic regression was performed using the qRT-PCR data to create a predictive model whose performance was validated in an independent set of subjects and compared to cotinine, a nicotine metabolite. Results Microarray analysis of whole blood RNA from 209 PREDICT subjects (41 current smokers, 4 quit ≤ 2 months, 64 quit > 2 months, 100 never smoked; NCT00500617) identified 4214 genes significantly correlated with self-reported smoking status. qRT-PCR was performed on 1,071 PREDICT subjects across 256 microarray genes significantly correlated with smoking or CAD. A five gene (CLDND1, LRRN3, MUC1, GOPC, LEF1) predictive model, derived from the qRT-PCR data using stepwise forward logistic regression, had a cross-validated mean AUC of 0.93 (sensitivity=0.78; specificity=0.95), and was validated using 180 independent PREDICT subjects (AUC=0.82, CI 0.69-0.94; sensitivity=0.63; specificity=0.94). Plasma from the 180 validation subjects was used to assess levels of cotinine; a model using a threshold of 10 ng/ml cotinine resulted in an AUC of 0.89 (CI 0.81-0.97; sensitivity=0.81; specificity=0.97; kappa with expression model = 0.53). Conclusion We have constructed and validated a whole blood gene expression score for the evaluation of smoking status, demonstrating that clinical and environmental factors contributing to cardiovascular disease risk can be assessed by gene expression. PMID:23210427

  3. Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.

    PubMed

    Zhang, Wenqian; Yu, Ying; Hertwig, Falk; Thierry-Mieg, Jean; Zhang, Wenwei; Thierry-Mieg, Danielle; Wang, Jian; Furlanello, Cesare; Devanarayan, Viswanath; Cheng, Jie; Deng, Youping; Hero, Barbara; Hong, Huixiao; Jia, Meiwen; Li, Li; Lin, Simon M; Nikolsky, Yuri; Oberthuer, André; Qing, Tao; Su, Zhenqiang; Volland, Ruth; Wang, Charles; Wang, May D; Ai, Junmei; Albanese, Davide; Asgharzadeh, Shahab; Avigad, Smadar; Bao, Wenjun; Bessarabova, Marina; Brilliant, Murray H; Brors, Benedikt; Chierici, Marco; Chu, Tzu-Ming; Zhang, Jibin; Grundy, Richard G; He, Min Max; Hebbring, Scott; Kaufman, Howard L; Lababidi, Samir; Lancashire, Lee J; Li, Yan; Lu, Xin X; Luo, Heng; Ma, Xiwen; Ning, Baitang; Noguera, Rosa; Peifer, Martin; Phan, John H; Roels, Frederik; Rosswog, Carolina; Shao, Susan; Shen, Jie; Theissen, Jessica; Tonini, Gian Paolo; Vandesompele, Jo; Wu, Po-Yen; Xiao, Wenzhong; Xu, Joshua; Xu, Weihong; Xuan, Jiekun; Yang, Yong; Ye, Zhan; Dong, Zirui; Zhang, Ke K; Yin, Ye; Zhao, Chen; Zheng, Yuanting; Wolfinger, Russell D; Shi, Tieliu; Malkas, Linda H; Berthold, Frank; Wang, Jun; Tong, Weida; Shi, Leming; Peng, Zhiyu; Fischer, Matthias

    2015-06-25

    Gene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model. We generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models. We demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.

  4. Computational Prediction and Validation of BAHD1 as a Novel Molecule for Ulcerative Colitis

    NASA Astrophysics Data System (ADS)

    Zhu, Huatuo; Wan, Xingyong; Li, Jing; Han, Lu; Bo, Xiaochen; Chen, Wenguo; Lu, Chao; Shen, Zhe; Xu, Chenfu; Chen, Lihua; Yu, Chaohui; Xu, Guoqiang

    2015-07-01

    Ulcerative colitis (UC) is a common inflammatory bowel disease (IBD) producing intestinal inflammation and tissue damage. The precise aetiology of UC remains unknown. In this study, we applied a rank-based expression profile comparative algorithm, gene set enrichment analysis (GSEA), to evaluate the expression profiles of UC patients and small interfering RNA (siRNA)-perturbed cells to predict proteins that might be essential in UC from publicly available expression profiles. We used quantitative PCR (qPCR) to characterize the expression levels of those genes predicted to be the most important for UC in dextran sodium sulphate (DSS)-induced colitic mice. We found that bromo-adjacent homology domain (BAHD1), a novel heterochromatinization factor in vertebrates, was the most downregulated gene. We further validated a potential role of BAHD1 as a regulatory factor for inflammation through the TNF signalling pathway in vitro. Our findings indicate that computational approaches leveraging public gene expression data can be used to infer potential genes or proteins for diseases, and BAHD1 might act as an indispensable factor in regulating the cellular inflammatory response in UC.

  5. Sherlock: Detecting Gene-Disease Associations by Matching Patterns of Expression QTL and GWAS

    PubMed Central

    He, Xin; Fuller, Chris K.; Song, Yi; Meng, Qingying; Zhang, Bin; Yang, Xia; Li, Hao

    2013-01-01

    Genetic mapping of complex diseases to date depends on variations inside or close to the genes that perturb their activities. A strong body of evidence suggests that changes in gene expression play a key role in complex diseases and that numerous loci perturb gene expression in trans. The information in trans variants, however, has largely been ignored in the current analysis paradigm. Here we present a statistical framework for genetic mapping by utilizing collective information in both cis and trans variants. We reason that for a disease-associated gene, any genetic variation that perturbs its expression is also likely to influence the disease risk. Thus, the expression quantitative trait loci (eQTL) of the gene, which constitute a unique “genetic signature,” should overlap significantly with the set of loci associated with the disease. We translate this idea into a computational algorithm (named Sherlock) to search for gene-disease associations from GWASs, taking advantage of independent eQTL data. Application of this strategy to Crohn disease and type 2 diabetes predicts a number of genes with possible disease roles, including several predictions supported by solid experimental evidence. Importantly, predicted genes are often implicated by multiple trans eQTL with moderate associations. These genes are far from any GWAS association signals and thus cannot be identified from the GWAS alone. Our approach allows analysis of association data from a new perspective and is applicable to any complex phenotype. It is readily generalizable to molecular traits other than gene expression, such as metabolites, noncoding RNAs, and epigenetic modifications. PMID:23643380

  6. Transcriptomic analysis in the developing zebrafish embryo after compound exposure: Individual gene expression and pathway regulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hermsen, Sanne A.B., E-mail: Sanne.Hermsen@rivm.nl; Department of Toxicogenomics, Maastricht University, P.O. Box 616, 6200 MD, Maastricht; Institute for Risk Assessment Sciences

    2013-10-01

    The zebrafish embryotoxicity test is a promising alternative assay for developmental toxicity. Classically, morphological assessment of the embryos is applied to evaluate the effects of compound exposure. However, by applying differential gene expression analysis the sensitivity and predictability of the test may be increased. For defining gene expression signatures of developmental toxicity, we explored the possibility of using gene expression signatures of compound exposures based on commonly expressed individual genes as well as based on regulated gene pathways. Four developmental toxic compounds were tested in concentration-response design, caffeine, carbamazepine, retinoic acid and valproic acid, and two non-embryotoxic compounds, D-mannitol andmore » saccharin, were included. With transcriptomic analyses we were able to identify commonly expressed genes, which were mostly development related, after exposure to the embryotoxicants. We also identified gene pathways regulated by the embryotoxicants, suggestive of their modes of action. Furthermore, whereas pathways may be regulated by all compounds, individual gene expression within these pathways can differ for each compound. Overall, the present study suggests that the use of individual gene expression signatures as well as pathway regulation may be useful starting points for defining gene biomarkers for predicting embryotoxicity. - Highlights: • The zebrafish embryotoxicity test in combination with transcriptomics was used. • We explored two approaches of defining gene biomarkers for developmental toxicity. • Four compounds in concentration-response design were tested. • We identified commonly expressed individual genes as well as regulated gene pathways. • Both approaches seem suitable starting points for defining gene biomarkers.« less

  7. Sig2GRN: a software tool linking signaling pathway with gene regulatory network for dynamic simulation.

    PubMed

    Zhang, Fan; Liu, Runsheng; Zheng, Jie

    2016-12-23

    Linking computational models of signaling pathways to predicted cellular responses such as gene expression regulation is a major challenge in computational systems biology. In this work, we present Sig2GRN, a Cytoscape plugin that is able to simulate time-course gene expression data given the user-defined external stimuli to the signaling pathways. A generalized logical model is used in modeling the upstream signaling pathways. Then a Boolean model and a thermodynamics-based model are employed to predict the downstream changes in gene expression based on the simulated dynamics of transcription factors in signaling pathways. Our empirical case studies show that the simulation of Sig2GRN can predict changes in gene expression patterns induced by DNA damage signals and drug treatments. As a software tool for modeling cellular dynamics, Sig2GRN can facilitate studies in systems biology by hypotheses generation and wet-lab experimental design. http://histone.scse.ntu.edu.sg/Sig2GRN/.

  8. Distinct gene expression profiles determine molecular treatment response in childhood acute lymphoblastic leukemia.

    PubMed

    Cario, Gunnar; Stanulla, Martin; Fine, Bernard M; Teuffel, Oliver; Neuhoff, Nils V; Schrauder, André; Flohr, Thomas; Schäfer, Beat W; Bartram, Claus R; Welte, Karl; Schlegelberger, Brigitte; Schrappe, Martin

    2005-01-15

    Treatment resistance, as indicated by the presence of high levels of minimal residual disease (MRD) after induction therapy and induction consolidation, is associated with a poor prognosis in childhood acute lymphoblastic leukemia (ALL). We hypothesized that treatment resistance is an intrinsic feature of ALL cells reflected in the gene expression pattern and that resistance to chemotherapy can be predicted before treatment. To test these hypotheses, gene expression signatures of ALL samples with high MRD load were compared with those of samples without measurable MRD during treatment. We identified 54 genes that clearly distinguished resistant from sensitive ALL samples. Genes with low expression in resistant samples were predominantly associated with cell-cycle progression and apoptosis, suggesting that impaired cell proliferation and apoptosis are involved in treatment resistance. Prediction analysis using randomly selected samples as a training set and the remaining samples as a test set revealed an accuracy of 84%. We conclude that resistance to chemotherapy seems at least in part to be an intrinsic feature of ALL cells. Because treatment response could be predicted with high accuracy, gene expression profiling could become a clinically relevant tool for treatment stratification in the early course of childhood ALL.

  9. Clustering gene expression data based on predicted differential effects of GV interaction.

    PubMed

    Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu

    2005-02-01

    Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.

  10. Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

    PubMed

    Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

    2014-01-01

    Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.

  11. [Detection and analysis of the characteristic expression of microRNAs of anal fistula patients].

    PubMed

    Qiu, Jianming; Yu, Jiping; Yang, Guangen; Xu, Kan; Tao, Yong; Lin, Ali; Wang, Dong

    2016-07-01

    To detect and analyze the characteristic miRNAs profile of anal fistula and explore their possible target genes and potential clinical significance. The anal mucosa close to the hemorrhoids were collected from three patients undergoing fistulectomy and hemorrhoidectomy (fistula group) as well as three patients receiving only hemorroidectomy(hemorrhoids group), matching with fistula group in age, gender and body weight. miRNA microarray was used to compare the expression of 1 285 human miRNAs of the anal mucosa between two groups. Cluster analysis was adopted to analyze the accumulation of the differentially expressed miRNAs(P<0.05, fold≥2.0 or ≤0.5) and their target genes were predicted with 10 softwares such as DIANAmT, miRanda, miRDB, miRWalk etc. Comprehensive scoring was performed to identify genes with highest predictive score. Gene ontology (GO) concentration technique was used to analyze the target gene-associated biological process. Immunohistochemistry was used to examine protein expression of genes with the highest score. Among 1285 miRNAs in fistula group, 13 miRNAs were differentially expressed with those in hemorrhoid group, including 2 of up-regulation and 11 of down-regulation. Paired t test showed that in fistula group, miRNA-3609 up-regulation was 5.98 folds(P=0.0231) and miR-181a-2-3p down-regulation was 0.13 folds(P=0.0067) compared to those in hemorrhoid group, which had the greatest differential expression. Cluster analysis suggested that up-regulated miR-3609 and miR-6086 had similar change trend in both groups. Among 11 down-regulated miRNAs, miR-125bp-1-3p and miR-548q had similar expression and other 9 miRNAs had similar expression as well, including miR-1185-1-3p, miR-532-3p, miR-1233-5p, miR-769-5p, miR-149-5p, miR-99b-3p, miR-141-3p, miR-138-5p, and miR-181a-2-3p. Target gene prediction analysis of above 13 genes showed that 7 miRNAs(53.8%) were eligible to predict their potential target genes, yielding totally 104 possible target genes. The rest of 6 miRNAs(46.2%) failed to predict any target gene. The highest score in prediction of target gene was chitinase 1(ChIT1) and its corresponding differential miRNA was miR-769-5p(r=-0.94286, P=0.0167). Gene ontology analysis showed that the most associated biological process related with these 104 target genes was keratinization, immune response and signal transduction. Immunohistochemistry revealed ChiT1 expression of anal mucosa in fistula group was significantly higher compared to hemorrhoid group(P<0.01). There is a characteristic miRNAs profile in anal fistula patients, which may play a role in the occurrence and development of anal fistula.

  12. Muscle myeloid type I interferon gene expression may predict therapeutic responses to rituximab in myositis patients.

    PubMed

    Nagaraju, Kanneboyina; Ghimbovschi, Svetlana; Rayavarapu, Sree; Phadke, Aditi; Rider, Lisa G; Hoffman, Eric P; Miller, Frederick W

    2016-09-01

    To identify muscle gene expression patterns that predict rituximab responses and assess the effects of rituximab on muscle gene expression in PM and DM. In an attempt to understand the molecular mechanism of response and non-response to rituximab therapy, we performed Affymetrix gene expression array analyses on muscle biopsy specimens taken before and after rituximab therapy from eight PM and two DM patients in the Rituximab in Myositis study. We also analysed selected muscle-infiltrating cell phenotypes in these biopsies by immunohistochemical staining. Partek and Ingenuity pathway analyses assessed the gene pathways and networks. Myeloid type I IFN signature genes were expressed at higher levels at baseline in the skeletal muscle of rituximab responders than in non-responders, whereas classic non-myeloid IFN signature genes were expressed at higher levels in non-responders at baseline. Also, rituximab responders have a greater reduction of the myeloid and non-myeloid type I IFN signatures than non-responders. The decrease in the type I IFN signature following administration of rituximab may be associated with the decreases in muscle-infiltrating CD19(+) B cells and CD68(+) macrophages in responders. Our findings suggest that high levels of myeloid type I IFN gene expression in skeletal muscle predict responses to rituximab in PM/DM and that rituximab responders also have a greater decrease in the expression of these genes. These data add further evidence to recent studies defining the type I IFN signature as both a predictor of therapeutic responses and a biomarker of myositis disease activity. Published by Oxford University Press on behalf British Society for Rheumatology 2016. This work is written by US Government employees and is in the public domain in the US.

  13. Multiple biomarkers in molecular oncology. II. Molecular diagnostics applications in breast cancer management.

    PubMed

    Malinowski, Douglas P

    2007-05-01

    In recent years, the application of genomic and proteomic technologies to the problem of breast cancer prognosis and the prediction of therapy response have begun to yield encouraging results. Independent studies employing transcriptional profiling of primary breast cancer specimens using DNA microarrays have identified gene expression profiles that correlate with clinical outcome in primary breast biopsy specimens. Recent advances in microarray technology have demonstrated reproducibility, making clinical applications more achievable. In this regard, one such DNA microarray device based upon a 70-gene expression signature was recently cleared by the US FDA for application to breast cancer prognosis. These DNA microarrays often employ at least 70 gene targets for transcriptional profiling and prognostic assessment in breast cancer. The use of PCR-based methods utilizing a small subset of genes has recently demonstrated the ability to predict the clinical outcome in early-stage breast cancer. Furthermore, protein-based immunohistochemistry methods have progressed from using gene clusters and gene expression profiling to smaller subsets of expressed proteins to predict prognosis in early-stage breast cancer. Beyond prognostic applications, DNA microarray-based transcriptional profiling has demonstrated the ability to predict response to chemotherapy in early-stage breast cancer patients. In this review, recent advances in the use of multiple markers for prognosis of disease recurrence in early-stage breast cancer and the prediction of therapy response will be discussed.

  14. Prediction of miRNA-mRNA associations in Alzheimer's disease mice using network topology.

    PubMed

    Noh, Haneul; Park, Charny; Park, Soojun; Lee, Young Seek; Cho, Soo Young; Seo, Hyemyung

    2014-08-03

    Little is known about the relationship between miRNA and mRNA expression in Alzheimer's disease (AD) at early- or late-symptomatic stages. Sequence-based target prediction algorithms and anti-correlation profiles have been applied to predict miRNA targets using omics data, but this approach often leads to false positive predictions. Here, we applied the joint profiling analysis of mRNA and miRNA expression levels to Tg6799 AD model mice at 4 and 8 months of age using a network topology-based method. We constructed gene regulatory networks and used the PageRank algorithm to predict significant interactions between miRNA and mRNA. In total, 8 cluster modules were predicted by the transcriptome data for co-expression networks of AD pathology. In total, 54 miRNAs were identified as being differentially expressed in AD. Among these, 50 significant miRNA-mRNA interactions were predicted by integrating sequence target prediction, expression analysis, and the PageRank algorithm. We identified a set of miRNA-mRNA interactions that were changed in the hippocampus of Tg6799 AD model mice. We determined the expression levels of several candidate genes and miRNA. For functional validation in primary cultured neurons from Tg6799 mice (MT) and littermate (LM) controls, the overexpression of ARRDC3 enhanced PPP1R3C expression. ARRDC3 overexpression showed the tendency to decrease the expression of miR139-5p and miR3470a in both LM and MT primary cells. Pathological environment created by Aβ treatment increased the gene expression of PPP1R3C and Sfpq but did not significantly alter the expression of miR139-5p or miR3470a. Aβ treatment increased the promoter activity of ARRDC3 gene in LM primary cells but not in MT primary cells. Our results demonstrate AD-specific changes in the miRNA regulatory system as well as the relationship between the expression levels of miRNAs and their targets in the hippocampus of Tg6799 mice. These data help further our understanding of the function and mechanism of various miRNAs and their target genes in the molecular pathology of AD.

  15. PTCH1 expression at diagnosis predicts imatinib failure in chronic myeloid leukaemia patients in chronic phase.

    PubMed

    Alonso-Dominguez, Juan M; Grinfeld, Jacob; Alikian, Mary; Marin, David; Reid, Alistair; Daghistani, Mustafa; Hedgley, Corinne; O'Brien, Stephen; Clark, Richard E; Apperley, Jane; Foroni, Letizia; Gerrard, Gareth

    2015-01-01

    The tyrosine kinase inhibitor (TKI) imatinib has revolutionized the management of chronic myeloid leukaemia (CML). However, around 25% of patients fail to sustain an adequate response. We sought to identify gene-expression biomarkers that could be used to predict imatinib response. The expression of 29 genes, previously implicated in CML pathogenesis, were measured by TaqMan Low Density Array in 73 CML patient samples. Patients were divided into low and high expression for each gene and imatinib failure (IF), probability of achieving CCyR, progression free survival and CML related OS were compared by Kaplan-Meier and log-rank. Results were validated in a second cohort of 56 patients, with a further technical validation using custom gene-expression assays in a conventional RT-qPCR in a sub-cohort of 37 patients. Patients with low PTCH1 expression showed a worse clinical response for all variables in all cohorts. PTCH1 was the most significant predictor in the multivariate analysis compared with Sokal, age and EUTOS. PTCH1 expression assay showed the adequate sensitivity, specificity and predictive values to predict for IF. Given the different treatments available for CML, measuring PTCH1 expression at diagnosis may help establish who will benefit best from imatinib and who is better selected for second generation TKI. © 2014 Wiley Periodicals, Inc.

  16. Identifying gnostic predictors of the vaccine response

    PubMed Central

    Haining, W. Nicholas; Pulendran, Bali

    2012-01-01

    Molecular predictors of the response to vaccination could transform vaccine development. They would allow larger numbers of vaccine candidates to be rapidly screened, shortening the development time for new vaccines. Gene-expression based predictors of vaccine response have shown early promise. However, a limitation of gene-expression based predictors is that they often fail to reveal the mechanistic basis for their ability to classify response. Linking predictive signatures to the function of their component genes would advance basic understanding of vaccine immunity and also improve the robustness of outcome classification. New analytic tools now allow more biological meaning to be extracted from predictive signatures. Functional genomic approaches to perturb gene expression in mammalian cells permit the function of predictive genes to be surveyed in highly parallel experiments. The challenge for vaccinologists is therefore to use these tools to embed mechanistic insights into predictors of vaccine response. PMID:22633886

  17. Protein disorder is positively correlated with gene expression in E. coli

    PubMed Central

    Paliy, Oleg; Gargac, Shawn M.; Cheng, Yugong; Uversky, Vladimir N.; Dunker, A. Keith

    2009-01-01

    We considered on a global scale the relationship between the predicted fraction of protein disorder and RNA and protein expression in E. coli. Fraction of protein disorder correlated positively with both measured RNA expression levels of E. coli genes in three different growth media and with predicted abundance levels of E. coli proteins. Though weak, the correlation was highly significant. Correlation of protein disorder with RNA expression did not depend on the growth rate of E. coli cultures and was not caused by a small subset of genes showing exceptionally high concordance in their disorder and expression levels. Global analysis was complemented by detailed consideration of several groups of proteins. PMID:18465893

  18. Isolation of pheromone precursor genes of Magnaporthe grisea.

    PubMed

    Shen, W C; Bobrowicz, P; Ebbole, D J

    1999-01-01

    In heterothallic ascomycetes one mating partner serves as the source of female tissue and is fertilized with spermatia from a partner of the opposite mating type. The role of pheromone signaling in mating is thought to involve recognition of cells of the opposite mating type. We have isolated two putative pheromone precursor genes of Magnaporthe grisea. The genes are present in both mating types of the fungus but they are expressed in a mating type-specific manner. The MF1-1 gene, expressed in Mat1-1 strains, is predicted to encode a 26-amino-acid polypeptide that is processed to produce a lipopeptide pheromone. The MF2-1 gene, expressed in Mat1-2 strains, is predicted to encode a precursor polypeptide that is processed by a Kex2-like protease to yield a pheromone with striking similarity to the predicted pheromone sequence of a close relative, Cryphonectria parasitica. Expression of the M. grisea putative pheromone precursor genes was observed under defined nutritional conditions and in field isolates. This suggests that the requirement for complex media for mating and the poor fertility of field isolates may not be due to limitation of pheromone precursor gene expression. Detection of putative pheromone precursor gene mRNA in conidia suggests that pheromones may be important for the fertility of conidia acting as spermatia. Copyright 1999 Academic Press.

  19. Moving Toward Integrating Gene Expression Profiling Into High-Throughput Testing: A Gene Expression Biomarker Accurately Predicts Estrogen Receptor α Modulation in a Microarray Compendium

    PubMed Central

    Ryan, Natalia; Chorley, Brian; Tice, Raymond R.; Judson, Richard; Corton, J. Christopher

    2016-01-01

    Microarray profiling of chemical-induced effects is being increasingly used in medium- and high-throughput formats. Computational methods are described here to identify molecular targets from whole-genome microarray data using as an example the estrogen receptor α (ERα), often modulated by potential endocrine disrupting chemicals. ERα biomarker genes were identified by their consistent expression after exposure to 7 structurally diverse ERα agonists and 3 ERα antagonists in ERα-positive MCF-7 cells. Most of the biomarker genes were shown to be directly regulated by ERα as determined by ESR1 gene knockdown using siRNA as well as through chromatin immunoprecipitation coupled with DNA sequencing analysis of ERα-DNA interactions. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression datasets from experiments using MCF-7 cells, including those evaluating the transcriptional effects of hormones and chemicals. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% and 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) ER reference chemicals including “very weak” agonists. Importantly, the biomarker predictions accurately replicated predictions based on 18 in vitro high-throughput screening assays that queried different steps in ERα signaling. For 114 chemicals, the balanced accuracies were 95% and 98% for activation or suppression, respectively. These results demonstrate that the ERα gene expression biomarker can accurately identify ERα modulators in large collections of microarray data derived from MCF-7 cells. PMID:26865669

  20. Understanding Transcription Factor Regulation by Integrating Gene Expression and DNase I Hypersensitive Sites.

    PubMed

    Wang, Guohua; Wang, Fang; Huang, Qian; Li, Yu; Liu, Yunlong; Wang, Yadong

    2015-01-01

    Transcription factors are proteins that bind to DNA sequences to regulate gene transcription. The transcription factor binding sites are short DNA sequences (5-20 bp long) specifically bound by one or more transcription factors. The identification of transcription factor binding sites and prediction of their function continue to be challenging problems in computational biology. In this study, by integrating the DNase I hypersensitive sites with known position weight matrices in the TRANSFAC database, the transcription factor binding sites in gene regulatory region are identified. Based on the global gene expression patterns in cervical cancer HeLaS3 cell and HelaS3-ifnα4h cell (interferon treatment on HeLaS3 cell for 4 hours), we present a model-based computational approach to predict a set of transcription factors that potentially cause such differential gene expression. Significantly, 6 out 10 predicted functional factors, including IRF, IRF-2, IRF-9, IRF-1 and IRF-3, ICSBP, belong to interferon regulatory factor family and upregulate the gene expression levels responding to the interferon treatment. Another factor, ISGF-3, is also a transcriptional activator induced by interferon alpha. Using the different transcription factor binding sites selected criteria, the prediction result of our model is consistent. Our model demonstrated the potential to computationally identify the functional transcription factors in gene regulation.

  1. Classification of Time Series Gene Expression in Clinical Studies via Integration of Biological Network

    PubMed Central

    Qian, Liwei; Zheng, Haoran; Zhou, Hong; Qin, Ruibin; Li, Jinlong

    2013-01-01

    The increasing availability of time series expression datasets, although promising, raises a number of new computational challenges. Accordingly, the development of suitable classification methods to make reliable and sound predictions is becoming a pressing issue. We propose, here, a new method to classify time series gene expression via integration of biological networks. We evaluated our approach on 2 different datasets and showed that the use of a hidden Markov model/Gaussian mixture models hybrid explores the time-dependence of the expression data, thereby leading to better prediction results. We demonstrated that the biclustering procedure identifies function-related genes as a whole, giving rise to high accordance in prognosis prediction across independent time series datasets. In addition, we showed that integration of biological networks into our method significantly improves prediction performance. Moreover, we compared our approach with several state-of–the-art algorithms and found that our method outperformed previous approaches with regard to various criteria. Finally, our approach achieved better prediction results on early-stage data, implying the potential of our method for practical prediction. PMID:23516469

  2. Spatial analysis and high resolution mapping of the human whole-brain transcriptome for integrative analysis in neuroimaging.

    PubMed

    Gryglewski, Gregor; Seiger, René; James, Gregory Miles; Godbersen, Godber Mathis; Komorowski, Arkadiusz; Unterholzner, Jakob; Michenthaler, Paul; Hahn, Andreas; Wadsak, Wolfgang; Mitterhauser, Markus; Kasper, Siegfried; Lanzenberger, Rupert

    2018-08-01

    The quantification of big pools of diverse molecules provides important insights on brain function, but is often restricted to a limited number of observations, which impairs integration with other modalities. To resolve this issue, a method allowing for the prediction of mRNA expression in the entire brain based on microarray data provided in the Allen Human Brain Atlas was developed. Microarray data of 3702 samples from 6 brain donors was registered to MNI and cortical surface space using FreeSurfer. For each of 18,686 genes, spatial dependence of transcription was assessed using variogram modelling. Variogram models were employed in Gaussian process regression to calculate best linear unbiased predictions for gene expression at all locations represented in well-established imaging atlases for cortex, subcortical structures and cerebellum. For validation, predicted whole-brain transcription of the HTR1A gene was correlated with [carbonyl- 11 C]WAY-100635 positron emission tomography data collected from 30 healthy subjects. Prediction results showed minimal bias ranging within ±0.016 (cortical surface), ±0.12 (subcortical regions) and ±0.14 (cerebellum) in units of log2 expression intensity for all genes. Across genes, the correlation of predicted and observed mRNA expression in leave-one-out cross-validation correlated with the strength of spatial dependence (cortical surface: r = 0.91, subcortical regions: r = 0.85, cerebellum: r = 0.84). 816 out of 18,686 genes exhibited a high spatial dependence accounting for more than 50% of variance in the difference of gene expression on the cortical surface. In subcortical regions and cerebellum, different sets of genes were implicated by high spatially structured variability. For the serotonin 1A receptor, correlation between PET binding potentials and predicted comprehensive mRNA expression was markedly higher (Spearman ρ = 0.72 for cortical surface, ρ = 0.84 for subcortical regions) than correlation of PET and discrete samples only (ρ = 0.55 and ρ = 0.63, respectively). Prediction of mRNA expression in the entire human brain allows for intuitive visualization of gene transcription and seamless integration in multimodal analysis without bias arising from non-uniform distribution of available samples. Extension of this methodology promises to facilitate translation of omics research and enable investigation of human brain function at a systems level. Copyright © 2018 Elsevier Inc. All rights reserved.

  3. Interleukin-27 is a novel candidate diagnostic biomarker for bacterial infection in critically ill children.

    PubMed

    Wong, Hector R; Cvijanovich, Natalie Z; Hall, Mark; Allen, Geoffrey L; Thomas, Neal J; Freishtat, Robert J; Anas, Nick; Meyer, Keith; Checchia, Paul A; Lin, Richard; Bigham, Michael T; Sen, Anita; Nowak, Jeffrey; Quasney, Michael; Henricksen, Jared W; Chopra, Arun; Banschbach, Sharon; Beckman, Eileen; Harmon, Kelli; Lahni, Patrick; Shanley, Thomas P

    2012-10-29

    Differentiating between sterile inflammation and bacterial infection in critically ill patients with fever and other signs of the systemic inflammatory response syndrome (SIRS) remains a clinical challenge. The objective of our study was to mine an existing genome-wide expression database for the discovery of candidate diagnostic biomarkers to predict the presence of bacterial infection in critically ill children. Genome-wide expression data were compared between patients with SIRS having negative bacterial cultures (n = 21) and patients with sepsis having positive bacterial cultures (n = 60). Differentially expressed genes were subjected to a leave-one-out cross-validation (LOOCV) procedure to predict SIRS or sepsis classes. Serum concentrations of interleukin-27 (IL-27) and procalcitonin (PCT) were compared between 101 patients with SIRS and 130 patients with sepsis. All data represent the first 24 hours of meeting criteria for either SIRS or sepsis. Two hundred twenty one gene probes were differentially regulated between patients with SIRS and patients with sepsis. The LOOCV procedure correctly predicted 86% of the SIRS and sepsis classes, and Epstein-Barr virus-induced gene 3 (EBI3) had the highest predictive strength. Computer-assisted image analyses of gene-expression mosaics were able to predict infection with a specificity of 90% and a positive predictive value of 94%. Because EBI3 is a subunit of the heterodimeric cytokine, IL-27, we tested the ability of serum IL-27 protein concentrations to predict infection. At a cut-point value of ≥5 ng/ml, serum IL-27 protein concentrations predicted infection with a specificity and a positive predictive value of >90%, and the overall performance of IL-27 was generally better than that of PCT. A decision tree combining IL-27 and PCT improved overall predictive capacity compared with that of either biomarker alone. Genome-wide expression analysis has provided the foundation for the identification of IL-27 as a novel candidate diagnostic biomarker for predicting bacterial infection in critically ill children. Additional studies will be required to test further the diagnostic performance of IL-27. The microarray data reported in this article have been deposited in the Gene Expression Omnibus under accession number GSE4607.

  4. Gene Expression Signatures Based on Variability can Robustly Predict Tumor Progression and Prognosis

    PubMed Central

    Dinalankara, Wikum; Bravo, Héctor Corrada

    2015-01-01

    Gene expression signatures are commonly used to create cancer prognosis and diagnosis methods, yet only a small number of them are successfully deployed in the clinic since many fail to replicate performance on subsequent validation. A primary reason for this lack of reproducibility is the fact that these signatures attempt to model the highly variable and unstable genomic behavior of cancer. Our group recently introduced gene expression anti-profiles as a robust methodology to derive gene expression signatures based on the observation that while gene expression measurements are highly heterogeneous across tumors of a specific cancer type relative to the normal tissue, their degree of deviation from normal tissue expression in specific genes involved in tissue differentiation is a stable tumor mark that is reproducible across experiments and cancer types. Here we show that constructing gene expression signatures based on variability and the anti-profile approach yields classifiers capable of successfully distinguishing benign growths from cancerous growths based on deviation from normal expression. We then show that this same approach generates stable and reproducible signatures that predict probability of relapse and survival based on tumor gene expression. These results suggest that using the anti-profile framework for the discovery of genomic signatures is an avenue leading to the development of reproducible signatures suitable for adoption in clinical settings. PMID:26078586

  5. Analysis of baseline gene expression levels from toxicogenomics study control animals to identify sources of variation and predict responses to chemicals

    EPA Science Inventory

    The use of gene expression profiling to predict chemical mode of action would be enhanced by better characterization of variance due to individual, environmental, and technical factors. Meta-analysis of microarray data from untreated or vehicle-treated animals within the control ...

  6. A long non-coding RNA expression profile can predict early recurrence in hepatocellular carcinoma after curative resection.

    PubMed

    Lv, Yufeng; Wei, Wenhao; Huang, Zhong; Chen, Zhichao; Fang, Yuan; Pan, Lili; Han, Xueqiong; Xu, Zihai

    2018-06-20

    The aim of this study was to develop a novel long non-coding RNA (lncRNA) expression signature to accurately predict early recurrence for patients with hepatocellular carcinoma (HCC) after curative resection. Using expression profiles downloaded from The Cancer Genome Atlas database, we identified multiple lncRNAs with differential expression between early recurrence (ER) group and non-early recurrence (non-ER) group of HCC. Least absolute shrinkage and selection operator (LASSO) for logistic regression models were used to develop a lncRNA-based classifier for predicting ER in the training set. An independent test set was used to validated the predictive value of this classifier. Futhermore, a co-expression network based on these lncRNAs and its highly related genes was constructed and Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses of genes in the network were performed. We identified 10 differentially expressed lncRNAs, including 3 that were upregulated and 7 that were downregulated in ER group. The lncRNA-based classifier was constructed based on 7 lncRNAs (AL035661.1, PART1, AC011632.1, AC109588.1, AL365361.1, LINC00861 and LINC02084), and its accuracy was 0.83 in training set, 0.87 in test set and 0.84 in total set. And ROC curve analysis showed the AUROC was 0.741 in training set, 0.824 in the test set and 0.765 in total set. A functional enrichment analysis suggested that the genes of which is highly related to 4 lncRNAs were involved in immune system. This 7-lncRNA expression profile can effectively predict the early recurrence after surgical resection for HCC. This article is protected by copyright. All rights reserved.

  7. Cell-specific prediction and application of drug-induced gene expression profiles.

    PubMed

    Hodos, Rachel; Zhang, Ping; Lee, Hao-Chih; Duan, Qiaonan; Wang, Zichen; Clark, Neil R; Ma'ayan, Avi; Wang, Fei; Kidd, Brian; Hu, Jianying; Sontag, David; Dudley, Joel

    2018-01-01

    Gene expression profiling of in vitro drug perturbations is useful for many biomedical discovery applications including drug repurposing and elucidation of drug mechanisms. However, limited data availability across cell types has hindered our capacity to leverage or explore the cell-specificity of these perturbations. While recent efforts have generated a large number of drug perturbation profiles across a variety of human cell types, many gaps remain in this combinatorial drug-cell space. Hence, we asked whether it is possible to fill these gaps by predicting cell-specific drug perturbation profiles using available expression data from related conditions--i.e. from other drugs and cell types. We developed a computational framework that first arranges existing profiles into a three-dimensional array (or tensor) indexed by drugs, genes, and cell types, and then uses either local (nearest-neighbors) or global (tensor completion) information to predict unmeasured profiles. We evaluate prediction accuracy using a variety of metrics, and find that the two methods have complementary performance, each superior in different regions in the drug-cell space. Predictions achieve correlations of 0.68 with true values, and maintain accurate differentially expressed genes (AUC 0.81). Finally, we demonstrate that the predicted profiles add value for making downstream associations with drug targets and therapeutic classes.

  8. Cell-specific prediction and application of drug-induced gene expression profiles

    PubMed Central

    Hodos, Rachel; Zhang, Ping; Lee, Hao-Chih; Duan, Qiaonan; Wang, Zichen; Clark, Neil R.; Ma'ayan, Avi; Wang, Fei; Kidd, Brian; Hu, Jianying; Sontag, David

    2017-01-01

    Gene expression profiling of in vitro drug perturbations is useful for many biomedical discovery applications including drug repurposing and elucidation of drug mechanisms. However, limited data availability across cell types has hindered our capacity to leverage or explore the cell-specificity of these perturbations. While recent efforts have generated a large number of drug perturbation profiles across a variety of human cell types, many gaps remain in this combinatorial drug-cell space. Hence, we asked whether it is possible to fill these gaps by predicting cell-specific drug perturbation profiles using available expression data from related conditions--i.e. from other drugs and cell types. We developed a computational framework that first arranges existing profiles into a three-dimensional array (or tensor) indexed by drugs, genes, and cell types, and then uses either local (nearest-neighbors) or global (tensor completion) information to predict unmeasured profiles. We evaluate prediction accuracy using a variety of metrics, and find that the two methods have complementary performance, each superior in different regions in the drug-cell space. Predictions achieve correlations of 0.68 with true values, and maintain accurate differentially expressed genes (AUC 0.81). Finally, we demonstrate that the predicted profiles add value for making downstream associations with drug targets and therapeutic classes. PMID:29218867

  9. Altered expression of four miRNA (miR-1238-3p, miR-202-3p, miR-630 and miR-766-3p) and their potential targets in peripheral blood from vitiligo patients.

    PubMed

    Shang, Zhiwei; Li, Hongwen

    2017-10-01

    Vitiligo is an acquired skin disease with pigmentary disorder. Autoimmune destruction of melanocytes is thought to be major factor in the etiology of vitiligo. miRNA-based regulators of gene expression have been reported to play crucial roles in autoimmune disease. Therefore, we attempt to profile the miRNA expressions and predict their potential targets, assessing the biological functions of differentially expressed miRNA. Total RNA was extracted from peripheral blood of vitiligo (experimental group, n = 5) and non-vitiligo (control group, n = 5) age-matched patients. Samples were hybridized to a miRNA array. Box, scatter and principal component analysis plots were performed, followed by unsupervised hierarchical clustering analysis to classify the samples. Quantitative reverse transcription polymerase chain reaction (RT-PCR) was conducted for validation of microarray data. Three different databases, TargetScan, PITA and microRNA.org, were used to predict the potential target genes. Gene ontology (GO) annotation and pathway analysis were performed to assess the potential functions of predicted genes of identified miRNA. A total of 100 (29 upregulated and 71 downregulated) miRNA were filtered by volcano plot analysis. Four miRNA were validated by quantitative RT-PCR as significantly downregulated in the vitiligo group. The functions of predicted target genes associated with differentially expressed miRNA were assessed by GO analysis, showing that the GO term with most significantly enriched target genes was axon guidance, and that the axon guidance pathway was most significantly correlated with these miRNA. In conclusion, we identified four downregulated miRNA in vitiligo and assessed the potential functions of target genes related to these differentially expressed miRNA. © 2017 Japanese Dermatological Association.

  10. A novel highly differentially expressed gene in wheat endosperm associated with bread quality

    PubMed Central

    Furtado, A.; Bundock, P. C.; Banks, P. M.; Fox, G.; Yin, X.; Henry, R. J.

    2015-01-01

    Analysis of gene expression in developing wheat seeds was used to identify a gene, wheat bread making (wbm), with highly differential expression (~1000 fold) in the starchy endosperm of genotypes varying in bread making quality. Several alleles differing in the 5’-upstream region (promoter) of this gene were identified, with one present only in genotypes with high levels of wbm expression. RNA-Seq analysis revealed low or no wbm expression in most genotypes but high expression (0.2-0.4% of total gene expression) in genotypes that had good bread loaf volume. The wbm gene is predicted to encode a mature protein of 48 amino acids (including four cysteine residues) not previously identified in association with wheat quality, possibly because of its small size and low frequency in the wheat gene pool. Genotypes with high wbm expression all had good bread making quality but not always good physical dough qualities. The predicted protein was sulphur rich suggesting the possibility of a contribution to bread loaf volume by supporting the crossing linking of proteins in gluten. Improved understanding of the molecular basis of differences in bread making quality may allow more rapid development of high performing genotypes with acceptable end-use properties and facilitate increased wheat production. PMID:26011437

  11. A novel highly differentially expressed gene in wheat endosperm associated with bread quality.

    PubMed

    Furtado, A; Bundock, P C; Banks, P M; Fox, G; Yin, X; Henry, R J

    2015-05-26

    Analysis of gene expression in developing wheat seeds was used to identify a gene, wheat bread making (wbm), with highly differential expression (~1000 fold) in the starchy endosperm of genotypes varying in bread making quality. Several alleles differing in the 5'-upstream region (promoter) of this gene were identified, with one present only in genotypes with high levels of wbm expression. RNA-Seq analysis revealed low or no wbm expression in most genotypes but high expression (0.2-0.4% of total gene expression) in genotypes that had good bread loaf volume. The wbm gene is predicted to encode a mature protein of 48 amino acids (including four cysteine residues) not previously identified in association with wheat quality, possibly because of its small size and low frequency in the wheat gene pool. Genotypes with high wbm expression all had good bread making quality but not always good physical dough qualities. The predicted protein was sulphur rich suggesting the possibility of a contribution to bread loaf volume by supporting the crossing linking of proteins in gluten. Improved understanding of the molecular basis of differences in bread making quality may allow more rapid development of high performing genotypes with acceptable end-use properties and facilitate increased wheat production.

  12. Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits.

    PubMed

    Mancuso, Nicholas; Shi, Huwenbo; Goddard, Pagé; Kichaev, Gleb; Gusev, Alexander; Pasaniuc, Bogdan

    2017-03-02

    Although genome-wide association studies (GWASs) have identified thousands of risk loci for many complex traits and diseases, the causal variants and genes at these loci remain largely unknown. Here, we introduce a method for estimating the local genetic correlation between gene expression and a complex trait and utilize it to estimate the genetic correlation due to predicted expression between pairs of traits. We integrated gene expression measurements from 45 expression panels with summary GWAS data to perform 30 multi-tissue transcriptome-wide association studies (TWASs). We identified 1,196 genes whose expression is associated with these traits; of these, 168 reside more than 0.5 Mb away from any previously reported GWAS significant variant. We then used our approach to find 43 pairs of traits with significant genetic correlation at the level of predicted expression; of these, eight were not found through genetic correlation at the SNP level. Finally, we used bi-directional regression to find evidence that BMI causally influences triglyceride levels and that triglyceride levels causally influence low-density lipoprotein. Together, our results provide insight into the role of gene expression in the susceptibility of complex traits and diseases. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  13. Genomics of Mature and Immature Olfactory Sensory Neurons

    PubMed Central

    Nickell, Melissa D.; Breheny, Patrick; Stromberg, Arnold J.; McClintock, Timothy S.

    2014-01-01

    The continuous replacement of neurons in the olfactory epithelium provides an advantageous model for investigating neuronal differentiation and maturation. By calculating the relative enrichment of every mRNA detected in samples of mature mouse olfactory sensory neurons (OSNs), immature OSNs, and the residual population of neighboring cell types, and then comparing these ratios against the known expression patterns of >300 genes, enrichment criteria that accurately predicted the OSN expression patterns of nearly all genes were determined. We identified 847 immature OSN-specific and 691 mature OSN-specific genes. The control of gene expression by chromatin modification and transcription factors, and neurite growth, protein transport, RNA processing, cholesterol biosynthesis, and apoptosis via death domain receptors, were overrepresented biological processes in immature OSNs. Ion transport (ion channels), presynaptic functions, and cilia-specific processes were overrepresented in mature OSNs. Processes overrepresented among the genes expressed by all OSNs were protein and ion transport, ER overload response, protein catabolism, and the electron transport chain. To more accurately represent gradations in mRNA abundance and identify all genes expressed in each cell type, classification methods were used to produce probabilities of expression in each cell type for every gene. These probabilities, which identified 9,300 genes expressed in OSNs, were 96% accurate at identifying genes expressed in OSNs and 86% accurate at discriminating genes specific to mature and immature OSNs. This OSN gene database not only predicts the genes responsible for the major biological processes active in OSNs, but also identifies thousands of never before studied genes that support OSN phenotypes. PMID:22252456

  14. GSEH: A Novel Approach to Select Prostate Cancer-Associated Genes Using Gene Expression Heterogeneity.

    PubMed

    Kim, Hyunjin; Choi, Sang-Min; Park, Sanghyun

    2018-01-01

    When a gene shows varying levels of expression among normal people but similar levels in disease patients or shows similar levels of expression among normal people but different levels in disease patients, we can assume that the gene is associated with the disease. By utilizing this gene expression heterogeneity, we can obtain additional information that abets discovery of disease-associated genes. In this study, we used collaborative filtering to calculate the degree of gene expression heterogeneity between classes and then scored the genes on the basis of the degree of gene expression heterogeneity to find "differentially predicted" genes. Through the proposed method, we discovered more prostate cancer-associated genes than 10 comparable methods. The genes prioritized by the proposed method are potentially significant to biological processes of a disease and can provide insight into them.

  15. DiRE: identifying distant regulatory elements of co-expressed genes

    PubMed Central

    Gotea, Valer; Ovcharenko, Ivan

    2008-01-01

    Regulation of gene expression in eukaryotic genomes is established through a complex cooperative activity of proximal promoters and distant regulatory elements (REs) such as enhancers, repressors and silencers. We have developed a web server named DiRE, based on the Enhancer Identification (EI) method, for predicting distant regulatory elements in higher eukaryotic genomes, namely for determining their chromosomal location and functional characteristics. The server uses gene co-expression data, comparative genomics and profiles of transcription factor binding sites (TFBSs) to determine TFBS-association signatures that can be used for discriminating specific regulatory functions. DiRE's unique feature is its ability to detect REs outside of proximal promoter regions, as it takes advantage of the full gene locus to conduct the search. DiRE can predict common REs for any set of input genes for which the user has prior knowledge of co-expression, co-function or other biologically meaningful grouping. The server predicts function-specific REs consisting of clusters of specifically-associated TFBSs and it also scores the association of individual transcription factors (TFs) with the biological function shared by the group of input genes. Its integration with the Array2BIO server allows users to start their analysis with raw microarray expression data. The DiRE web server is freely available at http://dire.dcode.org. PMID:18487623

  16. Adrenal-kidney-gonad complex measurements may not predict gonad-specific changes in gene expression patterns during temperature-dependent sex determination in the red-eared slider turtle (Trachemys scripta elegans).

    PubMed

    Ramsey, Mary; Crews, David

    2007-08-01

    Many turtles, including the red-eared slider turtle (Trachemys scripta elegans) have temperature-dependent sex determination in which gonadal sex is determined by temperature during the middle third of incubation. The gonad develops as part of a heterogenous tissue complex that comprises the developing adrenal, kidney, and gonad (AKG complex). Owing to the difficulty in excising the gonad from the adjacent tissues, the AKG complex is often used as tissue source in assays examining gene expression in the developing gonad. However, the gonad is a relatively small component of the AKG, and gene expression in the adrenal-kidney (AK) compartment may interfere with the detection of gonad-specific changes in gene expression, particularly during early key phases of gonadal development and sex determination. In this study, we examine transcript levels as measured by quantitative real-time polymerase chain reaction for five genes important in slider turtle sex determination and differentiation (AR, ERalpha, ERbeta, aromatase, and Sf1) in AKG, AK, and isolated gonad tissues. In all cases, gonad-specific gene expression patterns were attenuated in AKG versus gonad tissue. All five genes were expressed in the AK in addition to the gonad at all stages/temperatures. Inclusion of the AK compartment masked important changes in gonadal gene expression. In addition, AK and gonad expression patterns are not additive, and gonadal gene expression cannot be predicted from intact AKG measurements. (c) 2007 Wiley-Liss, Inc.

  17. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    PubMed Central

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; Taylor, Ronald C.; Weisenhorn, Pamela; Olson, Robert D.; Stevens, Rick L.; Rocha, Miguel; Rocha, Isabel; Best, Aaron A.; DeJongh, Matthew; Tintle, Nathan L.; Parrello, Bruce; Overbeek, Ross; Henry, Christopher S.

    2016-01-01

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. An important step toward meeting the challenge of understanding gene function and regulation is the identification of sets of genes that are always co-expressed. These gene sets, Atomic Regulons (ARs), represent fundamental units of function within a cell and could be used to associate genes of unknown function with cellular processes and to enable rational genetic engineering of cellular systems. Here, we describe an approach for inferring ARs that leverages large-scale expression data sets, gene context, and functional relationships among genes. We computed ARs for Escherichia coli based on 907 gene expression experiments and compared our results with gene clusters produced by two prevalent data-driven methods: Hierarchical clustering and k-means clustering. We compared ARs and purely data-driven gene clusters to the curated set of regulatory interactions for E. coli found in RegulonDB, showing that ARs are more consistent with gold standard regulons than are data-driven gene clusters. We further examined the consistency of ARs and data-driven gene clusters in the context of gene interactions predicted by Context Likelihood of Relatedness (CLR) analysis, finding that the ARs show better agreement with CLR predicted interactions. We determined the impact of increasing amounts of expression data on AR construction and find that while more data improve ARs, it is not necessary to use the full set of gene expression experiments available for E. coli to produce high quality ARs. In order to explore the conservation of co-regulated gene sets across different organisms, we computed ARs for Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus, each of which represents increasing degrees of phylogenetic distance from E. coli. Comparison of the organism-specific ARs showed that the consistency of AR gene membership correlates with phylogenetic distance, but there is clear variability in the regulatory networks of closely related organisms. As large scale expression data sets become increasingly common for model and non-model organisms, comparative analyses of atomic regulons will provide valuable insights into fundamental regulatory modules used across the bacterial domain. PMID:27933038

  18. Identification of Novel Tissue-Specific Genes by Analysis of Microarray Databases: A Human and Mouse Model

    PubMed Central

    Suh, Yeunsu; Davis, Michael E.; Lee, Kichoon

    2013-01-01

    Understanding the tissue-specific pattern of gene expression is critical in elucidating the molecular mechanisms of tissue development, gene function, and transcriptional regulations of biological processes. Although tissue-specific gene expression information is available in several databases, follow-up strategies to integrate and use these data are limited. The objective of the current study was to identify and evaluate novel tissue-specific genes in human and mouse tissues by performing comparative microarray database analysis and semi-quantitative PCR analysis. We developed a powerful approach to predict tissue-specific genes by analyzing existing microarray data from the NCBI′s Gene Expression Omnibus (GEO) public repository. We investigated and confirmed tissue-specific gene expression in the human and mouse kidney, liver, lung, heart, muscle, and adipose tissue. Applying our novel comparative microarray approach, we confirmed 10 kidney, 11 liver, 11 lung, 11 heart, 8 muscle, and 8 adipose specific genes. The accuracy of this approach was further verified by employing semi-quantitative PCR reaction and by searching for gene function information in existing publications. Three novel tissue-specific genes were discovered by this approach including AMDHD1 (amidohydrolase domain containing 1) in the liver, PRUNE2 (prune homolog 2) in the heart, and ACVR1C (activin A receptor, type IC) in adipose tissue. We further confirmed the tissue-specific expression of these 3 novel genes by real-time PCR. Among them, ACVR1C is adipose tissue-specific and adipocyte-specific in adipose tissue, and can be used as an adipocyte developmental marker. From GEO profiles, we predicted the processes in which AMDHD1 and PRUNE2 may participate. Our approach provides a novel way to identify new sets of tissue-specific genes and to predict functions in which they may be involved. PMID:23741331

  19. Analysis of Antisense Expression by Whole Genome Tiling Microarrays and siRNAs Suggests Mis-Annotation of Arabidopsis Orphan Protein-Coding Genes

    PubMed Central

    Richardson, Casey R.; Luo, Qing-Jun; Gontcharova, Viktoria; Jiang, Ying-Wen; Samanta, Manoj; Youn, Eunseog; Rock, Christopher D.

    2010-01-01

    Background MicroRNAs (miRNAs) and trans-acting small-interfering RNAs (tasi-RNAs) are small (20–22 nt long) RNAs (smRNAs) generated from hairpin secondary structures or antisense transcripts, respectively, that regulate gene expression by Watson-Crick pairing to a target mRNA and altering expression by mechanisms related to RNA interference. The high sequence homology of plant miRNAs to their targets has been the mainstay of miRNA prediction algorithms, which are limited in their predictive power for other kingdoms because miRNA complementarity is less conserved yet transitive processes (production of antisense smRNAs) are active in eukaryotes. We hypothesize that antisense transcription and associated smRNAs are biomarkers which can be computationally modeled for gene discovery. Principal Findings We explored rice (Oryza sativa) sense and antisense gene expression in publicly available whole genome tiling array transcriptome data and sequenced smRNA libraries (as well as C. elegans) and found evidence of transitivity of MIRNA genes similar to that found in Arabidopsis. Statistical analysis of antisense transcript abundances, presence of antisense ESTs, and association with smRNAs suggests several hundred Arabidopsis ‘orphan’ hypothetical genes are non-coding RNAs. Consistent with this hypothesis, we found novel Arabidopsis homologues of some MIRNA genes on the antisense strand of previously annotated protein-coding genes. A Support Vector Machine (SVM) was applied using thermodynamic energy of binding plus novel expression features of sense/antisense transcription topology and siRNA abundances to build a prediction model of miRNA targets. The SVM when trained on targets could predict the “ancient” (deeply conserved) class of validated Arabidopsis MIRNA genes with an accuracy of 84%, and 76% for “new” rapidly-evolving MIRNA genes. Conclusions Antisense and smRNA expression features and computational methods may identify novel MIRNA genes and other non-coding RNAs in plants and potentially other kingdoms, which can provide insight into antisense transcription, miRNA evolution, and post-transcriptional gene regulation. PMID:20520764

  20. Predicting human genetic interactions from cancer genome evolution.

    PubMed

    Lu, Xiaowen; Megchelenbrink, Wout; Notebaart, Richard A; Huynen, Martijn A

    2015-01-01

    Synthetic Lethal (SL) genetic interactions play a key role in various types of biological research, ranging from understanding genotype-phenotype relationships to identifying drug-targets against cancer. Despite recent advances in empirical measuring SL interactions in human cells, the human genetic interaction map is far from complete. Here, we present a novel approach to predict this map by exploiting patterns in cancer genome evolution. First, we show that empirically determined SL interactions are reflected in various gene presence, absence, and duplication patterns in hundreds of cancer genomes. The most evident pattern that we discovered is that when one member of an SL interaction gene pair is lost, the other gene tends not to be lost, i.e. the absence of co-loss. This observation is in line with expectation, because the loss of an SL interacting pair will be lethal to the cancer cell. SL interactions are also reflected in gene expression profiles, such as an under representation of cases where the genes in an SL pair are both under expressed, and an over representation of cases where one gene of an SL pair is under expressed, while the other one is over expressed. We integrated the various previously unknown cancer genome patterns and the gene expression patterns into a computational model to identify SL pairs. This simple, genome-wide model achieves a high prediction power (AUC = 0.75) for known genetic interactions. It allows us to present for the first time a comprehensive genome-wide list of SL interactions with a high estimated prediction precision, covering up to 591,000 gene pairs. This unique list can potentially be used in various application areas ranging from biotechnology to medical genetics.

  1. Integrated analyses of microRNAs demonstrate their widespread influence on gene expression in high-grade serous ovarian carcinoma.

    PubMed

    Creighton, Chad J; Hernandez-Herrera, Anadulce; Jacobsen, Anders; Levine, Douglas A; Mankoo, Parminder; Schultz, Nikolaus; Du, Ying; Zhang, Yiqun; Larsson, Erik; Sheridan, Robert; Xiao, Weimin; Spellman, Paul T; Getz, Gad; Wheeler, David A; Perou, Charles M; Gibbs, Richard A; Sander, Chris; Hayes, D Neil; Gunaratne, Preethi H

    2012-01-01

    The Cancer Genome Atlas (TCGA) Network recently comprehensively catalogued the molecular aberrations in 487 high-grade serous ovarian cancers, with much remaining to be elucidated regarding the microRNAs (miRNAs). Here, using TCGA ovarian data, we surveyed the miRNAs, in the context of their predicted gene targets. Integration of miRNA and gene patterns yielded evidence that proximal pairs of miRNAs are processed from polycistronic primary transcripts, and that intronic miRNAs and their host gene mRNAs derive from common transcripts. Patterns of miRNA expression revealed multiple tumor subtypes and a set of 34 miRNAs predictive of overall patient survival. In a global analysis, miRNA:mRNA pairs anti-correlated in expression across tumors showed a higher frequency of in silico predicted target sites in the mRNA 3'-untranslated region (with less frequency observed for coding sequence and 5'-untranslated regions). The miR-29 family and predicted target genes were among the most strongly anti-correlated miRNA:mRNA pairs; over-expression of miR-29a in vitro repressed several anti-correlated genes (including DNMT3A and DNMT3B) and substantially decreased ovarian cancer cell viability. This study establishes miRNAs as having a widespread impact on gene expression programs in ovarian cancer, further strengthening our understanding of miRNA biology as it applies to human cancer. As with gene transcripts, miRNAs exhibit high diversity reflecting the genomic heterogeneity within a clinically homogeneous disease population. Putative miRNA:mRNA interactions, as identified using integrative analysis, can be validated. TCGA data are a valuable resource for the identification of novel tumor suppressive miRNAs in ovarian as well as other cancers.

  2. Minimising Immunohistochemical False Negative ER Classification Using a Complementary 23 Gene Expression Signature of ER Status

    PubMed Central

    Li, Qiyuan; Eklund, Aron C.; Juul, Nicolai; Haibe-Kains, Benjamin; Workman, Christopher T.; Richardson, Andrea L.; Szallasi, Zoltan; Swanton, Charles

    2010-01-01

    Background Expression of the oestrogen receptor (ER) in breast cancer predicts benefit from endocrine therapy. Minimising the frequency of false negative ER status classification is essential to identify all patients with ER positive breast cancers who should be offered endocrine therapies in order to improve clinical outcome. In routine oncological practice ER status is determined by semi-quantitative methods such as immunohistochemistry (IHC) or other immunoassays in which the ER expression level is compared to an empirical threshold[1], [2]. The clinical relevance of gene expression-based ER subtypes as compared to IHC-based determination has not been systematically evaluated. Here we attempt to reduce the frequency of false negative ER status classification using two gene expression approaches and compare these methods to IHC based ER status in terms of predictive and prognostic concordance with clinical outcome. Methodology/Principal Findings Firstly, ER status was discriminated by fitting the bimodal expression of ESR1 to a mixed Gaussian model. The discriminative power of ESR1 suggested bimodal expression as an efficient way to stratify breast cancer; therefore we identified a set of genes whose expression was both strongly bimodal, mimicking ESR expression status, and highly expressed in breast epithelial cell lines, to derive a 23-gene ER expression signature-based classifier. We assessed our classifiers in seven published breast cancer cohorts by comparing the gene expression-based ER status to IHC-based ER status as a predictor of clinical outcome in both untreated and tamoxifen treated cohorts. In untreated breast cancer cohorts, the 23 gene signature-based ER status provided significantly improved prognostic power compared to IHC-based ER status (P = 0.006). In tamoxifen-treated cohorts, the 23 gene ER expression signature predicted clinical outcome (HR = 2.20, P = 0.00035). These complementary ER signature-based strategies estimated that between 15.1% and 21.8% patients of IHC-based negative ER status would be classified with ER positive breast cancer. Conclusion/Significance Expression-based ER status classification may complement IHC to minimise false negative ER status classification and optimise patient stratification for endocrine therapies. PMID:21152022

  3. Survey of the Heritability and Sparse Architecture of Gene Expression Traits across Human Tissues.

    PubMed

    Wheeler, Heather E; Shah, Kaanan P; Brenner, Jonathon; Garcia, Tzintzuni; Aquino-Michaels, Keston; Cox, Nancy J; Nicolae, Dan L; Im, Hae Kyung

    2016-11-01

    Understanding the genetic architecture of gene expression traits is key to elucidating the underlying mechanisms of complex traits. Here, for the first time, we perform a systematic survey of the heritability and the distribution of effect sizes across all representative tissues in the human body. We find that local h2 can be relatively well characterized with 59% of expressed genes showing significant h2 (FDR < 0.1) in the DGN whole blood cohort. However, current sample sizes (n ≤ 922) do not allow us to compute distal h2. Bayesian Sparse Linear Mixed Model (BSLMM) analysis provides strong evidence that the genetic contribution to local expression traits is dominated by a handful of genetic variants rather than by the collective contribution of a large number of variants each of modest size. In other words, the local architecture of gene expression traits is sparse rather than polygenic across all 40 tissues (from DGN and GTEx) examined. This result is confirmed by the sparsity of optimal performing gene expression predictors via elastic net modeling. To further explore the tissue context specificity, we decompose the expression traits into cross-tissue and tissue-specific components using a novel Orthogonal Tissue Decomposition (OTD) approach. Through a series of simulations we show that the cross-tissue and tissue-specific components are identifiable via OTD. Heritability and sparsity estimates of these derived expression phenotypes show similar characteristics to the original traits. Consistent properties relative to prior GTEx multi-tissue analysis results suggest that these traits reflect the expected biology. Finally, we apply this knowledge to develop prediction models of gene expression traits for all tissues. The prediction models, heritability, and prediction performance R2 for original and decomposed expression phenotypes are made publicly available (https://github.com/hakyimlab/PrediXcan).

  4. RNA-sequence data normalization through in silico prediction of reference genes: the bacterial response to DNA damage as case study.

    PubMed

    Berghoff, Bork A; Karlsson, Torgny; Källman, Thomas; Wagner, E Gerhart H; Grabherr, Manfred G

    2017-01-01

    Measuring how gene expression changes in the course of an experiment assesses how an organism responds on a molecular level. Sequencing of RNA molecules, and their subsequent quantification, aims to assess global gene expression changes on the RNA level (transcriptome). While advances in high-throughput RNA-sequencing (RNA-seq) technologies allow for inexpensive data generation, accurate post-processing and normalization across samples is required to eliminate any systematic noise introduced by the biochemical and/or technical processes. Existing methods thus either normalize on selected known reference genes that are invariant in expression across the experiment, assume that the majority of genes are invariant, or that the effects of up- and down-regulated genes cancel each other out during the normalization. Here, we present a novel method, moose 2 , which predicts invariant genes in silico through a dynamic programming (DP) scheme and applies a quadratic normalization based on this subset. The method allows for specifying a set of known or experimentally validated invariant genes, which guides the DP. We experimentally verified the predictions of this method in the bacterium Escherichia coli , and show how moose 2 is able to (i) estimate the expression value distances between RNA-seq samples, (ii) reduce the variation of expression values across all samples, and (iii) to subsequently reveal new functional groups of genes during the late stages of DNA damage. We further applied the method to three eukaryotic data sets, on which its performance compares favourably to other methods. The software is implemented in C++ and is publicly available from http://grabherr.github.io/moose2/. The proposed RNA-seq normalization method, moose 2 , is a valuable alternative to existing methods, with two major advantages: (i) in silico prediction of invariant genes provides a list of potential reference genes for downstream analyses, and (ii) non-linear artefacts in RNA-seq data are handled adequately to minimize variations between replicates.

  5. Gene expression profile associated with superimposed non-alcoholic fatty liver disease and hepatic fibrosis in patients with chronic hepatitis C.

    PubMed

    Younossi, Zobair M; Afendy, Arian; Stepanova, Maria; Hossain, Noreen; Younossi, Issah; Ankrah, Kathy; Gramlich, Terry; Baranova, Ancha

    2009-10-01

    Hepatic steatosis occurs in 40-70% of patients chronically infected with hepatitis C virus [chronic hepatitis C (CH-C)]. Hepatic steatosis in CH-C is associated with progressive liver disease and a low response rate to antiviral therapy. Gene expression profiles were examined in CH-C patients with and without hepatic steatosis, non-alcoholic steatohepatitis (NASH) and fibrosis. This study included 65 CH-C patients who were not receiving antiviral treatment. Total RNA was extracted from peripheral blood mononuclear cells, quantified and used for one-step reverse transcriptase-polymerase chain reaction to profile 153 mRNAs that were normalized with six 'housekeeping' genes and a reference RNA. Multiple regression and stepwise selection assessed differences in gene expression and the models' performances were evaluated. Models predicting the grade of hepatic steatosis in patients with CH-C genotype 3 involved two genes: SOCS1 and IFITM1, which progressively changed their expression level with the increasing grade of steatosis. On the other hand, models predicting hepatic steatosis in non-genotype 3 patients highlighted MIP-1 cytokine encoding genes: CCL3 and CCL4 as well as IFNAR and PRKRIR. Expression levels of PRKRIR and SMAD3 differentiated patients with and without superimposed NASH only in the non-genotype 3 cohort (area under the receiver operating characteristic curve=0.822, P-value 0.006]. Gene expression signatures related to hepatic fibrosis were not genotype specific. Gene expression might predict moderate to severe hepatic steatosis, NASH and fibrosis in patients with CH-C, providing potential insights into the pathogenesis of hepatic steatosis and fibrosis in these patients.

  6. Transcriptional network inference from functional similarity and expression data: a global supervised approach.

    PubMed

    Ambroise, Jérôme; Robert, Annie; Macq, Benoit; Gala, Jean-Luc

    2012-01-06

    An important challenge in system biology is the inference of biological networks from postgenomic data. Among these biological networks, a gene transcriptional regulatory network focuses on interactions existing between transcription factors (TFs) and and their corresponding target genes. A large number of reverse engineering algorithms were proposed to infer such networks from gene expression profiles, but most current methods have relatively low predictive performances. In this paper, we introduce the novel TNIFSED method (Transcriptional Network Inference from Functional Similarity and Expression Data), that infers a transcriptional network from the integration of correlations and partial correlations of gene expression profiles and gene functional similarities through a supervised classifier. In the current work, TNIFSED was applied to predict the transcriptional network in Escherichia coli and in Saccharomyces cerevisiae, using datasets of 445 and 170 affymetrix arrays, respectively. Using the area under the curve of the receiver operating characteristics and the F-measure as indicators, we showed the predictive performance of TNIFSED to be better than unsupervised state-of-the-art methods. TNIFSED performed slightly worse than the supervised SIRENE algorithm for the target genes identification of the TF having a wide range of yet identified target genes but better for TF having only few identified target genes. Our results indicate that TNIFSED is complementary to the SIRENE algorithm, and particularly suitable to discover target genes of "orphan" TFs.

  7. Integrating multiple molecular sources into a clinical risk prediction signature by extracting complementary information.

    PubMed

    Hieke, Stefanie; Benner, Axel; Schlenl, Richard F; Schumacher, Martin; Bullinger, Lars; Binder, Harald

    2016-08-30

    High-throughput technology allows for genome-wide measurements at different molecular levels for the same patient, e.g. single nucleotide polymorphisms (SNPs) and gene expression. Correspondingly, it might be beneficial to also integrate complementary information from different molecular levels when building multivariable risk prediction models for a clinical endpoint, such as treatment response or survival. Unfortunately, such a high-dimensional modeling task will often be complicated by a limited overlap of molecular measurements at different levels between patients, i.e. measurements from all molecular levels are available only for a smaller proportion of patients. We propose a sequential strategy for building clinical risk prediction models that integrate genome-wide measurements from two molecular levels in a complementary way. To deal with partial overlap, we develop an imputation approach that allows us to use all available data. This approach is investigated in two acute myeloid leukemia applications combining gene expression with either SNP or DNA methylation data. After obtaining a sparse risk prediction signature e.g. from SNP data, an automatically selected set of prognostic SNPs, by componentwise likelihood-based boosting, imputation is performed for the corresponding linear predictor by a linking model that incorporates e.g. gene expression measurements. The imputed linear predictor is then used for adjustment when building a prognostic signature from the gene expression data. For evaluation, we consider stability, as quantified by inclusion frequencies across resampling data sets. Despite an extremely small overlap in the application example with gene expression and SNPs, several genes are seen to be more stably identified when taking the (imputed) linear predictor from the SNP data into account. In the application with gene expression and DNA methylation, prediction performance with respect to survival also indicates that the proposed approach might work well. We consider imputation of linear predictor values to be a feasible and sensible approach for dealing with partial overlap in complementary integrative analysis of molecular measurements at different levels. More generally, these results indicate that a complementary strategy for integrating different molecular levels can result in more stable risk prediction signatures, potentially providing a more reliable insight into the underlying biology.

  8. Prediction of epigenetically regulated genes in breast cancer cell lines.

    PubMed

    Loss, Leandro A; Sadanandam, Anguraj; Durinck, Steffen; Nautiyal, Shivani; Flaucher, Diane; Carlton, Victoria E H; Moorhead, Martin; Lu, Yontao; Gray, Joe W; Faham, Malek; Spellman, Paul; Parvin, Bahram

    2010-06-04

    Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profiles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines, which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profiles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fixed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis. Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically significant negative correlation between methylation profiles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identified 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes. Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.

  9. A comparison of machine learning techniques for survival prediction in breast cancer

    PubMed Central

    2011-01-01

    Background The ability to accurately classify cancer patients into risk classes, i.e. to predict the outcome of the pathology on an individual basis, is a key ingredient in making therapeutic decisions. In recent years gene expression data have been successfully used to complement the clinical and histological criteria traditionally used in such prediction. Many "gene expression signatures" have been developed, i.e. sets of genes whose expression values in a tumor can be used to predict the outcome of the pathology. Here we investigate the use of several machine learning techniques to classify breast cancer patients using one of such signatures, the well established 70-gene signature. Results We show that Genetic Programming performs significantly better than Support Vector Machines, Multilayered Perceptrons and Random Forests in classifying patients from the NKI breast cancer dataset, and comparably to the scoring-based method originally proposed by the authors of the 70-gene signature. Furthermore, Genetic Programming is able to perform an automatic feature selection. Conclusions Since the performance of Genetic Programming is likely to be improvable compared to the out-of-the-box approach used here, and given the biological insight potentially provided by the Genetic Programming solutions, we conclude that Genetic Programming methods are worth further investigation as a tool for cancer patient classification based on gene expression data. PMID:21569330

  10. Discovering functions of unannotated genes from a transcriptome survey of wild fungal isolates.

    PubMed

    Ellison, Christopher E; Kowbel, David; Glass, N Louise; Taylor, John W; Brem, Rachel B

    2014-04-01

    Most fungal genomes are poorly annotated, and many fungal traits of industrial and biomedical relevance are not well suited to classical genetic screens. Assigning genes to phenotypes on a genomic scale thus remains an urgent need in the field. We developed an approach to infer gene function from expression profiles of wild fungal isolates, and we applied our strategy to the filamentous fungus Neurospora crassa. Using transcriptome measurements in 70 strains from two well-defined clades of this microbe, we first identified 2,247 cases in which the expression of an unannotated gene rose and fell across N. crassa strains in parallel with the expression of well-characterized genes. We then used image analysis of hyphal morphologies, quantitative growth assays, and expression profiling to test the functions of four genes predicted from our population analyses. The results revealed two factors that influenced regulation of metabolism of nonpreferred carbon and nitrogen sources, a gene that governed hyphal architecture, and a gene that mediated amino acid starvation resistance. These findings validate the power of our population-transcriptomic approach for inference of novel gene function, and we suggest that this strategy will be of broad utility for genome-scale annotation in many fungal systems. IMPORTANCE Some fungal species cause deadly infections in humans or crop plants, and other fungi are workhorses of industrial chemistry, including the production of biofuels. Advances in medical and industrial mycology require an understanding of the genes that control fungal traits. We developed a method to infer functions of uncharacterized genes by observing correlated expression of their mRNAs with those of known genes across wild fungal isolates. We applied this strategy to a filamentous fungus and predicted functions for thousands of unknown genes. In four cases, we experimentally validated the predictions from our method, discovering novel genes involved in the metabolism of nutrient sources relevant for biofuel production, as well as colony morphology and starvation resistance. Our strategy is straightforward, inexpensive, and applicable for predicting gene function in many fungal species.

  11. A Genomic Score Prognostic of Outcome in Trauma Patients

    PubMed Central

    Warren, H Shaw; Elson, Constance M; Hayden, Douglas L; Schoenfeld, David A; Cobb, J Perren; Maier, Ronald V; Moldawer, Lyle L; Moore, Ernest E; Harbrecht, Brian G; Pelak, Kimberly; Cuschieri, Joseph; Herndon, David N; Jeschke, Marc G; Finnerty, Celeste C; Brownstein, Bernard H; Hennessy, Laura; Mason, Philip H; Tompkins, Ronald G

    2009-01-01

    Traumatic injuries frequently lead to infection, organ failure, and death. Health care providers rely on several injury scoring systems to quantify the extent of injury and to help predict clinical outcome. Physiological, anatomical, and clinical laboratory analytic scoring systems (Acute Physiology and Chronic Health Evaluation [APACHE], Injury Severity Score [ISS]) are utilized, with limited success, to predict outcome following injury. The recent development of techniques for measuring the expression level of all of a person’s genes simultaneously may make it possible to develop an injury scoring system based on the degree of gene activation. We hypothesized that a peripheral blood leukocyte gene expression score could predict outcome, including multiple organ failure, following severe blunt trauma. To test such a scoring system, we measured gene expression of peripheral blood leukocytes from patients within 12 h of traumatic injury. cRNA derived from whole blood leukocytes obtained within 12 h of injury provided gene expression data for the entire genome that were used to create a composite gene expression score for each patient. Total blood leukocytes were chosen because they are active during inflammation, which is reflective of poor outcome. The gene expression score combines the activation levels of all the genes into a single number which compares the patient’s gene expression to the average gene expression in uninjured volunteers. Expression profiles from healthy volunteers were averaged to create a reference gene expression profile which was used to compute a difference from reference (DFR) score for each patient. This score described the overall genomic response of patients within the first 12 h following severe blunt trauma. Regression models were used to compare the association of the DFR, APACHE, and ISS scores with outcome. We hypothesized that patients with a total gene response more different from uninjured volunteers would tend to have poorer outcome than those more similar. Our data show that for measures of poor outcome, such as infections, organ failures, and length of hospital stay, this is correct. DFR scores were associated significantly with adverse outcome, including multiple organ failure, duration of ventilation, length of hospital stay, and infection rate. The association remained significant after adjustment for injury severity as measured by APACHE or ISS. A single score representing changes in gene expression in peripheral blood leukocytes within hours of severe blunt injury is associated with adverse clinical outcomes that develop later in the hospital course. Assessment of genome-wide gene expression provides useful clinical information that is different from that provided by currently utilized anatomic or physiologic scores. PMID:19593405

  12. Study on predictive role of AR and EGFR family genes with response to neoadjuvant chemotherapy in locally advanced breast cancer in Indian women.

    PubMed

    Singh, L C; Chakraborty, Anurupa; Mishra, Ashwani K; Devi, Thoudam Regina; Sugandhi, Nidhi; Chintamani, Chintamani; Bhatnagar, Dinesh; Kapur, Sujala; Saxena, Sunita

    2012-06-01

    Locally advanced breast cancer (LABC) remains a clinical challenge as the majority of patients with this diagnosis develop distant metastases despite appropriate therapy. We analyzed expression of steroid and growth hormone receptor genes as well as gene associated with metabolism of chemotherapeutic drugs in locally advanced breast cancer before and after neoadjuvant chemotherapy (NACT) to study whether there is a change in gene expression induced by chemotherapy and whether such changes are associated with tumor response or non-response. Fifty patients were included with locally advanced breast cancer treated with cyclophosphamide, adriamycin, 5-fluorouracil (CAF)-based neoadjuvant chemotherapy before surgery. Total RNA was extracted from 50 match samples of pre- and post-NACT tumor tissues. RNA expression levels of epidermal growth factor receptor family genes including EGFR, ERBB2, ERBB3, androgen receptor (AR), and multidrug-resistance gene 1 (MDR1) were determined by quantitative real-time reverse transcriptase-polymerase chain reaction. Responders show significantly high levels of pre-NACT AR gene expression (P = 0.016), which reduces following NACT (P = 0.008), and hence can serve as a useful tool for the prediction of the success of neoadjuvant chemotherapy in individual cancer patients with locally advanced breast carcinoma. Moreover, a significant post-therapeutic increase in the expression levels of EGFR and MDR1 gene in responders (P = 0.026 and P < 0.001) as well as in non-responders (P = 0.055, P = 0.001) suggests that expression of these genes changes during therapy but they do not have any impact on tumor response, whereas a post-therapeutic reduction was observed in AR in responders. This indicates an independent predictive role of AR with response to NACT.

  13. A transversal approach to predict gene product networks from ontology-based similarity

    PubMed Central

    Chabalier, Julie; Mosser, Jean; Burgun, Anita

    2007-01-01

    Background Interpretation of transcriptomic data is usually made through a "standard" approach which consists in clustering the genes according to their expression patterns and exploiting Gene Ontology (GO) annotations within each expression cluster. This approach makes it difficult to underline functional relationships between gene products that belong to different expression clusters. To address this issue, we propose a transversal analysis that aims to predict functional networks based on a combination of GO processes and data expression. Results The transversal approach presented in this paper consists in computing the semantic similarity between gene products in a Vector Space Model. Through a weighting scheme over the annotations, we take into account the representativity of the terms that annotate a gene product. Comparing annotation vectors results in a matrix of gene product similarities. Combined with expression data, the matrix is displayed as a set of functional gene networks. The transversal approach was applied to 186 genes related to the enterocyte differentiation stages. This approach resulted in 18 functional networks proved to be biologically relevant. These results were compared with those obtained through a standard approach and with an approach based on information content similarity. Conclusion Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression. PMID:17605807

  14. Cloning and characterization of a mouse gene with homology to the human von Hippel-Lindau disease tumor suppressor gene: implications for the potential organization of the human von Hippel-Lindau disease gene.

    PubMed

    Gao, J; Naglich, J G; Laidlaw, J; Whaley, J M; Seizinger, B R; Kley, N

    1995-02-15

    The human von Hippel-Lindau disease (VHL) gene has recently been identified and, based on the nucleotide sequence of a partial cDNA clone, has been predicted to encode a novel protein with as yet unknown functions [F. Latif et al., Science (Washington DC), 260: 1317-1320, 1993]. The length of the encoded protein and the characteristics of the cellular expressed protein are as yet unclear. Here we report the cloning and characterization of a mouse gene (mVHLh1) that is widely expressed in different mouse tissues and shares high homology with the human VHL gene. It predicts a protein 181 residues long (and/or 162 amino acids, considering a potential alternative start codon), which across a core region of approximately 140 residues displays a high degree of sequence identity (98%) to the predicted human VHL protein. High stringency DNA and RNA hybridization experiments and protein expression analyses indicate that this gene is the most highly VHL-related mouse gene, suggesting that it represents the mouse VHL gene homologue rather than a related gene sharing a conserved functional domain. These findings provide new insights into the potential organization of the VHL gene and nature of its encoded protein.

  15. Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification.

    PubMed

    Shimoni, Yishai

    2018-02-01

    One of the goals of cancer research is to identify a set of genes that cause or control disease progression. However, although multiple such gene sets were published, these are usually in very poor agreement with each other, and very few of the genes proved to be functional therapeutic targets. Furthermore, recent findings from a breast cancer gene-expression cohort showed that sets of genes selected randomly can be used to predict survival with a much higher probability than expected. These results imply that many of the genes identified in breast cancer gene expression analysis may not be causal of cancer progression, even though they can still be highly predictive of prognosis. We performed a similar analysis on all the cancer types available in the cancer genome atlas (TCGA), namely, estimating the predictive power of random gene sets for survival. Our work shows that most cancer types exhibit the property that random selections of genes are more predictive of survival than expected. In contrast to previous work, this property is not removed by using a proliferation signature, which implies that proliferation may not always be the confounder that drives this property. We suggest one possible solution in the form of data-driven sub-classification to reduce this property significantly. Our results suggest that the predictive power of random gene sets may be used to identify the existence of sub-classes in the data, and thus may allow better understanding of patient stratification. Furthermore, by reducing the observed bias this may allow more direct identification of biologically relevant, and potentially causal, genes.

  16. Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification

    PubMed Central

    2018-01-01

    One of the goals of cancer research is to identify a set of genes that cause or control disease progression. However, although multiple such gene sets were published, these are usually in very poor agreement with each other, and very few of the genes proved to be functional therapeutic targets. Furthermore, recent findings from a breast cancer gene-expression cohort showed that sets of genes selected randomly can be used to predict survival with a much higher probability than expected. These results imply that many of the genes identified in breast cancer gene expression analysis may not be causal of cancer progression, even though they can still be highly predictive of prognosis. We performed a similar analysis on all the cancer types available in the cancer genome atlas (TCGA), namely, estimating the predictive power of random gene sets for survival. Our work shows that most cancer types exhibit the property that random selections of genes are more predictive of survival than expected. In contrast to previous work, this property is not removed by using a proliferation signature, which implies that proliferation may not always be the confounder that drives this property. We suggest one possible solution in the form of data-driven sub-classification to reduce this property significantly. Our results suggest that the predictive power of random gene sets may be used to identify the existence of sub-classes in the data, and thus may allow better understanding of patient stratification. Furthermore, by reducing the observed bias this may allow more direct identification of biologically relevant, and potentially causal, genes. PMID:29470520

  17. Co-acting gene networks predict TRAIL responsiveness of tumour cells with high accuracy.

    PubMed

    O'Reilly, Paul; Ortutay, Csaba; Gernon, Grainne; O'Connell, Enda; Seoighe, Cathal; Boyce, Susan; Serrano, Luis; Szegezdi, Eva

    2014-12-19

    Identification of differentially expressed genes from transcriptomic studies is one of the most common mechanisms to identify tumor biomarkers. This approach however is not well suited to identify interaction between genes whose protein products potentially influence each other, which limits its power to identify molecular wiring of tumour cells dictating response to a drug. Due to the fact that signal transduction pathways are not linear and highly interlinked, the biological response they drive may be better described by the relative amount of their components and their functional relationships than by their individual, absolute expression. Gene expression microarray data for 109 tumor cell lines with known sensitivity to the death ligand cytokine tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) was used to identify genes with potential functional relationships determining responsiveness to TRAIL-induced apoptosis. The machine learning technique Random Forest in the statistical environment "R" with backward elimination was used to identify the key predictors of TRAIL sensitivity and differentially expressed genes were identified using the software GeneSpring. Gene co-regulation and statistical interaction was assessed with q-order partial correlation analysis and non-rejection rate. Biological (functional) interactions amongst the co-acting genes were studied with Ingenuity network analysis. Prediction accuracy was assessed by calculating the area under the receiver operator curve using an independent dataset. We show that the gene panel identified could predict TRAIL-sensitivity with a very high degree of sensitivity and specificity (AUC=0·84). The genes in the panel are co-regulated and at least 40% of them functionally interact in signal transduction pathways that regulate cell death and cell survival, cellular differentiation and morphogenesis. Importantly, only 12% of the TRAIL-predictor genes were differentially expressed highlighting the importance of functional interactions in predicting the biological response. The advantage of co-acting gene clusters is that this analysis does not depend on differential expression and is able to incorporate direct- and indirect gene interactions as well as tissue- and cell-specific characteristics. This approach (1) identified a descriptor of TRAIL sensitivity which performs significantly better as a predictor of TRAIL sensitivity than any previously reported gene signatures, (2) identified potential novel regulators of TRAIL-responsiveness and (3) provided a systematic view highlighting fundamental differences between the molecular wiring of sensitive and resistant cell types.

  18. A gene expression signature associated with survival in metastatic melanoma

    PubMed Central

    Mandruzzato, Susanna; Callegaro, Andrea; Turcatel, Gianluca; Francescato, Samuela; Montesco, Maria C; Chiarion-Sileni, Vanna; Mocellin, Simone; Rossi, Carlo R; Bicciato, Silvio; Wang, Ena; Marincola, Francesco M; Zanovello, Paola

    2006-01-01

    Background Current clinical and histopathological criteria used to define the prognosis of melanoma patients are inadequate for accurate prediction of clinical outcome. We investigated whether genome screening by means of high-throughput gene microarray might provide clinically useful information on patient survival. Methods Forty-three tumor tissues from 38 patients with stage III and stage IV melanoma were profiled with a 17,500 element cDNA microarray. Expression data were analyzed using significance analysis of microarrays (SAM) to identify genes associated with patient survival, and supervised principal components (SPC) to determine survival prediction. Results SAM analysis revealed a set of 80 probes, corresponding to 70 genes, associated with survival, i.e. 45 probes characterizing longer and 35 shorter survival times, respectively. These transcripts were included in a survival prediction model designed using SPC and cross-validation which allowed identifying 30 predicting probes out of the 80 associated with survival. Conclusion The longer-survival group of genes included those expressed in immune cells, both innate and acquired, confirming the interplay between immunological mechanisms and the natural history of melanoma. Genes linked to immune cells were totally lacking in the poor-survival group, which was instead associated with a number of genes related to highly proliferative and invasive tumor cells. PMID:17129373

  19. The Application of Gene Expression Profiling in Predictions of Occult Lymph Node Metastasis in Colorectal Cancer Patients

    PubMed Central

    Peyravian, Noshad; Larki, Pegah; Gharib, Ehsan; Nazemalhosseini-Mojarad, Ehsan; Anaraki, Fakhrosadate; Young, Chris; McClellan, James; Ashrafian Bonab, Maziar; Asadzadeh-Aghdaei, Hamid; Zali, Mohammad Reza

    2018-01-01

    A key factor in determining the likely outcome for a patient with colorectal cancer is whether or not the tumour has metastasised to the lymph nodes—information which is also important in assessing any possibilities of lymph node resection so as to improve survival. In this review we perform a wide-range assessment of literature relating to recent developments in gene expression profiling (GEP) of the primary tumour, to determine their utility in assessing node status. A set of characteristic genes seems to be involved in the prediction of lymph node metastasis (LNM) in colorectal patients. Hence, GEP is applicable in personalised/individualised/tailored therapies and provides insights into developing novel therapeutic targets. Not only is GEP useful in prediction of LNM, but it also allows classification based on differences such as sample size, target gene expression, and examination method. PMID:29498671

  20. Fuzzy Neural Network Applied to Gene Expression Profiling for Predicting the Prognosis of Diffuse Large B‐cell Lymphoma

    PubMed Central

    Ando, Tatsuya; Suguro, Miyuki; Hanai, Taizo; Kobayashi, Takeshi; Seto, Masao

    2002-01-01

    Diffuse large B‐cell lymphoma (DLBCL) is the largest category of aggressive lymphomas. Less than 50% of patients can be cured by combination chemotherapy. Microarray technologies have recently shown that the response to chemotherapy reflects the molecular heterogeneity in DLBCL. On the basis of published microarray data, we attempted to develop a long‐overdue method for the precise and simple prediction of survival of DLBCL patients. We developed a fuzzy neural network (FNN) model to analyze gene expression profiling data for DLBCL. From data on 5857 genes, this model identified four genes (CD10, AA807551, AA805611 and IRF‐4) that could be used to predict prognosis with 93% accuracy. FNNs are powerful tools for extracting significant biological markers affecting prognosis, and are applicable to various kinds of expression profiling data for any malignancy. PMID:12460461

  1. Inferring evolution of gene duplicates using probabilistic models and nonparametric belief propagation.

    PubMed

    Zeng, Jia; Hannenhalli, Sridhar

    2013-01-01

    Gene duplication, followed by functional evolution of duplicate genes, is a primary engine of evolutionary innovation. In turn, gene expression evolution is a critical component of overall functional evolution of paralogs. Inferring evolutionary history of gene expression among paralogs is therefore a problem of considerable interest. It also represents significant challenges. The standard approaches of evolutionary reconstruction assume that at an internal node of the duplication tree, the two duplicates evolve independently. However, because of various selection pressures functional evolution of the two paralogs may be coupled. The coupling of paralog evolution corresponds to three major fates of gene duplicates: subfunctionalization (SF), conserved function (CF) or neofunctionalization (NF). Quantitative analysis of these fates is of great interest and clearly influences evolutionary inference of expression. These two interrelated problems of inferring gene expression and evolutionary fates of gene duplicates have not been studied together previously and motivate the present study. Here we propose a novel probabilistic framework and algorithm to simultaneously infer (i) ancestral gene expression and (ii) the likely fate (SF, NF, CF) at each duplication event during the evolution of gene family. Using tissue-specific gene expression data, we develop a nonparametric belief propagation (NBP) algorithm to predict the ancestral expression level as a proxy for function, and describe a novel probabilistic model that relates the predicted and known expression levels to the possible evolutionary fates. We validate our model using simulation and then apply it to a genome-wide set of gene duplicates in human. Our results suggest that SF tends to be more frequent at the earlier stage of gene family expansion, while NF occurs more frequently later on.

  2. Gene expression analysis in zebrafish embryos: a potential approach to predict effect concentrations in the fish early life stage test.

    PubMed

    Weil, Mirco; Scholz, Stefan; Zimmer, Michaela; Sacher, Frank; Duis, Karen

    2009-09-01

    Based on the hypothesis that analysis of gene expression could be used to predict chronic fish toxicity, the zebrafish (Danio rerio) embryo test (DarT), developed as a replacement method for the acute fish test, was expanded to a gene expression D. rerio embryo test (Gene-DarT). The effects of 14 substances on lethal and sublethal endpoints of the DarT and on expression of potential marker genes were investigated: the aryl hydrocarbon receptor 2, cytochrome P450 1A (cypla), heat shock protein 70, fizzy-related protein 1, the transcription factors v-maf musculoaponeurotic fibrosarcoma oncogene family protein g (avian) 1 and NF-E2-p45-related factor, and heme oxygenase 1 (hmox1). After exposure of zebrafish embryos for 48 h, differential gene expression was evaluated using reverse transcriptase-polymerase chain reaction, gel electrophoresis, and densitometric analysis of the gels. All tested compounds significantly affected the expression of at least one potential marker gene, with cyp1a and hmox1 being most sensitive. Lowest-observed-effect concentrations (LOECs) for gene expression were below concentrations resulting in 10% lethal effects in the DarT. For 10 (3,4- and 3,5-dichloroaniline, 1,4-dichlorobenzene, 2,4-dinitrophenol, atrazine, parathion-ethyl, chlorotoluron, genistein, 4-nitroquinoline-1-oxide, and cadmium) out of the 14 tested substances, LOEC values derived with the Gene-DarT differ by a factor of less than 10 from LOEC values of fish early life stage tests with zebrafish. For pentachloroaniline and pentachlorobenzene, the Gene-DarT showed a 23- and 153-fold higher sensitivity, respectively, while for lindane, it showed a 13-fold lower sensitivity. For ivermectin, the Gene-DarT was by a factor of more than 1,000 less sensitive than the acute fish test. The results of the present study indicate that gene expression analysis in zebrafish embryos could principally be used to predict effect concentrations in the fish early life stage test.

  3. Neighboring Genes Show Correlated Evolution in Gene Expression

    PubMed Central

    Ghanbarian, Avazeh T.; Hurst, Laurence D.

    2015-01-01

    When considering the evolution of a gene’s expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. PMID:25743543

  4. Whole Blood Gene Expression Profiling Predicts Severe Morbidity and Mortality in Cystic Fibrosis: A 5-Year Follow-Up Study.

    PubMed

    Saavedra, Milene T; Quon, Bradley S; Faino, Anna; Caceres, Silvia M; Poch, Katie R; Sanders, Linda A; Malcolm, Kenneth C; Nichols, David P; Sagel, Scott D; Taylor-Cousar, Jennifer L; Leach, Sonia M; Strand, Matthew; Nick, Jerry A

    2018-05-01

    Cystic fibrosis pulmonary exacerbations accelerate pulmonary decline and increase mortality. Previously, we identified a 10-gene leukocyte panel measured directly from whole blood, which indicates response to exacerbation treatment. We hypothesized that molecular characteristics of exacerbations could also predict future disease severity. We tested whether a 10-gene panel measured from whole blood could identify patient cohorts at increased risk for severe morbidity and mortality, beyond standard clinical measures. Transcript abundance for the 10-gene panel was measured from whole blood at the beginning of exacerbation treatment (n = 57). A hierarchical cluster analysis of subjects based on their gene expression was performed, yielding four molecular clusters. An analysis of cluster membership and outcomes incorporating an independent cohort (n = 21) was completed to evaluate robustness of cluster partitioning of genes to predict severe morbidity and mortality. The four molecular clusters were analyzed for differences in forced expiratory volume in 1 second, C-reactive protein, return to baseline forced expiratory volume in 1 second after treatment, time to next exacerbation, and time to morbidity or mortality events (defined as lung transplant referral, lung transplant, intensive care unit admission for respiratory insufficiency, or death). Clustering based on gene expression discriminated between patient groups with significant differences in forced expiratory volume in 1 second, admission frequency, and overall morbidity and mortality. At 5 years, all subjects in cluster 1 (very low risk) were alive and well, whereas 90% of subjects in cluster 4 (high risk) had suffered a major event (P = 0.0001). In multivariable analysis, the ability of gene expression to predict clinical outcomes remained significant, despite adjustment for forced expiratory volume in 1 second, sex, and admission frequency. The robustness of gene clustering to categorize patients appropriately in terms of clinical characteristics, and short- and long-term clinical outcomes, remained consistent, even when adding in a secondary population with significantly different clinical outcomes. Whole blood gene expression profiling allows molecular classification of acute pulmonary exacerbations, beyond standard clinical measures, providing a predictive tool for identifying subjects at increased risk for mortality and disease progression.

  5. Integration of somatic mutation, expression and functional data reveals potential driver genes predictive of breast cancer survival.

    PubMed

    Suo, Chen; Hrydziuszko, Olga; Lee, Donghwan; Pramana, Setia; Saputra, Dhany; Joshi, Himanshu; Calza, Stefano; Pawitan, Yudi

    2015-08-15

    Genome and transcriptome analyses can be used to explore cancers comprehensively, and it is increasingly common to have multiple omics data measured from each individual. Furthermore, there are rich functional data such as predicted impact of mutations on protein coding and gene/protein networks. However, integration of the complex information across the different omics and functional data is still challenging. Clinical validation, particularly based on patient outcomes such as survival, is important for assessing the relevance of the integrated information and for comparing different procedures. An analysis pipeline is built for integrating genomic and transcriptomic alterations from whole-exome and RNA sequence data and functional data from protein function prediction and gene interaction networks. The method accumulates evidence for the functional implications of mutated potential driver genes found within and across patients. A driver-gene score (DGscore) is developed to capture the cumulative effect of such genes. To contribute to the score, a gene has to be frequently mutated, with high or moderate mutational impact at protein level, exhibiting an extreme expression and functionally linked to many differentially expressed neighbors in the functional gene network. The pipeline is applied to 60 matched tumor and normal samples of the same patient from The Cancer Genome Atlas breast-cancer project. In clinical validation, patients with high DGscores have worse survival than those with low scores (P = 0.001). Furthermore, the DGscore outperforms the established expression-based signatures MammaPrint and PAM50 in predicting patient survival. In conclusion, integration of mutation, expression and functional data allows identification of clinically relevant potential driver genes in cancer. The documented pipeline including annotated sample scripts can be found in http://fafner.meb.ki.se/biostatwiki/driver-genes/. yudi.pawitan@ki.se Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. Knowledge-guided gene prioritization reveals new insights into the mechanisms of chemoresistance.

    PubMed

    Emad, Amin; Cairns, Junmei; Kalari, Krishna R; Wang, Liewei; Sinha, Saurabh

    2017-08-11

    Identification of genes whose basal mRNA expression predicts the sensitivity of tumor cells to cytotoxic treatments can play an important role in individualized cancer medicine. It enables detailed characterization of the mechanism of action of drugs. Furthermore, screening the expression of these genes in the tumor tissue may suggest the best course of chemotherapy or a combination of drugs to overcome drug resistance. We developed a computational method called ProGENI to identify genes most associated with the variation of drug response across different individuals, based on gene expression data. In contrast to existing methods, ProGENI also utilizes prior knowledge of protein-protein and genetic interactions, using random walk techniques. Analysis of two relatively new and large datasets including gene expression data on hundreds of cell lines and their cytotoxic responses to a large compendium of drugs reveals a significant improvement in prediction of drug sensitivity using genes identified by ProGENI compared to other methods. Our siRNA knockdown experiments on ProGENI-identified genes confirmed the role of many new genes in sensitivity to three chemotherapy drugs: cisplatin, docetaxel, and doxorubicin. Based on such experiments and extensive literature survey, we demonstrate that about 73% of our top predicted genes modulate drug response in selected cancer cell lines. In addition, global analysis of genes associated with groups of drugs uncovered pathways of cytotoxic response shared by each group. Our results suggest that knowledge-guided prioritization of genes using ProGENI gives new insight into mechanisms of drug resistance and identifies genes that may be targeted to overcome this phenomenon.

  7. The transcriptional landscape of age in human peripheral blood

    PubMed Central

    Peters, Marjolein J.; Joehanes, Roby; Pilling, Luke C.; Schurmann, Claudia; Conneely, Karen N.; Powell, Joseph; Reinmaa, Eva; Sutphin, George L.; Zhernakova, Alexandra; Schramm, Katharina; Wilson, Yana A.; Kobes, Sayuko; Tukiainen, Taru; Nalls, Michael A.; Hernandez, Dena G.; Cookson, Mark R.; Gibbs, Raphael J.; Hardy, John; Ramasamy, Adaikalavan; Zonderman, Alan B.; Dillman, Allissa; Traynor, Bryan; Smith, Colin; Longo, Dan L.; Trabzuni, Daniah; Troncoso, Juan; van der Brug, Marcel; Weale, Michael E.; O'Brien, Richard; Johnson, Robert; Walker, Robert; Zielke, Ronald H.; Arepalli, Sampath; Ryten, Mina; Singleton, Andrew B.; Ramos, Yolande F.; Göring, Harald H. H.; Fornage, Myriam; Liu, Yongmei; Gharib, Sina A.; Stranger, Barbara E.; De Jager, Philip L.; Aviv, Abraham; Levy, Daniel; Murabito, Joanne M.; Munson, Peter J.; Huan, Tianxiao; Hofman, Albert; Uitterlinden, André G.; Rivadeneira, Fernando; van Rooij, Jeroen; Stolk, Lisette; Broer, Linda; Verbiest, Michael M. P. J.; Jhamai, Mila; Arp, Pascal; Metspalu, Andres; Tserel, Liina; Milani, Lili; Samani, Nilesh J.; Peterson, Pärt; Kasela, Silva; Codd, Veryan; Peters, Annette; Ward-Caviness, Cavin K.; Herder, Christian; Waldenberger, Melanie; Roden, Michael; Singmann, Paula; Zeilinger, Sonja; Illig, Thomas; Homuth, Georg; Grabe, Hans-Jörgen; Völzke, Henry; Steil, Leif; Kocher, Thomas; Murray, Anna; Melzer, David; Yaghootkar, Hanieh; Bandinelli, Stefania; Moses, Eric K.; Kent, Jack W.; Curran, Joanne E.; Johnson, Matthew P.; Williams-Blangero, Sarah; Westra, Harm-Jan; McRae, Allan F.; Smith, Jennifer A.; Kardia, Sharon L. R.; Hovatta, Iiris; Perola, Markus; Ripatti, Samuli; Salomaa, Veikko; Henders, Anjali K.; Martin, Nicholas G.; Smith, Alicia K.; Mehta, Divya; Binder, Elisabeth B.; Nylocks, K Maria; Kennedy, Elizabeth M.; Klengel, Torsten; Ding, Jingzhong; Suchy-Dicey, Astrid M.; Enquobahrie, Daniel A.; Brody, Jennifer; Rotter, Jerome I.; Chen, Yii-Der I.; Houwing-Duistermaat, Jeanine; Kloppenburg, Margreet; Slagboom, P. Eline; Helmer, Quinta; den Hollander, Wouter; Bean, Shannon; Raj, Towfique; Bakhshi, Noman; Wang, Qiao Ping; Oyston, Lisa J.; Psaty, Bruce M.; Tracy, Russell P.; Montgomery, Grant W.; Turner, Stephen T.; Blangero, John; Meulenbelt, Ingrid; Ressler, Kerry J.; Yang, Jian; Franke, Lude; Kettunen, Johannes; Visscher, Peter M.; Neely, G. Gregory; Korstanje, Ron; Hanson, Robert L.; Prokisch, Holger; Ferrucci, Luigi; Esko, Tonu; Teumer, Alexander; van Meurs, Joyce B. J.; Johnson, Andrew D.

    2015-01-01

    Disease incidences increase with age, but the molecular characteristics of ageing that lead to increased disease susceptibility remain inadequately understood. Here we perform a whole-blood gene expression meta-analysis in 14,983 individuals of European ancestry (including replication) and identify 1,497 genes that are differentially expressed with chronological age. The age-associated genes do not harbor more age-associated CpG-methylation sites than other genes, but are instead enriched for the presence of potentially functional CpG-methylation sites in enhancer and insulator regions that associate with both chronological age and gene expression levels. We further used the gene expression profiles to calculate the ‘transcriptomic age' of an individual, and show that differences between transcriptomic age and chronological age are associated with biological features linked to ageing, such as blood pressure, cholesterol levels, fasting glucose, and body mass index. The transcriptomic prediction model adds biological relevance and complements existing epigenetic prediction models, and can be used by others to calculate transcriptomic age in external cohorts. PMID:26490707

  8. Advanced colorectal adenoma related gene expression signature may predict prognostic for colorectal cancer patients with adenoma-carcinoma sequence.

    PubMed

    Li, Bing; Shi, Xiao-Yu; Liao, Dai-Xiang; Cao, Bang-Rong; Luo, Cheng-Hua; Cheng, Shu-Jun

    2015-01-01

    There are still no absolute parameters predicting progression of adenoma into cancer. The present study aimed to characterize functional differences on the multistep carcinogenetic process from the adenoma-carcinoma sequence. All samples were collected and mRNA expression profiling was performed by using Agilent Microarray high-throughput gene-chip technology. Then, the characteristics of mRNA expression profiles of adenoma-carcinoma sequence were described with bioinformatics software, and we analyzed the relationship between gene expression profiles of adenoma-adenocarcinoma sequence and clinical prognosis of colorectal cancer. The mRNA expressions of adenoma-carcinoma sequence were significantly different between high-grade intraepithelial neoplasia group and adenocarcinoma group. The biological process of gene ontology function enrichment analysis on differentially expressed genes between high-grade intraepithelial neoplasia group and adenocarcinoma group showed that genes enriched in the extracellular structure organization, skeletal system development, biological adhesion and itself regulated growth regulation, with the P value after FDR correction of less than 0.05. In addition, IPR-related protein mainly focused on the insulin-like growth factor binding proteins. The variable trends of gene expression profiles for adenoma-carcinoma sequence were mainly concentrated in high-grade intraepithelial neoplasia and adenocarcinoma. The differentially expressed genes are significantly correlated between high-grade intraepithelial neoplasia group and adenocarcinoma group. Bioinformatics analysis is an effective way to study the gene expression profiles in the adenoma-carcinoma sequence, and may provide an effective tool to involve colorectal cancer research strategy into colorectal adenoma or advanced adenoma.

  9. Promoter architecture dictates cell-to-cell variability in gene expression.

    PubMed

    Jones, Daniel L; Brewster, Robert C; Phillips, Rob

    2014-12-19

    Variability in gene expression among genetically identical cells has emerged as a central preoccupation in the study of gene regulation; however, a divide exists between the predictions of molecular models of prokaryotic transcriptional regulation and genome-wide experimental studies suggesting that this variability is indifferent to the underlying regulatory architecture. We constructed a set of promoters in Escherichia coli in which promoter strength, transcription factor binding strength, and transcription factor copy numbers are systematically varied, and used messenger RNA (mRNA) fluorescence in situ hybridization to observe how these changes affected variability in gene expression. Our parameter-free models predicted the observed variability; hence, the molecular details of transcription dictate variability in mRNA expression, and transcriptional noise is specifically tunable and thus represents an evolutionarily accessible phenotypic parameter. Copyright © 2014, American Association for the Advancement of Science.

  10. Improvement of experimental testing and network training conditions with genome-wide microarrays for more accurate predictions of drug gene targets

    PubMed Central

    2014-01-01

    Background Genome-wide microarrays have been useful for predicting chemical-genetic interactions at the gene level. However, interpreting genome-wide microarray results can be overwhelming due to the vast output of gene expression data combined with off-target transcriptional responses many times induced by a drug treatment. This study demonstrates how experimental and computational methods can interact with each other, to arrive at more accurate predictions of drug-induced perturbations. We present a two-stage strategy that links microarray experimental testing and network training conditions to predict gene perturbations for a drug with a known mechanism of action in a well-studied organism. Results S. cerevisiae cells were treated with the antifungal, fluconazole, and expression profiling was conducted under different biological conditions using Affymetrix genome-wide microarrays. Transcripts were filtered with a formal network-based method, sparse simultaneous equation models and Lasso regression (SSEM-Lasso), under different network training conditions. Gene expression results were evaluated using both gene set and single gene target analyses, and the drug’s transcriptional effects were narrowed first by pathway and then by individual genes. Variables included: (i) Testing conditions – exposure time and concentration and (ii) Network training conditions – training compendium modifications. Two analyses of SSEM-Lasso output – gene set and single gene – were conducted to gain a better understanding of how SSEM-Lasso predicts perturbation targets. Conclusions This study demonstrates that genome-wide microarrays can be optimized using a two-stage strategy for a more in-depth understanding of how a cell manifests biological reactions to a drug treatment at the transcription level. Additionally, a more detailed understanding of how the statistical model, SSEM-Lasso, propagates perturbations through a network of gene regulatory interactions is achieved. PMID:24444313

  11. DEEP--a tool for differential expression effector prediction.

    PubMed

    Degenhardt, Jost; Haubrock, Martin; Dönitz, Jürgen; Wingender, Edgar; Crass, Torsten

    2007-07-01

    High-throughput methods for measuring transcript abundance, like SAGE or microarrays, are widely used for determining differences in gene expression between different tissue types, dignities (normal/malignant) or time points. Further analysis of such data frequently aims at the identification of gene interaction networks that form the causal basis for the observed properties of the systems under examination. To this end, it is usually not sufficient to rely on the measured gene expression levels alone; rather, additional biological knowledge has to be taken into account in order to generate useful hypotheses about the molecular mechanism leading to the realization of a certain phenotype. We present a method that combines gene expression data with biological expert knowledge on molecular interaction networks, as described by the TRANSPATH database on signal transduction, to predict additional--and not necessarily differentially expressed--genes or gene products which might participate in processes specific for either of the examined tissues or conditions. In a first step, significance values for over-expression in tissue/condition A or B are assigned to all genes in the expression data set. Genes with a significance value exceeding a certain threshold are used as starting points for the reconstruction of a graph with signaling components as nodes and signaling events as edges. In a subsequent graph traversal process, again starting from the previously identified differentially expressed genes, all encountered nodes 'inherit' all their starting nodes' significance values. In a final step, the graph is visualized, the nodes being colored according to a weighted average of their inherited significance values. Each node's, or sub-network's, predominant color, ranging from green (significant for tissue/condition A) over yellow (not significant for either tissue/condition) to red (significant for tissue/condition B), thus gives an immediate visual clue on which molecules--differentially expressed or not--may play pivotal roles in the tissues or conditions under examination. The described method has been implemented in Java as a client/server application and a web interface called DEEP (Differential Expression Effector Prediction). The client, which features an easy-to-use graphical interface, can freely be downloaded from the following URL: http://deep.bioinf.med.uni-goettingen.de.

  12. The Constrained Maximal Expression Level Owing to Haploidy Shapes Gene Content on the Mammalian X Chromosome.

    PubMed

    Hurst, Laurence D; Ghanbarian, Avazeh T; Forrest, Alistair R R; Huminiecki, Lukasz

    2015-12-01

    X chromosomes are unusual in many regards, not least of which is their nonrandom gene content. The causes of this bias are commonly discussed in the context of sexual antagonism and the avoidance of activity in the male germline. Here, we examine the notion that, at least in some taxa, functionally biased gene content may more profoundly be shaped by limits imposed on gene expression owing to haploid expression of the X chromosome. Notably, if the X, as in primates, is transcribed at rates comparable to the ancestral rate (per promoter) prior to the X chromosome formation, then the X is not a tolerable environment for genes with very high maximal net levels of expression, owing to transcriptional traffic jams. We test this hypothesis using The Encyclopedia of DNA Elements (ENCODE) and data from the Functional Annotation of the Mammalian Genome (FANTOM5) project. As predicted, the maximal expression of human X-linked genes is much lower than that of genes on autosomes: on average, maximal expression is three times lower on the X chromosome than on autosomes. Similarly, autosome-to-X retroposition events are associated with lower maximal expression of retrogenes on the X than seen for X-to-autosome retrogenes on autosomes. Also as expected, X-linked genes have a lesser degree of increase in gene expression than autosomal ones (compared to the human/Chimpanzee common ancestor) if highly expressed, but not if lowly expressed. The traffic jam model also explains the known lower breadth of expression for genes on the X (and the Z of birds), as genes with broad expression are, on average, those with high maximal expression. As then further predicted, highly expressed tissue-specific genes are also rare on the X and broadly expressed genes on the X tend to be lowly expressed, both indicating that the trend is shaped by the maximal expression level not the breadth of expression per se. Importantly, a limit to the maximal expression level explains biased tissue of expression profiles of X-linked genes. Tissues whose tissue-specific genes are very highly expressed (e.g., secretory tissues, tissues abundant in structural proteins) are also tissues in which gene expression is relatively rare on the X chromosome. These trends cannot be fully accounted for in terms of alternative models of biased expression. In conclusion, the notion that it is hard for genes on the Therian X to be highly expressed, owing to transcriptional traffic jams, provides a simple yet robustly supported rationale of many peculiar features of X's gene content, gene expression, and evolution.

  13. The Constrained Maximal Expression Level Owing to Haploidy Shapes Gene Content on the Mammalian X Chromosome

    PubMed Central

    Hurst, Laurence D.; Ghanbarian, Avazeh T.; Forrest, Alistair R. R.; Huminiecki, Lukasz

    2015-01-01

    X chromosomes are unusual in many regards, not least of which is their nonrandom gene content. The causes of this bias are commonly discussed in the context of sexual antagonism and the avoidance of activity in the male germline. Here, we examine the notion that, at least in some taxa, functionally biased gene content may more profoundly be shaped by limits imposed on gene expression owing to haploid expression of the X chromosome. Notably, if the X, as in primates, is transcribed at rates comparable to the ancestral rate (per promoter) prior to the X chromosome formation, then the X is not a tolerable environment for genes with very high maximal net levels of expression, owing to transcriptional traffic jams. We test this hypothesis using The Encyclopedia of DNA Elements (ENCODE) and data from the Functional Annotation of the Mammalian Genome (FANTOM5) project. As predicted, the maximal expression of human X-linked genes is much lower than that of genes on autosomes: on average, maximal expression is three times lower on the X chromosome than on autosomes. Similarly, autosome-to-X retroposition events are associated with lower maximal expression of retrogenes on the X than seen for X-to-autosome retrogenes on autosomes. Also as expected, X-linked genes have a lesser degree of increase in gene expression than autosomal ones (compared to the human/Chimpanzee common ancestor) if highly expressed, but not if lowly expressed. The traffic jam model also explains the known lower breadth of expression for genes on the X (and the Z of birds), as genes with broad expression are, on average, those with high maximal expression. As then further predicted, highly expressed tissue-specific genes are also rare on the X and broadly expressed genes on the X tend to be lowly expressed, both indicating that the trend is shaped by the maximal expression level not the breadth of expression per se. Importantly, a limit to the maximal expression level explains biased tissue of expression profiles of X-linked genes. Tissues whose tissue-specific genes are very highly expressed (e.g., secretory tissues, tissues abundant in structural proteins) are also tissues in which gene expression is relatively rare on the X chromosome. These trends cannot be fully accounted for in terms of alternative models of biased expression. In conclusion, the notion that it is hard for genes on the Therian X to be highly expressed, owing to transcriptional traffic jams, provides a simple yet robustly supported rationale of many peculiar features of X’s gene content, gene expression, and evolution. PMID:26685068

  14. Prediction of response to preoperative chemoradiotherapy and establishment of individualized therapy in advanced rectal cancer.

    PubMed

    Nakao, Toshihiro; Iwata, Takashi; Hotchi, Masanori; Yoshikawa, Kozo; Higashijima, Jun; Nishi, Masaaki; Takasu, Chie; Eto, Shohei; Teraoku, Hiroki; Shimada, Mitsuo

    2015-10-01

    Preoperative chemoradiotherapy (CRT) has become the standard treatment for patients with locally advanced rectal cancer. However, no specific biomarker has been identified to predict a response to preoperative CRT. The aim of the present study was to assess the gene expression patterns of patients with advanced rectal cancer to predict their responses to preoperative CRT. Fifty-nine rectal cancer patients were subjected to preoperative CRT. Patients were randomly assigned to receive CRT with tegafur/gimeracil/oteracil (S-1 group, n=30) or tegafur-uracil (UFT group, n=29). Gene expression changes were studied with cDNA and miRNA microarray. The association between gene expression and response to CRT was evaluated. cDNA microarray showed that 184 genes were significantly differentially expressed between the responders and the non‑responders in the S-1 group. Comparatively, 193 genes were significantly differentially expressed in the responders in the UFT group. TBX18 upregulation was common to both groups whereas BTNL8, LOC375010, ADH1B, HRASLS2, LOC284232, GCNT3 and ALDH1A2 were significantly differentially lower in both groups when compared with the non-responders. Using miRNA microarray, we found that 7 and 16 genes were significantly differentially expressed between the responders and non-responders in the S-1 and UFT groups, respectively. miR-223 was significantly higher in the responders in the S-1 group and tended to be higher in the responders in the UFT group. The present study identified several genes likely to be useful for establishing individualized therapies for patients with rectal cancer.

  15. Ovary transcriptome profiling via artificial intelligence reveals a transcriptomic fingerprint predicting egg quality in striped bass, Morone saxatilis.

    PubMed

    Chapman, Robert W; Reading, Benjamin J; Sullivan, Craig V

    2014-01-01

    Inherited gene transcripts deposited in oocytes direct early embryonic development in all vertebrates, but transcript profiles indicative of embryo developmental competence have not previously been identified. We employed artificial intelligence to model profiles of maternal ovary gene expression and their relationship to egg quality, evaluated as production of viable mid-blastula stage embryos, in the striped bass (Morone saxatilis), a farmed species with serious egg quality problems. In models developed using artificial neural networks (ANNs) and supervised machine learning, collective changes in the expression of a limited suite of genes (233) representing <2% of the queried ovary transcriptome explained >90% of the eventual variance in embryo survival. Egg quality related to minor changes in gene expression (<0.2-fold), with most individual transcripts making a small contribution (<1%) to the overall prediction of egg quality. These findings indicate that the predictive power of the transcriptome as regards egg quality resides not in levels of individual genes, but rather in the collective, coordinated expression of a suite of transcripts constituting a transcriptomic "fingerprint". Correlation analyses of the corresponding candidate genes indicated that dysfunction of the ubiquitin-26S proteasome, COP9 signalosome, and subsequent control of the cell cycle engenders embryonic developmental incompetence. The affected gene networks are centrally involved in regulation of early development in all vertebrates, including humans. By assessing collective levels of the relevant ovarian transcripts via ANNs we were able, for the first time in any vertebrate, to accurately predict the subsequent embryo developmental potential of eggs from individual females. Our results show that the transcriptomic fingerprint evidencing developmental dysfunction is highly predictive of, and therefore likely to regulate, egg quality, a biologically complex trait crucial to reproductive fitness.

  16. Ovary Transcriptome Profiling via Artificial Intelligence Reveals a Transcriptomic Fingerprint Predicting Egg Quality in Striped Bass, Morone saxatilis

    PubMed Central

    2014-01-01

    Inherited gene transcripts deposited in oocytes direct early embryonic development in all vertebrates, but transcript profiles indicative of embryo developmental competence have not previously been identified. We employed artificial intelligence to model profiles of maternal ovary gene expression and their relationship to egg quality, evaluated as production of viable mid-blastula stage embryos, in the striped bass (Morone saxatilis), a farmed species with serious egg quality problems. In models developed using artificial neural networks (ANNs) and supervised machine learning, collective changes in the expression of a limited suite of genes (233) representing <2% of the queried ovary transcriptome explained >90% of the eventual variance in embryo survival. Egg quality related to minor changes in gene expression (<0.2-fold), with most individual transcripts making a small contribution (<1%) to the overall prediction of egg quality. These findings indicate that the predictive power of the transcriptome as regards egg quality resides not in levels of individual genes, but rather in the collective, coordinated expression of a suite of transcripts constituting a transcriptomic “fingerprint”. Correlation analyses of the corresponding candidate genes indicated that dysfunction of the ubiquitin-26S proteasome, COP9 signalosome, and subsequent control of the cell cycle engenders embryonic developmental incompetence. The affected gene networks are centrally involved in regulation of early development in all vertebrates, including humans. By assessing collective levels of the relevant ovarian transcripts via ANNs we were able, for the first time in any vertebrate, to accurately predict the subsequent embryo developmental potential of eggs from individual females. Our results show that the transcriptomic fingerprint evidencing developmental dysfunction is highly predictive of, and therefore likely to regulate, egg quality, a biologically complex trait crucial to reproductive fitness. PMID:24820964

  17. Interleukin-27 is a novel candidate diagnostic biomarker for bacterial infection in critically ill children

    PubMed Central

    2012-01-01

    Introduction Differentiating between sterile inflammation and bacterial infection in critically ill patients with fever and other signs of the systemic inflammatory response syndrome (SIRS) remains a clinical challenge. The objective of our study was to mine an existing genome-wide expression database for the discovery of candidate diagnostic biomarkers to predict the presence of bacterial infection in critically ill children. Methods Genome-wide expression data were compared between patients with SIRS having negative bacterial cultures (n = 21) and patients with sepsis having positive bacterial cultures (n = 60). Differentially expressed genes were subjected to a leave-one-out cross-validation (LOOCV) procedure to predict SIRS or sepsis classes. Serum concentrations of interleukin-27 (IL-27) and procalcitonin (PCT) were compared between 101 patients with SIRS and 130 patients with sepsis. All data represent the first 24 hours of meeting criteria for either SIRS or sepsis. Results Two hundred twenty one gene probes were differentially regulated between patients with SIRS and patients with sepsis. The LOOCV procedure correctly predicted 86% of the SIRS and sepsis classes, and Epstein-Barr virus-induced gene 3 (EBI3) had the highest predictive strength. Computer-assisted image analyses of gene-expression mosaics were able to predict infection with a specificity of 90% and a positive predictive value of 94%. Because EBI3 is a subunit of the heterodimeric cytokine, IL-27, we tested the ability of serum IL-27 protein concentrations to predict infection. At a cut-point value of ≥5 ng/ml, serum IL-27 protein concentrations predicted infection with a specificity and a positive predictive value of >90%, and the overall performance of IL-27 was generally better than that of PCT. A decision tree combining IL-27 and PCT improved overall predictive capacity compared with that of either biomarker alone. Conclusions Genome-wide expression analysis has provided the foundation for the identification of IL-27 as a novel candidate diagnostic biomarker for predicting bacterial infection in critically ill children. Additional studies will be required to test further the diagnostic performance of IL-27. The microarray data reported in this article have been deposited in the Gene Expression Omnibus under accession number GSE4607. PMID:23107287

  18. Sex-specific microRNA expression networks in an acute mouse model of ozone-induced lung inflammation.

    PubMed

    Fuentes, Nathalie; Roy, Arpan; Mishra, Vikas; Cabello, Noe; Silveyra, Patricia

    2018-05-08

    Sex differences in the incidence and prognosis of respiratory diseases have been reported. Studies have shown that women are at increased risk of adverse health outcomes from air pollution than men, but sex-specific immune gene expression patterns and regulatory networks have not been well studied in the lung. MicroRNAs (miRNAs) are environmentally sensitive posttranscriptional regulators of gene expression that may mediate the damaging effects of inhaled pollutants in the lung, by altering the expression of innate immunity molecules. Male and female mice of the C57BL/6 background were exposed to 2 ppm of ozone or filtered air (control) for 3 h. Female mice were also exposed at different stages of the estrous cycle. Following exposure, lungs were harvested and total RNA was extracted. We used PCR arrays to study sex differences in the expression of 84 miRNAs predicted to target inflammatory and immune genes. We identified differentially expressed miRNA signatures in the lungs of male vs. female exposed to ozone. In silico pathway analyses identified sex-specific biological networks affected by exposure to ozone that ranged from direct predicted gene targeting to complex interactions with multiple intermediates. We also identified differences in miRNA expression and predicted regulatory networks in females exposed to ozone at different estrous cycle stages. Our results indicate that both sex and hormonal status can influence lung miRNA expression in response to ozone exposure, indicating that sex-specific miRNA regulation of inflammatory gene expression could mediate differential pollution-induced health outcomes in men and women.

  19. Supervised group Lasso with applications to microarray data analysis

    PubMed Central

    Ma, Shuangge; Song, Xiao; Huang, Jian

    2007-01-01

    Background A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. Results We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. Conclusion We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods. PMID:17316436

  20. A systems approach to model the relationship between aflatoxin gene cluster expression, environmental factors, growth and toxin production by Aspergillus flavus

    PubMed Central

    Abdel-Hadi, Ahmed; Schmidt-Heydt, Markus; Parra, Roberto; Geisen, Rolf; Magan, Naresh

    2012-01-01

    A microarray analysis was used to examine the effect of combinations of water activity (aw, 0.995–0.90) and temperature (20–42°C) on the activation of aflatoxin biosynthetic genes (30 genes) in Aspergillus flavus grown on a conducive YES (20 g yeast extract, 150 g sucrose, 1 g MgSO4·7H2O) medium. The relative expression of 10 key genes (aflF, aflD, aflE, aflM, aflO, aflP, aflQ, aflX, aflR and aflS) in the biosynthetic pathway was examined in relation to different environmental factors and phenotypic aflatoxin B1 (AFB1) production. These data, plus data on relative growth rates and AFB1 production under different aw × temperature conditions were used to develop a mixed-growth-associated product formation model. The gene expression data were normalized and then used as a linear combination of the data for all 10 genes and combined with the physical model. This was used to relate gene expression to aw and temperature conditions to predict AFB1 production. The relationship between the observed AFB1 production provided a good linear regression fit to the predicted production based in the model. The model was then validated by examining datasets outside the model fitting conditions used (37°C, 40°C and different aw levels). The relationship between structural genes (aflD, aflM) in the biosynthetic pathway and the regulatory genes (aflS, aflJ) was examined in relation to aw and temperature by developing ternary diagrams of relative expression. These findings are important in developing a more integrated systems approach by combining gene expression, ecophysiological influences and growth data to predict mycotoxin production. This could help in developing a more targeted approach to develop prevention strategies to control such carcinogenic natural metabolites that are prevalent in many staple food products. The model could also be used to predict the impact of climate change on toxin production. PMID:21880616

  1. Soybean kinome: functional classification and gene expression patterns

    PubMed Central

    Liu, Jinyi; Chen, Nana; Grant, Joshua N.; Cheng, Zong-Ming (Max); Stewart, C. Neal; Hewezi, Tarek

    2015-01-01

    The protein kinase (PK) gene family is one of the largest and most highly conserved gene families in plants and plays a role in nearly all biological functions. While a large number of genes have been predicted to encode PKs in soybean, a comprehensive functional classification and global analysis of expression patterns of this large gene family is lacking. In this study, we identified the entire soybean PK repertoire or kinome, which comprised 2166 putative PK genes, representing 4.67% of all soybean protein-coding genes. The soybean kinome was classified into 19 groups, 81 families, and 122 subfamilies. The receptor-like kinase (RLK) group was remarkably large, containing 1418 genes. Collinearity analysis indicated that whole-genome segmental duplication events may have played a key role in the expansion of the soybean kinome, whereas tandem duplications might have contributed to the expansion of specific subfamilies. Gene structure, subcellular localization prediction, and gene expression patterns indicated extensive functional divergence of PK subfamilies. Global gene expression analysis of soybean PK subfamilies revealed tissue- and stress-specific expression patterns, implying regulatory functions over a wide range of developmental and physiological processes. In addition, tissue and stress co-expression network analysis uncovered specific subfamilies with narrow or wide interconnected relationships, indicative of their association with particular or broad signalling pathways, respectively. Taken together, our analyses provide a foundation for further functional studies to reveal the biological and molecular functions of PKs in soybean. PMID:25614662

  2. Genomic Features That Predict Allelic Imbalance in Humans Suggest Patterns of Constraint on Gene Expression Variation

    PubMed Central

    Fédrigo, Olivier; Haygood, Ralph; Mukherjee, Sayan; Wray, Gregory A.

    2009-01-01

    Variation in gene expression is an important contributor to phenotypic diversity within and between species. Although this variation often has a genetic component, identification of the genetic variants driving this relationship remains challenging. In particular, measurements of gene expression usually do not reveal whether the genetic basis for any observed variation lies in cis or in trans to the gene, a distinction that has direct relevance to the physical location of the underlying genetic variant, and which may also impact its evolutionary trajectory. Allelic imbalance measurements identify cis-acting genetic effects by assaying the relative contribution of the two alleles of a cis-regulatory region to gene expression within individuals. Identification of patterns that predict commonly imbalanced genes could therefore serve as a useful tool and also shed light on the evolution of cis-regulatory variation itself. Here, we show that sequence motifs, polymorphism levels, and divergence levels around a gene can be used to predict commonly imbalanced genes in a human data set. Reduction of this feature set to four factors revealed that only one factor significantly differentiated between commonly imbalanced and nonimbalanced genes. We demonstrate that these results are consistent between the original data set and a second published data set in humans obtained using different technical and statistical methods. Finally, we show that variation in the single allelic imbalance-associated factor is partially explained by the density of genes in the region of a target gene (allelic imbalance is less probable for genes in gene-dense regions), and, to a lesser extent, the evenness of expression of the gene across tissues and the magnitude of negative selection on putative regulatory regions of the gene. These results suggest that the genomic distribution of functional cis-regulatory variants in the human genome is nonrandom, perhaps due to local differences in evolutionary constraint. PMID:19506001

  3. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    PubMed

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  4. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    DOE PAGES

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; ...

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making themmore » highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating comparative genomics of closely related organisms with gene expression data to assemble large-scale TRN models with high-quality predictions.« less

  5. Moving Toward Integrating Gene Expression Profiling Into High-Throughput Testing: A Gene Expression Biomarker Accurately Predicts Estrogen Receptor α Modulation in a Microarray Compendium.

    PubMed

    Ryan, Natalia; Chorley, Brian; Tice, Raymond R; Judson, Richard; Corton, J Christopher

    2016-05-01

    Microarray profiling of chemical-induced effects is being increasingly used in medium- and high-throughput formats. Computational methods are described here to identify molecular targets from whole-genome microarray data using as an example the estrogen receptor α (ERα), often modulated by potential endocrine disrupting chemicals. ERα biomarker genes were identified by their consistent expression after exposure to 7 structurally diverse ERα agonists and 3 ERα antagonists in ERα-positive MCF-7 cells. Most of the biomarker genes were shown to be directly regulated by ERα as determined by ESR1 gene knockdown using siRNA as well as through chromatin immunoprecipitation coupled with DNA sequencing analysis of ERα-DNA interactions. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression datasets from experiments using MCF-7 cells, including those evaluating the transcriptional effects of hormones and chemicals. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% and 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) ER reference chemicals including "very weak" agonists. Importantly, the biomarker predictions accurately replicated predictions based on 18 in vitro high-throughput screening assays that queried different steps in ERα signaling. For 114 chemicals, the balanced accuracies were 95% and 98% for activation or suppression, respectively. These results demonstrate that the ERα gene expression biomarker can accurately identify ERα modulators in large collections of microarray data derived from MCF-7 cells. Published by Oxford University Press on behalf of the Society of Toxicology 2016. This work is written by US Government employees and is in the public domain in the US.

  6. Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.

    PubMed

    Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin

    2013-09-22

    High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.

  7. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    PubMed Central

    Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-01-01

    While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490

  8. Neighboring Genes Show Correlated Evolution in Gene Expression.

    PubMed

    Ghanbarian, Avazeh T; Hurst, Laurence D

    2015-07-01

    When considering the evolution of a gene's expression profile, we commonly assume that this is unaffected by its genomic neighborhood. This is, however, in contrast to what we know about the lack of autonomy between neighboring genes in gene expression profiles in extant taxa. Indeed, in all eukaryotic genomes genes of similar expression-profile tend to cluster, reflecting chromatin level dynamics. Does it follow that if a gene increases expression in a particular lineage then the genomic neighbors will also increase in their expression or is gene expression evolution autonomous? To address this here we consider evolution of human gene expression since the human-chimp common ancestor, allowing for both variation in estimation of current expression level and error in Bayesian estimation of the ancestral state. We find that in all tissues and both sexes, the change in gene expression of a focal gene on average predicts the change in gene expression of neighbors. The effect is highly pronounced in the immediate vicinity (<100 kb) but extends much further. Sex-specific expression change is also genomically clustered. As genes increasing their expression in humans tend to avoid nuclear lamina domains and be enriched for the gene activator 5-hydroxymethylcytosine, we conclude that, most probably owing to chromatin level control of gene expression, a change in gene expression of one gene likely affects the expression evolution of neighbors, what we term expression piggybacking, an analog of hitchhiking. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  9. Global identification and expression analysis of stress-responsive genes of the Argonaute family in apple.

    PubMed

    Xu, Ruirui; Liu, Caiyun; Li, Ning; Zhang, Shizhong

    2016-12-01

    Argonaute (AGO) proteins, which are found in yeast, animals, and plants, are the core molecules of the RNA-induced silencing complex. These proteins play important roles in plant growth, development, and responses to biotic stresses. The complete analysis and classification of the AGO gene family have been recently reported in different plants. Nevertheless, systematic analysis and expression profiling of these genes have not been performed in apple (Malus domestica). Approximately 15 AGO genes were identified in the apple genome. The phylogenetic tree, chromosome location, conserved protein motifs, gene structure, and expression of the AGO gene family in apple were analyzed for gene prediction. All AGO genes were phylogenetically clustered into four groups (i.e., AGO1, AGO4, MEL1/AGO5, and ZIPPY/AGO7) with the AGO genes of Arabidopsis. These groups of the AGO gene family were statistically analyzed and compared among 31 plant species. The predicted apple AGO genes are distributed across nine chromosomes at different densities and include three segment duplications. Expression studies indicated that 15 AGO genes exhibit different expression patterns in at least one of the tissues tested. Additionally, analysis of gene expression levels indicated that the genes are mostly involved in responses to NaCl, PEG, heat, and low-temperature stresses. Hence, several candidate AGO genes are involved in different aspects of physiological and developmental processes and may play an important role in abiotic stress responses in apple. To the best of our knowledge, this study is the first to report a comprehensive analysis of the apple AGO gene family. Our results provide useful information to understand the classification and putative functions of these proteins, especially for gene members that may play important roles in abiotic stress responses in M. hupehensis.

  10. Codon usage and amino acid usage influence genes expression level.

    PubMed

    Paul, Prosenjit; Malakar, Arup Kumar; Chakraborty, Supriyo

    2018-02-01

    Highly expressed genes in any species differ in the usage frequency of synonymous codons. The relative recurrence of an event of the favored codon pair (amino acid pairs) varies between gene and genomes due to varying gene expression and different base composition. Here we propose a new measure for predicting the gene expression level, i.e., codon plus amino bias index (CABI). Our approach is based on the relative bias of the favored codon pair inclination among the genes, illustrated by analyzing the CABI score of the Medicago truncatula genes. CABI showed strong correlation with all other widely used measures (CAI, RCBS, SCUO) for gene expression analysis. Surprisingly, CABI outperforms all other measures by showing better correlation with the wet-lab data. This emphasizes the importance of the neighboring codons of the favored codon in a synonymous group while estimating the expression level of a gene.

  11. Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.

    PubMed Central

    Borodovsky, M; Rudd, K E; Koonin, E V

    1994-01-01

    The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428

  12. A method for generating new datasets based on copy number for cancer analysis.

    PubMed

    Kim, Shinuk; Kon, Mark; Kang, Hyunsik

    2015-01-01

    New data sources for the analysis of cancer data are rapidly supplementing the large number of gene-expression markers used for current methods of analysis. Significant among these new sources are copy number variation (CNV) datasets, which typically enumerate several hundred thousand CNVs distributed throughout the genome. Several useful algorithms allow systems-level analyses of such datasets. However, these rich data sources have not yet been analyzed as deeply as gene-expression data. To address this issue, the extensive toolsets used for analyzing expression data in cancerous and noncancerous tissue (e.g., gene set enrichment analysis and phenotype prediction) could be redirected to extract a great deal of predictive information from CNV data, in particular those derived from cancers. Here we present a software package capable of preprocessing standard Agilent copy number datasets into a form to which essentially all expression analysis tools can be applied. We illustrate the use of this toolset in predicting the survival time of patients with ovarian cancer or glioblastoma multiforme and also provide an analysis of gene- and pathway-level deletions in these two types of cancer.

  13. Gene-Expression Signature Predicts Postoperative Recurrence in Stage I Non-Small Cell Lung Cancer Patients

    PubMed Central

    Lu, Yan; Wang, Liang; Liu, Pengyuan; Yang, Ping; You, Ming

    2012-01-01

    About 30% stage I non-small cell lung cancer (NSCLC) patients undergoing resection will recur. Robust prognostic markers are required to better manage therapy options. The purpose of this study is to develop and validate a novel gene-expression signature that can predict tumor recurrence of stage I NSCLC patients. Cox proportional hazards regression analysis was performed to identify recurrence-related genes and a partial Cox regression model was used to generate a gene signature of recurrence in the training dataset −142 stage I lung adenocarcinomas without adjunctive therapy from the Director's Challenge Consortium. Four independent validation datasets, including GSE5843, GSE8894, and two other datasets provided by Mayo Clinic and Washington University, were used to assess the prediction accuracy by calculating the correlation between risk score estimated from gene expression and real recurrence-free survival time and AUC of time-dependent ROC analysis. Pathway-based survival analyses were also performed. 104 probesets correlated with recurrence in the training dataset. They are enriched in cell adhesion, apoptosis and regulation of cell proliferation. A 51-gene expression signature was identified to distinguish patients likely to develop tumor recurrence (Dxy = −0.83, P<1e-16) and this signature was validated in four independent datasets with AUC >85%. Multiple pathways including leukocyte transendothelial migration and cell adhesion were highly correlated with recurrence-free survival. The gene signature is highly predictive of recurrence in stage I NSCLC patients, which has important prognostic and therapeutic implications for the future management of these patients. PMID:22292069

  14. ESR1 and PGR polymorphisms are associated with estrogen and progesterone receptor expression in breast tumors.

    PubMed

    Hertz, Daniel L; Henry, N Lynn; Kidwell, Kelley M; Thomas, Dafydd; Goddard, Audrey; Azzouz, Faouzi; Speth, Kelly; Li, Lang; Banerjee, Mousumi; Thibert, Jacklyn N; Kleer, Celina G; Stearns, Vered; Hayes, Daniel F; Skaar, Todd C; Rae, James M

    2016-09-01

    Hormone receptor-positive (HR+) breast cancers express the estrogen (ERα) and/or progesterone (PgR) receptors. Inherited single nucleotide polymorphisms (SNPs) in ESR1, the gene encoding ERα, have been reported to predict tamoxifen effectiveness. We hypothesized that these associations could be attributed to altered tumor gene/protein expression of ESR1/ERα and that SNPs in the PGR gene predict tumor PGR/PgR expression. Formalin-fixed paraffin-embedded breast cancer tumor specimens were analyzed for ESR1 and PGR gene transcript expression by the reverse transcription polymerase chain reaction based Oncotype DX assay and for ERα and PgR protein expression by immunohistochemistry (IHC) and an automated quantitative immunofluorescence assay (AQUA). Germline genotypes for SNPs in ESR1 (n = 41) and PGR (n = 8) were determined by allele-specific TaqMan assays. One SNP in ESR1 (rs9322336) was significantly associated with ESR1 gene transcript expression (P = 0.006) but not ERα protein expression (P > 0.05). A PGR SNP (rs518162) was associated with decreased PGR gene transcript expression (P = 0.003) and PgR protein expression measured by IHC (P = 0.016), but not AQUA (P = 0.054). There were modest, but statistically significant correlations between gene and protein expression for ESR1/ERα and PGR/PgR and for protein expression measured by IHC and AQUA (Pearson correlation = 0.32-0.64, all P < 0.001). Inherited ESR1 and PGR genotypes may affect tumor ESR1/ERα and PGR/PgR expression, respectively, which are moderately correlated. This work supports further research into germline predictors of tumor characteristics and treatment effectiveness, which may someday inform selection of hormonal treatments for patients with HR+ breast cancer. Copyright © 2016 the American Physiological Society.

  15. Semi-supervised prediction of gene regulatory networks using machine learning algorithms.

    PubMed

    Patel, Nihir; Wang, Jason T L

    2015-10-01

    Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely, support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabelled data for training. We investigated inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabelled data. We then applied our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluated the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.

  16. Combining Evidence of Preferential Gene-Tissue Relationships from Multiple Sources

    PubMed Central

    Guo, Jing; Hammar, Mårten; Öberg, Lisa; Padmanabhuni, Shanmukha S.; Bjäreland, Marcus; Dalevi, Daniel

    2013-01-01

    An important challenge in drug discovery and disease prognosis is to predict genes that are preferentially expressed in one or a few tissues, i.e. showing a considerably higher expression in one tissue(s) compared to the others. Although several data sources and methods have been published explicitly for this purpose, they often disagree and it is not evident how to retrieve these genes and how to distinguish true biological findings from those that are due to choice-of-method and/or experimental settings. In this work we have developed a computational approach that combines results from multiple methods and datasets with the aim to eliminate method/study-specific biases and to improve the predictability of preferentially expressed human genes. A rule-based score is used to merge and assign support to the results. Five sets of genes with known tissue specificity were used for parameter pruning and cross-validation. In total we identify 3434 tissue-specific genes. We compare the genes of highest scores with the public databases: PaGenBase (microarray), TiGER (EST) and HPA (protein expression data). The results have 85% overlap to PaGenBase, 71% to TiGER and only 28% to HPA. 99% of our predictions have support from at least one of these databases. Our approach also performs better than any of the databases on identifying drug targets and biomarkers with known tissue-specificity. PMID:23950964

  17. Impact of angiogenesis-related gene expression on the tracer kinetics of 18F-FDG in colorectal tumors.

    PubMed

    Strauss, Ludwig G; Koczan, Dirk; Klippel, Sven; Pan, Leyun; Cheng, Caixia; Willis, Stefan; Haberkorn, Uwe; Dimitrakopoulou-Strauss, Antonia

    2008-08-01

    18F-FDG kinetics are primarily dependent on the expression of genes associated with glucose transporters and hexokinases but may be modulated by other genes. The dependency of 18F-FDG kinetics on angiogenesis-related gene expression was evaluated in this study. Patients with primary colorectal tumors (n = 25) were examined with PET and 18F-FDG within 2 days before surgery. Tissue specimens were obtained from the tumor and the normal colon during surgery, and gene expression was assessed using gene arrays. Overall, 23 angiogenesis-related genes were identified with a tumor-to-normal ratio exceeding 1.50. Analysis revealed a significant correlation between k1 and vascular endothelial growth factor (VEGF-A, r = 0.51) and between fractal dimension and angiopoietin-2 (r = 0.48). k3 was negatively correlated with VEGF-B (r = -0.46), and a positive correlation was noted for angiopoietin-like 4 gene (r = 0.42). A multiple linear regression analysis was used for the PET parameters to predict the gene expression, and a correlation coefficient of r = 0.75 was obtained for VEGF-A and of r = 0.76 for the angiopoietin-2 expression. Thus, on the basis of these multiple correlation coefficients, angiogenesis-related gene expression contributes to about 50% of the variance of the 18F-FDG kinetic data. The global 18F-FDG uptake, as measured by the standardized uptake value and influx, was not significantly correlated with angiogenesis-associated genes. 18F-FDG kinetics are modulated by angiogenesis-related genes. The transport rate for 18F-FDG (k1) is higher in tumors with a higher expression of VEGF-A and angiopoietin-2. The regression functions for the PET parameters provide the possibility to predict the gene expression of VEGF-A and angiopoietin-2.

  18. Gene expression profile predicting the response to anti-TNF treatment in patients with rheumatoid arthritis; analysis of GEO datasets.

    PubMed

    Kim, Tae-Hwan; Choi, Sung Jae; Lee, Young Ho; Song, Gwan Gyu; Ji, Jong Dae

    2014-07-01

    Anti-tumor necrosis factor (TNF) therapy is the treatment of choice for rheumatoid arthritis (RA) patients in whom standard disease-modifying anti-rheumatic drugs are ineffective. However, a substantial proportion of RA patients treated with anti-TNF agents do not show a significant clinical response. Therefore, biomarkers predicting response to anti-TNF agents are needed. Recently, gene expression profiling has been applied in research for developing such biomarkers. We compared gene expression profiles reported by previous studies dealing with the responsiveness of anti-TNF therapy in RA patients and attempted to identify differentially expressed genes (DEGs) that discriminated between responders and non-responders to anti-TNF therapy. We used microarray datasets available at the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO). This analysis included 6 studies and 5 sets of microarray data that used peripheral blood samples for identification of DEGs predicting response to anti-TNF therapy. We found little overlap in the DEGs that were highly ranked in each study. Three DEGs including IL2RB, SH2D2A and G0S2 appeared in more than 1 study. In addition, a meta-analysis designed to increase statistical power found one DEG, G0S2 by the Fisher's method. Our finding suggests the possibility that G0S2 plays as a biomarker to predict response to anti-TNF therapy in patients with rheumatoid arthritis. Further investigations based on larger studies are therefore needed to confirm the significance of G0S2 in predicting response to anti-TNF therapy. Copyright © 2014 Société française de rhumatologie. Published by Elsevier SAS. All rights reserved.

  19. Prediction of epigenetically regulated genes in breast cancer cell lines

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Loss, Leandro A; Sadanandam, Anguraj; Durinck, Steffen

    Methylation of CpG islands within the DNA promoter regions is one mechanism that leads to aberrant gene expression in cancer. In particular, the abnormal methylation of CpG islands may silence associated genes. Therefore, using high-throughput microarrays to measure CpG island methylation will lead to better understanding of tumor pathobiology and progression, while revealing potentially new biomarkers. We have examined a recently developed high-throughput technology for measuring genome-wide methylation patterns called mTACL. Here, we propose a computational pipeline for integrating gene expression and CpG island methylation profles to identify epigenetically regulated genes for a panel of 45 breast cancer cell lines,more » which is widely used in the Integrative Cancer Biology Program (ICBP). The pipeline (i) reduces the dimensionality of the methylation data, (ii) associates the reduced methylation data with gene expression data, and (iii) ranks methylation-expression associations according to their epigenetic regulation. Dimensionality reduction is performed in two steps: (i) methylation sites are grouped across the genome to identify regions of interest, and (ii) methylation profles are clustered within each region. Associations between the clustered methylation and the gene expression data sets generate candidate matches within a fxed neighborhood around each gene. Finally, the methylation-expression associations are ranked through a logistic regression, and their significance is quantified through permutation analysis. Our two-step dimensionality reduction compressed 90% of the original data, reducing 137,688 methylation sites to 14,505 clusters. Methylation-expression associations produced 18,312 correspondences, which were used to further analyze epigenetic regulation. Logistic regression was used to identify 58 genes from these correspondences that showed a statistically signifcant negative correlation between methylation profles and gene expression in the panel of breast cancer cell lines. Subnetwork enrichment of these genes has identifed 35 common regulators with 6 or more predicted markers. In addition to identifying epigenetically regulated genes, we show evidence of differentially expressed methylation patterns between the basal and luminal subtypes. Our results indicate that the proposed computational protocol is a viable platform for identifying epigenetically regulated genes. Our protocol has generated a list of predictors including COL1A2, TOP2A, TFF1, and VAV3, genes whose key roles in epigenetic regulation is documented in the literature. Subnetwork enrichment of these predicted markers further suggests that epigenetic regulation of individual genes occurs in a coordinated fashion and through common regulators.« less

  20. Gene Expression-Based Survival Prediction in Lung Adenocarcinoma: A Multi-Site, Blinded Validation Study

    PubMed Central

    Shedden, Kerby; Taylor, Jeremy M.G.; Enkemann, Steve A.; Tsao, Ming S.; Yeatman, Timothy J.; Gerald, William L.; Eschrich, Steve; Jurisica, Igor; Venkatraman, Seshan E.; Meyerson, Matthew; Kuick, Rork; Dobbin, Kevin K.; Lively, Tracy; Jacobson, James W.; Beer, David G.; Giordano, Thomas J.; Misek, David E.; Chang, Andrew C.; Zhu, Chang Qi; Strumpf, Dan; Hanash, Samir; Shepherd, Francis A.; Ding, Kuyue; Seymour, Lesley; Naoki, Katsuhiko; Pennell, Nathan; Weir, Barbara; Verhaak, Roel; Ladd-Acosta, Christine; Golub, Todd; Gruidl, Mike; Szoke, Janos; Zakowski, Maureen; Rusch, Valerie; Kris, Mark; Viale, Agnes; Motoi, Noriko; Travis, William; Sharma, Anupama

    2009-01-01

    Although prognostic gene expression signatures for survival in early stage lung cancer have been proposed, for clinical application it is critical to establish their performance across different subject populations and in different laboratories. Here we report a large, training-testing, multi-site blinded validation study to characterize the performance of several prognostic models based on gene expression for 442 lung adenocarcinomas. The hypotheses proposed examined whether microarray measurements of gene expression either alone or combined with basic clinical covariates (stage, age, sex) can be used to predict overall survival in lung cancer subjects. Several models examined produced risk scores that substantially correlated with actual subject outcome. Most methods performed better with clinical data, supporting the combined use of clinical and molecular information when building prognostic models for early stage lung cancer. This study also provides the largest available set of microarray data with extensive pathological and clinical annotation for lung adenocarcinomas. PMID:18641660

  1. Reconstructing directed gene regulatory network by only gene expression data.

    PubMed

    Zhang, Lu; Feng, Xi Kang; Ng, Yen Kaow; Li, Shuai Cheng

    2016-08-18

    Accurately identifying gene regulatory network is an important task in understanding in vivo biological activities. The inference of such networks is often accomplished through the use of gene expression data. Many methods have been developed to evaluate gene expression dependencies between transcription factor and its target genes, and some methods also eliminate transitive interactions. The regulatory (or edge) direction is undetermined if the target gene is also a transcription factor. Some methods predict the regulatory directions in the gene regulatory networks by locating the eQTL single nucleotide polymorphism, or by observing the gene expression changes when knocking out/down the candidate transcript factors; regrettably, these additional data are usually unavailable, especially for the samples deriving from human tissues. In this study, we propose the Context Based Dependency Network (CBDN), a method that is able to infer gene regulatory networks with the regulatory directions from gene expression data only. To determine the regulatory direction, CBDN computes the influence of source to target by evaluating the magnitude changes of expression dependencies between the target gene and the others with conditioning on the source gene. CBDN extends the data processing inequality by involving the dependency direction to distinguish between direct and transitive relationship between genes. We also define two types of important regulators which can influence a majority of the genes in the network directly or indirectly. CBDN can detect both of these two types of important regulators by averaging the influence functions of candidate regulator to the other genes. In our experiments with simulated and real data, even with the regulatory direction taken into account, CBDN outperforms the state-of-the-art approaches for inferring gene regulatory network. CBDN identifies the important regulators in the predicted network: 1. TYROBP influences a batch of genes that are related to Alzheimer's disease; 2. ZNF329 and RB1 significantly regulate those 'mesenchymal' gene expression signature genes for brain tumors. By merely leveraging gene expression data, CBDN can efficiently infer the existence of gene-gene interactions as well as their regulatory directions. The constructed networks are helpful in the identification of important regulators for complex diseases.

  2. Evolutionary conservation of vertebrate notochord genes in the ascidian Ciona intestinalis.

    PubMed

    Kugler, Jamie E; Passamaneck, Yale J; Feldman, Taya G; Beh, Jeni; Regnier, Todd W; Di Gregorio, Anna

    2008-11-01

    To reconstruct a minimum complement of notochord genes evolutionarily conserved across chordates, we scanned the Ciona intestinalis genome using the sequences of 182 genes reported to be expressed in the notochord of different vertebrates and identified 139 candidate notochord genes. For 66 of these Ciona genes expression data were already available, hence we analyzed the expression of the remaining 73 genes and found notochord expression for 20. The predicted products of the newly identified notochord genes range from the transcription factors Ci-XBPa and Ci-miER1 to extracellular matrix proteins. We examined the expression of the newly identified notochord genes in embryos ectopically expressing Ciona Brachyury (Ci-Bra) and in embryos expressing a repressor form of this transcription factor in the notochord, and we found that while a subset of the genes examined are clearly responsive to Ci-Bra, other genes are not affected by alterations in its levels. We provide a first description of notochord genes that are not evidently influenced by the ectopic expression of Ci-Bra and we propose alternative regulatory mechanisms that might control their transcription. Copyright 2008 Wiley-Liss, Inc.

  3. Computational gene expression profiling under salt stress reveals patterns of co-expression

    PubMed Central

    Sanchita; Sharma, Ashok

    2016-01-01

    Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411

  4. Convergent evolution at the pathway level: predictable regulatory changes during flower color transitions.

    PubMed

    Larter, Maximilian; Dunbar-Wallis, Amy; Berardi, Andrea E; Smith, Stacey D

    2018-06-07

    The predictability of evolution, or whether lineages repeatedly follow the same evolutionary trajectories during phenotypic convergence remains an open question of evolutionary biology. In this study, we investigate evolutionary convergence at the biochemical pathway level and test the predictability of evolution using floral anthocyanin pigmentation, a trait with a well-understood genetic and regulatory basis. We reconstructed the evolution of floral anthocyanin content across 28 species of the Andean clade Iochrominae (Solanaceae) and investigated how shifts in pigmentation are related to changes in expression of 7 key anthocyanin pathway genes. We used phylogenetic multivariate analysis of gene expression to test for phenotypic and developmental convergence at a macroevolutionary scale. Our results show that the four independent losses of the ancestral pigment delphinidin involved convergent losses of expression of the three late pathway genes (F3'5'h, Dfr and Ans). Transitions between pigment types affecting floral hue (e.g. blue to red) involve changes to the expression of branching genes F3'h and F3'5'h, while the expression levels of early steps of the pathway are strongly conserved in all species. These patterns support the idea that the macroevolution of floral pigmentation follows predictable evolutionary trajectories to reach convergent phenotype space, repeatedly involving regulatory changes. This is likely driven by constraints at the pathway level, such as pleiotropy and regulatory structure.

  5. A gene expression biomarker accurately predicts estrogen ...

    EPA Pesticide Factsheets

    The EPA’s vision for the Endocrine Disruptor Screening Program (EDSP) in the 21st Century (EDSP21) includes utilization of high-throughput screening (HTS) assays coupled with computational modeling to prioritize chemicals with the goal of eventually replacing current Tier 1 screening tests. The ToxCast program currently includes 18 HTS in vitro assays that evaluate the ability of chemicals to modulate estrogen receptor α (ERα), an important endocrine target. We propose microarray-based gene expression profiling as a complementary approach to predict ERα modulation and have developed computational methods to identify ERα modulators in an existing database of whole-genome microarray data. The ERα biomarker consisted of 46 ERα-regulated genes with consistent expression patterns across 7 known ER agonists and 3 known ER antagonists. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression data sets from experiments in MCF-7 cells. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% or 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) OECD ER reference chemicals including “very weak” agonists and replicated predictions based on 18 in vitro ER-associated HTS assays. For 114 chemicals present in both the HTS data and the MCF-7 c

  6. Open source machine-learning algorithms for the prediction of optimal cancer drug therapies.

    PubMed

    Huang, Cai; Mezencev, Roman; McDonald, John F; Vannberg, Fredrik

    2017-01-01

    Precision medicine is a rapidly growing area of modern medical science and open source machine-learning codes promise to be a critical component for the successful development of standardized and automated analysis of patient data. One important goal of precision cancer medicine is the accurate prediction of optimal drug therapies from the genomic profiles of individual patient tumors. We introduce here an open source software platform that employs a highly versatile support vector machine (SVM) algorithm combined with a standard recursive feature elimination (RFE) approach to predict personalized drug responses from gene expression profiles. Drug specific models were built using gene expression and drug response data from the National Cancer Institute panel of 60 human cancer cell lines (NCI-60). The models are highly accurate in predicting the drug responsiveness of a variety of cancer cell lines including those comprising the recent NCI-DREAM Challenge. We demonstrate that predictive accuracy is optimized when the learning dataset utilizes all probe-set expression values from a diversity of cancer cell types without pre-filtering for genes generally considered to be "drivers" of cancer onset/progression. Application of our models to publically available ovarian cancer (OC) patient gene expression datasets generated predictions consistent with observed responses previously reported in the literature. By making our algorithm "open source", we hope to facilitate its testing in a variety of cancer types and contexts leading to community-driven improvements and refinements in subsequent applications.

  7. Functional requirements for bacteriophage growth: gene essentiality and expression in mycobacteriophage Giles.

    PubMed

    Dedrick, Rebekah M; Marinelli, Laura J; Newton, Gerald L; Pogliano, Kit; Pogliano, Joseph; Hatfull, Graham F

    2013-05-01

    Bacteriophages represent a majority of all life forms, and the vast, dynamic population with early origins is reflected in their enormous genetic diversity. A large number of bacteriophage genomes have been sequenced. They are replete with novel genes without known relatives. We know little about their functions, which genes are required for lytic growth, and how they are expressed. Furthermore, the diversity is such that even genes with required functions - such as virion proteins and repressors - cannot always be recognized. Here we describe a functional genomic dissection of mycobacteriophage Giles, in which the virion proteins are identified, genes required for lytic growth are determined, the repressor is identified, and the transcription patterns determined. We find that although all of the predicted phage genes are expressed either in lysogeny or in lytic growth, 45% of the predicted genes are non-essential for lytic growth. We also describe genes required for DNA replication, show that recombination is required for lytic growth, and that Giles encodes a novel repressor. RNAseq analysis reveals abundant expression of a small non-coding RNA in a lysogen and in late lytic growth, although it is non-essential for lytic growth and does not alter lysogeny. © 2013 Blackwell Publishing Ltd.

  8. PROSPECT improves cis-acting regulatory element prediction by integrating expression profile data with consensus pattern searches

    PubMed Central

    Fujibuchi, Wataru; Anderson, John S. J.; Landsman, David

    2001-01-01

    Consensus pattern and matrix-based searches designed to predict cis-acting transcriptional regulatory sequences have historically been subject to large numbers of false positives. We sought to decrease false positives by incorporating expression profile data into a consensus pattern-based search method. We have systematically analyzed the expression phenotypes of over 6000 yeast genes, across 121 expression profile experiments, and correlated them with the distribution of 14 known regulatory elements over sequences upstream of the genes. Our method is based on a metric we term probabilistic element assessment (PEA), which is a ranking of potential sites based on sequence similarity in the upstream regions of genes with similar expression phenotypes. For eight of the 14 known elements that we examined, our method had a much higher selectivity than a naïve consensus pattern search. Based on our analysis, we have developed a web-based tool called PROSPECT, which allows consensus pattern-based searching of gene clusters obtained from microarray data. PMID:11574681

  9. Gene Expression Ratios Lead to Accurate and Translatable Predictors of DR5 Agonism across Multiple Tumor Lineages.

    PubMed

    Reddy, Anupama; Growney, Joseph D; Wilson, Nick S; Emery, Caroline M; Johnson, Jennifer A; Ward, Rebecca; Monaco, Kelli A; Korn, Joshua; Monahan, John E; Stump, Mark D; Mapa, Felipa A; Wilson, Christopher J; Steiger, Janine; Ledell, Jebediah; Rickles, Richard J; Myer, Vic E; Ettenberg, Seth A; Schlegel, Robert; Sellers, William R; Huet, Heather A; Lehár, Joseph

    2015-01-01

    Death Receptor 5 (DR5) agonists demonstrate anti-tumor activity in preclinical models but have yet to demonstrate robust clinical responses. A key limitation may be the lack of patient selection strategies to identify those most likely to respond to treatment. To overcome this limitation, we screened a DR5 agonist Nanobody across >600 cell lines representing 21 tumor lineages and assessed molecular features associated with response. High expression of DR5 and Casp8 were significantly associated with sensitivity, but their expression thresholds were difficult to translate due to low dynamic ranges. To address the translational challenge of establishing thresholds of gene expression, we developed a classifier based on ratios of genes that predicted response across lineages. The ratio classifier outperformed the DR5+Casp8 classifier, as well as standard approaches for feature selection and classification using genes, instead of ratios. This classifier was independently validated using 11 primary patient-derived pancreatic xenograft models showing perfect predictions as well as a striking linearity between prediction probability and anti-tumor response. A network analysis of the genes in the ratio classifier captured important biological relationships mediating drug response, specifically identifying key positive and negative regulators of DR5 mediated apoptosis, including DR5, CASP8, BID, cFLIP, XIAP and PEA15. Importantly, the ratio classifier shows translatability across gene expression platforms (from Affymetrix microarrays to RNA-seq) and across model systems (in vitro to in vivo). Our approach of using gene expression ratios presents a robust and novel method for constructing translatable biomarkers of compound response, which can also probe the underlying biology of treatment response.

  10. Gene Expression Ratios Lead to Accurate and Translatable Predictors of DR5 Agonism across Multiple Tumor Lineages

    PubMed Central

    Reddy, Anupama; Growney, Joseph D.; Wilson, Nick S.; Emery, Caroline M.; Johnson, Jennifer A.; Ward, Rebecca; Monaco, Kelli A.; Korn, Joshua; Monahan, John E.; Stump, Mark D.; Mapa, Felipa A.; Wilson, Christopher J.; Steiger, Janine; Ledell, Jebediah; Rickles, Richard J.; Myer, Vic E.; Ettenberg, Seth A.; Schlegel, Robert; Sellers, William R.

    2015-01-01

    Death Receptor 5 (DR5) agonists demonstrate anti-tumor activity in preclinical models but have yet to demonstrate robust clinical responses. A key limitation may be the lack of patient selection strategies to identify those most likely to respond to treatment. To overcome this limitation, we screened a DR5 agonist Nanobody across >600 cell lines representing 21 tumor lineages and assessed molecular features associated with response. High expression of DR5 and Casp8 were significantly associated with sensitivity, but their expression thresholds were difficult to translate due to low dynamic ranges. To address the translational challenge of establishing thresholds of gene expression, we developed a classifier based on ratios of genes that predicted response across lineages. The ratio classifier outperformed the DR5+Casp8 classifier, as well as standard approaches for feature selection and classification using genes, instead of ratios. This classifier was independently validated using 11 primary patient-derived pancreatic xenograft models showing perfect predictions as well as a striking linearity between prediction probability and anti-tumor response. A network analysis of the genes in the ratio classifier captured important biological relationships mediating drug response, specifically identifying key positive and negative regulators of DR5 mediated apoptosis, including DR5, CASP8, BID, cFLIP, XIAP and PEA15. Importantly, the ratio classifier shows translatability across gene expression platforms (from Affymetrix microarrays to RNA-seq) and across model systems (in vitro to in vivo). Our approach of using gene expression ratios presents a robust and novel method for constructing translatable biomarkers of compound response, which can also probe the underlying biology of treatment response. PMID:26378449

  11. Integrated Analyses of microRNAs Demonstrate Their Widespread Influence on Gene Expression in High-Grade Serous Ovarian Carcinoma

    PubMed Central

    Levine, Douglas A.; Mankoo, Parminder; Schultz, Nikolaus; Du, Ying; Zhang, Yiqun; Larsson, Erik; Sheridan, Robert; Xiao, Weimin; Spellman, Paul T.; Getz, Gad; Wheeler, David A.; Perou, Charles M.; Gibbs, Richard A.; Sander, Chris; Hayes, D. Neil; Gunaratne, Preethi H.

    2012-01-01

    Background The Cancer Genome Atlas (TCGA) Network recently comprehensively catalogued the molecular aberrations in 487 high-grade serous ovarian cancers, with much remaining to be elucidated regarding the microRNAs (miRNAs). Here, using TCGA ovarian data, we surveyed the miRNAs, in the context of their predicted gene targets. Methods and Results Integration of miRNA and gene patterns yielded evidence that proximal pairs of miRNAs are processed from polycistronic primary transcripts, and that intronic miRNAs and their host gene mRNAs derive from common transcripts. Patterns of miRNA expression revealed multiple tumor subtypes and a set of 34 miRNAs predictive of overall patient survival. In a global analysis, miRNA:mRNA pairs anti-correlated in expression across tumors showed a higher frequency of in silico predicted target sites in the mRNA 3′-untranslated region (with less frequency observed for coding sequence and 5′-untranslated regions). The miR-29 family and predicted target genes were among the most strongly anti-correlated miRNA:mRNA pairs; over-expression of miR-29a in vitro repressed several anti-correlated genes (including DNMT3A and DNMT3B) and substantially decreased ovarian cancer cell viability. Conclusions This study establishes miRNAs as having a widespread impact on gene expression programs in ovarian cancer, further strengthening our understanding of miRNA biology as it applies to human cancer. As with gene transcripts, miRNAs exhibit high diversity reflecting the genomic heterogeneity within a clinically homogeneous disease population. Putative miRNA:mRNA interactions, as identified using integrative analysis, can be validated. TCGA data are a valuable resource for the identification of novel tumor suppressive miRNAs in ovarian as well as other cancers. PMID:22479643

  12. CLIC, a tool for expanding biological pathways based on co-expression across thousands of datasets

    PubMed Central

    Li, Yang; Liu, Jun S.; Mootha, Vamsi K.

    2017-01-01

    In recent years, there has been a huge rise in the number of publicly available transcriptional profiling datasets. These massive compendia comprise billions of measurements and provide a special opportunity to predict the function of unstudied genes based on co-expression to well-studied pathways. Such analyses can be very challenging, however, since biological pathways are modular and may exhibit co-expression only in specific contexts. To overcome these challenges we introduce CLIC, CLustering by Inferred Co-expression. CLIC accepts as input a pathway consisting of two or more genes. It then uses a Bayesian partition model to simultaneously partition the input gene set into coherent co-expressed modules (CEMs), while assigning the posterior probability for each dataset in support of each CEM. CLIC then expands each CEM by scanning the transcriptome for additional co-expressed genes, quantified by an integrated log-likelihood ratio (LLR) score weighted for each dataset. As a byproduct, CLIC automatically learns the conditions (datasets) within which a CEM is operative. We implemented CLIC using a compendium of 1774 mouse microarray datasets (28628 microarrays) or 1887 human microarray datasets (45158 microarrays). CLIC analysis reveals that of 910 canonical biological pathways, 30% consist of strongly co-expressed gene modules for which new members are predicted. For example, CLIC predicts a functional connection between protein C7orf55 (FMC1) and the mitochondrial ATP synthase complex that we have experimentally validated. CLIC is freely available at www.gene-clic.org. We anticipate that CLIC will be valuable both for revealing new components of biological pathways as well as the conditions in which they are active. PMID:28719601

  13. MicroRNA expression, target genes, and signaling pathways in infants with a ventricular septal defect.

    PubMed

    Chai, Hui; Yan, Zhaoyuan; Huang, Ke; Jiang, Yuanqing; Zhang, Lin

    2018-02-01

    This study aimed to systematically investigate the relationship between miRNA expression and the occurrence of ventricular septal defect (VSD), and characterize the miRNA target genes and pathways that can lead to VSD. The miRNAs that were differentially expressed in blood samples from VSD and normal infants were screened and validated by implementing miRNA microarrays and qRT-PCR. The target genes regulated by differentially expressed miRNAs were predicted using three target gene databases. The functions and signaling pathways of the target genes were enriched using the GO database and KEGG database, respectively. The transcription and protein expression of specific target genes in critical pathways were compared in the VSD and normal control groups using qRT-PCR and western blotting, respectively. Compared with the normal control group, the VSD group had 22 differentially expressed miRNAs; 19 were downregulated and three were upregulated. The 10,677 predicted target genes participated in many biological functions related to cardiac development and morphogenesis. Four target genes (mGLUR, Gq, PLC, and PKC) were involved in the PKC pathway and four (ECM, FAK, PI3 K, and PDK1) were involved in the PI3 K-Akt pathway. The transcription and protein expression of these eight target genes were significantly upregulated in the VSD group. The 22 miRNAs that were dysregulated in the VSD group were mainly downregulated, which may result in the dysregulation of several key genes and biological functions related to cardiac development. These effects could also be exerted via the upregulation of eight specific target genes, the subsequent over-activation of the PKC and PI3 K-Akt pathways, and the eventual abnormal cardiac development and VSD.

  14. Aberrant RNA splicing in cancer; expression changes and driver mutations of splicing factor genes.

    PubMed

    Sveen, A; Kilpinen, S; Ruusulehto, A; Lothe, R A; Skotheim, R I

    2016-05-12

    Alternative splicing is a widespread process contributing to structural transcript variation and proteome diversity. In cancer, the splicing process is commonly disrupted, resulting in both functional and non-functional end-products. Cancer-specific splicing events are known to contribute to disease progression; however, the dysregulated splicing patterns found on a genome-wide scale have until recently been less well-studied. In this review, we provide an overview of aberrant RNA splicing and its regulation in cancer. We then focus on the executors of the splicing process. Based on a comprehensive catalog of splicing factor encoding genes and analyses of available gene expression and somatic mutation data, we identify cancer-associated patterns of dysregulation. Splicing factor genes are shown to be significantly differentially expressed between cancer and corresponding normal samples, and to have reduced inter-individual expression variation in cancer. Furthermore, we identify enrichment of predicted cancer-critical genes among the splicing factors. In addition to previously described oncogenic splicing factor genes, we propose 24 novel cancer-critical splicing factors predicted from somatic mutations.

  15. Application of a fuzzy neural network model in predicting polycyclic aromatic hydrocarbon-mediated perturbations of the Cyp1b1 transcriptional regulatory network in mouse skin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Larkin, Andrew; Department of Statistics, Oregon State University; Superfund Research Center, Oregon State University

    2013-03-01

    Polycyclic aromatic hydrocarbons (PAHs) are present in the environment as complex mixtures with components that have diverse carcinogenic potencies and mostly unknown interactive effects. Non-additive PAH interactions have been observed in regulation of cytochrome P450 (CYP) gene expression in the CYP1 family. To better understand and predict biological effects of complex mixtures, such as environmental PAHs, an 11 gene input-1 gene output fuzzy neural network (FNN) was developed for predicting PAH-mediated perturbations of dermal Cyp1b1 transcription in mice. Input values were generalized using fuzzy logic into low, medium, and high fuzzy subsets, and sorted using k-means clustering to create Mamdanimore » logic functions for predicting Cyp1b1 mRNA expression. Model testing was performed with data from microarray analysis of skin samples from FVB/N mice treated with toluene (vehicle control), dibenzo[def,p]chrysene (DBC), benzo[a]pyrene (BaP), or 1 of 3 combinations of diesel particulate extract (DPE), coal tar extract (CTE) and cigarette smoke condensate (CSC) using leave-one-out cross-validation. Predictions were within 1 log{sub 2} fold change unit of microarray data, with the exception of the DBC treatment group, where the unexpected down-regulation of Cyp1b1 expression was predicted but did not reach statistical significance on the microarrays. Adding CTE to DPE was predicted to increase Cyp1b1 expression, whereas adding CSC to CTE and DPE was predicted to have no effect, in agreement with microarray results. The aryl hydrocarbon receptor repressor (Ahrr) was determined to be the most significant input variable for model predictions using back-propagation and normalization of FNN weights. - Highlights: ► Tested a model to predict PAH mixture-mediated changes in Cyp1b1 expression ► Quantitative predictions in agreement with microarrays for Cyp1b1 induction ► Unexpected difference in expression between DBC and other treatments predicted ► Model predictions for combining PAH mixtures in agreement with microarrays ► Predictions highly dependent on aryl hydrocarbon receptor repressor expression.« less

  16. Female Behaviour Drives Expression and Evolution of Gustatory Receptors in Butterflies

    PubMed Central

    Briscoe, Adriana D.; Macias-Muñoz, Aide; Kozak, Krzysztof M.; Walters, James R.; Yuan, Furong; Jamie, Gabriel A.; Martin, Simon H.; Dasmahapatra, Kanchon K.; Ferguson, Laura C.; Mallet, James; Jacquin-Joly, Emmanuelle; Jiggins, Chris D.

    2013-01-01

    Secondary plant compounds are strong deterrents of insect oviposition and feeding, but may also be attractants for specialist herbivores. These insect-plant interactions are mediated by insect gustatory receptors (Grs) and olfactory receptors (Ors). An analysis of the reference genome of the butterfly Heliconius melpomene, which feeds on passion-flower vines (Passiflora spp.), together with whole-genome sequencing within the species and across the Heliconius phylogeny has permitted an unprecedented opportunity to study the patterns of gene duplication and copy-number variation (CNV) among these key sensory genes. We report in silico gene predictions of 73 Gr genes in the H. melpomene reference genome, including putative CO2, sugar, sugar alcohol, fructose, and bitter receptors. The majority of these Grs are the result of gene duplications since Heliconius shared a common ancestor with the monarch butterfly or the silkmoth. Among Grs but not Ors, CNVs are more common within species in those gene lineages that have also duplicated over this evolutionary time-scale, suggesting ongoing rapid gene family evolution. Deep sequencing (∼1 billion reads) of transcriptomes from proboscis and labial palps, antennae, and legs of adult H. melpomene males and females indicates that 67 of the predicted 73 Gr genes and 67 of the 70 predicted Or genes are expressed in these three tissues. Intriguingly, we find that one-third of all Grs show female-biased gene expression (n = 26) and nearly all of these (n = 21) are Heliconius-specific Grs. In fact, a significant excess of Grs that are expressed in female legs but not male legs are the result of recent gene duplication. This difference in Gr gene expression diversity between the sexes is accompanied by a striking sexual dimorphism in the abundance of gustatory sensilla on the forelegs of H. melpomene, suggesting that female oviposition behaviour drives the evolution of new gustatory receptors in butterfly genomes. PMID:23950722

  17. Biological interpretation of genome-wide association studies using predicted gene functions.

    PubMed

    Pers, Tune H; Karjalainen, Juha M; Chan, Yingleong; Westra, Harm-Jan; Wood, Andrew R; Yang, Jian; Lui, Julian C; Vedantam, Sailaja; Gustafsson, Stefan; Esko, Tonu; Frayling, Tim; Speliotes, Elizabeth K; Boehnke, Michael; Raychaudhuri, Soumya; Fehrmann, Rudolf S N; Hirschhorn, Joel N; Franke, Lude

    2015-01-19

    The main challenge for gaining biological insights from genetic associations is identifying which genes and pathways explain the associations. Here we present DEPICT, an integrative tool that employs predicted gene functions to systematically prioritize the most likely causal genes at associated loci, highlight enriched pathways and identify tissues/cell types where genes from associated loci are highly expressed. DEPICT is not limited to genes with established functions and prioritizes relevant gene sets for many phenotypes.

  18. Phylogenomic detection and functional prediction of genes potentially important for plant meiosis.

    PubMed

    Zhang, Luoyan; Kong, Hongzhi; Ma, Hong; Yang, Ji

    2018-02-15

    Meiosis is a specialized type of cell division necessary for sexual reproduction in eukaryotes. A better understanding of the cytological procedures of meiosis has been achieved by comprehensive cytogenetic studies in plants, while the genetic mechanisms regulating meiotic progression remain incompletely understood. The increasing accumulation of complete genome sequences and large-scale gene expression datasets has provided a powerful resource for phylogenomic inference and unsupervised identification of genes involved in plant meiosis. By integrating sequence homology and expression data, 164, 131, 124 and 162 genes potentially important for meiosis were identified in the genomes of Arabidopsis thaliana, Oryza sativa, Selaginella moellendorffii and Pogonatum aloides, respectively. The predicted genes were assigned to 45 meiotic GO terms, and their functions were related to different processes occurring during meiosis in various organisms. Most of the predicted meiotic genes underwent lineage-specific duplication events during plant evolution, with about 30% of the predicted genes retaining only a single copy in higher plant genomes. The results of this study provided clues to design experiments for better functional characterization of meiotic genes in plants, promoting the phylogenomic approach to the evolutionary dynamics of the plant meiotic machineries. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Identification and Correction of Sample Mix-Ups in Expression Genetic Data: A Case Study

    PubMed Central

    Broman, Karl W.; Keller, Mark P.; Broman, Aimee Teo; Kendziorski, Christina; Yandell, Brian S.; Sen, Śaunak; Attie, Alan D.

    2015-01-01

    In a mouse intercross with more than 500 animals and genome-wide gene expression data on six tissues, we identified a high proportion (18%) of sample mix-ups in the genotype data. Local expression quantitative trait loci (eQTL; genetic loci influencing gene expression) with extremely large effect were used to form a classifier to predict an individual’s eQTL genotype based on expression data alone. By considering multiple eQTL and their related transcripts, we identified numerous individuals whose predicted eQTL genotypes (based on their expression data) did not match their observed genotypes, and then went on to identify other individuals whose genotypes did match the predicted eQTL genotypes. The concordance of predictions across six tissues indicated that the problem was due to mix-ups in the genotypes (although we further identified a small number of sample mix-ups in each of the six panels of gene expression microarrays). Consideration of the plate positions of the DNA samples indicated a number of off-by-one and off-by-two errors, likely the result of pipetting errors. Such sample mix-ups can be a problem in any genetic study, but eQTL data allow us to identify, and even correct, such problems. Our methods have been implemented in an R package, R/lineup. PMID:26290572

  20. Identification and Correction of Sample Mix-Ups in Expression Genetic Data: A Case Study.

    PubMed

    Broman, Karl W; Keller, Mark P; Broman, Aimee Teo; Kendziorski, Christina; Yandell, Brian S; Sen, Śaunak; Attie, Alan D

    2015-08-19

    In a mouse intercross with more than 500 animals and genome-wide gene expression data on six tissues, we identified a high proportion (18%) of sample mix-ups in the genotype data. Local expression quantitative trait loci (eQTL; genetic loci influencing gene expression) with extremely large effect were used to form a classifier to predict an individual's eQTL genotype based on expression data alone. By considering multiple eQTL and their related transcripts, we identified numerous individuals whose predicted eQTL genotypes (based on their expression data) did not match their observed genotypes, and then went on to identify other individuals whose genotypes did match the predicted eQTL genotypes. The concordance of predictions across six tissues indicated that the problem was due to mix-ups in the genotypes (although we further identified a small number of sample mix-ups in each of the six panels of gene expression microarrays). Consideration of the plate positions of the DNA samples indicated a number of off-by-one and off-by-two errors, likely the result of pipetting errors. Such sample mix-ups can be a problem in any genetic study, but eQTL data allow us to identify, and even correct, such problems. Our methods have been implemented in an R package, R/lineup. Copyright © 2015 Broman et al.

  1. Literature-based condition-specific miRNA-mRNA target prediction.

    PubMed

    Oh, Minsik; Rhee, Sungmin; Moon, Ji Hwan; Chae, Heejoon; Lee, Sunwon; Kang, Jaewoo; Kim, Sun

    2017-01-01

    miRNAs are small non-coding RNAs that regulate gene expression by binding to the 3'-UTR of genes. Many recent studies have reported that miRNAs play important biological roles by regulating specific mRNAs or genes. Many sequence-based target prediction algorithms have been developed to predict miRNA targets. However, these methods are not designed for condition-specific target predictions and produce many false positives; thus, expression-based target prediction algorithms have been developed for condition-specific target predictions. A typical strategy to utilize expression data is to leverage the negative control roles of miRNAs on genes. To control false positives, a stringent cutoff value is typically set, but in this case, these methods tend to reject many true target relationships, i.e., false negatives. To overcome these limitations, additional information should be utilized. The literature is probably the best resource that we can utilize. Recent literature mining systems compile millions of articles with experiments designed for specific biological questions, and the systems provide a function to search for specific information. To utilize the literature information, we used a literature mining system, BEST, that automatically extracts information from the literature in PubMed and that allows the user to perform searches of the literature with any English words. By integrating omics data analysis methods and BEST, we developed Context-MMIA, a miRNA-mRNA target prediction method that combines expression data analysis results and the literature information extracted based on the user-specified context. In the pathway enrichment analysis using genes included in the top 200 miRNA-targets, Context-MMIA outperformed the four existing target prediction methods that we tested. In another test on whether prediction methods can re-produce experimentally validated target relationships, Context-MMIA outperformed the four existing target prediction methods. In summary, Context-MMIA allows the user to specify a context of the experimental data to predict miRNA targets, and we believe that Context-MMIA is very useful for predicting condition-specific miRNA targets.

  2. Genes involved in host-parasite interactions can be revealed by their correlated expression.

    PubMed

    Reid, Adam James; Berriman, Matthew

    2013-02-01

    Molecular interactions between a parasite and its host are key to the ability of the parasite to enter the host and persist. Our understanding of the genes and proteins involved in these interactions is limited. To better understand these processes it would be advantageous to have a range of methods to predict pairs of genes involved in such interactions. Correlated gene expression profiles can be used to identify molecular interactions within a species. Here we have extended the concept to different species, showing that genes with correlated expression are more likely to encode proteins, which directly or indirectly participate in host-parasite interaction. We go on to examine our predictions of molecular interactions between the malaria parasite and both its mammalian host and insect vector. Our approach could be applied to study any interaction between species, for example, between a host and its parasites or pathogens, but also symbiotic and commensal pairings.

  3. Genetic regulation of gene expression in the lung identifies CST3 and CD22 as potential causal genes for airflow obstruction.

    PubMed

    Lamontagne, Maxime; Timens, Wim; Hao, Ke; Bossé, Yohan; Laviolette, Michel; Steiling, Katrina; Campbell, Joshua D; Couture, Christian; Conti, Massimo; Sherwood, Karen; Hogg, James C; Brandsma, Corry-Anke; van den Berge, Maarten; Sandford, Andrew; Lam, Stephen; Lenburg, Marc E; Spira, Avrum; Paré, Peter D; Nickle, David; Sin, Don D; Postma, Dirkje S

    2014-11-01

    COPD is a complex chronic disease with poorly understood pathogenesis. Integrative genomic approaches have the potential to elucidate the biological networks underlying COPD and lung function. We recently combined genome-wide genotyping and gene expression in 1111 human lung specimens to map expression quantitative trait loci (eQTL). To determine causal associations between COPD and lung function-associated single nucleotide polymorphisms (SNPs) and lung tissue gene expression changes in our lung eQTL dataset. We evaluated causality between SNPs and gene expression for three COPD phenotypes: FEV(1)% predicted, FEV(1)/FVC and COPD as a categorical variable. Different models were assessed in the three cohorts independently and in a meta-analysis. SNPs associated with a COPD phenotype and gene expression were subjected to causal pathway modelling and manual curation. In silico analyses evaluated functional enrichment of biological pathways among newly identified causal genes. Biologically relevant causal genes were validated in two separate gene expression datasets of lung tissues and bronchial airway brushings. High reliability causal relations were found in SNP-mRNA-phenotype triplets for FEV(1)% predicted (n=169) and FEV(1)/FVC (n=80). Several genes of potential biological relevance for COPD were revealed. eQTL-SNPs upregulating cystatin C (CST3) and CD22 were associated with worse lung function. Signalling pathways enriched with causal genes included xenobiotic metabolism, apoptosis, protease-antiprotease and oxidant-antioxidant balance. By using integrative genomics and analysing the relationships of COPD phenotypes with SNPs and gene expression in lung tissue, we identified CST3 and CD22 as potential causal genes for airflow obstruction. This study also augmented the understanding of previously described COPD pathways. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  4. Tumour gene expression predicts response to cetuximab in patients with KRAS wild-type metastatic colorectal cancer.

    PubMed

    Baker, J B; Dutta, D; Watson, D; Maddala, T; Munneke, B M; Shak, S; Rowinsky, E K; Xu, L-A; Harbison, C T; Clark, E A; Mauro, D J; Khambata-Ford, S

    2011-02-01

    Although it is accepted that metastatic colorectal cancers (mCRCs) that carry activating mutations in KRAS are unresponsive to anti-epidermal growth factor receptor (EGFR) monoclonal antibodies, a significant fraction of KRAS wild-type (wt) mCRCs are also unresponsive to anti-EGFR therapy. Genes encoding EGFR ligands amphiregulin (AREG) and epiregulin (EREG) are promising gene expression-based markers but have not been incorporated into a test to dichotomise KRAS wt mCRC patients with respect to sensitivity to anti-EGFR treatment. We used RT-PCR to test 110 candidate gene expression markers in primary tumours from 144 KRAS wt mCRC patients who received monotherapy with the anti-EGFR antibody cetuximab. Results were correlated with multiple clinical endpoints: disease control, objective response, and progression-free survival (PFS). Expression of many of the tested candidate genes, including EREG and AREG, strongly associate with all clinical endpoints. Using multivariate analysis with two-layer five-fold cross-validation, we constructed a four-gene predictive classifier. Strikingly, patients below the classifier cutpoint had PFS and disease control rates similar to those of patients with KRAS mutant mCRC. Gene expression appears to identify KRAS wt mCRC patients who receive little benefit from cetuximab. It will be important to test this model in an independent validation study.

  5. Genes located in a chromosomal inversion are correlated with territorial song in white-throated sparrows.

    PubMed

    Zinzow-Kramer, W M; Horton, B M; McKee, C D; Michaud, J M; Tharp, G K; Thomas, J W; Tuttle, E M; Yi, S; Maney, D L

    2015-11-01

    The genome of the white-throated sparrow (Zonotrichia albicollis) contains an inversion polymorphism on chromosome 2 that is linked to predictable variation in a suite of phenotypic traits including plumage color, aggression and parental behavior. Differences in gene expression between the two color morphs, which represent the two common inversion genotypes (ZAL2/ZAL2 and ZAL2/ZAL2(m) ), may therefore advance our understanding of the molecular underpinnings of these phenotypes. To identify genes that are differentially expressed between the two morphs and correlated with behavior, we quantified gene expression and terrirorial aggression, including song, in a population of free-living white-throated sparrows. We analyzed gene expression in two brain regions, the medial amygdala (MeA) and hypothalamus. Both regions are part of a 'social behavior network', which is rich in steroid hormone receptors and previously linked with territorial behavior. Using weighted gene co-expression network analyses, we identified modules of genes that were correlated with both morph and singing behavior. The majority of these genes were located within the inversion, showing the profound effect of the inversion on the expression of genes captured by the rearrangement. These modules were enriched with genes related to retinoic acid signaling and basic cellular functioning. In the MeA, the most prominent pathways were those related to steroid hormone receptor activity. Within these pathways, the only gene encoding such a receptor was ESR1 (estrogen receptor 1), a gene previously shown to predict song rate in this species. The set of candidate genes we identified may mediate the effects of a chromosomal inversion on territorial behavior. © 2015 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.

  6. An entomopathogenic bacterium, Xenorhabdus nematophila, suppresses expression of antimicrobial peptides controlled by Toll and Imd pathways by blocking eicosanoid biosynthesis.

    PubMed

    Hwang, Jihyun; Park, Youngjin; Kim, Yonggyun; Hwang, Jihyun; Lee, Daeweon

    2013-07-01

    Immune-associated genes of the beet armyworm, Spodoptera exigua, were predicted from 454 pyrosequencing transcripts of hemocytes collected from fifth instar larvae challenged with bacteria. Out of 22,551 contigs and singletons, 36% of the transcripts had at least one significant hit (E-value cutoff of 1e-20) and used to predict immune-associated genes implicated in pattern recognition, prophenoloxidase activation, intracellular signaling, and antimicrobial peptides (AMPs). Immune signaling and AMP genes were further confirmed in their expression patterns in response to different types of microbial challenge. To discriminate the AMP expression signaling between Toll and Imd pathways, RNA interference was applied to specifically knockdown each signal pathway; the separate silencing treatments resulted in differential suppression of AMP genes. An entomopathogenic bacterium, Xenorhabdus nematophila, suppressed expression of most AMP genes controlled by Toll and Imd pathways, while challenge with heat-killed X. nematophila induced expression of all AMPs in experimental larvae. Benzylideneacetone (BZA), a metabolite of X. nematophila, suppressed the AMP gene inductions when it was co-injected with the heat-killed X. nematophila. However, arachidonic acid, a catalytic product of PLA2 , significantly reversed the inhibitory effect of BZA on the AMP gene expression. This study suggests that X. nematophila suppresses AMP production controlled by Toll and Imd pathways by inhibiting eicosanoid biosynthesis in S. exigua. © 2013 Wiley Periodicals, Inc.

  7. Assessment of the reliability of protein-protein interactions and protein function prediction.

    PubMed

    Deng, Minghua; Sun, Fengzhu; Chen, Ting

    2003-01-01

    As more and more high-throughput protein-protein interaction data are collected, the task of estimating the reliability of different data sets becomes increasingly important. In this paper, we present our study of two groups of protein-protein interaction data, the physical interaction data and the protein complex data, and estimate the reliability of these data sets using three different measurements: (1) the distribution of gene expression correlation coefficients, (2) the reliability based on gene expression correlation coefficients, and (3) the accuracy of protein function predictions. We develop a maximum likelihood method to estimate the reliability of protein interaction data sets according to the distribution of correlation coefficients of gene expression profiles of putative interacting protein pairs. The results of the three measurements are consistent with each other. The MIPS protein complex data have the highest mean gene expression correlation coefficients (0.256) and the highest accuracy in predicting protein functions (70% sensitivity and specificity), while Ito's Yeast two-hybrid data have the lowest mean (0.041) and the lowest accuracy (15% sensitivity and specificity). Uetz's data are more reliable than Ito's data in all three measurements, and the TAP protein complex data are more reliable than the HMS-PCI data in all three measurements as well. The complex data sets generally perform better in function predictions than do the physical interaction data sets. Proteins in complexes are shown to be more highly correlated in gene expression. The results confirm that the components of a protein complex can be assigned to functions that the complex carries out within a cell. There are three interaction data sets different from the above two groups: the genetic interaction data, the in-silico data and the syn-express data. Their capability of predicting protein functions generally falls between that of the Y2H data and that of the MIPS protein complex data. The supplementary information is available at the following Web site: http://www-hto.usc.edu/-msms/AssessInteraction/.

  8. Prediction of Bacillus weihenstephanensis acid resistance: the use of gene expression patterns to select potential biomarkers.

    PubMed

    Desriac, N; Postollec, F; Coroller, L; Sohier, D; Abee, T; den Besten, H M W

    2013-10-01

    Exposure to mild stress conditions can activate stress adaptation mechanisms and provide cross-resistance towards otherwise lethal stresses. In this study, an approach was followed to select molecular biomarkers (quantitative gene expressions) to predict induced acid resistance after exposure to various mild stresses, i.e. exposure to sublethal concentrations of salt, acid and hydrogen peroxide during 5 min to 60 min. Gene expression patterns of unstressed and mildly stressed cells of Bacillus weihenstephanensis were correlated to their acid resistance (3D value) which was estimated after exposure to lethal acid conditions. Among the twenty-nine candidate biomarkers, 12 genes showed expression patterns that were correlated either linearly or non-linearly to acid resistance, while for the 17 other genes the correlation remains to be determined. The selected genes represented two types of biomarkers, (i) four direct biomarker genes (lexA, spxA, narL, bkdR) for which expression patterns upon mild stress treatment were linearly correlated to induced acid resistance; and (ii) nine long-acting biomarker genes (spxA, BcerKBAB4_0325, katA, trxB, codY, lacI, BcerKBAB4_1716, BcerKBAB4_2108, relA) which were transiently up-regulated during mild stress exposure and correlated to increased acid resistance over time. Our results highlight that mild stress induced transcripts can be linearly or non-linearly correlated to induced acid resistance and both approaches can be used to find relevant biomarkers. This quantitative and systematic approach opens avenues to select cellular biomarkers that could be incremented in mathematical models to predict microbial behaviour. Copyright © 2013 Elsevier B.V. All rights reserved.

  9. Synergistic interactions of biotic and abiotic environmental stressors on gene expression.

    PubMed

    Altshuler, Ianina; McLeod, Anne M; Colbourne, John K; Yan, Norman D; Cristescu, Melania E

    2015-03-01

    Understanding the response of organisms to multiple stressors is critical for predicting if populations can adapt to rapid environmental change. Natural and anthropogenic stressors often interact, complicating general predictions. In this study, we examined the interactive and cumulative effects of two common environmental stressors, lowered calcium concentration, an anthropogenic stressor, and predator presence, a natural stressor, on the water flea Daphnia pulex. We analyzed expression changes of five genes involved in calcium homeostasis - cuticle proteins (Cutie, Icp2), calbindin (Calb), and calcium pump and channel (Serca and Ip3R) - using real-time quantitative PCR (RT-qPCR) in a full factorial experiment. We observed strong synergistic interactions between low calcium concentration and predator presence. While the Ip3R gene was not affected by the stressors, the other four genes were affected in their transcriptional levels by the combination of the stressors. Transcriptional patterns of genes that code for cuticle proteins (Cutie and Icp2) and a sarcoplasmic calcium pump (Serca) only responded to the combination of stressors, changing their relative expression levels in a synergistic response, while a calcium-binding protein (Calb) responded to low calcium stress and the combination of both stressors. The expression pattern of these genes (Cutie, Icp2, and Serca) were nonlinear, yet they were dose dependent across the calcium gradient. Multiple stressors can have complex, often unexpected effects on ecosystems. This study demonstrates that the dominant interaction for the set of tested genes appears to be synergism. We argue that gene expression patterns can be used to understand and predict the type of interaction expected when organisms are exposed simultaneously to natural and anthropogenic stressors.

  10. Predicting effects of structural stress in a genome-reduced model bacterial metabolism

    NASA Astrophysics Data System (ADS)

    Güell, Oriol; Sagués, Francesc; Serrano, M. Ángeles

    2012-08-01

    Mycoplasma pneumoniae is a human pathogen recently proposed as a genome-reduced model for bacterial systems biology. Here, we study the response of its metabolic network to different forms of structural stress, including removal of individual and pairs of reactions and knockout of genes and clusters of co-expressed genes. Our results reveal a network architecture as robust as that of other model bacteria regarding multiple failures, although less robust against individual reaction inactivation. Interestingly, metabolite motifs associated to reactions can predict the propagation of inactivation cascades and damage amplification effects arising in double knockouts. We also detect a significant correlation between gene essentiality and damages produced by single gene knockouts, and find that genes controlling high-damage reactions tend to be expressed independently of each other, a functional switch mechanism that, simultaneously, acts as a genetic firewall to protect metabolism. Prediction of failure propagation is crucial for metabolic engineering or disease treatment.

  11. Analysis of temporal transcription expression profiles reveal links between protein function and developmental stages of Drosophila melanogaster.

    PubMed

    Wan, Cen; Lees, Jonathan G; Minneci, Federico; Orengo, Christine A; Jones, David T

    2017-10-01

    Accurate gene or protein function prediction is a key challenge in the post-genome era. Most current methods perform well on molecular function prediction, but struggle to provide useful annotations relating to biological process functions due to the limited power of sequence-based features in that functional domain. In this work, we systematically evaluate the predictive power of temporal transcription expression profiles for protein function prediction in Drosophila melanogaster. Our results show significantly better performance on predicting protein function when transcription expression profile-based features are integrated with sequence-derived features, compared with the sequence-derived features alone. We also observe that the combination of expression-based and sequence-based features leads to further improvement of accuracy on predicting all three domains of gene function. Based on the optimal feature combinations, we then propose a novel multi-classifier-based function prediction method for Drosophila melanogaster proteins, FFPred-fly+. Interpreting our machine learning models also allows us to identify some of the underlying links between biological processes and developmental stages of Drosophila melanogaster.

  12. A Theoretical Lower Bound for Selection on the Expression Levels of Proteins

    DOE PAGES

    Price, Morgan N.; Arkin, Adam P.

    2016-06-11

    We use simple models of the costs and benefits of microbial gene expression to show that changing a protein's expression away from its optimum by 2-fold should reduce fitness by at least [Formula: see text], where P is the fraction the cell's protein that the gene accounts for. As microbial genes are usually expressed at above 5 parts per million, and effective population sizes are likely to be above 10(6), this implies that 2-fold changes to gene expression levels are under strong selection, as [Formula: see text], where Ne is the effective population size and s is the selection coefficient.more » Thus, most gene duplications should be selected against. On the other hand, we predict that for most genes, small changes in the expression will be effectively neutral.« less

  13. A Theoretical Lower Bound for Selection on the Expression Levels of Proteins

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Price, Morgan N.; Arkin, Adam P.

    We use simple models of the costs and benefits of microbial gene expression to show that changing a protein's expression away from its optimum by 2-fold should reduce fitness by at least [Formula: see text], where P is the fraction the cell's protein that the gene accounts for. As microbial genes are usually expressed at above 5 parts per million, and effective population sizes are likely to be above 10(6), this implies that 2-fold changes to gene expression levels are under strong selection, as [Formula: see text], where Ne is the effective population size and s is the selection coefficient.more » Thus, most gene duplications should be selected against. On the other hand, we predict that for most genes, small changes in the expression will be effectively neutral.« less

  14. Gene Expression Differences in Peripheral Blood of Parkinson’s Disease Patients with Distinct Progression Profiles

    PubMed Central

    Soreq, Lilach; Lobo, Patrícia P.; Mestre, Tiago; Coelho, Miguel; Rosa, Mário M.; Gonçalves, Nilza; Wales, Pauline; Mendes, Tiago; Gerhardt, Ellen; Fahlbusch, Christiane; Bonifati, Vincenzo; Bonin, Michael; Miltenberger-Miltényi, Gabriel; Borovecki, Fran; Soreq, Hermona; Ferreira, Joaquim J.; F. Outeiro, Tiago

    2016-01-01

    The prognosis of neurodegenerative disorders is clinically challenging due to the inexistence of established biomarkers for predicting disease progression. Here, we performed an exploratory cross-sectional, case-control study aimed at determining whether gene expression differences in peripheral blood may be used as a signature of Parkinson’s disease (PD) progression, thereby shedding light into potential molecular mechanisms underlying disease development. We compared transcriptional profiles in the blood from 34 PD patients who developed postural instability within ten years with those of 33 patients who did not develop postural instability within this time frame. Our study identified >200 differentially expressed genes between the two groups. The expression of several of the genes identified was previously found deregulated in animal models of PD and in PD patients. Relevant genes were selected for validation by real-time PCR in a subset of patients. The genes validated were linked to nucleic acid metabolism, mitochondria, immune response and intracellular-transport. Interestingly, we also found deregulation of these genes in a dopaminergic cell model of PD, a simple paradigm that can now be used to further dissect the role of these molecular players on dopaminergic cell loss. Altogether, our study provides preliminary evidence that expression changes in specific groups of genes and pathways, detected in peripheral blood samples, may be correlated with differential PD progression. Our exploratory study suggests that peripheral gene expression profiling may prove valuable for assisting in prediction of PD prognosis, and identifies novel culprits possibly involved in dopaminergic cell death. Given the exploratory nature of our study, further investigations using independent, well-characterized cohorts will be essential in order to validate our candidates as predictors of PD prognosis and to definitively confirm the value of gene expression analysis in aiding patient stratification and therapeutic intervention. PMID:27322389

  15. Identification of predictive markers of cytarabine response in AML by integrative analysis of gene-expression profiles with multiple phenotypes

    PubMed Central

    Lamba, Jatinder K; Crews, Kristine R; Pounds, Stanley B; Cao, Xueyuan; Gandhi, Varsha; Plunkett, William; Razzouk, Bassem I; Lamba, Vishal; Baker, Sharyn D; Raimondi, Susana C; Campana, Dario; Pui, Ching-Hon; Downing, James R; Rubnitz, Jeffrey E; Ribeiro, Raul C

    2011-01-01

    Aim To identify gene-expression signatures predicting cytarabine response by an integrative analysis of multiple clinical and pharmacological end points in acute myeloid leukemia (AML) patients. Materials & methods We performed an integrated analysis to associate the gene expression of diagnostic bone marrow blasts from acute myeloid leukemia (AML) patients treated in the discovery set (AML97; n = 42) and in the independent validation set (AML02; n = 46) with multiple clinical and pharmacological end points. Based on prior biological knowledge, we defined a gene to show a therapeutically beneficial (detrimental) pattern of association of its expression positively (negatively) correlated with favorable phenotypes such as intracellular cytarabine 5´-triphosphate levels, morphological response and event-free survival, and negatively (positively) correlated with unfavorable end points such as post-cytarabine DNA synthesis levels, minimal residual disease and cytarabine LC50. Results We identified 240 probe sets predicting a therapeutically beneficial pattern and 97 predicting detrimental pattern (p ≤ 0.005) in the discovery set. Of these, 60 were confirmed in the independent validation set. The validated probe sets correspond to genes involved in PIK3/PTEN/AKT/mTOR signaling, G-protein-coupled receptor signaling and leukemogenesis. This suggests that targeting these pathways as potential pharmacogenomic and therapeutic candidates could be useful for improving treatment outcomes in AML. Conclusion This study illustrates the power of integrated data analysis of genomic data as well as multiple clinical and pharmacologic end points in the identification of genes and pathways of biological relevance. PMID:21449673

  16. Microarray analysis in rat liver slices correctly predicts in vivo hepatotoxicity.

    PubMed

    Elferink, M G L; Olinga, P; Draaisma, A L; Merema, M T; Bauerschmidt, S; Polman, J; Schoonen, W G; Groothuis, G M M

    2008-06-15

    The microarray technology, developed for the simultaneous analysis of a large number of genes, may be useful for the detection of toxicity in an early stage of the development of new drugs. The effect of different hepatotoxins was analyzed at the gene expression level in the rat liver both in vivo and in vitro. As in vitro model system the precision-cut liver slice model was used, in which all liver cell types are present in their natural architecture. This is important since drug-induced toxicity often is a multi-cellular process involving not only hepatocytes but also other cell types such as Kupffer and stellate cells. As model toxic compounds lipopolysaccharide (LPS, inducing inflammation), paracetamol (necrosis), carbon tetrachloride (CCl(4), fibrosis and necrosis) and gliotoxin (apoptosis) were used. The aim of this study was to validate the rat liver slice system as in vitro model system for drug-induced toxicity studies. The results of the microarray studies show that the in vitro profiles of gene expression cluster per compound and incubation time, and when analyzed in a commercial gene expression database, can predict the toxicity and pathology observed in vivo. Each toxic compound induces a specific pattern of gene expression changes. In addition, some common genes were up- or down-regulated with all toxic compounds. These data show that the rat liver slice system can be an appropriate tool for the prediction of multi-cellular liver toxicity. The same experiments and analyses are currently performed for the prediction of human specific toxicity using human liver slices.

  17. Blood Gene Expression Predicts Bronchiolitis Obliterans Syndrome

    PubMed Central

    Danger, Richard; Royer, Pierre-Joseph; Reboulleau, Damien; Durand, Eugénie; Loy, Jennifer; Tissot, Adrien; Lacoste, Philippe; Roux, Antoine; Reynaud-Gaubert, Martine; Gomez, Carine; Kessler, Romain; Mussot, Sacha; Dromer, Claire; Brugière, Olivier; Mornex, Jean-François; Guillemain, Romain; Dahan, Marcel; Knoop, Christiane; Botturi, Karine; Foureau, Aurore; Pison, Christophe; Koutsokera, Angela; Nicod, Laurent P.; Brouard, Sophie; Magnan, Antoine; Jougon, J.

    2018-01-01

    Bronchiolitis obliterans syndrome (BOS), the main manifestation of chronic lung allograft dysfunction, leads to poor long-term survival after lung transplantation. Identifying predictors of BOS is essential to prevent the progression of dysfunction before irreversible damage occurs. By using a large set of 107 samples from lung recipients, we performed microarray gene expression profiling of whole blood to identify early biomarkers of BOS, including samples from 49 patients with stable function for at least 3 years, 32 samples collected at least 6 months before BOS diagnosis (prediction group), and 26 samples at or after BOS diagnosis (diagnosis group). An independent set from 25 lung recipients was used for validation by quantitative PCR (13 stables, 11 in the prediction group, and 8 in the diagnosis group). We identified 50 transcripts differentially expressed between stable and BOS recipients. Three genes, namely POU class 2 associating factor 1 (POU2AF1), T-cell leukemia/lymphoma protein 1A (TCL1A), and B cell lymphocyte kinase, were validated as predictive biomarkers of BOS more than 6 months before diagnosis, with areas under the curve of 0.83, 0.77, and 0.78 respectively. These genes allow stratification based on BOS risk (log-rank test p < 0.01) and are not associated with time posttransplantation. This is the first published large-scale gene expression analysis of blood after lung transplantation. The three-gene blood signature could provide clinicians with new tools to improve follow-up and adapt treatment of patients likely to develop BOS. PMID:29375549

  18. Microarray analysis in rat liver slices correctly predicts in vivo hepatotoxicity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Elferink, M.G.L.; Olinga, P.; Draaisma, A.L.

    2008-06-15

    The microarray technology, developed for the simultaneous analysis of a large number of genes, may be useful for the detection of toxicity in an early stage of the development of new drugs. The effect of different hepatotoxins was analyzed at the gene expression level in the rat liver both in vivo and in vitro. As in vitro model system the precision-cut liver slice model was used, in which all liver cell types are present in their natural architecture. This is important since drug-induced toxicity often is a multi-cellular process involving not only hepatocytes but also other cell types such asmore » Kupffer and stellate cells. As model toxic compounds lipopolysaccharide (LPS, inducing inflammation), paracetamol (necrosis), carbon tetrachloride (CCl{sub 4}, fibrosis and necrosis) and gliotoxin (apoptosis) were used. The aim of this study was to validate the rat liver slice system as in vitro model system for drug-induced toxicity studies. The results of the microarray studies show that the in vitro profiles of gene expression cluster per compound and incubation time, and when analyzed in a commercial gene expression database, can predict the toxicity and pathology observed in vivo. Each toxic compound induces a specific pattern of gene expression changes. In addition, some common genes were up- or down-regulated with all toxic compounds. These data show that the rat liver slice system can be an appropriate tool for the prediction of multi-cellular liver toxicity. The same experiments and analyses are currently performed for the prediction of human specific toxicity using human liver slices.« less

  19. A gene expression signature of RAS pathway dependence predicts response to PI3K and RAS pathway inhibitors and expands the population of RAS pathway activated tumors.

    PubMed

    Loboda, Andrey; Nebozhyn, Michael; Klinghoffer, Rich; Frazier, Jason; Chastain, Michael; Arthur, William; Roberts, Brian; Zhang, Theresa; Chenard, Melissa; Haines, Brian; Andersen, Jannik; Nagashima, Kumiko; Paweletz, Cloud; Lynch, Bethany; Feldman, Igor; Dai, Hongyue; Huang, Pearl; Watters, James

    2010-06-30

    Hyperactivation of the Ras signaling pathway is a driver of many cancers, and RAS pathway activation can predict response to targeted therapies. Therefore, optimal methods for measuring Ras pathway activation are critical. The main focus of our work was to develop a gene expression signature that is predictive of RAS pathway dependence. We used the coherent expression of RAS pathway-related genes across multiple datasets to derive a RAS pathway gene expression signature and generate RAS pathway activation scores in pre-clinical cancer models and human tumors. We then related this signature to KRAS mutation status and drug response data in pre-clinical and clinical datasets. The RAS signature score is predictive of KRAS mutation status in lung tumors and cell lines with high (> 90%) sensitivity but relatively low (50%) specificity due to samples that have apparent RAS pathway activation in the absence of a KRAS mutation. In lung and breast cancer cell line panels, the RAS pathway signature score correlates with pMEK and pERK expression, and predicts resistance to AKT inhibition and sensitivity to MEK inhibition within both KRAS mutant and KRAS wild-type groups. The RAS pathway signature is upregulated in breast cancer cell lines that have acquired resistance to AKT inhibition, and is downregulated by inhibition of MEK. In lung cancer cell lines knockdown of KRAS using siRNA demonstrates that the RAS pathway signature is a better measure of dependence on RAS compared to KRAS mutation status. In human tumors, the RAS pathway signature is elevated in ER negative breast tumors and lung adenocarcinomas, and predicts resistance to cetuximab in metastatic colorectal cancer. These data demonstrate that the RAS pathway signature is superior to KRAS mutation status for the prediction of dependence on RAS signaling, can predict response to PI3K and RAS pathway inhibitors, and is likely to have the most clinical utility in lung and breast tumors.

  20. A Pathway Based Classification Method for Analyzing Gene Expression for Alzheimer's Disease Diagnosis.

    PubMed

    Voyle, Nicola; Keohane, Aoife; Newhouse, Stephen; Lunnon, Katie; Johnston, Caroline; Soininen, Hilkka; Kloszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon; Hodges, Angela; Kiddle, Steven; Dobson, Richard Jb

    2016-01-01

    Recent studies indicate that gene expression levels in blood may be able to differentiate subjects with Alzheimer's disease (AD) from normal elderly controls and mild cognitively impaired (MCI) subjects. However, there is limited replicability at the single marker level. A pathway-based interpretation of gene expression may prove more robust. This study aimed to investigate whether a case/control classification model built on pathway level data was more robust than a gene level model and may consequently perform better in test data. The study used two batches of gene expression data from the AddNeuroMed (ANM) and Dementia Case Registry (DCR) cohorts. Our study used Illumina Human HT-12 Expression BeadChips to collect gene expression from blood samples. Random forest modeling with recursive feature elimination was used to predict case/control status. Age and APOE ɛ4 status were used as covariates for all analysis. Gene and pathway level models performed similarly to each other and to a model based on demographic information only. Any potential increase in concordance from the novel pathway level approach used here has not lead to a greater predictive ability in these datasets. However, we have only tested one method for creating pathway level scores. Further, we have been able to benchmark pathways against genes in datasets that had been extensively harmonized. Further work should focus on the use of alternative methods for creating pathway level scores, in particular those that incorporate pathway topology, and the use of an endophenotype based approach.

  1. Biological interpretation of genome-wide association studies using predicted gene functions

    PubMed Central

    Pers, Tune H.; Karjalainen, Juha M.; Chan, Yingleong; Westra, Harm-Jan; Wood, Andrew R.; Yang, Jian; Lui, Julian C.; Vedantam, Sailaja; Gustafsson, Stefan; Esko, Tonu; Frayling, Tim; Speliotes, Elizabeth K.; Boehnke, Michael; Raychaudhuri, Soumya; Fehrmann, Rudolf S.N.; Hirschhorn, Joel N.; Franke, Lude

    2015-01-01

    The main challenge for gaining biological insights from genetic associations is identifying which genes and pathways explain the associations. Here we present DEPICT, an integrative tool that employs predicted gene functions to systematically prioritize the most likely causal genes at associated loci, highlight enriched pathways and identify tissues/cell types where genes from associated loci are highly expressed. DEPICT is not limited to genes with established functions and prioritizes relevant gene sets for many phenotypes. PMID:25597830

  2. A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer.

    PubMed

    Wu, Lang; Shi, Wei; Long, Jirong; Guo, Xingyi; Michailidou, Kyriaki; Beesley, Jonathan; Bolla, Manjeet K; Shu, Xiao-Ou; Lu, Yingchang; Cai, Qiuyin; Al-Ejeh, Fares; Rozali, Esdy; Wang, Qin; Dennis, Joe; Li, Bingshan; Zeng, Chenjie; Feng, Helian; Gusev, Alexander; Barfield, Richard T; Andrulis, Irene L; Anton-Culver, Hoda; Arndt, Volker; Aronson, Kristan J; Auer, Paul L; Barrdahl, Myrto; Baynes, Caroline; Beckmann, Matthias W; Benitez, Javier; Bermisheva, Marina; Blomqvist, Carl; Bogdanova, Natalia V; Bojesen, Stig E; Brauch, Hiltrud; Brenner, Hermann; Brinton, Louise; Broberg, Per; Brucker, Sara Y; Burwinkel, Barbara; Caldés, Trinidad; Canzian, Federico; Carter, Brian D; Castelao, J Esteban; Chang-Claude, Jenny; Chen, Xiaoqing; Cheng, Ting-Yuan David; Christiansen, Hans; Clarke, Christine L; Collée, Margriet; Cornelissen, Sten; Couch, Fergus J; Cox, David; Cox, Angela; Cross, Simon S; Cunningham, Julie M; Czene, Kamila; Daly, Mary B; Devilee, Peter; Doheny, Kimberly F; Dörk, Thilo; Dos-Santos-Silva, Isabel; Dumont, Martine; Dwek, Miriam; Eccles, Diana M; Eilber, Ursula; Eliassen, A Heather; Engel, Christoph; Eriksson, Mikael; Fachal, Laura; Fasching, Peter A; Figueroa, Jonine; Flesch-Janys, Dieter; Fletcher, Olivia; Flyger, Henrik; Fritschi, Lin; Gabrielson, Marike; Gago-Dominguez, Manuela; Gapstur, Susan M; García-Closas, Montserrat; Gaudet, Mia M; Ghoussaini, Maya; Giles, Graham G; Goldberg, Mark S; Goldgar, David E; González-Neira, Anna; Guénel, Pascal; Hahnen, Eric; Haiman, Christopher A; Håkansson, Niclas; Hall, Per; Hallberg, Emily; Hamann, Ute; Harrington, Patricia; Hein, Alexander; Hicks, Belynda; Hillemanns, Peter; Hollestelle, Antoinette; Hoover, Robert N; Hopper, John L; Huang, Guanmengqian; Humphreys, Keith; Hunter, David J; Jakubowska, Anna; Janni, Wolfgang; John, Esther M; Johnson, Nichola; Jones, Kristine; Jones, Michael E; Jung, Audrey; Kaaks, Rudolf; Kerin, Michael J; Khusnutdinova, Elza; Kosma, Veli-Matti; Kristensen, Vessela N; Lambrechts, Diether; Le Marchand, Loic; Li, Jingmei; Lindström, Sara; Lissowska, Jolanta; Lo, Wing-Yee; Loibl, Sibylle; Lubinski, Jan; Luccarini, Craig; Lux, Michael P; MacInnis, Robert J; Maishman, Tom; Kostovska, Ivana Maleva; Mannermaa, Arto; Manson, JoAnn E; Margolin, Sara; Mavroudis, Dimitrios; Meijers-Heijboer, Hanne; Meindl, Alfons; Menon, Usha; Meyer, Jeffery; Mulligan, Anna Marie; Neuhausen, Susan L; Nevanlinna, Heli; Neven, Patrick; Nielsen, Sune F; Nordestgaard, Børge G; Olopade, Olufunmilayo I; Olson, Janet E; Olsson, Håkan; Peterlongo, Paolo; Peto, Julian; Plaseska-Karanfilska, Dijana; Prentice, Ross; Presneau, Nadege; Pylkäs, Katri; Rack, Brigitte; Radice, Paolo; Rahman, Nazneen; Rennert, Gad; Rennert, Hedy S; Rhenius, Valerie; Romero, Atocha; Romm, Jane; Rudolph, Anja; Saloustros, Emmanouil; Sandler, Dale P; Sawyer, Elinor J; Schmidt, Marjanka K; Schmutzler, Rita K; Schneeweiss, Andreas; Scott, Rodney J; Scott, Christopher G; Seal, Sheila; Shah, Mitul; Shrubsole, Martha J; Smeets, Ann; Southey, Melissa C; Spinelli, John J; Stone, Jennifer; Surowy, Harald; Swerdlow, Anthony J; Tamimi, Rulla M; Tapper, William; Taylor, Jack A; Terry, Mary Beth; Tessier, Daniel C; Thomas, Abigail; Thöne, Kathrin; Tollenaar, Rob A E M; Torres, Diana; Truong, Thérèse; Untch, Michael; Vachon, Celine; Van Den Berg, David; Vincent, Daniel; Waisfisz, Quinten; Weinberg, Clarice R; Wendt, Camilla; Whittemore, Alice S; Wildiers, Hans; Willett, Walter C; Winqvist, Robert; Wolk, Alicja; Xia, Lucy; Yang, Xiaohong R; Ziogas, Argyrios; Ziv, Elad; Dunning, Alison M; Pharoah, Paul D P; Simard, Jacques; Milne, Roger L; Edwards, Stacey L; Kraft, Peter; Easton, Douglas F; Chenevix-Trench, Georgia; Zheng, Wei

    2018-06-18

    The breast cancer risk variants identified in genome-wide association studies explain only a small fraction of the familial relative risk, and the genes responsible for these associations remain largely unknown. To identify novel risk loci and likely causal genes, we performed a transcriptome-wide association study evaluating associations of genetically predicted gene expression with breast cancer risk in 122,977 cases and 105,974 controls of European ancestry. We used data from the Genotype-Tissue Expression Project to establish genetic models to predict gene expression in breast tissue and evaluated model performance using data from The Cancer Genome Atlas. Of the 8,597 genes evaluated, significant associations were identified for 48 at a Bonferroni-corrected threshold of P < 5.82 × 10 -6 , including 14 genes at loci not yet reported for breast cancer. We silenced 13 genes and showed an effect for 11 on cell proliferation and/or colony-forming efficiency. Our study provides new insights into breast cancer genetics and biology.

  3. Digital gene expression analysis of the zebra finch genome

    PubMed Central

    2010-01-01

    Background In order to understand patterns of adaptation and molecular evolution it is important to quantify both variation in gene expression and nucleotide sequence divergence. Gene expression profiling in non-model organisms has recently been facilitated by the advent of massively parallel sequencing technology. Here we investigate tissue specific gene expression patterns in the zebra finch (Taeniopygia guttata) with special emphasis on the genes of the major histocompatibility complex (MHC). Results Almost 2 million 454-sequencing reads from cDNA of six different tissues were assembled and analysed. A total of 11,793 zebra finch transcripts were represented in this EST data, indicating a transcriptome coverage of about 65%. There was a positive correlation between the tissue specificity of gene expression and non-synonymous to synonymous nucleotide substitution ratio of genes, suggesting that genes with a specialised function are evolving at a higher rate (or with less constraint) than genes with a more general function. In line with this, there was also a negative correlation between overall expression levels and expression specificity of contigs. We found evidence for expression of 10 different genes related to the MHC. MHC genes showed relatively tissue specific expression levels and were in general primarily expressed in spleen. Several MHC genes, including MHC class I also showed expression in brain. Furthermore, for all genes with highest levels of expression in spleen there was an overrepresentation of several gene ontology terms related to immune function. Conclusions Our study highlights the usefulness of next-generation sequence data for quantifying gene expression in the genome as a whole as well as in specific candidate genes. Overall, the data show predicted patterns of gene expression profiles and molecular evolution in the zebra finch genome. Expression of MHC genes in particular, corresponds well with expression patterns in other vertebrates. PMID:20359325

  4. A Prognostic Gene Expression Profile That Predicts Circulating Tumor Cell Presence in Breast Cancer Patients

    PubMed Central

    Molloy, Timothy J.; Roepman, Paul; Naume, Bjørn; van't Veer, Laura J.

    2012-01-01

    The detection of circulating tumor cells (CTCs) in the peripheral blood and microarray gene expression profiling of the primary tumor are two promising new technologies able to provide valuable prognostic data for patients with breast cancer. Meta-analyses of several established prognostic breast cancer gene expression profiles in large patient cohorts have demonstrated that despite sharing few genes, their delineation of patients into “good prognosis” or “poor prognosis” are frequently very highly correlated, and combining prognostic profiles does not increase prognostic power. In the current study, we aimed to develop a novel profile which provided independent prognostic data by building a signature predictive of CTC status rather than outcome. Microarray gene expression data from an initial training cohort of 72 breast cancer patients for which CTC status had been determined in a previous study using a multimarker QPCR-based assay was used to develop a CTC-predictive profile. The generated profile was validated in two independent datasets of 49 and 123 patients and confirmed to be both predictive of CTC status, and independently prognostic. Importantly, the “CTC profile” also provided prognostic information independent of the well-established and powerful ‘70-gene’ prognostic breast cancer signature. This profile therefore has the potential to not only add prognostic information to currently-available microarray tests but in some circumstances even replace blood-based prognostic CTC tests at time of diagnosis for those patients already undergoing testing by multigene assays. PMID:22384245

  5. A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis

    PubMed Central

    Ni, Ying; Aghamirzaie, Delasa; Elmarakeby, Haitham; Collakova, Eva; Li, Song; Grene, Ruth; Heath, Lenwood S.

    2016-01-01

    Gene regulatory networks (GRNs) provide a representation of relationships between regulators and their target genes. Several methods for GRN inference, both unsupervised and supervised, have been developed to date. Because regulatory relationships consistently reprogram in diverse tissues or under different conditions, GRNs inferred without specific biological contexts are of limited applicability. In this report, a machine learning approach is presented to predict GRNs specific to developing Arabidopsis thaliana embryos. We developed the Beacon GRN inference tool to predict GRNs occurring during seed development in Arabidopsis based on a support vector machine (SVM) model. We developed both global and local inference models and compared their performance, demonstrating that local models are generally superior for our application. Using both the expression levels of the genes expressed in developing embryos and prior known regulatory relationships, GRNs were predicted for specific embryonic developmental stages. The targets that are strongly positively correlated with their regulators are mostly expressed at the beginning of seed development. Potential direct targets were identified based on a match between the promoter regions of these inferred targets and the cis elements recognized by specific regulators. Our analysis also provides evidence for previously unknown inhibitory effects of three positive regulators of gene expression. The Beacon GRN inference tool provides a valuable model system for context-specific GRN inference and is freely available at https://github.com/BeaconProjectAtVirginiaTech/beacon_network_inference.git. PMID:28066488

  6. Predicting neuroblastoma using developmental signals and a logic-based model.

    PubMed

    Kasemeier-Kulesa, Jennifer C; Schnell, Santiago; Woolley, Thomas; Spengler, Jennifer A; Morrison, Jason A; McKinney, Mary C; Pushel, Irina; Wolfe, Lauren A; Kulesa, Paul M

    2018-07-01

    Genomic information from human patient samples of pediatric neuroblastoma cancers and known outcomes have led to specific gene lists put forward as high risk for disease progression. However, the reliance on gene expression correlations rather than mechanistic insight has shown limited potential and suggests a critical need for molecular network models that better predict neuroblastoma progression. In this study, we construct and simulate a molecular network of developmental genes and downstream signals in a 6-gene input logic model that predicts a favorable/unfavorable outcome based on the outcome of the four cell states including cell differentiation, proliferation, apoptosis, and angiogenesis. We simulate the mis-expression of the tyrosine receptor kinases, trkA and trkB, two prognostic indicators of neuroblastoma, and find differences in the number and probability distribution of steady state outcomes. We validate the mechanistic model assumptions using RNAseq of the SHSY5Y human neuroblastoma cell line to define the input states and confirm the predicted outcome with antibody staining. Lastly, we apply input gene signatures from 77 published human patient samples and show that our model makes more accurate disease outcome predictions for early stage disease than any current neuroblastoma gene list. These findings highlight the predictive strength of a logic-based model based on developmental genes and offer a better understanding of the molecular network interactions during neuroblastoma disease progression. Copyright © 2018. Published by Elsevier B.V.

  7. A network approach to predict pathogenic genes for Fusarium graminearum.

    PubMed

    Liu, Xiaoping; Tang, Wei-Hua; Zhao, Xing-Ming; Chen, Luonan

    2010-10-04

    Fusarium graminearum is the pathogenic agent of Fusarium head blight (FHB), which is a destructive disease on wheat and barley, thereby causing huge economic loss and health problems to human by contaminating foods. Identifying pathogenic genes can shed light on pathogenesis underlying the interaction between F. graminearum and its plant host. However, it is difficult to detect pathogenic genes for this destructive pathogen by time-consuming and expensive molecular biological experiments in lab. On the other hand, computational methods provide an alternative way to solve this problem. Since pathogenesis is a complicated procedure that involves complex regulations and interactions, the molecular interaction network of F. graminearum can give clues to potential pathogenic genes. Furthermore, the gene expression data of F. graminearum before and after its invasion into plant host can also provide useful information. In this paper, a novel systems biology approach is presented to predict pathogenic genes of F. graminearum based on molecular interaction network and gene expression data. With a small number of known pathogenic genes as seed genes, a subnetwork that consists of potential pathogenic genes is identified from the protein-protein interaction network (PPIN) of F. graminearum, where the genes in the subnetwork are further required to be differentially expressed before and after the invasion of the pathogenic fungus. Therefore, the candidate genes in the subnetwork are expected to be involved in the same biological processes as seed genes, which imply that they are potential pathogenic genes. The prediction results show that most of the pathogenic genes of F. graminearum are enriched in two important signal transduction pathways, including G protein coupled receptor pathway and MAPK signaling pathway, which are known related to pathogenesis in other fungi. In addition, several pathogenic genes predicted by our method are verified in other pathogenic fungi, which demonstrate the effectiveness of the proposed method. The results presented in this paper not only can provide guidelines for future experimental verification, but also shed light on the pathogenesis of the destructive fungus F. graminearum.

  8. Superoxide radical-generating compounds activate a predicted promoter site for paraquat-inducible genes of the Chromobacterium violaceum bacterium in a dose-dependent manner.

    PubMed

    Gabriel, J E; Guerra-Slompo, E P; de Souza, E M; de Carvalho, F A L; Madeira, H M F; de Vasconcelos, A T R

    2015-08-21

    The purpose of the present study was to functionally evaluate the influence of superoxide radical-generating compounds on the heterologous induction of a predicted promoter region of open reading frames for paraquat-inducible genes (pqi genes) revealed during genome annotation analyses of the Chromobacterium violaceum bacterium. A 388-bp fragment corresponding to a pqi gene promoter of C. violaceum was amplified using specific primers and cloned into a conjugative vector containing the Escherichia coli lacZ gene without a promoter. Assessments of the expression of the β-galactosidase enzyme were performed in the presence of menadione (MEN) and phenazine methosulfate (PMS) compounds at different final concentrations to evaluate the heterologous activation of the predicted promoter region of interest in C. violaceum induced by these substrates. Under these experimental conditions, the MEN reagent promoted highly significant increases in the expression of the β-galactosidase enzyme modulated by activating the promoter region of the pqi genes at all concentrations tested. On the other hand, significantly higher levels in the expression of the β-galactosidase enzyme were detected exclusively in the presence of the PMS reagent at a final concentration of 50 μg/mL. The findings described in the present study demonstrate that superoxide radical-generating compounds can activate a predicted promoter DNA motif for pqi genes of the C. violaceum bacterium in a dose-dependent manner.

  9. Large clusters of co-expressed genes in the Drosophila genome.

    PubMed

    Boutanaev, Alexander M; Kalmykova, Alla I; Shevelyov, Yuri Y; Nurminsky, Dmitry I

    2002-12-12

    Clustering of co-expressed, non-homologous genes on chromosomes implies their co-regulation. In lower eukaryotes, co-expressed genes are often found in pairs. Clustering of genes that share aspects of transcriptional regulation has also been reported in higher eukaryotes. To advance our understanding of the mode of coordinated gene regulation in multicellular organisms, we performed a genome-wide analysis of the chromosomal distribution of co-expressed genes in Drosophila. We identified a total of 1,661 testes-specific genes, one-third of which are clustered on chromosomes. The number of clusters of three or more genes is much higher than expected by chance. We observed a similar trend for genes upregulated in the embryo and in the adult head, although the expression pattern of individual genes cannot be predicted on the basis of chromosomal position alone. Our data suggest that the prevalent mechanism of transcriptional co-regulation in higher eukaryotes operates with extensive chromatin domains that comprise multiple genes.

  10. Target research on tumor biology characteristics of mir-155-5p regulation on gastric cancer cell.

    PubMed

    Feng, Jun-an

    2016-03-01

    After the mir-155-5p over expressed in gastric cancer cells, the expression profile chip was adopted to screen its target genes. Some of the intersection of target genes were selected based on the bioinformatics prediction, in order to study the mechanism of its function and role of research. Affymetrix eukaryotic gene expression spectrum was conducted to screen mir-155-5p regulated genetic experiment. Western blot technique was employed to detect and screen the protein expression of target genes. Mimics was transfected in BGC-823 of gastric cancer cells. Compared with mimics-nc group and mock group, the mRNA expression quantities of SMAD1, STAT1, CAB39, CXCR4 and CA9 were significantly lower. After the gastric cancer cells BGC-823 and MKN-45 had been transfected by mimics, compared with mimics-nc (MNC) group and mock (MOCK) group, it was decreased for the protein expression of SMAD1, STAT1 and CAB39 in mimics (MIMICS) group. The verification of qRT-PCR demonstrated that SMAD1, STAT1, CAB39, CXCR4 and CA9 were the predicted target genes and target proteins of mir-155-5p, the over expression of mir-155-5p could enable the decreasing of its expression level in gastric cancer cells MKN-45 and BGC-823.

  11. Differential Responses to Wnt and PCP Disruption Predict Expression and Developmental Function of Conserved and Novel Genes in a Cnidarian

    PubMed Central

    Lapébie, Pascal; Ruggiero, Antonella; Barreau, Carine; Chevalier, Sandra; Chang, Patrick; Dru, Philippe; Houliston, Evelyn; Momose, Tsuyoshi

    2014-01-01

    We have used Digital Gene Expression analysis to identify, without bilaterian bias, regulators of cnidarian embryonic patterning. Transcriptome comparison between un-manipulated Clytia early gastrula embryos and ones in which the key polarity regulator Wnt3 was inhibited using morpholino antisense oligonucleotides (Wnt3-MO) identified a set of significantly over and under-expressed transcripts. These code for candidate Wnt signaling modulators, orthologs of other transcription factors, secreted and transmembrane proteins known as developmental regulators in bilaterian models or previously uncharacterized, and also many cnidarian-restricted proteins. Comparisons between embryos injected with morpholinos targeting Wnt3 and its receptor Fz1 defined four transcript classes showing remarkable correlation with spatiotemporal expression profiles. Class 1 and 3 transcripts tended to show sustained expression at “oral” and “aboral” poles respectively of the developing planula larva, class 2 transcripts in cells ingressing into the endodermal region during gastrulation, while class 4 gene expression was repressed at the early gastrula stage. The preferential effect of Fz1-MO on expression of class 2 and 4 transcripts can be attributed to Planar Cell Polarity (PCP) disruption, since it was closely matched by morpholino knockdown of the specific PCP protein Strabismus. We conclude that endoderm and post gastrula-specific gene expression is particularly sensitive to PCP disruption while Wnt-/β-catenin signaling dominates gene regulation along the oral-aboral axis. Phenotype analysis using morpholinos targeting a subset of transcripts indicated developmental roles consistent with expression profiles for both conserved and cnidarian-restricted genes. Overall our unbiased screen allowed systematic identification of regionally expressed genes and provided functional support for a shared eumetazoan developmental regulatory gene set with both predicted and previously unexplored members, but also demonstrated that fundamental developmental processes including axial patterning and endoderm formation in cnidarians can involve newly evolved (or highly diverged) genes. PMID:25233086

  12. Differential responses to Wnt and PCP disruption predict expression and developmental function of conserved and novel genes in a cnidarian.

    PubMed

    Lapébie, Pascal; Ruggiero, Antonella; Barreau, Carine; Chevalier, Sandra; Chang, Patrick; Dru, Philippe; Houliston, Evelyn; Momose, Tsuyoshi

    2014-09-01

    We have used Digital Gene Expression analysis to identify, without bilaterian bias, regulators of cnidarian embryonic patterning. Transcriptome comparison between un-manipulated Clytia early gastrula embryos and ones in which the key polarity regulator Wnt3 was inhibited using morpholino antisense oligonucleotides (Wnt3-MO) identified a set of significantly over and under-expressed transcripts. These code for candidate Wnt signaling modulators, orthologs of other transcription factors, secreted and transmembrane proteins known as developmental regulators in bilaterian models or previously uncharacterized, and also many cnidarian-restricted proteins. Comparisons between embryos injected with morpholinos targeting Wnt3 and its receptor Fz1 defined four transcript classes showing remarkable correlation with spatiotemporal expression profiles. Class 1 and 3 transcripts tended to show sustained expression at "oral" and "aboral" poles respectively of the developing planula larva, class 2 transcripts in cells ingressing into the endodermal region during gastrulation, while class 4 gene expression was repressed at the early gastrula stage. The preferential effect of Fz1-MO on expression of class 2 and 4 transcripts can be attributed to Planar Cell Polarity (PCP) disruption, since it was closely matched by morpholino knockdown of the specific PCP protein Strabismus. We conclude that endoderm and post gastrula-specific gene expression is particularly sensitive to PCP disruption while Wnt-/β-catenin signaling dominates gene regulation along the oral-aboral axis. Phenotype analysis using morpholinos targeting a subset of transcripts indicated developmental roles consistent with expression profiles for both conserved and cnidarian-restricted genes. Overall our unbiased screen allowed systematic identification of regionally expressed genes and provided functional support for a shared eumetazoan developmental regulatory gene set with both predicted and previously unexplored members, but also demonstrated that fundamental developmental processes including axial patterning and endoderm formation in cnidarians can involve newly evolved (or highly diverged) genes.

  13. Gene expression profiles associated with anaemia and ITPA genotypes in patients with chronic hepatitis C (CH-C).

    PubMed

    Birerdinc, A; Estep, M; Afendy, A; Stepanova, M; Younossi, I; Baranova, A; Younossi, Z M

    2012-06-01

    Anaemia is a common side effect of ribavirin (RBV) which is used for the treatment of hepatitis C. Inosine triphosphatase gene polymorphism (C to A) protects against RBV-induced anaemia. The aim of our study was to genotype patients for inosine triphosphatase gene polymorphism rs1127354 SNP (CC or CA) and associate treatment-induced anaemia with gene expression profile and genotypes. We used 67 hepatitis C patients with available gene expression, clinical, laboratory data and whole-blood samples. Whole blood was used to determine inosine triphosphatase gene polymorphism rs1127354 genotypes (CC or CA). The cohort with inosine triphosphatase gene polymorphism CA genotype revealed a distinct pattern of protection against anaemia and a lower drop in haemoglobin. A variation in the propensity of CC carriers to develop anaemia prompted us to look for additional predictors of anaemia during pegylated interferon (PEG-IFN) and RBV. Pretreatment blood samples of patients receiving a full course of PEG-IFN and RBV were used to assess expression of 153 genes previously implicated in host response to viral infections. The gene expression data were analysed according to presence of anaemia and inosine triphosphatase gene polymorphism genotypes. Thirty-six genes were associated with treatment-related anaemia, six of which are involved in the response to hypoxia pathway (HIF1A, AIF1, RHOC, PTEN, LCK and PDGFB). There was a substantial overlap between sustained virological response (SVR)-predicting and anaemia-related genes; however, of the nine JAK-STAT pathway-related genes associated with SVR, none were implicated in anaemia. These observations exclude the direct involvement of antiviral response in the development of anaemia associated with PEG-IFN and RBV treatment, whereas another, distinct component within the SVR-associated gene expression response may predict anaemia. We have identified baseline gene expression signatures associated with RBV-induced anaemia and identified its functional pathways. In particular, we identified the hypoxia response pathway and the apoptosis/survival-related gene network, as differentially expressed in chronic hepatitis C patients with anaemia. © 2011 Blackwell Publishing Ltd.

  14. TANDEM: a two-stage approach to maximize interpretability of drug response models based on multiple molecular data types.

    PubMed

    Aben, Nanne; Vis, Daniel J; Michaut, Magali; Wessels, Lodewyk F A

    2016-09-01

    Clinical response to anti-cancer drugs varies between patients. A large portion of this variation can be explained by differences in molecular features, such as mutation status, copy number alterations, methylation and gene expression profiles. We show that the classic approach for combining these molecular features (Elastic Net regression on all molecular features simultaneously) results in models that are almost exclusively based on gene expression. The gene expression features selected by the classic approach are difficult to interpret as they often represent poorly studied combinations of genes, activated by aberrations in upstream signaling pathways. To utilize all data types in a more balanced way, we developed TANDEM, a two-stage approach in which the first stage explains response using upstream features (mutations, copy number, methylation and cancer type) and the second stage explains the remainder using downstream features (gene expression). Applying TANDEM to 934 cell lines profiled across 265 drugs (GDSC1000), we show that the resulting models are more interpretable, while retaining the same predictive performance as the classic approach. Using the more balanced contributions per data type as determined with TANDEM, we find that response to MAPK pathway inhibitors is largely predicted by mutation data, while predicting response to DNA damaging agents requires gene expression data, in particular SLFN11 expression. TANDEM is available as an R package on CRAN (for more information, see http://ccb.nki.nl/software/tandem). m.michaut@nki.nl or l.wessels@nki.nl Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Gene Expression Changes in Phosphorus Deficient Potato (Solanum tuberosum L.) Leaves and the Potential for Diagnostic Gene Expression Markers

    PubMed Central

    Hammond, John P.; Broadley, Martin R.; Bowen, Helen C.; Spracklen, William P.; Hayden, Rory M.; White, Philip J.

    2011-01-01

    Background There are compelling economic and environmental reasons to reduce our reliance on inorganic phosphate (Pi) fertilisers. Better management of Pi fertiliser applications is one option to improve the efficiency of Pi fertiliser use, whilst maintaining crop yields. Application rates of Pi fertilisers are traditionally determined from analyses of soil or plant tissues. Alternatively, diagnostic genes with altered expression under Pi limiting conditions that suggest a physiological requirement for Pi fertilisation, could be used to manage Pifertiliser applications, and might be more precise than indirect measurements of soil or tissue samples. Results We grew potato (Solanum tuberosum L.) plants hydroponically, under glasshouse conditions, to control their nutrient status accurately. Samples of total leaf RNA taken periodically after Pi was removed from the nutrient solution were labelled and hybridised to potato oligonucleotide arrays. A total of 1,659 genes were significantly differentially expressed following Pi withdrawal. These included genes that encode proteins involved in lipid, protein, and carbohydrate metabolism, characteristic of Pi deficient leaves and included potential novel roles for genes encoding patatin like proteins in potatoes. The array data were analysed using a support vector machine algorithm to identify groups of genes that could predict the Pi status of the crop. These groups of diagnostic genes were tested using field grown potatoes that had either been fertilised or unfertilised. A group of 200 genes could correctly predict the Pi status of field grown potatoes. Conclusions This paper provides a proof-of-concept demonstration for using microarrays and class prediction tools to predict the Pi status of a field grown potato crop. There is potential to develop this technology for other biotic and abiotic stresses in field grown crops. Ultimately, a better understanding of crop stresses may improve our management of the crop, improving the sustainability of agriculture. PMID:21935429

  16. Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array

    PubMed Central

    Gardina, Paul J; Clark, Tyson A; Shimada, Brian; Staples, Michelle K; Yang, Qing; Veitch, James; Schweitzer, Anthony; Awad, Tarif; Sugnet, Charles; Dee, Suzanne; Davies, Christopher; Williams, Alan; Turpaz, Yaron

    2006-01-01

    Background Alternative splicing is a mechanism for increasing protein diversity by excluding or including exons during post-transcriptional processing. Alternatively spliced proteins are particularly relevant in oncology since they may contribute to the etiology of cancer, provide selective drug targets, or serve as a marker set for cancer diagnosis. While conventional identification of splice variants generally targets individual genes, we present here a new exon-centric array (GeneChip Human Exon 1.0 ST) that allows genome-wide identification of differential splice variation, and concurrently provides a flexible and inclusive analysis of gene expression. Results We analyzed 20 paired tumor-normal colon cancer samples using a microarray designed to detect over one million putative exons that can be virtually assembled into potential gene-level transcripts according to various levels of prior supporting evidence. Analysis of high confidence (empirically supported) transcripts identified 160 differentially expressed genes, with 42 genes occupying a network impacting cell proliferation and another twenty nine genes with unknown functions. A more speculative analysis, including transcripts based solely on computational prediction, produced another 160 differentially expressed genes, three-fourths of which have no previous annotation. We also present a comparison of gene signal estimations from the Exon 1.0 ST and the U133 Plus 2.0 arrays. Novel splicing events were predicted by experimental algorithms that compare the relative contribution of each exon to the cognate transcript intensity in each tissue. The resulting candidate splice variants were validated with RT-PCR. We found nine genes that were differentially spliced between colon tumors and normal colon tissues, several of which have not been previously implicated in cancer. Top scoring candidates from our analysis were also found to substantially overlap with EST-based bioinformatic predictions of alternative splicing in cancer. Conclusion Differential expression of high confidence transcripts correlated extremely well with known cancer genes and pathways, suggesting that the more speculative transcripts, largely based solely on computational prediction and mostly with no previous annotation, might be novel targets in colon cancer. Five of the identified splicing events affect mediators of cytoskeletal organization (ACTN1, VCL, CALD1, CTTN, TPM1), two affect extracellular matrix proteins (FN1, COL6A3) and another participates in integrin signaling (SLC3A2). Altogether they form a pattern of colon-cancer specific alterations that may particularly impact cell motility. PMID:17192196

  17. Fine mapping and identification of candidate genes for the sy-2 locus in a temperature-sensitive chili pepper (Capsicum chinense).

    PubMed

    Liu, Li; Venkatesh, Jelli; Jo, Yeong Deuk; Koeda, Sota; Hosokawa, Munetaka; Kang, Jin-Ho; Goritschnig, Sandra; Kang, Byoung-Cheorl

    2016-08-01

    The sy - 2 temperature-sensitive gene from Capsicum chinense was fine mapped to a 138.8-kb region at the distal portion of pepper chromosome 1. Based on expression analyses, two putative F-box genes were identified as sy - 2 candidate genes. Seychelles-2 ('sy-2') is a temperature-sensitive natural mutant of Capsicum chinense, which exhibits an abnormal leaf phenotype when grown at temperatures below 24 °C. We previously showed that the sy-2 phenotype is controlled by a single recessive gene, sy-2, located on pepper chromosome 1. In this study, a high-resolution genetic and physical map for the sy-2 locus was constructed using two individual F2 mapping populations derived from a cross between C. chinense mutant 'sy-2' and wild-type 'No. 3341'. The sy-2 gene was fine mapped to a 138.8-kb region between markers SNP 5-5 and SNP 3-8 at the distal portion of chromosome 1, based on comparative genomic analysis and genomic information from pepper. The sy-2 target region was predicted to contain 27 genes. Expression analysis of these predicted genes showed a differential expression pattern for ORF10 and ORF20 between mutant and wild-type plants; with both having significantly lower expression in 'sy-2' than in wild-type plants. In addition, the coding sequences of both ORF10 and ORF20 contained single nucleotide polymorphisms (SNPs) causing amino acid changes, which may have important functional consequences. ORF10 and ORF20 are predicted to encode F-box proteins, which are components of the SCF complex. Based on the differential expression pattern and the presence of nonsynonymous SNPs, we suggest that these two putative F-box genes are most likely responsible for the temperature-sensitive phenotypes in pepper. Further investigation of these genes may enable a better understanding of the molecular mechanisms of low temperature sensitivity in plants.

  18. Regulation of bacterial photosynthesis genes by the small noncoding RNA PcrZ

    PubMed Central

    Mank, Nils N.; Berghoff, Bork A.; Hermanns, Yannick N.; Klug, Gabriele

    2012-01-01

    The small RNA PcrZ (photosynthesis control RNA Z) of the facultative phototrophic bacterium Rhodobacter sphaeroides is induced upon a drop of oxygen tension with similar kinetics to those of genes for components of photosynthetic complexes. High expression of PcrZ depends on PrrA, the response regulator of the PrrB/PrrA two-component system with a central role in redox regulation in R. sphaeroides. In addition the FnrL protein, an activator of some photosynthesis genes at low oxygen tension, is involved in redox-dependent expression of this small (s)RNA. Overexpression of full-length PcrZ in R. sphaeroides affects expression of a small subset of genes, most of them with a function in photosynthesis. Some mRNAs from the photosynthetic gene cluster were predicted to be putative PcrZ targets and results from an in vivo reporter system support these predictions. Our data reveal a negative effect of PcrZ on expression of its target mRNAs. Thus, PcrZ counteracts the redox-dependent induction of photosynthesis genes, which is mediated by protein regulators. Because PrrA directly activates photosynthesis genes and at the same time PcrZ, which negatively affects photosynthesis gene expression, this is one of the rare cases of an incoherent feed-forward loop including an sRNA. Our data identified PcrZ as a trans acting sRNA with a direct regulatory function in formation of photosynthetic complexes and provide a model for the control of photosynthesis gene expression by a regulatory network consisting of proteins and a small noncoding RNA. PMID:22988125

  19. Regulation of bacterial photosynthesis genes by the small noncoding RNA PcrZ.

    PubMed

    Mank, Nils N; Berghoff, Bork A; Hermanns, Yannick N; Klug, Gabriele

    2012-10-02

    The small RNA PcrZ (photosynthesis control RNA Z) of the facultative phototrophic bacterium Rhodobacter sphaeroides is induced upon a drop of oxygen tension with similar kinetics to those of genes for components of photosynthetic complexes. High expression of PcrZ depends on PrrA, the response regulator of the PrrB/PrrA two-component system with a central role in redox regulation in R. sphaeroides. In addition the FnrL protein, an activator of some photosynthesis genes at low oxygen tension, is involved in redox-dependent expression of this small (s)RNA. Overexpression of full-length PcrZ in R. sphaeroides affects expression of a small subset of genes, most of them with a function in photosynthesis. Some mRNAs from the photosynthetic gene cluster were predicted to be putative PcrZ targets and results from an in vivo reporter system support these predictions. Our data reveal a negative effect of PcrZ on expression of its target mRNAs. Thus, PcrZ counteracts the redox-dependent induction of photosynthesis genes, which is mediated by protein regulators. Because PrrA directly activates photosynthesis genes and at the same time PcrZ, which negatively affects photosynthesis gene expression, this is one of the rare cases of an incoherent feed-forward loop including an sRNA. Our data identified PcrZ as a trans acting sRNA with a direct regulatory function in formation of photosynthetic complexes and provide a model for the control of photosynthesis gene expression by a regulatory network consisting of proteins and a small noncoding RNA.

  20. Exploring candidate biomarkers for lung and prostate cancers using gene expression and flux variability analysis.

    PubMed

    Asgari, Yazdan; Khosravi, Pegah; Zabihinpour, Zahra; Habibi, Mahnaz

    2018-02-19

    Genome-scale metabolic models have provided valuable resources for exploring changes in metabolism under normal and cancer conditions. However, metabolism itself is strongly linked to gene expression, so integration of gene expression data into metabolic models might improve the detection of genes involved in the control of tumor progression. Herein, we considered gene expression data as extra constraints to enhance the predictive powers of metabolic models. We reconstructed genome-scale metabolic models for lung and prostate, under normal and cancer conditions to detect the major genes associated with critical subsystems during tumor development. Furthermore, we utilized gene expression data in combination with an information theory-based approach to reconstruct co-expression networks of the human lung and prostate in both cohorts. Our results revealed 19 genes as candidate biomarkers for lung and prostate cancer cells. This study also revealed that the development of a complementary approach (integration of gene expression and metabolic profiles) could lead to proposing novel biomarkers and suggesting renovated cancer treatment strategies which have not been possible to detect using either of the methods alone.

  1. Commentary on "predicting metastasized seminoma using gene expression." Ruf CG, Linbecker M, Port M, Riecke A, Schmelz HU, Wagner W, Meineke V, Abend M, Department of Urology, Federal Armed Forces Hospital, Hamburg, Germany: BJU Int 2012;110:E14.

    PubMed

    Richie, Jerome

    2013-02-01

    Treatment options for testis cancer depend on the histological subtype as well as on the clinical stage. An accurate staging is essential for correct treatment. The 'golden standard' for staging purposes is CT, but occult metastasis cannot be detected with this method. Currently, parameters such as primary tumour size, vessel invasion or invasion of the rete testis are used for predicting occult metastasis. Last year the association of these parameters with metastasis could not be validated in a new independent cohort. Gene expression analysis in testis cancer allowed discrimination between the different histological subtypes (seminoma and non-seminoma) as well as testis cancer and normal testis tissue. In a two-stage study design we (i) screened the whole genome (using human whole genome microarrays) for candidate genes associated with the metastatic stage in seminoma and (ii) validated and quantified gene expression of our candidate genes (real-time quantitative polymerase chain reaction) on another independent group. Gene expression measurements of two of our candidate genes (dopamine receptor D1 [DRD1] and family with sequence similarity 71, member F2 [FAM71F2]) examined in primary testis cancers made it possible to discriminate the metastasis status in seminoma. The discriminative ability of the genes exceeded the predictive significance of currently used histological/pathological parameters. Based on gene expression analysis the present study provides suggestions for improved individual decision making either in favour of early adjuvant therapy or increased surveillance. To evaluate the usefulness of gene expression profiling for predicting metastatic status in testicular seminoma at the time of first diagnosis compared with established clinical and pathological parameters. Total RNA was isolated from testicular tumours of metastasized patients (12 patients, clinical stage IIa-III), non-metastasized patients (40, clinical stage I) and adjacent 'normal' tissue (n = 36). The RNA was then converted into cDNA and real-time quantitative polymerase chain reaction was run on 94 candidate genes selected from previous work. Normalised gene expression of these genes and histological variables, e.g. tumour size and rete testis infiltration, were analysed using logistic regression analysis. Expression of two genes (dopamine receptor D1 [DRD1] and family with sequence similarity 71, member F2 [FAM71F2], P = 0.005 and 0.024 in separate analysis and P = 0.004 and 0.016 when combining both genes, respectively) made it possible to significantly discriminate the metastasis status. Concordance increased from 77.9% (DRD1) and 72.3% (FAM71F2) in separate analysis and up to 87.7% when combining both genes in one model. Only primary tumour size in separate analysis (continuous or categorical with tumour size>6cm) was significantly associated with metastasis (P = 0.039/P = 0.02), but concordance was lower (61%). When we combined tumour size with our two genes in one model there was no further statistical improvement or increased concordance. Based on gene expression analysis our study provides suggestions for improved individual decision making either in favour of early adjuvant therapy or increased surveillance. Copyright © 2013 Elsevier Inc. All rights reserved.

  2. Identification of regulatory targets of tissue-specific transcription factors: application to retina-specific gene regulation

    PubMed Central

    Qian, Jiang; Esumi, Noriko; Chen, Yangjian; Wang, Qingliang; Chowers, Itay; Zack, Donald J.

    2005-01-01

    Identification of tissue-specific gene regulatory networks can yield insights into the molecular basis of a tissue's development, function and pathology. Here, we present a computational approach designed to identify potential regulatory target genes of photoreceptor cell-specific transcription factors (TFs). The approach is based on the hypothesis that genes related to the retina in terms of expression, disease and/or function are more likely to be the targets of retina-specific TFs than other genes. A list of genes that are preferentially expressed in retina was obtained by integrating expressed sequence tag, SAGE and microarray datasets. The regulatory targets of retina-specific TFs are enriched in this set of retina-related genes. A Bayesian approach was employed to integrate information about binding site location relative to a gene's transcription start site. Our method was applied to three retina-specific TFs, CRX, NRL and NR2E3, and a number of potential targets were predicted. To experimentally assess the validity of the bioinformatic predictions, mobility shift, transient transfection and chromatin immunoprecipitation assays were performed with five predicted CRX targets, and the results were suggestive of CRX regulation in 5/5, 3/5 and 4/5 cases, respectively. Together, these experiments strongly suggest that RP1, GUCY2D, ABCA4 are novel targets of CRX. PMID:15967807

  3. Analysis and modelling of septic shock microarray data using Singular Value Decomposition.

    PubMed

    Allanki, Srinivas; Dixit, Madhulika; Thangaraj, Paul; Sinha, Nandan Kumar

    2017-06-01

    Being a high throughput technique, enormous amounts of microarray data has been generated and there arises a need for more efficient techniques of analysis, in terms of speed and accuracy. Finding the differentially expressed genes based on just fold change and p-value might not extract all the vital biological signals that occur at a lower gene expression level. Besides this, numerous mathematical models have been generated to predict the clinical outcome from microarray data, while very few, if not none, aim at predicting the vital genes that are important in a disease progression. Such models help a basic researcher narrow down and concentrate on a promising set of genes which leads to the discovery of gene-based therapies. In this article, as a first objective, we have used the lesser known and used Singular Value Decomposition (SVD) technique to build a microarray data analysis tool that works with gene expression patterns and intrinsic structure of the data in an unsupervised manner. We have re-analysed a microarray data over the clinical course of Septic shock from Cazalis et al. (2014) and have shown that our proposed analysis provides additional information compared to the conventional method. As a second objective, we developed a novel mathematical model that predicts a set of vital genes in the disease progression that works by generating samples in the continuum between health and disease, using a simple normal-distribution-based random number generator. We also verify that most of the predicted genes are indeed related to septic shock. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. General theory for integrated analysis of growth, gene, and protein expression in biofilms.

    PubMed

    Zhang, Tianyu; Pabst, Breana; Klapper, Isaac; Stewart, Philip S

    2013-01-01

    A theory for analysis and prediction of spatial and temporal patterns of gene and protein expression within microbial biofilms is derived. The theory integrates phenomena of solute reaction and diffusion, microbial growth, mRNA or protein synthesis, biomass advection, and gene transcript or protein turnover. Case studies illustrate the capacity of the theory to simulate heterogeneous spatial patterns and predict microbial activities in biofilms that are qualitatively different from those of planktonic cells. Specific scenarios analyzed include an inducible GFP or fluorescent protein reporter, a denitrification gene repressed by oxygen, an acid stress response gene, and a quorum sensing circuit. It is shown that the patterns of activity revealed by inducible stable fluorescent proteins or reporter unstable proteins overestimate the region of activity. This is due to advective spreading and finite protein turnover rates. In the cases of a gene induced by either limitation for a metabolic substrate or accumulation of a metabolic product, maximal expression is predicted in an internal stratum of the biofilm. A quorum sensing system that includes an oxygen-responsive negative regulator exhibits behavior that is distinct from any stage of a batch planktonic culture. Though here the analyses have been limited to simultaneous interactions of up to two substrates and two genes, the framework applies to arbitrarily large networks of genes and metabolites. Extension of reaction-diffusion modeling in biofilms to the analysis of individual genes and gene networks is an important advance that dovetails with the growing toolkit of molecular and genetic experimental techniques.

  5. Increased baseline RUNX2, caspase 3 and p21 gene expressions in the peripheral blood of disease-modifying anti-rheumatic drug-naïve rheumatoid arthritis patients are associated with improved clinical response to methotrexate therapy.

    PubMed

    Tchetina, Elena V; Demidova, Natalia V; Markova, Galina A; Taskina, Elena A; Glukhova, Svetlana I; Karateev, Dmitry E

    2017-10-01

    To investigate the potential of the baseline gene expression in the whole blood of disease-modifying anti-rheumatic drug-naïve rheumatoid arthritis (RA) patients for predicting the response to methotrexate (MTX) treatment. Twenty-six control subjects and 40 RA patients were examined. Clinical, immunological and radiographic parameters were assessed before and after 24 months of follow-up. The gene expressions in the whole blood were measured using real-time reverse transcription polymerase chain reaction. The protein concentrations in peripheral blood mononuclear cells were quantified using enzyme-linked immunosorbent assay. Receiver operating characteristic curve analyses were used to suggest thresholds that were associated with the prediction of the response. Decreases in the disease activity at the end of the study were accompanied by significant increases in joint space narrowing score (JSN). Positive correlations between the expressions of the Unc-51-like kinase 1 (ULK1) and matrix metalloproteinase 9 (MMP-9) genes with the level of C-reactive protein and MMP-9 expression with Disease Activity Score of 28 joints (DAS28) and swollen joint count were noted at baseline. The baseline tumor necrosis factor (TNF)α gene expression was positively correlated with JSN at the end of the follow-up, whereas p21, caspase 3, and runt-related transcription factor (RUNX)2 were correlated with the ΔDAS28 values. Our results suggest that the expressions of MMP-9 and ULK1 might be associated with disease activity. Increased baseline gene expressions of RUNX2, p21 and caspase 3 in the peripheral blood might predict better responses to MTX therapy. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.

  6. Two members of the mouse mdr gene family confer multidrug resistance with overlapping but distinct drug specificities.

    PubMed Central

    Devault, A; Gros, P

    1990-01-01

    We report the cloning and functional analysis of a complete clone for the third member of the mouse mdr gene family, mdr3. Nucleotide and predicted amino acid sequence analyses showed that the three mouse mdr genes encode highly homologous membrane glycoproteins, which share the same length (1,276 residues), the same predicted functional domains, and overall structural arrangement. Regions of divergence among the three proteins are concentrated in discrete segments of the predicted polypeptides. Sequence comparison indicated that the three mouse mdr genes were created from a common ancestor by two independent gene duplication events, the most recent one producing mdr1 and mdr3. When transfected and overexpressed in otherwise drug-sensitive cells, the mdr3 gene, like mdr1 and unlike mdr2, conferred multidrug resistance to these cells. In independently derived transfected cell clones expressing similar amounts of either MDR1 or MDR3 protein, the drug resistance profile conferred by mdr3 was distinct from that conferred by mdr1. Cells transfected with and expressing MDR1 showed a marked 7- to 10-fold preferential resistance to colchicine and Adriamycin compared with cells expressing equivalent amounts of MDR3. Conversely, cells transfected with and expressing MDR3 showed a two- to threefold preferential resistance to actinomycin D over their cellular counterpart expressing MDR1. These results suggest that MDR1 and MDR3 are membrane-associated efflux pumps which, in multidrug-resistant cells and perhaps normal tissues, have overlapping but distinct substrate specificities. Images PMID:1969610

  7. Microbial forensics: predicting phenotypic characteristics and environmental conditions from large-scale gene expression profiles.

    PubMed

    Kim, Minseung; Zorraquino, Violeta; Tagkopoulos, Ilias

    2015-03-01

    A tantalizing question in cellular physiology is whether the cellular state and environmental conditions can be inferred by the expression signature of an organism. To investigate this relationship, we created an extensive normalized gene expression compendium for the bacterium Escherichia coli that was further enriched with meta-information through an iterative learning procedure. We then constructed an ensemble method to predict environmental and cellular state, including strain, growth phase, medium, oxygen level, antibiotic and carbon source presence. Results show that gene expression is an excellent predictor of environmental structure, with multi-class ensemble models achieving balanced accuracy between 70.0% (±3.5%) to 98.3% (±2.3%) for the various characteristics. Interestingly, this performance can be significantly boosted when environmental and strain characteristics are simultaneously considered, as a composite classifier that captures the inter-dependencies of three characteristics (medium, phase and strain) achieved 10.6% (±1.0%) higher performance than any individual models. Contrary to expectations, only 59% of the top informative genes were also identified as differentially expressed under the respective conditions. Functional analysis of the respective genetic signatures implicates a wide spectrum of Gene Ontology terms and KEGG pathways with condition-specific information content, including iron transport, transferases, and enterobactin synthesis. Further experimental phenotypic-to-genotypic mapping that we conducted for knock-out mutants argues for the information content of top-ranked genes. This work demonstrates the degree at which genome-scale transcriptional information can be predictive of latent, heterogeneous and seemingly disparate phenotypic and environmental characteristics, with far-reaching applications.

  8. iPcc: a novel feature extraction method for accurate disease class discovery and prediction

    PubMed Central

    Ren, Xianwen; Wang, Yong; Zhang, Xiang-Sun; Jin, Qi

    2013-01-01

    Gene expression profiling has gradually become a routine procedure for disease diagnosis and classification. In the past decade, many computational methods have been proposed, resulting in great improvements on various levels, including feature selection and algorithms for classification and clustering. In this study, we present iPcc, a novel method from the feature extraction perspective to further propel gene expression profiling technologies from bench to bedside. We define ‘correlation feature space’ for samples based on the gene expression profiles by iterative employment of Pearson’s correlation coefficient. Numerical experiments on both simulated and real gene expression data sets demonstrate that iPcc can greatly highlight the latent patterns underlying noisy gene expression data and thus greatly improve the robustness and accuracy of the algorithms currently available for disease diagnosis and classification based on gene expression profiles. PMID:23761440

  9. Versatile control of Plasmodium falciparum gene expression with an inducible protein-RNA interaction

    PubMed Central

    Goldfless, Stephen J.; Wagner, Jeffrey C.; Niles, Jacquin C.

    2014-01-01

    The available tools for conditional gene expression in Plasmodium falciparum are limited. Here, to enable reliable control of target gene expression, we build a system to efficiently modulate translation. We overcame several problems associated with other approaches for regulating gene expression in P. falciparum. Specifically, our system functions predictably across several native and engineered promoter contexts, and affords control over reporter and native parasite proteins irrespective of their subcellular compartmentalization. Induction and repression of gene expression are rapid, homogeneous, and stable over prolonged periods. To demonstrate practical application of our system, we used it to reveal direct links between antimalarial drugs and their native parasite molecular target. This is an important out come given the rapid spread of resistance, and intensified efforts to efficiently discover and optimize new antimalarial drugs. Overall, the studies presented highlight the utility of our system for broadly controlling gene expression and performing functional genetics in P. falciparum. PMID:25370483

  10. Transcriptome database resource and gene expression atlas for the rose

    PubMed Central

    2012-01-01

    Background For centuries roses have been selected based on a number of traits. Little information exists on the genetic and molecular basis that contributes to these traits, mainly because information on expressed genes for this economically important ornamental plant is scarce. Results Here, we used a combination of Illumina and 454 sequencing technologies to generate information on Rosa sp. transcripts using RNA from various tissues and in response to biotic and abiotic stresses. A total of 80714 transcript clusters were identified and 76611 peptides have been predicted among which 20997 have been clustered into 13900 protein families. BLASTp hits in closely related Rosaceae species revealed that about half of the predicted peptides in the strawberry and peach genomes have orthologs in Rosa dataset. Digital expression was obtained using RNA samples from organs at different development stages and under different stress conditions. qPCR validated the digital expression data for a selection of 23 genes with high or low expression levels. Comparative gene expression analyses between the different tissues and organs allowed the identification of clusters that are highly enriched in given tissues or under particular conditions, demonstrating the usefulness of the digital gene expression analysis. A web interface ROSAseq was created that allows data interrogation by BLAST, subsequent analysis of DNA clusters and access to thorough transcript annotation including best BLAST matches on Fragaria vesca, Prunus persica and Arabidopsis. The rose peptides dataset was used to create the ROSAcyc resource pathway database that allows access to the putative genes and enzymatic pathways. Conclusions The study provides useful information on Rosa expressed genes, with thorough annotation and an overview of expression patterns for transcripts with good accuracy. PMID:23164410

  11. Integrative functional transcriptomic analyses implicate specific molecular pathways in pulmonary toxicity from exposure to aluminum oxide nanoparticles.

    PubMed

    Li, Xiaobo; Zhang, Chengcheng; Bian, Qian; Gao, Na; Zhang, Xin; Meng, Qingtao; Wu, Shenshen; Wang, Shizhi; Xia, Yankai; Chen, Rui

    2016-09-01

    Gene expression profiling has developed rapidly in recent years and it can predict and define mechanisms underlying chemical toxicity. Here, RNA microarray and computational technology were used to show that aluminum oxide nanoparticles (Al2O3 NPs) were capable of triggering up-regulation of genes related to the cell cycle and cell death in a human A549 lung adenocarcinoma cell line. Gene expression levels were validated in Al2O3 NPs exposed A549 cells and mice lung tissues, most of which showed consistent trends in regulation. Gene-transcription factor network analysis coupled with cell- and animal-based assays demonstrated that the genes encoding PTPN6, RTN4, BAX and IER play a role in the biological responses induced by the nanoparticle exposure, which caused cell death and cell cycle arrest in the G2/S phase. Further, down-regulated PTPN6 expression demonstrated a core role in the network, thus expression level of PTPN6 was rescued by plasmid transfection, which showed ameliorative effects of A549 cells against cell death and cell cycle arrest. These results demonstrate the feasibility of using gene expression profiling to predict cellular responses induced by nanomaterials, which could be used to develop a comprehensive knowledge of nanotoxicity.

  12. Dynamic modelling of microRNA regulation during mesenchymal stem cell differentiation.

    PubMed

    Weber, Michael; Sotoca, Ana M; Kupfer, Peter; Guthke, Reinhard; van Zoelen, Everardus J

    2013-11-12

    Network inference from gene expression data is a typical approach to reconstruct gene regulatory networks. During chondrogenic differentiation of human mesenchymal stem cells (hMSCs), a complex transcriptional network is active and regulates the temporal differentiation progress. As modulators of transcriptional regulation, microRNAs (miRNAs) play a critical role in stem cell differentiation. Integrated network inference aimes at determining interrelations between miRNAs and mRNAs on the basis of expression data as well as miRNA target predictions. We applied the NetGenerator tool in order to infer an integrated gene regulatory network. Time series experiments were performed to measure mRNA and miRNA abundances of TGF-beta1+BMP2 stimulated hMSCs. Network nodes were identified by analysing temporal expression changes, miRNA target gene predictions, time series correlation and literature knowledge. Network inference was performed using NetGenerator to reconstruct a dynamical regulatory model based on the measured data and prior knowledge. The resulting model is robust against noise and shows an optimal trade-off between fitting precision and inclusion of prior knowledge. It predicts the influence of miRNAs on the expression of chondrogenic marker genes and therefore proposes novel regulatory relations in differentiation control. By analysing the inferred network, we identified a previously unknown regulatory effect of miR-524-5p on the expression of the transcription factor SOX9 and the chondrogenic marker genes COL2A1, ACAN and COL10A1. Genome-wide exploration of miRNA-mRNA regulatory relationships is a reasonable approach to identify miRNAs which have so far not been associated with the investigated differentiation process. The NetGenerator tool is able to identify valid gene regulatory networks on the basis of miRNA and mRNA time series data.

  13. Validation of Biomarkers Predictive of Recurrence Following Prostatectomy

    DTIC Science & Technology

    2011-04-14

    Bergerheim U, Ekman P, DeMarzo AM, Tibshirani R, Botstein D, Brown PO, Brooks JD, Pollack JR: Gene expression profiling identifies clinically...P, DeMarzo AM, Tibshirani R, Botstein D, Brown PO, Brooks JD, Pollack JR: Gene expression profiling identifies clinically relevant subtypes of

  14. MusTRD can regulate postnatal fiber-specific expression.

    PubMed

    Issa, Laura L; Palmer, Stephen J; Guven, Kim L; Santucci, Nicole; Hodgson, Vanessa R M; Popovic, Kata; Joya, Josephine E; Hardeman, Edna C

    2006-05-01

    Human MusTRD1alpha1 was isolated as a result of its ability to bind a critical element within the Troponin I slow upstream enhancer (TnIslow USE) and was predicted to be a regulator of slow fiber-specific genes. To test this hypothesis in vivo, we generated transgenic mice expressing hMusTRD1alpha1 in skeletal muscle. Adult transgenic mice show a complete loss of slow fibers and a concomitant replacement by fast IIA fibers, resulting in postural muscle weakness. However, developmental analysis demonstrates that transgene expression has no impact on embryonic patterning of slow fibers but causes a gradual postnatal slow to fast fiber conversion. This conversion was underpinned by a demonstrable repression of many slow fiber-specific genes, whereas fast fiber-specific gene expression was either unchanged or enhanced. These data are consistent with our initial predictions for hMusTRD1alpha1 and suggest that slow fiber genes contain a specific common regulatory element that can be targeted by MusTRD proteins.

  15. Transcriptomic and macroevolutionary evidence for phenotypic uncoupling between frog life history phases

    PubMed Central

    Wollenberg Valero, Katharina C.; Garcia-Porta, Joan; Rodríguez, Ariel; Arias, Mónica; Shah, Abhijeet; Randrianiaina, Roger Daniel; Brown, Jason L.; Glaw, Frank; Amat, Felix; Künzel, Sven; Metzler, Dirk; Isokpehi, Raphael D.; Vences, Miguel

    2017-01-01

    Anuran amphibians undergo major morphological transitions during development, but the contribution of their markedly different life-history phases to macroevolution has rarely been analysed. Here we generate testable predictions for coupling versus uncoupling of phenotypic evolution of tadpole and adult life-history phases, and for the underlying expression of genes related to morphological feature formation. We test these predictions by combining evidence from gene expression in two distantly related frogs, Xenopus laevis and Mantidactylus betsileanus, with patterns of morphological evolution in the entire radiation of Madagascan mantellid frogs. Genes linked to morphological structure formation are expressed in a highly phase-specific pattern, suggesting uncoupling of phenotypic evolution across life-history phases. This gene expression pattern agrees with uncoupled rates of trait evolution among life-history phases in the mantellids, which we show to have undergone an adaptive radiation. Our results validate a prevalence of uncoupling in the evolution of tadpole and adult phenotypes of frogs. PMID:28504275

  16. Identification and profiling of Cyprinus carpio microRNAs during ovary differentiation by deep sequencing.

    PubMed

    Wang, Fang; Jia, Yongfang; Wang, Po; Yang, Qianwen; Du, QiYan; Chang, ZhongJie

    2017-04-28

    MicroRNAs (miRNAs) are endogenous small non-coding RNAs that regulate gene expression by targeting specific mRNAs. However, the possible role of miRNAs in the ovary differentiation and development of fish is not well understood. In this study, we examined the expression profiles and differential expression of miRNAs during three key stages of ovarian development and different developmental stages in common carp Cyprinus carpio. A total of 8765 miRNAs were identified, including 2155 conserved miRNAs highly conserved among various species, 145 miRNAs registered in miRBase for common carp, and 6505 novel miRNAs identified in common carp for the first time. Comparison of miRNA expression profiles among the five libraries identified 714 co-expressed and 2382 specific expressed miRNAs. Overall, 150, 628, and 431 specifically expressed miRNAs were identified in primordial gonad, juvenile ovary, and adult ovary, respectively. MiR-6758-3p, miR-3050-5p, and miR-2985-3p were highly expressed in primordial gonad, miR-3544-5p, miR-6877-3p, and miR-9086-5p were highly expressed in juvenile ovary, and miR-154-3p, miR-5307-5p, and miR-3958-3p were highly expressed in adult ovary. Predicted target genes of specific miRNAs in primordial gonad were involved in many reproductive biology signaling pathways, including transforming growth factor-β, Wnt, oocyte meiosis, mitogen-activated protein kinase, Notch, p53, and gonadotropin-releasing hormone pathways. Target-gene prediction revealed upward trends in miRNAs targeting male-bias genes, including dmrt1, atm, gsdf, and sox9, and downward trends in miRNAs targeting female-bias genes including foxl2, smad3, and smad4. Other sex-related genes such as sf1 were also predicted to be miRNA target genes. This comprehensive miRNA transcriptome analysis demonstrated differential expression profiles of miRNAs during ovary development in common carp. These results could facilitate future exploitation of the sex-regulatory roles and mechanisms of miRNAs, especially in primordial gonads, while the specifically expressed miRNAs represent candidates for studying the mechanisms of ovary determination in Yellow River carp.

  17. [Joint effects of water temperature and salinity on the expression of gill Hsp70 gene in Pinctada martensii (Dunker)].

    PubMed

    Wang, Ya-Nan; Wang, Hui; Zhu, Xiao-Wen; Luo, Ming-Ming; Liu, Zhi-Gang; Du, Xiao-Dong

    2012-12-01

    By using central composite experimental design and response surface method, the joint effects of water temperature (16-40 degrees C) and salinity (10-50) on the expression of gill Hsp70 gene in Pinctada martensii (Dunker) were studied under laboratory conditions. The results showed that the linear and quadratic effects of temperature on the expression of gill Hsp70 gene were significant, the linear effect of salinity was not significant, while the quadratic effect of salinity was significant. The interactive effect of temperature and salinity was not significant, and the effect of temperature was greater than that of salinity. The model equation of the gill Hsp70 gene expression was established, with the R2, Adj. R2, and Pred. R2 as high as 98.7%, 97.4%, and 89.2%, respectively, suggesting that the overarching predictive capability of the model was very satisfactory, and could be practicably applied for prediction. Through the optimization of the model, the expression of the gill Hsp70 gene reached its minimum (0.5276) when the temperature was 26.78 degrees C and the salinity was 29.33, with the desirability value being 98%. These experimental results could offer theoretical reference for the high expression of gill Hsp70 gene in P. martensii, the maintenance of cell internal environment stability, and the enhancement of P. martensii stress resistance.

  18. Five putative nucleoside triphosphate diphosphohydrolase genes are expressed in Trichomonas vaginalis.

    PubMed

    Frasson, Amanda Piccoli; Dos Santos, Odelta; Meirelles, Lúcia Collares; Macedo, Alexandre José; Tasca, Tiana

    2016-01-01

    Trichomonas vaginalis is a protozoan that parasitizes the human urogenital tract causing trichomoniasis, the most common non-viral sexually transmitted disease. The parasite has unique genomic characteristics such as a large genome size and expanded gene families. Ectonucleoside triphosphate diphosphohydrolase (E-NTPDase) is an enzyme responsible for hydrolyzing nucleoside tri- and diphosphates and has already been biochemically characterized in T. vaginalis. Considering the important role of this enzyme in the production of extracellular adenosine for parasite uptake, we evaluated the gene expression of five putative NTPDases in T. vaginalis. We showed that all five putative TvNTPDase genes (TvNTPDase1-5) were expressed by both fresh clinical and long-term grown isolates. The amino acid alignment predicted the presence of the five crucial apyrase conserved regions, transmembrane domains, signal peptides, phosphorylation and catalytic sites. Moreover, a phylogenetic analysis showed that TvNTPDase sequences make up a clade with NTPDases intracellularly located. Biochemical NTPDase activity (ATP and ADP hydrolysis) is responsive to the serum-restrictive conditions and the gene expression of TvNTPDases was mostly increased, mainly TvNTPDase2 and TvNTPDase4, although there was not a clear pattern of expression among them. In summary, the present report demonstrates the gene expression patterns of predicted NTPDases in T. vaginalis. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Expression of the filaggrin gene in umbilical cord blood predicts eczema risk in infancy: A birth cohort study.

    PubMed

    Ziyab, A H; Ewart, S; Lockett, G A; Zhang, H; Arshad, H; Holloway, J W; Karmaus, W

    2017-09-01

    Filaggrin gene (FLG) expression, particularly in the skin, has been linked to the development of the skin barrier and is associated with eczema risk. However, knowledge as to whether FLG expression in umbilical cord blood (UCB) is associated with eczema development and prediction is lacking. This study sought to assess whether FLG expression in UCB associates with and predicts the development of eczema in infancy. Infants enrolled in a birth cohort study (n=94) were assessed for eczema at ages 3, 6, and 12 months. Five probes measuring FLG transcripts expression in UCB were available from genomewide gene expression profiling. FLG genetic variants R501X, 2282del4, and S3247X were genotyped. Associations were assessed using Poisson regression with robust variance estimation. Area under the curve (AUC), describing the discriminatory/predictive performance of fitted models, was estimated from logistic regression. Increased level of FLG expression measured by probe A_24_P51322 was associated with reduced risk of eczema during the first year of life (RR=0.60, 95% CI: 0.38-0.95). In contrast, increased level of FLG antisense transcripts measured by probe A_21_P0014075 was associated with increased risk of eczema (RR=2.02, 95% CI: 1.10-3.72). In prediction models including FLG expression, FLG genetic variants, and sex, discrimination between children who will and will not develop eczema at 3 months of age was high (AUC: 0.91, 95% CI: 0.84-0.98). This study demonstrated, for the first time, that FLG expression in UCB is associated with eczema development in infancy. Moreover, our analysis provided prediction models that were capable of discriminating, to a great extent, between those who will and will not develop eczema in infancy. Therefore, early identification of infants at increased risk of developing eczema is possible and such high-risk newborns may benefit from early stratification and intervention. © 2017 John Wiley & Sons Ltd.

  20. Identification of a Genomic Signature Predicting for Recurrence in Early Stage Ovarian Cancer

    DTIC Science & Technology

    2015-12-01

    early stage ovarian cancer to help researchers worldwide identify biomarkers that can aid early detection and inform novel targets for therapy. This...to detect differentially expressed genes after transformation using Voom. When using the top 5 genes to build the classifier, it predicted...to analyze expression of micro-RNA in these samples. Thus, at the end of the third year of funding we started a parallel analysis of RNAseq, DNA- CNV

  1. Perturbation of B Cell Gene Expression Persists in HIV-Infected Children Despite Effective Antiretroviral Therapy and Predicts H1N1 Response.

    PubMed

    Cotugno, Nicola; De Armas, Lesley; Pallikkuth, Suresh; Rinaldi, Stefano; Issac, Biju; Cagigi, Alberto; Rossi, Paolo; Palma, Paolo; Pahwa, Savita

    2017-01-01

    Despite effective antiretroviral therapy (ART), HIV-infected individuals with apparently similar clinical and immunological characteristics can vary in responsiveness to vaccinations. However, molecular mechanisms responsible for such impairment, as well as biomarkers able to predict vaccine responsiveness in HIV-infected children, remain unknown. Following the hypothesis that a B cell qualitative impairment persists in HIV-infected children (HIV) despite effective ART and phenotypic B cell immune reconstitution, the aim of the current study was to investigate B cell gene expression of HIV compared to age-matched healthy controls (HCs) and to determine whether distinct gene expression patterns could predict the ability to respond to influenza vaccine. To do so, we analyzed prevaccination transcriptional levels of a 96-gene panel in equal numbers of sort-purified B cell subsets (SPBS) isolated from peripheral blood mononuclear cells using multiplexed RT-PCR. Immune responses to H1N1 antigen were determined by hemaglutination inhibition and memory B cell ELISpot assays following trivalent-inactivated influenza vaccination (TIV) for all study participants. Although there were no differences in terms of cell frequencies of SPBS between HIV and HC, the groups were distinguishable based upon gene expression analyses. Indeed, a 28-gene signature, characterized by higher expression of genes involved in the inflammatory response and immune activation was observed in activated memory B cells (CD27 + CD21 - ) from HIV when compared to HC despite long-term viral control (>24 months). Further analysis, taking into account H1N1 responses after TIV in HIV participants, revealed that a 25-gene signature in resting memory (RM) B cells (CD27 + CD21 + ) was able to distinguish vaccine responders from non-responders (NR). In fact, prevaccination RM B cells of responders showed a higher expression of gene sets involved in B cell adaptive immune responses ( APRIL, BTK, BLIMP1 ) and BCR signaling ( MTOR, FYN, CD86 ) when compared to NR. Overall, these data suggest that a perturbation at a transcriptional level in the B cell compartment persists despite stable virus control achieved through ART in HIV-infected children. Additionally, the present study demonstrates the potential utility of transcriptional evaluation of RM B cells before vaccination for identifying predictive correlates of vaccine responses in this population.

  2. Perturbation of B Cell Gene Expression Persists in HIV-Infected Children Despite Effective Antiretroviral Therapy and Predicts H1N1 Response

    PubMed Central

    Cotugno, Nicola; De Armas, Lesley; Pallikkuth, Suresh; Rinaldi, Stefano; Issac, Biju; Cagigi, Alberto; Rossi, Paolo; Palma, Paolo; Pahwa, Savita

    2017-01-01

    Despite effective antiretroviral therapy (ART), HIV-infected individuals with apparently similar clinical and immunological characteristics can vary in responsiveness to vaccinations. However, molecular mechanisms responsible for such impairment, as well as biomarkers able to predict vaccine responsiveness in HIV-infected children, remain unknown. Following the hypothesis that a B cell qualitative impairment persists in HIV-infected children (HIV) despite effective ART and phenotypic B cell immune reconstitution, the aim of the current study was to investigate B cell gene expression of HIV compared to age-matched healthy controls (HCs) and to determine whether distinct gene expression patterns could predict the ability to respond to influenza vaccine. To do so, we analyzed prevaccination transcriptional levels of a 96-gene panel in equal numbers of sort-purified B cell subsets (SPBS) isolated from peripheral blood mononuclear cells using multiplexed RT-PCR. Immune responses to H1N1 antigen were determined by hemaglutination inhibition and memory B cell ELISpot assays following trivalent-inactivated influenza vaccination (TIV) for all study participants. Although there were no differences in terms of cell frequencies of SPBS between HIV and HC, the groups were distinguishable based upon gene expression analyses. Indeed, a 28-gene signature, characterized by higher expression of genes involved in the inflammatory response and immune activation was observed in activated memory B cells (CD27+CD21−) from HIV when compared to HC despite long-term viral control (>24 months). Further analysis, taking into account H1N1 responses after TIV in HIV participants, revealed that a 25-gene signature in resting memory (RM) B cells (CD27+CD21+) was able to distinguish vaccine responders from non-responders (NR). In fact, prevaccination RM B cells of responders showed a higher expression of gene sets involved in B cell adaptive immune responses (APRIL, BTK, BLIMP1) and BCR signaling (MTOR, FYN, CD86) when compared to NR. Overall, these data suggest that a perturbation at a transcriptional level in the B cell compartment persists despite stable virus control achieved through ART in HIV-infected children. Additionally, the present study demonstrates the potential utility of transcriptional evaluation of RM B cells before vaccination for identifying predictive correlates of vaccine responses in this population. PMID:28955330

  3. MAGIA2: from miRNA and genes expression data integrative analysis to microRNA–transcription factor mixed regulatory circuits (2012 update)

    PubMed Central

    Bisognin, Andrea; Sales, Gabriele; Coppe, Alessandro; Bortoluzzi, Stefania; Romualdi, Chiara

    2012-01-01

    MAGIA2 (http://gencomp.bio.unipd.it/magia2) is an update, extension and evolution of the MAGIA web tool. It is dedicated to the integrated analysis of in silico target prediction, microRNA (miRNA) and gene expression data for the reconstruction of post-transcriptional regulatory networks. miRNAs are fundamental post-transcriptional regulators of several key biological and pathological processes. As miRNAs act prevalently through target degradation, their expression profiles are expected to be inversely correlated to those of the target genes. Low specificity of target prediction algorithms makes integration approaches an interesting solution for target prediction refinement. MAGIA2 performs this integrative approach supporting different association measures, multiple organisms and almost all target predictions algorithms. Nevertheless, miRNAs activity should be viewed as part of a more complex scenario where regulatory elements and their interactors generate a highly connected network and where gene expression profiles are the result of different levels of regulation. The updated MAGIA2 tries to dissect this complexity by reconstructing mixed regulatory circuits involving either miRNA or transcription factor (TF) as regulators. Two types of circuits are identified: (i) a TF that regulates both a miRNA and its target and (ii) a miRNA that regulates both a TF and its target. PMID:22618880

  4. Identification, Expression Analysis, and Target Prediction of Flax Genotroph MicroRNAs Under Normal and Nutrient Stress Conditions

    PubMed Central

    Melnikova, Nataliya V.; Dmitriev, Alexey A.; Belenikin, Maxim S.; Koroban, Nadezhda V.; Speranskaya, Anna S.; Krinitsina, Anastasia A.; Krasnov, George S.; Lakunina, Valentina A.; Snezhkina, Anastasiya V.; Sadritdinova, Asiya F.; Kishlyan, Natalya V.; Rozhmina, Tatiana A.; Klimina, Kseniya M.; Amosova, Alexandra V.; Zelenin, Alexander V.; Muravenko, Olga V.; Bolsheva, Nadezhda L.; Kudryavtseva, Anna V.

    2016-01-01

    Cultivated flax (Linum usitatissimum L.) is an important plant valuable for industry. Some flax lines can undergo heritable phenotypic and genotypic changes (LIS-1 insertion being the most common) in response to nutrient stress and are called plastic lines. Offspring of plastic lines, which stably inherit the changes, are called genotrophs. MicroRNAs (miRNAs) are involved in a crucial regulatory mechanism of gene expression. They have previously been assumed to take part in nutrient stress response and can, therefore, participate in genotroph formation. In the present study, we performed high-throughput sequencing of small RNAs (sRNAs) extracted from flax plants grown under normal, phosphate deficient and nutrient excess conditions to identify miRNAs and evaluate their expression. Our analysis revealed expression of 96 conserved miRNAs from 21 families in flax. Moreover, 475 novel potential miRNAs were identified for the first time, and their targets were predicted. However, none of the identified miRNAs were transcribed from LIS-1. Expression of seven miRNAs (miR168, miR169, miR395, miR398, miR399, miR408, and lus-miR-N1) with up- or down-regulation under nutrient stress (on the basis of high-throughput sequencing data) was evaluated on extended sampling using qPCR. Reference gene search identified ETIF3H and ETIF3E genes as most suitable for this purpose. Down-regulation of novel potential lus-miR-N1 and up-regulation of conserved miR399 were revealed under the phosphate deficient conditions. In addition, the negative correlation of expression of lus-miR-N1 and its predicted target, ubiquitin-activating enzyme E1 gene, as well as, miR399 and its predicted target, ubiquitin-conjugating enzyme E2 gene, was observed. Thus, in our study, miRNAs expressed in flax plastic lines and genotrophs were identified and their expression and expression of their targets was evaluated using high-throughput sequencing and qPCR for the first time. These data provide new insights into nutrient stress response regulation in plastic flax cultivars. PMID:27092149

  5. Biological mechanism analysis of acute renal allograft rejection: integrated of mRNA and microRNA expression profiles.

    PubMed

    Huang, Shi-Ming; Zhao, Xia; Zhao, Xue-Mei; Wang, Xiao-Ying; Li, Shan-Shan; Zhu, Yu-Hui

    2014-01-01

    Renal transplantation is the preferred method for most patients with end-stage renal disease, however, acute renal allograft rejection is still a major risk factor for recipients leading to renal injury. To improve the early diagnosis and treatment of acute rejection, study on the molecular mechanism of it is urgent. MicroRNA (miRNA) expression profile and mRNA expression profile of acute renal allograft rejection and well-functioning allograft downloaded from ArrayExpress database were applied to identify differentially expressed (DE) miRNAs and DE mRNAs. DE miRNAs targets were predicted by combining five algorithm. By overlapping the DE mRNAs and DE miRNAs targets, common genes were obtained. Differentially co-expressed genes (DCGs) were identified by differential co-expression profile (DCp) and differential co-expression enrichment (DCe) methods in Differentially Co-expressed Genes and Links (DCGL) package. Then, co-expression network of DCGs and the cluster analysis were performed. Functional enrichment analysis for DCGs was undergone. A total of 1270 miRNA targets were predicted and 698 DE mRNAs were obtained. While overlapping miRNA targets and DE mRNAs, 59 common genes were gained. We obtained 103 DCGs and 5 transcription factors (TFs) based on regulatory impact factors (RIF), then built the regulation network of miRNA targets and DE mRNAs. By clustering the co-expression network, 5 modules were obtained. Thereinto, module 1 had the highest degree and module 2 showed the most number of DCGs and common genes. TF CEBPB and several common genes, such as RXRA, BASP1 and AKAP10, were mapped on the co-expression network. C1R showed the highest degree in the network. These genes might be associated with human acute renal allograft rejection. We conducted biological analysis on integration of DE mRNA and DE miRNA in acute renal allograft rejection, displayed gene expression patterns and screened out genes and TFs that may be related to acute renal allograft rejection.

  6. Biological mechanism analysis of acute renal allograft rejection: integrated of mRNA and microRNA expression profiles

    PubMed Central

    Huang, Shi-Ming; Zhao, Xia; Zhao, Xue-Mei; Wang, Xiao-Ying; Li, Shan-Shan; Zhu, Yu-Hui

    2014-01-01

    Objectives: Renal transplantation is the preferred method for most patients with end-stage renal disease, however, acute renal allograft rejection is still a major risk factor for recipients leading to renal injury. To improve the early diagnosis and treatment of acute rejection, study on the molecular mechanism of it is urgent. Methods: MicroRNA (miRNA) expression profile and mRNA expression profile of acute renal allograft rejection and well-functioning allograft downloaded from ArrayExpress database were applied to identify differentially expressed (DE) miRNAs and DE mRNAs. DE miRNAs targets were predicted by combining five algorithm. By overlapping the DE mRNAs and DE miRNAs targets, common genes were obtained. Differentially co-expressed genes (DCGs) were identified by differential co-expression profile (DCp) and differential co-expression enrichment (DCe) methods in Differentially Co-expressed Genes and Links (DCGL) package. Then, co-expression network of DCGs and the cluster analysis were performed. Functional enrichment analysis for DCGs was undergone. Results: A total of 1270 miRNA targets were predicted and 698 DE mRNAs were obtained. While overlapping miRNA targets and DE mRNAs, 59 common genes were gained. We obtained 103 DCGs and 5 transcription factors (TFs) based on regulatory impact factors (RIF), then built the regulation network of miRNA targets and DE mRNAs. By clustering the co-expression network, 5 modules were obtained. Thereinto, module 1 had the highest degree and module 2 showed the most number of DCGs and common genes. TF CEBPB and several common genes, such as RXRA, BASP1 and AKAP10, were mapped on the co-expression network. C1R showed the highest degree in the network. These genes might be associated with human acute renal allograft rejection. Conclusions: We conducted biological analysis on integration of DE mRNA and DE miRNA in acute renal allograft rejection, displayed gene expression patterns and screened out genes and TFs that may be related to acute renal allograft rejection. PMID:25664019

  7. Biological Networks for Predicting Chemical Hepatocarcinogenicity Using Gene Expression Data from Treated Mice and Relevance across Human and Rat Species

    PubMed Central

    Thomas, Reuben; Thomas, Russell S.; Auerbach, Scott S.; Portier, Christopher J.

    2013-01-01

    Background Several groups have employed genomic data from subchronic chemical toxicity studies in rodents (90 days) to derive gene-centric predictors of chronic toxicity and carcinogenicity. Genes are annotated to belong to biological processes or molecular pathways that are mechanistically well understood and are described in public databases. Objectives To develop a molecular pathway-based prediction model of long term hepatocarcinogenicity using 90-day gene expression data and to evaluate the performance of this model with respect to both intra-species, dose-dependent and cross-species predictions. Methods Genome-wide hepatic mRNA expression was retrospectively measured in B6C3F1 mice following subchronic exposure to twenty-six (26) chemicals (10 were positive, 2 equivocal and 14 negative for liver tumors) previously studied by the US National Toxicology Program. Using these data, a pathway-based predictor model for long-term liver cancer risk was derived using random forests. The prediction model was independently validated on test sets associated with liver cancer risk obtained from mice, rats and humans. Results Using 5-fold cross validation, the developed prediction model had reasonable predictive performance with the area under receiver-operator curve (AUC) equal to 0.66. The developed prediction model was then used to extrapolate the results to data associated with rat and human liver cancer. The extrapolated model worked well for both extrapolated species (AUC value of 0.74 for rats and 0.91 for humans). The prediction models implied a balanced interplay between all pathway responses leading to carcinogenicity predictions. Conclusions Pathway-based prediction models estimated from sub-chronic data hold promise for predicting long-term carcinogenicity and also for its ability to extrapolate results across multiple species. PMID:23737943

  8. Biological networks for predicting chemical hepatocarcinogenicity using gene expression data from treated mice and relevance across human and rat species.

    PubMed

    Thomas, Reuben; Thomas, Russell S; Auerbach, Scott S; Portier, Christopher J

    2013-01-01

    Several groups have employed genomic data from subchronic chemical toxicity studies in rodents (90 days) to derive gene-centric predictors of chronic toxicity and carcinogenicity. Genes are annotated to belong to biological processes or molecular pathways that are mechanistically well understood and are described in public databases. To develop a molecular pathway-based prediction model of long term hepatocarcinogenicity using 90-day gene expression data and to evaluate the performance of this model with respect to both intra-species, dose-dependent and cross-species predictions. Genome-wide hepatic mRNA expression was retrospectively measured in B6C3F1 mice following subchronic exposure to twenty-six (26) chemicals (10 were positive, 2 equivocal and 14 negative for liver tumors) previously studied by the US National Toxicology Program. Using these data, a pathway-based predictor model for long-term liver cancer risk was derived using random forests. The prediction model was independently validated on test sets associated with liver cancer risk obtained from mice, rats and humans. Using 5-fold cross validation, the developed prediction model had reasonable predictive performance with the area under receiver-operator curve (AUC) equal to 0.66. The developed prediction model was then used to extrapolate the results to data associated with rat and human liver cancer. The extrapolated model worked well for both extrapolated species (AUC value of 0.74 for rats and 0.91 for humans). The prediction models implied a balanced interplay between all pathway responses leading to carcinogenicity predictions. Pathway-based prediction models estimated from sub-chronic data hold promise for predicting long-term carcinogenicity and also for its ability to extrapolate results across multiple species.

  9. Extending bicluster analysis to annotate unclassified ORFs and predict novel functional modules using expression data

    PubMed Central

    Bryan, Kenneth; Cunningham, Pádraig

    2008-01-01

    Background Microarrays have the capacity to measure the expressions of thousands of genes in parallel over many experimental samples. The unsupervised classification technique of bicluster analysis has been employed previously to uncover gene expression correlations over subsets of samples with the aim of providing a more accurate model of the natural gene functional classes. This approach also has the potential to aid functional annotation of unclassified open reading frames (ORFs). Until now this aspect of biclustering has been under-explored. In this work we illustrate how bicluster analysis may be extended into a 'semi-supervised' ORF annotation approach referred to as BALBOA. Results The efficacy of the BALBOA ORF classification technique is first assessed via cross validation and compared to a multi-class k-Nearest Neighbour (kNN) benchmark across three independent gene expression datasets. BALBOA is then used to assign putative functional annotations to unclassified yeast ORFs. These predictions are evaluated using existing experimental and protein sequence information. Lastly, we employ a related semi-supervised method to predict the presence of novel functional modules within yeast. Conclusion In this paper we demonstrate how unsupervised classification methods, such as bicluster analysis, may be extended using of available annotations to form semi-supervised approaches within the gene expression analysis domain. We show that such methods have the potential to improve upon supervised approaches and shed new light on the functions of unclassified ORFs and their co-regulation. PMID:18831786

  10. Gene expression pattern recognition algorithm inferences to classify samples exposed to chemical agents

    NASA Astrophysics Data System (ADS)

    Bushel, Pierre R.; Bennett, Lee; Hamadeh, Hisham; Green, James; Ableson, Alan; Misener, Steve; Paules, Richard; Afshari, Cynthia

    2002-06-01

    We present an analysis of pattern recognition procedures used to predict the classes of samples exposed to pharmacologic agents by comparing gene expression patterns from samples treated with two classes of compounds. Rat liver mRNA samples following exposure for 24 hours with phenobarbital or peroxisome proliferators were analyzed using a 1700 rat cDNA microarray platform. Sets of genes that were consistently differentially expressed in the rat liver samples following treatment were stored in the MicroArray Project System (MAPS) database. MAPS identified 238 genes in common that possessed a low probability (P < 0.01) of being randomly detected as differentially expressed at the 95% confidence level. Hierarchical cluster analysis on the 238 genes clustered specific gene expression profiles that separated samples based on exposure to a particular class of compound.

  11. Integrated analysis of DNA-methylation and gene expression using high-dimensional penalized regression: a cohort study on bone mineral density in postmenopausal women.

    PubMed

    Lien, Tonje G; Borgan, Ørnulf; Reppe, Sjur; Gautvik, Kaare; Glad, Ingrid Kristine

    2018-03-07

    Using high-dimensional penalized regression we studied genome-wide DNA-methylation in bone biopsies of 80 postmenopausal women in relation to their bone mineral density (BMD). The women showed BMD varying from severely osteoporotic to normal. Global gene expression data from the same individuals was available, and since DNA-methylation often affects gene expression, the overall aim of this paper was to include both of these omics data sets into an integrated analysis. The classical penalized regression uses one penalty, but we incorporated individual penalties for each of the DNA-methylation sites. These individual penalties were guided by the strength of association between DNA-methylations and gene transcript levels. DNA-methylations that were highly associated to one or more transcripts got lower penalties and were therefore favored compared to DNA-methylations showing less association to expression. Because of the complex pathways and interactions among genes, we investigated both the association between DNA-methylations and their corresponding cis gene, as well as the association between DNA-methylations and trans-located genes. Two integrating penalized methods were used: first, an adaptive group-regularized ridge regression, and secondly, variable selection was performed through a modified version of the weighted lasso. When information from gene expressions was integrated, predictive performance was considerably improved, in terms of predictive mean square error, compared to classical penalized regression without data integration. We found a 14.7% improvement in the ridge regression case and a 17% improvement for the lasso case. Our version of the weighted lasso with data integration found a list of 22 interesting methylation sites. Several corresponded to genes that are known to be important in bone formation. Using BMD as response and these 22 methylation sites as covariates, least square regression analyses resulted in R 2 =0.726, comparable to an average R 2 =0.438 for 10000 randomly selected groups of DNA-methylations with group size 22. Two recent types of penalized regression methods were adapted to integrate DNA-methylation and their association to gene expression in the analysis of bone mineral density. In both cases predictions clearly benefit from including the additional information on gene expressions.

  12. Prediction of Human Disease Genes by Human-Mouse Conserved Coexpression Analysis

    PubMed Central

    Grassi, Elena; Damasco, Christian; Silengo, Lorenzo; Oti, Martin; Provero, Paolo; Di Cunto, Ferdinando

    2008-01-01

    Background Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates. Methodology/Principal Findings We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases. Conclusion Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes. PMID:18369433

  13. Predictive Genes in Adjacent Normal Tissue Are Preferentially Altered by sCNV during Tumorigenesis in Liver Cancer and May Rate Limiting

    PubMed Central

    Lamb, John R.; Zhang, Chunsheng; Xie, Tao; Wang, Kai; Zhang, Bin; Hao, Ke; Chudin, Eugene; Fraser, Hunter B.; Millstein, Joshua; Ferguson, Mark; Suver, Christine; Ivanovska, Irena; Scott, Martin; Philippar, Ulrike; Bansal, Dimple; Zhang, Zhan; Burchard, Julja; Smith, Ryan; Greenawalt, Danielle; Cleary, Michele; Derry, Jonathan; Loboda, Andrey; Watters, James; Poon, Ronnie T. P.; Fan, Sheung T.; Yeung, Chun; Lee, Nikki P. Y.; Guinney, Justin; Molony, Cliona; Emilsson, Valur; Buser-Doepner, Carolyn; Zhu, Jun; Friend, Stephen; Mao, Mao; Shaw, Peter M.; Dai, Hongyue; Luk, John M.; Schadt, Eric E.

    2011-01-01

    Background In hepatocellular carcinoma (HCC) genes predictive of survival have been found in both adjacent normal (AN) and tumor (TU) tissues. The relationships between these two sets of predictive genes and the general process of tumorigenesis and disease progression remains unclear. Methodology/Principal Findings Here we have investigated HCC tumorigenesis by comparing gene expression, DNA copy number variation and survival using ∼250 AN and TU samples representing, respectively, the pre-cancer state, and the result of tumorigenesis. Genes that participate in tumorigenesis were defined using a gene-gene correlation meta-analysis procedure that compared AN versus TU tissues. Genes predictive of survival in AN (AN-survival genes) were found to be enriched in the differential gene-gene correlation gene set indicating that they directly participate in the process of tumorigenesis. Additionally the AN-survival genes were mostly not predictive after tumorigenesis in TU tissue and this transition was associated with and could largely be explained by the effect of somatic DNA copy number variation (sCNV) in cis and in trans. The data was consistent with the variance of AN-survival genes being rate-limiting steps in tumorigenesis and this was confirmed using a treatment that promotes HCC tumorigenesis that selectively altered AN-survival genes and genes differentially correlated between AN and TU. Conclusions/Significance This suggests that the process of tumor evolution involves rate-limiting steps related to the background from which the tumor evolved where these were frequently predictive of clinical outcome. Additionally treatments that alter the likelihood of tumorigenesis occurring may act by altering AN-survival genes, suggesting that the process can be manipulated. Further sCNV explains a substantial fraction of tumor specific expression and may therefore be a causal driver of tumor evolution in HCC and perhaps many solid tumor types. PMID:21750698

  14. Prediction of in vivo hepatotoxicity effects using in vitro transcriptomics data (SOT)

    EPA Science Inventory

    High-throughput in vitro transcriptomics data support molecular understanding of chemical-induced toxicity. Here, we evaluated the utility of such data to predict liver toxicity. First, in vitro gene expression data for 93 genes was generated following exposure of metabolically c...

  15. Genome-wide DNA methylation profiling integrated with gene expression profiling identifies PAX9 as a novel prognostic marker in chronic lymphocytic leukemia.

    PubMed

    Rani, Lata; Mathur, Nitin; Gupta, Ritu; Gogia, Ajay; Kaur, Gurvinder; Dhanjal, Jaspreet Kaur; Sundar, Durai; Kumar, Lalit; Sharma, Atul

    2017-01-01

    In chronic lymphocytic leukemia (CLL), epigenomic and genomic studies have expanded the existing knowledge about the disease biology and led to the identification of potential biomarkers relevant for implementation of personalized medicine. In this study, an attempt has been made to examine and integrate the global DNA methylation changes with gene expression profile and their impact on clinical outcome in early stage CLL patients. The integration of DNA methylation profile ( n  = 14) with the gene expression profile ( n  = 21) revealed 142 genes as hypermethylated-downregulated and; 62 genes as hypomethylated-upregulated in early stage CLL patients compared to CD19+ B-cells from healthy individuals. The mRNA expression levels of 17 genes identified to be differentially methylated and/or differentially expressed was further examined in early stage CLL patients ( n  = 93) by quantitative real time PCR (RQ-PCR). Significant differences were observed in the mRNA expression of MEIS1 , PMEPA1 , SOX7 , SPRY1 , CDK6 , TBX2 , and SPRY2 genes in CLL cells as compared to B-cells from healthy individuals. The analysis in the IGHV mutation based categories (Unmutated = 39, Mutated = 54) revealed significantly higher mRNA expression of CRY1 and PAX9 genes in the IGHV unmutated subgroup ( p  < 0.001). The relative risk of treatment initiation was significantly higher among patients with high expression of CRY1 (RR = 1.91, p  = 0.005) or PAX9 (RR = 1.87, p  = 0.001). High expression of CRY1 (HR: 3.53, p  < 0.001) or PAX9 (HR: 3.14, p  < 0.001) gene was significantly associated with shorter time to first treatment. The high expression of PAX9 gene (HR: 3.29, 95% CI 1.172-9.272, p  = 0.016) was also predictive of shorter overall survival in CLL. The DNA methylation changes associated with mRNA expression of CRY1 and PAX9 genes allow risk stratification of early stage CLL patients. This comprehensive analysis supports the concept that the epigenetic changes along with the altered expression of genes have the potential to predict clinical outcome in early stage CLL patients.

  16. Transcriptome analysis of cattle muscle identifies potential markers for skeletal muscle growth rate and major cell types.

    PubMed

    Guo, Bing; Greenwood, Paul L; Cafe, Linda M; Zhou, Guanghong; Zhang, Wangang; Dalrymple, Brian P

    2015-03-13

    This study aimed to identify markers for muscle growth rate and the different cellular contributors to cattle muscle and to link the muscle growth rate markers to specific cell types. The expression of two groups of genes in the longissimus muscle (LM) of 48 Brahman steers of similar age, significantly enriched for "cell cycle" and "ECM (extracellular matrix) organization" Gene Ontology (GO) terms was correlated with average daily gain/kg liveweight (ADG/kg) of the animals. However, expression of the same genes was only partly related to growth rate across a time course of postnatal LM development in two cattle genotypes, Piedmontese x Hereford (high muscling) and Wagyu x Hereford (high marbling). The deposition of intramuscular fat (IMF) altered the relationship between the expression of these genes and growth rate. K-means clustering across the development time course with a large set of genes (5,596) with similar expression profiles to the ECM genes was undertaken. The locations in the clusters of published markers of different cell types in muscle were identified and used to link clusters of genes to the cell type most likely to be expressing them. Overall correspondence between published cell type expression of markers and predicted major cell types of expression in cattle LM was high. However, some exceptions were identified: expression of SOX8 previously attributed to muscle satellite cells was correlated with angiogenesis. Analysis of the clusters and cell types suggested that the "cell cycle" and "ECM" signals were from the fibro/adipogenic lineage. Significant contributions to these signals from the muscle satellite cells, angiogenic cells and adipocytes themselves were not as strongly supported. Based on the clusters and cell type markers, sets of five genes predicted to be representative of fibro/adipogenic precursors (FAPs) and endothelial cells, and/or ECM remodelling and angiogenesis were identified. Gene sets and gene markers for the analysis of many of the major processes/cell populations contributing to muscle composition and growth have been proposed, enabling a consistent interpretation of gene expression datasets from cattle LM. The same gene sets are likely to be applicable in other cattle muscles and in other species.

  17. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    DOE PAGES

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; ...

    2016-11-24

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less

  18. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less

  19. Combining Gene Signatures Improves Prediction of Breast Cancer Survival

    PubMed Central

    Zhao, Xi; Naume, Bjørn; Langerød, Anita; Frigessi, Arnoldo; Kristensen, Vessela N.; Børresen-Dale, Anne-Lise; Lingjærde, Ole Christian

    2011-01-01

    Background Several gene sets for prediction of breast cancer survival have been derived from whole-genome mRNA expression profiles. Here, we develop a statistical framework to explore whether combination of the information from such sets may improve prediction of recurrence and breast cancer specific death in early-stage breast cancers. Microarray data from two clinically similar cohorts of breast cancer patients are used as training (n = 123) and test set (n = 81), respectively. Gene sets from eleven previously published gene signatures are included in the study. Principal Findings To investigate the relationship between breast cancer survival and gene expression on a particular gene set, a Cox proportional hazards model is applied using partial likelihood regression with an L2 penalty to avoid overfitting and using cross-validation to determine the penalty weight. The fitted models are applied to an independent test set to obtain a predicted risk for each individual and each gene set. Hierarchical clustering of the test individuals on the basis of the vector of predicted risks results in two clusters with distinct clinical characteristics in terms of the distribution of molecular subtypes, ER, PR status, TP53 mutation status and histological grade category, and associated with significantly different survival probabilities (recurrence: p = 0.005; breast cancer death: p = 0.014). Finally, principal components analysis of the gene signatures is used to derive combined predictors used to fit a new Cox model. This model classifies test individuals into two risk groups with distinct survival characteristics (recurrence: p = 0.003; breast cancer death: p = 0.001). The latter classifier outperforms all the individual gene signatures, as well as Cox models based on traditional clinical parameters and the Adjuvant! Online for survival prediction. Conclusion Combining the predictive strength of multiple gene signatures improves prediction of breast cancer survival. The presented methodology is broadly applicable to breast cancer risk assessment using any new identified gene set. PMID:21423775

  20. Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification.

    PubMed

    Oberthuer, André; Berthold, Frank; Warnat, Patrick; Hero, Barbara; Kahlert, Yvonne; Spitz, Rüdiger; Ernestus, Karen; König, Rainer; Haas, Stefan; Eils, Roland; Schwab, Manfred; Brors, Benedikt; Westermann, Frank; Fischer, Matthias

    2006-11-01

    To develop a gene expression-based classifier for neuroblastoma patients that reliably predicts courses of the disease. Two hundred fifty-one neuroblastoma specimens were analyzed using a customized oligonucleotide microarray comprising 10,163 probes for transcripts with differential expression in clinical subgroups of the disease. Subsequently, the prediction analysis for microarrays (PAM) was applied to a first set of patients with maximally divergent clinical courses (n = 77). The classification accuracy was estimated by a complete 10-times-repeated 10-fold cross validation, and a 144-gene predictor was constructed from this set. This classifier's predictive power was evaluated in an independent second set (n = 174) by comparing results of the gene expression-based classification with those of risk stratification systems of current trials from Germany, Japan, and the United States. The first set of patients was accurately predicted by PAM (cross-validated accuracy, 99%). Within the second set, the PAM classifier significantly separated cohorts with distinct courses (3-year event-free survival [EFS] 0.86 +/- 0.03 [favorable; n = 115] v 0.52 +/- 0.07 [unfavorable; n = 59] and 3-year overall survival 0.99 +/- 0.01 v 0.84 +/- 0.05; both P < .0001) and separated risk groups of current neuroblastoma trials into subgroups with divergent outcome (NB2004: low-risk 3-year EFS 0.86 +/- 0.04 v 0.25 +/- 0.15, P < .0001; intermediate-risk 1.00 v 0.57 +/- 0.19, P = .018; high-risk 0.81 +/- 0.10 v 0.56 +/- 0.08, P = .06). In a multivariate Cox regression model, the PAM predictor classified patients of the second set more accurately than risk stratification of current trials from Germany, Japan, and the United States (P < .001; hazard ratio, 4.756 [95% CI, 2.544 to 8.893]). Integration of gene expression-based class prediction of neuroblastoma patients may improve risk estimation of current neuroblastoma trials.

  1. The Interrelationship between Promoter Strength, Gene Expression, and Growth Rate

    PubMed Central

    Klesmith, Justin R.; Detwiler, Emily E.; Tomek, Kyle J.; Whitehead, Timothy A.

    2014-01-01

    In exponentially growing bacteria, expression of heterologous protein impedes cellular growth rates. Quantitative understanding of the relationship between expression and growth rate will advance our ability to forward engineer bacteria, important for metabolic engineering and synthetic biology applications. Recently, a work described a scaling model based on optimal allocation of ribosomes for protein translation. This model quantitatively predicts a linear relationship between microbial growth rate and heterologous protein expression with no free parameters. With the aim of validating this model, we have rigorously quantified the fitness cost of gene expression by using a library of synthetic constitutive promoters to drive expression of two separate proteins (eGFP and amiE) in E. coli in different strains and growth media. In all cases, we demonstrate that the fitness cost is consistent with the previous findings. We expand upon the previous theory by introducing a simple promoter activity model to quantitatively predict how basal promoter strength relates to growth rate and protein expression. We then estimate the amount of protein expression needed to support high flux through a heterologous metabolic pathway and predict the sizable fitness cost associated with enzyme production. This work has broad implications across applied biological sciences because it allows for prediction of the interplay between promoter strength, protein expression, and the resulting cost to microbial growth rates. PMID:25286161

  2. Degradation of triglycerides by a pseudomonad isolated from milk: molecular analysis of a lipase-encoding gene and its expression in Escherichia coli.

    PubMed Central

    Johnson, L A; Beacham, I R; MacRae, I C; Free, M L

    1992-01-01

    Psychrotrophic lipolytic bacteria represent a significant problem in the storage of refrigerated dairy products. A lipase-encoding gene has been cloned and characterized from a highly lipolytic strain of Pseudomonas. The nucleotide sequence of the gene predicts a polypeptide of M(r) 49,905, which was identified when the gene was expressed in Escherichia coli. Images PMID:1622251

  3. Identification of genes associated with low furanocoumarin content in grapefruit.

    PubMed

    Chen, Chunxian; Yu, Qibin; Wei, Xu; Cancalon, Paul F; Gmitter, Fred G

    2014-10-01

    Some furanocoumarins in grapefruit (Citrus paradisi) are associated with the so-called grapefruit juice effect. Previous phytochemical quantification and genetic analysis suggested that the synthesis of these furanocoumarins may be controlled by a single gene in the pathway. In this study, cDNA-amplified fragment length polymorphism (cDNA-AFLP) analysis of fruit tissues was performed to identify the candidate gene(s) likely associated with low furanocoumarin content in grapefruit. Fifteen tentative differentially expressed fragments were cloned through the cDNA-AFLP analysis of the grapefruit variety Foster and its spontaneous low-furanocoumarin mutant Low Acid Foster. Sequence analysis revealed a cDNA-AFLP fragment, Contig 6, was homologous to a substrate-proved psoralen synthase gene, CYP71A22, and was part of citrus unigenes Cit.3003 and Csi.1332, and predicted genes Ciclev10004717m in mandarin and orange1.1g041507m in sweet orange. The two predicted genes contained the highly conserved motifs at one of the substrate recognition sites of CYP71A22. Digital gene expression profile showed the unigenes were expressed only in fruit and seed. Quantitative real-time PCR also proved Contig 6 was down-regulated in Low Acid Foster. These results showed the differentially expressed Contig 6 was related to the reduced furanocoumarin levels in the mutant. The identified fragment, homologs, unigenes, and genes may facilitate further furanocoumarin genetic study and grapefruit variety improvement.

  4. Time course of gene expression during mouse skeletal muscle hypertrophy

    PubMed Central

    Lee, Jonah D.; England, Jonathan H.; Esser, Karyn A.; McCarthy, John J.

    2013-01-01

    The purpose of this study was to perform a comprehensive transcriptome analysis during skeletal muscle hypertrophy to identify signaling pathways that are operative throughout the hypertrophic response. Global gene expression patterns were determined from microarray results on days 1, 3, 5, 7, 10, and 14 during plantaris muscle hypertrophy induced by synergist ablation in adult mice. Principal component analysis and the number of differentially expressed genes (cutoffs ≥2-fold increase or ≥50% decrease compared with control muscle) revealed three gene expression patterns during overload-induced hypertrophy: early (1 day), intermediate (3, 5, and 7 days), and late (10 and 14 days) patterns. Based on the robust changes in total RNA content and in the number of differentially expressed genes, we focused our attention on the intermediate gene expression pattern. Ingenuity Pathway Analysis revealed a downregulation of genes encoding components of the branched-chain amino acid degradation pathway during hypertrophy. Among these genes, five were predicted by Ingenuity Pathway Analysis or previously shown to be regulated by the transcription factor Kruppel-like factor-15, which was also downregulated during hypertrophy. Moreover, the integrin-linked kinase signaling pathway was activated during hypertrophy, and the downregulation of muscle-specific micro-RNA-1 correlated with the upregulation of five predicted targets associated with the integrin-linked kinase pathway. In conclusion, we identified two novel pathways that may be involved in muscle hypertrophy, as well as two upstream regulators (Kruppel-like factor-15 and micro-RNA-1) that provide targets for future studies investigating the importance of these pathways in muscle hypertrophy. PMID:23869057

  5. Time course of gene expression during mouse skeletal muscle hypertrophy.

    PubMed

    Chaillou, Thomas; Lee, Jonah D; England, Jonathan H; Esser, Karyn A; McCarthy, John J

    2013-10-01

    The purpose of this study was to perform a comprehensive transcriptome analysis during skeletal muscle hypertrophy to identify signaling pathways that are operative throughout the hypertrophic response. Global gene expression patterns were determined from microarray results on days 1, 3, 5, 7, 10, and 14 during plantaris muscle hypertrophy induced by synergist ablation in adult mice. Principal component analysis and the number of differentially expressed genes (cutoffs ≥2-fold increase or ≥50% decrease compared with control muscle) revealed three gene expression patterns during overload-induced hypertrophy: early (1 day), intermediate (3, 5, and 7 days), and late (10 and 14 days) patterns. Based on the robust changes in total RNA content and in the number of differentially expressed genes, we focused our attention on the intermediate gene expression pattern. Ingenuity Pathway Analysis revealed a downregulation of genes encoding components of the branched-chain amino acid degradation pathway during hypertrophy. Among these genes, five were predicted by Ingenuity Pathway Analysis or previously shown to be regulated by the transcription factor Kruppel-like factor-15, which was also downregulated during hypertrophy. Moreover, the integrin-linked kinase signaling pathway was activated during hypertrophy, and the downregulation of muscle-specific micro-RNA-1 correlated with the upregulation of five predicted targets associated with the integrin-linked kinase pathway. In conclusion, we identified two novel pathways that may be involved in muscle hypertrophy, as well as two upstream regulators (Kruppel-like factor-15 and micro-RNA-1) that provide targets for future studies investigating the importance of these pathways in muscle hypertrophy.

  6. Whole Blood Gene Expression Profile Associated with Spontaneous Preterm Birth in Women with Threatened Preterm Labor

    PubMed Central

    Heng, Yujing Jan; Pennell, Craig Edward; Chua, Hon Nian; Perkins, Jonathan Edward; Lye, Stephen James

    2014-01-01

    Threatened preterm labor (TPTL) is defined as persistent premature uterine contractions between 20 and 37 weeks of gestation and is the most common condition that requires hospitalization during pregnancy. Most of these TPTL women continue their pregnancies to term while only an estimated 5% will deliver a premature baby within ten days. The aim of this work was to study differential whole blood gene expression associated with spontaneous preterm birth (sPTB) within 48 hours of hospital admission. Peripheral blood was collected at point of hospital admission from 154 women with TPTL before any medical treatment. Microarrays were utilized to investigate differential whole blood gene expression between TPTL women who did (n = 48) or did not have a sPTB (n = 106) within 48 hours of admission. Total leukocyte and neutrophil counts were significantly higher (35% and 41% respectively) in women who had sPTB than women who did not deliver within 48 hours (p<0.001). Fetal fibronectin (fFN) test was performed on 62 women. There was no difference in the urine, vaginal and placental microbiology and histopathology reports between the two groups of women. There were 469 significant differentially expressed genes (FDR<0.05); 28 differentially expressed genes were chosen for microarray validation using qRT-PCR and 20 out of 28 genes were successfully validated (p<0.05). An optimal random forest classifier model to predict sPTB was achieved using the top nine differentially expressed genes coupled with peripheral clinical blood data (sensitivity 70.8%, specificity 75.5%). These differentially expressed genes may further elucidate the underlying mechanisms of sPTB and pave the way for future systems biology studies to predict sPTB. PMID:24828675

  7. Translating natural genetic variation to gene expression in a computational model of the Drosophila gap gene regulatory network

    PubMed Central

    Kozlov, Konstantin N.; Kulakovskiy, Ivan V.; Zubair, Asif; Marjoram, Paul; Lawrie, David S.; Nuzhdin, Sergey V.; Samsonova, Maria G.

    2017-01-01

    Annotating the genotype-phenotype relationship, and developing a proper quantitative description of the relationship, requires understanding the impact of natural genomic variation on gene expression. We apply a sequence-level model of gap gene expression in the early development of Drosophila to analyze single nucleotide polymorphisms (SNPs) in a panel of natural sequenced D. melanogaster lines. Using a thermodynamic modeling framework, we provide both analytical and computational descriptions of how single-nucleotide variants affect gene expression. The analysis reveals that the sequence variants increase (decrease) gene expression if located within binding sites of repressors (activators). We show that the sign of SNP influence (activation or repression) may change in time and space and elucidate the origin of this change in specific examples. The thermodynamic modeling approach predicts non-local and non-linear effects arising from SNPs, and combinations of SNPs, in individual fly genotypes. Simulation of individual fly genotypes using our model reveals that this non-linearity reduces to almost additive inputs from multiple SNPs. Further, we see signatures of the action of purifying selection in the gap gene regulatory regions. To infer the specific targets of purifying selection, we analyze the patterns of polymorphism in the data at two phenotypic levels: the strengths of binding and expression. We find that combinations of SNPs show evidence of being under selective pressure, while individual SNPs do not. The model predicts that SNPs appear to accumulate in the genotypes of the natural population in a way biased towards small increases in activating action on the expression pattern. Taken together, these results provide a systems-level view of how genetic variation translates to the level of gene regulatory networks via combinatorial SNP effects. PMID:28898266

  8. Tissue- and Time-Specific Expression of Otherwise Identical tRNA Genes

    PubMed Central

    Adir, Idan; Dahan, Orna; Broday, Limor; Pilpel, Yitzhak; Rechavi, Oded

    2016-01-01

    Codon usage bias affects protein translation because tRNAs that recognize synonymous codons differ in their abundance. Although the current dogma states that tRNA expression is exclusively regulated by intrinsic control elements (A- and B-box sequences), we revealed, using a reporter that monitors the levels of individual tRNA genes in Caenorhabditis elegans, that eight tryptophan tRNA genes, 100% identical in sequence, are expressed in different tissues and change their expression dynamically. Furthermore, the expression levels of the sup-7 tRNA gene at day 6 were found to predict the animal’s lifespan. We discovered that the expression of tRNAs that reside within introns of protein-coding genes is affected by the host gene’s promoter. Pairing between specific Pol II genes and the tRNAs that are contained in their introns is most likely adaptive, since a genome-wide analysis revealed that the presence of specific intronic tRNAs within specific orthologous genes is conserved across Caenorhabditis species. PMID:27560950

  9. Screening of biomarkers for prediction of response to and prognosis after chemotherapy for breast cancers.

    PubMed

    Bing, Feng; Zhao, Yu

    2016-01-01

    To screen the biomarkers having the ability to predict prognosis after chemotherapy for breast cancers. Three microarray data of breast cancer patients undergoing chemotherapy were collected from Gene Expression Omnibus database. After preprocessing, data in GSE41112 were analyzed using significance analysis of microarrays to screen the differentially expressed genes (DEGs). The DEGs were further analyzed by Differentially Coexpressed Genes and Links to construct a function module, the prognosis efficacy of which was verified by the other two datasets (GSE22226 and GSE58644) using Kaplan-Meier plots. The involved genes in function module were subjected to a univariate Cox regression analysis to confirm whether the expression of each prognostic gene was associated with survival. A total of 511 DEGs between breast cancer patients who received chemotherapy or not were obtained, consisting of 421 upregulated and 90 downregulated genes. Using the Differentially Coexpressed Genes and Links package, 1,244 differentially coexpressed genes (DCGs) were identified, among which 36 DCGs were regulated by the transcription factor complex NFY (NFYA, NFYB, NFYC). These 39 genes constructed a gene module to classify the samples in GSE22226 and GSE58644 into three subtypes and these subtypes exhibited significantly different survival rates. Furthermore, several genes of the 39 DCGs were shown to be significantly associated with good (such as CDC20) and poor (such as ARID4A) prognoses following chemotherapy. Our present study provided a serial of biomarkers for predicting the prognosis of chemotherapy or targets for development of alternative treatment (ie, CDC20 and ARID4A) in breast cancer patients.

  10. Hereditary mixed polyposis syndrome is caused by a 40-kb upstream duplication that leads to increased and ectopic expression of the BMP antagonist GREM1.

    PubMed

    Jaeger, Emma; Leedham, Simon; Lewis, Annabelle; Segditsas, Stefania; Becker, Martin; Cuadrado, Pedro Rodenas; Davis, Hayley; Kaur, Kulvinder; Heinimann, Karl; Howarth, Kimberley; East, James; Taylor, Jenny; Thomas, Huw; Tomlinson, Ian

    2012-05-06

    Hereditary mixed polyposis syndrome (HMPS) is characterized by apparent autosomal dominant inheritance of multiple types of colorectal polyp, with colorectal carcinoma occurring in a high proportion of affected individuals. Here, we use genetic mapping, copy-number analysis, exclusion of mutations by high-throughput sequencing, gene expression analysis and functional assays to show that HMPS is caused by a duplication spanning the 3' end of the SCG5 gene and a region upstream of the GREM1 locus. This unusual mutation is associated with increased allele-specific GREM1 expression. Whereas GREM1 is expressed in intestinal subepithelial myofibroblasts in controls, GREM1 is predominantly expressed in the epithelium of the large bowel in individuals with HMPS. The HMPS duplication contains predicted enhancer elements; some of these interact with the GREM1 promoter and can drive gene expression in vitro. Increased GREM1 expression is predicted to cause reduced bone morphogenetic protein (BMP) pathway activity, a mechanism that also underlies tumorigenesis in juvenile polyposis of the large bowel.

  11. Biological and Clinical Significance of MAD2L1 and BUB1, Genes Frequently Appearing in Expression Signatures for Breast Cancer Prognosis

    PubMed Central

    Wang, Zhanwei; Katsaros, Dionyssios; Shen, Yi; Fu, Yuanyuan; Canuto, Emilie Marion; Benedetto, Chiara; Lu, Lingeng; Chu, Wen-Ming; Risch, Harvey A.; Yu, Herbert

    2015-01-01

    To investigate the biologic relevance and clinical implication of genes involved in multiple gene expression signatures for breast cancer prognosis, we identified 16 published gene expression signatures, and selected two genes, MAD2L1 and BUB1. These genes appeared in 5 signatures and were involved in cell-cycle regulation. We analyzed the expression of these genes in relation to tumor features and disease outcomes. In vitro experiments were also performed in two breast cancer cell lines, MDA-MB-231 and MDA-MB-468, to assess cell proliferation, migration and invasion after knocking down the expression of these genes. High expression of these genes was found to be associated with aggressive tumors and poor disease-free survival of 203 breast cancer patients in our study, and the association with survival was confirmed in an online database consisting of 914 patients. In vitro experiments demonstrated that lowering the expression of these genes by siRNAs reduced tumor cell growth and inhibited cell migration and invasion. Our investigation suggests that MAD2L1 and BUB1 may play important roles in breast cancer progression, and measuring the expression of these genes may assist the prediction of breast cancer prognosis. PMID:26287798

  12. Identification of differentially expressed genes in the zebrafish hypothalamus - pituitary axis

    PubMed Central

    Toro, Sabrina; Wegner, Jeremy; Muller, Marc; Westerfield, Monte; Varga, Zoltan M.

    2009-01-01

    The vertebrate hypothalamic-pituitary axis (HP) is the main link between the central nervous system and endocrine system. Although several signal pathways and regulatory genes have been implicated in adenohypophysis ontogenesis, little is known about hypothalamic and neurohypophysial development or when the HP matures and becomes functional. To identify markers of the HP, we constructed subtractive cDNA libraries between adult zebrafish hypothalamus and pituitary. We identified previously published genes and ESTs and novel zebrafish genes, some of which were predicted by genomic database analysis. We also analyzed expression patterns of these genes and found that several are expressed in the embryonic and larval hypothalamus, neurohypophysis, and/or adenohypophysis. Expression at these stages makes these genes useful markers to study HP maturation and function. PMID:19166982

  13. Environmental and genetic modulation of the phenotypic expression of antibiotic resistance

    PubMed Central

    Andersson, Dan I

    2017-01-01

    Abstract Antibiotic resistance can be acquired by mutation or horizontal transfer of a resistance gene, and generally an acquired mechanism results in a predictable increase in phenotypic resistance. However, recent findings suggest that the environment and/or the genetic context can modify the phenotypic expression of specific resistance genes/mutations. An important implication from these findings is that a given genotype does not always result in the expected phenotype. This dissociation of genotype and phenotype has important consequences for clinical bacteriology and for our ability to predict resistance phenotypes from genetics and DNA sequences. A related problem concerns the degree to which the genes/mutations currently identified in vitro can fully explain the in vivo resistance phenotype, or whether there is a significant additional amount of presently unknown mutations/genes (genetic ‘dark matter’) that could contribute to resistance in clinical isolates. Finally, a very important question is whether/how we can identify the genetic features that contribute to making a successful pathogen, and predict why some resistant clones are very successful and spread globally? In this review, we describe different environmental and genetic factors that influence phenotypic expression of antibiotic resistance genes/mutations and how this information is needed to understand why particular resistant clones spread worldwide and to what extent we can use DNA sequences to predict evolutionary success. PMID:28333270

  14. A Gene Expression Profile of BRCAness That Predicts for Responsiveness to Platinum and PARP Inhibitors

    DTIC Science & Technology

    2017-02-01

    To) 15 July 2010 – 2 Nov.2016 4 . TITLE AND SUBTITLE A Gene Expression Profile of BRCAness That Predicts for Responsiveness to Platinum and PARP...resistance in vitro, and to investigate the mechanism for this effect. The major goal for Aim 4 was to determine the reproducibility of the BRCAness...we used the epithelial ovarian cancer (EOC) dataset from The Cancer Genome Atlas (TCGA) ( 4 ). The TCGA dataset is a unique tool for these studies as

  15. Characterization of circulating microRNA expression in patients with a ventricular septal defect.

    PubMed

    Li, Dong; Ji, Long; Liu, Lianbo; Liu, Yizhi; Hou, Haifeng; Yu, Kunkun; Sun, Qiang; Zhao, Zhongtang

    2014-01-01

    Ventricular septal defect (VSD), one of the most common types of congenital heart disease (CHD), results from a combination of environmental and genetic factors. Recent studies demonstrated that microRNAs (miRNAs) are involved in development of CHD. This study was to characterize the expression of miRNAs that might be involved in the development or reflect the consequences of VSD. MiRNA microarray analysis and reverse transcription-polymerase chain reaction (RT-PCR) were employed to determine the miRNA expression profile from 3 patients with VSD and 3 VSD-free controls. 3 target gene databases were employed to predict the target genes of differentially expressed miRNAs. miRNAs that were generally consensus across the three databases were selected and then independently validated using real time PCR in plasma samples from 20 VSD patients and 15 VSD-free controls. Target genes of validated 8 miRNAs were predicted using bioinformatic methods. 36 differentially expressed miRNAs were found in the patients with VSD and the VSD-free controls. Compared with VSD-free controls, expression of 15 miRNAs were up-regulated and 21 miRNAs were downregulated in the VSD group. 15 miRNAs were selected based on database analysis results and expression levels of 8 miRNAs were validated. The results of the real time PCR were consistent with those of the microarray analysis. Gene ontology analysis indicated that the top target genes were mainly related to cardiac right ventricle morphogenesis. NOTCH1, HAND1, ZFPM2, and GATA3 were predicted as targets of hsa-let-7e-5p, hsa-miR-222-3p and hsa-miR-433. We report for the first time the circulating miRNA profile for patients with VSD and showed that 7 miRNAs were downregulated and 1 upregulated when matched to VSD-free controls. Analysis revealed target genes involved in cardiac development were probably regulated by these miRNAs.

  16. A framework for analyzing the relationship between gene expression and morphological, topological, and dynamical patterns in neuronal networks.

    PubMed

    de Arruda, Henrique Ferraz; Comin, Cesar Henrique; Miazaki, Mauro; Viana, Matheus Palhares; Costa, Luciano da Fontoura

    2015-04-30

    A key point in developmental biology is to understand how gene expression influences the morphological and dynamical patterns that are observed in living beings. In this work we propose a methodology capable of addressing this problem that is based on estimating the mutual information and Pearson correlation between the intensity of gene expression and measurements of several morphological properties of the cells. A similar approach is applied in order to identify effects of gene expression over the system dynamics. Neuronal networks were artificially grown over a lattice by considering a reference model used to generate artificial neurons. The input parameters of the artificial neurons were determined according to two distinct patterns of gene expression and the dynamical response was assessed by considering the integrate-and-fire model. As far as single gene dependence is concerned, we found that the interaction between the gene expression and the network topology, as well as between the former and the dynamics response, is strongly affected by the gene expression pattern. In addition, we observed a high correlation between the gene expression and some topological measurements of the neuronal network for particular patterns of gene expression. To our best understanding, there are no similar analyses to compare with. A proper understanding of gene expression influence requires jointly studying the morphology, topology, and dynamics of neurons. The proposed framework represents a first step towards predicting gene expression patterns from morphology and connectivity. Copyright © 2015. Published by Elsevier B.V.

  17. Meta-analysis of Gene Expression in the Mouse Liver Reveals Biomarkers Associated with Inflammation Increased Early During Aging

    EPA Science Inventory

    Aging is associated with a predictable loss of cellular homeostasis, a decline in physiological function and an increase in various diseases. We hypothesized that similar age-related gene expression profiles would be observed in mice across independent studies. Employing a metaan...

  18. Using Gene Expression Biomarkers to Identify Chemicals that Induce Key Events in Cancer and Endocrine Disruption AOPs: Androgen Receptor as an Example

    EPA Science Inventory

    High-throughput transcriptomic (HTTr) technologies are increasingly being used to screen environmental chemicals in vitro to provide mechanistic context for regulatory testing. The development of gene expression biomarkers that accurately predict molecular and toxicological effec...

  19. A gene expression biomarker accurately predicts estrogen receptor α modulation in a human gene expression compendium

    EPA Science Inventory

    The EPA’s vision for the Endocrine Disruptor Screening Program (EDSP) in the 21st Century (EDSP21) includes utilization of high-throughput screening (HTS) assays coupled with computational modeling to prioritize chemicals with the goal of eventually replacing current Tier 1...

  20. Meta-analysis of gene expression profiles associated with histological classification and survival in 829 ovarian cancer samples.

    PubMed

    Fekete, Tibor; Rásó, Erzsébet; Pete, Imre; Tegze, Bálint; Liko, István; Munkácsy, Gyöngyi; Sipos, Norbert; Rigó, János; Györffy, Balázs

    2012-07-01

    Transcriptomic analysis of global gene expression in ovarian carcinoma can identify dysregulated genes capable to serve as molecular markers for histology subtypes and survival. The aim of our study was to validate previous candidate signatures in an independent setting and to identify single genes capable to serve as biomarkers for ovarian cancer progression. As several datasets are available in the GEO today, we were able to perform a true meta-analysis. First, 829 samples (11 datasets) were downloaded, and the predictive power of 16 previously published gene sets was assessed. Of these, eight were capable to discriminate histology subtypes, and none was capable to predict survival. To overcome the differences in previous studies, we used the 829 samples to identify new predictors. Then, we collected 64 ovarian cancer samples (median relapse-free survival 24.5 months) and performed TaqMan Real Time Polimerase Chain Reaction (RT-PCR) analysis for the best 40 genes associated with histology subtypes and survival. Over 90% of subtype-associated genes were confirmed. Overall survival was effectively predicted by hormone receptors (PGR and ESR2) and by TSPAN8. Relapse-free survival was predicted by MAPT and SNCG. In summary, we successfully validated several gene sets in a meta-analysis in large datasets of ovarian samples. Additionally, several individual genes identified were validated in a clinical cohort. Copyright © 2011 UICC.

  1. DiffSLC: A graph centrality method to detect essential proteins of a protein-protein interaction network.

    PubMed

    Mistry, Divya; Wise, Roger P; Dickerson, Julie A

    2017-01-01

    Identification of central genes and proteins in biomolecular networks provides credible candidates for pathway analysis, functional analysis, and essentiality prediction. The DiffSLC centrality measure predicts central and essential genes and proteins using a protein-protein interaction network. Network centrality measures prioritize nodes and edges based on their importance to the network topology. These measures helped identify critical genes and proteins in biomolecular networks. The proposed centrality measure, DiffSLC, combines the number of interactions of a protein and the gene coexpression values of genes from which those proteins were translated, as a weighting factor to bias the identification of essential proteins in a protein interaction network. Potentially essential proteins with low node degree are promoted through eigenvector centrality. Thus, the gene coexpression values are used in conjunction with the eigenvector of the network's adjacency matrix and edge clustering coefficient to improve essentiality prediction. The outcome of this prediction is shown using three variations: (1) inclusion or exclusion of gene co-expression data, (2) impact of different coexpression measures, and (3) impact of different gene expression data sets. For a total of seven networks, DiffSLC is compared to other centrality measures using Saccharomyces cerevisiae protein interaction networks and gene expression data. Comparisons are also performed for the top ranked proteins against the known essential genes from the Saccharomyces Gene Deletion Project, which show that DiffSLC detects more essential proteins and has a higher area under the ROC curve than other compared methods. This makes DiffSLC a stronger alternative to other centrality methods for detecting essential genes using a protein-protein interaction network that obeys centrality-lethality principle. DiffSLC is implemented using the igraph package in R, and networkx package in Python. The python package can be obtained from git.io/diffslcpy. The R implementation and code to reproduce the analysis is available via git.io/diffslc.

  2. Knowledge-driven genomic interactions: an application in ovarian cancer.

    PubMed

    Kim, Dokyoon; Li, Ruowang; Dudek, Scott M; Frase, Alex T; Pendergrass, Sarah A; Ritchie, Marylyn D

    2014-01-01

    Effective cancer clinical outcome prediction for understanding of the mechanism of various types of cancer has been pursued using molecular-based data such as gene expression profiles, an approach that has promise for providing better diagnostics and supporting further therapies. However, clinical outcome prediction based on gene expression profiles varies between independent data sets. Further, single-gene expression outcome prediction is limited for cancer evaluation since genes do not act in isolation, but rather interact with other genes in complex signaling or regulatory networks. In addition, since pathways are more likely to co-operate together, it would be desirable to incorporate expert knowledge to combine pathways in a useful and informative manner. Thus, we propose a novel approach for identifying knowledge-driven genomic interactions and applying it to discover models associated with cancer clinical phenotypes using grammatical evolution neural networks (GENN). In order to demonstrate the utility of the proposed approach, an ovarian cancer data from the Cancer Genome Atlas (TCGA) was used for predicting clinical stage as a pilot project. We identified knowledge-driven genomic interactions associated with cancer stage from single knowledge bases such as sources of pathway-pathway interaction, but also knowledge-driven genomic interactions across different sets of knowledge bases such as pathway-protein family interactions by integrating different types of information. Notably, an integration model from different sources of biological knowledge achieved 78.82% balanced accuracy and outperformed the top models with gene expression or single knowledge-based data types alone. Furthermore, the results from the models are more interpretable because they are framed in the context of specific biological pathways or other expert knowledge. The success of the pilot study we have presented herein will allow us to pursue further identification of models predictive of clinical cancer survival and recurrence. Understanding the underlying tumorigenesis and progression in ovarian cancer through the global view of interactions within/between different biological knowledge sources has the potential for providing more effective screening strategies and therapeutic targets for many types of cancer.

  3. Genome wide predictions of miRNA regulation by transcription factors.

    PubMed

    Ruffalo, Matthew; Bar-Joseph, Ziv

    2016-09-01

    Reconstructing regulatory networks from expression and interaction data is a major goal of systems biology. While much work has focused on trying to experimentally and computationally determine the set of transcription-factors (TFs) and microRNAs (miRNAs) that regulate genes in these networks, relatively little work has focused on inferring the regulation of miRNAs by TFs. Such regulation can play an important role in several biological processes including development and disease. The main challenge for predicting such interactions is the very small positive training set currently available. Another challenge is the fact that a large fraction of miRNAs are encoded within genes making it hard to determine the specific way in which they are regulated. To enable genome wide predictions of TF-miRNA interactions, we extended semi-supervised machine-learning approaches to integrate a large set of different types of data including sequence, expression, ChIP-seq and epigenetic data. As we show, the methods we develop achieve good performance on both a labeled test set, and when analyzing general co-expression networks. We next analyze mRNA and miRNA cancer expression data, demonstrating the advantage of using the predicted set of interactions for identifying more coherent and relevant modules, genes, and miRNAs. The complete set of predictions is available on the supporting website and can be used by any method that combines miRNAs, genes, and TFs. Code and full set of predictions are available from the supporting website: http://cs.cmu.edu/~mruffalo/tf-mirna/ zivbj@cs.cmu.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Bacillus anthracis genome organization in light of whole transcriptome sequencing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.

    2010-03-22

    Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computationalmore » predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.« less

  5. Presymptomatic Diagnosis of Celiac Disease in Predisposed Children: The Role of Gene Expression Profile.

    PubMed

    Galatola, Martina; Cielo, Donatella; Panico, Camilla; Stellato, Pio; Malamisura, Basilio; Carbone, Lorenzo; Gianfrani, Carmen; Troncone, Riccardo; Greco, Luigi; Auricchio, Renata

    2017-09-01

    The prevalence of celiac disease (CD) has increased significantly in recent years, and risk prediction and early diagnosis have become imperative especially in at-risk families. In a previous study, we identified individuals with CD based on the expression profile of a set of candidate genes in peripheral blood monocytes. Here we evaluated the expression of a panel of CD candidate genes in peripheral blood mononuclear cells from at-risk infants long time before any symptom or production of antibodies. We analyzed the gene expression of a set of 9 candidate genes, associated with CD, in 22 human leukocyte antigen predisposed children from at-risk families for CD, studied from birth to 6 years of age. Nine of them developed CD (patients) and 13 did not (controls). We analyzed gene expression at 3 different time points (age matched in the 2 groups): 4-19 months before diagnosis, at the time of CD diagnosis, and after at least 1 year of a gluten-free diet. At similar age points, controls were also evaluated. Three genes (KIAA, TAGAP [T-cell Activation GTPase Activating Protein], and SH2B3 [SH2B Adaptor Protein 3]) were overexpressed in patients, compared with controls, at least 9 months before CD diagnosis. At a stepwise discriminant analysis, 4 genes (RGS1 [Regulator of G-protein signaling 1], TAGAP, TNFSF14 [Tumor Necrosis Factor (Ligand) Superfamily member 14], and SH2B3) differentiate patients from controls before serum antibodies production and clinical symptoms. Multivariate equation correctly classified CD from non-CD children in 95.5% of patients. The expression of a small set of candidate genes in peripheral blood mononuclear cells can predict CD at least 9 months before the appearance of any clinical and serological signs of the disease.

  6. Identification and consequences of miRNA-target interactions--beyond repression of gene expression.

    PubMed

    Hausser, Jean; Zavolan, Mihaela

    2014-09-01

    Comparative genomics analyses and high-throughput experimental studies indicate that a microRNA (miRNA) binds to hundreds of sites across the transcriptome. Although the knockout of components of the miRNA biogenesis pathway has profound phenotypic consequences, most predicted miRNA targets undergo small changes at the mRNA and protein levels when the expression of the miRNA is perturbed. Alternatively, miRNAs can establish thresholds in and increase the coherence of the expression of their target genes, as well as reduce the cell-to-cell variability in target gene expression. Here, we review the recent progress in identifying miRNA targets and the emerging paradigms of how miRNAs shape the dynamics of target gene expression.

  7. Interdependence of cell growth and gene expression: origins and consequences.

    PubMed

    Scott, Matthew; Gunderson, Carl W; Mateescu, Eduard M; Zhang, Zhongge; Hwa, Terence

    2010-11-19

    In bacteria, the rate of cell proliferation and the level of gene expression are intimately intertwined. Elucidating these relations is important both for understanding the physiological functions of endogenous genetic circuits and for designing robust synthetic systems. We describe a phenomenological study that reveals intrinsic constraints governing the allocation of resources toward protein synthesis and other aspects of cell growth. A theory incorporating these constraints can accurately predict how cell proliferation and gene expression affect one another, quantitatively accounting for the effect of translation-inhibiting antibiotics on gene expression and the effect of gratuitous protein expression on cell growth. The use of such empirical relations, analogous to phenomenological laws, may facilitate our understanding and manipulation of complex biological systems before underlying regulatory circuits are elucidated.

  8. Global miRNA expression profile reveals novel molecular players in aneurysmal subarachnoid haemorrhage.

    PubMed

    Lopes, Katia de Paiva; Vinasco-Sandoval, Tatiana; Vialle, Ricardo Assunção; Paschoal, Fernando Mendes; Bastos, Vanessa Albuquerque P Aviz; Bor-Seng-Shu, Edson; Teixeira, Manoel Jacobsen; Yamada, Elizabeth Sumi; Pinto, Pablo; Vidal, Amanda Ferreira; Ribeiro-Dos-Santos, Arthur; Moreira, Fabiano; Santos, Sidney; Paschoal, Eric Homero Albuquerque; Ribeiro-Dos-Santos, Ândrea

    2018-06-08

    The molecular mechanisms behind aneurysmal subarachnoid haemorrhage (aSAH) are still poorly understood. Expression patterns of miRNAs may help elucidate the post-transcriptional gene expression in aSAH. Here, we evaluate the global miRNAs expression profile (miRnome) of patients with aSAH to identify potential biomarkers. We collected 33 peripheral blood samples (27 patients with cerebral aneurysm, collected 7 to 10 days after the haemorrhage, when usually is the cerebral vasospasm risk peak, and six controls). Then, were performed small RNA sequencing using an Illumina Next Generation Sequencing (NGS) platform. Differential expression analysis identified eight differentially expressed miRNAs. Among them, three were identified being up-regulated, and five down-regulated. miR-486-5p was the most abundant expressed and is associated with poor neurological admission status. In silico miRNA gene target prediction showed 148 genes associated with at least two differentially expressed miRNAs. Among these, THBS1 and VEGFA, known to be related to thrombospondin and vascular endothelial growth factor. Moreover, MYC gene was found to be regulated by four miRNAs, suggesting an important role in aneurysmal subarachnoid haemorrhage. Additionally, 15 novel miRNAs were predicted being expressed only in aSAH, suggesting possible involvement in aneurysm pathogenesis. These findings may help the identification of novel biomarkers of clinical interest.

  9. Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction

    PubMed Central

    Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K.; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G.; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H.

    2017-01-01

    The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. PMID:27899623

  10. Robust diagnosis of non-Hodgkin lymphoma phenotypes validated on gene expression data from different laboratories.

    PubMed

    Bhanot, Gyan; Alexe, Gabriela; Levine, Arnold J; Stolovitzky, Gustavo

    2005-01-01

    A major challenge in cancer diagnosis from microarray data is the need for robust, accurate, classification models which are independent of the analysis techniques used and can combine data from different laboratories. We propose such a classification scheme originally developed for phenotype identification from mass spectrometry data. The method uses a robust multivariate gene selection procedure and combines the results of several machine learning tools trained on raw and pattern data to produce an accurate meta-classifier. We illustrate and validate our method by applying it to gene expression datasets: the oligonucleotide HuGeneFL microarray dataset of Shipp et al. (www.genome.wi.mit.du/MPR/lymphoma) and the Hu95Av2 Affymetrix dataset (DallaFavera's laboratory, Columbia University). Our pattern-based meta-classification technique achieves higher predictive accuracies than each of the individual classifiers , is robust against data perturbations and provides subsets of related predictive genes. Our techniques predict that combinations of some genes in the p53 pathway are highly predictive of phenotype. In particular, we find that in 80% of DLBCL cases the mRNA level of at least one of the three genes p53, PLK1 and CDK2 is elevated, while in 80% of FL cases, the mRNA level of at most one of them is elevated.

  11. RNA expression of genes involved in cytarabine metabolism and transport predicts cytarabine response in acute myeloid leukemia.

    PubMed

    Abraham, Ajay; Varatharajan, Savitha; Karathedath, Sreeja; Philip, Chepsy; Lakshmi, Kavitha M; Jayavelu, Ashok Kumar; Mohanan, Ezhilpavai; Janet, Nancy Beryl; Srivastava, Vivi M; Shaji, Ramachandran V; Zhang, Wei; Abraham, Aby; Viswabandya, Auro; George, Biju; Chandy, Mammen; Srivastava, Alok; Mathews, Vikram; Balasubramanian, Poonkuzhali

    2015-07-01

    Variation in terms of outcome and toxic side effects of treatment exists among acute myeloid leukemia (AML) patients on chemotherapy with cytarabine (Ara-C) and daunorubicin (Dnr). Candidate Ara-C metabolizing gene expression in primary AML cells is proposed to account for this variation. Ex vivo Ara-C sensitivity was determined in primary AML samples using MTT assay. mRNA expression of candidate Ara-C metabolizing genes were evaluated by RQPCR analysis. Global gene expression profiling was carried out for identifying differentially expressed genes between exvivo Ara-C sensitive and resistant samples. Wide interindividual variations in ex vivo Ara-C cytotoxicity were observed among samples from patients with AML and were stratified into sensitive, intermediately sensitive and resistant, based on IC50 values obtained by MTT assay. RNA expression of deoxycytidine kinase (DCK), human equilibrative nucleoside transporter-1 (ENT1) and ribonucleotide reductase M1 (RRM1) were significantly higher and cytidine deaminase (CDA) was significantly lower in ex vivo Ara-C sensitive samples. Higher DCK and RRM1 expression in AML patient's blast correlated with better DFS. Ara-C resistance index (RI), a mathematically derived quotient was proposed based on candidate gene expression pattern. Ara-C ex vivo sensitive samples were found to have significantly lower RI compared with resistant as well as samples from patients presenting with relapse. Patients with low RI supposedly highly sensitive to Ara-C were found to have higher incidence of induction death (p = 0.002; RR: 4.35 [95% CI: 1.69-11.22]). Global gene expression profiling undertaken to find out additional contributors of Ara-C resistance identified many apoptosis as well as metabolic pathway genes to be differentially expressed between Ara-C resistant and sensitive samples. This study highlights the importance of evaluating expression of candidate Ara-C metabolizing genes in predicting ex vivo drug response as well as treatment outcome. RI could be a predictor of ex vivo Ara-C response irrespective of cytogenetic and molecular risk groups and a potential biomarker for AML treatment outcome and toxicity. Original submitted 22 December 2014; Revision submitted 9 April 2015.

  12. Predicting selective drug targets in cancer through metabolic networks

    PubMed Central

    Folger, Ori; Jerby, Livnat; Frezza, Christian; Gottlieb, Eyal; Ruppin, Eytan; Shlomi, Tomer

    2011-01-01

    The interest in studying metabolic alterations in cancer and their potential role as novel targets for therapy has been rejuvenated in recent years. Here, we report the development of the first genome-scale network model of cancer metabolism, validated by correctly identifying genes essential for cellular proliferation in cancer cell lines. The model predicts 52 cytostatic drug targets, of which 40% are targeted by known, approved or experimental anticancer drugs, and the rest are new. It further predicts combinations of synthetic lethal drug targets, whose synergy is validated using available drug efficacy and gene expression measurements across the NCI-60 cancer cell line collection. Finally, potential selective treatments for specific cancers that depend on cancer type-specific downregulation of gene expression and somatic mutations are compiled. PMID:21694718

  13. Investigation of gene expressions in differentiated cell derived bone marrow stem cells during bone morphogenetic protein-4 treatments with Fourier transform infrared spectroscopy

    NASA Astrophysics Data System (ADS)

    Zafari, Jaber; Jouni, Fatemeh Javani; Ahmadvand, Ali; Abdolmaleki, Parviz; Soodi, Malihe; Zendehdel, Rezvan

    2017-02-01

    A model was set up to predict the differentiation patterns based on the data extracted from FTIR spectroscopy. For this reason, bone marrow stem cells (BMSCs) were differentiated to primordial germ cells (PGCs). Changes in cellular macromolecules in the time of 0, 24, 48, 72, and 96 h of differentiation, as different steps of the differentiation procedure were investigated by using FTIR spectroscopy. Also, the expression of pluripotency (Oct-4, Nanog and c-Myc) and specific genes (Mvh, Stella and Fragilis) were investigated by real-time PCR. However, the expression of genes in five steps of differentiation was predicted by FTIR spectroscopy. FTIR spectra showed changes in the template of band intensities at different differentiation steps. There are increasing changes in the stepwise differentiation procedure for the ratio area of CH2, which is symmetric to CH2 asymmetric stretching. An ensemble of expert methods, including regression tree (RT), boosting algorithm (BA), and generalized regression neural network (GRNN), was the best method to predict the gene expression by FTIR spectroscopy. In conclusion, the model was able to distinguish the pattern of different steps from cell differentiation by using some useful features extracted from FTIR spectra.

  14. Acute Response of the Hippocampal Transcriptome Following Mild Traumatic Brain Injury After Controlled Cortical Impact in the Rat.

    PubMed

    Samal, Babru B; Waites, Cameron K; Almeida-Suhett, Camila; Li, Zheng; Marini, Ann M; Samal, Nihar R; Elkahloun, Abdel; Braga, Maria F M; Eiden, Lee E

    2015-10-01

    We have previously demonstrated that mild controlled cortical impact (mCCI) injury to rat cortex causes indirect, concussive injury to underlying hippocampus and other brain regions, providing a reproducible model for mild traumatic brain injury (mTBI) and its neurochemical, synaptic, and behavioral sequelae. Here, we extend a preliminary gene expression study of the hippocampus-specific events occurring after mCCI and identify 193 transcripts significantly upregulated, and 21 transcripts significantly downregulated, 24 h after mCCI. Fifty-three percent of genes altered by mCCI within 24 h of injury are predicted to be expressed only in the non-neuronal/glial cellular compartment, with only 13% predicted to be expressed only in neurons. The set of upregulated genes following mCCI was interrogated using Ingenuity Pathway Analysis (IPA) augmented with manual curation of the literature (190 transcripts accepted for analysis), revealing a core group of 15 first messengers, mostly inflammatory cytokines, predicted to account for >99% of the transcript upregulation occurring 24 h after mCCI. Convergent analysis of predicted transcription factors (TFs) regulating the mCCI target genes, carried out in IPA relative to the entire Affymetrix-curated transcriptome, revealed a high concordance with TFs regulated by the cohort of 15 cytokines/cytokine-like messengers independently accounting for upregulation of the mCCI transcript cohort. TFs predicted to regulate transcription of the 193-gene mCCI cohort also displayed a high degree of overlap with TFs predicted to regulate glia-, rather than neuron-specific genes in cortical tissue. We conclude that mCCI predominantly affects transcription of non-neuronal genes within the first 24 h after insult. This finding suggests that early non-neuronal events trigger later permanent neuronal changes after mTBI, and that early intervention after mTBI could potentially affect the neurochemical cascade leading to later reported synaptic and behavioral dysfunction.

  15. Predicting degree of benefit from adjuvant trastuzumab in NSABP trial B-31.

    PubMed

    Pogue-Geile, Katherine L; Kim, Chungyeul; Jeong, Jong-Hyeon; Tanaka, Noriko; Bandos, Hanna; Gavin, Patrick G; Fumagalli, Debora; Goldstein, Lynn C; Sneige, Nour; Burandt, Eike; Taniyama, Yusuke; Bohn, Olga L; Lee, Ahwon; Kim, Seung-Il; Reilly, Megan L; Remillard, Matthew Y; Blackmon, Nicole L; Kim, Seong-Rim; Horne, Zachary D; Rastogi, Priya; Fehrenbacher, Louis; Romond, Edward H; Swain, Sandra M; Mamounas, Eleftherios P; Wickerham, D Lawrence; Geyer, Charles E; Costantino, Joseph P; Wolmark, Norman; Paik, Soonmyung

    2013-12-04

    National Surgical Adjuvant Breast and Bowel Project (NSABP) trial B-31 suggested the efficacy of adjuvant trastuzumab, even in HER2-negative breast cancer. This finding prompted us to develop a predictive model for degree of benefit from trastuzumab using archived tumor blocks from B-31. Case subjects with tumor blocks were randomly divided into discovery (n = 588) and confirmation cohorts (n = 991). A predictive model was built from the discovery cohort through gene expression profiling of 462 genes with nCounter assay. A predefined cut point for the predictive model was tested in the confirmation cohort. Gene-by-treatment interaction was tested with Cox models, and correlations between variables were assessed with Spearman correlation. Principal component analysis was performed on the final set of selected genes. All statistical tests were two-sided. Eight predictive genes associated with HER2 (ERBB2, c17orf37, GRB7) or ER (ESR1, NAT1, GATA3, CA12, IGF1R) were selected for model building. Three-dimensional subset treatment effect pattern plot using two principal components of these genes was used to identify a subset with no benefit from trastuzumab, characterized by intermediate-level ERBB2 and high-level ESR1 mRNA expression. In the confirmation set, the predefined cut points for this model classified patients into three subsets with differential benefit from trastuzumab with hazard ratios of 1.58 (95% confidence interval [CI] = 0.67 to 3.69; P = .29; n = 100), 0.60 (95% CI = 0.41 to 0.89; P = .01; n = 449), and 0.28 (95% CI = 0.20 to 0.41; P < .001; n = 442; P(interaction) between the model and trastuzumab < .001). We developed a gene expression-based predictive model for degree of benefit from trastuzumab and demonstrated that HER2-negative tumors belong to the moderate benefit group, thus providing justification for testing trastuzumab in HER2-negative patients (NSABP B-47).

  16. Predicting dynamic metabolic demands in the photosynthetic eukaryote Chlorella vulgaris

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zuniga, Cristal; Levering, Jennifer; Antoniewicz, Maciek R.

    Phototrophic organisms exhibit a highly dynamic proteome, adapting their biomass composition in response to diurnal light/dark cycles and nutrient availability. We used experimentally determined biomass compositions over the course of growth to determine and constrain the biomass objective function (BOF) in a genome-scale metabolic model of Chlorella vulgaris UTEX 395 over time. Changes in the BOF, which encompasses all metabolites necessary to produce biomass, influence the state of the metabolic network thus directly affecting predictions. Simulations using dynamic BOFs predicted distinct proteome demands during heterotrophic or photoautotrophic growth. Model-driven analysis of extracellular nitrogen concentrations and predicted nitrogen uptake rates revealedmore » an intracellular nitrogen pool, which contains 38% of the total nitrogen provided in the medium for photoautotrophic and 13% for heterotrophic growth. Agreement between flux and gene expression trends was determined by statistical comparison. Accordance between predicted fluxes trends and gene expression trends was found for 65% of multi-subunit enzymes and 75% of allosteric reactions. Reactions with the highest agreement between simulations and experimental data were associated with energy metabolism, terpenoid biosynthesis, fatty acids, nucleotides, and amino acids metabolism. Moreover, predicted flux distributions at each time point were compared with gene expression data to gain new insights into intracellular compartmentalization, specifically for transporters. A total of 103 genes related to internal transport reactions were identified and added to the updated model of C. vulgaris, iCZ946, thus increasing our knowledgebase by 10% for this model green alga.« less

  17. Predicting dynamic metabolic demands in the photosynthetic eukaryote Chlorella vulgaris

    DOE PAGES

    Zuniga, Cristal; Levering, Jennifer; Antoniewicz, Maciek R.; ...

    2017-09-26

    Phototrophic organisms exhibit a highly dynamic proteome, adapting their biomass composition in response to diurnal light/dark cycles and nutrient availability. We used experimentally determined biomass compositions over the course of growth to determine and constrain the biomass objective function (BOF) in a genome-scale metabolic model of Chlorella vulgaris UTEX 395 over time. Changes in the BOF, which encompasses all metabolites necessary to produce biomass, influence the state of the metabolic network thus directly affecting predictions. Simulations using dynamic BOFs predicted distinct proteome demands during heterotrophic or photoautotrophic growth. Model-driven analysis of extracellular nitrogen concentrations and predicted nitrogen uptake rates revealedmore » an intracellular nitrogen pool, which contains 38% of the total nitrogen provided in the medium for photoautotrophic and 13% for heterotrophic growth. Agreement between flux and gene expression trends was determined by statistical comparison. Accordance between predicted fluxes trends and gene expression trends was found for 65% of multi-subunit enzymes and 75% of allosteric reactions. Reactions with the highest agreement between simulations and experimental data were associated with energy metabolism, terpenoid biosynthesis, fatty acids, nucleotides, and amino acids metabolism. Moreover, predicted flux distributions at each time point were compared with gene expression data to gain new insights into intracellular compartmentalization, specifically for transporters. A total of 103 genes related to internal transport reactions were identified and added to the updated model of C. vulgaris, iCZ946, thus increasing our knowledgebase by 10% for this model green alga.« less

  18. Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates

    PubMed Central

    Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

    2009-01-01

    Background Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. Results We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. Conclusion These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes. PMID:19138430

  19. Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates.

    PubMed

    Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

    2009-01-12

    Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes.

  20. Transcriptomic Analysis of Leaf in Tree Peony Reveals Differentially Expressed Pigments Genes.

    PubMed

    Luo, Jianrang; Shi, Qianqian; Niu, Lixin; Zhang, Yanlong

    2017-02-20

    Tree peony (Paeonia suffruticosa Andrews) is an important traditional flower in China. Besides its beautiful flower, the leaf of tree peony has also good ornamental value owing to its leaf color change in spring. So far, the molecular mechanism of leaf color change in tree peony is unclear. In this study, the pigment level and transcriptome of three different color stages of tree peony leaf were analyzed. The purplish red leaf was rich in anthocyanin, while yellowish green leaf was rich in chlorophyll and carotenoid. Transcriptome analysis revealed that 4302 differentially expressed genes (DEGs) were upregulated, and 4225 were downregulated in the purplish red leaf vs. yellowish green leaf. Among these DEGs, eight genes were predicted to participate in anthocyanin biosynthesis, eight genes were predicted involved in porphyrin and chlorophyll metabolism, and 10 genes were predicted to participate in carotenoid metabolism. In addition, 27 MYBs, 20 bHLHs, 36 WD40 genes were also identified from DEGs. Anthocyanidin synthase (ANS) is the key gene that controls the anthocyanin level in tree peony leaf. Protochlorophyllide oxido-reductase (POR) is the key gene which regulated the chlorophyll content in tree peony leaf.

  1. Sample entropy analysis of cervical neoplasia gene-expression signatures

    PubMed Central

    Botting, Shaleen K; Trzeciakowski, Jerome P; Benoit, Michelle F; Salama, Salama A; Diaz-Arrastia, Concepcion R

    2009-01-01

    Background We introduce Approximate Entropy as a mathematical method of analysis for microarray data. Approximate entropy is applied here as a method to classify the complex gene expression patterns resultant of a clinical sample set. Since Entropy is a measure of disorder in a system, we believe that by choosing genes which display minimum entropy in normal controls and maximum entropy in the cancerous sample set we will be able to distinguish those genes which display the greatest variability in the cancerous set. Here we describe a method of utilizing Approximate Sample Entropy (ApSE) analysis to identify genes of interest with the highest probability of producing an accurate, predictive, classification model from our data set. Results In the development of a diagnostic gene-expression profile for cervical intraepithelial neoplasia (CIN) and squamous cell carcinoma of the cervix, we identified 208 genes which are unchanging in all normal tissue samples, yet exhibit a random pattern indicative of the genetic instability and heterogeneity of malignant cells. This may be measured in terms of the ApSE when compared to normal tissue. We have validated 10 of these genes on 10 Normal and 20 cancer and CIN3 samples. We report that the predictive value of the sample entropy calculation for these 10 genes of interest is promising (75% sensitivity, 80% specificity for prediction of cervical cancer over CIN3). Conclusion The success of the Approximate Sample Entropy approach in discerning alterations in complexity from biological system with such relatively small sample set, and extracting biologically relevant genes of interest hold great promise. PMID:19232110

  2. Under the influence of the active deodorant ingredient 4-hydroxy-3-methoxybenzyl alcohol, the skin bacterium Corynebacterium jeikeium moderately responds with differential gene expression.

    PubMed

    Brune, Iris; Becker, Anke; Paarmann, Daniel; Albersmeier, Andreas; Kalinowski, Jörn; Pühler, Alfred; Tauch, Andreas

    2006-12-15

    A 70mer oligonucleotide microarray was constructed to analyze genome-wide expression profiles of Corynebacterium jeikeium, a skin bacterium that is predominantly present in the human axilla and involved in axillary odor formation. Oligonucleotides representing 100% of the predicted coding regions of the C. jeikeium K411 genome were designed and spotted in quadruplicate onto epoxy-coated glass slides. The quality of the printed microarray was demonstrated by co-hybridization with fluorescently labeled cDNA probes obtained from exponentially growing C. jeikeium cultures. Accordingly, genes detected with different intensities resulting in log(2) transformed ratios greater than 0.8 or smaller than -0.8 can be regarded as differentially expressed with a confidence level greater than 99%. In an application example, we measured global changes of gene expression during growth of C. jeikeium in the presence of different concentrations of the deodorant component 4-hydroxy-3-methoxybenzyl alcohol that is active in preventing body odor formation. Global expression profiling revealed that low concentrations of 4-hydroxy-3-methoxybenzyl alcohol (0.5 and 2.5mg/ml) had almost no detectable effect on the transcriptome of C. jeikeium. A slightly higher concentration of 4-hydroxy-3-methoxybenzyl alcohol (5mg/ml) resulted in differential expression of 95 genes, 86 of which showed an enhanced expression when compared to a control culture. Besides many genes encoding proteins that apparently participate in transcription and translation, the drug resistance determinant cmx and the predicted virulence factors sapA and sapD showed significantly enhanced expression levels. Differential expression of relevant genes was validated by real-time reverse transcription PCR assays.

  3. A distinct adipose tissue gene expression response to caloric restriction predicts 6-mo weight maintenance in obese subjects.

    PubMed

    Mutch, David M; Pers, Tune H; Temanni, M Ramzi; Pelloux, Veronique; Marquez-Quiñones, Adriana; Holst, Claus; Martinez, J Alfredo; Babalis, Dimitris; van Baak, Marleen A; Handjieva-Darlenska, Teodora; Walker, Celia G; Astrup, Arne; Saris, Wim H M; Langin, Dominique; Viguerie, Nathalie; Zucker, Jean-Daniel; Clément, Karine

    2011-12-01

    Weight loss has been shown to reduce risk factors associated with cardiovascular disease and diabetes; however, successful maintenance of weight loss continues to pose a challenge. The present study was designed to assess whether changes in subcutaneous adipose tissue (scAT) gene expression during a low-calorie diet (LCD) could be used to differentiate and predict subjects who experience successful short-term weight maintenance from subjects who experience weight regain. Forty white women followed a dietary protocol consisting of an 8-wk LCD phase followed by a 6-mo weight-maintenance phase. Participants were classified as weight maintainers (WMs; 0-10% weight regain) and weight regainers (WRs; 50-100% weight regain) by considering changes in body weight during the 2 phases. Anthropometric measurements, bioclinical variables, and scAT gene expression were studied in all individuals before and after the LCD. Energy intake was estimated by using 3-d dietary records. No differences in body weight and fasting insulin were observed between WMs and WRs at baseline or after the LCD period. The LCD resulted in significant decreases in body weight and in several plasma variables in both groups. WMs experienced a significant reduction in insulin secretion in response to an oral-glucose-tolerance test after the LCD; in contrast, no changes in insulin secretion were observed in WRs after the LCD. An ANOVA of scAT gene expression showed that genes regulating fatty acid metabolism, citric acid cycle, oxidative phosphorylation, and apoptosis were regulated differently by the LCD in WM and WR subjects. This study suggests that LCD-induced changes in insulin secretion and scAT gene expression may have the potential to predict successful short-term weight maintenance. This trial was registered at clinicaltrials.gov as NCT00390637.

  4. An Integrative Genetics Approach to Identify Candidate Genes Regulating BMD: Combining Linkage, Gene Expression, and Association

    PubMed Central

    Farber, Charles R; van Nas, Atila; Ghazalpour, Anatole; Aten, Jason E; Doss, Sudheer; Sos, Brandon; Schadt, Eric E; Ingram-Drake, Leslie; Davis, Richard C; Horvath, Steve; Smith, Desmond J; Drake, Thomas A; Lusis, Aldons J

    2009-01-01

    Numerous quantitative trait loci (QTLs) affecting bone traits have been identified in the mouse; however, few of the underlying genes have been discovered. To improve the process of transitioning from QTL to gene, we describe an integrative genetics approach, which combines linkage analysis, expression QTL (eQTL) mapping, causality modeling, and genetic association in outbred mice. In C57BL/6J × C3H/HeJ (BXH) F2 mice, nine QTLs regulating femoral BMD were identified. To select candidate genes from within each QTL region, microarray gene expression profiles from individual F2 mice were used to identify 148 genes whose expression was correlated with BMD and regulated by local eQTLs. Many of the genes that were the most highly correlated with BMD have been previously shown to modulate bone mass or skeletal development. Candidates were further prioritized by determining whether their expression was predicted to underlie variation in BMD. Using network edge orienting (NEO), a causality modeling algorithm, 18 of the 148 candidates were predicted to be causally related to differences in BMD. To fine-map QTLs, markers in outbred MF1 mice were tested for association with BMD. Three chromosome 11 SNPs were identified that were associated with BMD within the Bmd11 QTL. Finally, our approach provides strong support for Wnt9a, Rasd1, or both underlying Bmd11. Integration of multiple genetic and genomic data sets can substantially improve the efficiency of QTL fine-mapping and candidate gene identification. PMID:18767929

  5. Molecular marker genes for ectomycorrhizal symbiosis

    Treesearch

    Shiv Hiremath; Carolyn McQuattie; Gopi Podila; Jenise Bauman

    2013-01-01

    Mycorrhizal symbiosis is a mutually beneficial association very commonly found among most vascular plants. Formation of mycorrhiza happens only between compatible partners and predicting this is often accomplished through a trial and error process. We investigated the possibility of using expression of symbiosis specific genes as markers to predict the formation of...

  6. Development and validation of a gene profile predicting benefit of postmastectomy radiotherapy in patients with high-risk breast cancer: a study of gene expression in the DBCG82bc cohort.

    PubMed

    Tramm, Trine; Mohammed, Hayat; Myhre, Simen; Kyndi, Marianne; Alsner, Jan; Børresen-Dale, Anne-Lise; Sørlie, Therese; Frigessi, Arnoldo; Overgaard, Jens

    2014-10-15

    To identify genes predicting benefit of radiotherapy in patients with high-risk breast cancer treated with systemic therapy and randomized to receive or not receive postmastectomy radiotherapy (PMRT). The study was based on the Danish Breast Cancer Cooperative Group (DBCG82bc) cohort. Gene-expression analysis was performed in a training set of frozen tumor tissue from 191 patients. Genes were identified through the Lasso method with the endpoint being locoregional recurrence (LRR). A weighted gene-expression index (DBCG-RT profile) was calculated and transferred to quantitative real-time PCR (qRT-PCR) in corresponding formalin-fixed, paraffin-embedded (FFPE) samples, before validation in FFPE from 112 additional patients. Seven genes were identified, and the derived DBCG-RT profile divided the 191 patients into "high LRR risk" and "low LRR risk" groups. PMRT significantly reduced risk of LRR in "high LRR risk" patients, whereas "low LRR risk" patients showed no additional reduction in LRR rate. Technical transfer of the DBCG-RT profile to FFPE/qRT-PCR was successful, and the predictive impact was successfully validated in another 112 patients. A DBCG-RT gene profile was identified and validated, identifying patients with very low risk of LRR and no benefit from PMRT. The profile may provide a method to individualize treatment with PMRT. ©2014 American Association for Cancer Research.

  7. Murine Hyperglycemic Vasculopathy and Cardiomyopathy: Whole-Genome Gene Expression Analysis Predicts Cellular Targets and Regulatory Networks Influenced by Mannose Binding Lectin

    PubMed Central

    Zou, Chenhui; La Bonte, Laura R.; Pavlov, Vasile I.; Stahl, Gregory L.

    2012-01-01

    Hyperglycemia, in the absence of type 1 or 2 diabetes, is an independent risk factor for cardiovascular disease. We have previously demonstrated a central role for mannose binding lectin (MBL)-mediated cardiac dysfunction in acute hyperglycemic mice. In this study, we applied whole-genome microarray data analysis to investigate MBL’s role in systematic gene expression changes. The data predict possible intracellular events taking place in multiple cellular compartments such as enhanced insulin signaling pathway sensitivity, promoted mitochondrial respiratory function, improved cellular energy expenditure and protein quality control, improved cytoskeleton structure, and facilitated intracellular trafficking, all of which may contribute to the organismal health of MBL null mice against acute hyperglycemia. Our data show a tight association between gene expression profile and tissue function which might be a very useful tool in predicting cellular targets and regulatory networks connected with in vivo observations, providing clues for further mechanistic studies. PMID:22375142

  8. Lung tumor diagnosis and subtype discovery by gene expression profiling.

    PubMed

    Wang, Lu-yong; Tu, Zhuowen

    2006-01-01

    The optimal treatment of patients with complex diseases, such as cancers, depends on the accurate diagnosis by using a combination of clinical and histopathological data. In many scenarios, it becomes tremendously difficult because of the limitations in clinical presentation and histopathology. To accurate diagnose complex diseases, the molecular classification based on gene or protein expression profiles are indispensable for modern medicine. Moreover, many heterogeneous diseases consist of various potential subtypes in molecular basis and differ remarkably in their response to therapies. It is critical to accurate predict subgroup on disease gene expression profiles. More fundamental knowledge of the molecular basis and classification of disease could aid in the prediction of patient outcome, the informed selection of therapies, and identification of novel molecular targets for therapy. In this paper, we propose a new disease diagnostic method, probabilistic boosting tree (PB tree) method, on gene expression profiles of lung tumors. It enables accurate disease classification and subtype discovery in disease. It automatically constructs a tree in which each node combines a number of weak classifiers into a strong classifier. Also, subtype discovery is naturally embedded in the learning process. Our algorithm achieves excellent diagnostic performance, and meanwhile it is capable of detecting the disease subtype based on gene expression profile.

  9. Double-filter identification of vascular-expressed genes using Arabidopsis plants with vascular hypertrophy and hypotrophy.

    PubMed

    Ckurshumova, Wenzislava; Scarpella, Enrico; Goldstein, Rochelle S; Berleth, Thomas

    2011-08-01

    Genes expressed in vascular tissues have been identified by several strategies, usually with a focus on mature vascular cells. In this study, we explored the possibility of using two opposite types of altered tissue compositions in combination with a double-filter selection to identify genes with a high probability of vascular expression in early organ primordia. Specifically, we generated full-transcriptome microarray profiles of plants with (a) genetically strongly reduced and (b) pharmacologically vastly increased vascular tissues and identified a reproducible cohort of 158 transcripts that fulfilled the dual requirement of being underrepresented in (a) and overrepresented in (b). In order to assess the predictive value of our identification scheme for vascular gene expression, we determined the expression patterns of genes in two unbiased subsamples. First, we assessed the expression patterns of all twenty annotated transcription factor genes from the cohort of 158 genes and found that seventeen of the twenty genes were preferentially expressed in leaf vascular cells. Remarkably, fifteen of these seventeen vascular genes were clearly expressed already very early in leaf vein development. Twelve genes with published leaf expression patterns served as a second subsample to monitor the representation of vascular genes in our cohort. Of those twelve genes, eleven were preferentially expressed in leaf vascular tissues. Based on these results we propose that our compendium of 158 genes represents a sample that is highly enriched for genes expressed in vascular tissues and that our approach is particularly suited to detect genes expressed in vascular cell lineages at early stages of their inception. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  10. Application of a Fuzzy Neural Network Model in Predicting Polycyclic Aromatic Hydrocarbon- Mediated Perturbations of the Cyp1b1 Transcriptional Regulatory Network in Mouse Skin

    PubMed Central

    Larkin, Andrew; Siddens, Lisbeth K.; Krueger, Sharon K.; Tilton, Susan C.; Waters, Katrina M.; Williams, David E.; Baird, William M.

    2013-01-01

    Polycyclic aromatic hydrocarbons (PAHs) are present in the environment as complex mixtures with components that have diverse carcinogenic potencies and mostly unknown interactive effects. Non-additive PAH interactions have been observed in regulation of cytochrome P450 (CYP) gene expression in the CYP1 family. To better understand and predict biological effects of complex mixtures, such as environmental PAHs, an 11 gene input-1 gene output fuzzy neural network (FNN) was developed for predicting PAH-mediated perturbations of dermal Cyp1b1 transcription in mice. Input values were generalized using fuzzy logic into low, medium, and high fuzzy subsets, and sorted using k-means clustering to create Mamdani logic functions for predicting Cyp1b1 mRNA expression. Model testing was performed with data from microarray analysis of skin samples from FVB/N mice treated with toluene (vehicle control), dibenzo[def,p]chrysene (DBC), benzo[a]pyrene (BaP), or 1 of 3 combinations of diesel particulate extract (DPE), coal tar extract (CTE) and cigarette smoke condensate (CSC) using leave one out cross-validation. Predictions were within 1 log2 fold change unit of microarray data, with the exception of the DBC treatment group, where the unexpected down-regulation of Cyp1b1 expression was predicted but did not reach statistical significance on the microarrays. Adding CTE to DPE was predicted to increase Cyp1b1 expression, whereas adding CSC to CTE and DPE was predicted to have no effect, in agreement with microarray results. The aryl hydrocarbon receptor repressor (Ahrr) was determined to be the most significant input variable for model predictions using back-propagation and normalization of FNN weights. PMID:23274566

  11. A gene expression signature of RAS pathway dependence predicts response to PI3K and RAS pathway inhibitors and expands the population of RAS pathway activated tumors

    PubMed Central

    2010-01-01

    Background Hyperactivation of the Ras signaling pathway is a driver of many cancers, and RAS pathway activation can predict response to targeted therapies. Therefore, optimal methods for measuring Ras pathway activation are critical. The main focus of our work was to develop a gene expression signature that is predictive of RAS pathway dependence. Methods We used the coherent expression of RAS pathway-related genes across multiple datasets to derive a RAS pathway gene expression signature and generate RAS pathway activation scores in pre-clinical cancer models and human tumors. We then related this signature to KRAS mutation status and drug response data in pre-clinical and clinical datasets. Results The RAS signature score is predictive of KRAS mutation status in lung tumors and cell lines with high (> 90%) sensitivity but relatively low (50%) specificity due to samples that have apparent RAS pathway activation in the absence of a KRAS mutation. In lung and breast cancer cell line panels, the RAS pathway signature score correlates with pMEK and pERK expression, and predicts resistance to AKT inhibition and sensitivity to MEK inhibition within both KRAS mutant and KRAS wild-type groups. The RAS pathway signature is upregulated in breast cancer cell lines that have acquired resistance to AKT inhibition, and is downregulated by inhibition of MEK. In lung cancer cell lines knockdown of KRAS using siRNA demonstrates that the RAS pathway signature is a better measure of dependence on RAS compared to KRAS mutation status. In human tumors, the RAS pathway signature is elevated in ER negative breast tumors and lung adenocarcinomas, and predicts resistance to cetuximab in metastatic colorectal cancer. Conclusions These data demonstrate that the RAS pathway signature is superior to KRAS mutation status for the prediction of dependence on RAS signaling, can predict response to PI3K and RAS pathway inhibitors, and is likely to have the most clinical utility in lung and breast tumors. PMID:20591134

  12. In silico analysis of miRNA-mediated gene regulation in OCA and OA genes.

    PubMed

    Kamaraj, Balu; Gopalakrishnan, Chandrasekhar; Purohit, Rituraj

    2014-12-01

    Albinism is an autosomal recessive genetic disorder due to low secretion of melanin. The oculocutaneous albinism (OCA) and ocular albinism (OA) genes are responsible for melanin production and also act as a potential targets for miRNAs. The role of miRNA is to inhibit the protein synthesis partially or completely by binding with the 3'UTR of the mRNA thus regulating gene expression. In this analysis, we predicted the genetic variation that occurred in 3'UTR of the transcript which can be a reason for low melanin production thus causing albinism. The single nucleotide polymorphisms (SNPs) in 3'UTR cause more new binding sites for miRNA which binds with mRNA which leads to inhibit the translation process either partially or completely. The SNPs in the mRNA of OCA and OA genes can create new binding sites for miRNA which may control the gene expression and lead to hypopigmentation. We have developed a computational procedure to determine the SNPs in the 3'UTR region of mRNA of OCA (TYR, OCA2, TYRP1 and SLC45A2) and OA (GPR143) genes which will be a potential cause for albinism. We identified 37 SNPs in five genes that are predicted to create 87 new binding sites on mRNA, which may lead to abrogation of the translation process. Expression analysis confirms that these genes are highly expressed in skin and eye regions. It is well supported by enrichment analysis that these genes are mainly involved in eye pigmentation and melanin biosynthesis process. The network analysis also shows how the genes are interacting and expressing in a complex network. This insight provides clue to wet-lab researches to understand the expression pattern of OCA and OA genes and binding phenomenon of mRNA and miRNA upon mutation, which is responsible for inhibition of translation process at genomic levels.

  13. Characteristics of genomic signatures derived using univariate methods and mechanistically anchored functional descriptors for predicting drug- and xenobiotic-induced nephrotoxicity.

    PubMed

    Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J

    2008-01-01

    ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.

  14. Systematic drug safety evaluation based on public genomic expression (Connectivity Map) data: Myocardial and infectious adverse reactions as application cases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Kejian, E-mail: kejian.wang.bio@gmail.com; Weng, Zuquan; Sun, Liya

    Adverse drug reaction (ADR) is of great importance to both regulatory agencies and the pharmaceutical industry. Various techniques, such as quantitative structure–activity relationship (QSAR) and animal toxicology, are widely used to identify potential risks during the preclinical stage of drug development. Despite these efforts, drugs with safety liabilities can still pass through safety checkpoints and enter the market. This situation raises the concern that conventional chemical structure analysis and phenotypic screening are not sufficient to avoid all clinical adverse events. Genomic expression data following in vitro drug treatments characterize drug actions and thus have become widely used in drug repositioning. Inmore » the present study, we explored prediction of ADRs based on the drug-induced gene-expression profiles from cultured human cells in the Connectivity Map (CMap) database. The results showed that drugs inducing comparable ADRs generally lead to similar CMap expression profiles. Based on such ADR-gene expression association, we established prediction models for various ADRs, including severe myocardial and infectious events. Drugs with FDA boxed warnings of safety liability were effectively identified. We therefore suggest that drug-induced gene expression change, in combination with effective computational methods, may provide a new dimension of information to facilitate systematic drug safety evaluation. - Highlights: • Drugs causing common toxicity lead to similar in vitro gene expression changes. • We built a model to predict drug toxicity with drug-specific expression profiles. • Drugs with FDA black box warnings were effectively identified by our model. • In vitro assay can detect severe toxicity in the early stage of drug development.« less

  15. Characteristics of allelic gene expression in human brain cells from single-cell RNA-seq data analysis.

    PubMed

    Zhao, Dejian; Lin, Mingyan; Pedrosa, Erika; Lachman, Herbert M; Zheng, Deyou

    2017-11-10

    Monoallelic expression of autosomal genes has been implicated in human psychiatric disorders. However, there is a paucity of allelic expression studies in human brain cells at the single cell and genome wide levels. In this report, we reanalyzed a previously published single-cell RNA-seq dataset from several postmortem human brains and observed pervasive monoallelic expression in individual cells, largely in a random manner. Examining single nucleotide variants with a predicted functional disruption, we found that the "damaged" alleles were overall expressed in fewer brain cells than their counterparts, and at a lower level in cells where their expression was detected. We also identified many brain cell type-specific monoallelically expressed genes. Interestingly, many of these cell type-specific monoallelically expressed genes were enriched for functions important for those brain cell types. In addition, function analysis showed that genes displaying monoallelic expression and correlated expression across neuronal cells from different individual brains were implicated in the regulation of synaptic function. Our findings suggest that monoallelic gene expression is prevalent in human brain cells, which may play a role in generating cellular identity and neuronal diversity and thus increasing the complexity and diversity of brain cell functions.

  16. Identification of Homeotic Target Genes in Drosophila Melanogaster Including Nervy, a Proto-Oncogene Homologue

    PubMed Central

    Feinstein, P. G.; Kornfeld, K.; Hogness, D. S.; Mann, R. S.

    1995-01-01

    In Drosophila, the specific morphological characteristics of each segment are determined by the homeotic genes that regulate the expression of downstream target genes. We used a subtractive hybridization procedure to isolate activated target genes of the homeotic gene Ultrabithorax (Ubx). In addition, we constructed a set of mutant genotypes that measures the regulatory contribution of individual homeotic genes to a complex target gene expression pattern. Using these mutants, we demonstrate that homeotic genes can regulate target gene expression at the start of gastrulation, suggesting a previously unknown role for the homeotic genes at this early stage. We also show that, in abdominal segments, the levels of expression for two target genes increase in response to high levels of Ubx, demonstrating that the normal down-regulation of Ubx in these segments is functional. Finally, the DNA sequence of cDNAs for one of these genes predicts a protein that is similar to a human proto-oncogene involved in acute myeloid leukemias. These results illustrate potentially general rules about the homeotic control of target gene expression and suggest that subtractive hybridization can be used to isolate interesting homeotic target genes. PMID:7498738

  17. Predictive and therapeutic markers in ovarian cancer

    DOEpatents

    Gray, Joe W.; Guan, Yinghui; Kuo, Wen-Lin; Fridlyand, Jane; Mills, Gordon B.

    2013-03-26

    Cancer markers may be developed to detect diseases characterized by increased expression of apoptosis-suppressing genes, such as aggressive cancers. Genes in the human chromosomal regions, 8q24, 11q13, 20q11-q13, were found to be amplified indicating in vivo drug resistance in diseases such as ovarian cancer. Diagnosis and assessment of amplification levels certain genes shown to be amplified, including PVT1, can be useful in prediction of poor outcome of patient's response and drug resistance in ovarian cancer patients with low survival rates. Certain genes were found to be high priority therapeutic targets by the identification of recurrent aberrations involving genome sequence, copy number and/or gene expression are associated with reduced survival duration in certain diseases and cancers, specifically ovarian cancer. Therapeutics to inhibit amplification and inhibitors of one of these genes, PVT1, target drug resistance in ovarian cancer patients with low survival rates is described.

  18. Predictive biomarkers of sensitivity to the phosphatidylinositol 3' kinase inhibitor GDC-0941 in breast cancer preclinical models.

    PubMed

    O'Brien, Carol; Wallin, Jeffrey J; Sampath, Deepak; GuhaThakurta, Debraj; Savage, Heidi; Punnoose, Elizabeth A; Guan, Jane; Berry, Leanne; Prior, Wei Wei; Amler, Lukas C; Belvin, Marcia; Friedman, Lori S; Lackner, Mark R

    2010-07-15

    The class I phosphatidylinositol 3' kinase (PI3K) plays a major role in proliferation and survival in a wide variety of human cancers. A key factor in successful development of drugs targeting this pathway is likely to be the identification of responsive patient populations with predictive diagnostic biomarkers. This study sought to identify candidate biomarkers of response to the selective PI3K inhibitor GDC-0941. We used a large panel of breast cancer cell lines and in vivo xenograft models to identify candidate predictive biomarkers for a selective inhibitor of class I PI3K that is currently in clinical development. The approach involved pharmacogenomic profiling as well as analysis of gene expression data sets from cells profiled at baseline or after GDC-0941 treatment. We found that models harboring mutations in PIK3CA, amplification of human epidermal growth factor receptor 2, or dual alterations in two pathway components were exquisitely sensitive to the antitumor effects of GDC-0941. We found that several models that do not harbor these alterations also showed sensitivity, suggesting a need for additional diagnostic markers. Gene expression studies identified a collection of genes whose expression was associated with in vitro sensitivity to GDC-0941, and expression of a subset of these genes was found to be intimately linked to signaling through the pathway. Pathway focused biomarkers and the gene expression signature described in this study may have utility in the identification of patients likely to benefit from therapy with a selective PI3K inhibitor. Copyright 2010 AACR.

  19. Natural language indicators of differential gene regulation in the human immune system.

    PubMed

    Mehl, Matthias R; Raison, Charles L; Pace, Thaddeus W W; Arevalo, Jesusa M G; Cole, Steve W

    2017-11-21

    Adverse social conditions have been linked to a conserved transcriptional response to adversity (CTRA) in circulating leukocytes that may contribute to social gradients in disease. However, the CNS mechanisms involved remain obscure, in part because CTRA gene-expression profiles often track external social-environmental variables more closely than they do self-reported internal affective states such as stress, depression, or anxiety. This study examined the possibility that variations in patterns of natural language use might provide more sensitive indicators of the automatic threat-detection and -response systems that proximally regulate autonomic induction of the CTRA. In 22,627 audio samples of natural speech sampled from the daily interactions of 143 healthy adults, both total language output and patterns of function-word use covaried with CTRA gene expression. These language features predicted CTRA gene expression substantially better than did conventional self-report measures of stress, depression, and anxiety and did so independently of demographic and behavioral factors (age, sex, race, smoking, body mass index) and leukocyte subset distributions. This predictive relationship held when language and gene expression were sampled more than a week apart, suggesting that associations reflect stable individual differences or chronic life circumstances. Given the observed relationship between personal expression and gene expression, patterns of natural language use may provide a useful behavioral indicator of nonconsciously evaluated well-being (implicit safety vs. threat) that is distinct from conscious affective experience and more closely tracks the neurobiological processes involved in peripheral gene regulation. Copyright © 2017 the Author(s). Published by PNAS.

  20. Faster-X evolution of gene expression is driven by recessive adaptive cis-regulatory variation in Drosophila.

    PubMed

    Llopart, Ana

    2018-05-01

    The hemizygosity of the X (Z) chromosome fully exposes the fitness effects of mutations on that chromosome and has evolutionary consequences on the relative rates of evolution of X and autosomes. Specifically, several population genetics models predict increased rates of evolution in X-linked loci relative to autosomal loci. This prediction of faster-X evolution has been evaluated and confirmed for both protein coding sequences and gene expression. In the case of faster-X evolution for gene expression divergence, it is often assumed that variation in 5' noncoding sequences is associated with variation in transcript abundance between species but a formal, genomewide test of this hypothesis is still missing. Here, I use whole genome sequence data in Drosophila yakuba and D. santomea to evaluate this hypothesis and report positive correlations between sequence divergence at 5' noncoding sequences and gene expression divergence. I also examine polymorphism and divergence in 9,279 noncoding sequences located at the 5' end of annotated genes and detected multiple signals of positive selection. Notably, I used the traditional synonymous sites as neutral reference to test for adaptive evolution, but I also used bases 8-30 of introns <65 bp, which have been proposed to be a better neutral choice. X-linked genes with high degree of male-biased expression show the most extreme adaptive pattern at 5' noncoding regions, in agreement with faster-X evolution for gene expression divergence and a higher incidence of positively selected recessive mutations. © 2018 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  1. An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.

    PubMed

    Nidheesh, N; Abdul Nazeer, K A; Ameer, P M

    2017-12-01

    Clustering algorithms with steps involving randomness usually give different results on different executions for the same dataset. This non-deterministic nature of algorithms such as the K-Means clustering algorithm limits their applicability in areas such as cancer subtype prediction using gene expression data. It is hard to sensibly compare the results of such algorithms with those of other algorithms. The non-deterministic nature of K-Means is due to its random selection of data points as initial centroids. We propose an improved, density based version of K-Means, which involves a novel and systematic method for selecting initial centroids. The key idea of the algorithm is to select data points which belong to dense regions and which are adequately separated in feature space as the initial centroids. We compared the proposed algorithm to a set of eleven widely used single clustering algorithms and a prominent ensemble clustering algorithm which is being used for cancer data classification, based on the performances on a set of datasets comprising ten cancer gene expression datasets. The proposed algorithm has shown better overall performance than the others. There is a pressing need in the Biomedical domain for simple, easy-to-use and more accurate Machine Learning tools for cancer subtype prediction. The proposed algorithm is simple, easy-to-use and gives stable results. Moreover, it provides comparatively better predictions of cancer subtypes from gene expression data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Isoflavones in soy flour diet have different effects on whole-genome expression patterns than purified isoflavone mix in human MCF-7 breast tumors in ovariectomized athymic nude mice.

    PubMed

    Liu, Yunxian; Hilakivi-Clarke, Leena; Zhang, Yukun; Wang, Xiao; Pan, Yuan-Xiang; Xuan, Jianhua; Fleck, Stefanie C; Doerge, Daniel R; Helferich, William G

    2015-08-01

    Soy flour diet (MS) prevented isoflavones from stimulating MCF-7 tumor growth in athymic nude mice, indicating that other bioactive compounds in soy can negate the estrogenic properties of isoflavones. The underlying signal transduction pathways to explain the protective effects of soy flour consumption were studied here. Ovariectomized athymic nude mice inoculated with MCF-7 human breast cancer cells were fed either Soy flour diet (MS) or purified isoflavone mix diet (MI), both with equivalent amounts of genistein. Positive controls received estradiol pellets and negative controls received sham pellets. GeneChip Human Genome U133 Plus 2.0 Array platform was used to evaluate gene expressions, and results were analyzed using bioinformatics approaches. Tumors in MS-fed mice exhibited higher expression of tumor growth suppressing genes ATP2A3 and BLNK and lower expression of oncogene MYC. Tumors in MI-fed mice expressed a higher level of oncogene MYB and a lower level of MHC-I and MHC-II, allowing tumor cells to escape immunosurveillance. MS-induced gene expression alterations were predictive of prolonged survival among estrogen-receptor-positive breast cancer patients, whilst MI-induced gene changes were predictive of shortened survival. Our findings suggest that dietary soy flour affects gene expression differently than purified isoflavones, which may explain why soy foods prevent isoflavones-induced stimulation of MCF-7 tumor growth in athymic nude mice. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Cancer stem cell-related gene expression as a potential biomarker of response for first-in-class imipridone ONC201 in solid tumors.

    PubMed

    Prabhu, Varun V; Lulla, Amriti R; Madhukar, Neel S; Ralff, Marie D; Zhao, Dan; Kline, Christina Leah B; Van den Heuvel, A Pieter J; Lev, Avital; Garnett, Mathew J; McDermott, Ultan; Benes, Cyril H; Batchelor, Tracy T; Chi, Andrew S; Elemento, Olivier; Allen, Joshua E; El-Deiry, Wafik S

    2017-01-01

    Cancer stem cells (CSCs) correlate with recurrence, metastasis and poor survival in clinical studies. Encouraging results from clinical trials of CSC inhibitors have further validated CSCs as therapeutic targets. ONC201 is a first-in-class small molecule imipridone in Phase I/II clinical trials for advanced cancer. We have previously shown that ONC201 targets self-renewing, chemotherapy-resistant colorectal CSCs via Akt/ERK inhibition and DR5/TRAIL induction. In this study, we demonstrate that the anti-CSC effects of ONC201 involve early changes in stem cell-related gene expression prior to tumor cell death induction. A targeted network analysis of gene expression profiles in colorectal cancer cells revealed that ONC201 downregulates stem cell pathways such as Wnt signaling and modulates genes (ID1, ID2, ID3 and ALDH7A1) known to regulate self-renewal in colorectal, prostate cancer and glioblastoma. ONC201-mediated changes in CSC-related gene expression were validated at the RNA and protein level for each tumor type. Accordingly, we observed inhibition of self-renewal and CSC markers in prostate cancer cell lines and patient-derived glioblastoma cells upon ONC201 treatment. Interestingly, ONC201-mediated CSC depletion does not occur in colorectal cancer cells with acquired resistance to ONC201. Finally, we observed that basal expression of CSC-related genes (ID1, CD44, HES7 and TCF3) significantly correlate with ONC201 efficacy in >1000 cancer cell lines and combining the expression of multiple genes leads to a stronger overall prediction. These proof-of-concept studies provide a rationale for testing CSC expression at the RNA and protein level as a predictive and pharmacodynamic biomarker of ONC201 response in ongoing clinical studies.

  4. Cancer stem cell-related gene expression as a potential biomarker of response for first-in-class imipridone ONC201 in solid tumors

    PubMed Central

    Zhao, Dan; Kline, Christina Leah B.; Van den Heuvel, A. Pieter J.; Lev, Avital; Garnett, Mathew J.; McDermott, Ultan; Benes, Cyril H.; Batchelor, Tracy T.; Chi, Andrew S.; Elemento, Olivier; Allen, Joshua E.

    2017-01-01

    Cancer stem cells (CSCs) correlate with recurrence, metastasis and poor survival in clinical studies. Encouraging results from clinical trials of CSC inhibitors have further validated CSCs as therapeutic targets. ONC201 is a first-in-class small molecule imipridone in Phase I/II clinical trials for advanced cancer. We have previously shown that ONC201 targets self-renewing, chemotherapy-resistant colorectal CSCs via Akt/ERK inhibition and DR5/TRAIL induction. In this study, we demonstrate that the anti-CSC effects of ONC201 involve early changes in stem cell-related gene expression prior to tumor cell death induction. A targeted network analysis of gene expression profiles in colorectal cancer cells revealed that ONC201 downregulates stem cell pathways such as Wnt signaling and modulates genes (ID1, ID2, ID3 and ALDH7A1) known to regulate self-renewal in colorectal, prostate cancer and glioblastoma. ONC201-mediated changes in CSC-related gene expression were validated at the RNA and protein level for each tumor type. Accordingly, we observed inhibition of self-renewal and CSC markers in prostate cancer cell lines and patient-derived glioblastoma cells upon ONC201 treatment. Interestingly, ONC201-mediated CSC depletion does not occur in colorectal cancer cells with acquired resistance to ONC201. Finally, we observed that basal expression of CSC-related genes (ID1, CD44, HES7 and TCF3) significantly correlate with ONC201 efficacy in >1000 cancer cell lines and combining the expression of multiple genes leads to a stronger overall prediction. These proof-of-concept studies provide a rationale for testing CSC expression at the RNA and protein level as a predictive and pharmacodynamic biomarker of ONC201 response in ongoing clinical studies. PMID:28767654

  5. Intratumoral gene expression of 5-fluorouracil pharmacokinetics-related enzymes in stage I and II non-small cell lung cancer patients treated with uracil-tegafur after surgery: a prospective multi-institutional study in Japan.

    PubMed

    Eguchi, Keisuke; Oyama, Takahiko; Tajima, Atsushi; Abiko, Tomohiro; Sawafuji, Makoto; Horio, Hirotoshi; Hashizume, Toshinori; Matsutani, Noriyuki; Kato, Ryoichi; Nakayama, Mitsuo; Kawamura, Masafumi; Kobayashi, Koichi

    2015-01-01

    This investigation was conducted to assess the use of the intratumoral mRNA expression levels of nucleic acid-metabolizing enzymes as biomarkers of adjuvant chemotherapy for non-small cell lung cancer (NSCLC) using uracil-tegafur in a multi-institutional prospective study. 236 patients with a completely resected NSCLC (adenocarcinoma and squamous cell carcinoma) of pathological stage IA (maximum tumor diameter of 2 cm or greater), IB, and II tumors were given a dose of 250 mg of uracil-tegafur per square meter of body surface area per day orally for two years after surgery. Intratumoral mRNA levels of thymidylate synthase (TS), dihydropyrimidine dehydrogenase (DPD), orotate phosphoribosyltransferase (OPRT), and thymidine phosphorylase (TP) genes relative to an internal standard, β-actin, were determined using laser-capture microdissection and fluorescence-based real time PCR detection systems. Among 5-FU target enzymes, TS was the only one that showed a significant difference in the level of gene expression between the high and low gene expression groups, for both disease-free survival (DFS) and overall survival (OS), when patients were divided according to median values; 5-year DFS rates in high/low TS gene expression were 60.4% and 72.6%, respectively (p=0.050), 5-year OS rates were 78.1% and 88.6%, respectively (p=0.011). Cox's proportional hazard model indicated that the pathological stage and TS gene expression level were independent values for predicting DFS. The TS gene expression level was shown to be an independent predictive factor for DFS in stage I and II NSCLC patients who were treated with uracil-tegafur following surgery. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  6. Transcript profiling reveals expression differences in wild-type and glabrous soybean lines

    PubMed Central

    2011-01-01

    Background Trichome hairs affect diverse agronomic characters such as seed weight and yield, prevent insect damage and reduce loss of water but their molecular control has not been extensively studied in soybean. Several detailed models for trichome development have been proposed for Arabidopsis thaliana, but their applicability to important crops such as cotton and soybean is not fully known. Results Two high throughput transcript sequencing methods, Digital Gene Expression (DGE) Tag Profiling and RNA-Seq, were used to compare the transcriptional profiles in wild-type (cv. Clark standard, CS) and a mutant (cv. Clark glabrous, i.e., trichomeless or hairless, CG) soybean isoline that carries the dominant P1 allele. DGE data and RNA-Seq data were mapped to the cDNAs (Glyma models) predicted from the reference soybean genome, Williams 82. Extending the model length by 250 bp at both ends resulted in significantly more matches of authentic DGE tags indicating that many of the predicted gene models are prematurely truncated at the 5' and 3' UTRs. The genome-wide comparative study of the transcript profiles of the wild-type versus mutant line revealed a number of differentially expressed genes. One highly-expressed gene, Glyma04g35130, in wild-type soybean was of interest as it has high homology to the cotton gene GhRDL1 gene that has been identified as being involved in cotton fiber initiation and is a member of the BURP protein family. Sequence comparison of Glyma04g35130 among Williams 82 with our sequences derived from CS and CG isolines revealed various SNPs and indels including addition of one nucleotide C in the CG and insertion of ~60 bp in the third exon of CS that causes a frameshift mutation and premature truncation of peptides in both lines as compared to Williams 82. Conclusion Although not a candidate for the P1 locus, a BURP family member (Glyma04g35130) from soybean has been shown to be abundantly expressed in the CS line and very weakly expressed in the glabrous CG line. RNA-Seq and DGE data are compared and provide experimental data on the expression of predicted soybean gene models as well as an overview of the genes expressed in young shoot tips of two closely related isolines. PMID:22029708

  7. MicroRNA-124-3p expression and its prospective functional pathways in hepatocellular carcinoma: A quantitative polymerase chain reaction, gene expression omnibus and bioinformatics study.

    PubMed

    He, Rong-Quan; Yang, Xia; Liang, Liang; Chen, Gang; Ma, Jie

    2018-04-01

    The present study aimed to explore the potential clinical significance of microRNA (miR)-124-3p expression in the hepatocarcinogenesis and development of hepatocellular carcinoma (HCC), as well as the potential target genes of functional HCC pathways. Reverse transcription-quantitative polymerase chain reaction was performed to evaluate the expression of miR-124-3p in 101 HCC and adjacent non-cancerous tissue samples. Additionally, the association between miR-124-3p expression and clinical parameters was also analyzed. Differentially expressed genes identified following miR-124-3p transfection, the prospective target genes predicted in silico and the key genes of HCC obtained from Natural Language Processing (NLP) were integrated to obtain potential target genes of miR-124-3p in HCC. Relevant signaling pathways were assessed with protein-protein interaction (PPI) networks, Gene Ontology (GO) enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Protein Annotation Through Evolutionary Relationships (PANTHER) pathway enrichment analysis. miR-124-3p expression was significantly reduced in HCC tissues compared with expression in adjacent non-cancerous liver tissues. In HCC, miR-124-3p was demonstrated to be associated with clinical stage. The mean survival time of the low miR-124-3p expression group was reduced compared with that of the high expression group. A total of 132 genes overlapped from differentially expressed genes, miR-124-3p predicted target genes and NLP identified genes. PPI network construction revealed a total of 109 nodes and 386 edges, and 20 key genes were identified. The major enriched terms of three GO categories included regulation of cell proliferation, positive regulation of cellular biosynthetic processes, cell leading edge, cytosol and cell projection, protein kinase activity, transcription activator activity and enzyme binding. KEGG analysis revealed pancreatic cancer, prostate cancer and non-small cell lung cancer as the top three terms. Angiogenesis, the endothelial growth factor receptor signaling pathway and the fibroblast growth factor signaling pathway were identified as the most significant terms in the PANTHER pathway analysis. The present study confirmed that miR-124-3p acts as a tumor suppressor in HCC. miR-124-3p may target multiple genes, exerting its effect spatiotemporally, or in combination with a diverse range of processes in HCC. Functional characterization of miR-124-3p targets will offer novel insight into the molecular changes that occur in HCC progression.

  8. Altered Expression of Genes Implicated in Xylan Biosynthesis Affects Penetration Resistance against Powdery Mildew.

    PubMed

    Chowdhury, Jamil; Lück, Stefanie; Rajaraman, Jeyaraman; Douchkov, Dimitar; Shirley, Neil J; Schwerdt, Julian G; Schweizer, Patrick; Fincher, Geoffrey B; Burton, Rachel A; Little, Alan

    2017-01-01

    Heteroxylan has recently been identified as an important component of papillae, which are formed during powdery mildew infection of barley leaves. Deposition of heteroxylan near the sites of attempted fungal penetration in the epidermal cell wall is believed to enhance the physical resistance to the fungal penetration peg and hence to improve pre-invasion resistance. Several glycosyltransferase (GT) families are implicated in the assembly of heteroxylan in the plant cell wall, and are likely to work together in a multi-enzyme complex. Members of key GT families reported to be involved in heteroxylan biosynthesis are up-regulated in the epidermal layer of barley leaves during powdery mildew infection. Modulation of their expression leads to altered susceptibility levels, suggesting that these genes are important for penetration resistance. The highest level of resistance was achieved when a GT43 gene was co-expressed with a GT47 candidate gene, both of which have been predicted to be involved in xylan backbone biosynthesis. Altering the expression level of several candidate heteroxylan synthesis genes can significantly alter disease susceptibility. This is predicted to occur through changes in the amount and structure of heteroxylan in barley papillae.

  9. Gene Rearrangement Attenuates Expression and Lethality of a Nonsegmented Negative Strand RNA Virus

    NASA Astrophysics Data System (ADS)

    Williams Wertz, Gail; Perepelitsa, Victoria P.; Ball, L. Andrew

    1998-03-01

    The nonsegmented negative strand RNA viruses comprise hundreds of human, animal, insect, and plant pathogens. Gene expression of these viruses is controlled by the highly conserved order of genes relative to the single transcriptional promoter. We utilized this regulatory mechanism to alter gene expression levels of vesicular stomatitis virus by rearranging the gene order. This report documents that gene expression levels and the viral phenotype can be manipulated in a predictable manner. Translocation of the promoter-proximal nucleocapsid protein gene N, whose product is required stoichiometrically for genome replication, to successive positions down the genome reduced N mRNA and protein expression in a stepwise manner. The reduction in N gene expression resulted in a stepwise decrease in genomic RNA replication. Translocation of the N gene also attenuated the viruses to increasing extents for replication in cultured cells and for lethality in mice, without compromising their ability to elicit protective immunity. Because monopartite negative strand RNA viruses have not been reported to undergo homologous recombination, gene rearrangement should be irreversible and may provide a rational strategy for developing stably attenuated live vaccines against this type of virus.

  10. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer.

    PubMed

    Pandi, Narayanan Sathiya; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    2013-10-04

    Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC. Copyright © 2013 Elsevier Inc. All rights reserved.

  11. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pandi, Narayanan Sathiya, E-mail: sathiyapandi@gmail.com; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    Highlights: •Identified stomach lineage specific gene set (SLSGS) was found to be under expressed in gastric tumors. •Elevated expression of SLSGS in gastric tumor is a molecular predictor of metabolic type gastric cancer. •In silico pathway scanning identified estrogen-α signaling is a putative regulator of SLSGS in gastric cancer. •Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. -- Abstract: Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However,more » the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC.« less

  12. A reported 20-gene expression signature to predict lymph node-positive disease at radical cystectomy for muscle-invasive bladder cancer is clinically not applicable.

    PubMed

    van Kessel, Kim E M; van de Werken, Harmen J G; Lurkin, Irene; Ziel-van der Made, Angelique C J; Zwarthoff, Ellen C; Boormans, Joost L

    2017-01-01

    Neoadjuvant chemotherapy (NAC) for muscle-invasive bladder cancer (MIBC) provides a small but significant survival benefit. Nevertheless, controversies on applying NAC remain because the limited benefit must be weight against chemotherapy-related toxicity and the delay of definitive local treatment. Therefore, there is a clear clinical need for tools to guide treatment decisions on NAC in MIBC. Here, we aimed to validate a previously reported 20-gene expression signature that predicted lymph node-positive disease at radical cystectomy in clinically node-negative MIBC patients, which would be a justification for upfront chemotherapy. We studied diagnostic transurethral resection of bladder tumors (dTURBT) of 150 MIBC patients (urothelial carcinoma) who were subsequently treated by radical cystectomy and pelvic lymph node dissection. RNA was isolated and the expression level of the 20 genes was determined on a qRT-PCR platform. Normalized Ct values were used to calculate a risk score to predict the presence of node-positive disease. The Cancer Genome Atlas (TCGA) RNA expression data was analyzed to subsequently validate the results. In a univariate regression analysis, none of the 20 genes significantly correlated with node-positive disease. The area under the curve of the risk score calculated by the 20-gene expression signature was 0.54 (95% Confidence Interval: 0.44-0.65) versus 0.67 for the model published by Smith et al. Node-negative patients had a significantly lower tumor grade at TURBT (p = 0.03), a lower pT stage (p<0.01) and less frequent lymphovascular invasion (13% versus 38%, p<0.01) at radical cystectomy than node-positive patients. In addition, in the TCGA data, none of the 20 genes was differentially expressed in node-negative versus node-positive patients. We conclude that a 20-gene expression signature developed for nodal staging of MIBC at radical cystectomy could not be validated on a qRT-PCR platform in a large cohort of dTURBT specimens.

  13. Protein-DNA binding dynamics predict transcriptional response to nutrients in archaea.

    PubMed

    Todor, Horia; Sharma, Kriti; Pittman, Adrianne M C; Schmid, Amy K

    2013-10-01

    Organisms across all three domains of life use gene regulatory networks (GRNs) to integrate varied stimuli into coherent transcriptional responses to environmental pressures. However, inferring GRN topology and regulatory causality remains a central challenge in systems biology. Previous work characterized TrmB as a global metabolic transcription factor in archaeal extremophiles. However, it remains unclear how TrmB dynamically regulates its ∼100 metabolic enzyme-coding gene targets. Using a dynamic perturbation approach, we elucidate the topology of the TrmB metabolic GRN in the model archaeon Halobacterium salinarum. Clustering of dynamic gene expression patterns reveals that TrmB functions alone to regulate central metabolic enzyme-coding genes but cooperates with various regulators to control peripheral metabolic pathways. Using a dynamical model, we predict gene expression patterns for some TrmB-dependent promoters and infer secondary regulators for others. Our data suggest feed-forward gene regulatory topology for cobalamin biosynthesis. In contrast, purine biosynthesis appears to require TrmB-independent regulators. We conclude that TrmB is an important component for mediating metabolic modularity, integrating nutrient status and regulating gene expression dynamics alone and in concert with secondary regulators.

  14. A Gene Expression Profile of BRCAness that Predicts for Responsiveness to Platinum and PARP Inhibitors

    DTIC Science & Technology

    2014-08-01

    allylamino-17-demethoxygeldanamycin) downregulated HR, ATM and Fanconi Anemia pathways. In HR- proficient EOC cells, 17-AAG suppressed HR as assessed...downregulated HR (pɘ.005), ATM (p=0.015) and Fanconi Anemia (pɘ.005) pathways, and downregulated the expression levels of several genes of these

  15. iTAK: A program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators and protein kinases

    USDA-ARS?s Scientific Manuscript database

    Transcription factors (TFs) are proteins that regulate the expression of target genes by binding to specific elements in their regulatory regions. Transcriptional regulators (TRs) also regulate the expression of target genes; however, they operate indirectly via interaction with the basal transcript...

  16. Separate and combined effects of genetic variants and pre-treatment whole blood gene expression on response to exposure-based cognitive behavioural therapy for anxiety disorders.

    PubMed

    Coleman, Jonathan R I; Lester, Kathryn J; Roberts, Susanna; Keers, Robert; Lee, Sang Hyuck; De Jong, Simone; Gaspar, Héléna; Teismann, Tobias; Wannemüller, André; Schneider, Silvia; Jöhren, Peter; Margraf, Jürgen; Breen, Gerome; Eley, Thalia C

    2017-04-01

    Exposure-based cognitive behavioural therapy (eCBT) is an effective treatment for anxiety disorders. Response varies between individuals. Gene expression integrates genetic and environmental influences. We analysed the effect of gene expression and genetic markers separately and together on treatment response. Adult participants (n ≤ 181) diagnosed with panic disorder or a specific phobia underwent eCBT as part of standard care. Percentage decrease in the Clinical Global Impression severity rating was assessed across treatment, and between baseline and a 6-month follow-up. Associations with treatment response were assessed using expression data from 3,233 probes, and expression profiles clustered in a data- and literature-driven manner. A total of 3,343,497 genetic variants were used to predict treatment response alone and combined in polygenic risk scores. Genotype and expression data were combined in expression quantitative trait loci (eQTL) analyses. Expression levels were not associated with either treatment phenotype in any analysis. A total of 1,492 eQTLs were identified with q < 0.05, but interactions between genetic variants and treatment response did not affect expression levels significantly. Genetic variants did not significantly predict treatment response alone or in polygenic risk scores. We assessed gene expression alone and alongside genetic variants. No associations with treatment outcome were identified. Future studies require larger sample sizes to discover associations.

  17. Integrative analyses shed new light on human ribosomal protein gene regulation

    PubMed Central

    Li, Xin; Zheng, Yiyu; Hu, Haiyan; Li, Xiaoman

    2016-01-01

    Ribosomal protein genes (RPGs) are important house-keeping genes that are well-known for their coordinated expression. Previous studies on RPGs are largely limited to their promoter regions. Recent high-throughput studies provide an unprecedented opportunity to study how human RPGs are transcriptionally modulated and how such transcriptional regulation may contribute to the coordinate gene expression in various tissues and cell types. By analyzing the DNase I hypersensitive sites under 349 experimental conditions, we predicted 217 RPG regulatory regions in the human genome. More than 86.6% of these computationally predicted regulatory regions were partially corroborated by independent experimental measurements. Motif analyses on these predicted regulatory regions identified 31 DNA motifs, including 57.1% of experimentally validated motifs in literature that regulate RPGs. Interestingly, we observed that the majority of the predicted motifs were shared by the predicted distal and proximal regulatory regions of the same RPGs, a likely general mechanism for enhancer-promoter interactions. We also found that RPGs may be differently regulated in different cells, indicating that condition-specific RPG regulatory regions still need to be discovered and investigated. Our study advances the understanding of how RPGs are coordinately modulated, which sheds light to the general principles of gene transcriptional regulation in mammals. PMID:27346035

  18. Integrative analyses shed new light on human ribosomal protein gene regulation.

    PubMed

    Li, Xin; Zheng, Yiyu; Hu, Haiyan; Li, Xiaoman

    2016-06-27

    Ribosomal protein genes (RPGs) are important house-keeping genes that are well-known for their coordinated expression. Previous studies on RPGs are largely limited to their promoter regions. Recent high-throughput studies provide an unprecedented opportunity to study how human RPGs are transcriptionally modulated and how such transcriptional regulation may contribute to the coordinate gene expression in various tissues and cell types. By analyzing the DNase I hypersensitive sites under 349 experimental conditions, we predicted 217 RPG regulatory regions in the human genome. More than 86.6% of these computationally predicted regulatory regions were partially corroborated by independent experimental measurements. Motif analyses on these predicted regulatory regions identified 31 DNA motifs, including 57.1% of experimentally validated motifs in literature that regulate RPGs. Interestingly, we observed that the majority of the predicted motifs were shared by the predicted distal and proximal regulatory regions of the same RPGs, a likely general mechanism for enhancer-promoter interactions. We also found that RPGs may be differently regulated in different cells, indicating that condition-specific RPG regulatory regions still need to be discovered and investigated. Our study advances the understanding of how RPGs are coordinately modulated, which sheds light to the general principles of gene transcriptional regulation in mammals.

  19. An RNA-Seq based gene expression atlas of the common bean.

    PubMed

    O'Rourke, Jamie A; Iniguez, Luis P; Fu, Fengli; Bucciarelli, Bruna; Miller, Susan S; Jackson, Scott A; McClean, Philip E; Li, Jun; Dai, Xinbin; Zhao, Patrick X; Hernandez, Georgina; Vance, Carroll P

    2014-10-06

    Common bean (Phaseolus vulgaris) is grown throughout the world and comprises roughly 50% of the grain legumes consumed worldwide. Despite this, genetic resources for common beans have been lacking. Next generation sequencing, has facilitated our investigation of the gene expression profiles associated with biologically important traits in common bean. An increased understanding of gene expression in common bean will improve our understanding of gene expression patterns in other legume species. Combining recently developed genomic resources for Phaseolus vulgaris, including predicted gene calls, with RNA-Seq technology, we measured the gene expression patterns from 24 samples collected from seven tissues at developmentally important stages and from three nitrogen treatments. Gene expression patterns throughout the plant were analyzed to better understand changes due to nodulation, seed development, and nitrogen utilization. We have identified 11,010 genes differentially expressed with a fold change ≥ 2 and a P-value < 0.05 between different tissues at the same time point, 15,752 genes differentially expressed within a tissue due to changes in development, and 2,315 genes expressed only in a single tissue. These analyses identified 2,970 genes with expression patterns that appear to be directly dependent on the source of available nitrogen. Finally, we have assembled this data in a publicly available database, The Phaseolus vulgaris Gene Expression Atlas (Pv GEA), http://plantgrn.noble.org/PvGEA/ . Using the website, researchers can query gene expression profiles of their gene of interest, search for genes expressed in different tissues, or download the dataset in a tabular form. These data provide the basis for a gene expression atlas, which will facilitate functional genomic studies in common bean. Analysis of this dataset has identified genes important in regulating seed composition and has increased our understanding of nodulation and impact of the nitrogen source on assimilation and distribution throughout the plant.

  20. Screening of biomarkers for prediction of response to and prognosis after chemotherapy for breast cancers

    PubMed Central

    Bing, Feng; Zhao, Yu

    2016-01-01

    Objective To screen the biomarkers having the ability to predict prognosis after chemotherapy for breast cancers. Methods Three microarray data of breast cancer patients undergoing chemotherapy were collected from Gene Expression Omnibus database. After preprocessing, data in GSE41112 were analyzed using significance analysis of microarrays to screen the differentially expressed genes (DEGs). The DEGs were further analyzed by Differentially Coexpressed Genes and Links to construct a function module, the prognosis efficacy of which was verified by the other two datasets (GSE22226 and GSE58644) using Kaplan–Meier plots. The involved genes in function module were subjected to a univariate Cox regression analysis to confirm whether the expression of each prognostic gene was associated with survival. Results A total of 511 DEGs between breast cancer patients who received chemotherapy or not were obtained, consisting of 421 upregulated and 90 downregulated genes. Using the Differentially Coexpressed Genes and Links package, 1,244 differentially coexpressed genes (DCGs) were identified, among which 36 DCGs were regulated by the transcription factor complex NFY (NFYA, NFYB, NFYC). These 39 genes constructed a gene module to classify the samples in GSE22226 and GSE58644 into three subtypes and these subtypes exhibited significantly different survival rates. Furthermore, several genes of the 39 DCGs were shown to be significantly associated with good (such as CDC20) and poor (such as ARID4A) prognoses following chemotherapy. Conclusion Our present study provided a serial of biomarkers for predicting the prognosis of chemotherapy or targets for development of alternative treatment (ie, CDC20 and ARID4A) in breast cancer patients. PMID:27217777

  1. Risk of type 1 diabetes progression in islet autoantibody-positive children can be further stratified using expression patterns of multiple genes implicated in peripheral blood lymphocyte activation and function.

    PubMed

    Jin, Yulan; Sharma, Ashok; Bai, Shan; Davis, Colleen; Liu, Haitao; Hopkins, Diane; Barriga, Kathy; Rewers, Marian; She, Jin-Xiong

    2014-07-01

    There is tremendous scientific and clinical value to further improving the predictive power of autoantibodies because autoantibody-positive (AbP) children have heterogeneous rates of progression to clinical diabetes. This study explored the potential of gene expression profiles as biomarkers for risk stratification among 104 AbP subjects from the Diabetes Autoimmunity Study in the Young (DAISY) using a discovery data set based on microarray and a validation data set based on real-time RT-PCR. The microarray data identified 454 candidate genes with expression levels associated with various type 1 diabetes (T1D) progression rates. RT-PCR analyses of the top-27 candidate genes confirmed 5 genes (BACH2, IGLL3, EIF3A, CDC20, and TXNDC5) associated with differential progression and implicated in lymphocyte activation and function. Multivariate analyses of these five genes in the discovery and validation data sets identified and confirmed four multigene models (BI, ICE, BICE, and BITE, with each letter representing a gene) that consistently stratify high- and low-risk subsets of AbP subjects with hazard ratios >6 (P < 0.01). The results suggest that these genes may be involved in T1D pathogenesis and potentially serve as excellent gene expression biomarkers to predict the risk of progression to clinical diabetes for AbP subjects. © 2014 by the American Diabetes Association.

  2. Identification of miRNA-Mediated Core Gene Module for Glioma Patient Prediction by Integrating High-Throughput miRNA, mRNA Expression and Pathway Structure

    PubMed Central

    Han, Junwei; Shang, Desi; Zhang, Yunpeng; Zhang, Wei; Yao, Qianlan; Han, Lei; Xu, Yanjun; Yan, Wei; Bao, Zhaoshi; You, Gan; Jiang, Tao; Kang, Chunsheng; Li, Xia

    2014-01-01

    The prognosis of glioma patients is usually poor, especially in patients with glioblastoma (World Health Organization (WHO) grade IV). The regulatory functions of microRNA (miRNA) on genes have important implications in glioma cell survival. However, there are not many studies that have investigated glioma survival by integrating miRNAs and genes while also considering pathway structure. In this study, we performed sample-matched miRNA and mRNA expression profilings to systematically analyze glioma patient survival. During this analytical process, we developed pathway-based random walk to identify a glioma core miRNA-gene module, simultaneously considering pathway structure information and multi-level involvement of miRNAs and genes. The core miRNA-gene module we identified was comprised of four apparent sub-modules; all four sub-modules displayed a significant correlation with patient survival in the testing set (P-values≤0.001). Notably, one sub-module that consisted of 6 miRNAs and 26 genes also correlated with survival time in the high-grade subgroup (WHO grade III and IV), P-value = 0.0062. Furthermore, the 26-gene expression signature from this sub-module had robust predictive power in four independent, publicly available glioma datasets. Our findings suggested that the expression signatures, which were identified by integration of miRNA and gene level, were closely associated with overall survival among the glioma patients with various grades. PMID:24809850

  3. Relationship between gene expression and GC-content in mammals: statistical significance and biological relevance.

    PubMed

    Sémon, Marie; Mouchiroud, Dominique; Duret, Laurent

    2005-02-01

    Mammalian chromosomes are characterized by large-scale variations of DNA base composition (the so-called isochores). In contradiction with previous studies, Lercher et al. (Hum. Mol. Genet., 12, 2411, 2003) recently reported a strong correlation between gene expression breadth and GC-content, suggesting that there might be a selective pressure favoring the concentration of housekeeping genes in GC-rich isochores. We reassessed this issue by examining in human and mouse the correlation between gene expression and GC-content, using different measures of gene expression (EST, SAGE and microarray) and different measures of GC-content. We show that correlations between GC-content and expression are very weak, and may vary according to the method used to measure expression. Such weak correlations have a very low predictive value. The strong correlations reported by Lercher et al. (2003) are because of the fact that they measured variables over neighboring genes windows. We show here that using gene windows artificially enhances the correlation. The assertion that the expression of a given gene depends on the GC-content of the region where it is located is therefore not supported by the data.

  4. Dynamic gene expression changes precede dioxin-induced liver pathogenesis in medaka fish.

    PubMed

    Volz, David C; Hinton, David E; Law, J McHugh; Kullman, Seth W

    2006-02-01

    A major challenge for environmental genomics is linking gene expression to cellular toxicity and morphological alteration. Herein, we address complexities related to hepatic gene expression responses after a single injection of the aryl hydrocarbon receptor (AHR) agonist 2,3,7,8-tetrachlorodibenzo-p-dioxin (dioxin) and illustrate an initial stress response followed by cytologic and adaptive changes in the teleost fish medaka. Using a custom 175-gene array, we find that overall hepatic gene expression and histological changes are strongly dependent on dose and time. The most pronounced dioxin-induced gene expression changes occurred early and preceded morphologic alteration in the liver. Following a systematic search for putative Ah response elements (AHREs) (5'-CACGCA-3') within 2000 bp upstream of the predicted transcriptional start site, the majority (87%) of genes screened in this study did not contain an AHRE, suggesting that gene expression was not solely dependent on AHRE-mediated transcription. Moreover, in the highest dosage, we observed gene expression changes associated with adaptation that persisted for almost two weeks, including induction of a gene putatively identified as ependymin that may function in hepatic injury repair. These data suggest that the cellular response to dioxin involves both AHRE- and non-AHRE-mediated transcription, and that coupling gene expression profiling with analysis of morphologic pathogenesis is essential for establishing temporal relationships between transcriptional changes, toxicity, and adaptation to hepatic injury.

  5. Deletion of the transcriptional coactivator PGC1α in skeletal muscles is associated with reduced expression of genes related to oxidative muscle function

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hatazawa, Yukino; Research Fellow of Japan Society for the Promotion of Science, Tokyo; Minami, Kimiko

    The expression of the transcriptional coactivator PGC1α is increased in skeletal muscles during exercise. Previously, we showed that increased PGC1α leads to prolonged exercise performance (the duration for which running can be continued) and, at the same time, increases the expression of branched-chain amino acid (BCAA) metabolism-related enzymes and genes that are involved in supplying substrates for the TCA cycle. We recently created mice with PGC1α knockout specifically in the skeletal muscles (PGC1α KO mice), which show decreased mitochondrial content. In this study, global gene expression (microarray) analysis was performed in the skeletal muscles of PGC1α KO mice compared withmore » that of wild-type control mice. As a result, decreased expression of genes involved in the TCA cycle, oxidative phosphorylation, and BCAA metabolism were observed. Compared with previously obtained microarray data on PGC1α-overexpressing transgenic mice, each gene showed the completely opposite direction of expression change. Bioinformatic analysis of the promoter region of genes with decreased expression in PGC1α KO mice predicted the involvement of several transcription factors, including a nuclear receptor, ERR, in their regulation. As PGC1α KO microarray data in this study show opposing findings to the PGC1α transgenic data, a loss-of-function experiment, as well as a gain-of-function experiment, revealed PGC1α’s function in the oxidative energy metabolism of skeletal muscles. - Highlights: • Microarray analysis was performed in the skeletal muscle of PGC1α KO mice. • Expression of genes in the oxidative energy metabolism was decreased. • Bioinformatic analysis of promoter region of the genes predicted involvement of ERR. • PGC1α KO microarray data in this study show the mirror image of transgenic data.« less

  6. LPL is the strongest prognostic factor in a comparative analysis of RNA-based markers in early chronic lymphocytic leukemia.

    PubMed

    Kaderi, Mohd Arifin; Kanduri, Meena; Buhl, Anne Mette; Sevov, Marie; Cahill, Nicola; Gunnarsson, Rebeqa; Jansson, Mattias; Smedby, Karin Ekström; Hjalgrim, Henrik; Jurlander, Jesper; Juliusson, Gunnar; Mansouri, Larry; Rosenquist, Richard

    2011-08-01

    The expression levels of LPL, ZAP70, TCL1A, CLLU1 and MCL1 have recently been proposed as prognostic factors in chronic lymphocytic leukemia. However, few studies have systematically compared these different RNA-based markers. Using real-time quantitative PCR, we measured the mRNA expression levels of these genes in unsorted samples from 252 newly diagnosed chronic lymphocytic leukemia patients and correlated our data with established prognostic markers (for example Binet stage, CD38, IGHV gene mutational status and genomic aberrations) and clinical outcome. High expression levels of all RNA-based markers, except MCL1, predicted shorter overall survival and time to treatment, with LPL being the most significant. In multivariate analysis including the RNA-based markers, LPL expression was the only independent prognostic marker for overall survival and time to treatment. When studying LPL expression and the established markers, LPL expression retained its independent prognostic strength for overall survival. All of the RNA-based markers, albeit with varying ability, added prognostic information to established markers, with LPL expression giving the most significant results. Notably, high LPL expression predicted a worse outcome in good-prognosis subgroups, such as patients with mutated IGHV genes, Binet stage A, CD38 negativity or favorable cytogenetics. In particular, the combination of LPL expression and CD38 could further stratify Binet stage A patients. LPL expression is the strongest RNA-based prognostic marker in chronic lymphocytic leukemia that could potentially be applied to predict outcome in the clinical setting, particularly in the large group of patients with favorable prognosis.

  7. Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae.

    PubMed

    Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu

    2018-01-01

    A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata . It consists of 10 amino acid residues, including five N -methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae . The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR , were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae , gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata . Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae , although there may be unknown factors limiting productivity in this species.

  8. Heterologous Production of a Novel Cyclic Peptide Compound, KK-1, in Aspergillus oryzae

    PubMed Central

    Yoshimi, Akira; Yamaguchi, Sigenari; Fujioka, Tomonori; Kawai, Kiyoshi; Gomi, Katsuya; Machida, Masayuki; Abe, Keietsu

    2018-01-01

    A novel cyclic peptide compound, KK-1, was originally isolated from the plant-pathogenic fungus Curvularia clavata. It consists of 10 amino acid residues, including five N-methylated amino acid residues, and has potent antifungal activity. Recently, the genome-sequencing analysis of C. clavata was completed, and the biosynthetic genes involved in KK-1 production were predicted by using a novel gene cluster mining tool, MIDDAS-M. These genes form an approximately 75-kb cluster, which includes nine open reading frames, containing a non-ribosomal peptide synthetase (NRPS) gene. To determine whether the predicted genes were responsible for the biosynthesis of KK-1, we performed heterologous production of KK-1 in Aspergillus oryzae by introduction of the cluster genes into the genome of A. oryzae. The NRPS gene was split in two fragments and then reconstructed in the A. oryzae genome, because the gene was quite large (approximately 40 kb). The remaining seven genes in the cluster, excluding the regulatory gene kkR, were simultaneously introduced into the strain of A. oryzae in which NRPS had already been incorporated. To evaluate the heterologous production of KK-1 in A. oryzae, gene expression was analyzed by RT-PCR and KK-1 productivity was quantified by HPLC. KK-1 was produced in variable quantities by a number of transformed strains, along with expression of the cluster genes. The amount of KK-1 produced by the strain with the greatest expression of all genes was lower than that produced by the original producer, C. clavata. Therefore, expression of the cluster genes is necessary and sufficient for the heterologous production of KK-1 in A. oryzae, although there may be unknown factors limiting productivity in this species. PMID:29686660

  9. Analysis of multiplex gene expression maps obtained by voxelation.

    PubMed

    An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios

    2009-04-29

    Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists.

  10. A four-gene signature predicts survival in clear-cell renal-cell carcinoma.

    PubMed

    Dai, Jun; Lu, Yuchao; Wang, Jinyu; Yang, Lili; Han, Yingyan; Wang, Ying; Yan, Dan; Ruan, Qiurong; Wang, Shaogang

    2016-12-13

    Clear-cell renal-cell carcinoma (ccRCC) is the most common pathological subtype of renal cell carcinoma (RCC), accounting for about 80% of RCC. In order to find potential prognostic biomarkers in ccRCC, we presented a four-gene signature to evaluate the prognosis of ccRCC. SurvExpress and immunohistochemical (IHC) staining of tissue microarrays were used to analyze the association between the four genes and the prognosis of ccRCC. Data from TCGA dataset revealed a prognostic prompt function of the four genes (PTEN, PIK3C2A, ITPA and BCL3). Further discovery suggested that the four-gene signature predicted survival better than any of the four genes alone. Moreover, IHC staining demonstrated a consistent result with TCGA, indicating that the signature was an independent prognostic factor of survival in ccRCC. Univariate and multivariate Cox proportional hazard regression analysis were conducted to verify the association of clinicopathological variables and the four genes' expression levels with survival. The results further testified that the risk (four-gene signature) was an independent prognostic factors of both Overall Survival (OS) and Disease-free Survival (DFS) (P<0.05). In conclusion, the four-gene signature was correlated with the survival of ccRCC, and therefore, may help to provide significant clinical implications for predicting the prognosis of patients.

  11. Characteristics of functional enrichment and gene expression level of human putative transcriptional target genes.

    PubMed

    Osato, Naoki

    2018-01-19

    Transcriptional target genes show functional enrichment of genes. However, how many and how significantly transcriptional target genes include functional enrichments are still unclear. To address these issues, I predicted human transcriptional target genes using open chromatin regions, ChIP-seq data and DNA binding sequences of transcription factors in databases, and examined functional enrichment and gene expression level of putative transcriptional target genes. Gene Ontology annotations showed four times larger numbers of functional enrichments in putative transcriptional target genes than gene expression information alone, independent of transcriptional target genes. To compare the number of functional enrichments of putative transcriptional target genes between cells or search conditions, I normalized the number of functional enrichment by calculating its ratios in the total number of transcriptional target genes. With this analysis, native putative transcriptional target genes showed the largest normalized number of functional enrichments, compared with target genes including 5-60% of randomly selected genes. The normalized number of functional enrichments was changed according to the criteria of enhancer-promoter interactions such as distance from transcriptional start sites and orientation of CTCF-binding sites. Forward-reverse orientation of CTCF-binding sites showed significantly higher normalized number of functional enrichments than the other orientations. Journal papers showed that the top five frequent functional enrichments were related to the cellular functions in the three cell types. The median expression level of transcriptional target genes changed according to the criteria of enhancer-promoter assignments (i.e. interactions) and was correlated with the changes of the normalized number of functional enrichments of transcriptional target genes. Human putative transcriptional target genes showed significant functional enrichments. Functional enrichments were related to the cellular functions. The normalized number of functional enrichments of human putative transcriptional target genes changed according to the criteria of enhancer-promoter assignments and correlated with the median expression level of the target genes. These analyses and characters of human putative transcriptional target genes would be useful to examine the criteria of enhancer-promoter assignments and to predict the novel mechanisms and factors such as DNA binding proteins and DNA sequences of enhancer-promoter interactions.

  12. A novel pair of immunoglobulin-like receptors expressed by B cells and myeloid cells

    PubMed Central

    Kubagawa, Hiromi; Burrows, Peter D.; Cooper, Max D.

    1997-01-01

    An Fcα receptor probe of human origin was used to identify novel members of the Ig gene superfamily in mice. Paired Ig-like receptors, named PIR-A and PIR-B, are predicted from sequence analysis of the cDNAs isolated from a mouse splenic library. Both type I transmembrane proteins possess similar ectodomains with six Ig-like loops, but have different transmembrane and cytoplasmic regions. The predicted PIR-A protein has a short cytoplasmic tail and a charged Arg residue in the transmembrane region that, by analogy with the FcαR relative, suggests the potential for association with an additional transmembrane protein to form a signal transducing unit. In contrast, the PIR-B protein has an uncharged transmembrane region and a long cytoplasmic tail containing four potential immunoreceptor tyrosine-based inhibitory motifs. These features are shared by the related killer inhibitory receptors. PIR-A proteins appear to be highly variable, in that predicted peptide sequences differ for seven randomly selected PIR-A clones, whereas PIR-B cDNA clones are invariant. Southern blot analysis with PIR-B and PIR-A-specific probes suggests only one PIR-B gene and multiple PIR-A genes. The PIR-A and PIR-B genes are expressed in B lymphocytes and myeloid lineage cells, wherein both are expressed simultaneously. The characteristics of the highly-conserved PIR-A and PIR-B genes and their coordinate cellular expression suggest a potential regulatory role in humoral, inflammatory, and allergic responses. PMID:9144225

  13. Profile of microRNA in Giant Panda Blood: A Resource for Immune-Related and Novel microRNAs

    PubMed Central

    Yang, Mingyu; Du, Lianming; Li, Wujiao; Shen, Fujun; Fan, Zhenxin; Jian, Zuoyi; Hou, Rong; Shen, Yongmei; Yue, Bisong; Zhang, Xiuyue

    2015-01-01

    The giant panda (Ailuropoda melanoleuca) is one of the world’s most beloved endangered mammals. Although the draft genome of this species had been assembled, little was known about the composition of its microRNAs (miRNAs) or their functional profiles. Recent studies demonstrated that changes in the expression of miRNAs are associated with immunity. In this study, miRNAs were extracted from the blood of four healthy giant pandas and sequenced by Illumina next generation sequencing technology. As determined by miRNA screening, a total of 276 conserved miRNAs and 51 novel putative miRNAs candidates were detected. After differential expression analysis, we noticed that the expressions of 7 miRNAs were significantly up-regulated in young giant pandas compared with that of adults. Moreover, 2 miRNAs were up-regulated in female giant pandas and 1 in the male individuals. Target gene prediction suggested that the miRNAs of giant panda might be relevant to the expressions of 4,602 downstream genes. Subseuqently, the predicted target genes were conducted to KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis and we found that these genes were mainly involved in host immunity, including the Ras signaling pathway, the PI3K-Akt signaling pathway, and the MAPK signaling pathway. In conclusion, our results provide the first miRNA profiles of giant panda blood, and the predicted functional analyses may open an avenue for further study of giant panda immunity. PMID:26599861

  14. Profile of microRNA in Giant Panda Blood: A Resource for Immune-Related and Novel microRNAs.

    PubMed

    Yang, Mingyu; Du, Lianming; Li, Wujiao; Shen, Fujun; Fan, Zhenxin; Jian, Zuoyi; Hou, Rong; Shen, Yongmei; Yue, Bisong; Zhang, Xiuyue

    2015-01-01

    The giant panda (Ailuropoda melanoleuca) is one of the world's most beloved endangered mammals. Although the draft genome of this species had been assembled, little was known about the composition of its microRNAs (miRNAs) or their functional profiles. Recent studies demonstrated that changes in the expression of miRNAs are associated with immunity. In this study, miRNAs were extracted from the blood of four healthy giant pandas and sequenced by Illumina next generation sequencing technology. As determined by miRNA screening, a total of 276 conserved miRNAs and 51 novel putative miRNAs candidates were detected. After differential expression analysis, we noticed that the expressions of 7 miRNAs were significantly up-regulated in young giant pandas compared with that of adults. Moreover, 2 miRNAs were up-regulated in female giant pandas and 1 in the male individuals. Target gene prediction suggested that the miRNAs of giant panda might be relevant to the expressions of 4,602 downstream genes. Subseuqently, the predicted target genes were conducted to KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis and we found that these genes were mainly involved in host immunity, including the Ras signaling pathway, the PI3K-Akt signaling pathway, and the MAPK signaling pathway. In conclusion, our results provide the first miRNA profiles of giant panda blood, and the predicted functional analyses may open an avenue for further study of giant panda immunity.

  15. Brain neurotransmitter transporter/receptor genomics and efavirenz central nervous system adverse events.

    PubMed

    Haas, David W; Bradford, Yuki; Verma, Anurag; Verma, Shefali S; Eron, Joseph J; Gulick, Roy M; Riddler, Sharon A; Sax, Paul E; Daar, Eric S; Morse, Gene D; Acosta, Edward P; Ritchie, Marylyn D

    2018-05-29

    We characterized associations between central nervous system (CNS) adverse events and brain neurotransmitter transporter/receptor genomics among participants randomized to efavirenz-containing regimens in AIDS Clinical Trials Group studies in the USA. Four clinical trials randomly assigned treatment-naive participants to efavirenz-containing regimens. Genome-wide genotype and PrediXcan were used to infer gene expression levels in tissues including 10 brain regions. Multivariable regression models stratified by race/ethnicity were adjusted for CYP2B6/CYP2A6 genotypes that predict plasma efavirenz exposure, age, and sex. Combined analyses also adjusted for genetic ancestry. Analyses included 167 cases with grade 2 or greater efavirenz-consistent CNS adverse events within 48 weeks of study entry, and 653 efavirenz-tolerant controls. CYP2B6/CYP2A6 genotype level was independently associated with CNS adverse events (odds ratio: 1.07; P=0.044). Predicted expression of six genes postulated to mediate efavirenz CNS side effects (SLC6A2, SLC6A3, PGR, HTR2A, HTR2B, HTR6) were not associated with CNS adverse events after correcting for multiple testing, the lowest P value being for PGR in hippocampus (P=0.012), nor were polymorphisms in these genes or AR and HTR2C, the lowest P value being for rs12393326 in HTR2C (P=6.7×10). As a positive control, baseline plasma bilirubin concentration was associated with predicted liver UGT1A1 expression level (P=1.9×10). Efavirenz-related CNS adverse events were not associated with predicted neurotransmitter transporter/receptor gene expression levels in brain or with polymorphisms in these genes. Variable susceptibility to efavirenz-related CNS adverse events may not be explained by brain neurotransmitter transporter/receptor genomics.

  16. TransCONFIRM: Identification of a Genetic Signature of Response to Fulvestrant in Advanced Hormone Receptor-Positive Breast Cancer.

    PubMed

    Jeselsohn, Rinath; Barry, William T; Migliaccio, Ilenia; Biagioni, Chiara; Zhao, Jin; De Tribolet-Hardy, Jonas; Guarducci, Cristina; Bonechi, Martina; Laing, Naomi; Winer, Eric P; Brown, Myles; Leo, Angelo Di; Malorni, Luca

    2016-12-01

    Fulvestrant is an estrogen receptor (ER) antagonist and an approved treatment for metastatic estrogen receptor-positive (ER + ) breast cancer. With the exception of ER levels, there are no established predictive biomarkers of response to single-agent fulvestrant. We attempted to identify a gene signature of response to fulvestrant in advanced breast cancer. Primary tumor samples from 134 patients enrolled in the phase III CONFIRM study of patients with metastatic ER + breast cancer comparing treatment with either 250 mg or 500 mg fulvestrant were collected for genome-wide transcriptomic analysis. Gene expression profiling was performed using Affymetrix microarrays. An exploratory analysis was performed to identify biologic pathways and new signatures associated with response to fulvestrant. Pathway analysis demonstrated that increased EGF pathway and FOXA1 transcriptional signaling is associated with decreased response to fulvestrant. Using a multivariate Cox model, we identified a novel set of 37 genes with an expression that is independently associated with progression-free survival (PFS). TFAP2C, a known regulator of ER activity, was ranked second in this gene set, and high expression was associated with a decreased response to fulvestrant. The negative predictive value of TFAP2C expression at the protein level was confirmed by IHC. We identified biologic pathways and a novel gene signature in primary ER + breast cancers that predicts for response to treatment in the CONFIRM study. These results suggest potential new therapeutic targets and warrant further validation as predictive biomarkers of fulvestrant treatment in metastatic breast cancer. Clin Cancer Res; 22(23); 5755-64. ©2016 AACR. ©2016 American Association for Cancer Research.

  17. Gene expression levels of gamma-glutamyl hydrolase in tumor tissues may be a useful biomarker for the proper use of S-1 and tegafur-uracil/leucovorin in preoperative chemoradiotherapy for patients with rectal cancer.

    PubMed

    Sadahiro, Sotaro; Suzuki, T; Tanaka, A; Okada, K; Saito, G; Miyakita, H; Ogimi, T; Nagase, H

    2017-06-01

    Preoperative chemoradiotherapy (CRT) using 5-fluorouracil (5-FU)-based chemotherapy is the standard of care for rectal cancer. The effect of additional chemotherapy during the period between the completion of radiotherapy and surgery remains unclear. Predictive factors for CRT may differ between combination chemotherapy with S-1 and with tegafur-uracil/leucovorin (UFT/LV). The subjects were 54 patients with locally advanced rectal cancer who received preoperative CRT with S-1 or UFT/LV. The pathological tumor response was assessed according to the tumor regression grade (TRG). The expression levels of 18 CRT-related genes were determined using RT-PCR assay. A pathological response (TRG 1-2) was observed in 23 patients (42.6%). In a multivariate logistic regression analysis for pathological response, the overall expression levels of four genes, HIF1A, MTHFD1, GGH and TYMS, were significant, and the accuracy rate of the predictive model was 83.3%. The effects of the gene expression levels of GGH on the response differed significantly according to the treatment regimen. The total pathological response rate of both high-GGH patients in the S-1 group and low-GGH patients in the UFT/LV group was 58.3%. Additional treatment with 5-FU-based chemotherapy during the interval between radiotherapy and surgery is not beneficial in patients who have received 5-FU-based CRT. The expression levels of four genes, HIF1A, MTHFD1, GGH and TYMS, in tumor tissues can predict the response to preoperative CRT including either S-1 or UFT/LV. In particular, the gene expression level of GGH in tumor tissues may be a useful biomarker for the appropriate use of S-1 and UFT/LV in CRT.

  18. Genes associated with metabolic syndrome predict disease-free survival in stage II colorectal cancer patients. A novel link between metabolic dysregulation and colorectal cancer.

    PubMed

    Vargas, Teodoro; Moreno-Rubio, Juan; Herranz, Jesús; Cejas, Paloma; Molina, Susana; González-Vallinas, Margarita; Ramos, Ricardo; Burgos, Emilio; Aguayo, Cristina; Custodio, Ana B; Reglero, Guillermo; Feliu, Jaime; Ramírez de Molina, Ana

    2014-12-01

    Studies have recently suggested that metabolic syndrome and its components increase the risk of colorectal cancer. Both diseases are increasing in most countries, and the genetic association between them has not been fully elucidated. The objective of this study was to assess the association between genetic risk factors of metabolic syndrome or related conditions (obesity, hyperlipidaemia, diabetes mellitus type 2) and clinical outcome in stage II colorectal cancer patients. Expression levels of several genes related to metabolic syndrome and associated alterations were analysed by real-time qPCR in two equivalent but independent sets of stage II colorectal cancer patients. Using logistic regression models and cross-validation analysis with all tumour samples, we developed a metabolic syndrome-related gene expression profile to predict clinical outcome in stage II colorectal cancer patients. The results showed that a gene expression profile constituted by genes previously related to metabolic syndrome was significantly associated with clinical outcome of stage II colorectal cancer patients. This metabolic profile was able to identify patients with a low risk and high risk of relapse. Its predictive value was validated using an independent set of stage II colorectal cancer patients. The identification of a set of genes related to metabolic syndrome that predict survival in intermediate-stage colorectal cancer patients allows delineation of a high-risk group that may benefit from adjuvant therapy and avoid the toxic and unnecessary chemotherapy in patients classified as low risk. Our results also confirm the linkage between metabolic disorder and colorectal cancer and suggest the potential for cancer prevention and/or treatment by targeting these genes. Copyright © 2014 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  19. Characterization of Clostridium perfringens iota-toxin genes and expression in Escherichia coli.

    PubMed

    Perelle, S; Gibert, M; Boquet, P; Popoff, M R

    1993-12-01

    The iota toxin which is produced by Clostridium perfringens type E, is a binary toxin consisting of two independent polypeptides: Ia, which is an ADP-ribosyltransferase, and Ib, which is involved in the binding and internalization of the toxin into the cell. Two degenerate oligonucleotide probes deduced from partial amino acid sequence of each component of C. spiroforme toxin, which is closely related to the iota toxin, were used to clone three overlapping DNA fragments containing the iota-toxin genes from C. perfringens type E plasmid DNA. Two genes, in the same orientation, coding for Ia (387 amino acids) and Ib (875 amino acids) and separated by 243 noncoding nucleotides were identified. A predicted signal peptide was found for each component, and the secreted Ib displays two domains, the propeptide (172 amino acids) and the mature protein (664 amino acids). The Ia gene has been expressed in Escherichia coli and C. perfringens, under the control of its own promoter. The recombinant polypeptide obtained was recognized by Ia antibodies and ADP-ribosylated actin. The expression of the Ib gene was obtained in E. coli harboring a recombinant plasmid encompassing the putative promoter upstream of the Ia gene and the Ia and Ib genes. Two residues which have been found to be involved in the NAD+ binding site of diphtheria and pseudomonas toxins are conserved in the predicted Ia sequence (Glu-14 and Trp-19). The predicted amino acid Ib sequence shows 33.9% identity with and 54.4% similarity to the protective antigen of the anthrax toxin complex. In particular, the central region of Ib, which contains a predicted transmembrane segment (Leu-292 to Ser-308), presents 45% identity with the corresponding protective antigen sequence which is involved in the translocation of the toxin across the cell membrane.

  20. Over-expression of the miRNA cluster at chromosome 14q32 in the alcoholic brain correlates with suppression of predicted target mRNA required for oligodendrocyte proliferation.

    PubMed

    Manzardo, A M; Gunewardena, S; Butler, M G

    2013-09-10

    We examined miRNA expression from RNA isolated from the frontal cortex (Broadman area 9) of 9 alcoholics (6 males, 3 females, mean age 48 years) and 9 matched controls using both the Affymetrix GeneChip miRNA 2.0 and Human Exon 1.0 ST Arrays to further characterize genetic influences in alcoholism and the effects of alcohol consumption on predicted target mRNA expression. A total of 12 human miRNAs were significantly up-regulated in alcohol dependent subjects (fold change≥1.5, false discovery rate (FDR)≤0.3; p<0.05) compared with controls including a cluster of 4 miRNAs (e.g., miR-377, miR-379) from the maternally expressed 14q32 chromosome region. The status of the up-regulated miRNAs was supported using the high-throughput method of exon microarrays showing decreased predicted mRNA gene target expression as anticipated from the same RNA aliquot. Predicted mRNA targets were involved in cellular adhesion (e.g., THBS2), tissue differentiation (e.g., CHN2), neuronal migration (e.g., NDE1), myelination (e.g., UGT8, CNP) and oligodendrocyte proliferation (e.g., ENPP2, SEMA4D1). Our data support an association of alcoholism with up-regulation of a cluster of miRNAs located in the genomic imprinted domain on chromosome 14q32 with their predicted gene targets involved with oligodendrocyte growth, differentiation and signaling. Copyright © 2013 Elsevier B.V. All rights reserved.

  1. Temporal Expression-based Analysis of Metabolism

    PubMed Central

    Segrè, Daniel

    2012-01-01

    Metabolic flux is frequently rerouted through cellular metabolism in response to dynamic changes in the intra- and extra-cellular environment. Capturing the mechanisms underlying these metabolic transitions in quantitative and predictive models is a prominent challenge in systems biology. Progress in this regard has been made by integrating high-throughput gene expression data into genome-scale stoichiometric models of metabolism. Here, we extend previous approaches to perform a Temporal Expression-based Analysis of Metabolism (TEAM). We apply TEAM to understanding the complex metabolic dynamics of the respiratorily versatile bacterium Shewanella oneidensis grown under aerobic, lactate-limited conditions. TEAM predicts temporal metabolic flux distributions using time-series gene expression data. Increased predictive power is achieved by supplementing these data with a large reference compendium of gene expression, which allows us to take into account the unique character of the distribution of expression of each individual gene. We further propose a straightforward method for studying the sensitivity of TEAM to changes in its fundamental free threshold parameter θ, and reveal that discrete zones of distinct metabolic behavior arise as this parameter is changed. By comparing the qualitative characteristics of these zones to additional experimental data, we are able to constrain the range of θ to a small, well-defined interval. In parallel, the sensitivity analysis reveals the inherently difficult nature of dynamic metabolic flux modeling: small errors early in the simulation propagate to relatively large changes later in the simulation. We expect that handling such “history-dependent” sensitivities will be a major challenge in the future development of dynamic metabolic-modeling techniques. PMID:23209390

  2. First Generation Gene Expression Signature for Early Prediction of Late Occurring Hematological Acute Radiation Syndrome in Baboons.

    PubMed

    Port, M; Herodin, F; Valente, M; Drouet, M; Lamkowski, A; Majewski, M; Abend, M

    2016-07-01

    We implemented a two-stage study to predict late occurring hematologic acute radiation syndrome (HARS) in a baboon model based on gene expression changes measured in peripheral blood within the first two days after irradiation. Eighteen baboons were irradiated to simulate different patterns of partial-body and total-body exposure, which corresponded to an equivalent dose of 2.5 or 5 Gy. According to changes in blood cell counts the surviving baboons (n = 17) exhibited mild (H1-2, n = 4) or more severe (H2-3, n = 13) HARS. Blood samples taken before irradiation served as unexposed control (H0, n = 17). For stage I of this study, a whole genome screen (mRNA microarrays) was performed using a portion of the samples (H0, n = 5; H1-2, n = 4; H2-3, n = 5). For stage II, using the remaining samples and the more sensitive methodology, qRT-PCR, validation was performed on candidate genes that were differentially up- or down-regulated during the first two days after irradiation. Differential gene expression was defined as significant (P < 0.05) and greater than or equal to a twofold difference above a H0 classification. From approximately 20,000 genes, on average 46% appeared to be expressed. On day 1 postirradiation for H2-3, approximately 2-3 times more genes appeared up-regulated (1,418 vs. 550) or down-regulated (1,603 vs. 735) compared to H1-2. This pattern became more pronounced at day 2 while the number of differentially expressed genes decreased. The specific genes showed an enrichment of biological processes coding for immune system processes, natural killer cell activation and immune response (P = 1 × E-06 up to 9 × E-14). Based on the P values, magnitude and sustained differential gene expression over time, we selected 89 candidate genes for validation using qRT-PCR. Ultimately, 22 genes were confirmed for identification of H1-3 classifications and seven genes for identification of H2-3 classifications using qRT-PCR. For H1-3 classifications, most genes were constantly three to fivefold down-regulated relative to H0 over both days, but some genes appeared 10.3-fold (VSIG4) or even 30.7-fold up-regulated (CD177) over H0. For H2-3, some genes appeared four to sevenfold up-regulated relative to H0 (RNASE3, DAGLA, ARG2), but other genes showed a strong 14- to 33-fold down-regulation relative to H0 (WNT3, POU2AF1, CCR7). All of these genes allowed an almost completely identifiable separation among each of the HARS categories. In summary, clinically relevant HARS can be independently predicted with all 29 irradiated genes examined in the peripheral blood of baboons within the first two days postirradiation. While further studies are needed to confirm these findings, this model shows potential relevance in the prediction of clinical outcomes in exposed humans and as an aid in the prioritizing of medical treatment.

  3. Characterization of transformation related genes in oral cancer cells.

    PubMed

    Chang, D D; Park, N H; Denny, C T; Nelson, S F; Pe, M

    1998-04-16

    A cDNA representational difference analysis (cDNA-RDA) and an arrayed filter technique were used to characterize transformation-related genes in oral cancer. From an initial comparison of normal oral epithelial cells and a human papilloma virus (HPV)-immortalized oral epithelial cell line, we obtained 384 differentially expressed gene fragments and arrayed them on a filter. Two hundred and twelve redundant clones were identified by three rounds of back hybridization. Sequence analysis of the remaining clones revealed 99 unique clones corresponding to 69 genes. The expression of these transformation related gene fragments in three nontumorigenic HPV-immortalized oral epithelial cell lines and three oral cancer cell lines were simultaneously monitored using a cDNA array hybridization. Although there was a considerable cell line-to-cell line variability in the expression of these clones, a reliable prediction of their expression could be made from the cDNA array hybridization. Our study demonstrates the utility of combining cDNA-RDA and arrayed filters in high-throughput gene expression difference analysis. The differentially expressed genes identified in this study should be informative in studying oral epithelial cell carcinogenesis.

  4. The prediction of biogenic magnetic nanoparticles biomineralization in human tissues and organs

    NASA Astrophysics Data System (ADS)

    Medviediev, O.; Gorobets, O. Yu; Gorobets, S. V.; Yadrykhins'ky, V. S.

    2017-10-01

    In this study, human homologs of magnetosome island proteins basing on pairwise and multiple alignment of amino acid sequences were found. The expression levels of genes, which encode magnetosome island proteins of M. gryphiswaldense MSR-1, that were cultured under oxygen deficiency conditions and also under microaerobic conditions were compared to the expression levels of genes that encode the relevant homologs in human organism. The possibility of BMN biomineralization in human tissues and organs, in which BMN were not experimentally found before, was predicted.

  5. Network information improves cancer outcome prediction.

    PubMed

    Roy, Janine; Winter, Christof; Isik, Zerrin; Schroeder, Michael

    2014-07-01

    Disease progression in cancer can vary substantially between patients. Yet, patients often receive the same treatment. Recently, there has been much work on predicting disease progression and patient outcome variables from gene expression in order to personalize treatment options. Despite first diagnostic kits in the market, there are open problems such as the choice of random gene signatures or noisy expression data. One approach to deal with these two problems employs protein-protein interaction networks and ranks genes using the random surfer model of Google's PageRank algorithm. In this work, we created a benchmark dataset collection comprising 25 cancer outcome prediction datasets from literature and systematically evaluated the use of networks and a PageRank derivative, NetRank, for signature identification. We show that the NetRank performs significantly better than classical methods such as fold change or t-test. Despite an order of magnitude difference in network size, a regulatory and protein-protein interaction network perform equally well. Experimental evaluation on cancer outcome prediction in all of the 25 underlying datasets suggests that the network-based methodology identifies highly overlapping signatures over all cancer types, in contrast to classical methods that fail to identify highly common gene sets across the same cancer types. Integration of network information into gene expression analysis allows the identification of more reliable and accurate biomarkers and provides a deeper understanding of processes occurring in cancer development and progression. © The Author 2012. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  6. Global map of physical interactions among differentially expressed genes in multiple sclerosis relapses and remissions.

    PubMed

    Tuller, Tamir; Atar, Shimshi; Ruppin, Eytan; Gurevich, Michael; Achiron, Anat

    2011-09-15

    Multiple sclerosis (MS) is a central nervous system autoimmune inflammatory T-cell-mediated disease with a relapsing-remitting course in the majority of patients. In this study, we performed a high-resolution systems biology analysis of gene expression and physical interactions in MS relapse and remission. To this end, we integrated 164 large-scale measurements of gene expression in peripheral blood mononuclear cells of MS patients in relapse or remission and healthy subjects, with large-scale information about the physical interactions between these genes obtained from public databases. These data were analyzed with a variety of computational methods. We find that there is a clear and significant global network-level signal that is related to the changes in gene expression of MS patients in comparison to healthy subjects. However, despite the clear differences in the clinical symptoms of MS patients in relapse versus remission, the network level signal is weaker when comparing patients in these two stages of the disease. This result suggests that most of the genes have relatively similar expression levels in the two stages of the disease. In accordance with previous studies, we found that the pathways related to regulation of cell death, chemotaxis and inflammatory response are differentially expressed in the disease in comparison to healthy subjects, while pathways related to cell adhesion, cell migration and cell-cell signaling are activated in relapse in comparison to remission. However, the current study includes a detailed report of the exact set of genes involved in these pathways and the interactions between them. For example, we found that the genes TP53 and IL1 are 'network-hub' that interacts with many of the differentially expressed genes in MS patients versus healthy subjects, and the epidermal growth factor receptor is a 'network-hub' in the case of MS patients with relapse versus remission. The statistical approaches employed in this study enabled us to report new sets of genes that according to their gene expression and physical interactions are predicted to be differentially expressed in MS versus healthy subjects, and in MS patients in relapse versus remission. Some of these genes may be useful biomarkers for diagnosing MS and predicting relapses in MS patients.

  7. Faster-X Evolution of Gene Expression in Drosophila

    PubMed Central

    Meisel, Richard P.; Malone, John H.; Clark, Andrew G.

    2012-01-01

    DNA sequences on X chromosomes often have a faster rate of evolution when compared to similar loci on the autosomes, and well articulated models provide reasons why the X-linked mode of inheritance may be responsible for the faster evolution of X-linked genes. We analyzed microarray and RNA–seq data collected from females and males of six Drosophila species and found that the expression levels of X-linked genes also diverge faster than autosomal gene expression, similar to the “faster-X” effect often observed in DNA sequence evolution. Faster-X evolution of gene expression was recently described in mammals, but it was limited to the evolutionary lineages shortly following the creation of the therian X chromosome. In contrast, we detect a faster-X effect along both deep lineages and those on the tips of the Drosophila phylogeny. In Drosophila males, the dosage compensation complex (DCC) binds the X chromosome, creating a unique chromatin environment that promotes the hyper-expression of X-linked genes. We find that DCC binding, chromatin environment, and breadth of expression are all predictive of the rate of gene expression evolution. In addition, estimates of the intraspecific genetic polymorphism underlying gene expression variation suggest that X-linked expression levels are not under relaxed selective constraints. We therefore hypothesize that the faster-X evolution of gene expression is the result of the adaptive fixation of beneficial mutations at X-linked loci that change expression level in cis. This adaptive faster-X evolution of gene expression is limited to genes that are narrowly expressed in a single tissue, suggesting that relaxed pleiotropic constraints permit a faster response to selection. Finally, we present a conceptional framework to explain faster-X expression evolution, and we use this framework to examine differences in the faster-X effect between Drosophila and mammals. PMID:23071459

  8. Multiple abiotic stimuli are integrated in the regulation of rice gene expression under field conditions.

    PubMed

    Plessis, Anne; Hafemeister, Christoph; Wilkins, Olivia; Gonzaga, Zennia Jean; Meyer, Rachel Sarah; Pires, Inês; Müller, Christian; Septiningsih, Endang M; Bonneau, Richard; Purugganan, Michael

    2015-11-26

    Plants rely on transcriptional dynamics to respond to multiple climatic fluctuations and contexts in nature. We analyzed the genome-wide gene expression patterns of rice (Oryza sativa) growing in rainfed and irrigated fields during two distinct tropical seasons and determined simple linear models that relate transcriptomic variation to climatic fluctuations. These models combine multiple environmental parameters to account for patterns of expression in the field of co-expressed gene clusters. We examined the similarities of our environmental models between tropical and temperate field conditions, using previously published data. We found that field type and macroclimate had broad impacts on transcriptional responses to environmental fluctuations, especially for genes involved in photosynthesis and development. Nevertheless, variation in solar radiation and temperature at the timescale of hours had reproducible effects across environmental contexts. These results provide a basis for broad-based predictive modeling of plant gene expression in the field.

  9. Artificial genetic selection for an efficient translation initiation site for expression of human RACK1 gene in Escherichia coli

    PubMed Central

    Zhelyabovskaya, Olga B.; Berlin, Yuri A.; Birikh, Klara R.

    2004-01-01

    In bacterial expression systems, translation initiation is usually the rate limiting and the least predictable stage of protein synthesis. Efficiency of a translation initiation site can vary dramatically depending on the sequence context. This is why many standard expression vectors provide very poor expression levels of some genes. This notion persuaded us to develop an artificial genetic selection protocol, which allows one to find for a given target gene an individual efficient ribosome binding site from a random pool. In order to create Darwinian pressure necessary for the genetic selection, we designed a system based on translational coupling, in which microorganism survival in the presence of antibiotic depends on expression of the target gene, while putting no special requirements on this gene. Using this system we obtained superproducing constructs for the human protein RACK1 (receptor for activated C kinase). PMID:15034151

  10. Regulation of human genome expression and RNA splicing by human papillomavirus 16 E2 protein.

    PubMed

    Gauson, Elaine J; Windle, Brad; Donaldson, Mary M; Caffarel, Maria M; Dornan, Edward S; Coleman, Nicholas; Herzyk, Pawel; Henderson, Scott C; Wang, Xu; Morgan, Iain M

    2014-11-01

    Human papillomavirus 16 (HPV16) is causative in human cancer. The E2 protein regulates transcription from and replication of the viral genome; the role of E2 in regulating the host genome has been less well studied. We have expressed HPV16 E2 (E2) stably in U2OS cells; these cells tolerate E2 expression well and gene expression analysis identified 74 genes showing differential expression specific to E2. Analysis of published gene expression data sets during cervical cancer progression identified 20 of the genes as being altered in a similar direction as the E2 specific genes. In addition, E2 altered the splicing of many genes implicated in cancer and cell motility. The E2 expressing cells showed no alteration in cell growth but were altered in cell motility, consistent with the E2 induced altered splicing predicted to affect this cellular function. The results present a model system for investigating E2 regulation of the host genome. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. AXIN2 expression predicts prostate cancer recurrence and regulates invasion and tumor growth.

    PubMed

    Hu, Brian R; Fairey, Adrian S; Madhav, Anisha; Yang, Dongyun; Li, Meng; Groshen, Susan; Stephens, Craig; Kim, Philip H; Virk, Navneet; Wang, Lina; Martin, Sue Ellen; Erho, Nicholas; Davicioni, Elai; Jenkins, Robert B; Den, Robert B; Xu, Tong; Xu, Yucheng; Gill, Inderbir S; Quinn, David I; Goldkorn, Amir

    2016-05-01

    Treatment of prostate cancer (PCa) may be improved by identifying biological mechanisms of tumor growth that directly impact clinical disease progression. We investigated whether genes associated with a highly tumorigenic, drug resistant, progenitor phenotype impact PCa biology and recurrence. Radical prostatectomy (RP) specimens (±disease recurrence, N = 276) were analyzed by qRT-PCR to quantify expression of genes associated with self-renewal, drug resistance, and tumorigenicity in prior studies. Associations between gene expression and PCa recurrence were confirmed by bootstrap internal validation and by external validation in independent cohorts (total N = 675) and in silico. siRNA knockdown and lentiviral overexpression were used to determine the effect of gene expression on PCa invasion, proliferation, and tumor growth. Four candidate genes were differentially expressed in PCa recurrence. Of these, low AXIN2 expression was internally validated in the discovery cohort. Validation in external cohorts and in silico demonstrated that low AXIN2 was independently associated with more aggressive PCa, biochemical recurrence, and metastasis-free survival after RP. Functionally, siRNA-mediated depletion of AXIN2 significantly increased invasiveness, proliferation, and tumor growth. Conversely, ectopic overexpression of AXIN2 significantly reduced invasiveness, proliferation, and tumor growth. Low AXIN2 expression was associated with PCa recurrence after RP in our test population as well as in external validation cohorts, and its expression levels in PCa cells significantly impacted invasiveness, proliferation, and tumor growth. Given these novel roles, further study of AXIN2 in PCa may yield promising new predictive and therapeutic strategies. © 2016 Wiley Periodicals, Inc.

  12. A detailed gene expression study of the Miscanthus genus reveals changes in the transcriptome associated with the rejuvenation of spring rhizomes.

    PubMed

    Barling, Adam; Swaminathan, Kankshita; Mitros, Therese; James, Brandon T; Morris, Juliette; Ngamboma, Ornella; Hall, Megan C; Kirkpatrick, Jessica; Alabady, Magdy; Spence, Ashley K; Hudson, Matthew E; Rokhsar, Daniel S; Moose, Stephen P

    2013-12-09

    The Miscanthus genus of perennial C4 grasses contains promising biofuel crops for temperate climates. However, few genomic resources exist for Miscanthus, which limits understanding of its interesting biology and future genetic improvement. A comprehensive catalog of expressed sequences were generated from a variety of Miscanthus species and tissue types, with an emphasis on characterizing gene expression changes in spring compared to fall rhizomes. Illumina short read sequencing technology was used to produce transcriptome sequences from different tissues and organs during distinct developmental stages for multiple Miscanthus species, including Miscanthus sinensis, Miscanthus sacchariflorus, and their interspecific hybrid Miscanthus × giganteus. More than fifty billion base-pairs of Miscanthus transcript sequence were produced. Overall, 26,230 Sorghum gene models (i.e., ~ 96% of predicted Sorghum genes) had at least five Miscanthus reads mapped to them, suggesting that a large portion of the Miscanthus transcriptome is represented in this dataset. The Miscanthus × giganteus data was used to identify genes preferentially expressed in a single tissue, such as the spring rhizome, using Sorghum bicolor as a reference. Quantitative real-time PCR was used to verify examples of preferential expression predicted via RNA-Seq. Contiguous consensus transcript sequences were assembled for each species and annotated using InterProScan. Sequences from the assembled transcriptome were used to amplify genomic segments from a doubled haploid Miscanthus sinensis and from Miscanthus × giganteus to further disentangle the allelic and paralogous variations in genes. This large expressed sequence tag collection creates a valuable resource for the study of Miscanthus biology by providing detailed gene sequence information and tissue preferred expression patterns. We have successfully generated a database of transcriptome assemblies and demonstrated its use in the study of genes of interest. Analysis of gene expression profiles revealed biological pathways that exhibit altered regulation in spring compared to fall rhizomes, which are consistent with their different physiological functions. The expression profiles of the subterranean rhizome provides a better understanding of the biological activities of the underground stem structures that are essentials for perenniality and the storage or remobilization of carbon and nutrient resources.

  13. S100A9 and EGFR gene signatures predict disease progression in muscle invasive bladder cancer patients after chemotherapy.

    PubMed

    Kim, W T; Kim, J; Yan, C; Jeong, P; Choi, S Y; Lee, O J; Chae, Y B; Yun, S J; Lee, S C; Kim, W J

    2014-05-01

    In our previous gene expression profile analysis, IL1B, S100A8, S100A9, and EGFR were shown to be important mediators of muscle invasive bladder cancer (MIBC) progression. The aim of the present study was to investigate the ability of these gene signatures to predict disease progression after chemotherapy in patients with locally recurrent or metastatic MIBC. Patients with locally advanced MIBC who received chemotherapy were enrolled. The expression signatures of four genes were measured and carried out further functional analysis to confirm our findings. Two of the four genes, S100A9 and EGFR, were determined to significantly influence disease progression (P = 0.023, 0.045, respectively). Based on a receiver operating characteristic curve, a cut-off value for disease progression was determined. Patients with the good-prognostic signature group had a significantly longer time to progression and cancer-specific survival time than those with the poor-prognostic signature group (P < 0.001, 0.042, respectively). In the multivariate Cox regression analysis, gene signature was the only factor that significantly influenced disease progression [hazard ratio: 4.726, confidence interval: 1.623-13.763, P = 0.004]. In immunohistochemical analysis, S100A9 and EGFR positivity were associated with disease progression after chemotherapy. Protein expression of S100A9/EGFR showed modest correlation with gene expression of S100A9/EGFR (r = 0.395, P = 0.014 and r = 0.453, P = 0.004). Our functional analysis provided the evidence demonstrating that expression of S100A9 and EGFR closely associated chemoresistance, and that inhibition of S100A9 and EGFR may sensitize bladder tumor cells to the cisplatin-based chemotherapy. The S100A9/EGFR level is a novel prognostic marker to predict the chemoresponsiveness of patients with locally recurrent or metastatic MIBC.

  14. Analyses of Expressed Sequence Tags from Apple1

    PubMed Central

    Newcomb, Richard D.; Crowhurst, Ross N.; Gleave, Andrew P.; Rikkerink, Erik H.A.; Allan, Andrew C.; Beuning, Lesley L.; Bowen, Judith H.; Gera, Emma; Jamieson, Kim R.; Janssen, Bart J.; Laing, William A.; McArtney, Steve; Nain, Bhawana; Ross, Gavin S.; Snowden, Kimberley C.; Souleyre, Edwige J.F.; Walton, Eric F.; Yauk, Yar-Khing

    2006-01-01

    The domestic apple (Malus domestica; also known as Malus pumila Mill.) has become a model fruit crop in which to study commercial traits such as disease and pest resistance, grafting, and flavor and health compound biosynthesis. To speed the discovery of genes involved in these traits, develop markers to map genes, and breed new cultivars, we have produced a substantial expressed sequence tag collection from various tissues of apple, focusing on fruit tissues of the cultivar Royal Gala. Over 150,000 expressed sequence tags have been collected from 43 different cDNA libraries representing 34 different tissues and treatments. Clustering of these sequences results in a set of 42,938 nonredundant sequences comprising 17,460 tentative contigs and 25,478 singletons, together representing what we predict are approximately one-half the expressed genes from apple. Many potential molecular markers are abundant in the apple transcripts. Dinucleotide repeats are found in 4,018 nonredundant sequences, mainly in the 5′-untranslated region of the gene, with a bias toward one repeat type (containing AG, 88%) and against another (repeats containing CG, 0.1%). Trinucleotide repeats are most common in the predicted coding regions and do not show a similar degree of sequence bias in their representation. Bi-allelic single-nucleotide polymorphisms are highly abundant with one found, on average, every 706 bp of transcribed DNA. Predictions of the numbers of representatives from protein families indicate the presence of many genes involved in disease resistance and the biosynthesis of flavor and health-associated compounds. Comparisons of some of these gene families with Arabidopsis (Arabidopsis thaliana) suggest instances where there have been duplications in the lineages leading to apple of biosynthetic and regulatory genes that are expressed in fruit. This resource paves the way for a concerted functional genomics effort in this important temperate fruit crop. PMID:16531485

  15. Dealing with the genetic load in bacterial synthetic biology circuits: convergences with the Ohm's law

    PubMed Central

    Carbonell-Ballestero, M.; Garcia-Ramallo, E.; Montañez, R.; Rodriguez-Caso, C.; Macía, J.

    2016-01-01

    Synthetic biology seeks to envision living cells as a matter of engineering. However, increasing evidence suggests that the genetic load imposed by the incorporation of synthetic devices in a living organism introduces a sort of unpredictability in the design process. As a result, individual part characterization is not enough to predict the behavior of designed circuits and thus, a costly trial-error process is eventually required. In this work, we provide a new theoretical framework for the predictive treatment of the genetic load. We mathematically and experimentally demonstrate that dependences among genes follow a quantitatively predictable behavior. Our theory predicts the observed reduction of the expression of a given synthetic gene when an extra genetic load is introduced in the circuit. The theory also explains that such dependence qualitatively differs when the extra load is added either by transcriptional or translational modifications. We finally show that the limitation of the cellular resources for gene expression leads to a mathematical formulation that converges to an expression analogous to the Ohm's law for electric circuits. Similitudes and divergences with this law are outlined. Our work provides a suitable framework with predictive character for the design process of complex genetic devices in synthetic biology. PMID:26656950

  16. A 17-gene assay to predict prostate cancer aggressiveness in the context of Gleason grade heterogeneity, tumor multifocality, and biopsy undersampling.

    PubMed

    Klein, Eric A; Cooperberg, Matthew R; Magi-Galluzzi, Cristina; Simko, Jeffry P; Falzarano, Sara M; Maddala, Tara; Chan, June M; Li, Jianbo; Cowan, Janet E; Tsiatis, Athanasios C; Cherbavaz, Diana B; Pelham, Robert J; Tenggara-Hunter, Imelda; Baehner, Frederick L; Knezevic, Dejan; Febbo, Phillip G; Shak, Steven; Kattan, Michael W; Lee, Mark; Carroll, Peter R

    2014-09-01

    Prostate tumor heterogeneity and biopsy undersampling pose challenges to accurate, individualized risk assessment for men with localized disease. To identify and validate a biopsy-based gene expression signature that predicts clinical recurrence, prostate cancer (PCa) death, and adverse pathology. Gene expression was quantified by reverse transcription-polymerase chain reaction for three studies-a discovery prostatectomy study (n=441), a biopsy study (n=167), and a prospectively designed, independent clinical validation study (n=395)-testing retrospectively collected needle biopsies from contemporary (1997-2011) patients with low to intermediate clinical risk who were candidates for active surveillance (AS). The main outcome measures defining aggressive PCa were clinical recurrence, PCa death, and adverse pathology at prostatectomy. Cox proportional hazards regression models were used to evaluate the association between gene expression and time to event end points. Results from the prostatectomy and biopsy studies were used to develop and lock a multigene-expression-based signature, called the Genomic Prostate Score (GPS); in the validation study, logistic regression was used to test the association between the GPS and pathologic stage and grade at prostatectomy. Decision-curve analysis and risk profiles were used together with clinical and pathologic characteristics to evaluate clinical utility. Of the 732 candidate genes analyzed, 288 (39%) were found to predict clinical recurrence despite heterogeneity and multifocality, and 198 (27%) were predictive of aggressive disease after adjustment for prostate-specific antigen, Gleason score, and clinical stage. Further analysis identified 17 genes representing multiple biological pathways that were combined into the GPS algorithm. In the validation study, GPS predicted high-grade (odds ratio [OR] per 20 GPS units: 2.3; 95% confidence interval [CI], 1.5-3.7; p<0.001) and high-stage (OR per 20 GPS units: 1.9; 95% CI, 1.3-3.0; p=0.003) at surgical pathology. GPS predicted high-grade and/or high-stage disease after controlling for established clinical factors (p<0.005) such as an OR of 2.1 (95% CI, 1.4-3.2) when adjusting for Cancer of the Prostate Risk Assessment score. A limitation of the validation study was the inclusion of men with low-volume intermediate-risk PCa (Gleason score 3+4), for whom some providers would not consider AS. Genes representing multiple biological pathways discriminate PCa aggressiveness in biopsy tissue despite tumor heterogeneity, multifocality, and limited sampling at time of biopsy. The biopsy-based 17-gene GPS improves prediction of the presence or absence of adverse pathology and may help men with PCa make more informed decisions between AS and immediate treatment. Prostate cancer (PCa) is often present in multiple locations within the prostate and has variable characteristics. We identified genes with expression associated with aggressive PCa to develop a biopsy-based, multigene signature, the Genomic Prostate Score (GPS). GPS was validated for its ability to predict men who have high-grade or high-stage PCa at diagnosis and may help men diagnosed with PCa decide between active surveillance and immediate definitive treatment. Copyright © 2014 European Association of Urology. Published by Elsevier B.V. All rights reserved.

  17. Coregulation of terpenoid pathway genes and prediction of isoprene production in Bacillus subtilis using transcriptomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hess, Becky M.; Xue, Junfeng; Markillie, Lye Meng

    2013-06-19

    The isoprenoid pathway converts pyruvate to isoprene and related isoprenoid compounds in plants and some bacteria. Currently, this pathway is of great interest because of the critical role that isoprenoids play in basic cellular processes as well as the industrial value of metabolites such as isoprene. Although the regulation of several pathway genes has been described, there is a paucity of information regarding the system level regulation and control of the pathway. To address this limitation, we examined Bacillus subtilis grown under multiple conditions and then determined the relationship between altered isoprene production and the pattern of gene expression. Wemore » found that terpenoid genes appeared to fall into two distinct subsets with opposing correlations with respect to the amount of isoprene produced. The group whose expression levels positively correlated with isoprene production included dxs, the gene responsible for the commitment step in the pathway, as well as ispD, and two genes that participate in the mevalonate pathway, yhfS and pksG. The subset of terpenoid genes that inversely correlated with isoprene production included ispH, ispF, hepS, uppS, ispE, and dxr. A genome wide partial least squares regression model was created to identify other genes or pathways that contribute to isoprene production. This analysis showed that a subset of 213 regulated genes was sufficient to create a predictive model of isoprene production under different conditions and showed correlations at the transcriptional level. We conclude that gene expression levels alone are sufficiently informative about the metabolic state of a cell that produces increased isoprene and can be used to build a model which accurately predicts production of this secondary metabolite across many simulated environmental conditions.« less

  18. Transgenic Animals.

    ERIC Educational Resources Information Center

    Jaenisch, Rudolf

    1988-01-01

    Describes three methods and their advantages and disadvantages for introducing genes into animals. Discusses the predictability and tissue-specificity of the injected genes. Outlines the applications of transgenic technology for studying gene expression, the early stages of mammalian development, mutations, and the molecular nature of chromosomes.…

  19. Annotation of Ehux ESTs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kuo, Alan; Grigoriev, Igor

    2009-06-12

    22 percent ESTs do no align with scaffolds. EST Pipeleine assembles 17126 consensi from the noaligned ESTs. Annotation Pipeline predicts 8564 ORFS on the consensi. Domain analysis of ORFs reveals missing genes. Cluster analysis reveals missing genes. Expression analysis reveals potential strain specific genes.

  20. Some ethylene biosynthesis and AP2/ERF genes reveal a specific pattern of expression during somatic embryogenesis in Hevea brasiliensis

    PubMed Central

    2012-01-01

    Background Ethylene production and signalling play an important role in somatic embryogenesis, especially for species that are recalcitrant in in vitro culture. The AP2/ERF superfamily has been identified and classified in Hevea brasiliensis. This superfamily includes the ERFs involved in response to ethylene. The relative transcript abundance of ethylene biosynthesis genes and of AP2/ERF genes was analysed during somatic embryogenesis for callus lines with different regeneration potential, in order to identify genes regulated during that process. Results The analysis of relative transcript abundance was carried out by real-time RT-PCR for 142 genes. The transcripts of ERFs from group I, VII and VIII were abundant at all stages of the somatic embryogenesis process. Forty genetic expression markers for callus regeneration capacity were identified. Fourteen markers were found for proliferating calli and 35 markers for calli at the end of the embryogenesis induction phase. Sixteen markers discriminated between normal and abnormal embryos and, lastly, there were 36 markers of conversion into plantlets. A phylogenetic analysis comparing the sequences of the AP2 domains of Hevea and Arabidopsis genes enabled us to predict the function of 13 expression marker genes. Conclusions This first characterization of the AP2/ERF superfamily in Hevea revealed dramatic regulation of the expression of AP2/ERF genes during the somatic embryogenesis process. The gene expression markers of proliferating callus capacity to regenerate plants by somatic embryogenesis should make it possible to predict callus lines suitable to be used for multiplication. Further functional characterization of these markers opens up prospects for discovering specific AP2/ERF functions in the Hevea species for which somatic embryogenesis is difficult. PMID:23268714

  1. Comparative brain transcriptomic analyses of scouting across distinct behavioural and ecological contexts in honeybees

    PubMed Central

    Liang, Zhengzheng S.; Mattila, Heather R.; Rodriguez-Zas, Sandra L.; Southey, Bruce R.; Seeley, Thomas D.; Robinson, Gene E.

    2014-01-01

    Individual differences in behaviour are often consistent across time and contexts, but it is not clear whether such consistency is reflected at the molecular level. We explored this issue by studying scouting in honeybees in two different behavioural and ecological contexts: finding new sources of floral food resources and finding a new nest site. Brain gene expression profiles in food-source and nest-site scouts showed a significant overlap, despite large expression differences associated with the two different contexts. Class prediction and ‘leave-one-out’ cross-validation analyses revealed that a bee's role as a scout in either context could be predicted with 92.5% success using 89 genes at minimum. We also found that genes related to four neurotransmitter systems were part of a shared brain molecular signature in both types of scouts, and the two types of scouts were more similar for genes related to glutamate and GABA than catecholamine or acetylcholine signalling. These results indicate that consistent behavioural tendencies across different ecological contexts involve a mixture of similarities and differences in brain gene expression. PMID:25355476

  2. Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction.

    PubMed

    Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H

    2017-01-09

    The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Image-guided genomic analysis of tissue response to laser-induced thermal stress

    NASA Astrophysics Data System (ADS)

    Mackanos, Mark A.; Helms, Mike; Kalish, Flora; Contag, Christopher H.

    2011-05-01

    The cytoprotective response to thermal injury is characterized by transcriptional activation of ``heat shock proteins'' (hsp) and proinflammatory proteins. Expression of these proteins may predict cellular survival. Microarray analyses were performed to identify spatially distinct gene expression patterns responding to thermal injury. Laser injury zones were identified by expression of a transgene reporter comprised of the 70 kD hsp gene and the firefly luciferase coding sequence. Zones included the laser spot, the surrounding region where hsp70-luc expression was increased, and a region adjacent to the surrounding region. A total of 145 genes were up-regulated in the laser irradiated region, while 69 were up-regulated in the adjacent region. At 7 hours the chemokine Cxcl3 was the highest expressed gene in the laser spot (24 fold) and adjacent region (32 fold). Chemokines were the most common up-regulated genes identified. Microarray gene expression was successfully validated using qRT- polymerase chain reaction for selected genes of interest. The early response genes are likely involved in cytoprotection and initiation of the healing response. Their regulatory elements will benefit creating the next generation reporter mice and controlling expression of therapeutic proteins. The identified genes serve as drug development targets that may prevent acute tissue damage and accelerate healing.

  4. A gene expression inflammatory signature specifically predicts multiple myeloma evolution and patients survival.

    PubMed

    Botta, C; Di Martino, M T; Ciliberto, D; Cucè, M; Correale, P; Rossi, M; Tagliaferri, P; Tassone, P

    2016-12-16

    Multiple myeloma (MM) is closely dependent on cross-talk between malignant plasma cells and cellular components of the inflammatory/immunosuppressive bone marrow milieu, which promotes disease progression, drug resistance, neo-angiogenesis, bone destruction and immune-impairment. We investigated the relevance of inflammatory genes in predicting disease evolution and patient survival. A bioinformatics study by Ingenuity Pathway Analysis on gene expression profiling dataset of monoclonal gammopathy of undetermined significance, smoldering and symptomatic-MM, identified inflammatory and cytokine/chemokine pathways as the most progressively affected during disease evolution. We then selected 20 candidate genes involved in B-cell inflammation and we investigated their role in predicting clinical outcome, through univariate and multivariate analyses (log-rank test, logistic regression and Cox-regression model). We defined an 8-genes signature (IL8, IL10, IL17A, CCL3, CCL5, VEGFA, EBI3 and NOS2) identifying each condition (MGUS/smoldering/symptomatic-MM) with 84% accuracy. Moreover, six genes (IFNG, IL2, LTA, CCL2, VEGFA, CCL3) were found independently correlated with patients' survival. Patients whose MM cells expressed high levels of Th1 cytokines (IFNG/LTA/IL2/CCL2) and low levels of CCL3 and VEGFA, experienced the longest survival. On these six genes, we built a prognostic risk score that was validated in three additional independent datasets. In this study, we provide proof-of-concept that inflammation has a critical role in MM patient progression and survival. The inflammatory-gene prognostic signature validated in different datasets clearly indicates novel opportunities for personalized anti-MM treatment.

  5. Visual gene-network analysis reveals the cancer gene co-expression in human endometrial cancer

    PubMed Central

    2014-01-01

    Background Endometrial cancers (ECs) are the most common form of gynecologic malignancy. Recent studies have reported that ECs reveal distinct markers for molecular pathogenesis, which in turn is linked to the various histological types of ECs. To understand further the molecular events contributing to ECs and endometrial tumorigenesis in general, a more precise identification of cancer-associated molecules and signaling networks would be useful for the detection and monitoring of malignancy, improving clinical cancer therapy, and personalization of treatments. Results ECs-specific gene co-expression networks were constructed by differential expression analysis and weighted gene co-expression network analysis (WGCNA). Important pathways and putative cancer hub genes contribution to tumorigenesis of ECs were identified. An elastic-net regularized classification model was built using the cancer hub gene signatures to predict the phenotypic characteristics of ECs. The 19 cancer hub gene signatures had high predictive power to distinguish among three key principal features of ECs: grade, type, and stage. Intriguingly, these hub gene networks seem to contribute to ECs progression and malignancy via cell-cycle regulation, antigen processing and the citric acid (TCA) cycle. Conclusions The results of this study provide a powerful biomarker discovery platform to better understand the progression of ECs and to uncover potential therapeutic targets in the treatment of ECs. This information might lead to improved monitoring of ECs and resulting improvement of treatment of ECs, the 4th most common of cancer in women. PMID:24758163

  6. Customizing chemotherapy for colon cancer: the potential of gene expression profiling.

    PubMed

    Mariadason, John M; Arango, Diego; Augenlicht, Leonard H

    2004-06-01

    The value of gene expression profiling, or microarray analysis, for the classification and prognosis of multiple forms of cancer is now clearly established. For colon cancer, expression profiling can readily discriminate between normal and tumor tissue, and to some extent between tumors of different histopathological stage and prognosis. While a definitive in vivo study demonstrating the potential of this methodology for predicting response to chemotherapy is presently lacking, the ability of microarrays to distinguish other subtleties of colon cancer phenotype, as well as recent in vitro proof-of-principle experiments utilizing colon cancer cell lines, illustrate the potential of this methodology for predicting the probability of response to specific chemotherapeutic agents. This review discusses some of the recent advances in the use of microarray analysis for understanding and distinguishing colon cancer subtypes, and attempts to identify challenges that need to be overcome in order to achieve the goal of using gene expression profiling for customizing chemotherapy in colon cancer.

  7. Expression of the Pasteurella haemolytica leukotoxin is inhibited by a locus that encodes an ATP-binding cassette homolog.

    PubMed Central

    Highlander, S K; Wickersham, E A; Garza, O; Weinstock, G M

    1993-01-01

    Multicopy and single-copy chromosomal fusions between the Pasteurella haemolytica leukotoxin regulatory region and the Escherichia coli beta-galactosidase gene have been constructed. These fusions were used as reporters to identify and isolate regulators of leukotoxin expression from a P. haemolytica cosmid library. A cosmid clone, which inhibited leukotoxin expression from multicopy and single-copy protein fusions, was isolated and found to contain the complete leukotoxin gene cluster plus additional upstream sequences. The locus responsible for inhibition of expression from leukotoxin-beta-galactosidase fusions was mapped within these upstream sequences, by transposon mutagenesis with Tn5, and its DNA sequence was determined. The inhibitory activity was found to be associated with a predicted 440-amino-acid reading frame (lapA) that lies within a four-gene arginine transport locus. LapA is predicted to be the nucleotide-binding component of this transport system and shares homology with the Clp family of proteases. Images PMID:8359916

  8. GENE EXPRESSION PROFILING OF MOUSE SKIN AND PAPILLOMAS FOLLOWING CHRONIC EXPOSURE TO MONOMETHYLARSONOUS ACID IN K6/ODC TRANSGENIC MICE

    EPA Science Inventory

    Methylarsonous acid [MMA(III)], a common metabolite of inorganic arsenic metabolism, increases tumor frequency in the skin of K6/ODC transgenic mice following a chronic exposure. To characterize gene expression profiles predictive of MMA(III) exposure and mode of action of carcin...

  9. Moving Toward Integrating Gene Expression Profiling into High-throughput Testing:A Gene Expression Biomarker Accurately Predicts Estrogen Receptor α Modulation in a Microarray Compendium

    EPA Science Inventory

    Microarray profiling of chemical-induced effects is being increasingly used in medium and high-throughput formats. In this study, we describe computational methods to identify molecular targets from whole-genome microarray data using as an example the estrogen receptor α (ERα), ...

  10. Aberrant Gene Expression in Humans

    PubMed Central

    Yang, Ence; Ji, Guoli; Brinkmeyer-Langford, Candice L.; Cai, James J.

    2015-01-01

    Gene expression as an intermediate molecular phenotype has been a focus of research interest. In particular, studies of expression quantitative trait loci (eQTL) have offered promise for understanding gene regulation through the discovery of genetic variants that explain variation in gene expression levels. Existing eQTL methods are designed for assessing the effects of common variants, but not rare variants. Here, we address the problem by establishing a novel analytical framework for evaluating the effects of rare or private variants on gene expression. Our method starts from the identification of outlier individuals that show markedly different gene expression from the majority of a population, and then reveals the contributions of private SNPs to the aberrant gene expression in these outliers. Using population-scale mRNA sequencing data, we identify outlier individuals using a multivariate approach. We find that outlier individuals are more readily detected with respect to gene sets that include genes involved in cellular regulation and signal transduction, and less likely to be detected with respect to the gene sets with genes involved in metabolic pathways and other fundamental molecular functions. Analysis of polymorphic data suggests that private SNPs of outlier individuals are enriched in the enhancer and promoter regions of corresponding aberrantly-expressed genes, suggesting a specific regulatory role of private SNPs, while the commonly-occurring regulatory genetic variants (i.e., eQTL SNPs) show little evidence of involvement. Additional data suggest that non-genetic factors may also underlie aberrant gene expression. Taken together, our findings advance a novel viewpoint relevant to situations wherein common eQTLs fail to predict gene expression when heritable, rare inter-individual variation exists. The analytical framework we describe, taking into consideration the reality of differential phenotypic robustness, may be valuable for investigating complex traits and conditions. PMID:25617623

  11. Dexamethasone Stimulated Gene Expression in Peripheral Blood is a Sensitive Marker for Glucocorticoid Receptor Resistance in Depressed Patients

    PubMed Central

    Menke, Andreas; Arloth, Janine; Pütz, Benno; Weber, Peter; Klengel, Torsten; Mehta, Divya; Gonik, Mariya; Rex-Haffner, Monika; Rubel, Jennifer; Uhr, Manfred; Lucae, Susanne; Deussing, Jan M; Müller-Myhsok, Bertram; Holsboer, Florian; Binder, Elisabeth B

    2012-01-01

    Although gene expression profiles in peripheral blood in major depression are not likely to identify genes directly involved in the pathomechanism of affective disorders, they may serve as biomarkers for this disorder. As previous studies using baseline gene expression profiles have provided mixed results, our approach was to use an in vivo dexamethasone challenge test and to compare glucocorticoid receptor (GR)-mediated changes in gene expression between depressed patients and healthy controls. Whole genome gene expression data (baseline and following GR-stimulation with 1.5 mg dexamethasone p.o.) from two independent cohorts were analyzed to identify gene expression pattern that would predict case and control status using a training (N=18 cases/18 controls) and a test cohort (N=11/13). Dexamethasone led to reproducible regulation of 2670 genes in controls and 1151 transcripts in cases. Several genes, including FKBP5 and DUSP1, previously associated with the pathophysiology of major depression, were found to be reliable markers of GR-activation. Using random forest analyses for classification, GR-stimulated gene expression outperformed baseline gene expression as a classifier for case and control status with a correct classification of 79.1 vs 41.6% in the test cohort. GR-stimulated gene expression performed best in dexamethasone non-suppressor patients (88.7% correctly classified with 100% sensitivity), but also correctly classified 77.3% of the suppressor patients (76.7% sensitivity), when using a refined set of 19 genes. Our study suggests that in vivo stimulated gene expression in peripheral blood cells could be a promising molecular marker of altered GR-functioning, an important component of the underlying pathology, in patients suffering from depressive episodes. PMID:22237309

  12. Toward a Public Toxicogenomics Capability for Supporting Predictive Toxicology: Survey of Current Resources and Chemical Indexing of Experiments in GEO and ArrayExpress

    EPA Science Inventory

    A publicly available toxicogenomics capability for supporting predictive toxicology and meta-analysis depends on availability of gene expression data for chemical treatment scenarios, the ability to locate and aggregate such information by chemical, and broad data coverage within...

  13. Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome.

    PubMed

    Tothill, Richard W; Tinker, Anna V; George, Joshy; Brown, Robert; Fox, Stephen B; Lade, Stephen; Johnson, Daryl S; Trivett, Melanie K; Etemadmoghadam, Dariush; Locandro, Bianca; Traficante, Nadia; Fereday, Sian; Hung, Jillian A; Chiew, Yoke-Eng; Haviv, Izhak; Gertig, Dorota; DeFazio, Anna; Bowtell, David D L

    2008-08-15

    The study aim to identify novel molecular subtypes of ovarian cancer by gene expression profiling with linkage to clinical and pathologic features. Microarray gene expression profiling was done on 285 serous and endometrioid tumors of the ovary, peritoneum, and fallopian tube. K-means clustering was applied to identify robust molecular subtypes. Statistical analysis identified differentially expressed genes, pathways, and gene ontologies. Laser capture microdissection, pathology review, and immunohistochemistry validated the array-based findings. Patient survival within k-means groups was evaluated using Cox proportional hazards models. Class prediction validated k-means groups in an independent dataset. A semisupervised survival analysis of the array data was used to compare against unsupervised clustering results. Optimal clustering of array data identified six molecular subtypes. Two subtypes represented predominantly serous low malignant potential and low-grade endometrioid subtypes, respectively. The remaining four subtypes represented higher grade and advanced stage cancers of serous and endometrioid morphology. A novel subtype of high-grade serous cancers reflected a mesenchymal cell type, characterized by overexpression of N-cadherin and P-cadherin and low expression of differentiation markers, including CA125 and MUC1. A poor prognosis subtype was defined by a reactive stroma gene expression signature, correlating with extensive desmoplasia in such samples. A similar poor prognosis signature could be found using a semisupervised analysis. Each subtype displayed distinct levels and patterns of immune cell infiltration. Class prediction identified similar subtypes in an independent ovarian dataset with similar prognostic trends. Gene expression profiling identified molecular subtypes of ovarian cancer of biological and clinical importance.

  14. The Metastasis Efficiency Modifier Ribosomal RNA Processing 1 Homolog B (RRP1B) Is a Chromatin-associated Factor*

    PubMed Central

    Crawford, Nigel P. S.; Yang, Hailiu; Mattaini, Katherine R.; Hunter, Kent W.

    2009-01-01

    There is accumulating evidence for a role of germ line variation in breast cancer metastasis. We have recently identified a novel metastasis susceptibility gene, Rrp1b (ribosomal RNA processing 1 homolog B). Overexpression of Rrp1b in a mouse mammary tumor cell line induces a gene expression signature that predicts survival in breast cancer. Here we extend the analysis of RRP1B function by demonstrating that the Rrp1b activation gene expression signature accurately predicted the outcome in three of four publicly available breast carcinoma gene expression data sets. In addition, we provide insights into the mechanism of RRP1B. Tandem affinity purification demonstrated that RRP1B physically interacts with many nucleosome binding factors, including histone H1X, poly(ADP-ribose) polymerase 1, TRIM28 (tripartite motif-containing 28), and CSDA (cold shock domain protein A). Co-immunofluorescence and co-immunoprecipitation confirmed these interactions and also interactions with heterochromatin protein-1α and acetyl-histone H4 lysine 5. Finally, we investigated the effects of ectopic expression of an RRP1B allelic variant previously associated with improved survival in breast cancer. Gene expression analyses demonstrate that, compared with ectopic expression of wild type RRP1B in HeLa cells, the variant RRP1B differentially modulates various transcription factors controlled by TRIM28 and CSDA. These data suggest that RRP1B, a tumor progression and metastasis susceptibility candidate gene, is potentially a dynamic modulator of transcription and chromatin structure. PMID:19710015

  15. Comprehensive Assessments of RNA-seq by the SEQC Consortium: FDA-Led Efforts Advance Precision Medicine.

    PubMed

    Xu, Joshua; Gong, Binsheng; Wu, Leihong; Thakkar, Shraddha; Hong, Huixiao; Tong, Weida

    2016-03-15

    Studies on gene expression in response to therapy have led to the discovery of pharmacogenomics biomarkers and advances in precision medicine. Whole transcriptome sequencing (RNA-seq) is an emerging tool for profiling gene expression and has received wide adoption in the biomedical research community. However, its value in regulatory decision making requires rigorous assessment and consensus between various stakeholders, including the research community, regulatory agencies, and industry. The FDA-led SEquencing Quality Control (SEQC) consortium has made considerable progress in this direction, and is the subject of this review. Specifically, three RNA-seq platforms (Illumina HiSeq, Life Technologies SOLiD, and Roche 454) were extensively evaluated at multiple sites to assess cross-site and cross-platform reproducibility. The results demonstrated that relative gene expression measurements were consistently comparable across labs and platforms, but not so for the measurement of absolute expression levels. As part of the quality evaluation several studies were included to evaluate the utility of RNA-seq in clinical settings and safety assessment. The neuroblastoma study profiled tumor samples from 498 pediatric neuroblastoma patients by both microarray and RNA-seq. RNA-seq offers more utilities than microarray in determining the transcriptomic characteristics of cancer. However, RNA-seq and microarray-based models were comparable in clinical endpoint prediction, even when including additional features unique to RNA-seq beyond gene expression. The toxicogenomics study compared microarray and RNA-seq profiles of the liver samples from rats exposed to 27 different chemicals representing multiple toxicity modes of action. Cross-platform concordance was dependent on chemical treatment and transcript abundance. Though both RNA-seq and microarray are suitable for developing gene expression based predictive models with comparable prediction performance, RNA-seq offers advantages over microarray in profiling genes with low expression. The rat BodyMap study provided a comprehensive rat transcriptomic body map by performing RNA-Seq on 320 samples from 11 organs in either sex of juvenile, adolescent, adult and aged Fischer 344 rats. Lastly, the transferability study demonstrated that signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development using a comprehensive approach with two large clinical data sets. This result suggests continued usefulness of legacy microarray data in the coming RNA-seq era. In conclusion, the SEQC project enhances our understanding of RNA-seq and provides valuable guidelines for RNA-seq based clinical application and safety evaluation to advance precision medicine.

  16. Functional redundancy and/or ongoing pseudogenization among F-box protein genes expressed in Arabidopsis male gametophyte.

    PubMed

    Ikram, Sobia; Durandet, Monique; Vesa, Simona; Pereira, Serge; Guerche, Philippe; Bonhomme, Sandrine

    2014-06-01

    F-box protein genes family is one of the largest gene families in plants, with almost 700 predicted genes in the model plant Arabidopsis. F-box proteins are key components of the ubiquitin proteasome system that allows targeted protein degradation. Transcriptome analyses indicate that half of these F-box protein genes are found expressed in microspore and/or pollen, i.e., during male gametogenesis. To assess the role of F-box protein genes during this crucial developmental step, we selected 34 F-box protein genes recorded as highly and specifically expressed in pollen and isolated corresponding insertion mutants. We checked the expression level of each selected gene by RT-PCR and confirmed pollen expression for 25 genes, but specific expression for only 10 of the 34 F-box protein genes. In addition, we tested the expression level of selected F-box protein genes in 24 mutant lines and showed that 11 of them were null mutants. Transmission analysis of the mutations to the progeny showed that none of the single mutations was gametophytic lethal. These unaffected transmission efficiencies suggested leaky mutations or functional redundancy among F-box protein genes. Cytological observation of the gametophytes in the mutants confirmed these results. Combinations of mutations in F-box protein genes from the same subfamily did not lead to transmission defect either, further highlighting functional redundancy and/or a high proportion of pseudogenes among these F-box protein genes.

  17. A chain reaction approach to modelling gene pathways.

    PubMed

    Cheng, Gary C; Chen, Dung-Tsa; Chen, James J; Soong, Seng-Jaw; Lamartiniere, Coral; Barnes, Stephen

    2012-08-01

    BACKGROUND: Of great interest in cancer prevention is how nutrient components affect gene pathways associated with the physiological events of puberty. Nutrient-gene interactions may cause changes in breast or prostate cells and, therefore, may result in cancer risk later in life. Analysis of gene pathways can lead to insights about nutrient-gene interactions and the development of more effective prevention approaches to reduce cancer risk. To date, researchers have relied heavily upon experimental assays (such as microarray analysis, etc.) to identify genes and their associated pathways that are affected by nutrient and diets. However, the vast number of genes and combinations of gene pathways, coupled with the expense of the experimental analyses, has delayed the progress of gene-pathway research. The development of an analytical approach based on available test data could greatly benefit the evaluation of gene pathways, and thus advance the study of nutrient-gene interactions in cancer prevention. In the present study, we have proposed a chain reaction model to simulate gene pathways, in which the gene expression changes through the pathway are represented by the species undergoing a set of chemical reactions. We have also developed a numerical tool to solve for the species changes due to the chain reactions over time. Through this approach we can examine the impact of nutrient-containing diets on the gene pathway; moreover, transformation of genes over time with a nutrient treatment can be observed numerically, which is very difficult to achieve experimentally. We apply this approach to microarray analysis data from an experiment which involved the effects of three polyphenols (nutrient treatments), epigallo-catechin-3-O-gallate (EGCG), genistein, and resveratrol, in a study of nutrient-gene interaction in the estrogen synthesis pathway during puberty. RESULTS: In this preliminary study, the estrogen synthesis pathway was simulated by a chain reaction model. By applying it to microarray data, the chain reaction model computed a set of reaction rates to examine the effects of three polyphenols (EGCG, genistein, and resveratrol) on gene expression in this pathway during puberty. We first performed statistical analysis to test the time factor on the estrogen synthesis pathway. Global tests were used to evaluate an overall gene expression change during puberty for each experimental group. Then, a chain reaction model was employed to simulate the estrogen synthesis pathway. Specifically, the model computed the reaction rates in a set of ordinary differential equations to describe interactions between genes in the pathway (A reaction rate K of A to B represents gene A will induce gene B per unit at a rate of K; we give details in the "method" section). Since disparate changes of gene expression may cause numerical error problems in solving these differential equations, we used an implicit scheme to address this issue. We first applied the chain reaction model to obtain the reaction rates for the control group. A sensitivity study was conducted to evaluate how well the model fits to the control group data at Day 50. Results showed a small bias and mean square error. These observations indicated the model is robust to low random noises and has a good fit for the control group. Then the chain reaction model derived from the control group data was used to predict gene expression at Day 50 for the three polyphenol groups. If these nutrients affect the estrogen synthesis pathways during puberty, we expect discrepancy between observed and expected expressions. Results indicated some genes had large differences in the EGCG (e.g., Hsd3b and Sts) and the resveratrol (e.g., Hsd3b and Hrmt12) groups. CONCLUSIONS: In the present study, we have presented (I) experimental studies of the effect of nutrient diets on the gene expression changes in a selected estrogen synthesis pathway. This experiment is valuable because it allows us to examine how the nutrient-containing diets regulate gene expression in the estrogen synthesis pathway during puberty; (II) global tests to assess an overall association of this particular pathway with time factor by utilizing generalized linear models to analyze microarray data; and (III) a chain reaction model to simulate the pathway. This is a novel application because we are able to translate the gene pathway into the chemical reactions in which each reaction channel describes gene-gene relationship in the pathway. In the chain reaction model, the implicit scheme is employed to efficiently solve the differential equations. Data analysis results show the proposed model is capable of predicting gene expression changes and demonstrating the effect of nutrient-containing diets on gene expression changes in the pathway. One of the objectives of this study is to explore and develop a numerical approach for simulating the gene expression change so that it can be applied and calibrated when the data of more time slices are available, and thus can be used to interpolate the expression change at a desired time point without conducting expensive experiments for a large amount of time points. Hence, we are not claiming this is either essential or the most efficient way for simulating this problem, rather a mathematical/numerical approach that can model the expression change of a large set of genes of a complex pathway. In addition, we understand the limitation of this experiment and realize that it is still far from being a complete model of predicting nutrient-gene interactions. The reason is that in the present model, the reaction rates were estimated based on available data at two time points; hence, the gene expression change is dependent upon the reaction rates and a linear function of the gene expressions. More data sets containing gene expression at various time slices are needed in order to improve the present model so that a non-linear variation of gene expression changes at different time can be predicted.

  18. Naringenin Regulates Expression of Genes Involved in Cell Wall Synthesis in Herbaspirillum seropedicae▿

    PubMed Central

    Tadra-Sfeir, M. Z.; Souza, E. M.; Faoro, H.; Müller-Santos, M.; Baura, V. A.; Tuleski, T. R.; Rigo, L. U.; Yates, M. G.; Wassem, R.; Pedrosa, F. O.; Monteiro, R. A.

    2011-01-01

    Five thousand mutants of Herbaspirillum seropedicae SmR1 carrying random insertions of transposon pTnMod-OGmKmlacZ were screened for differential expression of LacZ in the presence of naringenin. Among the 16 mutants whose expression was regulated by naringenin were genes predicted to be involved in the synthesis of exopolysaccharides, lipopolysaccharides, and auxin. These loci are probably involved in establishing interactions with host plants. PMID:21257805

  19. Naringenin regulates expression of genes involved in cell wall synthesis in Herbaspirillum seropedicae.

    PubMed

    Tadra-Sfeir, M Z; Souza, E M; Faoro, H; Müller-Santos, M; Baura, V A; Tuleski, T R; Rigo, L U; Yates, M G; Wassem, R; Pedrosa, F O; Monteiro, R A

    2011-03-01

    Five thousand mutants of Herbaspirillum seropedicae SmR1 carrying random insertions of transposon pTnMod-OGmKmlacZ were screened for differential expression of LacZ in the presence of naringenin. Among the 16 mutants whose expression was regulated by naringenin were genes predicted to be involved in the synthesis of exopolysaccharides, lipopolysaccharides, and auxin. These loci are probably involved in establishing interactions with host plants.

  20. Establishment of a 12-gene expression signature to predict colon cancer prognosis

    PubMed Central

    Zhao, Guangxi; Dong, Pingping; Wu, Bingrui

    2018-01-01

    A robust and accurate gene expression signature is essential to assist oncologists to determine which subset of patients at similar Tumor-Lymph Node-Metastasis (TNM) stage has high recurrence risk and could benefit from adjuvant therapies. Here we applied a two-step supervised machine-learning method and established a 12-gene expression signature to precisely predict colon adenocarcinoma (COAD) prognosis by using COAD RNA-seq transcriptome data from The Cancer Genome Atlas (TCGA). The predictive performance of the 12-gene signature was validated with two independent gene expression microarray datasets: GSE39582 includes 566 COAD cases for the development of six molecular subtypes with distinct clinical, molecular and survival characteristics; GSE17538 is a dataset containing 232 colon cancer patients for the generation of a metastasis gene expression profile to predict recurrence and death in COAD patients. The signature could effectively separate the poor prognosis patients from good prognosis group (disease specific survival (DSS): Kaplan Meier (KM) Log Rank p = 0.0034; overall survival (OS): KM Log Rank p = 0.0336) in GSE17538. For patients with proficient mismatch repair system (pMMR) in GSE39582, the signature could also effectively distinguish high risk group from low risk group (OS: KM Log Rank p = 0.005; Relapse free survival (RFS): KM Log Rank p = 0.022). Interestingly, advanced stage patients were significantly enriched in high 12-gene score group (Fisher’s exact test p = 0.0003). After stage stratification, the signature could still distinguish poor prognosis patients in GSE17538 from good prognosis within stage II (Log Rank p = 0.01) and stage II & III (Log Rank p = 0.017) in the outcome of DFS. Within stage III or II/III pMMR patients treated with Adjuvant Chemotherapies (ACT) and patients with higher 12-gene score showed poorer prognosis (III, OS: KM Log Rank p = 0.046; III & II, OS: KM Log Rank p = 0.041). Among stage II/III pMMR patients with lower 12-gene scores in GSE39582, the subgroup receiving ACT showed significantly longer OS time compared with those who received no ACT (Log Rank p = 0.021), while there is no obvious difference between counterparts among patients with higher 12-gene scores (Log Rank p = 0.12). Besides COAD, our 12-gene signature is multifunctional in several other cancer types including kidney cancer, lung cancer, uveal and skin melanoma, brain cancer, and pancreatic cancer. Functional classification showed that seven of the twelve genes are involved in immune system function and regulation, so our 12-gene signature could potentially be used to guide decisions about adjuvant therapy for patients with stage II/III and pMMR COAD.

  1. Loss of Cytoplasmic CDK1 Predicts Poor Survival in Human Lung Cancer and Confers Chemotherapeutic Resistance

    PubMed Central

    Zhang, Chunyu; Elkahloun, Abdel G.; Robertson, Matthew; Gills, Joell J.; Tsurutani, Junji; Shih, Joanna H.; Fukuoka, Junya; Hollander, M. Christine; Harris, Curtis C.; Travis, William D.; Jen, Jin; Dennis, Phillip A.

    2011-01-01

    The dismal lethality of lung cancer is due to late stage at diagnosis and inherent therapeutic resistance. The incorporation of targeted therapies has modestly improved clinical outcomes, but the identification of new targets could further improve clinical outcomes by guiding stratification of poor-risk early stage patients and individualizing therapeutic choices. We hypothesized that a sequential, combined microarray approach would be valuable to identify and validate new targets in lung cancer. We profiled gene expression signatures during lung epithelial cell immortalization and transformation, and showed that genes involved in mitosis were progressively enhanced in carcinogenesis. 28 genes were validated by immunoblotting and 4 genes were further evaluated in non-small cell lung cancer tissue microarrays. Although CDK1 was highly expressed in tumor tissues, its loss from the cytoplasm unexpectedly predicted poor survival and conferred resistance to chemotherapy in multiple cell lines, especially microtubule-directed agents. An analysis of expression of CDK1 and CDK1-associated genes in the NCI60 cell line database confirmed the broad association of these genes with chemotherapeutic responsiveness. These results have implications for personalizing lung cancer therapy and highlight the potential of combined approaches for biomarker discovery. PMID:21887332

  2. A quantitative validated model reveals two phases of transcriptional regulation for the gap gene giant in Drosophila.

    PubMed

    Hoermann, Astrid; Cicin-Sain, Damjan; Jaeger, Johannes

    2016-03-15

    Understanding eukaryotic transcriptional regulation and its role in development and pattern formation is one of the big challenges in biology today. Most attempts at tackling this problem either focus on the molecular details of transcription factor binding, or aim at genome-wide prediction of expression patterns from sequence through bioinformatics and mathematical modelling. Here we bridge the gap between these two complementary approaches by providing an integrative model of cis-regulatory elements governing the expression of the gap gene giant (gt) in the blastoderm embryo of Drosophila melanogaster. We use a reverse-engineering method, where mathematical models are fit to quantitative spatio-temporal reporter gene expression data to infer the regulatory mechanisms underlying gt expression in its anterior and posterior domains. These models are validated through prediction of gene expression in mutant backgrounds. A detailed analysis of our data and models reveals that gt is regulated by domain-specific CREs at early stages, while a late element drives expression in both the anterior and the posterior domains. Initial gt expression depends exclusively on inputs from maternal factors. Later, gap gene cross-repression and gt auto-activation become increasingly important. We show that auto-regulation creates a positive feedback, which mediates the transition from early to late stages of regulation. We confirm the existence and role of gt auto-activation through targeted mutagenesis of Gt transcription factor binding sites. In summary, our analysis provides a comprehensive picture of spatio-temporal gene regulation by different interacting enhancer elements for an important developmental regulator. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  3. Quantitative assessment of Hox complex expression in the indirect development of the polychaete annelid Chaetopterus sp

    NASA Technical Reports Server (NTRS)

    Peterson, K. J.; Irvine, S. Q.; Cameron, R. A.; Davidson, E. H.

    2000-01-01

    A prediction from the set-aside theory of bilaterian origins is that pattern formation processes such as those controlled by the Hox cluster genes are required specifically for adult body plan formation. This prediction can be tested in animals that use maximal indirect development, in which the embryonic formation of the larva and the postembryonic formation of the adult body plan are temporally and spatially distinct. To this end, we quantitatively measured the amount of transcripts for five Hox genes in embryos of a lophotrochozoan, the polychaete annelid Chaetopterus sp. The polychaete Hox complex is shown not to be expressed during embryogenesis, but transcripts of all measured Hox complex genes are detected at significant levels during the initial stages of adult body plan formation. Temporal colinearity in the sequence of their activation is observed, so that activation follows the 3'-5' arrangement of the genes. Moreover, Hox gene expression is spatially localized to the region of teloblastic set-aside cells of the later-stage embryos. This study shows that an indirectly developing lophotrochozoan shares with an indirectly developing deuterostome, the sea urchin, a common mode of Hox complex utilization: construction of the larva, whether a trochophore or dipleurula, does not involve Hox cluster expression, but in both forms the complex is expressed in the set-aside cells from which the adult body plan derives.

  4. Dynamic Changes in Nucleosome Occupancy Are Not Predictive of Gene Expression Dynamics but Are Linked to Transcription and Chromatin Regulators

    PubMed Central

    Huebert, Dana J.; Kuan, Pei-Fen; Keleş, Sündüz

    2012-01-01

    The response to stressful stimuli requires rapid, precise, and dynamic gene expression changes that must be coordinated across the genome. To gain insight into the temporal ordering of genome reorganization, we investigated dynamic relationships between changing nucleosome occupancy, transcription factor binding, and gene expression in Saccharomyces cerevisiae yeast responding to oxidative stress. We applied deep sequencing to nucleosomal DNA at six time points before and after hydrogen peroxide treatment and revealed many distinct dynamic patterns of nucleosome gain and loss. The timing of nucleosome repositioning was not predictive of the dynamics of downstream gene expression change but instead was linked to nucleosome position relative to transcription start sites and specific cis-regulatory elements. We measured genome-wide binding of the stress-activated transcription factor Msn2p over time and found that Msn2p binds different loci with different dynamics. Nucleosome eviction from Msn2p binding sites was common across the genome; however, we show that, contrary to expectation, nucleosome loss occurred after Msn2p binding and in fact required Msn2p. This negates the prevailing model that nucleosomes obscuring Msn2p sites regulate DNA access and must be lost before Msn2p can bind DNA. Together, these results highlight the complexities of stress-dependent chromatin changes and their effects on gene expression. PMID:22354995

  5. Regulatory network involving miRNAs and genes in serous ovarian carcinoma

    PubMed Central

    Zhao, Haiyan; Xu, Hao; Xue, Luchen

    2017-01-01

    Serous ovarian carcinoma (SOC) is one of the most life-threatening types of gynecological malignancy, but the pathogenesis of SOC remains unknown. Previous studies have indicated that differentially expressed genes and microRNAs (miRNAs) serve important functions in SOC. However, genes and miRNAs are identified in a disperse form, and limited information is known about the regulatory association between miRNAs and genes in SOC. In the present study, three regulatory networks were hierarchically constructed, including a differentially-expressed network, a related network and a global network to reveal associations between each factor. In each network, there were three types of factors, which were genes, miRNAs and transcription factors that interact with each other. Focus was placed on the differentially-expressed network, in which all genes and miRNAs were differentially expressed and therefore may have affected the development of SOC. Following the comparison and analysis between the three networks, a number of signaling pathways which demonstrated differentially expressed elements were highlighted. Subsequently, the upstream and downstream elements of differentially expressed miRNAs and genes were listed, and a number of key elements (differentially expressed miRNAs, genes and TFs predicted using the P-match method) were analyzed. The differentially expressed network partially illuminated the pathogenesis of SOC. It was hypothesized that if there was no differential expression of miRNAs and genes, SOC may be prevented and treatment may be identified. The present study provided a theoretical foundation for gene therapy for SOC. PMID:29113276

  6. Determining Physical Mechanisms of Gene Expression Regulation from Single Cell Gene Expression Data.

    PubMed

    Ezer, Daphne; Moignard, Victoria; Göttgens, Berthold; Adryan, Boris

    2016-08-01

    Many genes are expressed in bursts, which can contribute to cell-to-cell heterogeneity. It is now possible to measure this heterogeneity with high throughput single cell gene expression assays (single cell qPCR and RNA-seq). These experimental approaches generate gene expression distributions which can be used to estimate the kinetic parameters of gene expression bursting, namely the rate that genes turn on, the rate that genes turn off, and the rate of transcription. We construct a complete pipeline for the analysis of single cell qPCR data that uses the mathematics behind bursty expression to develop more accurate and robust algorithms for analyzing the origin of heterogeneity in experimental samples, specifically an algorithm for clustering cells by their bursting behavior (Simulated Annealing for Bursty Expression Clustering, SABEC) and a statistical tool for comparing the kinetic parameters of bursty expression across populations of cells (Estimation of Parameter changes in Kinetics, EPiK). We applied these methods to hematopoiesis, including a new single cell dataset in which transcription factors (TFs) involved in the earliest branchpoint of blood differentiation were individually up- and down-regulated. We could identify two unique sub-populations within a seemingly homogenous group of hematopoietic stem cells. In addition, we could predict regulatory mechanisms controlling the expression levels of eighteen key hematopoietic transcription factors throughout differentiation. Detailed information about gene regulatory mechanisms can therefore be obtained simply from high throughput single cell gene expression data, which should be widely applicable given the rapid expansion of single cell genomics.

  7. Diametrical clustering for identifying anti-correlated gene clusters.

    PubMed

    Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman

    2003-09-01

    Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.

  8. Epigenetic Alteration by DNA Methylation of ESR1, MYOD1 and hTERT Gene Promoters is Useful for Prediction of Response in Patients of Locally Advanced Invasive Cervical Carcinoma Treated by Chemoradiation.

    PubMed

    Sood, S; Patel, F D; Ghosh, S; Arora, A; Dhaliwal, L K; Srinivasan, R

    2015-12-01

    Locally advanced invasive cervical cancer [International Federation of Gynecology and Obstetrics (FIGO) IIB/III] is treated by chemoradiation. The response to treatment is variable within a given FIGO stage. Therefore, the aim of the present study was to evaluate the gene promoter methylation profile and corresponding transcript expression of a panel of six genes to identify genes which could predict the response of patients treated by chemoradiation. In total, 100 patients with invasive cervical cancer in FIGO stage IIB/III who underwent chemoradiation treatment were evaluated. Ten patients developed systemic metastases during therapy and were excluded. On the basis of patient follow-up, 69 patients were chemoradiation-sensitive, whereas 21 were chemoradiation-resistant. Gene promoter methylation and gene expression was determined by TaqMan assay and quantitative real-time PCR, respectively, in tissue samples. The methylation frequency of ESR1, BRCA1, RASSF1A, MLH1, MYOD1 and hTERT genes ranged from 40 to 70%. Univariate and hierarchical cluster analysis revealed that gene promoter methylation of MYOD1, ESR1 and hTERT could predict for chemoradiation response. A pattern of unmethylated MYOD1, unmethylated ESR1 and methylated hTERT promoter as well as lower ESR1 transcript levels predicted for chemoradiation resistance. Methylation profiling of a panel of three genes that includes MYOD1, ESR1 and hTERT may be useful to predict the response of invasive cervical carcinoma patients treated with standard chemoradiation therapy. Copyright © 2015 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.

  9. Expression of an isoflavone reductase-like gene enhanced by pollen tube growth in pistils of Solanum tuberosum.

    PubMed

    van Eldik, G J; Ruiter, R K; Colla, P H; van Herpen, M M; Schrauwen, J A; Wullems, G J

    1997-03-01

    Successful sexual reproduction relies on gene products delivered by the pistil to create an environment suitable for pollen tube growth. These compounds are either produced before pollination or formed during the interactions between pistil and pollen tubes. Here we describe the pollination-enhanced expression of the cp100 gene in pistils of Solanum tuberosum. Temporal analysis of gene expression revealed an enhanced expression already one hour after pollination and lasts more than 72 h. Increase in expression also occurred after touching the stigma and was not restricted to the site of touch but spread into the style. The predicted CP100 protein shows similarity to leguminous isoflavone reductases (IFRs), but belongs to a family of IFR-like NAD(P)H-dependent oxidoreductases present in various plant species.

  10. From SNP co-association to RNA co-expression: novel insights into gene networks for intramuscular fatty acid composition in porcine.

    PubMed

    Ramayo-Caldas, Yuliaxis; Ballester, Maria; Fortes, Marina R S; Esteve-Codina, Anna; Castelló, Anna; Noguera, Jose L; Fernández, Ana I; Pérez-Enciso, Miguel; Reverter, Antonio; Folch, Josep M

    2014-03-26

    Fatty acids (FA) play a critical role in energy homeostasis and metabolic diseases; in the context of livestock species, their profile also impacts on meat quality for healthy human consumption. Molecular pathways controlling lipid metabolism are highly interconnected and are not fully understood. Elucidating these molecular processes will aid technological development towards improvement of pork meat quality and increased knowledge of FA metabolism, underpinning metabolic diseases in humans. The results from genome-wide association studies (GWAS) across 15 phenotypes were subjected to an Association Weight Matrix (AWM) approach to predict a network of 1,096 genes related to intramuscular FA composition in pigs. To identify the key regulators of FA metabolism, we focused on the minimal set of transcription factors (TF) that the explored the majority of the network topology. Pathway and network analyses pointed towards a trio of TF as key regulators of FA metabolism: NCOA2, FHL2 and EP300. Promoter sequence analyses confirmed that these TF have binding sites for some well-know regulators of lipid and carbohydrate metabolism. For the first time in a non-model species, some of the co-associations observed at the genetic level were validated through co-expression at the transcriptomic level based on real-time PCR of 40 genes in adipose tissue, and a further 55 genes in liver. In particular, liver expression of NCOA2 and EP300 differed between pig breeds (Iberian and Landrace) extreme in terms of fat deposition. Highly clustered co-expression networks in both liver and adipose tissues were observed. EP300 and NCOA2 showed centrality parameters above average in the both networks. Over all genes, co-expression analyses confirmed 28.9% of the AWM predicted gene-gene interactions in liver and 33.0% in adipose tissue. The magnitude of this validation varied across genes, with up to 60.8% of the connections of NCOA2 in adipose tissue being validated via co-expression. Our results recapitulate the known transcriptional regulation of FA metabolism, predict gene interactions that can be experimentally validated, and suggest that genetic variants mapped to EP300, FHL2, and NCOA2 modulate lipid metabolism and control energy homeostasis in pigs.

  11. A statistical approach to identify, monitor, and manage incomplete curated data sets.

    PubMed

    Howe, Douglas G

    2018-04-02

    Many biological knowledge bases gather data through expert curation of published literature. High data volume, selective partial curation, delays in access, and publication of data prior to the ability to curate it can result in incomplete curation of published data. Knowing which data sets are incomplete and how incomplete they are remains a challenge. Awareness that a data set may be incomplete is important for proper interpretation, to avoiding flawed hypothesis generation, and can justify further exploration of published literature for additional relevant data. Computational methods to assess data set completeness are needed. One such method is presented here. In this work, a multivariate linear regression model was used to identify genes in the Zebrafish Information Network (ZFIN) Database having incomplete curated gene expression data sets. Starting with 36,655 gene records from ZFIN, data aggregation, cleansing, and filtering reduced the set to 9870 gene records suitable for training and testing the model to predict the number of expression experiments per gene. Feature engineering and selection identified the following predictive variables: the number of journal publications; the number of journal publications already attributed for gene expression annotation; the percent of journal publications already attributed for expression data; the gene symbol; and the number of transgenic constructs associated with each gene. Twenty-five percent of the gene records (2483 genes) were used to train the model. The remaining 7387 genes were used to test the model. One hundred and twenty-two and 165 of the 7387 tested genes were identified as missing expression annotations based on their residuals being outside the model lower or upper 95% confidence interval respectively. The model had precision of 0.97 and recall of 0.71 at the negative 95% confidence interval and precision of 0.76 and recall of 0.73 at the positive 95% confidence interval. This method can be used to identify data sets that are incompletely curated, as demonstrated using the gene expression data set from ZFIN. This information can help both database resources and data consumers gauge when it may be useful to look further for published data to augment the existing expertly curated information.

  12. Prediction on the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase based on gene expression programming.

    PubMed

    Li, Yuqin; You, Guirong; Jia, Baoxiu; Si, Hongzong; Yao, Xiaojun

    2014-01-01

    Quantitative structure-activity relationships (QSAR) were developed to predict the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase via heuristic method (HM) and gene expression programming (GEP). The descriptors of 33 pyrrolidine derivatives were calculated by the software CODESSA, which can calculate quantum chemical, topological, geometrical, constitutional, and electrostatic descriptors. HM was also used for the preselection of 5 appropriate molecular descriptors. Linear and nonlinear QSAR models were developed based on the HM and GEP separately and two prediction models lead to a good correlation coefficient (R (2)) of 0.93 and 0.94. The two QSAR models are useful in predicting the inhibition ratio of pyrrolidine derivatives on matrix metalloproteinase during the discovery of new anticancer drugs and providing theory information for studying the new drugs.

  13. Construction of a novel multi-gene assay (42-gene classifier) for prediction of late recurrence in ER-positive breast cancer patients.

    PubMed

    Tsunashima, Ryo; Naoi, Yasuto; Shimazu, Kenzo; Kagara, Naofumi; Shimoda, Masashi; Tanei, Tomonori; Miyake, Tomohiro; Kim, Seung Jin; Noguchi, Shinzaburo

    2018-05-04

    Prediction models for late (> 5 years) recurrence in ER-positive breast cancer need to be developed for the accurate selection of patients for extended hormonal therapy. We attempted to develop such a prediction model focusing on the differences in gene expression between breast cancers with early and late recurrence. For the training set, 779 ER-positive breast cancers treated with tamoxifen alone for 5 years were selected from the databases (GSE6532, GSE12093, GSE17705, and GSE26971). For the validation set, 221 ER-positive breast cancers treated with adjuvant hormonal therapy for 5 years with or without chemotherapy at our hospital were included. Gene expression was assayed by DNA microarray analysis (Affymetrix U133 plus 2.0). With the 42 genes differentially expressed in early and late recurrence breast cancers in the training set, a prediction model (42GC) for late recurrence was constructed. The patients classified by 42GC into the late recurrence-like group showed a significantly (P = 0.006) higher late recurrence rate as expected but a significantly (P = 1.62 × E-13) lower rate for early recurrence than non-late recurrence-like group. These observations were confirmed for the validation set, i.e., P = 0.020 for late recurrence and P = 5.70 × E-5 for early recurrence. We developed a unique prediction model (42GC) for late recurrence by focusing on the biological differences between breast cancers with early and late recurrence. Interestingly, patients in the late recurrence-like group by 42GC were at low risk for early recurrence.

  14. Predicting Response to Histone Deacetylase Inhibitors Using High-Throughput Genomics.

    PubMed

    Geeleher, Paul; Loboda, Andrey; Lenkala, Divya; Wang, Fan; LaCroix, Bonnie; Karovic, Sanja; Wang, Jacqueline; Nebozhyn, Michael; Chisamore, Michael; Hardwick, James; Maitland, Michael L; Huang, R Stephanie

    2015-11-01

    Many disparate biomarkers have been proposed as predictors of response to histone deacetylase inhibitors (HDI); however, all have failed when applied clinically. Rather than this being entirely an issue of reproducibility, response to the HDI vorinostat may be determined by the additive effect of multiple molecular factors, many of which have previously been demonstrated. We conducted a large-scale gene expression analysis using the Cancer Genome Project for discovery and generated another large independent cancer cell line dataset across different cancers for validation. We compared different approaches in terms of how accurately vorinostat response can be predicted on an independent out-of-batch set of samples and applied the polygenic marker prediction principles in a clinical trial. Using machine learning, the small effects that aggregate, resulting in sensitivity or resistance, can be recovered from gene expression data in a large panel of cancer cell lines.This approach can predict vorinostat response accurately, whereas single gene or pathway markers cannot. Our analyses recapitulated and contextualized many previous findings and suggest an important role for processes such as chromatin remodeling, autophagy, and apoptosis. As a proof of concept, we also discovered a novel causative role for CHD4, a helicase involved in the histone deacetylase complex that is associated with poor clinical outcome. As a clinical validation, we demonstrated that a common dose-limiting toxicity of vorinostat, thrombocytopenia, can be predicted (r = 0.55, P = .004) several days before it is detected clinically. Our work suggests a paradigm shift from single-gene/pathway evaluation to simultaneously evaluating multiple independent high-throughput gene expression datasets, which can be easily extended to other investigational compounds where similar issues are hampering clinical adoption. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Complete genome sequence and the expression pattern of plasmids of the model ethanologen Zymomonas mobilis ZM4 and its xylose-utilizing derivatives 8b and 2032.

    PubMed

    Yang, Shihui; Vera, Jessica M; Grass, Jeff; Savvakis, Giannis; Moskvin, Oleg V; Yang, Yongfu; McIlwain, Sean J; Lyu, Yucai; Zinonos, Irene; Hebert, Alexander S; Coon, Joshua J; Bates, Donna M; Sato, Trey K; Brown, Steven D; Himmel, Michael E; Zhang, Min; Landick, Robert; Pappas, Katherine M; Zhang, Yaoping

    2018-01-01

    Zymomonas mobilis is a natural ethanologen being developed and deployed as an industrial biofuel producer. To date, eight Z. mobilis strains have been completely sequenced and found to contain 2-8 native plasmids. However, systematic verification of predicted Z. mobilis plasmid genes and their contribution to cell fitness has not been hitherto addressed. Moreover, the precise number and identities of plasmids in Z. mobilis model strain ZM4 have been unclear. The lack of functional information about plasmid genes in ZM4 impedes ongoing studies for this model biofuel-producing strain. In this study, we determined the complete chromosome and plasmid sequences of ZM4 and its engineered xylose-utilizing derivatives 2032 and 8b. Compared to previously published and revised ZM4 chromosome sequences, the ZM4 chromosome sequence reported here contains 65 nucleotide sequence variations as well as a 2400-bp insertion. Four plasmids were identified in all three strains, with 150 plasmid genes predicted in strain ZM4 and 2032, and 153 plasmid genes predicted in strain 8b due to the insertion of heterologous DNA for expanded substrate utilization. Plasmid genes were then annotated using Blast2GO, InterProScan, and systems biology data analyses, and most genes were found to have apparent orthologs in other organisms or identifiable conserved domains. To verify plasmid gene prediction, RNA-Seq was used to map transcripts and also compare relative gene expression under various growth conditions, including anaerobic and aerobic conditions, or growth in different concentrations of biomass hydrolysates. Overall, plasmid genes were more responsive to varying hydrolysate concentrations than to oxygen availability. Additionally, our results indicated that although all plasmids were present in low copy number (about 1-2 per cell), the copy number of some plasmids varied under specific growth conditions or due to heterologous gene insertion. The complete genome of ZM4 and two xylose-utilizing derivatives is reported in this study, with an emphasis on identifying and characterizing plasmid genes. Plasmid gene annotation, validation, expression levels at growth conditions of interest, and contribution to host fitness are reported for the first time.

  16. Polycistronic gene expression in Aspergillus niger.

    PubMed

    Schuetze, Tabea; Meyer, Vera

    2017-09-25

    Genome mining approaches predict dozens of biosynthetic gene clusters in each of the filamentous fungal genomes sequenced so far. However, the majority of these gene clusters still remain cryptic because they are not expressed in their natural host. Simultaneous expression of all genes belonging to a biosynthetic pathway in a heterologous host is one approach to activate biosynthetic gene clusters and to screen the metabolites produced for bioactivities. Polycistronic expression of all pathway genes under control of a single and tunable promoter would be the method of choice, as this does not only simplify cloning procedures, but also offers control on timing and strength of expression. However, polycistronic gene expression is a feature not commonly found in eukaryotic host systems, such as Aspergillus niger. In this study, we tested the suitability of the viral P2A peptide for co-expression of three genes in A. niger. Two genes descend from Fusarium oxysporum and are essential to produce the secondary metabolite enniatin (esyn1, ekivR). The third gene (luc) encodes the reporter luciferase which was included to study position effects. Expression of the polycistronic gene cassette was put under control of the Tet-On system to ensure tunable gene expression in A. niger. In total, three polycistronic expression cassettes which differed in the position of luc were constructed and targeted to the pyrG locus in A. niger. This allowed direct comparison of the luciferase activity based on the position of the luciferase gene. Doxycycline-mediated induction of the Tet-On expression cassettes resulted in the production of one long polycistronic mRNA as proven by Northern analyses, and ensured comparable production of enniatin in all three strains. Notably, gene position within the polycistronic expression cassette matters, as, luciferase activity was lowest at position one and had a comparable activity at positions two and three. The P2A peptide can be used to express at least three genes polycistronically in A. niger. This approach can now be applied to heterologously express entire secondary metabolite gene clusters polycistronically or to co-express any genes of interest in equimolar amounts.

  17. Coregulation of Terpenoid Pathway Genes and Prediction of Isoprene Production in Bacillus subtilis Using Transcriptomics.

    PubMed

    Hess, Becky M; Xue, Junfeng; Markillie, Lye Meng; Taylor, Ronald C; Wiley, H Steven; Ahring, Birgitte K; Linggi, Bryan

    2013-01-01

    The isoprenoid pathway converts pyruvate to isoprene and related isoprenoid compounds in plants and some bacteria. Currently, this pathway is of great interest because of the critical role that isoprenoids play in basic cellular processes, as well as the industrial value of metabolites such as isoprene. Although the regulation of several pathway genes has been described, there is a paucity of information regarding system level regulation and control of the pathway. To address these limitations, we examined Bacillus subtilis grown under multiple conditions and determined the relationship between altered isoprene production and gene expression patterns. We found that with respect to the amount of isoprene produced, terpenoid genes fall into two distinct subsets with opposing correlations. The group whose expression levels positively correlated with isoprene production included dxs, which is responsible for the commitment step in the pathway, ispD, and two genes that participate in the mevalonate pathway, yhfS and pksG. The subset of terpenoid genes that inversely correlated with isoprene production included ispH, ispF, hepS, uppS, ispE, and dxr. A genome-wide partial least squares regression model was created to identify other genes or pathways that contribute to isoprene production. These analyses showed that a subset of 213 regulated genes was sufficient to create a predictive model of isoprene production under different conditions and showed correlations at the transcriptional level. We conclude that gene expression levels alone are sufficiently informative about the metabolic state of a cell that produces increased isoprene and can be used to build a model that accurately predicts production of this secondary metabolite across many simulated environmental conditions.

  18. Coregulation of Terpenoid Pathway Genes and Prediction of Isoprene Production in Bacillus subtilis Using Transcriptomics

    PubMed Central

    Hess, Becky M.; Xue, Junfeng; Markillie, Lye Meng; Taylor, Ronald C.; Wiley, H. Steven; Ahring, Birgitte K.; Linggi, Bryan

    2013-01-01

    The isoprenoid pathway converts pyruvate to isoprene and related isoprenoid compounds in plants and some bacteria. Currently, this pathway is of great interest because of the critical role that isoprenoids play in basic cellular processes, as well as the industrial value of metabolites such as isoprene. Although the regulation of several pathway genes has been described, there is a paucity of information regarding system level regulation and control of the pathway. To address these limitations, we examined Bacillus subtilis grown under multiple conditions and determined the relationship between altered isoprene production and gene expression patterns. We found that with respect to the amount of isoprene produced, terpenoid genes fall into two distinct subsets with opposing correlations. The group whose expression levels positively correlated with isoprene production included dxs, which is responsible for the commitment step in the pathway, ispD, and two genes that participate in the mevalonate pathway, yhfS and pksG. The subset of terpenoid genes that inversely correlated with isoprene production included ispH, ispF, hepS, uppS, ispE, and dxr. A genome-wide partial least squares regression model was created to identify other genes or pathways that contribute to isoprene production. These analyses showed that a subset of 213 regulated genes was sufficient to create a predictive model of isoprene production under different conditions and showed correlations at the transcriptional level. We conclude that gene expression levels alone are sufficiently informative about the metabolic state of a cell that produces increased isoprene and can be used to build a model that accurately predicts production of this secondary metabolite across many simulated environmental conditions. PMID:23840410

  19. Expression signature as a biomarker for prenatal diagnosis of trisomy 21.

    PubMed

    Volk, Marija; Maver, Aleš; Lovrečić, Luca; Juvan, Peter; Peterlin, Borut

    2013-01-01

    A universal biomarker panel with the potential to predict high-risk pregnancies or adverse pregnancy outcome does not exist. Transcriptome analysis is a powerful tool to capture differentially expressed genes (DEG), which can be used as biomarker-diagnostic-predictive tool for various conditions in prenatal setting. In search of biomarker set for predicting high-risk pregnancies, we performed global expression profiling to find DEG in Ts21. Subsequently, we performed targeted validation and diagnostic performance evaluation on a larger group of case and control samples. Initially, transcriptomic profiles of 10 cultivated amniocyte samples with Ts21 and 9 with normal euploid constitution were determined using expression microarrays. Datasets from Ts21 transcriptomic studies from GEO repository were incorporated. DEG were discovered using linear regression modelling and validated using RT-PCR quantification on an independent sample of 16 cases with Ts21 and 32 controls. The classification performance of Ts21 status based on expression profiling was performed using supervised machine learning algorithm and evaluated using a leave-one-out cross validation approach. Global gene expression profiling has revealed significant expression changes between normal and Ts21 samples, which in combination with data from previously performed Ts21 transcriptomic studies, were used to generate a multi-gene biomarker for Ts21, comprising of 9 gene expression profiles. In addition to biomarker's high performance in discriminating samples from global expression profiling, we were also able to show its discriminatory performance on a larger sample set 2, validated using RT-PCR experiment (AUC=0.97), while its performance on data from previously published studies reached discriminatory AUC values of 1.00. Our results show that transcriptomic changes might potentially be used to discriminate trisomy of chromosome 21 in the prenatal setting. As expressional alterations reflect both, causal and reactive cellular mechanisms, transcriptomic changes may thus have future potential in the diagnosis of a wide array of heterogeneous diseases that result from genetic disturbances.

  20. Noise in gene expression is coupled to growth rate.

    PubMed

    Keren, Leeat; van Dijk, David; Weingarten-Gabbay, Shira; Davidi, Dan; Jona, Ghil; Weinberger, Adina; Milo, Ron; Segal, Eran

    2015-12-01

    Genetically identical cells exposed to the same environment display variability in gene expression (noise), with important consequences for the fidelity of cellular regulation and biological function. Although population average gene expression is tightly coupled to growth rate, the effects of changes in environmental conditions on expression variability are not known. Here, we measure the single-cell expression distributions of approximately 900 Saccharomyces cerevisiae promoters across four environmental conditions using flow cytometry, and find that gene expression noise is tightly coupled to the environment and is generally higher at lower growth rates. Nutrient-poor conditions, which support lower growth rates, display elevated levels of noise for most promoters, regardless of their specific expression values. We present a simple model of noise in expression that results from having an asynchronous population, with cells at different cell-cycle stages, and with different partitioning of the cells between the stages at different growth rates. This model predicts non-monotonic global changes in noise at different growth rates as well as overall higher variability in expression for cell-cycle-regulated genes in all conditions. The consistency between this model and our data, as well as with noise measurements of cells growing in a chemostat at well-defined growth rates, suggests that cell-cycle heterogeneity is a major contributor to gene expression noise. Finally, we identify gene and promoter features that play a role in gene expression noise across conditions. Our results show the existence of growth-related global changes in gene expression noise and suggest their potential phenotypic implications. © 2015 Keren et al.; Published by Cold Spring Harbor Laboratory Press.

  1. Noise in gene expression is coupled to growth rate

    PubMed Central

    Keren, Leeat; van Dijk, David; Weingarten-Gabbay, Shira; Davidi, Dan; Jona, Ghil; Weinberger, Adina; Milo, Ron; Segal, Eran

    2015-01-01

    Genetically identical cells exposed to the same environment display variability in gene expression (noise), with important consequences for the fidelity of cellular regulation and biological function. Although population average gene expression is tightly coupled to growth rate, the effects of changes in environmental conditions on expression variability are not known. Here, we measure the single-cell expression distributions of approximately 900 Saccharomyces cerevisiae promoters across four environmental conditions using flow cytometry, and find that gene expression noise is tightly coupled to the environment and is generally higher at lower growth rates. Nutrient-poor conditions, which support lower growth rates, display elevated levels of noise for most promoters, regardless of their specific expression values. We present a simple model of noise in expression that results from having an asynchronous population, with cells at different cell-cycle stages, and with different partitioning of the cells between the stages at different growth rates. This model predicts non-monotonic global changes in noise at different growth rates as well as overall higher variability in expression for cell-cycle–regulated genes in all conditions. The consistency between this model and our data, as well as with noise measurements of cells growing in a chemostat at well-defined growth rates, suggests that cell-cycle heterogeneity is a major contributor to gene expression noise. Finally, we identify gene and promoter features that play a role in gene expression noise across conditions. Our results show the existence of growth-related global changes in gene expression noise and suggest their potential phenotypic implications. PMID:26355006

  2. Increased lipoprotein lipase activity in non-small cell lung cancer tissue predicts shorter patient survival.

    PubMed

    Trost, Zoran; Sok, Miha; Marc, Janja; Cerne, Darko

    2009-07-01

    Cumulative evidence suggests the involvement of lipoprotein lipase (LPL) in tumor progression. We tested the hypothesis that increased LPL activity in resectable non-small cell lung cancer (NSCLC) tissue and the increased LPL gene expression in the surrounding non-cancer lung tissue found in our previous study are predictors of patient survival. Forty two consecutive patients with resected NSCLC were enrolled in the study. Paired samples of lung cancer tissue and adjacent non-cancer lung tissue were collected from resected specimens for baseline LPL activity and gene expression estimation. During a 4-year follow-up, 21 patients died due to tumor progression. One patient died due to a non-cancer reason and was not included in Cox regression analysis. High LPL activity in cancer tissue (relative to the adjacent non-cancer lung tissue) predicted shorter survival, independently of standard prognostic factors (p=0.003). High gene expression in the non-cancer lung tissue surrounding the tumor had no predictive value. Our study further underlines the involvement of cancer tissue LPL activity in tumor progression.

  3. Ewing's Sarcoma: An Analysis of miRNA Expression Profiles and Target Genes in Paraffin-Embedded Primary Tumor Tissue.

    PubMed

    Parafioriti, Antonina; Bason, Caterina; Armiraglio, Elisabetta; Calciano, Lucia; Daolio, Primo Andrea; Berardocco, Martina; Di Bernardo, Andrea; Colosimo, Alessia; Luksch, Roberto; Berardi, Anna C

    2016-04-30

    The molecular mechanism responsible for Ewing's Sarcoma (ES) remains largely unknown. MicroRNAs (miRNAs), a class of small non-coding RNAs able to regulate gene expression, are deregulated in tumors and may serve as a tool for diagnosis and prediction. However, the status of miRNAs in ES has not yet been thoroughly investigated. This study compared global miRNAs expression in paraffin-embedded tumor tissue samples from 20 ES patients, affected by primary untreated tumors, with miRNAs expressed in normal human mesenchymal stromal cells (MSCs) by microarray analysis. A miRTarBase database was used to identify the predicted target genes for differentially expressed miRNAs. The miRNAs microarray analysis revealed distinct patterns of miRNAs expression between ES samples and normal MSCs. 58 of the 954 analyzed miRNAs were significantly differentially expressed in ES samples compared to MSCs. Moreover, the qRT-PCR analysis carried out on three selected miRNAs showed that miR-181b, miR-1915 and miR-1275 were significantly aberrantly regulated, confirming the microarray results. Bio-database analysis identified BCL-2 as a bona fide target gene of the miR-21, miR-181a, miR-181b, miR-29a, miR-29b, miR-497, miR-195, miR-let-7a, miR-34a and miR-1915. Using paraffin-embedded tissues from ES patients, this study has identified several potential target miRNAs and one gene that might be considered a novel critical biomarker for ES pathogenesis.

  4. Molecular Characteristics of High-Dose Melphalan Associated Oral Mucositis in Patients with Multiple Myeloma: A Gene Expression Study on Human Mucosa

    PubMed Central

    Bødker, Julie Støve; Christensen, Heidi Søgaard; Johansen, Preben; Nielsen, Søren; Christiansen, Ilse; Bergmann, Olav Jonas; Bøgsted, Martin; Dybkær, Karen; Vyberg, Mogens; Johnsen, Hans Erik

    2017-01-01

    Background Toxicity of the oral and gastrointestinal mucosa induced by high-dose melphalan is a clinical challenge with no documented prophylactic interventions or predictive tests. The aim of this study was to describe molecular changes in human oral mucosa and to identify biomarkers correlated with the grade of clinical mucositis. Methods and Findings Ten patients with multiple myeloma (MM) were included. For each patient, we acquired three buccal biopsies, one before, one at 2 days, and one at 20 days after high-dose melphalan administration. We also acquired buccal biopsies from 10 healthy individuals that served as controls. We analyzed the biopsies for global gene expression and performed an immunohistochemical analysis to determine HLA-DRB5 expression. We evaluated associations between clinical mucositis and gene expression profiles. Compared to gene expression levels before and 20 days after therapy, at two days after melphalan treatment, we found gene regulation in the p53 and TNF pathways (MDM2, INPPD5, TIGAR), which favored anti-apoptotic defense, and upregulation of immunoregulatory genes (TREM2, LAMP3) in mucosal dendritic cells. This upregulation was independent of clinical mucositis. HLA-DRB1 and HLA-DRB5 (surface receptors on dendritic cells) were expressed at low levels in all patients with MM, in the subgroup of patients with ulcerative mucositis (UM), and in controls; in contrast, the subgroup with low-grade mucositis (NM) displayed 5–6 fold increases in HLA-DRB1 and HLA-DRB5 expression in the first two biopsies, independent of melphalan treatment. Moreover, different splice variants of HLA-DRB1 were expressed in the UM and NM subgroups. Conclusions Our results revealed that, among patients with MM, immunoregulatory genes and genes involved in defense against apoptosis were affected immediately after melphalan administration, independent of the presence of clinical mucositis. Furthermore, our results suggested that the expression levels of HLA-DRB1 and HLA-DRB5 may serve as potential predictive biomarkers for mucositis severity. PMID:28052121

  5. Spatial gradients of protein-level time delays set the pace of the traveling segmentation clock waves

    PubMed Central

    Ay, Ahmet; Holland, Jack; Sperlea, Adriana; Devakanmalai, Gnanapackiam Sheela; Knierer, Stephan; Sangervasi, Sebastian; Stevenson, Angel; Özbudak, Ertuğrul M.

    2014-01-01

    The vertebrate segmentation clock is a gene expression oscillator controlling rhythmic segmentation of the vertebral column during embryonic development. The period of oscillations becomes longer as cells are displaced along the posterior to anterior axis, which results in traveling waves of clock gene expression sweeping in the unsegmented tissue. Although various hypotheses necessitating the inclusion of additional regulatory genes into the core clock network at different spatial locations have been proposed, the mechanism underlying traveling waves has remained elusive. Here, we combined molecular-level computational modeling and quantitative experimentation to solve this puzzle. Our model predicts the existence of an increasing gradient of gene expression time delays along the posterior to anterior direction to recapitulate spatiotemporal profiles of the traveling segmentation clock waves in different genetic backgrounds in zebrafish. We validated this prediction by measuring an increased time delay of oscillatory Her1 protein production along the unsegmented tissue. Our results refuted the need for spatial expansion of the core feedback loop to explain the occurrence of traveling waves. Spatial regulation of gene expression time delays is a novel way of creating dynamic patterns; this is the first report demonstrating such a control mechanism in any tissue and future investigations will explore the presence of analogous examples in other biological systems. PMID:25336742

  6. L1000CDS2: LINCS L1000 characteristic direction signatures search engine.

    PubMed

    Duan, Qiaonan; Reid, St Patrick; Clark, Neil R; Wang, Zichen; Fernandez, Nicolas F; Rouillard, Andrew D; Readhead, Ben; Tritsch, Sarah R; Hodos, Rachel; Hafner, Marc; Niepel, Mario; Sorger, Peter K; Dudley, Joel T; Bavari, Sina; Panchal, Rekha G; Ma'ayan, Avi

    2016-01-01

    The library of integrated network-based cellular signatures (LINCS) L1000 data set currently comprises of over a million gene expression profiles of chemically perturbed human cell lines. Through unique several intrinsic and extrinsic benchmarking schemes, we demonstrate that processing the L1000 data with the characteristic direction (CD) method significantly improves signal to noise compared with the MODZ method currently used to compute L1000 signatures. The CD processed L1000 signatures are served through a state-of-the-art web-based search engine application called L1000CDS 2 . The L1000CDS 2 search engine provides prioritization of thousands of small-molecule signatures, and their pairwise combinations, predicted to either mimic or reverse an input gene expression signature using two methods. The L1000CDS 2 search engine also predicts drug targets for all the small molecules profiled by the L1000 assay that we processed. Targets are predicted by computing the cosine similarity between the L1000 small-molecule signatures and a large collection of signatures extracted from the gene expression omnibus (GEO) for single-gene perturbations in mammalian cells. We applied L1000CDS 2 to prioritize small molecules that are predicted to reverse expression in 670 disease signatures also extracted from GEO, and prioritized small molecules that can mimic expression of 22 endogenous ligand signatures profiled by the L1000 assay. As a case study, to further demonstrate the utility of L1000CDS 2 , we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 min. Querying these signatures with L1000CDS 2 we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose-dependent efficacy in inhibiting Ebola infection in vitro without causing cellular toxicity in human cell lines. In summary, the L1000CDS 2 tool can be applied in many biological and biomedical settings, while improving the extraction of knowledge from the LINCS L1000 resource.

  7. A genome-wide analysis of the flax (Linum usitatissimum L.) dirigent protein family: from gene identification and evolution to differential regulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija

    Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset ofmore » genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.« less

  8. A genome-wide analysis of the flax (Linum usitatissimum L.) dirigent protein family: from gene identification and evolution to differential regulation.

    PubMed

    Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija; Auguin, Daniel; Lainé, Éric; Davin, Laurence B; Cort, John R; Lewis, Norman G; Hano, Christophe

    2018-05-01

    Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset of genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.

  9. Prediction of Microbial Infection of Cultured Cells Using DNA Microarray Gene-Expression Profiles of Host Responses

    PubMed Central

    Park, Yu Rang; Chung, Tae Su; Lee, Young Joo; Song, Yeong Wook; Lee, Eun Young; Sohn, Yeo Won; Song, Sukgil; Park, Woong Yang

    2012-01-01

    Infection by microorganisms may cause fatally erroneous interpretations in the biologic researches based on cell culture. The contamination by microorganism in the cell culture is quite frequent (5% to 35%). However, current approaches to identify the presence of contamination have many limitations such as high cost of time and labor, and difficulty in interpreting the result. In this paper, we propose a model to predict cell infection, using a microarray technique which gives an overview of the whole genome profile. By analysis of 62 microarray expression profiles under various experimental conditions altering cell type, source of infection and collection time, we discovered 5 marker genes, NM_005298, NM_016408, NM_014588, S76389, and NM_001853. In addition, we discovered two of these genes, S76389, and NM_001853, are involved in a Mycolplasma-specific infection process. We also suggest models to predict the source of infection, cell type or time after infection. We implemented a web based prediction tool in microarray data, named Prediction of Microbial Infection (http://www.snubi.org/software/PMI). PMID:23091307

  10. GENOMIC IMPRINTING, DISRUPTED PLACENTAL EXPRESSION, AND SPECIATION

    PubMed Central

    Brekke, Thomas D.; Henry, Lindy A.; Good, Jeffrey M.

    2016-01-01

    The importance of regulatory incompatibilities to the early stages of speciation remains unclear. Hybrid mammals often show extreme parent-of-origin growth effects that are thought to be a consequence of disrupted genetic imprinting (parent-specific epigenetic gene silencing) during early development. Here we test the long-standing hypothesis that abnormal hybrid growth reflects disrupted gene expression due to loss of imprinting (LOI) in hybrid placentas, resulting in dosage imbalances between paternal growth factors and maternal growth repressors. We analyzed placental gene expression in reciprocal dwarf hamster hybrids that show extreme parent-of-origin growth effects relative to their parental species. In massively enlarged hybrid placentas, we observed both extensive transgressive expression of growth-related genes and bi-allelic expression of many genes that were paternally silenced in normal sized hybrids. However, the apparent widespread disruption of paternal silencing was coupled with reduced gene expression levels overall. These patterns are contrary to the predictions of the LOI model and indicate that hybrid misexpression of dosage sensitive genes is caused by other regulatory mechanisms in this system. Collectively, our results support a central role for disrupted gene expression and imprinting in the evolution of mammalian hybrid inviability, but call into question the generality of the widely invoked LOI model. PMID:27714796

  11. A Dopaminergic Gene Cluster in the Prefrontal Cortex Predicts Performance Indicative of General Intelligence in Genetically Heterogeneous Mice

    PubMed Central

    Kolata, Stefan; Light, Kenneth; Wass, Christopher D.; Colas-Zelin, Danielle; Roy, Debasri; Matzel, Louis D.

    2010-01-01

    Background Genetically heterogeneous mice express a trait that is qualitatively and psychometrically analogous to general intelligence in humans, and as in humans, this trait co-varies with the processing efficacy of working memory (including its dependence on selective attention). Dopamine signaling in the prefrontal cortex (PFC) has been established to play a critical role in animals' performance in both working memory and selective attention tasks. Owing to this role of the PFC in the regulation of working memory, here we compared PFC gene expression profiles of 60 genetically diverse CD-1 mice that exhibited a wide range of general learning abilities (i.e., aggregate performance across five diverse learning tasks). Methodology/Principal Findings Animals' general cognitive abilities were first determined based on their aggregate performance across a battery of five diverse learning tasks. With a procedure designed to minimize false positive identifications, analysis of gene expression microarrays (comprised of ≈25,000 genes) identified a small number (<20) of genes that were differentially expressed across animals that exhibited fast and slow aggregate learning abilities. Of these genes, one functional cluster was identified, and this cluster (Darpp-32, Drd1a, and Rgs9) is an established modulator of dopamine signaling. Subsequent quantitative PCR found that expression of these dopaminegic genes plus one vascular gene (Nudt6) were significantly correlated with individual animal's general cognitive performance. Conclusions/Significance These results indicate that D1-mediated dopamine signaling in the PFC, possibly through its modulation of working memory, is predictive of general cognitive abilities. Furthermore, these results provide the first direct evidence of specific molecular pathways that might potentially regulate general intelligence. PMID:21103339

  12. Genome-wide identification and expression analysis of MAPK and MAPKK gene family in Malus domestica.

    PubMed

    Zhang, Shizhong; Xu, Ruirui; Luo, Xiaocui; Jiang, Zesheng; Shu, Huairui

    2013-12-01

    MAPK signal transduction modules play crucial roles in regulating many biological processes in plants, which are composed of three classes of hierarchically organized protein kinases, namely MAPKKKs, MAPKKs, and MAPKs. Although genome-wide analysis of this family has been carried out in some species, little is known about MAPK and MAPKK genes in apple (Malus domestica). In this study, a total of 26 putative apple MAPK genes (MdMPKs) and 9 putative apple MAPKK genes (MdMKKs) have been identified and located within the apple genome. Phylogenetic analysis revealed that MdMAPKs and MdMAPKKs could be divided into 4 subfamilies (groups A, B, C and D), respectively. The predicted MdMAPKs and MdMAPKKs were distributed across 13 out of 17 chromosomes with different densities. In addition, analysis of exon-intron junctions and of intron phase inside the predicted coding region of each candidate gene has revealed high levels of conservation within and between phylogenetic groups. According to the microarray and expressed sequence tag (EST) analysis, the different expression patterns indicate that they may play different roles during fruit development and rootstock-scion interaction process. Moreover, MAPK and MAPKK genes were performed expression profile analyses in different tissues (root, stem, leaf, flower and fruit), and all of the selected genes were expressed in at least one of the tissues tested, indicating that the MAPKs and MAPKKs are involved in various aspects of physiological and developmental processes of apple. To our knowledge, this is the first report of a genome-wide analysis of the apple MAPK and MAPKK gene family. This study provides valuable information for understanding the classification and putative functions of the MAPK signal in apple. © 2013.

  13. Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

    PubMed Central

    Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C

    2003-01-01

    Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626

  14. Tumor Cell Gene Expression Changes Following Short-term In vivo Exposure to Single Agent Chemotherapeutics are Related to Survival in Multiple Myeloma

    PubMed Central

    Burington, Bart; Barlogie, Bart; Zhan, Fenghuang; Crowley, John; Shaughnessy, John D.

    2013-01-01

    Changes in global gene expression patterns in tumor cells following in vivo therapy may vary by treatment and provide added or synergistic prognostic power over pretherapy gene expression profiles (GEP). This molecular readout of drug-cell interaction may also point to mechanisms of action/resistance. In newly diagnosed patients with multiple myeloma (MM), microarray data were obtained on tumor cells prior to and 48 hours after in vivo treatment using dexamethasone (n = 45) or thalidomide (n = 42); in the case of relapsed MM, microarray data were obtained prior to (n = 36) and after (n = 19) lenalidomide administration. Dexamethasone and thalidomide induced both common and unique GEP changes in tumor cells. Combined baseline and 48-hour changes in GEP in a subset of genes, many related to oxidative stress and cytoskeletal dynamics, were predictive of outcome in newly diagnosed MM patients receiving tandem transplants. Thalidomide-altered genes also changed following lenalidomide exposure and predicted event-free and overall survival in relapsed patients receiving lenalidomide as a single agent. Combined with baseline molecular features, changes in GEP following short-term single-agent exposure may help guide treatment decisions for patients with MM. Genes whose drug-altered expression were found to be related to survival may point to molecular switches related to response and/or resistance to different classes of drugs. PMID:18676754

  15. A stochastic model for optimizing composite predictors based on gene expression profiles.

    PubMed

    Ramanathan, Murali

    2003-07-01

    This project was done to develop a mathematical model for optimizing composite predictors based on gene expression profiles from DNA arrays and proteomics. The problem was amenable to a formulation and solution analogous to the portfolio optimization problem in mathematical finance: it requires the optimization of a quadratic function subject to linear constraints. The performance of the approach was compared to that of neighborhood analysis using a data set containing cDNA array-derived gene expression profiles from 14 multiple sclerosis patients receiving intramuscular inteferon-beta1a. The Markowitz portfolio model predicts that the covariance between genes can be exploited to construct an efficient composite. The model predicts that a composite is not needed for maximizing the mean value of a treatment effect: only a single gene is needed, but the usefulness of the effect measure may be compromised by high variability. The model optimized the composite to yield the highest mean for a given level of variability or the least variability for a given mean level. The choices that meet this optimization criteria lie on a curve of composite mean vs. composite variability plot referred to as the "efficient frontier." When a composite is constructed using the model, it outperforms the composite constructed using the neighborhood analysis method. The Markowitz portfolio model may find potential applications in constructing composite biomarkers and in the pharmacogenomic modeling of treatment effects derived from gene expression endpoints.

  16. Plasticity in the Rat Prefrontal Cortex: Linking Gene Expression and an Operant Learning with a Computational Theory

    PubMed Central

    Rapanelli, Maximiliano; Lew, Sergio Eduardo; Frick, Luciana Romina; Zanutto, Bonifacio Silvano

    2010-01-01

    The plasticity in the medial Prefrontal Cortex (mPFC) of rodents or lateral prefrontal cortex in non human primates (lPFC), plays a key role neural circuits involved in learning and memory. Several genes, like brain-derived neurotrophic factor (BDNF), cAMP response element binding (CREB), Synapsin I, Calcium/calmodulin-dependent protein kinase II (CamKII), activity-regulated cytoskeleton-associated protein (Arc), c-jun and c-fos have been related to plasticity processes. We analysed differential expression of related plasticity genes and immediate early genes in the mPFC of rats during learning an operant conditioning task. Incompletely and completely trained animals were studied because of the distinct events predicted by our computational model at different learning stages. During learning an operant conditioning task, we measured changes in the mRNA levels by Real-Time RT-PCR during learning; expression of these markers associated to plasticity was incremented while learning and such increments began to decline when the task was learned. The plasticity changes in the lPFC during learning predicted by the model matched up with those of the representative gene BDNF. Herein, we showed for the first time that plasticity in the mPFC in rats during learning of an operant conditioning is higher while learning than when the task is learned, using an integrative approach of a computational model and gene expression. PMID:20111591

  17. An 80-gene set to predict response to preoperative chemoradiotherapy for rectal cancer by principle component analysis.

    PubMed

    Empuku, Shinichiro; Nakajima, Kentaro; Akagi, Tomonori; Kaneko, Kunihiko; Hijiya, Naoki; Etoh, Tsuyoshi; Shiraishi, Norio; Moriyama, Masatsugu; Inomata, Masafumi

    2016-05-01

    Preoperative chemoradiotherapy (CRT) for locally advanced rectal cancer not only improves the postoperative local control rate, but also induces downstaging. However, it has not been established how to individually select patients who receive effective preoperative CRT. The aim of this study was to identify a predictor of response to preoperative CRT for locally advanced rectal cancer. This study is additional to our multicenter phase II study evaluating the safety and efficacy of preoperative CRT using oral fluorouracil (UMIN ID: 03396). From April, 2009 to August, 2011, 26 biopsy specimens obtained prior to CRT were analyzed by cyclopedic microarray analysis. Response to CRT was evaluated according to a histological grading system using surgically resected specimens. To decide on the number of genes for dividing into responder and non-responder groups, we statistically analyzed the data using a dimension reduction method, a principle component analysis. Of the 26 cases, 11 were responders and 15 non-responders. No significant difference was found in clinical background data between the two groups. We determined that the optimal number of genes for the prediction of response was 80 of 40,000 and the functions of these genes were analyzed. When comparing non-responders with responders, genes expressed at a high level functioned in alternative splicing, whereas those expressed at a low level functioned in the septin complex. Thus, an 80-gene expression set that predicts response to preoperative CRT for locally advanced rectal cancer was identified using a novel statistical method.

  18. Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes.

    PubMed

    Lomsadze, Alexandre; Gemayel, Karl; Tang, Shiyuyun; Borodovsky, Mark

    2018-05-17

    In a conventional view of the prokaryotic genome organization, promoters precede operons and ribosome binding sites (RBSs) with Shine-Dalgarno consensus precede genes. However, recent experimental research suggesting a more diverse view motivated us to develop an algorithm with improved gene-finding accuracy. We describe GeneMarkS-2, an ab initio algorithm that uses a model derived by self-training for finding species-specific (native) genes, along with an array of precomputed "heuristic" models designed to identify harder-to-detect genes (likely horizontally transferred). Importantly, we designed GeneMarkS-2 to identify several types of distinct sequence patterns (signals) involved in gene expression control, among them the patterns characteristic for leaderless transcription as well as noncanonical RBS patterns. To assess the accuracy of GeneMarkS-2, we used genes validated by COG (Clusters of Orthologous Groups) annotation, proteomics experiments, and N-terminal protein sequencing. We observed that GeneMarkS-2 performed better on average in all accuracy measures when compared with the current state-of-the-art gene prediction tools. Furthermore, the screening of ∼5000 representative prokaryotic genomes made by GeneMarkS-2 predicted frequent leaderless transcription in both archaea and bacteria. We also observed that the RBS sites in some species with leadered transcription did not necessarily exhibit the Shine-Dalgarno consensus. The modeling of different types of sequence motifs regulating gene expression prompted a division of prokaryotic genomes into five categories with distinct sequence patterns around the gene starts. © 2018 Lomsadze et al.; Published by Cold Spring Harbor Laboratory Press.

  19. Prediction and characterisation of a highly conserved, remote and cAMP responsive enhancer that regulates Msx1 gene expression in cardiac neural crest and outflow tract.

    PubMed

    Miller, Kerry Ann; Davidson, Scott; Liaros, Angela; Barrow, John; Lear, Marissa; Heine, Danielle; Hoppler, Stefan; MacKenzie, Alasdair

    2008-05-15

    Double knockouts of the Msx1 and Msx2 genes in the mouse result in severe cardiac outflow tract malformations similar to those frequently found in newborn infants. Despite the known role of the Msx genes in cardiac formation little is known of the regulatory systems (ligand receptor, signal transduction and protein-DNA interactions) that regulate the tissue-specific expression of the Msx genes in mammals during the formation of the outflow tract. In the present study we have used a combination of multi-species comparative genomics, mouse transgenic analysis and in-situ hybridisation to predict and validate the existence of a remote ultra-conserved enhancer that supports the expression of the Msx1 gene in migrating mouse cardiac neural crest and the outflow tract primordia. Furthermore, culturing of embryonic explants derived from transgenic lines with agonists of the PKC and PKA signal transduction systems demonstrates that this remote enhancer is influenced by PKA but not PKC dependent gene regulatory systems. These studies demonstrate the efficacy of combining comparative genomics and transgenic analyses and provide a platform for the study of the possible roles of Msx gene mis-regulation in the aetiology of congenital heart malformation.

  20. Genomewide identification and expression analysis of the ARF gene family in apple.

    PubMed

    Luo, Xiao-Cui; Sun, Mei-Hong; Xu, Rui-Rui; Shu, Huai-Rui; Wang, Jia-Wei; Zhang, Shi-Zhong

    2014-12-01

    Auxin response factors (ARF) are transcription factors that regulate auxin responses in plants. Although the genomewide analysis of this family has been performed in some species, little is known regarding ARF genes in apple (Malus domestica). In this study, 31 putative apple ARF genes have been identified and located within the apple genome. The phylogenetic analysis revealed that MdARFs could be divided into three subfamilies (groups I, II and III). The predicted MdARFs were distributed across 15 of 17 chromosomes with different densities. In addition, the analysis of exon-intron junctions and of the intron phase inside the predicted coding region of each candidate gene has revealed high levels of conservation within and between phylogenetic groups. Expression profile analyses of MdARF genes were performed in different tissues (root, stem, leaf, flower and fruit), and all the selected genes were expressed in at least one of the tissues that were tested, which indicated that MdARFs are involved in various aspects of physiological and developmental processes of apple. To our knowledge, this report is the first to provide a genomewide analysis of the apple ARF gene family. This study provides valuable information for understanding the classification and putative functions of the ARF signal in apple.

  1. Learning Parsimonious Classification Rules from Gene Expression Data Using Bayesian Networks with Local Structure.

    PubMed

    Lustgarten, Jonathan Lyle; Balasubramanian, Jeya Balaji; Visweswaran, Shyam; Gopalakrishnan, Vanathi

    2017-03-01

    The comprehensibility of good predictive models learned from high-dimensional gene expression data is attractive because it can lead to biomarker discovery. Several good classifiers provide comparable predictive performance but differ in their abilities to summarize the observed data. We extend a Bayesian Rule Learning (BRL-GSS) algorithm, previously shown to be a significantly better predictor than other classical approaches in this domain. It searches a space of Bayesian networks using a decision tree representation of its parameters with global constraints, and infers a set of IF-THEN rules. The number of parameters and therefore the number of rules are combinatorial to the number of predictor variables in the model. We relax these global constraints to a more generalizable local structure (BRL-LSS). BRL-LSS entails more parsimonious set of rules because it does not have to generate all combinatorial rules. The search space of local structures is much richer than the space of global structures. We design the BRL-LSS with the same worst-case time-complexity as BRL-GSS while exploring a richer and more complex model space. We measure predictive performance using Area Under the ROC curve (AUC) and Accuracy. We measure model parsimony performance by noting the average number of rules and variables needed to describe the observed data. We evaluate the predictive and parsimony performance of BRL-GSS, BRL-LSS and the state-of-the-art C4.5 decision tree algorithm, across 10-fold cross-validation using ten microarray gene-expression diagnostic datasets. In these experiments, we observe that BRL-LSS is similar to BRL-GSS in terms of predictive performance, while generating a much more parsimonious set of rules to explain the same observed data. BRL-LSS also needs fewer variables than C4.5 to explain the data with similar predictive performance. We also conduct a feasibility study to demonstrate the general applicability of our BRL methods on the newer RNA sequencing gene-expression data.

  2. Two pheromone precursor genes are transcriptionally expressed in the homothallic ascomycete Sordaria macrospora.

    PubMed

    Pöggeler, S

    2000-06-01

    In order to analyze the involvement of pheromones in cell recognition and mating in a homothallic fungus, two putative pheromone precursor genes, named ppg1 and ppg2, were isolated from a genomic library of Sordaria macrospora. The ppg1 gene is predicted to encode a precursor pheromone that is processed by a Kex2-like protease to yield a pheromone that is structurally similar to the alpha-factor of the yeast Saccharomyces cerevisiae. The ppg2 gene encodes a 24-amino-acid polypeptide that contains a putative farnesylated and carboxy methylated C-terminal cysteine residue. The sequences of the predicted pheromones display strong structural similarity to those encoded by putative pheromones of heterothallic filamentous ascomycetes. Both genes are expressed during the life cycle of S. macrospora. This is the first description of pheromone precursor genes encoded by a homothallic fungus. Southern-hybridization experiments indicated that ppg1 and ppg2 homologues are also present in other homothallic ascomycetes.

  3. Expression profiling in canine osteosarcoma: identification of biomarkers and pathways associated with outcome

    PubMed Central

    2010-01-01

    Background Osteosarcoma (OSA) spontaneously arises in the appendicular skeleton of large breed dogs and shares many physiological and molecular biological characteristics with human OSA. The standard treatment for OSA in both species is amputation or limb-sparing surgery, followed by chemotherapy. Unfortunately, OSA is an aggressive cancer with a high metastatic rate. Characterization of OSA with regard to its metastatic potential and chemotherapeutic resistance will improve both prognostic capabilities and treatment modalities. Methods We analyzed archived primary OSA tissue from dogs treated with limb amputation followed by doxorubicin or platinum-based drug chemotherapy. Samples were selected from two groups: dogs with disease free intervals (DFI) of less than 100 days (n = 8) and greater than 300 days (n = 7). Gene expression was assessed with Affymetrix Canine 2.0 microarrays and analyzed with a two-tailed t-test. A subset of genes was confirmed using qRT-PCR and used in classification analysis to predict prognosis. Systems-based gene ontology analysis was conducted on genes selected using a standard J5 metric. The genes identified using this approach were converted to their human homologues and assigned to functional pathways using the GeneGo MetaCore platform. Results Potential biomarkers were identified using gene expression microarray analysis and 11 differentially expressed (p < 0.05) genes were validated with qRT-PCR (n = 10/group). Statistical classification models using the qRT-PCR profiles predicted patient outcomes with 100% accuracy in the training set and up to 90% accuracy upon stratified cross validation. Pathway analysis revealed alterations in pathways associated with oxidative phosphorylation, hedgehog and parathyroid hormone signaling, cAMP/Protein Kinase A (PKA) signaling, immune responses, cytoskeletal remodeling and focal adhesion. Conclusions This profiling study has identified potential new biomarkers to predict patient outcome in OSA and new pathways that may be targeted for therapeutic intervention. PMID:20860831

  4. Increased Brahma-related Gene 1 Expression Predicts Distant Metastasis and Shorter Survival in Patients with Invasive Ductal Carcinoma of the Breast.

    PubMed

    Do, Sung-Im; Yoon, Gun; Kim, Hyun-Soo; Kim, Kyungeun; Lee, Hyunjoo; Do, In-Gu; Kim, Dong-Hoon; Chae, Seoung Wan; Sohn, Jin Hee

    2016-09-01

    Previous studies have demonstrated aberrant Brahma-related gene 1 (BRG1) expression in various tumor types. Increased BRG1 expression has recently been shown to correlate with aggressive oncogenic behavior in many different types of human cancer. However, the role of BRG1 in breast cancer development and progression is not fully understood. We evaluated BRG1 expression in 224 patients with invasive ductal carcinoma (IDC) of the breast using tissue microarray samples and immunohistochemistry. We also investigated whether BRG1 expression status is associated with clinicopathological characteristics and outcomes of patients with IDC. Among the 224 patients with IDC, 37.5% (84/224) exhibited high BRG1 expression. IDC exhibited significantly higher BRG1 expression compared to ductal carcinoma in situ (p=0.009) and normal breast tissue (p=0.005). High BRG1 expression in IDC significantly correlated with higher histological grade (p=0.035) and presence of distant metastasis (p=0.002). Furthermore, high BRG1 expression was an independent factor for predicting distant metastasis (relative risk=4.079; p=0.007). In addition, high BRG1 expression predicted shorter overall (p=0.011) and recurrence-free (p=0.003) survival in patients with IDC. In particular, BRG1 had a significant prognostic value in predicting recurrence-free survival of patients with IDC with lymph node metastasis or stage III disease. BRG1 is involved in the progression and metastasis of breast cancer and can serve as a novel biomarker predictive of distant metastasis and patient outcomes. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.

  5. Clinical Value of miR-101-3p and Biological Analysis of its Prospective Targets in Breast Cancer: A Study Based on The Cancer Genome Atlas (TCGA) and Bioinformatics.

    PubMed

    Li, Chun-Yao; Xiong, Dan-Dan; Huang, Chun-Qin; He, Rong-Quan; Liang, Hai-Wei; Pan, Deng-Hua; Wang, Han-Lin; Wang, Yi-Wen; Zhu, Hua-Wei; Chen, Gang

    2017-04-18

    BACKGROUND MiR-101-3p can promote apoptosis and inhibit proliferation, invasion, and metastasis in breast cancer (BC) cells. However, its mechanisms in BC are not fully understood. Therefore, a comprehensive analysis of the target genes, pathways, and networks of miR-101-3p in BC is necessary. MATERIAL AND METHODS The miR-101 profiles for 781 patients with BC from The Cancer Genome Atlas (TCGA) were analyzed. Gene expression profiling of GSE31397 with miR-101-3p transfected MCF-7 cells and scramble control cells was downloaded from Gene Expression Omnibus (GEO), and the differentially expressed genes (DEGs) were identified. The potential genes targeted by miR-101-3p were also predicted. Gene Ontology (GO) and pathway and network analyses were constructed for the DEGs and predicted genes. RESULTS In the TCGA data, a low level of miR-101-2 expression might represent a diagnostic (AUC: 0.63) marker, and the miR-101-1 was a prognostic (HR=1.79) marker. MiR-101-1 was linked to the estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2), and miR-101-2 was associated with the tumor (T), lymph node (N), and metastasis (M) stages of BC. Moreover, 427 genes were selected from the 921 DEGs in GEO and the 7924 potential target genes from the prediction databases. These genes were related to transcription, metabolism, biosynthesis, and proliferation. The results were also significantly enriched in the VEGF, mTOR, focal adhesion, Wnt, and chemokine signaling pathways. CONCLUSIONS MiR-101-1 and miR-101-2 may be prospective biomarkers for the prognosis and diagnosis of BC, respectively, and are associated with diverse clinical parameters. The target genes of miR-101-3p regulate the development and progression of BC. These results provide insight into the pathogenic mechanism and potential therapies for BC.

  6. Evaluation of a 30-gene paclitaxel, fluorouracil, doxorubicin and cyclophosphamide chemotherapy response predictor in a multicenter randomized trial in breast cancer

    PubMed Central

    Tabchy, Adel; Valero, Vicente; Vidaurre, Tatiana; Lluch, Ana; Gomez, Henry; Martin, Miguel; Qi, Yuan; Barajas-Figueroa, Luis Javier; Souchon, Eduardo; Coutant, Charles; Doimi, Franco D; Ibrahim, Nuhad K; Gong, Yun; Hortobagyi, Gabriel N; Hess, Kenneth R; Symmans, W Fraser; Pusztai, Lajos

    2010-01-01

    Purpose We examined in a prospective, randomized, international clinical trial the performance of a previously defined 30-gene predictor (DLDA-30) of pathologic complete response (pCR) to preoperative weekly paclitaxel and fluorouracil, doxorubicin, cyclophosphamide (T/FAC) chemotherapy, and assessed if DLDA-30 also predicts increased sensitivity to FAC-only chemotherapy. We compared the pCR rates after T/FAC versus FAC×6 preoperative chemotherapy. We also performed an exploratory analysis to identify novel candidate genes that differentially predict response in the two treatment arms. Experimental Design 273 patients were randomly assigned to receive either weekly paclitaxel × 12 followed by FAC × 4 (T/FAC, n=138), or FAC × 6 (n=135) neoadjuvant chemotherapy. All patients underwent a pretreatment FNA biopsy of the tumor for gene expression profiling and treatment response prediction. Results The pCR rates were 19% and 9% in the T/FAC and FAC arms, respectively (p<0.05). In the T/FAC arm, the positive predictive value (PPV) of the genomic predictor was 38% (95%CI:21–56%), the negative predictive value (NPV) 88% (CI:77–95%) and the AUC 0.711. In the FAC arm, the PPV was 9% (CI:1–29%) and the AUC 0.584. This suggests that the genomic predictor may have regimen-specificity. Its performance was similar to a clinical variable-based predictor nomogram. Conclusions Gene expression profiling for prospective response prediction was feasible in this international trial. The 30-gene predictor can identify patients with greater than average sensitivity to T/FAC chemotherapy. However, it captured molecular equivalents of clinical phenotype. Next generation predictive markers will need to be developed separately for different molecular subsets of breast cancers. PMID:20829329

  7. Major Shifts in Glial Regional Identity Are a Transcriptional Hallmark of Human Brain Aging.

    PubMed

    Soreq, Lilach; Rose, Jamie; Soreq, Eyal; Hardy, John; Trabzuni, Daniah; Cookson, Mark R; Smith, Colin; Ryten, Mina; Patani, Rickie; Ule, Jernej

    2017-01-10

    Gene expression studies suggest that aging of the human brain is determined by a complex interplay of molecular events, although both its region- and cell-type-specific consequences remain poorly understood. Here, we extensively characterized aging-altered gene expression changes across ten human brain regions from 480 individuals ranging in age from 16 to 106 years. We show that astrocyte- and oligodendrocyte-specific genes, but not neuron-specific genes, shift their regional expression patterns upon aging, particularly in the hippocampus and substantia nigra, while the expression of microglia- and endothelial-specific genes increase in all brain regions. In line with these changes, high-resolution immunohistochemistry demonstrated decreased numbers of oligodendrocytes and of neuronal subpopulations in the aging brain cortex. Finally, glial-specific genes predict age with greater precision than neuron-specific genes, thus highlighting the need for greater mechanistic understanding of neuron-glia interactions in aging and late-life diseases. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  8. Microarray analysis of long non-coding RNA expression profiles in monocytic myeloid-derived suppressor cells in Echinococcus granulosus-infected mice.

    PubMed

    Yu, Aiping; Wang, Ying; Yin, Jianhai; Zhang, Jing; Cao, Shengkui; Cao, Jianping; Shen, Yujuan

    2018-05-30

    Cystic echinococcosis is a worldwide chronic zoonotic disease caused by infection with the larval stage of Echinococcus granulosus. Previously, we found significant accumulation of myeloid-derived suppressor cells (MDSCs) in E. granulosus infection mouse models and that they play a key role in immunosuppressing T lymphocytes. Here, we compared the long non-coding RNA (lncRNA) and mRNA expression patterns between the splenic monocytic MDSCs (M-MDSCs) of E. granulosus protoscoleces-infected mice and normal mice using microarray analysis. LncRNA functions were predicted using Gene Ontology enrichment and the Kyoto Encyclopedia of Genes and Genomes pathway analysis. Cis- and trans-regulation analyses revealed potential relationships between the lncRNAs and their target genes or related transcription factors. We found that 649 lncRNAs were differentially expressed (fold change ≥ 2, P < 0.05): 582 lncRNAs were upregulated and 67 lncRNAs were downregulated; respectively, 28 upregulated mRNAs and 1043 downregulated mRNAs were differentially expressed. The microarray data was validated by quantitative reverse transcription-PCR. The results indicated that mRNAs co-expressed with the lncRNAs are mainly involved in regulating the actin cytoskeleton, Salmonella infection, leishmaniasis, and the vascular endothelial growth factor (VEGF) signaling pathway. The lncRNA NONMMUT021591 was predicted to cis-regulate the retinoblastoma gene (Rb1), whose expression is associated with abnormal M-MDSCs differentiation. We found that 372 lncRNAs were predicted to interact with 60 transcription factors; among these, C/EBPβ (CCAAT/enhancer binding protein beta) was previously demonstrated to be a transcription factor of MDSCs. Our study identified dysregulated lncRNAs in the M-MDSCs of E. granulosus infection mouse models; they might be involved in M-MDSC-derived immunosuppression in related diseases.

  9. Identification of Transcription Factors ZmMYB111 and ZmMYB148 Involved in Phenylpropanoid Metabolism.

    PubMed

    Zhang, Junjie; Zhang, Shuangshuang; Li, Hui; Du, Hai; Huang, Huanhuan; Li, Yangping; Hu, Yufeng; Liu, Hanmei; Liu, Yinghong; Yu, Guowu; Huang, Yubi

    2016-01-01

    Maize is the leading crop worldwide in terms of both planting area and total yields, but environmental stresses cause significant losses in productivity. Phenylpropanoid compounds play an important role in plant stress resistance; however, the mechanism of their synthesis is not fully understood, especially in regard to the expression and regulation of key genes. Phenylalanine ammonia-lyase (PAL) is the first key enzyme involved in phenylpropanoid metabolism, and it has a significant effect on the synthesis of important phenylpropanoid compounds. According to the results of sequence alignments and functional prediction, we selected two conserved R2R3-MYB transcription factors as candidate genes for the regulation of phenylpropanoid metabolism. The two candidate R2R3-MYB genes, which we named ZmMYB111 and ZmMYB148, were cloned, and then their structural characteristics and phylogenetic placement were predicted and analyzed. In addition, a series of evaluations were performed, including expression profiles, subcellular localization, transcription activation, protein-DNA interaction, and transient expression in maize endosperm. Our results indicated that both ZmMYB111 and ZmMYB148 are indeed R2R3-MYB transcription factors and that they may play a regulatory role in PAL gene expression.

  10. Defective Cell Cycle Checkpoint Functions in Melanoma Are Associated with Altered Patterns of Gene Expression

    PubMed Central

    Kaufmann, William K.; Nevis, Kathleen R.; Qu, Pingping; Ibrahim, Joseph G.; Zhou, Tong; Zhou, Yingchun; Simpson, Dennis A.; Helms-Deaton, Jennifer; Cordeiro-Stone, Marila; Moore, Dominic T.; Thomas, Nancy E.; Hao, Honglin; Liu, Zhi; Shields, Janiel M.; Scott, Glynis A.; Sharpless, Norman E.

    2009-01-01

    Defects in DNA damage responses may underlie genetic instability and malignant progression in melanoma. Cultures of normal human melanocytes (NHMs) and melanoma lines were analyzed to determine whether global patterns of gene expression could predict the efficacy of DNA damage cell cycle checkpoints that arrest growth and suppress genetic instability. NHMs displayed effective G1 and G2 checkpoint responses to ionizing radiation-induced DNA damage. A majority of melanoma cell lines (11/16) displayed significant quantitative defects in one or both checkpoints. Melanomas with B-RAF mutations as a class displayed a significant defect in DNA damage G2 checkpoint function. In contrast the epithelial-like subtype of melanomas with wild-type N-RAS and B-RAF alleles displayed an effective G2 checkpoint but a significant defect in G1 checkpoint function. RNA expression profiling revealed that melanoma lines with defects in the DNA damage G1 checkpoint displayed reduced expression of p53 transcriptional targets, such as CDKN1A and DDB2, and enhanced expression of proliferation-associated genes, such as CDC7 and GEMININ. A Bayesian analysis tool was more accurate than significance analysis of microarrays for predicting checkpoint function using a leave-one-out method. The results suggest that defects in DNA damage checkpoints may be recognized in melanomas through analysis of gene expression. PMID:17597816

  11. Gene expression profiling reveals two separate mechanisms regulating apoptosis in rectal carcinomas in vivo

    PubMed Central

    de Bruin, Elza C.; van de Pas, Simone; van de Velde, Cornelis J. H.; van Krieken, J. Han J. M.; Peltenburg, Lucy T. C.; Marijnen, Corrie A. M.

    2007-01-01

    The level of apoptosis in rectal carcinomas of patients treated by surgery only predicts local failure; patients with intrinsically high-apoptotic tumors develop less local recurrences than patients with low levels of apoptosis. To identify genes involved in this intrinsic apoptotic process in vivo, 47 rectal tumors with known apoptotic phenotype (24 low- and 23 high-apoptotic) were analyzed by oligonucleotide microarray technology. We identified several genes differentially expressed between low- and high-apoptotic tumors. Unsupervised clustering of the tumors based on expression levels of these genes separated the low-apoptotic from the high-apoptotic tumors, indicating a gene expression-dependent regulation. In addition, this clustering revealed two subgroups of high-apoptotic tumors. One high-apoptotic subgroup showed subtle differences in mRNA and protein expression of the known apoptotic regulators BAX, cIAP2 and ARC compared to the low-apoptotic tumors. The other subgroup of high-apoptotic tumors showed high expression of immune-related genes; predominantly HLA class II and chemokines, but also HLA class I and interferon-inducible genes were highly expressed. Immunohistochemistry revealed HLA-DR expression in epithelial tumor cells in 70% of these high-apoptotic tumors. The expression data suggest that high levels of apoptosis in rectal carcinoma patients can be the result of either slightly altered expression of known pro- and anti-apoptotic genes or high expression of immune-related genes. Electronic supplementary material The online version of this article (doi: 10.1007/s10495-007-0088-2) contains supplementary material, which is available to authorized users. PMID:17610066

  12. Analysis of functional importance of binding sites in the Drosophila gap gene network model.

    PubMed

    Kozlov, Konstantin; Gursky, Vitaly V; Kulakovskiy, Ivan V; Dymova, Arina; Samsonova, Maria

    2015-01-01

    The statistical thermodynamics based approach provides a promising framework for construction of the genotype-phenotype map in many biological systems. Among important aspects of a good model connecting the DNA sequence information with that of a molecular phenotype (gene expression) is the selection of regulatory interactions and relevant transcription factor bindings sites. As the model may predict different levels of the functional importance of specific binding sites in different genomic and regulatory contexts, it is essential to formulate and study such models under different modeling assumptions. We elaborate a two-layer model for the Drosophila gap gene network and include in the model a combined set of transcription factor binding sites and concentration dependent regulatory interaction between gap genes hunchback and Kruppel. We show that the new variants of the model are more consistent in terms of gene expression predictions for various genetic constructs in comparison to previous work. We quantify the functional importance of binding sites by calculating their impact on gene expression in the model and calculate how these impacts correlate across all sites under different modeling assumptions. The assumption about the dual interaction between hb and Kr leads to the most consistent modeling results, but, on the other hand, may obscure existence of indirect interactions between binding sites in regulatory regions of distinct genes. The analysis confirms the previously formulated regulation concept of many weak binding sites working in concert. The model predicts a more or less uniform distribution of functionally important binding sites over the sets of experimentally characterized regulatory modules and other open chromatin domains.

  13. A 15-gene signature for prediction of colon cancer recurrence and prognosis based on SVM.

    PubMed

    Xu, Guangru; Zhang, Minghui; Zhu, Hongxing; Xu, Jinhua

    2017-03-10

    To screen the gene signature for distinguishing patients with high risks from those with low-risks for colon cancer recurrence and predicting their prognosis. Five microarray datasets of colon cancer samples were collected from Gene Expression Omnibus database and one was obtained from The Cancer Genome Atlas (TCGA). After preprocessing, data in GSE17537 were analyzed using the Linear Models for Microarray data (LIMMA) method to identify the differentially expressed genes (DEGs). The DEGs further underwent PPI network-based neighborhood scoring and support vector machine (SVM) analyses to screen the feature genes associated with recurrence and prognosis, which were then validated by four datasets GSE38832, GSE17538, GSE28814 and TCGA using SVM and Cox regression analyses. A total of 1207 genes were identified as DEGs between recurrence and no-recurrence samples, including 726 downregulated and 481 upregulated genes. Using SVM analysis and five gene expression profile data confirmation, a 15-gene signature (HES5, ZNF417, GLRA2, OR8D2, HOXA7, FABP6, MUSK, HTR6, GRIP2, KLRK1, VEGFA, AKAP12, RHEB, NCRNA00152 and PMEPA1) were identified as a predictor of recurrence risk and prognosis for colon cancer patients. Our identified 15-gene signature may be useful to classify colon cancer patients with different prognosis and some genes in this signature may represent new therapeutic targets. Copyright © 2016. Published by Elsevier B.V.

  14. A general framework for optimization of probes for gene expression microarray and its application to the fungus Podospora anserina.

    PubMed

    Bidard, Frédérique; Imbeaud, Sandrine; Reymond, Nancie; Lespinet, Olivier; Silar, Philippe; Clavé, Corinne; Delacroix, Hervé; Berteaux-Lecellier, Véronique; Debuchy, Robert

    2010-06-18

    The development of new microarray technologies makes custom long oligonucleotide arrays affordable for many experimental applications, notably gene expression analyses. Reliable results depend on probe design quality and selection. Probe design strategy should cope with the limited accuracy of de novo gene prediction programs, and annotation up-dating. We present a novel in silico procedure which addresses these issues and includes experimental screening, as an empirical approach is the best strategy to identify optimal probes in the in silico outcome. We used four criteria for in silico probe selection: cross-hybridization, hairpin stability, probe location relative to coding sequence end and intron position. This latter criterion is critical when exon-intron gene structure predictions for intron-rich genes are inaccurate. For each coding sequence (CDS), we selected a sub-set of four probes. These probes were included in a test microarray, which was used to evaluate the hybridization behavior of each probe. The best probe for each CDS was selected according to three experimental criteria: signal-to-noise ratio, signal reproducibility, and representative signal intensities. This procedure was applied for the development of a gene expression Agilent platform for the filamentous fungus Podospora anserina and the selection of a single 60-mer probe for each of the 10,556 P. anserina CDS. A reliable gene expression microarray version based on the Agilent 44K platform was developed with four spot replicates of each probe to increase statistical significance of analysis.

  15. Molecular Profile of Peripheral Blood Mononuclear Cells from Patients with Rheumatoid Arthritis

    PubMed Central

    Edwards, Christopher J; Feldman, Jeffrey L; Beech, Jonathan; Shields, Kathleen M; Stover, Jennifer A; Trepicchio, William L; Larsen, Glenn; Foxwell, Brian MJ; Brennan, Fionula M; Feldmann, Marc; Pittman, Debra D

    2007-01-01

    Rheumatoid arthritis (RA) is a chronic inflammatory arthritis. Currently, diagnosis of RA may take several weeks, and factors used to predict a poor prognosis are not always reliable. Gene expression in RA may consist of a unique signature. Gene expression analysis has been applied to synovial tissue to define molecularly distinct forms of RA; however, expression analysis of tissue taken from a synovial joint is invasive and clinically impractical. Recent studies have demonstrated that unique gene expression changes can be identified in peripheral blood mononuclear cells (PBMCs) from patients with cancer, multiple sclerosis, and lupus. To identify RA disease-related genes, we performed a global gene expression analysis. RNA from PBMCs of 9 RA patients and 13 normal volunteers was analyzed on an oligonucleotide array. Compared with normal PBMCs, 330 transcripts were differentially expressed in RA. The differentially regulated genes belong to diverse functional classes and include genes involved in calcium binding, chaperones, cytokines, transcription, translation, signal transduction, extracellular matrix, integral to plasma membrane, integral to intracellular membrane, mitochondrial, ribosomal, structural, enzymes, and proteases. A k-nearest neighbor analysis identified 29 transcripts that were preferentially expressed in RA. Ten genes with increased expression in RA PBMCs compared with controls mapped to a RA susceptibility locus, 6p21.3. These results suggest that analysis of RA PBMCs at the molecular level may provide a set of candidate genes that could yield an easily accessible gene signature to aid in early diagnosis and treatment. PMID:17515956

  16. Possible pathways used to predict different stages of lung adenocarcinoma.

    PubMed

    Chen, Xiaodong; Duan, Qiongyu; Xuan, Ying; Sun, Yunan; Wu, Rong

    2017-04-01

    We aimed to find some specific pathways that can be used to predict the stage of lung adenocarcinoma.RNA-Seq expression profile data and clinical data of lung adenocarcinoma (stage I [37], stage II 161], stage III [75], and stage IV [45]) were obtained from the TCGA dataset. The differentially expressed genes were merged, correlation coefficient matrix between genes was constructed with correlation analysis, and unsupervised clustering was carried out with hierarchical clustering method. The specific coexpression network in every stage was constructed with cytoscape software. Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis was performed with KOBAS database and Fisher exact test. Euclidean distance algorithm was used to calculate total deviation score. The diagnostic model was constructed with SVM algorithm.Eighteen specific genes were obtained by getting intersection of 4 group differentially expressed genes. Ten significantly enriched pathways were obtained. In the distribution map of 10 pathways score in different groups, degrees that sample groups deviated from the normal level were as follows: stage I < stage II < stage III < stage IV. The pathway score of 4 stages exhibited linear change in some pathways, and the score of 1 or 2 stages were significantly different from the rest stages in some pathways. There was significant difference between dead and alive for these pathways except thyroid hormone signaling pathway.Those 10 pathways are associated with the development of lung adenocarcinoma and may be able to predict different stages of it. Furthermore, these pathways except thyroid hormone signaling pathway may be able to predict the prognosis.

  17. ModuleMiner - improved computational detection of cis-regulatory modules: are there different modes of gene regulation in embryonic development and adult tissues?

    PubMed Central

    Van Loo, Peter; Aerts, Stein; Thienpont, Bernard; De Moor, Bart; Moreau, Yves; Marynen, Peter

    2008-01-01

    We present ModuleMiner, a novel algorithm for computationally detecting cis-regulatory modules (CRMs) in a set of co-expressed genes. ModuleMiner outperforms other methods for CRM detection on benchmark data, and successfully detects CRMs in tissue-specific microarray clusters and in embryonic development gene sets. Interestingly, CRM predictions for differentiated tissues exhibit strong enrichment close to the transcription start site, whereas CRM predictions for embryonic development gene sets are depleted in this region. PMID:18394174

  18. Transcriptomic correlates of neuron electrophysiological diversity

    PubMed Central

    Li, Brenna; Crichlow, Cindy-Lee; Mancarci, B. Ogan; Pavlidis, Paul

    2017-01-01

    How neuronal diversity emerges from complex patterns of gene expression remains poorly understood. Here we present an approach to understand electrophysiological diversity through gene expression by integrating pooled- and single-cell transcriptomics with intracellular electrophysiology. Using neuroinformatics methods, we compiled a brain-wide dataset of 34 neuron types with paired gene expression and intrinsic electrophysiological features from publically accessible sources, the largest such collection to date. We identified 420 genes whose expression levels significantly correlated with variability in one or more of 11 physiological parameters. We next trained statistical models to infer cellular features from multivariate gene expression patterns. Such models were predictive of gene-electrophysiological relationships in an independent collection of 12 visual cortex cell types from the Allen Institute, suggesting that these correlations might reflect general principles relating expression patterns to phenotypic diversity across very different cell types. Many associations reported here have the potential to provide new insights into how neurons generate functional diversity, and correlations of ion channel genes like Gabrd and Scn1a (Nav1.1) with resting potential and spiking frequency are consistent with known causal mechanisms. Our work highlights the promise and inherent challenges in using cell type-specific transcriptomics to understand the mechanistic origins of neuronal diversity. PMID:29069078

  19. Structure-related clustering of gene expression fingerprints of thp-1 cells exposed to smaller polycyclic aromatic hydrocarbons.

    PubMed

    Wan, B; Yarbrough, J W; Schultz, T W

    2008-01-01

    This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.

  20. Low-rank regularization for learning gene expression programs.

    PubMed

    Ye, Guibo; Tang, Mengfan; Cai, Jian-Feng; Nie, Qing; Xie, Xiaohui

    2013-01-01

    Learning gene expression programs directly from a set of observations is challenging due to the complexity of gene regulation, high noise of experimental measurements, and insufficient number of experimental measurements. Imposing additional constraints with strong and biologically motivated regularizations is critical in developing reliable and effective algorithms for inferring gene expression programs. Here we propose a new form of regulation that constrains the number of independent connectivity patterns between regulators and targets, motivated by the modular design of gene regulatory programs and the belief that the total number of independent regulatory modules should be small. We formulate a multi-target linear regression framework to incorporate this type of regulation, in which the number of independent connectivity patterns is expressed as the rank of the connectivity matrix between regulators and targets. We then generalize the linear framework to nonlinear cases, and prove that the generalized low-rank regularization model is still convex. Efficient algorithms are derived to solve both the linear and nonlinear low-rank regularized problems. Finally, we test the algorithms on three gene expression datasets, and show that the low-rank regularization improves the accuracy of gene expression prediction in these three datasets.

  1. Final technical report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Edward DeLong

    2011-10-07

    Our overarching goals in this project were to: Develop and improve high-throughput sequencing methods and analytical approaches for quantitative analyses of microbial gene expression at the Hawaii Ocean Time Series Station and the Bermuda Atlantic Time Series Station; Conduct field analyses following gene expression patterns in picoplankton microbial communities in general, and Prochlorococcus flow sorted from that community, as they respond to different environmental variables (light, macronutrients, dissolved organic carbon), that are predicted to influence activity, productivity, and carbon cycling; Use the expression analyses of flow sorted Prochlorococcus to identify horizontally transferred genes and gene products, in particular those thatmore » are located in genomic islands and likely to confer habitat-specific fitness advantages; Use the microbial community gene expression data that we generate to gain insights, and test hypotheses, about the variability, genomic context, activity and function of as yet uncharacterized gene products, that appear highly expressed in the environment. We achieved the above goals, and even more over the course of the project. This includes a number of novel methodological developments, as well as the standardization of microbial community gene expression analyses in both field surveys, and experimental modalities. The availability of these methods, tools and approaches is changing current practice in microbial community analyses.« less

  2. Characterization and expression of amphioxus ApoD gene encoding an archetype of vertebrate ApoD proteins.

    PubMed

    Wang, Lei; Zhang, Shicui; Liu, Zhenhui; Li, Hongyan; Wang, Yongjun; Jiang, Shengjuan

    2007-01-01

    Here we report a homologue of the apolipoprotein D gene (AmphiApoD) in amphioxus, Branchiostoma belcheri tsingtauense, the first such finding in a basal chordate cephalochordate. The main features of the protein predicted from AmphiApoD are characteristic of the apolipoprotein D. Phylogenetic analysis places AmphiApoD at the base of the phylogenetic tree, suggesting that AmphiApoD is the archetype of the vertebrate ApoD genes. Both whole mount in situ hybridization and Northern blotting and RT-PCR as well as in situ hybridization histochemistry reveal that AmphiApoD is expressed in tissues derived from mesoderm and endoderm including notochord and hind-gut, which contrasts with the strong expression patterns of ApoD genes in the ectodermal derivatives in mammals and birds. The expression profiles of the ApoD gene may have been changed to be expressed in the endo-mesodermal derivatives in amphioxus after the vertebrate and cephalochordate lineages diverged; alternatively, the ApoD gene may first have been expressed in the endo-mesoderm during embryogenesis in the last common ancestor of all chordates, and subsequently came to be expressed in the ectodermal derivatives of vertebrates including mammals and birds.

  3. Bioinformatic identification and expression analysis of banana microRNAs and their targets.

    PubMed

    Chai, Juan; Feng, Renjun; Shi, Hourui; Ren, Mengyun; Zhang, Yindong; Wang, Jingyi

    2015-01-01

    MicroRNAs (miRNAs) represent a class of endogenous non-coding small RNAs that play important roles in multiple biological processes by degrading targeted mRNAs or repressing mRNA translation. Thousands of miRNAs have been identified in many plant species, whereas only a limited number of miRNAs have been predicted in M. acuminata (A genome) and M. balbisiana (B genome). Here, previously known plant miRNAs were BLASTed against the Expressed Sequence Tag (EST) and Genomic Survey Sequence (GSS), a database of banana genes. A total of 32 potential miRNAs belonging to 13 miRNAs families were detected using a range of filtering criteria. 244 miRNA:target pairs were subsequently predicted, most of which encode transcription factors or enzymes that participate in the regulation of development, growth, metabolism, and other physiological processes. In order to validate the predicted miRNAs and the mutual relationship between miRNAs and their target genes, qRT-PCR was applied to detect the tissue-specific expression levels of 12 putative miRNAs and 6 target genes in roots, leaves, flowers, and fruits. This study provides some important information about banana pre-miRNAs, mature miRNAs, and miRNA target genes and these findings can be applied to future research of miRNA functions.

  4. Bioinformatic Identification and Expression Analysis of Banana MicroRNAs and Their Targets

    PubMed Central

    Shi, Hourui; Ren, Mengyun; Zhang, Yindong; Wang, Jingyi

    2015-01-01

    MicroRNAs (miRNAs) represent a class of endogenous non-coding small RNAs that play important roles in multiple biological processes by degrading targeted mRNAs or repressing mRNA translation. Thousands of miRNAs have been identified in many plant species, whereas only a limited number of miRNAs have been predicted in M. acuminata (A genome) and M. balbisiana (B genome). Here, previously known plant miRNAs were BLASTed against the Expressed Sequence Tag (EST) and Genomic Survey Sequence (GSS), a database of banana genes. A total of 32 potential miRNAs belonging to 13 miRNAs families were detected using a range of filtering criteria. 244 miRNA:target pairs were subsequently predicted, most of which encode transcription factors or enzymes that participate in the regulation of development, growth, metabolism, and other physiological processes. In order to validate the predicted miRNAs and the mutual relationship between miRNAs and their target genes, qRT-PCR was applied to detect the tissue-specific expression levels of 12 putative miRNAs and 6 target genes in roots, leaves, flowers, and fruits. This study provides some important information about banana pre-miRNAs, mature miRNAs, and miRNA target genes and these findings can be applied to future research of miRNA functions. PMID:25856313

  5. Transcriptional response of Leptospira interrogans to iron limitation and characterization of a PerR homolog.

    PubMed

    Lo, Miranda; Murray, Gerald L; Khoo, Chen Ai; Haake, David A; Zuerner, Richard L; Adler, Ben

    2010-11-01

    Leptospirosis is a globally significant zoonosis caused by Leptospira spp. Iron is essential for growth of most bacterial species. Since iron availability is low in the host, pathogens have evolved complex iron acquisition mechanisms to survive and establish infection. In many bacteria, expression of iron uptake and storage proteins is regulated by Fur. L. interrogans encodes four predicted Fur homologs; we have constructed a mutation in one of these, la1857. We conducted microarray analysis to identify iron-responsive genes and to study the effects of la1857 mutation on gene expression. Under iron-limiting conditions, 43 genes were upregulated and 49 genes were downregulated in the wild type. Genes encoding proteins with predicted involvement in inorganic ion transport and metabolism (including TonB-dependent proteins and outer membrane transport proteins) were overrepresented in the upregulated list, while 54% of differentially expressed genes had no known function. There were 16 upregulated genes of unknown function which are absent from the saprophyte L. biflexa and which therefore may encode virulence-associated factors. Expression of iron-responsive genes was not significantly affected by mutagenesis of la1857, indicating that LA1857 is not a global regulator of iron homeostasis. Upregulation of heme biosynthetic genes and a putative catalase in the mutant suggested that LA1857 is more similar to PerR, a regulator of the oxidative stress response. Indeed, the la1857 mutant was more resistant to peroxide stress than the wild type. Our results provide insights into the role of iron in leptospiral metabolism and regulation of the oxidative stress response, including genes likely to be important for virulence.

  6. Upregulation of LYAR induces neuroblastoma cell proliferation and survival.

    PubMed

    Sun, Yuting; Atmadibrata, Bernard; Yu, Denise; Wong, Matthew; Liu, Bing; Ho, Nicholas; Ling, Dora; Tee, Andrew E; Wang, Jenny; Mungrue, Imran N; Liu, Pei Y; Liu, Tao

    2017-09-01

    The N-Myc oncoprotein induces neuroblastoma by regulating gene transcription and consequently causing cell proliferation. Paradoxically, N-Myc is well known to induce apoptosis by upregulating pro-apoptosis genes, and it is not clear how N-Myc overexpressing neuroblastoma cells escape N-Myc-mediated apoptosis. The nuclear zinc finger protein LYAR has recently been shown to modulate gene expression by forming a protein complex with the protein arginine methyltransferase PRMT5. Here we showed that N-Myc upregulated LYAR gene expression by binding to its gene promoter. Genome-wide differential gene expression studies revealed that knocking down LYAR considerably upregulated the expression of oxidative stress genes including CHAC1, which depletes intracellular glutathione and induces oxidative stress. Although knocking down LYAR expression with siRNAs induced oxidative stress, neuroblastoma cell growth inhibition and apoptosis, co-treatment with the glutathione supplement N-acetyl-l-cysteine or co-transfection with CHAC1 siRNAs blocked the effect of LYAR siRNAs. Importantly, high levels of LYAR gene expression in human neuroblastoma tissues predicted poor event-free and overall survival in neuroblastoma patients, independent of the best current markers for poor prognosis. Taken together, our data suggest that LYAR induces proliferation and promotes survival of neuroblastoma cells by repressing the expression of oxidative stress genes such as CHAC1 and suppressing oxidative stress, and identify LYAR as a novel co-factor in N-Myc oncogenesis.

  7. Gene expression of Caenorhabditis elegans neurons carries information on their synaptic connectivity.

    PubMed

    Kaufman, Alon; Dror, Gideon; Meilijson, Isaac; Ruppin, Eytan

    2006-12-08

    The claim that genetic properties of neurons significantly influence their synaptic network structure is a common notion in neuroscience. The nematode Caenorhabditis elegans provides an exciting opportunity to approach this question in a large-scale quantitative manner. Its synaptic connectivity network has been identified, and, combined with cellular studies, we currently have characteristic connectivity and gene expression signatures for most of its neurons. By using two complementary analysis assays we show that the expression signature of a neuron carries significant information about its synaptic connectivity signature, and identify a list of putative genes predicting neural connectivity. The current study rigorously quantifies the relation between gene expression and synaptic connectivity signatures in the C. elegans nervous system and identifies subsets of neurons where this relation is highly marked. The results presented and the genes identified provide a promising starting point for further, more detailed computational and experimental investigations.

  8. A specific glycerol kinase induces rapid cold hardening of the diamondback moth, Plutella xylostella.

    PubMed

    Park, Youngjin; Kim, Yonggyun

    2014-08-01

    Insects in temperate zones survive low temperatures by migrating or tolerating the cold. The diamondback moth, Plutella xylostella, is a serious insect pest on cabbage and other cruciferous crops worldwide. We showed that P. xylostella became cold-tolerant by expressing rapid cold hardiness (RCH) in response to a brief exposure to moderately low temperature (4°C) for 7h along with glycerol accumulation in hemolymph. Glycerol played a crucial role in the cold-hardening process because exogenously supplying glycerol significantly increased the cold tolerance of P. xylostella larvae without cold acclimation. To determine the genetic factor(s) responsible for RCH and the increase of glycerol, four glycerol kinases (GKs), and glycerol-3-phosphate dehydrogenase (PxGPDH) were predicted from the whole P. xylostella genome and analyzed for their function associated with glycerol biosynthesis. All predicted genes were expressed, but differed in their expression during different developmental stages and in different tissues. Expression of the predicted genes was individually suppressed by RNA interference (RNAi) using double-stranded RNAs specific to target genes. RNAi of PxGPDH expression significantly suppressed RCH and glycerol accumulation. Only PxGK1 among the four GKs was responsible for RCH and glycerol accumulation. Furthermore, PxGK1 expression was significantly enhanced during RCH. These results indicate that a specific GK, the terminal enzyme to produce glycerol, is specifically inducible during RCH to accumulate the main cryoprotectant. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Gene Signature for Predicting Solid Tumors Patient Prognosis | NCI Technology Transfer Center | TTC

    Cancer.gov

    The National Cancer Institute’s Laboratory of Human Carcinogenesis seeks parties to license or co-develop a method of predicting the prognosis of a patient diagnosed with hepatocellular carcinoma (HCC) or breast cancer by detecting expression of one or more cancer-associated genes, and a method of identifying an agent for use in treating HCC.

  10. BICD1 expression, as a potential biomarker for prognosis and predicting response to therapy in patients with glioblastomas

    PubMed Central

    Huang, Shang-Pen; Chang, Yu-Chan; Low, Qie Hua; Wu, Alexander T.H.; Chen, Chi-Long; Lin, Yuan-Feng; Hsiao, Michael

    2017-01-01

    There is variation in the survival and therapeutic outcome of patients with glioblastomas (GBMs). Therapy resistance is an important challenge in the treatment of GBM patients. The aim of this study was to identify Temozolomide (TMZ) related genes and confirm their clinical relevance. The TMZ-related genes were discovered by analysis of the gene-expression profiling in our cell-based microarray. Their clinical relevance was verified by in silico meta-analysis of the Cancer Genome Atlas (TCGA) and the Chinese Glioma Genome Atlas (CGGA) datasets. Our results demonstrated that BICD1 expression could predict both prognosis and response to therapy in GBM patients. First, high BICD1 expression was correlated with poor prognosis in the TCGA GBM cohort (n=523) and in the CGGA glioma cohort (n=220). Second, high BICD1 expression predicted poor outcome in patients with TMZ treatment (n=301) and radiation therapy (n=405). Third, multivariable Cox regression analysis confirmed BICD1 expression as an independent factor affecting the prognosis and therapeutic response of TMZ and radiation in GBM patients. Additionally, age, MGMT and BICD1 expression were combinedly utilized to stratify GBM patients into more distinct risk groups, which may provide better outcome assessment. Finally, we observed a strong correlation between BICD1 expression and epithelial-mesenchymal transition (EMT) in GBMs, and proposed a possible mechanism of BICD1-associated survival or therapeutic resistance in GBMs accordingly. In conclusion, our study suggests that high BICD1 expression may result in worse prognosis and could be a predictor of poor response to TMZ and radiation therapies in GBM patients. PMID:29371945

  11. Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.

    PubMed

    Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H

    2013-12-01

    Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.

  12. Analysis of functional redundancies within the Arabidopsis TCP transcription factor family

    PubMed Central

    Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.

    2013-01-01

    Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704

  13. Clinical value of miR-182-5p in lung squamous cell carcinoma: a study combining data from TCGA, GEO, and RT-qPCR validation.

    PubMed

    Luo, Jie; Shi, Ke; Yin, Shu-Ya; Tang, Rui-Xue; Chen, Wen-Jie; Huang, Lin-Zhen; Gan, Ting-Qing; Cai, Zheng-Wen; Chen, Gang

    2018-04-10

    MiR-182-5p, as a member of miRNA family, can be detected in lung cancer and plays an important role in lung cancer. To explore the clinical value of miR-182-5p in lung squamous cell carcinoma (LUSC) and to unveil the molecular mechanism of LUSC. The clinical value of miR-182-5p in LUSC was investigated by collecting and calculating data from The Cancer Genome Atlas (TCGA) database, the Gene Expression Omnibus (GEO) database, and real-time quantitative polymerase chain reaction (RT-qPCR). Twelve prediction platforms were used to predict the target genes of miR-182-5p. Protein-protein interaction (PPI) networks and gene ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were used to explore the molecular mechanism of LUSC. The expression of miR-182-5p was significantly over-expressed in LUSC than in non-cancerous tissues, as evidenced by various approaches, including the TCGA database, GEO microarrays, RT-qPCR, and a comprehensive meta-analysis of 501 LUSC cases and 148 non-cancerous cases. Furthermore, a total of 81 potential target genes were chosen from the union of predicted genes and the TCGA database. GO and KEGG analyses demonstrated that the target genes are involved in pathways related to biological processes. PPIs revealed the relationships between these genes, with EPAS1, PRKCE, NR3C1, and RHOB being located in the center of the PPI network. MiR-182-5p upregulation greatly contributes to LUSC and may serve as a biomarker in LUSC.

  14. CK2 phosphorylates and inhibits TAp73 tumor suppressor function to promote expression of cancer stem cell genes and phenotype in head and neck cancer.

    PubMed

    Lu, Hai; Yan, Carol; Quan, Xin Xin; Yang, Xinping; Zhang, Jialing; Bian, Yansong; Chen, Zhong; Van Waes, Carter

    2014-10-01

    Cancer stem cells (CSC) and genes have been linked to cancer development and therapeutic resistance, but the signaling mechanisms regulating CSC genes and phenotype are incompletely understood. CK2 has emerged as a key signal serine/threonine kinase that modulates diverse signal cascades regulating cell fate and growth. We previously showed that CK2 is often aberrantly expressed and activated in head and neck squamous cell carcinomas (HNSCC), concomitantly with mutant (mt) tumor suppressor TP53, and inactivation of its family member, TAp73. Unexpectedly, we observed that classical stem cell genes Nanog, Sox2, and Oct4, are overexpressed in HNSCC with inactivated TAp73 and mtTP53. However, the potential relationship between CK2, TAp73 inactivation, and CSC phenotype is unknown. We reveal that inhibition of CK2 by pharmacologic inhibitors or siRNA inhibits the expression of CSC genes and side population (SP), while enhancing TAp73 mRNA and protein expression. Conversely, CK2 inhibitor attenuation of CSC protein expression and the SP by was abrogated by TAp73 siRNA. Bioinformatic analysis uncovered a single predicted CK2 threonine phosphorylation site (T27) within the N-terminal transactivation domain of TAp73. Nuclear CK2 and TAp73 interaction, confirmed by co-immunoprecipitation, was attenuated by CK2 inhibitor, or a T27A point-mutation of this predicted CK2 threonine phospho-acceptor site of TAp73. Further, T27A mutation attenuated phosphorylation, while enhancing TAp73 function in repressing CSC gene expression and SP cells. A new CK2 inhibitor, CX-4945, inhibited CSC related SP cells, clonogenic survival, and spheroid formation. Our study unveils a novel regulatory mechanism whereby aberrant CK2 signaling inhibits TAp73 to promote the expression of CSC genes and phenotype.

  15. Psychological Well-Being and the Human Conserved Transcriptional Response to Adversity

    PubMed Central

    Fredrickson, Barbara L.; Grewen, Karen M.; Algoe, Sara B.; Firestine, Ann M.; Arevalo, Jesusa M. G.; Ma, Jeffrey; Cole, Steve W.

    2015-01-01

    Research in human social genomics has identified a conserved transcriptional response to adversity (CTRA) characterized by up-regulated expression of pro-inflammatory genes and down-regulated expression of Type I interferon- and antibody-related genes. This report seeks to identify the specific aspects of positive psychological well-being that oppose such effects and predict reduced CTRA gene expression. In a new confirmation study of 122 healthy adults that replicated the approach of a previously reported discovery study, mixed effect linear model analyses identified a significant inverse association between expression of CTRA indicator genes and a summary measure of eudaimonic well-being from the Mental Health Continuum – Short Form. Analyses of a 2- representation of eudaimonia converged in finding correlated psychological and social subdomains of eudaimonic well-being to be the primary carriers of CTRA associations. Hedonic well-being showed no consistent CTRA association independent of eudaimonic well-being, and summary measures integrating hedonic and eudaimonic well-being showed less stable CTRA associations than did focal measures of eudaimonia (psychological and social well-being). Similar results emerged from analyses of pooled discovery and confirmation samples (n = 198). Similar results also emerged from analyses of a second new generalization study of 107 healthy adults that included the more detailed Ryff Scales of Psychological Well-being and found this more robust measure of eudaimonic well-being to also associate with reduced CTRA gene expression. Five of the 6 major sub-domains of psychological well-being predicted reduced CTRA gene expression when analyzed separately, and 3 remained distinctively prognostic in mutually adjusted analyses. All associations were independent of demographic characteristics, health-related confounders, and RNA indicators of leukocyte subset distribution. These results identify specific sub-dimensions of eudaimonic well-being as promising targets for future interventions to mitigate CTRA gene expression, and provide no support for any independent favorable contribution from hedonic well-being. PMID:25811656

  16. Multi-walled carbon nanotube-induced gene signatures in the mouse lung: potential predictive value for human lung cancer risk and prognosis

    PubMed Central

    Guo, Nancy L; Wan, Ying-Wooi; Denvir, James; Porter, Dale W; Pacurari, Maricica; Wolfarth, Michael G; Castranova, Vincent; Qian, Yong

    2012-01-01

    Concerns over the potential for multi-walled carbon nanotubes (MWCNT) to induce lung carcinogenesis have emerged. This study sought to (1) identify gene expression signatures in the mouse lungs following pharyngeal aspiration of well-dispersed MWCNT and (2) determine if these genes were associated with human lung cancer risk and progression. Genome-wide mRNA expression profiles were analyzed in mouse lungs (n=160) exposed to 0, 10, 20, 40, or 80 µg of MWCNT by pharyngeal aspiration at 1, 7, 28, and 56 days post-exposure. By using pairwise-Statistical Analysis of Microarray (SAM) and linear modeling, 24 genes were selected, which have significant changes in at least two time points, have a more than 1.5 fold change at all doses, and are significant in the linear model for the dose or the interaction of time and dose. Additionally, a 38-gene set was identified as related to cancer from 330 genes differentially expressed at day 56 post-exposure in functional pathway analysis. Using the expression profiles of the cancer-related gene set in 8 mice at day 56 post-exposure to 10 µg of MWCNT, a nearest centroid classification accurately predicts human lung cancer survival with a significant hazard ratio in training set (n=256) and test set (n=186). Furthermore, both gene signatures were associated with human lung cancer risk (n=164) with significant odds ratios. These results may lead to development of a surveillance approach for early detection of lung cancer and prognosis associated with MWCNT in the workplace. PMID:22891886

  17. Identification of Biomarker Genes To Predict Biodegradation of 1,4-Dioxane

    PubMed Central

    Gedalanga, Phillip B.; Pornwongthong, Peerapong; Mora, Rebecca; Chiang, Sheau-Yun Dora; Baldwin, Brett; Ogles, Dora

    2014-01-01

    Bacterial multicomponent monooxygenase gene targets in Pseudonocardia dioxanivorans CB1190 were evaluated for their use as biomarkers to identify the potential for 1,4-dioxane biodegradation in pure cultures and environmental samples. Our studies using laboratory pure cultures and industrial activated sludge samples suggest that the presence of genes associated with dioxane monooxygenase, propane monooxygenase, alcohol dehydrogenase, and aldehyde dehydrogenase are promising indicators of 1,4-dioxane biotransformation; however, gene abundance was insufficient to predict actual biodegradation. A time course gene expression analysis of dioxane and propane monooxygenases in Pseudonocardia dioxanivorans CB1190 and mixed communities in wastewater samples revealed important associations with the rates of 1,4-dioxane removal. In addition, transcripts of alcohol dehydrogenase and aldehyde dehydrogenase genes were upregulated during biodegradation, although only the aldehyde dehydrogenase was significantly correlated with 1,4-dioxane concentrations. Expression of the propane monooxygenase demonstrated a time-dependent relationship with 1,4-dioxane biodegradation in P. dioxanivorans CB1190, with increased expression occurring after over 50% of the 1,4-dioxane had been removed. While the fraction of P. dioxanivorans CB1190-like bacteria among the total bacterial population significantly increased with decrease in 1,4-dioxane concentrations in wastewater treatment samples undergoing active biodegradation, the abundance and expression of monooxygenase-based biomarkers were better predictors of 1,4-dioxane degradation than taxonomic 16S rRNA genes. This study illustrates that specific bacterial monooxygenase and dehydrogenase gene targets together can serve as effective biomarkers for 1,4-dioxane biodegradation in the environment. PMID:24632253

  18. Relaxed selection is a precursor to the evolution of phenotypic plasticity.

    PubMed

    Hunt, Brendan G; Ometto, Lino; Wurm, Yannick; Shoemaker, DeWayne; Yi, Soojin V; Keller, Laurent; Goodisman, Michael A D

    2011-09-20

    Phenotypic plasticity allows organisms to produce alternative phenotypes under different conditions and represents one of the most important ways by which organisms adaptively respond to the environment. However, the relationship between phenotypic plasticity and molecular evolution remains poorly understood. We addressed this issue by investigating the evolution of genes associated with phenotypically plastic castes, sexes, and developmental stages of the fire ant Solenopsis invicta. We first determined if genes associated with phenotypic plasticity in S. invicta evolved at a rapid rate, as predicted under theoretical models. We found that genes differentially expressed between S. invicta castes, sexes, and developmental stages all exhibited elevated rates of evolution compared with ubiquitously expressed genes. We next investigated the evolutionary history of genes associated with the production of castes. Surprisingly, we found that orthologs of caste-biased genes in S. invicta and the social bee Apis mellifera evolved rapidly in lineages without castes. Thus, in contrast to some theoretical predictions, our results suggest that rapid rates of molecular evolution may not arise primarily as a consequence of phenotypic plasticity. Instead, genes evolving under relaxed purifying selection may more readily adopt new forms of biased expression during the evolution of alternate phenotypes. These results suggest that relaxed selective constraint on protein-coding genes is an important and underappreciated element in the evolutionary origin of phenotypic plasticity.

  19. Ion Channel Gene Expression in Lung Adenocarcinoma: Potential Role in Prognosis and Diagnosis

    PubMed Central

    Ko, Jae-Hong; Gu, Wanjun; Lim, Inja; Bang, Hyoweon; Ko, Eun A.; Zhou, Tong

    2014-01-01

    Ion channels are known to regulate cancer processes at all stages. The roles of ion channels in cancer pathology are extremely diverse. We systematically analyzed the expression patterns of ion channel genes in lung adenocarcinoma. First, we compared the expression of ion channel genes between normal and tumor tissues in patients with lung adenocarcinoma. Thirty-seven ion channel genes were identified as being differentially expressed between the two groups. Next, we investigated the prognostic power of ion channel genes in lung adenocarcinoma. We assigned a risk score to each lung adenocarcinoma patient based on the expression of the differentially expressed ion channel genes. We demonstrated that the risk score effectively predicted overall survival and recurrence-free survival in lung adenocarcinoma. We also found that the risk scores for ever-smokers were higher than those for never-smokers. Multivariate analysis indicated that the risk score was a significant prognostic factor for survival, which is independent of patient age, gender, stage, smoking history, Myc level, and EGFR/KRAS/ALK gene mutation status. Finally, we investigated the difference in ion channel gene expression between the two major subtypes of non-small cell lung cancer: adenocarcinoma and squamous-cell carcinoma. Thirty ion channel genes were identified as being differentially expressed between the two groups. We suggest that ion channel gene expression can be used to improve the subtype classification in non-small cell lung cancer at the molecular level. The findings in this study have been validated in several independent lung cancer cohorts. PMID:24466154

  20. Daytime soybean transcriptome fluctuations during water deficit stress.

    PubMed

    Rodrigues, Fabiana Aparecida; Fuganti-Pagliarini, Renata; Marcolino-Gomes, Juliana; Nakayama, Thiago Jonas; Molinari, Hugo Bruno Correa; Lobo, Francisco Pereira; Harmon, Frank G; Nepomuceno, Alexandre Lima

    2015-07-07

    Since drought can seriously affect plant growth and development and little is known about how the oscillations of gene expression during the drought stress-acclimation response in soybean is affected, we applied Illumina technology to sequence 36 cDNA libraries synthesized from control and drought-stressed soybean plants to verify the dynamic changes in gene expression during a 24-h time course. Cycling variables were measured from the expression data to determine the putative circadian rhythm regulation of gene expression. We identified 4866 genes differentially expressed in soybean plants in response to water deficit. Of these genes, 3715 were differentially expressed during the light period, from which approximately 9.55% were observed in both light and darkness. We found 887 genes that were either up- or down-regulated in different periods of the day. Of 54,175 predicted soybean genes, 35.52% exhibited expression oscillations in a 24 h period. This number increased to 39.23% when plants were submitted to water deficit. Major differences in gene expression were observed in the control plants from late day (ZT16) until predawn (ZT20) periods, indicating that gene expression oscillates during the course of 24 h in normal development. Under water deficit, dissimilarity increased in all time-periods, indicating that the applied stress influenced gene expression. Such differences in plants under stress were primarily observed in ZT0 (early morning) to ZT8 (late day) and also from ZT4 to ZT12. Stress-related pathways were triggered in response to water deficit primarily during midday, when more genes were up-regulated compared to early morning. Additionally, genes known to be involved in secondary metabolism and hormone signaling were also expressed in the dark period. Gene expression networks can be dynamically shaped to acclimate plant metabolism under environmental stressful conditions. We have identified putative cycling genes that are expressed in soybean leaves under normal developmental conditions and genes whose expression oscillates under conditions of water deficit. These results suggest that time of day, as well as light and temperature oscillations that occur considerably affect the regulation of water deficit stress response in soybean plants.

Top