Science.gov

Sample records for encochitinase-encoding genes identified

  1. QTLminer: identifying genes regulating quantitative traits.

    PubMed

    Alberts, Rudi; Schughart, Klaus

    2010-10-15

    Quantitative trait locus (QTL) mapping identifies genomic regions that likely contain genes regulating a quantitative trait. However, QTL regions may encompass tens to hundreds of genes. To find the most promising candidate genes that regulate the trait, the biologist typically collects information from multiple resources about the genes in the QTL interval. This process is very laborious and time consuming. QTLminer is a bioinformatics tool that automatically performs QTL region analysis. It is available in GeneNetwork and it integrates information such as gene annotation, gene expression and sequence polymorphisms for all the genes within a given genomic interval. QTLminer substantially speeds up discovery of the most promising candidate genes within a QTL region.

  2. Identifying essential genes in Arabidopsis thaliana.

    PubMed

    Meinke, David; Muralla, Rosanna; Sweeney, Colleen; Dickerman, Allan

    2008-09-01

    Eight years after publication of the Arabidopsis genome sequence and two years before completing the first phase of an international effort to characterize the function of every Arabidopsis gene, plant biologists remain unable to provide a definitive answer to the following basic question: what is the minimal gene set required for normal growth and development? The purpose of this review is to summarize different strategies employed to identify essential genes in Arabidopsis, an important component of the minimal gene set in plants, to present an overview of the datasets and specific genes identified to date, and to discuss the prospects for future saturation of this important class of genes. The long-term goal of this collaborative effort is to facilitate basic research in plant biology and complement ongoing research with other model organisms.

  3. Stratified gene expression analysis identifies major amyotrophic lateral sclerosis genes.

    PubMed

    Jones, Ashley R; Troakes, Claire; King, Andrew; Sahni, Vibhu; De Jong, Simone; Bossers, Koen; Papouli, Efterpi; Mirza, Muddassar; Al-Sarraj, Safa; Shaw, Christopher E; Shaw, Pamela J; Kirby, Janine; Veldink, Jan H; Macklis, Jeffrey D; Powell, John F; Al-Chalabi, Ammar

    2015-05-01

    Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease of motor neurons resulting in progressive paralysis. Gene expression studies of ALS only rarely identify the same gene pathways as gene association studies. We hypothesized that analyzing tissues by matching on degree of disease severity would identify different patterns of gene expression from a traditional case-control comparison. We analyzed gene expression changes in 4 postmortem central nervous system regions, stratified by severity of motor neuron loss. An overall comparison of cases (n = 6) and controls (n = 3) identified known ALS gene, SOX5, as showing differential expression (log2 fold change = 0.09, p = 5.5 × 10(-5)). Analyses stratified by disease severity identified expression changes in C9orf72 (p = 2.77 × 10(-3)), MATR3 (p = 3.46 × 10(-3)), and VEGFA (p = 8.21 × 10(-4)), all implicated in ALS through genetic studies, and changes in other genes in pathways involving RNA processing and immune response. These findings suggest that analysis of gene expression stratified by disease severity can identify major ALS genes and may be more efficient than traditional case-control comparison. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Phenoscape: Identifying Candidate Genes for Evolutionary Phenotypes

    PubMed Central

    Edmunds, Richard C.; Su, Baofeng; Balhoff, James P.; Eames, B. Frank; Dahdul, Wasila M.; Lapp, Hilmar; Lundberg, John G.; Vision, Todd J.; Dunham, Rex A.; Mabee, Paula M.; Westerfield, Monte

    2016-01-01

    Phenotypes resulting from mutations in genetic model organisms can help reveal candidate genes for evolutionarily important phenotypic changes in related taxa. Although testing candidate gene hypotheses experimentally in nonmodel organisms is typically difficult, ontology-driven information systems can help generate testable hypotheses about developmental processes in experimentally tractable organisms. Here, we tested candidate gene hypotheses suggested by expert use of the Phenoscape Knowledgebase, specifically looking for genes that are candidates responsible for evolutionarily interesting phenotypes in the ostariophysan fishes that bear resemblance to mutant phenotypes in zebrafish. For this, we searched ZFIN for genetic perturbations that result in either loss of basihyal element or loss of scales phenotypes, because these are the ancestral phenotypes observed in catfishes (Siluriformes). We tested the identified candidate genes by examining their endogenous expression patterns in the channel catfish, Ictalurus punctatus. The experimental results were consistent with the hypotheses that these features evolved through disruption in developmental pathways at, or upstream of, brpf1 and eda/edar for the ancestral losses of basihyal element and scales, respectively. These results demonstrate that ontological annotations of the phenotypic effects of genetic alterations in model organisms, when aggregated within a knowledgebase, can be used effectively to generate testable, and useful, hypotheses about evolutionary changes in morphology. PMID:26500251

  5. Identifying genes of gene regulatory networks using formal concept analysis.

    PubMed

    Gebert, Jutta; Motameny, Susanne; Faigle, Ulrich; Forst, Christian V; Schrader, Rainer

    2008-03-01

    In order to understand the behavior of a gene regulatory network, it is essential to know the genes that belong to it. Identifying the correct members (e.g., in order to build a model) is a difficult task even for small subnetworks. Usually only few members of a network are known and one needs to guess the missing members based on experience or informed speculation. It is beneficial if one can additionally rely on experimental data to support this guess. In this work we present a new method based on formal concept analysis to detect unknown members of a gene regulatory network from gene expression time series data. We show that formal concept analysis is able to find a list of candidate genes for inclusion into a partially known basic network. This list can then be reduced by a statistical analysis so that the resulting genes interact strongly with the basic network and therefore should be included when modeling the network. The method has been applied to the DNA repair system of Mycobacterium tuberculosis. In this application, our method produces comparable results to an already existing method of component selection while it is applicable to a broader range of problems.

  6. Virus induced gene silencing of Arabidopsis gene homologues in wheat identify genes conferring improved drought tolerance

    USDA-ARS?s Scientific Manuscript database

    In a non-model staple crop like wheat, functional validation of potential drought stress responsive genes identified in Arabidopsis could provide gene targets for wheat breeding. Virus induced gene silencing (VIGS) of genes of interest can overcome the inherent problems of polyploidy and limited tra...

  7. NIH Researchers Identify OCD Risk Gene

    MedlinePlus

    ... gene (SERT), site of action for the selective serotonin reuptake inhibitors (SSRIs) that are today's mainstay medications for OCD, other anxiety disorders and depression. "Improved knowledge of SERT's role in OCD raises ...

  8. Cancer genomics identifies disrupted epigenetic genes.

    PubMed

    Simó-Riudalbas, Laia; Esteller, Manel

    2014-06-01

    Latest advances in genome technologies have greatly advanced the discovery of epigenetic genes altered in cancer. The initial single candidate gene approaches have been coupled with newly developed epigenomic platforms to hasten the convergence of scientific discoveries and translational applications. Here, we present an overview of the evolution of cancer epigenomics and an updated catalog of disruptions in epigenetic pathways, whose misregulation can culminate in cancer. The creation of these basic mutational catalogs in cell lines and primary tumors will provide us with enough knowledge to move diagnostics and therapy from the laboratory bench to the bedside.

  9. Identifying driver genes in cancer by triangulating gene expression, gene location, and survival data.

    PubMed

    Rouam, Sigrid; Miller, Lance D; Karuturi, R Krishna Murthy

    2014-01-01

    Driver genes are directly responsible for oncogenesis and identifying them is essential in order to fully understand the mechanisms of cancer. However, it is difficult to delineate them from the larger pool of genes that are deregulated in cancer (ie, passenger genes). In order to address this problem, we developed an approach called TRIAngulating Gene Expression (TRIAGE through clinico-genomic intersects). Here, we present a refinement of this approach incorporating a new scoring methodology to identify putative driver genes that are deregulated in cancer. TRIAGE triangulates - or integrates - three levels of information: gene expression, gene location, and patient survival. First, TRIAGE identifies regions of deregulated expression (ie, expression footprints) by deriving a newly established measure called the Local Singular Value Decomposition (LSVD) score for each locus. Driver genes are then distinguished from passenger genes using dual survival analyses. Incorporating measurements of gene expression and weighting them according to the LSVD weight of each tumor, these analyses are performed using the genes located in significant expression footprints. Here, we first use simulated data to characterize the newly established LSVD score. We then present the results of our application of this refined version of TRIAGE to gene expression data from five cancer types. This refined version of TRIAGE not only allowed us to identify known prominent driver genes, such as MMP1, IL8, and COL1A2, but it also led us to identify several novel ones. These results illustrate that TRIAGE complements existing tools, allows for the identification of genes that drive cancer and could perhaps elucidate potential future targets of novel anticancer therapeutics.

  10. Identifying Cancer Driver Genes Using Replication-Incompetent Retroviral Vectors

    PubMed Central

    Bii, Victor M.; Trobridge, Grant D.

    2016-01-01

    Identifying novel genes that drive tumor metastasis and drug resistance has significant potential to improve patient outcomes. High-throughput sequencing approaches have identified cancer genes, but distinguishing driver genes from passengers remains challenging. Insertional mutagenesis screens using replication-incompetent retroviral vectors have emerged as a powerful tool to identify cancer genes. Unlike replicating retroviruses and transposons, replication-incompetent retroviral vectors lack additional mutagenesis events that can complicate the identification of driver mutations from passenger mutations. They can also be used for almost any human cancer due to the broad tropism of the vectors. Replication-incompetent retroviral vectors have the ability to dysregulate nearby cancer genes via several mechanisms including enhancer-mediated activation of gene promoters. The integrated provirus acts as a unique molecular tag for nearby candidate driver genes which can be rapidly identified using well established methods that utilize next generation sequencing and bioinformatics programs. Recently, retroviral vector screens have been used to efficiently identify candidate driver genes in prostate, breast, liver and pancreatic cancers. Validated driver genes can be potential therapeutic targets and biomarkers. In this review, we describe the emergence of retroviral insertional mutagenesis screens using replication-incompetent retroviral vectors as a novel tool to identify cancer driver genes in different cancer types. PMID:27792127

  11. Identifying Cancer Driver Genes Using Replication-Incompetent Retroviral Vectors.

    PubMed

    Bii, Victor M; Trobridge, Grant D

    2016-10-25

    Identifying novel genes that drive tumor metastasis and drug resistance has significant potential to improve patient outcomes. High-throughput sequencing approaches have identified cancer genes, but distinguishing driver genes from passengers remains challenging. Insertional mutagenesis screens using replication-incompetent retroviral vectors have emerged as a powerful tool to identify cancer genes. Unlike replicating retroviruses and transposons, replication-incompetent retroviral vectors lack additional mutagenesis events that can complicate the identification of driver mutations from passenger mutations. They can also be used for almost any human cancer due to the broad tropism of the vectors. Replication-incompetent retroviral vectors have the ability to dysregulate nearby cancer genes via several mechanisms including enhancer-mediated activation of gene promoters. The integrated provirus acts as a unique molecular tag for nearby candidate driver genes which can be rapidly identified using well established methods that utilize next generation sequencing and bioinformatics programs. Recently, retroviral vector screens have been used to efficiently identify candidate driver genes in prostate, breast, liver and pancreatic cancers. Validated driver genes can be potential therapeutic targets and biomarkers. In this review, we describe the emergence of retroviral insertional mutagenesis screens using replication-incompetent retroviral vectors as a novel tool to identify cancer driver genes in different cancer types.

  12. Identifying gene-environment and gene-gene interactions using a progressive penalization approach.

    PubMed

    Zhu, Ruoqing; Zhao, Hongyu; Ma, Shuangge

    2014-05-01

    In genomic studies, identifying important gene-environment and gene-gene interactions is a challenging problem. In this study, we adopt the statistical modeling approach, where interactions are represented by product terms in regression models. For the identification of important interactions, we adopt penalization, which has been used in many genomic studies. Straightforward application of penalization does not respect the "main effect, interaction" hierarchical structure. A few recently proposed methods respect this structure by applying constrained penalization. However, they demand very complicated computational algorithms and can only accommodate a small number of genomic measurements. We propose a computationally fast penalization method that can identify important gene-environment and gene-gene interactions and respect a strong hierarchical structure. The method takes a stagewise approach and progressively expands its optimization domain to account for possible hierarchical interactions. It is applicable to multiple data types and models. A coordinate descent method is utilized to produce the entire regularized solution path. Simulation study demonstrates the superior performance of the proposed method. We analyze a lung cancer prognosis study with gene expression measurements and identify important gene-environment interactions.

  13. Identifying gene-environment and gene-gene interactions using a progressive penalization approach

    PubMed Central

    Zhu, Ruoqing; Zhao, Hongyu; Ma, Shuangge

    2015-01-01

    In genomic studies, identifying important gene-environment and gene-gene interactions is a challenging problem. In this study, we adopt the statistical modeling approach, where interactions are represented by product terms in regression models. For the identification of important interactions, we adopt penalization, which has been used in many genomic studies. Straightforward application of penalization does not respect the “main effect, interaction” hierarchical structure. A few recently proposed methods respect this structure by applying constrained penalization. However, they demand very complicated computational algorithms and can only accommodate a small number of genomic measurements. We propose a computationally fast penalization method that can identify important gene-environment and gene-gene interactions and respect a strong hierarchical structure. The method takes a stagewise approach and progressively expands its optimization domain to account for possible hierarchical interactions. It is applicable to multiple data types and models. A coordinate descent method is utilized to produce the entire regularized solution path. Simulation study demonstrates the superior performance of the proposed method. We analyze a lung cancer prognosis study with gene expression measurements and identify important gene-environment interactions. PMID:24723356

  14. A predictive approach to identify genes differentially expressed

    NASA Astrophysics Data System (ADS)

    Saraiva, Erlandson F.; Louzada, Francisco; Milan, Luís A.; Meira, Silvana; Cobre, Juliana

    2012-10-01

    The main objective of gene expression data analysis is to identify genes that present significant changes in expression levels between a treatment and a control biological condition. In this paper, we propose a Bayesian approach to identify genes differentially expressed calculating credibility intervals from predictive densities which are constructed using sampled mean treatment effect from all genes in study excluding the treatment effect of genes previously identified with statistical evidence for difference. We compare our Bayesian approach with the standard ones based on the use of the t-test and modified t-tests via a simulation study, using small sample sizes which are common in gene expression data analysis. Results obtained indicate that the proposed approach performs better than standard ones, especially for cases with mean differences and increases in treatment variance in relation to control variance. We also apply the methodologies to a publicly available data set on Escherichia coli bacteria.

  15. Identifying potential cancer driver genes by genomic data integration

    PubMed Central

    Chen, Yong; Hao, Jingjing; Jiang, Wei; He, Tong; Zhang, Xuegong; Jiang, Tao; Jiang, Rui

    2013-01-01

    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis. PMID:24346768

  16. Identifying potential cancer driver genes by genomic data integration

    NASA Astrophysics Data System (ADS)

    Chen, Yong; Hao, Jingjing; Jiang, Wei; He, Tong; Zhang, Xuegong; Jiang, Tao; Jiang, Rui

    2013-12-01

    Cancer is a genomic disease associated with a plethora of gene mutations resulting in a loss of control over vital cellular functions. Among these mutated genes, driver genes are defined as being causally linked to oncogenesis, while passenger genes are thought to be irrelevant for cancer development. With increasing numbers of large-scale genomic datasets available, integrating these genomic data to identify driver genes from aberration regions of cancer genomes becomes an important goal of cancer genome analysis and investigations into mechanisms responsible for cancer development. A computational method, MAXDRIVER, is proposed here to identify potential driver genes on the basis of copy number aberration (CNA) regions of cancer genomes, by integrating publicly available human genomic data. MAXDRIVER employs several optimization strategies to construct a heterogeneous network, by means of combining a fused gene functional similarity network, gene-disease associations and a disease phenotypic similarity network. MAXDRIVER was validated to effectively recall known associations among genes and cancers. Previously identified as well as novel driver genes were detected by scanning CNAs of breast cancer, melanoma and liver carcinoma. Three predicted driver genes (CDKN2A, AKT1, RNF139) were found common in these three cancers by comparative analysis.

  17. Identifying gene expression modules that define human cell fates.

    PubMed

    Germanguz, I; Listgarten, J; Cinkornpumin, J; Solomon, A; Gaeta, X; Lowry, W E

    2016-05-01

    Using a compendium of cell-state-specific gene expression data, we identified genes that uniquely define cell states, including those thought to represent various developmental stages. Our analysis sheds light on human cell fate through the identification of core genes that are altered over several developmental milestones, and across regional specification. Here we present cell-type specific gene expression data for 17 distinct cell states and demonstrate that these modules of genes can in fact define cell fate. Lastly, we introduce a web-based database to disseminate the results. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  18. Identifying gene networks underlying the neurobiology of ethanol and alcoholism.

    PubMed

    Wolen, Aaron R; Miles, Michael F

    2012-01-01

    For complex disorders such as alcoholism, identifying the genes linked to these diseases and their specific roles is difficult. Traditional genetic approaches, such as genetic association studies (including genome-wide association studies) and analyses of quantitative trait loci (QTLs) in both humans and laboratory animals already have helped identify some candidate genes. However, because of technical obstacles, such as the small impact of any individual gene, these approaches only have limited effectiveness in identifying specific genes that contribute to complex diseases. The emerging field of systems biology, which allows for analyses of entire gene networks, may help researchers better elucidate the genetic basis of alcoholism, both in humans and in animal models. Such networks can be identified using approaches such as high-throughput molecular profiling (e.g., through microarray-based gene expression analyses) or strategies referred to as genetical genomics, such as the mapping of expression QTLs (eQTLs). Characterization of gene networks can shed light on the biological pathways underlying complex traits and provide the functional context for identifying those genes that contribute to disease development.

  19. Rice transcriptome analysis to identify possible herbicide quinclorac detoxification genes

    PubMed Central

    Xu, Wenying; Di, Chao; Zhou, Shaoxia; Liu, Jia; Li, Li; Liu, Fengxia; Yang, Xinling; Ling, Yun; Su, Zhen

    2015-01-01

    Quinclorac is a highly selective auxin-type herbicide and is widely used in the effective control of barnyard grass in paddy rice fields, improving the world's rice yield. The herbicide mode of action of quinclorac has been proposed, and hormone interactions affecting quinclorac signaling has been identified. Because of widespread use, quinclorac may be transported outside rice fields with the drainage waters, leading to soil and water pollution and other environmental health problems. In this study, we used 57K Affymetrix rice whole-genome array to identify quinclorac signaling response genes to study the molecular mechanisms of action and detoxification of quinclorac in rice plants. Overall, 637 probe sets were identified with differential expression levels under either 6 or 24 h of quinclorac treatment. Auxin-related genes such as GH3 and OsIAAs responded to quinclorac treatment. Gene Ontology analysis showed that genes of detoxification-related family genes were significantly enriched, including cytochrome P450, GST, UGT, and ABC and drug transporter genes. Moreover, real-time RT-PCR analysis showed that top candidate genes of P450 families such as CYP81, CYP709C, and CYP72A were universally induced by different herbicides. Some Arabidopsis genes of the same P450 family were up-regulated under quinclorac treatment. We conducted rice whole-genome GeneChip analysis and the first global identification of quinclorac response genes. This work may provide potential markers for detoxification of quinclorac and biomonitors of environmental chemical pollution. PMID:26483837

  20. GENE EXPRESSION PROFILING TO IDENTIFY BIOMARKERS OF REPRODUCTIVE TOXICITY

    EPA Science Inventory

    SOT 2005 SESSION ABSTRACT

    GENE EXPRESSION PROFILING TO IDENTIFY BIOMARKERS OF REPRODUCTIVE TOXICITY

    David J. Dix. National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency, Research Triangle...

  1. GENE EXPRESSION PROFILING TO IDENTIFY BIOMARKERS OF REPRODUCTIVE TOXICITY

    EPA Science Inventory

    SOT 2005 SESSION ABSTRACT

    GENE EXPRESSION PROFILING TO IDENTIFY BIOMARKERS OF REPRODUCTIVE TOXICITY

    David J. Dix. National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency, Research Triangle...

  2. Identifying a gene expression signature of cluster headache in blood

    PubMed Central

    Eising, Else; Pelzer, Nadine; Vijfhuizen, Lisanne S.; Vries, Boukje de; Ferrari, Michel D.; ‘t Hoen, Peter A. C.; Terwindt, Gisela M.; van den Maagdenberg, Arn M. J. M.

    2017-01-01

    Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20). Gene expression data were analysed by gene and by module of co-expressed genes with particular attention to previously implicated disease pathways including hypocretin dysregulation. Only moderate gene expression differences were identified and no associations were found with previously reported pathogenic mechanisms. At the level of functional gene sets, associations were observed for genes involved in several brain-related mechanisms such as GABA receptor function and voltage-gated channels. In addition, genes and modules of co-expressed genes showed a role for intracellular signalling cascades, mitochondria and inflammation. Although larger study samples may be required to identify the full range of involved pathways, these results indicate a role for mitochondria, intracellular signalling and inflammation in cluster headache. PMID:28074859

  3. Identifying genes preferentially expressed in undifferentiated embryonic stem cells

    PubMed Central

    Li, Xiajun; Leder, Philip

    2007-01-01

    Background The mechanism involved in the maintenance and differentiation of embryonic stem (ES) cells is incompletely understood. Results To address this issue, we have developed a retroviral gene trap vector that can target genes expressed in undifferentiated ES cells. This gene trap vector harbors both GFP and Neo reporter genes. G-418 drug resistance was used to select ES clones in which the vector was integrated into transcriptionally active loci. This was then followed by GFP FACS profiling to identify ES clones with reduced GFP fluorescence and, hence, reduced transcriptional activity when ES cells differentiate. Reduced expression of the GFP reporter in six of three hundred ES clones in our pilot screening was confirmed to be down-regulated by Northern blot analysis during ES cell differentiation. These six ES clones represent four different genes. Among the six integration sites, one was at Zfp-57 whose gene product is known to be enriched in undifferentiated ES cells. Three were located in an intron of a novel isoform of CSL/RBP-Jkappa which encodes the key transcription factor of the LIN-12/Notch pathway. Another was inside a gene that may encode noncoding RNA transcripts. The last integration event occurred at a locus that may harbor a novel gene. Conclusion Taken together, we demonstrate the use of a novel retroviral gene trap vector in identifying genes preferentially expressed in undifferentiated ES cells. PMID:17725840

  4. A Gene Recommender Algorithm to Identify Coexpressed Genes in C. elegans

    PubMed Central

    Owen, Art B.; Stuart, Josh; Mach, Kathy; Villeneuve, Anne M.; Kim, Stuart

    2003-01-01

    One of the most important uses of whole-genome expression data is for the discovery of new genes with similar function to a given list of genes (the query) already known to have closely related function. We have developed an algorithm, called the gene recommender, that ranks genes according to how strongly they correlate with a set of query genes in those experiments for which the query genes are most strongly coregulated. We used the gene recommender to find other genes coexpressed with several sets of query genes, including genes known to function in the retinoblastoma complex. Genetic experiments confirmed that one gene (JC8.6) identified by the gene recommender acts with lin-35 Rb to regulate vulval cell fates, and that another gene (wrm-1) acts antagonistically. We find that the gene recommender returns lists of genes with better precision, for fixed levels of recall, than lists generated using the C. elegans expression topomap. PMID:12902378

  5. ENU mutagenesis in mice identifies candidate genes for hypogonadism.

    PubMed

    Weiss, Jeffrey; Hurley, Lisa A; Harris, Rebecca M; Finlayson, Courtney; Tong, Minghan; Fisher, Lisa A; Moran, Jennifer L; Beier, David R; Mason, Christopher; Jameson, J Larry

    2012-06-01

    Genome-wide mutagenesis was performed in mice to identify candidate genes for male infertility, for which the predominant causes remain idiopathic. Mice were mutagenized using N-ethyl-N-nitrosourea (ENU), bred, and screened for phenotypes associated with the male urogenital system. Fifteen heritable lines were isolated and chromosomal loci were assigned using low-density genome-wide SNP arrays. Ten of the 15 lines were pursued further using higher-resolution SNP analysis to narrow the candidate gene regions. Exon sequencing of candidate genes identified mutations in mice with cystic kidneys (Bicc1), cryptorchidism (Rxfp2), restricted germ cell deficiency (Plk4), and severe germ cell deficiency (Prdm9). In two other lines with severe hypogonadism, candidate sequencing failed to identify mutations, suggesting defects in genes with previously undocumented roles in gonadal function. These genomic intervals were sequenced in their entirety and a candidate mutation was identified in SnrpE in one of the two lines. The line harboring the SnrpE variant retains substantial spermatogenesis despite small testis size, an unusual phenotype. In addition to the reproductive defects, heritable phenotypes were observed in mice with ataxia (Myo5a), tremors (Pmp22), growth retardation (unknown gene), and hydrocephalus (unknown gene). These results demonstrate that the ENU screen is an effective tool for identifying potential causes of male infertility.

  6. Analysis of gene expression profile identifies potential biomarkers for atherosclerosis

    PubMed Central

    Liu, Luran; Liu, Yan; Liu, Chang; Zhang, Zhuobo; Du, Yaojun; Zhao, Hao

    2016-01-01

    The present study aimed to identify potential biomarkers for atherosclerosis via analysis of gene expression profiles. The microarray dataset no. GSE20129 was downloaded from the Gene Expression Omnibus database. A total of 118 samples from the peripheral blood of female patients was used, including 47 atherosclerotic and 71 non-atherosclerotic patients. The differentially expressed genes (DEGs) in the atherosclerosis samples were identified using the Limma package. Gene ontology term and Kyoto Encyclopedia of Genes and Genomes pathway analyses for DEGs were performed using the Database for Annotation, Visualization and Integrated Discovery tool. The recursive feature elimination (RFE) algorithm was applied for feature selection via iterative classification, and support vector machine classifier was used for the validation of prediction accuracy. A total of 430 DEGs in the atherosclerosis samples were identified, including 149 up- and 281 downregulated genes. Subsequently, the RFE algorithm was used to identify 11 biomarkers, whose receiver operating characteristic curves had an area under curve of 0.92, indicating that the identified 11 biomarkers were representative. The present study indicated that APH1B, JAM3, FBLN2, CSAD and PSTPIP2 may have important roles in the progression of atherosclerosis in females and may be potential biomarkers for early diagnosis and prognosis as well as treatment targets for this disease. PMID:27573188

  7. Inferring Gene Family Histories in Yeast Identifies Lineage Specific Expansions

    PubMed Central

    Ames, Ryan M.; Money, Daniel; Lovell, Simon C.

    2014-01-01

    The complement of genes found in the genome is a balance between gene gain and gene loss. Knowledge of the specific genes that are gained and lost over evolutionary time allows an understanding of the evolution of biological functions. Here we use new evolutionary models to infer gene family histories across complete yeast genomes; these models allow us to estimate the relative genome-wide rates of gene birth, death, innovation and extinction (loss of an entire family) for the first time. We show that the rates of gene family evolution vary both between gene families and between species. We are also able to identify those families that have experienced rapid lineage specific expansion/contraction and show that these families are enriched for specific functions. Moreover, we find that families with specific functions are repeatedly expanded in multiple species, suggesting the presence of common adaptations and that these family expansions/contractions are not random. Additionally, we identify potential specialisations, unique to specific species, in the functions of lineage specific expanded families. These results suggest that an important mechanism in the evolution of genome content is the presence of lineage-specific gene family changes. PMID:24921666

  8. Functional epigenetic approach identifies frequently methylated genes in Ewing sarcoma.

    PubMed

    Alholle, Abdullah; Brini, Anna T; Gharanei, Seley; Vaiyapuri, Sumathi; Arrigoni, Elena; Dallol, Ashraf; Gentle, Dean; Kishida, Takeshi; Hiruma, Toru; Avigad, Smadar; Grimer, Robert; Maher, Eamonn R; Latif, Farida

    2013-11-01

    Using a candidate gene approach we recently identified frequent methylation of the RASSF2 gene associated with poor overall survival in Ewing sarcoma (ES). To identify effective biomarkers in ES on a genome-wide scale, we used a functionally proven epigenetic approach, in which gene expression was induced in ES cell lines by treatment with a demethylating agent followed by hybridization onto high density gene expression microarrays. After following a strict selection criterion, 34 genes were selected for expression and methylation analysis in ES cell lines and primary ES. Eight genes (CTHRC1, DNAJA4, ECHDC2, NEFH, NPTX2, PHF11, RARRES2, TSGA14) showed methylation frequencies of>20% in ES tumors (range 24-71%), these genes were expressed in human bone marrow derived mesenchymal stem cells (hBMSC) and hypermethylation was associated with transcriptional silencing. Methylation of NPTX2 or PHF11 was associated with poorer prognosis in ES. In addition, six of the above genes also showed methylation frequency of>20% (range 36-50%) in osteosarcomas. Identification of these genes may provide insights into bone cancer tumorigenesis and development of epigenetic biomarkers for prognosis and detection of these rare tumor types.

  9. Identifying genes related with rheumatoid arthritis via system biology analysis.

    PubMed

    Liu, Tao; Lin, Xinmei; Yu, Hongjian

    2015-10-15

    Rheumatoid arthritis (RA) is a chronic, inflammatory joint disease that mainly attacks synovial joints. However, the underlying systematic relationship among different genes and biological processes involved in the pathogenesis are still unclear. By analyzing and comparing the transcriptional profiles from RA, OA (osteoarthritis) patients as well as ND (normal donors) with bioinformatics methods, we tend to uncover the potential molecular networks and critical genes which play important roles in RA and OA development. Initially, hierarchical clustering was performed to classify the overall transcriptional profiles. Differentially expressed genes (DEGs) between ND and RA and OA patients were identified. Furthermore, PPI networks were constructed, functional modules were extracted, and functional annotation was also applied. Our functional analysis identifies 22 biological processes and 2 KEGG pathways enriched in the commonly-regulated gene set. However, we found that number of set of genes differentially expressed genes only between RA and ND reaches up to 244, indicating this gene set may specifically accounts for processing to disease of RA. Additionally, 142 biological processes and 19 KEGG pathways are over-represented by these 244 genes. Meanwhile, although another 21 genes were differentially expressed only in OA and ND, no biological process nor pathway is over-represented by them.

  10. Multiple differential expression networks identify key genes in rectal cancer.

    PubMed

    Li, Ri-Heng; Zhang, Ai-Min; Li, Shuang; Li, Tian-Yang; Wang, Lian-Jing; Zhang, Hao-Ran; Li, Ping; Jia, Xiong-Jie; Zhang, Tao; Peng, Xin-Yu; Liu, Min-Di; Wang, Xu; Lang, Yan; Xue, Wei-Lan; Liu, Jing; Wang, Yan-Yan

    2016-01-01

    Rectal cancer is an important contributor to cancer mortality. The objective of this paper is to identify key genes across three phenotypes (fungating, polypoid and polypoid & small-ulcer) of rectal cancer based on multiple differential expression networks (DENs). Differential interactions and non-differential interactions were evaluated according to Spearman correlation coefficient (SCC) algorithm, and were selected to construct DENs. Topological analysis was performed for exploring hub genes in largest components of DENs. Key genes were denoted as intersections between nodes of DENs and rectal cancer associated genes from Genecards. Finally, we utilized hub genes to classify phenotypes of rectal cancer on the basis of support vector machines (SVM) methodology. We obtained 19 hub genes and total 12 common key genes of three largest components of DENs, and EGFR was the common element. The SVM results revealed that hub genes could classify phenotypes, and validated feasibility of DEN methods. We have successfully identified significant genes (such as EGFR and UBC) across fungating, polypoid and polypoid & small-ulcer phenotype of rectal cancer. They might be potential biomarkers for classification, detection and therapy of this cancer.

  11. Identifying Causal Genes and Dysregulated Pathways in Complex Diseases

    PubMed Central

    Kim, Yoo-Ah; Wuchty, Stefan; Przytycka, Teresa M.

    2011-01-01

    In complex diseases, various combinations of genomic perturbations often lead to the same phenotype. On a molecular level, combinations of genomic perturbations are assumed to dys-regulate the same cellular pathways. Such a pathway-centric perspective is fundamental to understanding the mechanisms of complex diseases and the identification of potential drug targets. In order to provide an integrated perspective on complex disease mechanisms, we developed a novel computational method to simultaneously identify causal genes and dys-regulated pathways. First, we identified a representative set of genes that are differentially expressed in cancer compared to non-tumor control cases. Assuming that disease-associated gene expression changes are caused by genomic alterations, we determined potential paths from such genomic causes to target genes through a network of molecular interactions. Applying our method to sets of genomic alterations and gene expression profiles of 158 Glioblastoma multiforme (GBM) patients we uncovered candidate causal genes and causal paths that are potentially responsible for the altered expression of disease genes. We discovered a set of putative causal genes that potentially play a role in the disease. Combining an expression Quantitative Trait Loci (eQTL) analysis with pathway information, our approach allowed us not only to identify potential causal genes but also to find intermediate nodes and pathways mediating the information flow between causal and target genes. Our results indicate that different genomic perturbations indeed dys-regulate the same functional pathways, supporting a pathway-centric perspective of cancer. While copy number alterations and gene expression data of glioblastoma patients provided opportunities to test our approach, our method can be applied to any disease system where genetic variations play a fundamental causal role. PMID:21390271

  12. Integrative Genomics Identifies Gene Signature Associated with Melanoma Ulceration

    PubMed Central

    Toth, Reka; Vizkeleti, Laura; Herandez-Vargas, Hector; Lazar, Viktoria; Emri, Gabriella; Szatmari, Istvan; Herceg, Zdenko; Adany, Roza; Balazs, Margit

    2013-01-01

    Background Despite the extensive research approaches applied to characterise malignant melanoma, no specific molecular markers are available that are clearly related to the progression of this disease. In this study, our aims were to define a gene expression signature associated with the clinical outcome of melanoma patients and to provide an integrative interpretation of the gene expression -, copy number alterations -, and promoter methylation patterns that contribute to clinically relevant molecular functional alterations. Methods Gene expression profiles were determined using the Affymetrix U133 Plus2.0 array. The NimbleGen Human CGH Whole-Genome Tiling array was used to define CNAs, and the Illumina GoldenGate Methylation platform was applied to characterise the methylation patterns of overlapping genes. Results We identified two subclasses of primary melanoma: one representing patients with better prognoses and the other being characteristic of patients with unfavourable outcomes. We assigned 1,080 genes as being significantly correlated with ulceration, 987 genes were downregulated and significantly enriched in the p53, Nf-kappaB, and WNT/beta-catenin pathways. Through integrated genome analysis, we defined 150 downregulated genes whose expression correlated with copy number losses in ulcerated samples. These genes were significantly enriched on chromosome 6q and 10q, which contained a total of 36 genes. Ten of these genes were downregulated and involved in cell-cell and cell-matrix adhesion or apoptosis. The expression and methylation patterns of additional genes exhibited an inverse correlation, suggesting that transcriptional silencing of these genes is driven by epigenetic events. Conclusion Using an integrative genomic approach, we were able to identify functionally relevant molecular hotspots characterised by copy number losses and promoter hypermethylation in distinct molecular subtypes of melanoma that contribute to specific transcriptomic silencing

  13. Identifying new human oocyte marker genes: a microarray approach

    PubMed Central

    Gasca, Stephan; Pellestor, Franck; Assou, Said; Loup, Vanessa; Anahory, Tal; Dechaud, Hervé; De Vos, John; Hamamah, Samir

    2007-01-01

    Efficiency in classical IVF (cIVF) techniques is still impaired by poor implantation and pregnancy rates after embryo transfer. This is mostly due to a lack of reliable criteria for the selection of embryos with sufficient development potential. Several studies have provided evidence that some genes’ expression levels could be used as objective markers of oocytes and embryos competence and of their capacity to sustain a successful pregnancy. These analyses usually used reverse transcription-polymerase chain reaction to look at small sets of pre-selected genes. However, microarray approaches permit to identify a wider range of cellular marker genes. Thus they allow the identification of additional and perhaps more suited genes that could serve as embryo selection markers. Microarray screenings of circa 30 000 genes on U133P Affymetrix™ gene chips made it possible to establish the expression profile of these genes as well as other related genes in human oocytes and cumulus cells. In this study, we identified new potential regulators and marker genes such as BARD1, RBL2, RBBP7, BUB3 or BUB1B, which are involved in oocyte maturation. PMID:17298719

  14. Sleeping Beauty mouse models identify candidate genes involved in gliomagenesis.

    PubMed

    Vyazunova, Irina; Maklakova, Vilena I; Berman, Samuel; De, Ishani; Steffen, Megan D; Hong, Won; Lincoln, Hayley; Morrissy, A Sorana; Taylor, Michael D; Akagi, Keiko; Brennan, Cameron W; Rodriguez, Fausto J; Collier, Lara S

    2014-01-01

    Genomic studies of human high-grade gliomas have discovered known and candidate tumor drivers. Studies in both cell culture and mouse models have complemented these approaches and have identified additional genes and processes important for gliomagenesis. Previously, we found that mobilization of Sleeping Beauty transposons in mice ubiquitously throughout the body from the Rosa26 locus led to gliomagenesis with low penetrance. Here we report the characterization of mice in which transposons are mobilized in the Glial Fibrillary Acidic Protein (GFAP) compartment. Glioma formation in these mice did not occur on an otherwise wild-type genetic background, but rare gliomas were observed when mobilization occurred in a p19Arf heterozygous background. Through cloning insertions from additional gliomas generated by transposon mobilization in the Rosa26 compartment, several candidate glioma genes were identified. Comparisons to genetic, epigenetic and mRNA expression data from human gliomas implicates several of these genes as tumor suppressor genes and oncogenes in human glioblastoma.

  15. Identifying new human oocyte marker genes: a microarray approach.

    PubMed

    Gasca, Stéphan; Pellestor, Franck; Assou, Saïd; Loup, Vanessa; Anahory, Tal; Dechaud, Hervé; De Vos, John; Hamamah, Samir

    2007-02-01

    The efficacy of classical IVF techniques is still impaired by poor implantation and pregnancy rates after embryo transfer. This is mainly due to a lack of reliable criteria for the selection of embryos with sufficient development potential. Several studies have provided evidence that some gene expression levels could be used as objective markers of oocyte and embryo competence and capacity to sustain a successful pregnancy. These analyses usually use reverse transcription-polymerase chain reaction to look at small sets of pre-selected genes. However, microarray approaches allow the identification of a wider range of cellular marker genes which could include additional and perhaps more suitable genes that could serve as embryo selection markers. Microarray screenings of around 30,000 genes on U133P Affymetrix gene chips made it possible to establish the expression profile of these genes as well as other related genes in human oocytes and cumulus cells. This study identifies new potential regulators and marker genes such as BARD1, RBL2, RBBP7, BUB3 or BUB1B, which are involved in oocyte maturation.

  16. Identifying gene regulatory network rewiring using latent differential graphical models

    PubMed Central

    Tian, Dechao; Gu, Quanquan; Ma, Jian

    2016-01-01

    Gene regulatory networks (GRNs) are highly dynamic among different tissue types. Identifying tissue-specific gene regulation is critically important to understand gene function in a particular cellular context. Graphical models have been used to estimate GRN from gene expression data to distinguish direct interactions from indirect associations. However, most existing methods estimate GRN for a specific cell/tissue type or in a tissue-naive way, or do not specifically focus on network rewiring between different tissues. Here, we describe a new method called Latent Differential Graphical Model (LDGM). The motivation of our method is to estimate the differential network between two tissue types directly without inferring the network for individual tissues, which has the advantage of utilizing much smaller sample size to achieve reliable differential network estimation. Our simulation results demonstrated that LDGM consistently outperforms other Gaussian graphical model based methods. We further evaluated LDGM by applying to the brain and blood gene expression data from the GTEx consortium. We also applied LDGM to identify network rewiring between cancer subtypes using the TCGA breast cancer samples. Our results suggest that LDGM is an effective method to infer differential network using high-throughput gene expression data to identify GRN dynamics among different cellular conditions. PMID:27378774

  17. The genetics of alcoholism: identifying specific genes through family studies.

    PubMed

    Edenberg, Howard J; Foroud, Tatiana

    2006-09-01

    Alcoholism is a complex disorder with both genetic and environmental risk factors. Studies in humans have begun to elucidate the genetic underpinnings of the risk for alcoholism. Here we briefly review strategies for identifying individual genes in which variations affect the risk for alcoholism and related phenotypes, in the context of one large study that has successfully identified such genes. The Collaborative Study on the Genetics of Alcoholism (COGA) is a family-based study that has collected detailed phenotypic data on individuals in families with multiple alcoholic members. A genome-wide linkage approach led to the identification of chromosomal regions containing genes that influenced alcoholism risk and related phenotypes. Subsequently, single nucleotide polymorphisms (SNPs) were genotyped in positional candidate genes located within the linked chromosomal regions, and analyzed for association with these phenotypes. Using this sequential approach, COGA has detected association with GABRA2, CHRM2 and ADH4; these associations have all been replicated by other researchers. COGA has detected association to additional genes including GABRG3, TAS2R16, SNCA, OPRK1 and PDYN, results that are awaiting confirmation. These successes demonstrate that genes contributing to the risk for alcoholism can be reliably identified using human subjects.

  18. Automatically identifying gene/protein terms in MEDLINE abstracts.

    PubMed

    Yu, Hong; Hatzivassiloglou, Vasileios; Rzhetsky, Andrey; Wilbur, W John

    2002-01-01

    Natural language processing (NLP) techniques are used to extract information automatically from computer-readable literature. In biology, the identification of terms corresponding to biological substances (e.g., genes and proteins) is a necessary step that precedes the application of other NLP systems that extract biological information (e.g., protein-protein interactions, gene regulation events, and biochemical pathways). We have developed GPmarkup (for "gene/protein-full name mark up"), a software system that automatically identifies gene/protein terms (i.e., symbols or full names) in MEDLINE abstracts. As a part of marking up process, we also generated automatically a knowledge source of paired gene/protein symbols and full names (e.g., LARD for lymphocyte associated receptor of death) from MEDLINE. We found that many of the pairs in our knowledge source do not appear in the current GenBank database. Therefore our methods may also be used for automatic lexicon generation. GPmarkup has 73% recall and 93% precision in identifying and marking up gene/protein terms in MEDLINE abstracts. A random sample of gene/protein symbols and full names and a sample set of marked up abstracts can be viewed at http://www.cpmc.columbia.edu/homepages/yuh9001/GPmarkup/. Contact. hy52@columbia.edu. Voice: 212-939-7028; fax: 212-666-0140.

  19. Gene Signature in Sessile Serrated Polyps Identifies Colon Cancer Subtype

    PubMed Central

    Kanth, Priyanka; Bronner, Mary P.; Boucher, Kenneth M.; Burt, Randall W.; Neklason, Deborah W.; Hagedorn, Curt H.; Delker, Don A.

    2016-01-01

    Sessile serrated colon adenoma/polyps (SSA/Ps) are found during routine screening colonoscopy and may account for 20–30% of colon cancers. However, differentiating SSA/Ps from hyperplastic polyps (HP) with little risk of cancer is challenging and complementary molecular markers are needed. Additionally, the molecular mechanisms of colon cancer development from SSA/Ps are poorly understood. RNA sequencing was performed on 21 SSA/Ps, 10 HPs, 10 adenomas, 21 uninvolved colon and 20 control colon specimens. Differential expression and leave-one-out cross validation methods were used to define a unique gene signature of SSA/Ps. Our SSA/P gene signature was evaluated in colon cancer RNA-Seq data from The Cancer Genome Atlas (TCGA) to identify a subtype of colon cancers that may develop from SSA/Ps. A total of 1422 differentially expressed genes were found in SSA/Ps relative to controls. Serrated polyposis syndrome (n=12) and sporadic SSA/Ps (n=9) exhibited almost complete (96%) gene overlap. A 51-gene panel in SSA/P showed similar expression in a subset of TCGA colon cancers with high microsatellite instability (MSI-H). A smaller seven-gene panel showed high sensitivity and specificity in identifying BRAF mutant, CpG island methylator phenotype high (CIMP-H) and MLH1 silenced colon cancers. We describe a unique gene signature in SSA/Ps that identifies a subset of colon cancers likely to develop through the serrated pathway. These gene panels may be utilized for improved differentiation of SSA/Ps from HPs and provide insights into novel molecular pathways altered in colon cancer arising from the serrated pathway. PMID:27026680

  20. GESearch: An Interactive GUI Tool for Identifying Gene Expression Signature.

    PubMed

    Ye, Ning; Yin, Hengfu; Liu, Jingjing; Dai, Xiaogang; Yin, Tongming

    2015-01-01

    The huge amount of gene expression data generated by microarray and next-generation sequencing technologies present challenges to exploit their biological meanings. When searching for the coexpression genes, the data mining process is largely affected by selection of algorithms. Thus, it is highly desirable to provide multiple options of algorithms in the user-friendly analytical toolkit to explore the gene expression signatures. For this purpose, we developed GESearch, an interactive graphical user interface (GUI) toolkit, which is written in MATLAB and supports a variety of gene expression data files. This analytical toolkit provides four models, including the mean, the regression, the delegate, and the ensemble models, to identify the coexpression genes, and enables the users to filter data and to select gene expression patterns by browsing the display window or by importing knowledge-based genes. Subsequently, the utility of this analytical toolkit is demonstrated by analyzing two sets of real-life microarray datasets from cell-cycle experiments. Overall, we have developed an interactive GUI toolkit that allows for choosing multiple algorithms for analyzing the gene expression signatures.

  1. Machine Learning-Based Gene Prioritization Identifies Novel Candidate Risk Genes for Inflammatory Bowel Disease.

    PubMed

    Isakov, Ofer; Dotan, Iris; Ben-Shachar, Shay

    2017-09-01

    The inflammatory bowel diseases (IBDs) are chronic inflammatory disorders, associated with genetic, immunologic, and environmental factors. Although hundreds of genes are implicated in IBD etiology, it is likely that additional genes play a role in the disease process. We developed a machine learning-based gene prioritization method to identify novel IBD-risk genes. Known IBD genes were collected from genome-wide association studies and annotated with expression and pathway information. Using these genes, a model was trained to identify IBD-risk genes. A comprehensive list of 16,390 genes was then scored and classified. Immune and inflammatory responses, as well as pathways such as cell adhesion, cytokine-cytokine receptor interaction, and sulfur metabolism were identified to be related to IBD. Scores predicted for IBD genes were significantly higher than those for non-IBD genes (P < 10). There was a significant association between the score and having an IBD publication (P < 10). Overall, 347 genes had a high prediction score (>0.8). A literature review of the genes, excluding those used to train the model, identified 67 genes without any publication concerning IBD. These genes represent novel candidate IBD-risk genes, which can be targeted in future studies. Our method successfully differentiated IBD-risk genes from non-IBD genes by using information from expression data and a multitude of gene annotations. Crucial features were defined, and we were able to detect novel candidate risk genes for IBD. These findings may help detect new IBD-risk genes and improve the understanding of IBD pathogenesis.

  2. Identifying Mendelian disease genes with the Variant Effect Scoring Tool

    PubMed Central

    2013-01-01

    Background Whole exome sequencing studies identify hundreds to thousands of rare protein coding variants of ambiguous significance for human health. Computational tools are needed to accelerate the identification of specific variants and genes that contribute to human disease. Results We have developed the Variant Effect Scoring Tool (VEST), a supervised machine learning-based classifier, to prioritize rare missense variants with likely involvement in human disease. The VEST classifier training set comprised ~ 45,000 disease mutations from the latest Human Gene Mutation Database release and another ~45,000 high frequency (allele frequency >1%) putatively neutral missense variants from the Exome Sequencing Project. VEST outperforms some of the most popular methods for prioritizing missense variants in carefully designed holdout benchmarking experiments (VEST ROC AUC = 0.91, PolyPhen2 ROC AUC = 0.86, SIFT4.0 ROC AUC = 0.84). VEST estimates variant score p-values against a null distribution of VEST scores for neutral variants not included in the VEST training set. These p-values can be aggregated at the gene level across multiple disease exomes to rank genes for probable disease involvement. We tested the ability of an aggregate VEST gene score to identify candidate Mendelian disease genes, based on whole-exome sequencing of a small number of disease cases. We used whole-exome data for two Mendelian disorders for which the causal gene is known. Considering only genes that contained variants in all cases, the VEST gene score ranked dihydroorotate dehydrogenase (DHODH) number 2 of 2253 genes in four cases of Miller syndrome, and myosin-3 (MYH3) number 2 of 2313 genes in three cases of Freeman Sheldon syndrome. Conclusions Our results demonstrate the potential power gain of aggregating bioinformatics variant scores into gene-level scores and the general utility of bioinformatics in assisting the search for disease genes in large-scale exome sequencing studies. VEST is

  3. Using RNA interference to identify genes required for RNA interference

    PubMed Central

    Dudley, Nathaniel R.; Labbé, Jean-Claude; Goldstein, Bob

    2002-01-01

    RNA interference (RNAi) is a phenomenon in which double-stranded RNA (dsRNA) silences endogenous gene expression. By injecting pools of dsRNAs into Caenorhabditis elegans, we identified a dsRNA that acts as a potent suppressor of the RNAi mechanism. We have used coinjection of dsRNAs to identify four additional candidates for genes involved in the RNAi mechanism in C. elegans. Three of the genes are C. elegans mes genes, some of which encode homologs of the Drosophila chromatin-binding Polycomb-group proteins. We have used loss-of-function mutants to confirm a role for mes-3, -4, and -6 in RNAi. Interestingly, introducing very low levels of dsRNA can bypass a requirement for these genes in RNAi. The finding that genes predicted to encode proteins that associate with chromatin are involved in RNAi in C. elegans raises the possibility that chromatin may play a role in RNAi in animals, as it does in plants. PMID:11904378

  4. Candidate olfaction genes identified within the Helicoverpa armigera Antennal Transcriptome.

    PubMed

    Liu, Yang; Gu, Shaohua; Zhang, Yongjun; Guo, Yuyuan; Wang, Guirong

    2012-01-01

    Antennal olfaction is extremely important for insect survival, mediating key behaviors such as host preference, mate choice, and oviposition site selection. Multiple antennal proteins are involved in olfactory signal transduction pathways. Of these, odorant receptors (ORs) and ionotropic receptors (IRs) confer specificity on olfactory sensory neuron responses. In this study, we identified the olfactory gene repertoire of the economically important agricultural pest moth, Helicoverpa armigera, by assembling the adult male and female antennal transcriptomes. Within the male and female antennal transcriptomes we identified a total of 47 OR candidate genes containing 6 pheromone receptor candidates. Additionally, 12 IR genes as well as 26 odorant-binding proteins and 12 chemosensory proteins were annotated. Our results allow a systematic functional analysis across much of conventional ORs repertoire and newly reported IRs mediating the key olfaction-mediated behaviors of H. armigera.

  5. GENE EXPRESSION PROFILING TO IDENTIFY MECHANISMS OF MALE REPRODUCTIVE TOXICITY

    EPA Science Inventory

    Gene Expression Profiling to Identify Mechanisms of Male Reproductive Toxicity
    David J. Dix
    National Health and Environmental Effects Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA.
    Ab...

  6. GENE EXPRESSION PROFILING TO IDENTIFY MECHANISMS OF MALE REPRODUCTIVE TOXICITY

    EPA Science Inventory

    Gene Expression Profiling to Identify Mechanisms of Male Reproductive Toxicity
    David J. Dix
    National Health and Environmental Effects Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA.
    Ab...

  7. A genomics approach identifies senescence-specific gene expression regulation.

    PubMed

    Lackner, Daniel H; Hayashi, Makoto T; Cesare, Anthony J; Karlseder, Jan

    2014-10-01

    Replicative senescence is a fundamental tumor-suppressive mechanism triggered by telomere erosion that results in a permanent cell cycle arrest. To understand the impact of telomere shortening on gene expression, we analyzed the transcriptome of diploid human fibroblasts as they progressed toward and entered into senescence. We distinguished novel transcription regulation due to replicative senescence by comparing senescence-specific expression profiles to profiles from cells arrested by DNA damage or serum starvation. Only a small specific subset of genes was identified that was truly senescence-regulated and changes in gene expression were exacerbated from presenescent to senescent cells. The majority of gene expression regulation in replicative senescence was shown to occur due to telomere shortening, as exogenous telomerase activity reverted most of these changes.

  8. A genomics approach identifies senescence-specific gene expression regulation

    PubMed Central

    Lackner, Daniel H; Hayashi, Makoto T; Cesare, Anthony J; Karlseder, Jan

    2014-01-01

    Replicative senescence is a fundamental tumor-suppressive mechanism triggered by telomere erosion that results in a permanent cell cycle arrest. To understand the impact of telomere shortening on gene expression, we analyzed the transcriptome of diploid human fibroblasts as they progressed toward and entered into senescence. We distinguished novel transcription regulation due to replicative senescence by comparing senescence-specific expression profiles to profiles from cells arrested by DNA damage or serum starvation. Only a small specific subset of genes was identified that was truly senescence-regulated and changes in gene expression were exacerbated from presenescent to senescent cells. The majority of gene expression regulation in replicative senescence was shown to occur due to telomere shortening, as exogenous telomerase activity reverted most of these changes. PMID:24863242

  9. Multiregional gene expression profiling identifies MRPS6 as a possible candidate gene for Parkinson's disease.

    PubMed

    Papapetropoulos, Spiridon; Ffrench-Mullen, Jarlath; McCorquodale, Donald; Qin, Yujing; Pablo, John; Mash, Deborah C

    2006-01-01

    Combining large-scale gene expression approaches and bioinformatics may provide insights into the molecular variability of biological processes underlying neurodegeneration. To identify novel candidate genes and mechanisms, we conducted a multiregional gene expression analysis in postmortem brain. Gene arrays were performed utilizing Affymetrix HG U133 Plus 2.0 gene chips. Brain specimens from 21 different brain regions were taken from Parkinson's disease (PD) (n = 22) and normal aged (n = 23) brain donors. The rationale for conducting a multiregional survey of gene expression changes was based on the assumption that if a gene is changed in more than one brain region, it may be a higher probability candidate gene compared to genes that are changed in a single region. Although no gene was significantly changed in all of the 21 brain regions surveyed, we identified 11 candidate genes whose pattern of expression was regulated in at least 18 out of 21 regions. The expression of a gene encoding the mitochondria ribosomal protein S6 (MRPS6) had the highest combined mean fold change and topped the list of regulated genes. The analysis revealed other genes related to apoptosis, cell signaling, and cell cycle that may be of importance to disease pathophysiology. High throughput gene expression is an emerging technology for molecular target discovery in neurological and psychiatric disorders. The top gene reported here is the nuclear encoded MRPS6, a building block of the human mitoribosome of the oxidative phosphorylation system (OXPHOS). Impairments in mitochondrial OXPHOS have been linked to the pathogenesis of PD.

  10. Variation in Arabidopsis flooding responses identifies numerous putative "tolerance genes".

    PubMed

    Vashisht, Divya; van Veen, Hans; Akman, Melis; Sasidharan, Rashmi

    2016-11-01

    Plant survival in flooded environments requires a combinatory response to multiple stress conditions such as limited light availability, reduced gas exchange and nutrient uptake. The ability to fine-tune the molecular response at the transcriptional and/or post-transcriptional level that can eventually lead to metabolic and anatomical adjustments are the underlying requirements to confer tolerance. Previously, we compared the transcriptomic adjustment of submergence tolerant, intolerant accessions and identified a core conserved and genotype-specific response to flooding stress, identifying numerous 'putative' tolerance genes. Here, we performed genome wide association analyses on 81 natural Arabidopsis accessions that identified 30 additional SNP markers associated with flooding tolerance. We argue that, given the many genes associated with flooding tolerance in Arabidopsis, improving resistance to submergence requires numerous genetic changes.

  11. Identifying genes for neurobehavioural traits in rodents: progress and pitfalls.

    PubMed

    Baud, Amelie; Flint, Jonathan

    2017-04-01

    Identifying genes and pathways that contribute to differences in neurobehavioural traits is a key goal in psychiatric research. Despite considerable success in identifying quantitative trait loci (QTLs) associated with behaviour in laboratory rodents, pinpointing the causal variants and genes is more challenging. For a long time, the main obstacle was the size of QTLs, which could encompass tens if not hundreds of genes. However, recent studies have exploited mouse and rat resources that allow mapping of phenotypes to narrow intervals, encompassing only a few genes. Here, we review these studies, showcase the rodent resources they have used and highlight the insights into neurobehavioural traits provided to date. We discuss what we see as the biggest challenge in the field - translating QTLs into biological knowledge by experimentally validating and functionally characterizing candidate genes - and propose that the CRISPR/Cas genome-editing system holds the key to overcoming this obstacle. Finally, we challenge traditional views on inbred versus outbred resources in the light of recent resource and technology developments. © 2017. Published by The Company of Biologists Ltd.

  12. Gene expression profiling in bladder cancer identifies potential therapeutic targets

    PubMed Central

    Hussain, Syed A.; Palmer, Daniel H.; Syn, Wing-Kin; Sacco, Joseph J.; Greensmith, Richard M.D.; Elmetwali, Taha; Aachi, Vijay; Lloyd, Bryony H.; Jithesh, Puthen V.; Arrand, John; Barton, Darren; Ansari, Jawaher; Sibson, D. Ross; James, Nicholas D.

    2017-01-01

    Despite advances in management, bladder cancer remains a major cause of cancer related complications. Characterisation of gene expression patterns in bladder cancer allows the identification of pathways involved in its pathogenesis, and may stimulate the development of novel therapies targeting these pathways. Between 2004 and 2005, cystoscopic bladder biopsies were obtained from 19 patients and 11 controls. These were subjected to whole transcript-based microarray analysis. Unsupervised hierarchical clustering was used to identify samples with similar expression profiles. Hypergeometric analysis was used to identify canonical pathways and curated networks having statistically significant enrichment of differentially expressed genes. Osteopontin (OPN) expression was validated by immunohistochemistry. Hierarchical clustering defined signatures, which differentiated between cancer and healthy tissue, muscle-invasive or non-muscle invasive cancer and healthy tissue, grade 1 and grade 3. Pathways associated with cell cycle and proliferation were markedly upregulated in muscle-invasive and grade 3 cancers. Genes associated with the classical complement pathway were downregulated in non-muscle invasive cancer. Osteopontin was markedly overexpressed in invasive cancer compared to healthy tissue. The present study contributes to a growing body of work on gene expression signatures in bladder cancer. The data support an important role for osteopontin in bladder cancer, and identify several pathways worthy of further investigation. PMID:28259975

  13. Phage cluster relationships identified through single gene analysis

    PubMed Central

    2013-01-01

    Background Phylogenetic comparison of bacteriophages requires whole genome approaches such as dotplot analysis, genome pairwise maps, and gene content analysis. Currently mycobacteriophages, a highly studied phage group, are categorized into related clusters based on the comparative analysis of whole genome sequences. With the recent explosion of phage isolation, a simple method for phage cluster prediction would facilitate analysis of crude or complex samples without whole genome isolation and sequencing. The hypothesis of this study was that mycobacteriophage-cluster prediction is possible using comparison of a single, ubiquitous, semi-conserved gene. Tape Measure Protein (TMP) was selected to test the hypothesis because it is typically the longest gene in mycobacteriophage genomes and because regions within the TMP gene are conserved. Results A single gene, TMP, identified the known Mycobacteriophage clusters and subclusters using a Gepard dotplot comparison or a phylogenetic tree constructed from global alignment and maximum likelihood comparisons. Gepard analysis of 247 mycobacteriophage TMP sequences appropriately recovered 98.8% of the subcluster assignments that were made by whole-genome comparison. Subcluster-specific primers within TMP allow for PCR determination of the mycobacteriophage subcluster from DNA samples. Using the single-gene comparison approach for siphovirus coliphages, phage groupings by TMP comparison reflected relationships observed in a whole genome dotplot comparison and confirm the potential utility of this approach to another widely studied group of phages. Conclusions TMP sequence comparison and PCR results support the hypothesis that a single gene can be used for distinguishing phage cluster and subcluster assignments. TMP single-gene analysis can quickly and accurately aid in mycobacteriophage classification. PMID:23777341

  14. DAWN: a framework to identify autism genes and subnetworks using gene expression and genetics

    PubMed Central

    2014-01-01

    Background De novo loss-of-function (dnLoF) mutations are found twofold more often in autism spectrum disorder (ASD) probands than their unaffected siblings. Multiple independent dnLoF mutations in the same gene implicate the gene in risk and hence provide a systematic, albeit arduous, path forward for ASD genetics. It is likely that using additional non-genetic data will enhance the ability to identify ASD genes. Methods To accelerate the search for ASD genes, we developed a novel algorithm, DAWN, to model two kinds of data: rare variations from exome sequencing and gene co-expression in the mid-fetal prefrontal and motor-somatosensory neocortex, a critical nexus for risk. The algorithm casts the ensemble data as a hidden Markov random field in which the graph structure is determined by gene co-expression and it combines these interrelationships with node-specific observations, namely gene identity, expression, genetic data and the estimated effect on risk. Results Using currently available genetic data and a specific developmental time period for gene co-expression, DAWN identified 127 genes that plausibly affect risk, and a set of likely ASD subnetworks. Validation experiments making use of published targeted resequencing results demonstrate its efficacy in reliably predicting ASD genes. DAWN also successfully predicts known ASD genes, not included in the genetic data used to create the model. Conclusions Validation studies demonstrate that DAWN is effective in predicting ASD genes and subnetworks by leveraging genetic and gene expression data. The findings reported here implicate neurite extension and neuronal arborization as risks for ASD. Using DAWN on emerging ASD sequence data and gene expression data from other brain regions and tissues would likely identify novel ASD genes. DAWN can also be used for other complex disorders to identify genes and subnetworks in those disorders. PMID:24602502

  15. [Key effect genes responding to nerve injury identified by gene ontology and computer pattern recognition].

    PubMed

    Pan, Qian; Peng, Jin; Zhou, Xue; Yang, Hao; Zhang, Wei

    2012-07-01

    In order to screen out important genes from large gene data of gene microarray after nerve injury, we combine gene ontology (GO) method and computer pattern recognition technology to find key genes responding to nerve injury, and then verify one of these screened-out genes. Data mining and gene ontology analysis of gene chip data GSE26350 was carried out through MATLAB software. Cd44 was selected from screened-out key gene molecular spectrum by comparing genes' different GO terms and positions on score map of principal component. Function interferences were employed to influence the normal binding of Cd44 and one of its ligands, chondroitin sulfate C (CSC), to observe neurite extension. Gene ontology analysis showed that the first genes on score map (marked by red *) mainly distributed in molecular transducer activity, receptor activity, protein binding et al molecular function GO terms. Cd44 is one of six effector protein genes, and attracted us with its function diversity. After adding different reagents into the medium to interfere the normal binding of CSC and Cd44, varying-degree remissions of CSC's inhibition on neurite extension were observed. CSC can inhibit neurite extension through binding Cd44 on the neuron membrane. This verifies that important genes in given physiological processes can be identified by gene ontology analysis of gene chip data.

  16. Identifying candidate driver genes by integrative ovarian cancer genomics data

    NASA Astrophysics Data System (ADS)

    Lu, Xinguo; Lu, Jibo

    2017-08-01

    Integrative analysis of molecular mechanics underlying cancer can distinguish interactions that cannot be revealed based on one kind of data for the appropriate diagnosis and treatment of cancer patients. Tumor samples exhibit heterogeneity in omics data, such as somatic mutations, Copy Number Variations CNVs), gene expression profiles and so on. In this paper we combined gene co-expression modules and mutation modulators separately in tumor patients to obtain the candidate driver genes for resistant and sensitive tumor from the heterogeneous data. The final list of modulators identified are well known in biological processes associated with ovarian cancer, such as CCL17, CACTIN, CCL16, CCL22, APOB, KDF1, CCL11, HNF1B, LRG1, MED1 and so on, which can help to facilitate the discovery of biomarkers, molecular diagnostics, and drug discovery.

  17. Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.

    PubMed

    Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin

    2017-08-01

    This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.

  18. Genes Necessary for Bacterial Magnetite Biomineralization Identified by Transposon Mutagenesis

    NASA Astrophysics Data System (ADS)

    Nash, C. Z.; Komeili, A.; Newman, D. K.; Kirschvink, J. L.

    2004-12-01

    Magnetic bacteria synthesize nanoscale crystals of magnetite in intracellular, membrane-bounded organelles (magnetosomes). These crystals are preserved in the fossil record at least as far back as the late Neoproterozoic and have been tentatively identified in much older rocks (1). This fossil record may provide deep time calibration points for molecular evolution studies once the genes involved in biologically controlled magnetic mineralization (BCMM) are known. Further, a genetic and biochemical understanding of BCMM will give insight into the depositional environment and biogeochemical cycles in which magnetic bacteria play a role. The BCMM process is not well understood, though proteins have been identified from the magnetosome membrane and genetic manipulation and biochemical characterization of these proteins are underway. Most of the proteins currently thought to be involved are encoded within the mam cluster, a large cluster of genes whose products localize to the magnetosome membrane and are conserved among magnetic bacteria (2). In an effort to identify all of the genes necessary for bacterial BCMM, we undertook a transposon mutagenesis of Magnetospirillum magneticum AMB-1. Non-magnetic mutants (MNMs) were identified by growth in liquid culture followed by a magnetic assay. The insertion site of the transposon was identified two ways. First MNMs were screened with a PCR assay to determine if the transposon had inserted into the mam cluster. Second, the transposon was rescued from the mutant DNA and cloned for sequencing. The majority insertion sites are located within the mam cluster. Insertion sites also occur in operons which have not previously been suspected to be involved in magnetite biomineralization. None of the insertion sites have occurred within genes reported from previous transposon mutagenesis studies of AMB-1 (3, 4). Two of the non-mam cluster insertion sites occur in operons containing genes conserved particularly between MS-1 and MC-1. We

  19. Sleeping Beauty Mouse Models Identify Candidate Genes Involved in Gliomagenesis

    PubMed Central

    Vyazunova, Irina; Maklakova, Vilena I.; Berman, Samuel; De, Ishani; Steffen, Megan D.; Hong, Won; Lincoln, Hayley; Morrissy, A. Sorana; Taylor, Michael D.; Akagi, Keiko; Brennan, Cameron W.; Rodriguez, Fausto J.; Collier, Lara S.

    2014-01-01

    Genomic studies of human high-grade gliomas have discovered known and candidate tumor drivers. Studies in both cell culture and mouse models have complemented these approaches and have identified additional genes and processes important for gliomagenesis. Previously, we found that mobilization of Sleeping Beauty transposons in mice ubiquitously throughout the body from the Rosa26 locus led to gliomagenesis with low penetrance. Here we report the characterization of mice in which transposons are mobilized in the Glial Fibrillary Acidic Protein (GFAP) compartment. Glioma formation in these mice did not occur on an otherwise wild-type genetic background, but rare gliomas were observed when mobilization occurred in a p19Arf heterozygous background. Through cloning insertions from additional gliomas generated by transposon mobilization in the Rosa26 compartment, several candidate glioma genes were identified. Comparisons to genetic, epigenetic and mRNA expression data from human gliomas implicates several of these genes as tumor suppressor genes and oncogenes in human glioblastoma. PMID:25423036

  20. Axon Regeneration Genes Identified by RNAi Screening in C. elegans

    PubMed Central

    Nix, Paola; Hammarlund, Marc; Hauth, Linda; Lachnit, Martina; Jorgensen, Erik M.

    2014-01-01

    Axons of the mammalian CNS lose the ability to regenerate soon after development due to both an inhibitory CNS environment and the loss of cell-intrinsic factors necessary for regeneration. The complex molecular events required for robust regeneration of mature neurons are not fully understood, particularly in vivo. To identify genes affecting axon regeneration in Caenorhabditis elegans, we performed both an RNAi-based screen for defective motor axon regeneration in unc-70/β-spectrin mutants and a candidate gene screen. From these screens, we identified at least 50 conserved genes with growth-promoting or growth-inhibiting functions. Through our analysis of mutants, we shed new light on certain aspects of regeneration, including the role of β-spectrin and membrane dynamics, the antagonistic activity of MAP kinase signaling pathways, and the role of stress in promoting axon regeneration. Many gene candidates had not previously been associated with axon regeneration and implicate new pathways of interest for therapeutic intervention. PMID:24403161

  1. Identifying genes required for respiratory growth of fission yeast

    PubMed Central

    2016-01-01

    We have used both auxotroph and prototroph versions of the latest deletion-mutant library to identify genes required for respiratory growth on solid glycerol medium in fission yeast. This data set complements and enhances our recent study on functional and regulatory aspects of energy metabolism by providing additional proteins that are involved in respiration. Most proteins identified in this mutant screen have not been implicated in respiration in budding yeast. We also provide a protocol to generate a prototrophic mutant library, and data on technical and biological reproducibility of colony-based high-throughput screens. PMID:27918601

  2. Methods for identifying an essential gene in a prokaryotic microorganism

    SciTech Connect

    Shizuya, Hiroaki

    2006-01-31

    Methods are provided for the rapid identification of essential or conditionally essential DNA segments in any species of haploid cell (one copy chromosome per cell) that is capable of being transformed by artificial means and is capable of undergoing DNA recombination. This system offers an enhanced means of identifying essential function genes in diploid pathogens, such as gram-negative and gram-positive bacteria.

  3. Analysis of gene order conservation in eukaryotes identifies transcriptionally and functionally linked genes.

    PubMed

    Dávila López, Marcela; Martínez Guerra, Juan José; Samuelsson, Tore

    2010-05-14

    The order of genes in eukaryotes is not entirely random. Studies of gene order conservation are important to understand genome evolution and to reveal mechanisms why certain neighboring genes are more difficult to separate during evolution. Here, genome-wide gene order information was compiled for 64 species, representing a wide variety of eukaryotic phyla. This information is presented in a browser where gene order may be displayed and compared between species. Factors related to non-random gene order in eukaryotes were examined by considering pairs of neighboring genes. The evolutionary conservation of gene pairs was studied with respect to relative transcriptional direction, intergenic distance and functional relationship as inferred by gene ontology. The results show that among gene pairs that are conserved the divergently and co-directionally transcribed genes are much more common than those that are convergently transcribed. Furthermore, highly conserved pairs, in particular those of fungi, are characterized by a short intergenic distance. Finally, gene pairs of metazoa and fungi that are evolutionary conserved and that are divergently transcribed are much more likely to be related by function as compared to poorly conserved gene pairs. One example is the ribosomal protein gene pair L13/S16, which is unusual as it occurs both in fungi and alveolates. A specific functional relationship between these two proteins is also suggested by the fact that they are part of the same operon in both eubacteria and archaea. In conclusion, factors associated with non-random gene order in eukaryotes include relative gene orientation, intergenic distance and functional relationships. It seems likely that certain pairs of genes are conserved because the genes involved have a transcriptional and/or functional relationship. The results also indicate that studies of gene order conservation aid in identifying genes that are related in terms of transcriptional control.

  4. Analysis of Gene Order Conservation in Eukaryotes Identifies Transcriptionally and Functionally Linked Genes

    PubMed Central

    Dávila López, Marcela; Martínez Guerra, Juan José; Samuelsson, Tore

    2010-01-01

    The order of genes in eukaryotes is not entirely random. Studies of gene order conservation are important to understand genome evolution and to reveal mechanisms why certain neighboring genes are more difficult to separate during evolution. Here, genome-wide gene order information was compiled for 64 species, representing a wide variety of eukaryotic phyla. This information is presented in a browser where gene order may be displayed and compared between species. Factors related to non-random gene order in eukaryotes were examined by considering pairs of neighboring genes. The evolutionary conservation of gene pairs was studied with respect to relative transcriptional direction, intergenic distance and functional relationship as inferred by gene ontology. The results show that among gene pairs that are conserved the divergently and co-directionally transcribed genes are much more common than those that are convergently transcribed. Furthermore, highly conserved pairs, in particular those of fungi, are characterized by a short intergenic distance. Finally, gene pairs of metazoa and fungi that are evolutionary conserved and that are divergently transcribed are much more likely to be related by function as compared to poorly conserved gene pairs. One example is the ribosomal protein gene pair L13/S16, which is unusual as it occurs both in fungi and alveolates. A specific functional relationship between these two proteins is also suggested by the fact that they are part of the same operon in both eubacteria and archaea. In conclusion, factors associated with non-random gene order in eukaryotes include relative gene orientation, intergenic distance and functional relationships. It seems likely that certain pairs of genes are conserved because the genes involved have a transcriptional and/or functional relationship. The results also indicate that studies of gene order conservation aid in identifying genes that are related in terms of transcriptional control. PMID:20498846

  5. Efficient Strategy to Identify Gene-Gene Interactions and Its Application to Type 2 Diabetes

    PubMed Central

    Li, Donghe

    2016-01-01

    Over the past decade, the detection of gene-gene interactions has become more and more popular in the field of genome-wide association studies (GWASs). The goal of the GWAS is to identify genetic susceptibility to complex diseases by assaying and analyzing hundreds of thousands of single-nucleotide polymorphisms. However, such tests are computationally demanding and methodologically challenging. Recently, a simple but powerful method, named “BOolean Operation-based Screening and Testing” (BOOST), was proposed for genome-wide gene-gene interaction analyses. BOOST was designed with a Boolean representation of genotype data and is approximately equivalent to the log-linear model. It is extremely fast, and genome-wide gene-gene interaction analyses can be completed within a few hours. However, BOOST can not adjust for covariate effects, and its type-1 error control is not correct. Thus, we considered two-step approaches for gene-gene interaction analyses. First, we selected gene-gene interactions with BOOST and applied logistic regression with covariate adjustments to select gene-gene interactions. We applied the two-step approach to type 2 diabetes (T2D) in the Korea Association Resource (KARE) cohort and identified some promising pairs of single-nucleotide polymorphisms associated with T2D. PMID:28154506

  6. Meta-analysis of gene expression data identifies causal genes for prostate cancer.

    PubMed

    Wang, Xiang-Yang; Hao, Jian-Wei; Zhou, Rui-Jin; Zhang, Xiang-Sheng; Yan, Tian-Zhong; Ding, De-Gang; Shan, Lei

    2013-01-01

    Prostate cancer is a leading cause of death in male populations across the globe. With the advent of gene expression arrays, many microarray studies have been conducted in prostate cancer, but the results have varied across different studies. To better understand the genetic and biologic mechanisms of prostate cancer, we conducted a meta-analysis of two studies on prostate cancer. Eight key genes were identified to be differentially expressed with progression. After gene co-expression analysis based on data from the GEO database, we obtained a co- expressed gene list which included 725 genes. Gene Ontology analysis revealed that these genes are involved in actin filament-based processes, locomotion and cell morphogenesis. Further analysis of the gene list should provide important clues for developing new prognostic markers and therapeutic targets.

  7. A recellularized human colon model identifies cancer driver genes

    PubMed Central

    Chen, Huanhuan Joyce; Wei, Zhubo; Sun, Jian; Bhattacharya, Asmita; Savage, David J; Serda, Rita; Mackeyev, Yuri; Curley, Steven A.; Bu, Pengcheng; Wang, Lihua; Chen, Shuibing; Cohen-Gould, Leona; Huang, Emina; Shen, Xiling; Lipkin, Steven M.; Copeland, Neal G.; Jenkins, Nancy A.; Shuler, Michael L.

    2016-01-01

    Refined cancer models are needed to bridge the gap between cell-line, animal and clinical research. Here we describe the engineering of an organotypic colon cancer model by recellularization of a native human matrix that contains cell-populated mucosa and an intact muscularis mucosa layer. This ex vivo system recapitulates the pathophysiological progression from APC-mutant neoplasia to submucosal invasive tumor. We used it to perform a Sleeping Beauty transposon mutagenesis screen to identify genes that cooperate with mutant APC in driving invasive neoplasia. 38 candidate invasion driver genes were identified, 17 of which have been previously implicated in colorectal cancer progression, including TCF7L2, TWIST2, MSH2, DCC and EPHB1/2. Six invasion driver genes that to our knowledge have not been previously described were validated in vitro using cell proliferation, migration and invasion assays, and ex vivo using recellularized human colon. These results demonstrate the utility of our organoid model for studying cancer biology. PMID:27398792

  8. Identifying sleep regulatory genes using a Drosophila model of insomnia

    PubMed Central

    Seugnet, Laurent; Suzuki, Yasuko; Thimgan, Matthew; Donlea, Jeff; Gimbel, Sarah I.; Gottschalk, Laura; Duntley, Steve P.; Shaw, Paul J.

    2009-01-01

    Although it is widely accepted that sleep must serve an essential biological function, little is known about molecules that underlie sleep regulation. Given that insomnia is a common sleep disorder that disrupts the ability to initiate and maintain restorative sleep, a better understanding of its molecular underpinning may provide crucial insights into sleep regulatory processes. Thus, we created a line of flies using laboratory selection that share traits with human insomnia. After 60 generations insomnia-like (ins-l) flies sleep 60 min a day, exhibit difficulty initiating sleep, difficulty maintaining sleep, and show evidence of daytime cognitive impairment. ins-l flies are also hyperactive and hyper responsive to environmental perturbations. In addition they have difficulty maintaining their balance, have elevated levels of dopamine, are short-lived and show increased levels of triglycerides, cholesterol, and free fatty acids. While their core molecular clock remains intact, ins-l flies lose their ability to sleep when placed into constant darkness. Whole genome profiling identified genes that are modified in ins-l flies. Among those differentially expressed transcripts genes involved in metabolism, neuronal activity, and sensory perception constituted over-represented categories. We demonstrate that two of these genes are upregulated in human subjects following acute sleep deprivation. Together these data indicate that the ins-l flies are a useful tool that can be used to identify molecules important for sleep regulation and may provide insights into both the causes and long-term consequences of insomnia. PMID:19494137

  9. Identifying Francisella tularensis Genes Required for Growth in Host Cells

    PubMed Central

    Brunton, J.; Steele, S.; Miller, C.; Lovullo, E.; Taft-Benz, S.

    2015-01-01

    Francisella tularensis is a highly virulent Gram-negative intracellular pathogen capable of infecting a vast diversity of hosts, ranging from amoebae to humans. A hallmark of F. tularensis virulence is its ability to quickly grow to high densities within a diverse set of host cells, including, but not limited to, macrophages and epithelial cells. We developed a luminescence reporter system to facilitate a large-scale transposon mutagenesis screen to identify genes required for growth in macrophage and epithelial cell lines. We screened 7,454 individual mutants, 269 of which exhibited reduced intracellular growth. Transposon insertions in the 269 growth-defective strains mapped to 68 different genes. FTT_0924, a gene of unknown function but highly conserved among Francisella species, was identified in this screen to be defective for intracellular growth within both macrophage and epithelial cell lines. FTT_0924 was required for full Schu S4 virulence in a murine pulmonary infection model. The ΔFTT_0924 mutant bacterial membrane is permeable when replicating in hypotonic solution and within macrophages, resulting in strongly reduced viability. The permeability and reduced viability were rescued when the mutant was grown in a hypertonic solution, indicating that FTT_0924 is required for resisting osmotic stress. The ΔFTT_0924 mutant was also significantly more sensitive to β-lactam antibiotics than Schu S4. Taken together, the data strongly suggest that FTT_0924 is required for maintaining peptidoglycan integrity and virulence. PMID:25987704

  10. Identifying Francisella tularensis genes required for growth in host cells.

    PubMed

    Brunton, J; Steele, S; Miller, C; Lovullo, E; Taft-Benz, S; Kawula, T

    2015-08-01

    Francisella tularensis is a highly virulent Gram-negative intracellular pathogen capable of infecting a vast diversity of hosts, ranging from amoebae to humans. A hallmark of F. tularensis virulence is its ability to quickly grow to high densities within a diverse set of host cells, including, but not limited to, macrophages and epithelial cells. We developed a luminescence reporter system to facilitate a large-scale transposon mutagenesis screen to identify genes required for growth in macrophage and epithelial cell lines. We screened 7,454 individual mutants, 269 of which exhibited reduced intracellular growth. Transposon insertions in the 269 growth-defective strains mapped to 68 different genes. FTT_0924, a gene of unknown function but highly conserved among Francisella species, was identified in this screen to be defective for intracellular growth within both macrophage and epithelial cell lines. FTT_0924 was required for full Schu S4 virulence in a murine pulmonary infection model. The ΔFTT_0924 mutant bacterial membrane is permeable when replicating in hypotonic solution and within macrophages, resulting in strongly reduced viability. The permeability and reduced viability were rescued when the mutant was grown in a hypertonic solution, indicating that FTT_0924 is required for resisting osmotic stress. The ΔFTT_0924 mutant was also significantly more sensitive to β-lactam antibiotics than Schu S4. Taken together, the data strongly suggest that FTT_0924 is required for maintaining peptidoglycan integrity and virulence.

  11. Anaerobically expressed Escherichia coli genes identified by operon fusion techniques.

    PubMed Central

    Choe, M; Reznikoff, W S

    1991-01-01

    Genes that are expressed under anaerobic conditions were identified by operon fusion techniques with a hybrid bacteriophage of lambda and Mu, lambda placMu53, which creates transcriptional fusions to lacZY. Cells were screened for anaerobic expression on XG medium. Nine strains were selected, and the insertion point of the hybrid phage in each strain was mapped on the Escherichia coli chromosome linkage map. The anaerobic and aerobic expression levels of these genes were measured by beta-galactosidase assays in different medium conditions and in the presence of three regulatory mutations (fnr, narL, and rpoN). The anaerobically expressed genes (aeg) located at minute 99 (aeg-99) and 75 (aeg-75) appeared to be partially regulated by fnr, and aeg-93 is tightly regulated by fnr. aeg-60 requires a functional rpoN gene for its anaerobic expression. aeg-46.5 is repressed by narL. aeg-65A and aeg-65C are partially controlled by fnr but only in media containing nitrate or fumarate. aeg-47.5 and aeg-48.5 were found to be anaerobically induced only in rich media. The effects of a narL mutation on aeg-46.5 expression were observed in all medium conditions regardless of the presence or absence of nitrate. This suggests that narL has a regulatory function in the absence of exogenously added nitrate. PMID:1917846

  12. A Metastatic Mouse Model Identifies Genes That Regulate Neuroblastoma Metastasis.

    PubMed

    Seong, Bo Kyung A; Fathers, Kelly E; Hallett, Robin; Yung, Christina K; Stein, Lincoln D; Mouaaz, Samar; Kee, Lynn; Hawkins, Cynthia E; Irwin, Meredith S; Kaplan, David R

    2017-02-01

    Metastatic relapse is the major cause of death in pediatric neuroblastoma, where there remains a lack of therapies to target this stage of disease. To understand the molecular mechanisms mediating neuroblastoma metastasis, we developed a mouse model using intracardiac injection and in vivo selection to isolate malignant cell subpopulations with a higher propensity for metastasis to bone and the central nervous system. Gene expression profiling revealed primary and metastatic cells as two distinct cell populations defined by differential expression of 412 genes and of multiple pathways, including CADM1, SPHK1, and YAP/TAZ, whose expression independently predicted survival. In the metastatic subpopulations, a gene signature was defined (MET-75) that predicted survival of neuroblastoma patients with metastatic disease. Mechanistic investigations demonstrated causal roles for CADM1, SPHK1, and YAP/TAZ in mediating metastatic phenotypes in vitro and in vivo Notably, pharmacologic targeting of SPHK1 or YAP/TAZ was sufficient to inhibit neuroblastoma metastasis in vivo Overall, we identify gene expression signatures and candidate therapeutics that could improve the treatment of metastatic neuroblastoma. Cancer Res; 77(3); 696-706. ©2017 AACR.

  13. Gastric Cancer Associated Genes Identified by an Integrative Analysis of Gene Expression Data

    PubMed Central

    Jiang, Bing; Li, Shuwen; Jiang, Zhi

    2017-01-01

    Gastric cancer is one of the most severe complex diseases with high morbidity and mortality in the world. The molecular mechanisms and risk factors for this disease are still not clear since the cancer heterogeneity caused by different genetic and environmental factors. With more and more expression data accumulated nowadays, we can perform integrative analysis for these data to understand the complexity of gastric cancer and to identify consensus players for the heterogeneous cancer. In the present work, we screened the published gene expression data and analyzed them with integrative tool, combined with pathway and gene ontology enrichment investigation. We identified several consensus differentially expressed genes and these genes were further confirmed with literature mining; at last, two genes, that is, immunoglobulin J chain and C-X-C motif chemokine ligand 17, were screened as novel gastric cancer associated genes. Experimental validation is proposed to further confirm this finding. PMID:28232943

  14. Identifying Neighborhoods of Coordinated Gene Expression and Metabolite Profiles

    PubMed Central

    Hancock, Timothy; Wicker, Nicolas; Takigawa, Ichigaku; Mamitsuka, Hiroshi

    2012-01-01

    In this paper we investigate how metabolic network structure affects any coordination between transcript and metabolite profiles. To achieve this goal we conduct two complementary analyses focused on the metabolic response to stress. First, we investigate the general size of any relationship between metabolic network gene expression and metabolite profiles. We find that strongly correlated transcript-metabolite profiles are sustained over surprisingly long network distances away from any target metabolite. Secondly, we employ a novel pathway mining method to investigate the structure of this transcript-metabolite relationship. The objective of this method is to identify a minimum set of metabolites which are the target of significantly correlated gene expression pathways. The results reveal that in general, a global regulation signature targeting a small number of metabolites is responsible for a large scale metabolic response. However, our method also reveals pathway specific effects that can degrade this global regulation signature and complicates the observed coordination between transcript-metabolite profiles. PMID:22355360

  15. Identifying genes that mediate anthracyline toxicity in immune cells

    PubMed Central

    Frick, Amber; Suzuki, Oscar T.; Benton, Cristina; Parks, Bethany; Fedoriw, Yuri; Richards, Kristy L.; Thomas, Russell S.; Wiltshire, Tim

    2015-01-01

    The role of the immune system in response to chemotherapeutic agents remains elusive. The interpatient variability observed in immune and chemotherapeutic cytotoxic responses is likely, at least in part, due to complex genetic differences. Through the use of a panel of genetically diverse mouse inbred strains, we developed a drug screening platform aimed at identifying genes underlying these chemotherapeutic cytotoxic effects on immune cells. Using genome-wide association studies (GWAS), we identified four genome-wide significant quantitative trait loci (QTL) that contributed to the sensitivity of doxorubicin and idarubicin in immune cells. Of particular interest, a locus on chromosome 16 was significantly associated with cell viability following idarubicin administration (p = 5.01 × 10−8). Within this QTL lies App, which encodes amyloid beta precursor protein. Comparison of dose-response curves verified that T-cells in App knockout mice were more sensitive to idarubicin than those of C57BL/6J control mice (p < 0.05). In conclusion, the cellular screening approach coupled with GWAS led to the identification and subsequent validation of a gene involved in T-cell viability after idarubicin treatment. Previous studies have suggested a role for App in in vitro and in vivo cytotoxicity to anticancer agents; the overexpression of App enhances resistance, while the knockdown of this gene is deleterious to cell viability. Further investigations should include performing mechanistic studies, validating additional genes from the GWAS, including Ppfia1 and Ppfibp1, and ultimately translating the findings to in vivo and human studies. PMID:25926793

  16. Comparative Transcriptomics to Identify Novel Genes and Pathways in Dinoflagellates

    NASA Astrophysics Data System (ADS)

    Ryan, D.

    2016-02-01

    The unarmored dinoflagellate Karenia brevis is among the most prominent harmful, bloom-forming phytoplankton species in the Gulf of Mexico. During blooms, the polyketides PbTx-1 and PbTx-2 (brevetoxins) are produced by K. brevis. Brevetoxins negatively impact human health and the Gulf shellfish harvest. However, the genes underlying brevetoxin synthesis are currently unknown. Because the K. brevis genome is extremely large ( 1 × 1011 base pairs long), and with a high proportion of repetitive, non-coding DNA, it has not been sequenced. In fact, large, repetitive genomes are common among the dinoflagellate group. High-throughput RNA sequencing technology enabled us to assemble Karenia transcriptomes de novo and investigate potential genes in the brevetoxin pathway through comparative transcriptomics. The brevetoxin profile varies among K. brevis clonal cultures. For example, well-documented Wilson-CCFWC268 typically produces 8-10 pg PbTx per cell, whereas SP1 produces < 2 pg PbTx/cell, and the mutant low-toxin Wilson clone produces undetectable to low (<0.05 pg/cell) amounts. Further, PbTx-2 has been measured in Karenia papilionacea but not Karenia mikimotoi. We compared the transcriptomes of four K. brevis clones (Wilson-CCFWC268, SP3, SP1, and mutant low-toxin Wilson) with K. papilionacea and K. mikimotoi to investigate nucleotide-level genetic variations and differences in gene expression. Of the 85,000 transcripts in the K. brevis transcriptome, 4,600 transcripts, including novel unannotated orthologs and putative polyketide synthases (PKSs), were only expressed by brevetoxin-producing K. brevis and K. papilionacea, not K. mikimotoi. Examination of gene expression between the typical- and low-toxin Wilson clones identified about 3,500 genes with significantly different expression levels, including 2 putative PKSs. One of the 2 PKSs was only found in the brevetoxin-producing Karenia species. These transcriptomes could not have been characterized without high

  17. Novel radiation response genes identified in gene-trapped MCF10A mammary epithelial cells.

    PubMed

    Malone, Jennifer; Ullrich, Robert

    2007-02-01

    We have used a gene-trapping strategy to screen human mammary epithelial cells for radiation response genes. Relative mRNA expression levels of five candidate genes in MCF10A cells were analyzed, both with and without exposure to radiation. In all five cases, the trapped genes were significantly down-regulated after radiation treatment. Sequence analysis of the fusion transcripts identified the trapped genes: (1) the human androgen receptor, (2) the uncharacterized DREV1 gene, which has known homology to DNA methyltransferases, (3) the human creatine kinase gene, (4) the human eukaryotic translation elongation factor 1 beta 2, and (5) the human ribosomal protein L27. All five genes were down-regulated significantly after treatment with varying doses of ionizing radiation (0.10 to 4.0 Gy) and at varying times (2-30 h after treatment). The genes were also analyzed in human fibroblast and lymphoblastoid cell lines to determine whether the radiation response being observed was cell-type specific. The results verified that the observed radiation response was not a cell-type-specific phenomenon, suggesting that the genes play essential roles in the radiation damage control pathways. This study demonstrates the potential of the gene-trap approach for the identification and functional analysis of novel radiation response genes.

  18. Identifying redundant and missing relations in the gene ontology.

    PubMed

    Mougin, Fleur

    2015-01-01

    Significant efforts have been undertaken for providing the Gene Ontology (GO) in a computable format as well as for enriching it with logical definitions. Automated approaches can thus be applied to GO for assisting its maintenance and for checking its internal coherence. However, inconsistencies may still remain within GO. In this frame, the objective of this work was to audit GO relationships. First, reasoning over relationships was exploited for detecting redundant relations existing between GO concepts. Missing necessary and sufficient conditions were then identified based on the compositional structure of the preferred names of GO concepts. More than one thousand redundant relations and 500 missing necessary and sufficient conditions were found. The proposed approach was thus successful for detecting inconsistencies within GO relations. The application of lexical approaches as well as the exploitation of synonyms and textual definitions could be useful for identifying additional necessary and sufficient conditions. Multiple necessary and sufficient conditions for a given GO concept may be indicative of inconsistencies.

  19. Identifying the genes of unconventional high temperature superconductors.

    PubMed

    Hu, Jiangping

    We elucidate a recently emergent framework in unifying the two families of high temperature (high [Formula: see text]) superconductors, cuprates and iron-based superconductors. The unification suggests that the latter is simply the counterpart of the former to realize robust extended s-wave pairing symmetries in a square lattice. The unification identifies that the key ingredients (gene) of high [Formula: see text] superconductors is a quasi two dimensional electronic environment in which the d-orbitals of cations that participate in strong in-plane couplings to the p-orbitals of anions are isolated near Fermi energy. With this gene, the superexchange magnetic interactions mediated by anions could maximize their contributions to superconductivity. Creating the gene requires special arrangements between local electronic structures and crystal lattice structures. The speciality explains why high [Formula: see text] superconductors are so rare. An explicit prediction is made to realize high [Formula: see text] superconductivity in Co/Ni-based materials with a quasi two dimensional hexagonal lattice structure formed by trigonal bipyramidal complexes.

  20. Utilizing Gene Tree Variation to Identify Candidate Effector Genes in Zymoseptoria tritici.

    PubMed

    McDonald, Megan C; McGinness, Lachlan; Hane, James K; Williams, Angela H; Milgate, Andrew; Solomon, Peter S

    2016-04-07

    Zymoseptoria tritici is a host-specific, necrotrophic pathogen of wheat. Infection by Z. tritici is characterized by its extended latent period, which typically lasts 2 wks, and is followed by extensive host cell death, and rapid proliferation of fungal biomass. This work characterizes the level of genomic variation in 13 isolates, for which we have measured virulence on 11 wheat cultivars with differential resistance genes. Between the reference isolate, IPO323, and the 13 Australian isolates we identified over 800,000 single nucleotide polymorphisms, of which ∼10% had an effect on the coding regions of the genome. Furthermore, we identified over 1700 probable presence/absence polymorphisms in genes across the Australian isolates using de novo assembly. Finally, we developed a gene tree sorting method that quickly identifies groups of isolates within a single gene alignment whose sequence haplotypes correspond with virulence scores on a single wheat cultivar. Using this method, we have identified < 100 candidate effector genes whose gene sequence correlates with virulence toward a wheat cultivar carrying a major resistance gene.

  1. New mutations identified in the ocular albinism type 1 gene.

    PubMed

    Roma, Cristin; Ferrante, Paola; Guardiola, Ombretta; Ballabio, Andrea; Zollo, Massimo

    2007-11-01

    As the most common form of ocular albinism, ocular albinism type I (OA1) is an X-linked disorder that has an estimated prevalence of about 1:50,000. We searched for mutations through the human genome sequence draft by direct sequencing on eighteen patients with OA1, both within the coding region and in a thousand base pairs upstream of its start site. Here, we have identified eight new mutations located in the coding region of the gene. Two independent mutations, both located in the most carboxyterminal protein regions, were further characterized by immunofluorescence confocal microscopy, thus showing an impairment in their subcellular distribution into the lysosomal compartment of Cos-7A cells. The mutations found can result in protein misfolding, thus underlining the importance of the structure-function relationships of the protein as a major pathogenic mechanism in ocular albinism. Seven individuals out of eighteen (38.9%) with a clinical diagnosis of ocular albinism showed mutations, thus underlining the discrepancies between the clinical phenotype features and their genotype correlations. We postulate that mutations that have not yet been identified are potentially located in non-coding conserved regions or regulatory sequences of the OA1 gene.

  2. GeneValidator: identify problems with protein-coding gene predictions

    PubMed Central

    Drăgan, Monica-Andreea; Moghul, Ismail; Priyam, Anurag; Bustos, Claudio; Wurm, Yannick

    2016-01-01

    Summary: Genomes of emerging model organisms are now being sequenced at very low cost. However, obtaining accurate gene predictions remains challenging: even the best gene prediction algorithms make substantial errors and can jeopardize subsequent analyses. Therefore, many predicted genes must be time-consumingly visually inspected and manually curated. We developed GeneValidator (GV) to automatically identify problematic gene predictions and to aid manual curation. For each gene, GV performs multiple analyses based on comparisons to gene sequences from large databases. The resulting report identifies problematic gene predictions and includes extensive statistics and graphs for each prediction to guide manual curation efforts. GV thus accelerates and enhances the work of biocurators and researchers who need accurate gene predictions from newly sequenced genomes. Availability and implementation: GV can be used through a web interface or in the command-line. GV is open-source (AGPL), available at https://wurmlab.github.io/tools/genevalidator. Contact: y.wurm@qmul.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26787666

  3. Blood pressure loci identified with a gene-centric array.

    PubMed

    Johnson, Toby; Gaunt, Tom R; Newhouse, Stephen J; Padmanabhan, Sandosh; Tomaszewski, Maciej; Kumari, Meena; Morris, Richard W; Tzoulaki, Ioanna; O'Brien, Eoin T; Poulter, Neil R; Sever, Peter; Shields, Denis C; Thom, Simon; Wannamethee, Sasiwarang G; Whincup, Peter H; Brown, Morris J; Connell, John M; Dobson, Richard J; Howard, Philip J; Mein, Charles A; Onipinla, Abiodun; Shaw-Hawkins, Sue; Zhang, Yun; Davey Smith, George; Day, Ian N M; Lawlor, Debbie A; Goodall, Alison H; Fowkes, F Gerald; Abecasis, Gonçalo R; Elliott, Paul; Gateva, Vesela; Braund, Peter S; Burton, Paul R; Nelson, Christopher P; Tobin, Martin D; van der Harst, Pim; Glorioso, Nicola; Neuvrith, Hani; Salvi, Erika; Staessen, Jan A; Stucchi, Andrea; Devos, Nabila; Jeunemaitre, Xavier; Plouin, Pierre-François; Tichet, Jean; Juhanson, Peeter; Org, Elin; Putku, Margus; Sõber, Siim; Veldre, Gudrun; Viigimaa, Margus; Levinsson, Anna; Rosengren, Annika; Thelle, Dag S; Hastie, Claire E; Hedner, Thomas; Lee, Wai K; Melander, Olle; Wahlstrand, Björn; Hardy, Rebecca; Wong, Andrew; Cooper, Jackie A; Palmen, Jutta; Chen, Li; Stewart, Alexandre F R; Wells, George A; Westra, Harm-Jan; Wolfs, Marcel G M; Clarke, Robert; Franzosi, Maria Grazia; Goel, Anuj; Hamsten, Anders; Lathrop, Mark; Peden, John F; Seedorf, Udo; Watkins, Hugh; Ouwehand, Willem H; Sambrook, Jennifer; Stephens, Jonathan; Casas, Juan-Pablo; Drenos, Fotios; Holmes, Michael V; Kivimaki, Mika; Shah, Sonia; Shah, Tina; Talmud, Philippa J; Whittaker, John; Wallace, Chris; Delles, Christian; Laan, Maris; Kuh, Diana; Humphries, Steve E; Nyberg, Fredrik; Cusi, Daniele; Roberts, Robert; Newton-Cheh, Christopher; Franke, Lude; Stanton, Alice V; Dominiczak, Anna F; Farrall, Martin; Hingorani, Aroon D; Samani, Nilesh J; Caulfield, Mark J; Munroe, Patricia B

    2011-12-09

    Raised blood pressure (BP) is a major risk factor for cardiovascular disease. Previous studies have identified 47 distinct genetic variants robustly associated with BP, but collectively these explain only a few percent of the heritability for BP phenotypes. To find additional BP loci, we used a bespoke gene-centric array to genotype an independent discovery sample of 25,118 individuals that combined hypertensive case-control and general population samples. We followed up four SNPs associated with BP at our p < 8.56 × 10(-7) study-specific significance threshold and six suggestively associated SNPs in a further 59,349 individuals. We identified and replicated a SNP at LSP1/TNNT3, a SNP at MTHFR-NPPB independent (r(2) = 0.33) of previous reports, and replicated SNPs at AGT and ATP2B1 reported previously. An analysis of combined discovery and follow-up data identified SNPs significantly associated with BP at p < 8.56 × 10(-7) at four further loci (NPR3, HFE, NOS3, and SOX6). The high number of discoveries made with modest genotyping effort can be attributed to using a large-scale yet targeted genotyping array and to the development of a weighting scheme that maximized power when meta-analyzing results from samples ascertained with extreme phenotypes, in combination with results from nonascertained or population samples. Chromatin immunoprecipitation and transcript expression data highlight potential gene regulatory mechanisms at the MTHFR and NOS3 loci. These results provide candidates for further study to help dissect mechanisms affecting BP and highlight the utility of studying SNPs and samples that are independent of those studied previously even when the sample size is smaller than that in previous studies.

  4. Network-Based Inference Framework for Identifying Cancer Genes from Gene Expression Data

    PubMed Central

    Yang, Bo; Zhang, Junying; Yin, Yaling; Zhang, Yuanyuan

    2013-01-01

    Great efforts have been devoted to alleviate uncertainty of detected cancer genes as accurate identification of oncogenes is of tremendous significance and helps unravel the biological behavior of tumors. In this paper, we present a differential network-based framework to detect biologically meaningful cancer-related genes. Firstly, a gene regulatory network construction algorithm is proposed, in which a boosting regression based on likelihood score and informative prior is employed for improving accuracy of identification. Secondly, with the algorithm, two gene regulatory networks are constructed from case and control samples independently. Thirdly, by subtracting the two networks, a differential-network model is obtained and then used to rank differentially expressed hub genes for identification of cancer biomarkers. Compared with two existing gene-based methods (t-test and lasso), the method has a significant improvement in accuracy both on synthetic datasets and two real breast cancer datasets. Furthermore, identified six genes (TSPYL5, CD55, CCNE2, DCK, BBC3, and MUC1) susceptible to breast cancer were verified through the literature mining, GO analysis, and pathway functional enrichment analysis. Among these oncogenes, TSPYL5 and CCNE2 have been already known as prognostic biomarkers in breast cancer, CD55 has been suspected of playing an important role in breast cancer prognosis from literature evidence, and other three genes are newly discovered breast cancer biomarkers. More generally, the differential-network schema can be extended to other complex diseases for detection of disease associated-genes. PMID:24073403

  5. Reconstructability analysis as a tool for identifying gene-gene interactions in studies of human diseases.

    PubMed

    Shervais, Stephen; Kramer, Patricia L; Westaway, Shawn K; Cox, Nancy J; Zwick, Martin

    2010-01-01

    There are a number of common human diseases for which the genetic component may include an epistatic interaction of multiple genes. Detecting these interactions with standard statistical tools is difficult because there may be an interaction effect, but minimal or no main effect. Reconstructability analysis (RA) uses Shannon's information theory to detect relationships between variables in categorical datasets. We applied RA to simulated data for five different models of gene-gene interaction, and find that even with heritability levels as low as 0.008, and with the inclusion of 50 non-associated genes in the dataset, we can identify the interacting gene pairs with an accuracy of > or =80%. We applied RA to a real dataset of type 2 non-insulin-dependent diabetes (NIDDM) cases and controls, and closely approximated the results of more conventional single SNP disease association studies. In addition, we replicated prior evidence for epistatic interactions between SNPs on chromosomes 2 and 15.

  6. Gene expression patterns combined with network analysis identify hub genes associated with bladder cancer.

    PubMed

    Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia

    2015-06-01

    To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. BCIP: a gene-centered platform for identifying potential regulatory genes in breast cancer

    PubMed Central

    Wu, Jiaqi; Hu, Shuofeng; Chen, Yaowen; Li, Zongcheng; Zhang, Jian; Yuan, Hanyu; Shi, Qiang; Shao, Ningsheng; Ying, Xiaomin

    2017-01-01

    Breast cancer is a disease with high heterogeneity. Many issues on tumorigenesis and progression are still elusive. It is critical to identify genes that play important roles in the progression of tumors, especially for tumors with poor prognosis such as basal-like breast cancer and tumors in very young women. To facilitate the identification of potential regulatory or driver genes, we present the Breast Cancer Integrative Platform (BCIP, http://omics.bmi.ac.cn/bcancer/). BCIP maintains multi-omics data selected with strict quality control and processed with uniform normalization methods, including gene expression profiles from 9,005 tumor and 376 normal tissue samples, copy number variation information from 3,035 tumor samples, microRNA-target interactions, co-expressed genes, KEGG pathways, and mammary tissue-specific gene functional networks. This platform provides a user-friendly interface integrating comprehensive and flexible analysis tools on differential gene expression, copy number variation, and survival analysis. The prominent characteristic of BCIP is that users can perform analysis by customizing subgroups with single or combined clinical features, including subtypes, histological grades, pathologic stages, metastasis status, lymph node status, ER/PR/HER2 status, TP53 mutation status, menopause status, age, tumor size, therapy responses, and prognosis. BCIP will help to identify regulatory or driver genes and candidate biomarkers for further research in breast cancer. PMID:28327601

  8. A gene-trap strategy identifies quiescence-induced genes in synchronized myoblasts.

    PubMed

    Sambasivan, Ramkumar; Pavlath, Grace K; Dhawan, Jyotsna

    2008-03-01

    Cellular quiescence is characterized not only by reduced mitotic and metabolic activity but also by altered gene expression. Growing evidence suggests that quiescence is not merely a basal state but is regulated by active mechanisms. To understand the molecular programme that governs reversible cell cycle exit, we focused on quiescence-related gene expression in a culture model of myogenic cell arrest and activation. Here we report the identification of quiescence-induced genes using a gene-trap strategy. Using a retroviral vector, we generated a library of gene traps in C2C12 myoblasts that were screened for arrest-induced insertions by live cell sorting (FACS-gal). Several independent gene- trap lines revealed arrest-dependent induction of betagal activity, confirming the efficacy of the FACS screen. The locus of integration was identified in 15 lines. In three lines,insertion occurred in genes previously implicated in the control of quiescence, i.e. EMSY - a BRCA2--interacting protein, p8/com1 - a p300HAT -- binding protein and MLL5 - a SET domain protein. Our results demonstrate that expression of chromatin modulatory genes is induced in G0, providing support to the notion that this reversibly arrested state is actively regulated.

  9. Figmop: a profile HMM to identify genes and bypass troublesome gene models in draft genomes.

    PubMed

    Curran, David M; Gilleard, John S; Wasmuth, James D

    2014-11-15

    Gene models from draft genome assemblies of metazoan species are often incorrect, missing exons or entire genes, particularly for large gene families. Consequently, labour-intensive manual curation is often necessary. We present Figmop (Finding Genes using Motif Patterns) to help with the manual curation of gene families in draft genome assemblies. The program uses a pattern of short sequence motifs to identify putative genes directly from the genome sequence. Using a large gene family as a test case, Figmop was found to be more sensitive and specific than a BLAST-based approach. The visualization used allows the validation of potential genes to be carried out quickly and easily, saving hours if not days from an analysis. Source code of Figmop is freely available for download at https://github.com/dave-the-scientist, implemented in C and Python and is supported on Linux, Unix and MacOSX. curran.dave.m@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. Gene-based rare allele analysis identified a risk gene of Alzheimer's disease.

    PubMed

    Kim, Jong Hun; Song, Pamela; Lim, Hyunsun; Lee, Jae-Hyung; Lee, Jun Hong; Park, Sun Ah

    2014-01-01

    Alzheimer's disease (AD) has a strong propensity to run in families. However, the known risk genes excluding APOE are not clinically useful. In various complex diseases, gene studies have targeted rare alleles for unsolved heritability. Our study aims to elucidate previously unknown risk genes for AD by targeting rare alleles. We used data from five publicly available genetic studies from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the database of Genotypes and Phenotypes (dbGaP). A total of 4,171 cases and 9,358 controls were included. The genotype information of rare alleles was imputed using 1,000 genomes. We performed gene-based analysis of rare alleles (minor allele frequency≤3%). The genome-wide significance level was defined as meta P<1.8×10(-6) (0.05/number of genes in human genome = 0.05/28,517). ZNF628, which is located at chromosome 19q13.42, showed a genome-wide significant association with AD. The association of ZNF628 with AD was not dependent on APOE ε4. APOE and TREM2 were also significantly associated with AD, although not at genome-wide significance levels. Other genes identified by targeting common alleles could not be replicated in our gene-based rare allele analysis. We identified that rare variants in ZNF628 are associated with AD. The protein encoded by ZNF628 is known as a transcription factor. Furthermore, the associations of APOE and TREM2 with AD were highly significant, even in gene-based rare allele analysis, which implies that further deep sequencing of these genes is required in AD heritability studies.

  11. Gene expression patterns combined with bioinformatics analysis identify genes associated with cholangiocarcinoma.

    PubMed

    Li, Chen; Shen, Weixing; Shen, Sheng; Ai, Zhilong

    2013-12-01

    To explore the molecular mechanisms of cholangiocarcinoma (CC), microarray technology was used to find biomarkers for early detection and diagnosis. The gene expression profiles from 6 patients with CC and 5 normal controls were downloaded from Gene Expression Omnibus and compared. As a result, 204 differentially co-expressed genes (DCGs) in CC patients compared to normal controls were identified using a computational bioinformatics analysis. These genes were mainly involved in coenzyme metabolic process, peptidase activity and oxidation reduction. A regulatory network was constructed by mapping the DCGs to known regulation data. Four transcription factors, FOXC1, ZIC2, NKX2-2 and GCGR, were hub nodes in the network. In conclusion, this study provides a set of targets useful for future investigations into molecular biomarker studies. Copyright © 2013 Elsevier Ltd. All rights reserved.

  12. Gene expression analysis at multiple time-points identifies key genes for nerve regeneration.

    PubMed

    Pan, Bin; Liu, Yi; Yan, Jia-Yin; Wang, Yao; Yao, Xue; Zhou, Heng-Xing; Lu, Lu; Kong, Xiao-Hong; Feng, Shi-Qing

    2017-03-01

    The purpose of this study was to provide a comprehensive understanding of gene expression during Wallerian degeneration and axon regeneration after peripheral nerve injury. A microarray was used to detect gene expression in the distal nerve 0, 3, 7, and 14 days after sciatic nerve crush. Bioinformatic analysis was used to predict function of the differentially expressed mRNAs. Microarray results and the key pathways were validated by quantitative real-time polymerase chain reaction (qRT-PCR). Differentially expressed mRNAs at different time-points (3, 7, and 14 days) after injury were identified and compared with a control group (0 day). Nine general trends of changes in gene expression were identified. Key signal pathways and 9 biological processes closely associated with nerve regeneration were identified and verified. Differentially expressed genes and biological processes and pathways associated with axonal regeneration may elucidate the molecular-biological mechanisms underlying peripheral nerve regeneration. Muscle Nerve 55: 373-383, 2017. © 2016 Wiley Periodicals, Inc.

  13. Functional gene group analysis identifies synaptic gene groups as risk factor for schizophrenia

    PubMed Central

    Lips, E S; Cornelisse, L N; Toonen, R F; Min, J L; Hultman, C M; Holmans, P A; O'Donovan, M C; Purcell, S M; Smit, A B; Verhage, M; Sullivan, P F; Visscher, P M; Posthuma, D

    2012-01-01

    Schizophrenia is a highly heritable disorder with a polygenic pattern of inheritance and a population prevalence of ∼1%. Previous studies have implicated synaptic dysfunction in schizophrenia. We tested the accumulated association of genetic variants in expert-curated synaptic gene groups with schizophrenia in 4673 cases and 4965 healthy controls, using functional gene group analysis. Identifying groups of genes with similar cellular function rather than genes in isolation may have clinical implications for finding additional drug targets. We found that a group of 1026 synaptic genes was significantly associated with the risk of schizophrenia (P=7.6 × 10−11) and more strongly associated than 100 randomly drawn, matched control groups of genetic variants (P<0.01). Subsequent analysis of synaptic subgroups suggested that the strongest association signals are derived from three synaptic gene groups: intracellular signal transduction (P=2.0 × 10−4), excitability (P=9.0 × 10−4) and cell adhesion and trans-synaptic signaling (P=2.4 × 10−3). These results are consistent with a role of synaptic dysfunction in schizophrenia and imply that impaired intracellular signal transduction in synapses, synaptic excitability and cell adhesion and trans-synaptic signaling play a role in the pathology of schizophrenia. PMID:21931320

  14. Functional gene group analysis identifies synaptic gene groups as risk factor for schizophrenia.

    PubMed

    Lips, E S; Cornelisse, L N; Toonen, R F; Min, J L; Hultman, C M; Holmans, P A; O'Donovan, M C; Purcell, S M; Smit, A B; Verhage, M; Sullivan, P F; Visscher, P M; Posthuma, D

    2012-10-01

    Schizophrenia is a highly heritable disorder with a polygenic pattern of inheritance and a population prevalence of ~1%. Previous studies have implicated synaptic dysfunction in schizophrenia. We tested the accumulated association of genetic variants in expert-curated synaptic gene groups with schizophrenia in 4673 cases and 4965 healthy controls, using functional gene group analysis. Identifying groups of genes with similar cellular function rather than genes in isolation may have clinical implications for finding additional drug targets. We found that a group of 1026 synaptic genes was significantly associated with the risk of schizophrenia (P=7.6 × 10(-11)) and more strongly associated than 100 randomly drawn, matched control groups of genetic variants (P<0.01). Subsequent analysis of synaptic subgroups suggested that the strongest association signals are derived from three synaptic gene groups: intracellular signal transduction (P=2.0 × 10(-4)), excitability (P=9.0 × 10(-4)) and cell adhesion and trans-synaptic signaling (P=2.4 × 10(-3)). These results are consistent with a role of synaptic dysfunction in schizophrenia and imply that impaired intracellular signal transduction in synapses, synaptic excitability and cell adhesion and trans-synaptic signaling play a role in the pathology of schizophrenia.

  15. GeneCOST: a novel scoring-based prioritization framework for identifying disease causing genes.

    PubMed

    Ozer, Bugra; Sağıroğlu, Mahmut; Demirci, Hüseyin

    2015-11-15

    Due to the big data produced by next-generation sequencing studies, there is an evident need for methods to extract the valuable information gathered from these experiments. In this work, we propose GeneCOST, a novel scoring-based method to evaluate every gene for their disease association. Without any prior filtering and any prior knowledge, we assign a disease likelihood score to each gene in correspondence with their variations. Then, we rank all genes based on frequency, conservation, pedigree and detailed variation information to find out the causative reason of the disease state. We demonstrate the usage of GeneCOST with public and real life Mendelian disease cases including recessive, dominant, compound heterozygous and sporadic models. As a result, we were able to identify causative reason behind the disease state in top rankings of our list, proving that this novel prioritization framework provides a powerful environment for the analysis in genetic disease studies alternative to filtering-based approaches. GeneCOST software is freely available at www.igbam.bilgem.tubitak.gov.tr/en/softwares/genecost-en/index.html. buozer@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. LNDriver: identifying driver genes by integrating mutation and expression data based on gene-gene interaction network.

    PubMed

    Wei, Pi-Jing; Zhang, Di; Xia, Junfeng; Zheng, Chun-Hou

    2016-12-23

    Cancer is a complex disease which is characterized by the accumulation of genetic alterations during the patient's lifetime. With the development of the next-generation sequencing technology, multiple omics data, such as cancer genomic, epigenomic and transcriptomic data etc., can be measured from each individual. Correspondingly, one of the key challenges is to pinpoint functional driver mutations or pathways, which contributes to tumorigenesis, from millions of functional neutral passenger mutations. In this paper, in order to identify driver genes effectively, we applied a generalized additive model to mutation profiles to filter genes with long length and constructed a new gene-gene interaction network. Then we integrated the mutation data and expression data into the gene-gene interaction network. Lastly, greedy algorithm was used to prioritize candidate driver genes from the integrated data. We named the proposed method Length-Net-Driver (LNDriver). Experiments on three TCGA datasets, i.e., head and neck squamous cell carcinoma, kidney renal clear cell carcinoma and thyroid carcinoma, demonstrated that the proposed method was effective. Also, it can identify not only frequently mutated drivers, but also rare candidate driver genes.

  17. Identifying nonspecific SAGE tags by context of gene expression.

    PubMed

    Ge, Xijin; Wang, San Ming

    2008-01-01

    Many serial analysis of gene expression (SAGE) tags can be matched to multiple genes, leading to difficulty in SAGE data interpretation and analysis. As only a subset of genes in the human genome are transcribed in a certain type of tissue/cell, we used microarray expression data from different tissue types to define contexts of gene expression and to annotate SAGE tags collected from the same or similar tissue sources. To predict the original transcript contributing a nonspecific SAGE tag collected from a particular tissue, we ranked the corresponding genes by their expression levels determined by microarray. We developed a tissue-specific SAGE tag annotation database based on microarray data collected from 73 normal human tissues and 18 cancer tissues and cell lines. The database can be queried online at: http://www.basic.northwestern.edu/SAGE/. The accuracy of this database was confirmed by experimental data.

  18. An in vivo screening system to identify tumorigenic genes.

    PubMed

    Ihara, T; Hosokawa, Y; Kumazawa, K; Ishikawa, K; Fujimoto, J; Yamamoto, M; Muramkami, T; Goshima, N; Ito, E; Watanabe, S; Semba, K

    2017-04-06

    Screening for oncogenes has mostly been performed by in vitro transformation assays. However, some oncogenes might not exhibit their transforming activities in vitro unless putative essential factors from in vivo microenvironments are adequately supplied. Here, we have developed an in vivo screening system that evaluates the tumorigenicity of target genes. This system uses a retroviral high-efficiency gene transfer technique, a large collection of human cDNA clones corresponding to ~70% of human genes and a luciferase-expressing immortalized mouse mammary epithelial cell line (NMuMG-luc). From 845 genes that were highly expressed in human breast cancer cell lines, we focused on 205 genes encoding membrane proteins and/or kinases as that had the greater possibility of being oncogenes or drug targets. The 205 genes were divided into five subgroups, each containing 34-43 genes, and then introduced them into NMuMG-luc cells. These cells were subcutaneously injected into nude mice and monitored for tumor development by in vivo imaging. Tumors were observed in three subgroups. Using DNA microarray analyses and individual tumorigenic assays, we found that three genes, ADORA2B, PRKACB and LPAR3, were tumorigenic. ADORA2B and LPAR3 encode G-protein-coupled receptors and PRKACB encodes a protein kinase A catalytic subunit. Cells overexpressing ADORA2B, LPAR3 or PRKACB did not show transforming phenotypes in vitro, suggesting that transformation by these genes requires in vivo microenvironments. In addition, several clinical data sets, including one for breast cancer, showed that the expression of these genes correlated with lower overall survival rate.

  19. Dissecting the Gene Network of Dietary Restriction to Identify Evolutionarily Conserved Pathways and New Functional Genes

    PubMed Central

    Wuttke, Daniel; Connor, Richard; Vora, Chintan; Craig, Thomas; Li, Yang; Wood, Shona; Vasieva, Olga; Shmookler Reis, Robert; Tang, Fusheng; de Magalhães, João Pedro

    2012-01-01

    Dietary restriction (DR), limiting nutrient intake from diet without causing malnutrition, delays the aging process and extends lifespan in multiple organisms. The conserved life-extending effect of DR suggests the involvement of fundamental mechanisms, although these remain a subject of debate. To help decipher the life-extending mechanisms of DR, we first compiled a list of genes that if genetically altered disrupt or prevent the life-extending effects of DR. We called these DR–essential genes and identified more than 100 in model organisms such as yeast, worms, flies, and mice. In order for other researchers to benefit from this first curated list of genes essential for DR, we established an online database called GenDR (http://genomics.senescence.info/diet/). To dissect the interactions of DR–essential genes and discover the underlying lifespan-extending mechanisms, we then used a variety of network and systems biology approaches to analyze the gene network of DR. We show that DR–essential genes are more conserved at the molecular level and have more molecular interactions than expected by chance. Furthermore, we employed a guilt-by-association method to predict novel DR–essential genes. In budding yeast, we predicted nine genes related to vacuolar functions; we show experimentally that mutations deleting eight of those genes prevent the life-extending effects of DR. Three of these mutants (OPT2, FRE6, and RCR2) had extended lifespan under ad libitum, indicating that the lack of further longevity under DR is not caused by a general compromise of fitness. These results demonstrate how network analyses of DR using GenDR can be used to make phenotypically relevant predictions. Moreover, gene-regulatory circuits reveal that the DR–induced transcriptional signature in yeast involves nutrient-sensing, stress responses and meiotic transcription factors. Finally, comparing the influence of gene expression changes during DR on the interactomes of multiple

  20. Dissecting the gene network of dietary restriction to identify evolutionarily conserved pathways and new functional genes.

    PubMed

    Wuttke, Daniel; Connor, Richard; Vora, Chintan; Craig, Thomas; Li, Yang; Wood, Shona; Vasieva, Olga; Shmookler Reis, Robert; Tang, Fusheng; de Magalhães, João Pedro

    2012-01-01

    Dietary restriction (DR), limiting nutrient intake from diet without causing malnutrition, delays the aging process and extends lifespan in multiple organisms. The conserved life-extending effect of DR suggests the involvement of fundamental mechanisms, although these remain a subject of debate. To help decipher the life-extending mechanisms of DR, we first compiled a list of genes that if genetically altered disrupt or prevent the life-extending effects of DR. We called these DR-essential genes and identified more than 100 in model organisms such as yeast, worms, flies, and mice. In order for other researchers to benefit from this first curated list of genes essential for DR, we established an online database called GenDR (http://genomics.senescence.info/diet/). To dissect the interactions of DR-essential genes and discover the underlying lifespan-extending mechanisms, we then used a variety of network and systems biology approaches to analyze the gene network of DR. We show that DR-essential genes are more conserved at the molecular level and have more molecular interactions than expected by chance. Furthermore, we employed a guilt-by-association method to predict novel DR-essential genes. In budding yeast, we predicted nine genes related to vacuolar functions; we show experimentally that mutations deleting eight of those genes prevent the life-extending effects of DR. Three of these mutants (OPT2, FRE6, and RCR2) had extended lifespan under ad libitum, indicating that the lack of further longevity under DR is not caused by a general compromise of fitness. These results demonstrate how network analyses of DR using GenDR can be used to make phenotypically relevant predictions. Moreover, gene-regulatory circuits reveal that the DR-induced transcriptional signature in yeast involves nutrient-sensing, stress responses and meiotic transcription factors. Finally, comparing the influence of gene expression changes during DR on the interactomes of multiple organisms led

  1. Gene expression in human hippocampus from cocaine abusers identifies genes which regulate extracellular matrix remodeling.

    PubMed

    Mash, Deborah C; ffrench-Mullen, Jarlath; Adi, Nikhil; Qin, Yujing; Buck, Andrew; Pablo, John

    2007-11-14

    The chronic effects of cocaine abuse on brain structure and function are blamed for the inability of most addicts to remain abstinent. Part of the difficulty in preventing relapse is the persisting memory of the intense euphoria or cocaine "rush". Most abused drugs and alcohol induce neuroplastic changes in brain pathways subserving emotion and cognition. Such changes may account for the consolidation and structural reconfiguration of synaptic connections with exposure to cocaine. Adaptive hippocampal plasticity could be related to specific patterns of gene expression with chronic cocaine abuse. Here, we compare gene expression profiles in the human hippocampus from cocaine addicts and age-matched drug-free control subjects. Cocaine abusers had 151 gene transcripts upregulated, while 91 gene transcripts were downregulated. Topping the list of cocaine-regulated transcripts was RECK in the human hippocampus (FC = 2.0; p<0.05). RECK is a membrane-anchored MMP inhibitor that is implicated in the coordinated regulation of extracellular matrix integrity and angiogenesis. In keeping with elevated RECK expression, active MMP9 protein levels were decreased in the hippocampus from cocaine abusers. Pathway analysis identified other genes regulated by cocaine that code for proteins involved in the remodeling of the cytomatrix and synaptic connections and the inhibition of blood vessel proliferation (PCDH8, LAMB1, ITGB6, CTGF and EphB4). The observed microarray phenotype in the human hippocampus identified RECK and other region-specific genes that may promote long-lasting structural changes with repeated cocaine abuse. Extracellular matrix remodeling in the hippocampus may be a persisting effect of chronic abuse that contributes to the compulsive and relapsing nature of cocaine addiction.

  2. Identifying the Interaction between Genes and Gene Products Based on Frequently Seen Verbs in Medline Abstracts.

    PubMed

    Sekimizu; Park; Tsujii

    1998-01-01

    We have selected the most frequently seen verbs from raw texts made up of 1-million-words of Medline abstracts, and we were able to identify (or bracket) noun phrases contained in the corpus, with a precision rate of 90%. Then, based on the noun-phrase-bracketted corpus, we tried to find the subject and object terms for some frequently seen verbs in the domain. The precision rate of finding the right subject and object for each verb was about 73%. This task was only made possible because we were able to linguistically analyze (or parse) a large quantity of a raw corpus. Our approach will be useful for classifying genes and gene products and for identifying the interaction between them. It is the first step of our effort in building a genome-related thesaurus and hierarchies in a fully automatic way.

  3. Epidermal growth factor gene is a newly identified candidate gene for gout

    PubMed Central

    Han, Lin; Cao, Chunwei; Jia, Zhaotong; Liu, Shiguo; Liu, Zhen; Xin, Ruosai; Wang, Can; Li, Xinde; Ren, Wei; Wang, Xuefeng; Li, Changgui

    2016-01-01

    Chromosome 4q25 has been identified as a genomic region associated with gout. However, the associations of gout with the genes in this region have not yet been confirmed. Here, we performed two-stage analysis to determine whether variations in candidate genes in the 4q25 region are associated with gout in a male Chinese Han population. We first evaluated 96 tag single nucleotide polymorphisms (SNPs) in eight inflammatory/immune pathway- or glucose/lipid metabolism-related genes in the 4q25 region in 480 male gout patients and 480 controls. The SNP rs12504538, located in the elongation of very-long-chain-fatty-acid-like family member 6 gene (Elovl6), was found to be associated with gout susceptibility (Padjusted = 0.00595). In the second stage of analysis, we performed fine mapping analysis of 93 tag SNPs in Elovl6 and in the epidermal growth factor gene (EGF) and its flanking regions in 1017 male patients gout and 1897 healthy male controls. We observed a significant association between the T allele of EGF rs2298999 and gout (odds ratio = 0.77, 95% confidence interval = 0.67–0.88, Padjusted = 6.42 × 10−3). These results provide the first evidence for an association between the EGF rs2298999 C/T polymorphism and gout. Our findings should be validated in additional populations. PMID:27506295

  4. Systems Approaches to Identifying Gene Regulatory Networks in Plants

    PubMed Central

    Long, Terri A.; Brady, Siobhan M.; Benfey, Philip N.

    2009-01-01

    Complex gene regulatory networks are composed of genes, noncoding RNAs, proteins, metabolites, and signaling components. The availability of genome-wide mutagenesis libraries; large-scale transcriptome, proteome, and metabalome data sets; and new high-throughput methods that uncover protein interactions underscores the need for mathematical modeling techniques that better enable scientists to synthesize these large amounts of information and to understand the properties of these biological systems. Systems biology approaches can allow researchers to move beyond a reductionist approach and to both integrate and comprehend the interactions of multiple components within these systems. Descriptive and mathematical models for gene regulatory networks can reveal emergent properties of these plant systems. This review highlights methods that researchers are using to obtain large-scale data sets, and examples of gene regulatory networks modeled with these data. Emergent properties revealed by the use of these network models and perspectives on the future of systems biology are discussed. PMID:18616425

  5. Virus-induced gene silencing of Arabidopsis thaliana gene homologues in wheat identifies genes conferring improved drought tolerance.

    PubMed

    Manmathan, Harish; Shaner, Dale; Snelling, Jacob; Tisserat, Ned; Lapitan, Nora

    2013-03-01

    In a non-model staple crop like wheat (Triticum aestivumI L.), functional validation of potential drought stress responsive genes identified in Arabidopsis could provide gene targets for breeding. Virus-induced gene silencing (VIGS) of genes of interest can overcome the inherent problems of polyploidy and limited transformation potential that hamper functional validation studies in wheat. In this study, three potential candidate genes shown to be involved in abiotic stress response pathways in Arabidopsis thaliana were selected for VIGS experiments in wheat. These include Era1 (enhanced response to abscisic acid), Cyp707a (ABA 8'-hydroxylase), and Sal1 (inositol polyphosphate 1-phosphatase). Gene homologues for these three genes were identified in wheat and cloned in the viral vector barley stripe mosaic virus (BSMV) in the antisense direction, followed by rub inoculation of BSMV viral RNA transcripts onto wheat plants. Quantitative real-time PCR showed that VIGS-treated wheat plants had significant reductions in target gene transcripts. When VIGS-treated plants generated for Era1 and Sal1 were subjected to limiting water conditions, they showed increased relative water content, improved water use efficiency, reduced gas exchange, and better vigour compared to water-stressed control plants inoculated with RNA from the empty viral vector (BSMV0). In comparison, the Cyp707a-silenced plants showed no improvement over BSMV0-inoculated plants under limited water condition. These results indicate that Era1 and Sal1 play important roles in conferring drought tolerance in wheat. Other traits affected by Era1 silencing were also studied. Delayed seed germination in Era1-silenced plants suggests this gene may be a useful target for developing resistance to pre-harvest sprouting.

  6. Virus-induced gene silencing of Arabidopsis thaliana gene homologues in wheat identifies genes conferring improved drought tolerance

    PubMed Central

    Lapitan, Nora

    2013-01-01

    In a non-model staple crop like wheat (Triticum aestivumI L.), functional validation of potential drought stress responsive genes identified in Arabidopsis could provide gene targets for breeding. Virus-induced gene silencing (VIGS) of genes of interest can overcome the inherent problems of polyploidy and limited transformation potential that hamper functional validation studies in wheat. In this study, three potential candidate genes shown to be involved in abiotic stress response pathways in Arabidopsis thaliana were selected for VIGS experiments in wheat. These include Era1 (enhanced response to abscisic acid), Cyp707a (ABA 8’-hydroxylase), and Sal1 (inositol polyphosphate 1-phosphatase). Gene homologues for these three genes were identified in wheat and cloned in the viral vector barley stripe mosaic virus (BSMV) in the antisense direction, followed by rub inoculation of BSMV viral RNA transcripts onto wheat plants. Quantitative real-time PCR showed that VIGS-treated wheat plants had significant reductions in target gene transcripts. When VIGS-treated plants generated for Era1 and Sal1 were subjected to limiting water conditions, they showed increased relative water content, improved water use efficiency, reduced gas exchange, and better vigour compared to water-stressed control plants inoculated with RNA from the empty viral vector (BSMV0). In comparison, the Cyp707a-silenced plants showed no improvement over BSMV0-inoculated plants under limited water condition. These results indicate that Era1 and Sal1 play important roles in conferring drought tolerance in wheat. Other traits affected by Era1 silencing were also studied. Delayed seed germination in Era1-silenced plants suggests this gene may be a useful target for developing resistance to pre-harvest sprouting. PMID:23364940

  7. Protein networks identify novel symbiogenetic genes resulting from plastid endosymbiosis

    PubMed Central

    Méheust, Raphaël; Zelzion, Ehud; Bhattacharya, Debashish; Lopez, Philippe; Bapteste, Eric

    2016-01-01

    The integration of foreign genetic information is central to the evolution of eukaryotes, as has been demonstrated for the origin of the Calvin cycle and of the heme and carotenoid biosynthesis pathways in algae and plants. For photosynthetic lineages, this coordination involved three genomes of divergent phylogenetic origins (the nucleus, plastid, and mitochondrion). Major hurdles overcome by the ancestor of these lineages were harnessing the oxygen-evolving organelle, optimizing the use of light, and stabilizing the partnership between the plastid endosymbiont and host through retargeting of proteins to the nascent organelle. Here we used protein similarity networks that can disentangle reticulate gene histories to explore how these significant challenges were met. We discovered a previously hidden component of algal and plant nuclear genomes that originated from the plastid endosymbiont: symbiogenetic genes (S genes). These composite proteins, exclusive to photosynthetic eukaryotes, encode a cyanobacterium-derived domain fused to one of cyanobacterial or another prokaryotic origin and have emerged multiple, independent times during evolution. Transcriptome data demonstrate the existence and expression of S genes across a wide swath of algae and plants, and functional data indicate their involvement in tolerance to oxidative stress, phototropism, and adaptation to nitrogen limitation. Our research demonstrates the “recycling” of genetic information by photosynthetic eukaryotes to generate novel composite genes, many of which function in plastid maintenance. PMID:26976593

  8. Protein networks identify novel symbiogenetic genes resulting from plastid endosymbiosis.

    PubMed

    Méheust, Raphaël; Zelzion, Ehud; Bhattacharya, Debashish; Lopez, Philippe; Bapteste, Eric

    2016-03-29

    The integration of foreign genetic information is central to the evolution of eukaryotes, as has been demonstrated for the origin of the Calvin cycle and of the heme and carotenoid biosynthesis pathways in algae and plants. For photosynthetic lineages, this coordination involved three genomes of divergent phylogenetic origins (the nucleus, plastid, and mitochondrion). Major hurdles overcome by the ancestor of these lineages were harnessing the oxygen-evolving organelle, optimizing the use of light, and stabilizing the partnership between the plastid endosymbiont and host through retargeting of proteins to the nascent organelle. Here we used protein similarity networks that can disentangle reticulate gene histories to explore how these significant challenges were met. We discovered a previously hidden component of algal and plant nuclear genomes that originated from the plastid endosymbiont: symbiogenetic genes (S genes). These composite proteins, exclusive to photosynthetic eukaryotes, encode a cyanobacterium-derived domain fused to one of cyanobacterial or another prokaryotic origin and have emerged multiple, independent times during evolution. Transcriptome data demonstrate the existence and expression of S genes across a wide swath of algae and plants, and functional data indicate their involvement in tolerance to oxidative stress, phototropism, and adaptation to nitrogen limitation. Our research demonstrates the "recycling" of genetic information by photosynthetic eukaryotes to generate novel composite genes, many of which function in plastid maintenance.

  9. Using new genetic tools to identify potato resistance genes

    USDA-ARS?s Scientific Manuscript database

    Plant diseases present a burden to agriculture through yield losses due to plant stress, costs associated with disease control, and efforts to detect infections and limit disease epidemics. Plant breeders are interested in the identification and incorporation of simply inherited genes that confer ro...

  10. Exploiting natural variation to identify insect-resistance genes.

    PubMed

    Broekgaarden, Colette; Snoeren, Tjeerd A L; Dicke, Marcel; Vosman, Ben

    2011-10-01

    Herbivorous insects are widespread and often serious constraints to crop production. The use of insect-resistant crops is a very effective way to control insect pests in agriculture, and the development of such crops can be greatly enhanced by knowledge on plant resistance mechanisms and the genes involved. Plants have evolved diverse ways to cope with insect attack that has resulted in natural variation for resistance towards herbivorous insects. Studying the molecular genetics and transcriptional background of this variation has facilitated the identification of resistance genes and processes that lead to resistance against insects. With the development of new technologies, molecular studies are not restricted to model plants anymore. This review addresses the need to exploit natural variation in resistance towards insects to increase our knowledge on resistance mechanisms and the genes involved. We will discuss how this knowledge can be exploited in breeding programmes to provide sustainable crop protection against insect pests. Additionally, we discuss the current status of genetic research on insect-resistance genes. We conclude that insect-resistance mechanisms are still unclear at the molecular level and that exploiting natural variation with novel technologies will contribute greatly to the development of insect-resistant crop varieties.

  11. Gene expression profiling identifies genes predictive of oral squamous cell carcinoma.

    PubMed

    Chen, Chu; Méndez, Eduardo; Houck, John; Fan, Wenhong; Lohavanichbutr, Pawadee; Doody, Dave; Yueh, Bevan; Futran, Neal D; Upton, Melissa; Farwell, D Gregory; Schwartz, Stephen M; Zhao, Lue Ping

    2008-08-01

    Oral squamous cell carcinoma (OSCC) is associated with substantial mortality and morbidity. To identify potential biomarkers for the early detection of invasive OSCC, we compared the gene expressions of incident primary OSCC, oral dysplasia, and clinically normal oral tissue from surgical patients without head and neck cancer or preneoplastic oral lesions (controls), using Affymetrix U133 2.0 Plus arrays. We identified 131 differentially expressed probe sets using a training set of 119 OSCC patients and 35 controls. Forward and stepwise logistic regression analyses identified 10 successive combinations of genes which expression differentiated OSCC from controls. The best model included LAMC2, encoding laminin-gamma2 chain, and COL4A1, encoding collagen, type IV alpha1 chain. Subsequent modeling without these two markers showed that COL1A1, encoding collagen, type I alpha1 chain, and PADI1, encoding peptidyl arginine deiminase, type 1, could also distinguish OSCC from controls. We validated these two models using an internal independent testing set of 48 invasive OSCC and 10 controls and an external testing set of 42 head and neck squamous cell carcinoma cases and 14 controls (GEO GSE6791), with sensitivity and specificity above 95%. These two models were also able to distinguish dysplasia (n = 17) from control (n = 35) tissue. Differential expression of these four genes was confirmed by quantitative reverse transcription-PCR. If confirmed in larger studies, the proposed models may hold promise for monitoring local recurrence at surgical margins and the development of second primary oral cancer in patients with OSCC.

  12. Transcriptome Sequencing Identified Genes and Gene Ontologies Associated with Early Freezing Tolerance in Maize

    PubMed Central

    Li, Zhao; Hu, Guanghui; Liu, Xiangfeng; Zhou, Yao; Li, Yu; Zhang, Xu; Yuan, Xiaohui; Zhang, Qian; Yang, Deguang; Wang, Tianyu; Zhang, Zhiwu

    2016-01-01

    Originating in a tropical climate, maize has faced great challenges as cultivation has expanded to the majority of the world's temperate zones. In these zones, frost and cold temperatures are major factors that prevent maize from reaching its full yield potential. Among 30 elite maize inbred lines adapted to northern China, we identified two lines of extreme, but opposite, freezing tolerance levels—highly tolerant and highly sensitive. During the seedling stage of these two lines, we used RNA-seq to measure changes in maize whole genome transcriptome before and after freezing treatment. In total, 19,794 genes were expressed, of which 4550 exhibited differential expression due to either treatment (before or after freezing) or line type (tolerant or sensitive). Of the 4550 differently expressed genes, 948 exhibited differential expression due to treatment within line or lines under freezing condition. Analysis of gene ontology found that these 948 genes were significantly enriched for binding functions (DNA binding, ATP binding, and metal ion binding), protein kinase activity, and peptidase activity. Based on their enrichment, literature support, and significant levels of differential expression, 30 of these 948 genes were selected for quantitative real-time PCR (qRT-PCR) validation. The validation confirmed our RNA-Seq-based findings, with squared correlation coefficients of 80% and 50% in the tolerance and sensitive lines, respectively. This study provided valuable resources for further studies to enhance understanding of the molecular mechanisms underlying maize early freezing response and enable targeted breeding strategies for developing varieties with superior frost resistance to achieve yield potential. PMID:27774095

  13. Prokaryotic cDNA Subtraction: A Method to Rapidly Identify Functional Gene Biomarkers

    DTIC Science & Technology

    2008-10-01

    FINAL REPORT Prokaryotic cDNA Subtraction: A Method to Rapidly Identify Functional Gene Biomarkers SERDP Project ER-1563 OCTOBER 2008... Method to Rapidly Identify Functional Gene Biomarkers 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER...Studying Biological Treatment with MBT ...................................................................5 2.2 Methods for Obtaining Gene Sequences

  14. Identifying mechanistic indicators of childhood asthma from blood gene expression

    EPA Science Inventory

    Asthmatic individuals have been identified as a susceptible subpopulation for air pollutants. However, asthma represents a syndrome with multiple probable etiologies, and the identification of these asthma endotypes is critical to accurately define the most susceptible subpopula...

  15. Identifying mechanistic indicators of childhood asthma from blood gene expression

    EPA Science Inventory

    Asthmatic individuals have been identified as a susceptible subpopulation for air pollutants. However, asthma represents a syndrome with multiple probable etiologies, and the identification of these asthma endotypes is critical to accurately define the most susceptible subpopula...

  16. Transcriptome profiling to identify genes involved in peroxisome assembly and function

    PubMed Central

    Smith, Jennifer J.; Marelli, Marcello; Christmas, Rowan H.; Vizeacoumar, Franco J.; Dilworth, David J.; Ideker, Trey; Galitski, Timothy; Dimitrov, Krassen; Rachubinski, Richard A.; Aitchison, John D.

    2002-01-01

    Yeast cells were induced to proliferate peroxisomes, and microarray transcriptional profiling was used to identify PEX genes encoding peroxins involved in peroxisome assembly and genes involved in peroxisome function. Clustering algorithms identified 224 genes with expression profiles similar to those of genes encoding peroxisomal proteins and genes involved in peroxisome biogenesis. Several previously uncharacterized genes were identified, two of which, YPL112c and YOR084w, encode proteins of the peroxisomal membrane and matrix, respectively. Ypl112p, renamed Pex25p, is a novel peroxin required for the regulation of peroxisome size and maintenance. These studies demonstrate the utility of comparative gene profiling as an alternative to functional assays to identify genes with roles in peroxisome biogenesis. PMID:12135984

  17. A P-Norm Robust Feature Extraction Method for Identifying Differentially Expressed Genes

    PubMed Central

    Liu, Jian; Liu, Jin-Xing; Gao, Ying-Lian; Kong, Xiang-Zhen; Wang, Xue-Song; Wang, Dong

    2015-01-01

    In current molecular biology, it becomes more and more important to identify differentially expressed genes closely correlated with a key biological process from gene expression data. In this paper, based on the Schatten p-norm and Lp-norm, a novel p-norm robust feature extraction method is proposed to identify the differentially expressed genes. In our method, the Schatten p-norm is used as the regularization function to obtain a low-rank matrix and the Lp-norm is taken as the error function to improve the robustness to outliers in the gene expression data. The results on simulation data show that our method can obtain higher identification accuracies than the competitive methods. Numerous experiments on real gene expression data sets demonstrate that our method can identify more differentially expressed genes than the others. Moreover, we confirmed that the identified genes are closely correlated with the corresponding gene expression data. PMID:26201006

  18. Candidate genes for limiting cholestatic intestinal injury identified by gene expression profiling

    PubMed Central

    Alaish, Samuel M; Timmons, Jennifer; Smith, Alexis; Buzza, Marguerite S; Murphy, Ebony; Zhao, Aiping; Sun, Yezhou; Turner, Douglas J; Shea-Donahue, Terez; Antalis, Toni M; Cross, Alan; Dorsey, Susan G

    2013-01-01

    The lack of bile flow from the liver into the intestine can have devastating complications including hepatic failure, sepsis, and even death. This pathologic condition known as cholestasis can result from etiologies as diverse as total parenteral nutrition (TPN), hepatitis, and pancreatic cancer. The intestinal injury associated with cholestasis has been shown to result in decreased intestinal resistance, increased bacterial translocation, and increased endotoxemia. Anecdotal clinical evidence suggests a genetic predisposition to exaggerated injury. Recent animal research on two different strains of inbred mice demonstrating different rates of bacterial translocation with different mortality rates supports this premise. In this study, a microarray analysis of intestinal tissue following common bile duct ligation (CBDL) performed under general anesthesia on these same two strains of inbred mice was done with the goal of identifying the potential molecular mechanistic pathways responsible. Over 500 genes were increased more than 2.0-fold following CBDL. The most promising candidate genes included major urinary proteins (MUPs), serine protease-1-inhibitor (Serpina1a), and lipocalin-2 (LCN-2). Quantitative polymerase chain reaction (qPCR) validated the microarray results for these candidate genes. In an in vitro experiment using differentiated intestinal epithelial cells, inhibition of MUP-1 by siRNA resulted in increased intestinal epithelial cell permeability. Diverse novel mechanisms involving the growth hormone pathway, the acute phase response, and the innate immune response are thus potential avenues for limiting cholestatic intestinal injury. Changes in gene expression were at times found to be not only due to the CBDL but also due to the murine strain. Should further studies in cholestatic patients demonstrate interindividual variability similar to what we have shown in mice, then a “personalized medicine” approach to cholestatic patients may become

  19. DNA methylation profiling identifies CG methylation clusters in Arabidopsis genes.

    PubMed

    Tran, Robert K; Henikoff, Jorja G; Zilberman, Daniel; Ditt, Renata F; Jacobsen, Steven E; Henikoff, Steven

    2005-01-26

    Cytosine DNA methylation in vertebrates is widespread, but methylation in plants is found almost exclusively at transposable elements and repetitive DNA. Within regions of methylation, methylcytosines are typically found in CG, CNG, and asymmetric contexts. CG sites are maintained by a plant homolog of mammalian Dnmt1 acting on hemi-methylated DNA after replication. Methylation of CNG and asymmetric sites appears to be maintained at each cell cycle by other mechanisms. We report a new type of DNA methylation in Arabidopsis, dense CG methylation clusters found at scattered sites throughout the genome. These clusters lack non-CG methylation and are preferentially found in genes, although they are relatively deficient toward the 5' end. CG methylation clusters are present in lines derived from different accessions and in mutants that eliminate de novo methylation, indicating that CG methylation clusters are stably maintained at specific sites. Because 5-methylcytosine is mutagenic, the appearance of CG methylation clusters over evolutionary time predicts a genome-wide deficiency of CG dinucleotides and an excess of C(A/T)G trinucleotides within transcribed regions. This is exactly what we find, implying that CG methylation clusters have contributed profoundly to plant gene evolution. We suggest that CG methylation clusters silence cryptic promoters that arise sporadically within transcription units.

  20. Linkage, Association, and Gene-Expression Analyses Identify CNTNAP2 as an Autism-Susceptibility Gene

    PubMed Central

    Alarcón, Maricela; Abrahams, Brett S.; Stone, Jennifer L.; Duvall, Jacqueline A.; Perederiy, Julia V.; Bomar, Jamee M.; Sebat, Jonathan; Wigler, Michael; Martin, Christa L.; Ledbetter, David H.; Nelson, Stanley F.; Cantor, Rita M.; Geschwind, Daniel H.

    2008-01-01

    Autism is a genetically complex neurodevelopmental syndrome in which language deficits are a core feature. We describe results from two complimentary approaches used to identify risk variants on chromosome 7 that likely contribute to the etiology of autism. A two-stage association study tested 2758 SNPs across a 10 Mb 7q35 language-related autism QTL in AGRE (Autism Genetic Resource Exchange) trios1,2 and found significant association with Contactin Associated Protein-Like 2 (CNTNAP2), a strong a priori candidate. Male-only containing families were identified as primarily responsible for this association signal, consistent with the strong male affection bias in ASD and other language-based disorders. Gene-expression analyses in developing human brain further identified CNTNAP2 as enriched in circuits important for language development. Together, these results provide convergent evidence for involvement of CNTNAP2, a Neurexin family member, in autism, and demonstrate a connection between genetic risk for autism and specific brain structures. PMID:18179893

  1. Systematic analysis of microarray datasets to identify Parkinson's disease-associated pathways and genes

    PubMed Central

    Feng, Yinling; Wang, Xuefeng

    2017-01-01

    In order to investigate commonly disturbed genes and pathways in various brain regions of patients with Parkinson's disease (PD), microarray datasets from previous studies were collected and systematically analyzed. Different normalization methods were applied to microarray datasets from different platforms. A strategy combining gene co-expression networks and clinical information was adopted, using weighted gene co-expression network analysis (WGCNA) to screen for commonly disturbed genes in different brain regions of patients with PD. Functional enrichment analysis of commonly disturbed genes was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). Co-pathway relationships were identified with Pearson's correlation coefficient tests and a hypergeometric distribution-based test. Common genes in pathway pairs were selected out and regarded as risk genes. A total of 17 microarray datasets from 7 platforms were retained for further analysis. Five gene coexpression modules were identified, containing 9,745, 736, 233, 101 and 93 genes, respectively. One module was significantly correlated with PD samples and thus the 736 genes it contained were considered to be candidate PD-associated genes. Functional enrichment analysis demonstrated that these genes were implicated in oxidative phosphorylation and PD. A total of 44 pathway pairs and 52 risk genes were revealed, and a risk gene pathway relationship network was constructed. Eight modules were identified and were revealed to be associated with PD, cancers and metabolism. A number of disturbed pathways and risk genes were unveiled in PD, and these findings may help advance understanding of PD pathogenesis. PMID:28098893

  2. Network properties of complex human disease genes identified through genome-wide association studies.

    PubMed

    Barrenas, Fredrik; Chavali, Sreenivas; Holme, Petter; Mobini, Reza; Benson, Mikael

    2009-11-30

    Previous studies of network properties of human disease genes have mainly focused on monogenic diseases or cancers and have suffered from discovery bias. Here we investigated the network properties of complex disease genes identified by genome-wide association studies (GWAs), thereby eliminating discovery bias. We derived a network of complex diseases (n = 54) and complex disease genes (n = 349) to explore the shared genetic architecture of complex diseases. We evaluated the centrality measures of complex disease genes in comparison with essential and monogenic disease genes in the human interactome. The complex disease network showed that diseases belonging to the same disease class do not always share common disease genes. A possible explanation could be that the variants with higher minor allele frequency and larger effect size identified using GWAs constitute disjoint parts of the allelic spectra of similar complex diseases. The complex disease gene network showed high modularity with the size of the largest component being smaller than expected from a randomized null-model. This is consistent with limited sharing of genes between diseases. Complex disease genes are less central than the essential and monogenic disease genes in the human interactome. Genes associated with the same disease, compared to genes associated with different diseases, more often tend to share a protein-protein interaction and a Gene Ontology Biological Process. This indicates that network neighbors of known disease genes form an important class of candidates for identifying novel genes for the same disease.

  3. Systematic analysis of microarray datasets to identify Parkinson's disease‑associated pathways and genes.

    PubMed

    Feng, Yinling; Wang, Xuefeng

    2017-03-01

    In order to investigate commonly disturbed genes and pathways in various brain regions of patients with Parkinson's disease (PD), microarray datasets from previous studies were collected and systematically analyzed. Different normalization methods were applied to microarray datasets from different platforms. A strategy combining gene co‑expression networks and clinical information was adopted, using weighted gene co‑expression network analysis (WGCNA) to screen for commonly disturbed genes in different brain regions of patients with PD. Functional enrichment analysis of commonly disturbed genes was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). Co‑pathway relationships were identified with Pearson's correlation coefficient tests and a hypergeometric distribution‑based test. Common genes in pathway pairs were selected out and regarded as risk genes. A total of 17 microarray datasets from 7 platforms were retained for further analysis. Five gene coexpression modules were identified, containing 9,745, 736, 233, 101 and 93 genes, respectively. One module was significantly correlated with PD samples and thus the 736 genes it contained were considered to be candidate PD‑associated genes. Functional enrichment analysis demonstrated that these genes were implicated in oxidative phosphorylation and PD. A total of 44 pathway pairs and 52 risk genes were revealed, and a risk gene pathway relationship network was constructed. Eight modules were identified and were revealed to be associated with PD, cancers and metabolism. A number of disturbed pathways and risk genes were unveiled in PD, and these findings may help advance understanding of PD pathogenesis.

  4. Sleeping Beauty transposon mutagenesis identifies genes that cooperate with mutant Smad4 in gastric cancer development.

    PubMed

    Takeda, Haruna; Rust, Alistair G; Ward, Jerrold M; Yew, Christopher Chin Kuan; Jenkins, Nancy A; Copeland, Neal G

    2016-04-05

    Mutations in SMAD4 predispose to the development of gastrointestinal cancer, which is the third leading cause of cancer-related deaths. To identify genes driving gastric cancer (GC) development, we performed a Sleeping Beauty (SB) transposon mutagenesis screen in the stomach of Smad4(+/-) mutant mice. This screen identified 59 candidate GC trunk drivers and a much larger number of candidate GC progression genes. Strikingly, 22 SB-identified trunk drivers are known or candidate cancer genes, whereas four SB-identified trunk drivers, including PTEN, SMAD4, RNF43, and NF1, are known human GC trunk drivers. Similar to human GC, pathway analyses identified WNT, TGF-β, and PI3K-PTEN signaling, ubiquitin-mediated proteolysis, adherens junctions, and RNA degradation in addition to genes involved in chromatin modification and organization as highly deregulated pathways in GC. Comparative oncogenomic filtering of the complete list of SB-identified genes showed that they are highly enriched for genes mutated in human GC and identified many candidate human GC genes. Finally, by comparing our complete list of SB-identified genes against the list of mutated genes identified in five large-scale human GC sequencing studies, we identified LDL receptor-related protein 1B (LRP1B) as a previously unidentified human candidate GC tumor suppressor gene. In LRP1B, 129 mutations were found in 462 human GC samples sequenced, and LRP1B is one of the top 10 most deleted genes identified in a panel of 3,312 human cancers. SB mutagenesis has, thus, helped to catalog the cooperative molecular mechanisms driving SMAD4-induced GC growth and discover genes with potential clinical importance in human GC.

  5. Sleeping Beauty transposon mutagenesis identifies genes that cooperate with mutant Smad4 in gastric cancer development

    PubMed Central

    Takeda, Haruna; Rust, Alistair G.; Ward, Jerrold M.; Yew, Christopher Chin Kuan; Jenkins, Nancy A.; Copeland, Neal G.

    2016-01-01

    Mutations in SMAD4 predispose to the development of gastrointestinal cancer, which is the third leading cause of cancer-related deaths. To identify genes driving gastric cancer (GC) development, we performed a Sleeping Beauty (SB) transposon mutagenesis screen in the stomach of Smad4+/− mutant mice. This screen identified 59 candidate GC trunk drivers and a much larger number of candidate GC progression genes. Strikingly, 22 SB-identified trunk drivers are known or candidate cancer genes, whereas four SB-identified trunk drivers, including PTEN, SMAD4, RNF43, and NF1, are known human GC trunk drivers. Similar to human GC, pathway analyses identified WNT, TGF-β, and PI3K-PTEN signaling, ubiquitin-mediated proteolysis, adherens junctions, and RNA degradation in addition to genes involved in chromatin modification and organization as highly deregulated pathways in GC. Comparative oncogenomic filtering of the complete list of SB-identified genes showed that they are highly enriched for genes mutated in human GC and identified many candidate human GC genes. Finally, by comparing our complete list of SB-identified genes against the list of mutated genes identified in five large-scale human GC sequencing studies, we identified LDL receptor-related protein 1B (LRP1B) as a previously unidentified human candidate GC tumor suppressor gene. In LRP1B, 129 mutations were found in 462 human GC samples sequenced, and LRP1B is one of the top 10 most deleted genes identified in a panel of 3,312 human cancers. SB mutagenesis has, thus, helped to catalog the cooperative molecular mechanisms driving SMAD4-induced GC growth and discover genes with potential clinical importance in human GC. PMID:27006499

  6. Gene-trapping to identify and analyze genes expressed in the mouse hippocampus.

    PubMed

    Steel, M; Moss, J; Clark, K A; Kearns, I R; Davies, C H; Morris, R G; Skarnes, W C; Lathe, R

    1998-01-01

    Mice harboring random gene-trap insertions of a lacZ (beta-galactosidase)-neomycin resistance fusion cassette (beta-geo) were analyzed for expression in the hippocampus. In 4 of 15 lines reporter gene activity was observed in the hippocampal formation. In the obn line, enzyme activity was detected in the CA1-3 hippocampal subfields, in hpk expression was restricted to CA1, but in both lines reporter activity was also present in other brain regions. In the third line, kin, reporter activity was robustly expressed throughout the stratum pyrimidale of CA1-3, with only low-level expression elsewhere. The final line (glnC) displayed ubiquitous expression of the reporter and was not analyzed further. Fusion transcripts for the first three lines were characterized; all encode polypeptides with features of membrane-associated signalling proteins. The obn fusion identified a human cDNA (B2-1) encoding a pleckstrin homology (PH) domain, while hpk sequences matched the Epstein-Barr Virus (EBV) inducible G-protein coupled receptor, EBI-1. kin identified an alternative form of the abl-related nonreceptor tyrosine kinase c-arg. Electrophysiological studies on mice homozygous for the insertions revealed normal synaptic transmission, paired pulse facilitation and paired-pulse depression at Schaffer collateral-commissural CA1 synapses, and normal long-term potentiation (LTP) in obn and kin. hpk mice displayed an increase in hippocampal CA1 long-term potentiation (LTP), suggesting a role for this receptor in synaptic plasticity.

  7. A multistep screening method to identify genes using evolutionary transcriptome of plants.

    PubMed

    Kim, Chang-Kug; Lim, Hye-Min; Na, Jong-Kuk; Choi, Ji-Weon; Sohn, Seong-Han; Park, Soo-Chul; Kim, Young-Hwan; Kim, Yong-Kab; Kim, Dool-Yi

    2014-01-01

    We introduced a multistep screening method to identify the genes in plants using microarrays and ribonucleic acid (RNA)-seq transcriptome data. Our method describes the process for identifying genes using the salt-tolerance response pathways of the potato (Solanum tuberosum) plant. Gene expression was analyzed using microarrays and RNA-seq experiments that examined three potato lines (high, intermediate, and low salt tolerance) under conditions of salt stress. We screened the orthologous genes and pathway genes involved in salinity-related biosynthetic pathways, and identified nine potato genes that were candidates for salinity-tolerance pathways. The nine genes were selected to characterize their phylogenetic reconstruction with homologous genes of Arabidopsis thaliana, and a Circos diagram was generated to understand the relationships among the selected genes. The involvement of the selected genes in salt-tolerance pathways was verified by reverse transcription polymerase chain reaction analysis. One candidate potato gene was selected for physiological validation by generating dehydration-responsive element-binding 1 (DREB1)-overexpressing transgenic potato plants. The DREB1 overexpression lines exhibited increased salt tolerance and plant growth when compared to that of the control. Although the nine genes identified by our multistep screening method require further characterization and validation, this study demonstrates the power of our screening strategy after the initial identification of genes using microarrays and RNA-seq experiments.

  8. Identifying the rooted species tree from the distribution of unrooted gene trees under the coalescent.

    PubMed

    Allman, Elizabeth S; Degnan, James H; Rhodes, John A

    2011-06-01

    Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals-each with many genes-splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are four species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendant branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled.

  9. A search engine to identify pathway genes from expression data on multiple organisms

    PubMed Central

    Chen, Chunnuan; Weirauch, Matthew T; Powell, Corey C; Zambon, Alexander C; Stuart, Joshua M

    2007-01-01

    Background The completion of several genome projects showed that most genes have not yet been characterized, especially in multicellular organisms. Although most genes have unknown functions, a large collection of data is available describing their transcriptional activities under many different experimental conditions. In many cases, the coregulatation of a set of genes across a set of conditions can be used to infer roles for genes of unknown function. Results We developed a search engine, the Multiple-Species Gene Recommender (MSGR), which scans gene expression datasets from multiple organisms to identify genes that participate in a genetic pathway. The MSGR takes a query consisting of a list of genes that function together in a genetic pathway from one of six organisms: Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Arabidopsis thaliana, and Helicobacter pylori. Using a probabilistic method to merge searches, the MSGR identifies genes that are significantly coregulated with the query genes in one or more of those organisms. The MSGR achieves its highest accuracy for many human pathways when searches are combined across species. We describe specific examples in which new genes were identified to be involved in a neuromuscular signaling pathway and a cell-adhesion pathway. Conclusion The search engine can scan large collections of gene expression data for new genes that are significantly coregulated with a pathway of interest. By integrating searches across organisms, the MSGR can identify pathway members whose coregulation is either ancient or newly evolved. PMID:17477880

  10. Effective Boolean dynamics analysis to identify functionally important genes in large-scale signaling networks.

    PubMed

    Trinh, Hung-Cuong; Kwon, Yung-Keun

    2015-11-01

    Efficiently identifying functionally important genes in order to understand the minimal requirements of normal cellular development is challenging. To this end, a variety of structural measures have been proposed and their effectiveness has been investigated in recent literature; however, few studies have shown the effectiveness of dynamics-based measures. This led us to investigate a dynamic measure to identify functionally important genes, and the effectiveness of which was verified through application on two large-scale human signaling networks. We specifically consider Boolean sensitivity-based dynamics against an update-rule perturbation (BSU) as a dynamic measure. Through investigations on two large-scale human signaling networks, we found that genes with relatively high BSU values show slower evolutionary rate and higher proportions of essential genes and drug targets than other genes. Gene-ontology analysis showed clear differences between the former and latter groups of genes. Furthermore, we compare the identification accuracies of essential genes and drug targets via BSU and five well-known structural measures. Although BSU did not always show the best performance, it effectively identified the putative set of genes, which is significantly different from the results obtained via the structural measures. Most interestingly, BSU showed the highest synergy effect in identifying the functionally important genes in conjunction with other measures. Our results imply that Boolean-sensitive dynamics can be used as a measure to effectively identify functionally important genes in signaling networks.

  11. LGscore: A method to identify disease-related genes using biological literature and Google data.

    PubMed

    Kim, Jeongwoo; Kim, Hyunjin; Yoon, Youngmi; Park, Sanghyun

    2015-04-01

    Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which identifies disease-related genes using Google data and literature data. To implement this method, first, we construct a disease-related gene network using text-mining results. We then extract gene-gene interactions based on co-occurrences in abstract data obtained from PubMed, and calculate the weights of edges in the gene network by means of Z-scoring. The weights contain two values: the frequency and the Google search results. The frequency value is extracted from literature data, and the Google search result is obtained using Google. We assign a score to each gene through a network analysis. We assume that genes with a large number of links and numerous Google search results and frequency values are more likely to be involved in disease. For validation, we investigated the top 20 inferred genes for five different diseases using answer sets. The answer sets comprised six databases that contain information on disease-gene relationships. We identified a significant number of disease-related genes as well as candidate genes for Alzheimer's disease, diabetes, colon cancer, lung cancer, and prostate cancer. Our method was up to 40% more accurate than existing methods.

  12. Digital gene expression profiling of flax (Linum usitatissimum L.) stem peel identifies genes enriched in fiber-bearing phloem tissue.

    PubMed

    Guo, Yuan; Qiu, Caisheng; Long, Songhua; Chen, Ping; Hao, Dongmei; Preisner, Marta; Wang, Hui; Wang, Yufu

    2017-08-30

    To better understand the molecular mechanisms and gene expression characteristics associated with development of bast fiber cell within flax stem phloem, the gene expression profiling of flax stem peels and leaves were screened, using Illumina's Digital Gene Expression (DGE) analysis. Four DGE libraries (2 for stem peel and 2 for leaf), ranging from 6.7 to 9.2 million clean reads were obtained, which produced 7.0 million and 6.8 million mapped reads for flax stem peel and leave, respectively. By differential gene expression analysis, a total of 975 genes, of which 708 (73%) genes have protein-coding annotation, were identified as phloem enriched genes putatively involved in the processes of polysaccharide and cell wall metabolism. Differential expression genes (DEGs) was validated using quantitative RT-PCR, the expression pattern of all nine genes determined by qRT-PCR fitted in well with that obtained by sequencing analysis. Cluster and Gene Ontology (GO) analysis revealed that a large number of genes related to metabolic process, catalytic activity and binding category were expressed predominantly in the stem peels. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the phloem enriched genes suggested approximately 111 biological pathways. The large number of genes and pathways produced from DGE sequencing will expand our understanding of the complex molecular and cellular events in flax bast fiber development and provide a foundation for future studies on fiber development in other bast fiber crops. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Comparing classification performance of several types of significant genes to identify key genes in uremia.

    PubMed

    Ying, X-X; Zhou, C-X

    2016-06-01

    End-stage renal failure has profound changes in human gene expressions, but the molecular causation of these pleomorphic effects termed uremia is poorly understood. The purpose of this study was to explore key genes in uremia by comparing classification performance of five kinds of significant genes based on the support vector machines (SVM) model. The five kinds of genes were differentially expressed genes (DEGs), differential pathway genes (DPGs), common differential genes between DEGs and DPGs (CDGs), hub genes (HUGs) and common genes of hub genes and DEGs (CHDGs). In detailed, DEGs were detected by linear models for microarray data (Limma) package. Attract method was utilized to capture DPGs from differential pathways. HUGs were determined according to topological centrality analysis of mutual information network (MIN). Subsequently, SVM model was implemented to assess the classification performance of DEGs, DPGs, CDGs, HUGs and CHDGs, depending on its induces the area under the receiver operating characteristics curve (AUC), true negative rate (TNR), true positive rate (TPR) and the Matthews coefficient correlation classification (MCC). A total of 166 DEGs, 597 DPGs, 13 CDGs, 29 HUGs and 10 CHDGs were obtained in uremia. By assessing the SVM model classification analysis, CHDGs had the best performance of all with AUC = 0.99, TNR = 1.00, TPR = 0.97 and MCC = 0.95. Hence, we considered the CHDGs as key genes in uremia. Key genes concluded in this investigation might provide vital insights into uremia progression and new therapies.

  14. Identifying the optimal gene and gene set in hepatocellular carcinoma based on differential expression and differential co-expression algorithm.

    PubMed

    Dong, Li-Yang; Zhou, Wei-Zhong; Ni, Jun-Wei; Xiang, Wei; Hu, Wen-Hao; Yu, Chang; Li, Hai-Yan

    2017-02-01

    The objective of this study was to identify the optimal gene and gene set for hepatocellular carcinoma (HCC) utilizing differential expression and differential co-expression (DEDC) algorithm. The DEDC algorithm consisted of four parts: calculating differential expression (DE) by absolute t-value in t-statistics; computing differential co-expression (DC) based on Z-test; determining optimal thresholds on the basis of Chi-squared (χ2) maximization and the corresponding gene was the optimal gene; and evaluating functional relevance of genes categorized into different partitions to determine the optimal gene set with highest mean minimum functional information (FI) gain (Δ*G). The optimal thresholds divided genes into four partitions, high DE and high DC (HDE-HDC), high DE and low DC (HDE-LDC), low DE and high DC (LDE‑HDC), and low DE and low DC (LDE-LDC). In addition, the optimal gene was validated by conducting reverse transcription-polymerase chain reaction (RT-PCR) assay. The optimal threshold for DC and DE were 1.032 and 1.911, respectively. Using the optimal gene, the genes were divided into four partitions including: HDE-HDC (2,053 genes), HED-LDC (2,822 genes), LDE-HDC (2,622 genes), and LDE-LDC (6,169 genes). The optimal gene was microtubule‑associated protein RP/EB family member 1 (MAPRE1), and RT-PCR assay validated the significant difference between the HCC and normal state. The optimal gene set was nucleoside metabolic process (GO\\GO:0009116) with Δ*G = 18.681 and 24 HDE-HDC partitions in total. In conclusion, we successfully investigated the optimal gene, MAPRE1, and gene set, nucleoside metabolic process, which may be potential biomarkers for targeted therapy and provide significant insight for revealing the pathological mechanism underlying HCC.

  15. Analysis of gene expression in the nervous system identifies key genes and novel candidates for health and disease.

    PubMed

    Carpanini, Sarah M; Wishart, Thomas M; Gillingwater, Thomas H; Manson, Jean C; Summers, Kim M

    2017-04-01

    The incidence of neurodegenerative diseases in the developed world has risen over the last century, concomitant with an increase in average human lifespan. A major challenge is therefore to identify genes that control neuronal health and viability with a view to enhancing neuronal health during ageing and reducing the burden of neurodegeneration. Analysis of gene expression data has recently been used to infer gene functions for a range of tissues from co-expression networks. We have now applied this approach to transcriptomic datasets from the mammalian nervous system available in the public domain. We have defined the genes critical for influencing neuronal health and disease in different neurological cell types and brain regions. The functional contribution of genes in each co-expression cluster was validated using human disease and knockout mouse phenotypes, pathways and gene ontology term annotation. Additionally a number of poorly annotated genes were implicated by this approach in nervous system function. Exploiting gene expression data available in the public domain allowed us to validate key nervous system genes and, importantly, to identify additional genes with minimal functional annotation but with the same expression pattern. These genes are thus novel candidates for a role in neurological health and disease and could now be further investigated to confirm their function and regulation during ageing and neurodegeneration.

  16. Network Diffusion-Based Prioritization of Autism Risk Genes Identifies Significantly Connected Gene Modules

    PubMed Central

    Mosca, Ettore; Bersanelli, Matteo; Gnocchi, Matteo; Moscatelli, Marco; Castellani, Gastone; Milanesi, Luciano; Mezzelani, Alessandra

    2017-01-01

    Autism spectrum disorder (ASD) is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer. PMID:28993790

  17. An Orthologous Epigenetic Gene Expression Signature Derived from Differentiating Embryonic Stem Cells Identifies Regulators of Cardiogenesis.

    PubMed

    Busser, Brian W; Lin, Yongshun; Yang, Yanqin; Zhu, Jun; Chen, Guokai; Michelson, Alan M

    2015-01-01

    Here we used predictive gene expression signatures within a multi-species framework to identify the genes that underlie cardiac cell fate decisions in differentiating embryonic stem cells. We show that the overlapping orthologous mouse and human genes are the most accurate candidate cardiogenic genes as these genes identified the most conserved developmental pathways that characterize the cardiac lineage. An RNAi-based screen of the candidate genes in Drosophila uncovered numerous novel cardiogenic genes. shRNA knockdown combined with transcriptome profiling of the newly-identified transcription factors zinc finger protein 503 and zinc finger E-box binding homeobox 2 and the well-known cardiac regulatory factor NK2 homeobox 5 revealed that zinc finger E-box binding homeobox 2 activates terminal differentiation genes required for cardiomyocyte structure and function whereas zinc finger protein 503 and NK2 homeobox 5 are required for specification of the cardiac lineage. We further demonstrated that an essential role of NK2 homeobox 5 and zinc finger protein 503 in specification of the cardiac lineage is the repression of gene expression programs characteristic of alternative cell fates. Collectively, these results show that orthologous gene expression signatures can be used to identify conserved cardiogenic pathways.

  18. Identifying differentially expressed genes in cancer patients using a non-parameter Ising model.

    PubMed

    Li, Xumeng; Feltus, Frank A; Sun, Xiaoqian; Wang, James Z; Luo, Feng

    2011-10-01

    Identification of genes and pathways involved in diseases and physiological conditions is a major task in systems biology. In this study, we developed a novel non-parameter Ising model to integrate protein-protein interaction network and microarray data for identifying differentially expressed (DE) genes. We also proposed a simulated annealing algorithm to find the optimal configuration of the Ising model. The Ising model was applied to two breast cancer microarray data sets. The results showed that more cancer-related DE sub-networks and genes were identified by the Ising model than those by the Markov random field model. Furthermore, cross-validation experiments showed that DE genes identified by Ising model can improve classification performance compared with DE genes identified by Markov random field model.

  19. Weighted gene co-expression network analysis identifies specific modules and hub genes related to coronary artery disease.

    PubMed

    Liu, Jing; Jing, Ling; Tu, Xilin

    2016-03-05

    The analysis of the potential molecule targets of coronary artery disease (CAD) is critical for understanding the molecular mechanisms of disease. However, studies of global microarray gene co-expression analysis of CAD still remain limited. Microarray data of CAD (GSE23561) were downloaded from Gene Expression Omnibus, including peripheral blood samples from CAD patients (n = 6) and controls (n = 9). Limma package in R was used to identify the differentially expressed genes (DEGs) between CAD and control samples. Using weighted gene co-expression network analysis (WGCNA) package in R, WGCNA was performed to identify significant modules in the network. Then, functional and pathway enrichment analyses were conducted for genes in the most significant module using DAVID software. Moreover, hub genes in the module were analyzed by isubpathwayminer package in R and GenCLiP 2.0 tool to identify the significant sub-pathways. Total 3711 DEGs and 21 modules for them were identified in CAD samples. The most significant module was associated with the pathways of hypertrophic cardiomyopathy and membrane related functions. In addition, the top 30 hub genes with high connectivity in the module were selected, and two genes (G6PD and S100A7) were taken as key molecules via sub-pathway screening and data mining. A module associated with hypertrophic cardiomyopathy pathway was detected in CAD samples. G6PD and S100A7 were the potential targets in CAD. Our finding might provide novel insight into the underlying molecular mechanism of CAD.

  20. Transcriptome sequencing in pediatric acute lymphoblastic leukemia identifies fusion genes associated with distinct DNA methylation profiles.

    PubMed

    Marincevic-Zuniga, Yanara; Dahlberg, Johan; Nilsson, Sara; Raine, Amanda; Nystedt, Sara; Lindqvist, Carl Mårten; Berglund, Eva C; Abrahamsson, Jonas; Cavelier, Lucia; Forestier, Erik; Heyman, Mats; Lönnerholm, Gudmar; Nordlund, Jessica; Syvänen, Ann-Christine

    2017-08-14

    Structural chromosomal rearrangements that lead to expressed fusion genes are a hallmark of acute lymphoblastic leukemia (ALL). In this study, we performed transcriptome sequencing of 134 primary ALL patient samples to comprehensively detect fusion transcripts. We combined fusion gene detection with genome-wide DNA methylation analysis, gene expression profiling, and targeted sequencing to determine molecular signatures of emerging ALL subtypes. We identified 64 unique fusion events distributed among 80 individual patients, of which over 50% have not previously been reported in ALL. Although the majority of the fusion genes were found only in a single patient, we identified several recurrent fusion gene families defined by promiscuous fusion gene partners, such as ETV6, RUNX1, PAX5, and ZNF384, or recurrent fusion genes, such as DUX4-IGH. Our data show that patients harboring these fusion genes displayed characteristic genome-wide DNA methylation and gene expression signatures in addition to distinct patterns in single nucleotide variants and recurrent copy number alterations. Our study delineates the fusion gene landscape in pediatric ALL, including both known and novel fusion genes, and highlights fusion gene families with shared molecular etiologies, which may provide additional information for prognosis and therapeutic options in the future.

  1. Gene Expression Signature-Based Screening Identifies New Broadly Effective Influenza A Antivirals

    PubMed Central

    Josset, Laurence; Textoris, Julien; Loriod, Béatrice; Ferraris, Olivier; Moules, Vincent; Lina, Bruno; N'Guyen, Catherine; Diaz, Jean-Jacques; Rosa-Calatrava, Manuel

    2010-01-01

    Classical antiviral therapies target viral proteins and are consequently subject to resistance. To counteract this limitation, alternative strategies have been developed that target cellular factors. We hypothesized that such an approach could also be useful to identify broad-spectrum antivirals. The influenza A virus was used as a model for its viral diversity and because of the need to develop therapies against unpredictable viruses as recently underlined by the H1N1 pandemic. We proposed to identify a gene-expression signature associated with infection by different influenza A virus subtypes which would allow the identification of potential antiviral drugs with a broad anti-influenza spectrum of activity. We analyzed the cellular gene expression response to infection with five different human and avian influenza A virus strains and identified 300 genes as differentially expressed between infected and non-infected samples. The most 20 dysregulated genes were used to screen the connectivity map, a database of drug-associated gene expression profiles. Candidate antivirals were then identified by their inverse correlation to the query signature. We hypothesized that such molecules would induce an unfavorable cellular environment for influenza virus replication. Eight potential antivirals including ribavirin were identified and their effects were tested in vitro on five influenza A strains. Six of the molecules inhibited influenza viral growth. The new pandemic H1N1 virus, which was not used to define the gene expression signature of infection, was inhibited by five out of the eight identified molecules, demonstrating that this strategy could contribute to identifying new broad anti-influenza agents acting on cellular gene expression. The identified infection signature genes, the expression of which are modified upon infection, could encode cellular proteins involved in the viral life cycle. This is the first study showing that gene expression-based screening can be

  2. Utility and Limitations of Using Gene Expression Data to Identify Functional Associations

    PubMed Central

    Peng, Cheng; Shiu, Shin-Han

    2016-01-01

    Gene co-expression has been widely used to hypothesize gene function through guilt-by association. However, it is not clear to what degree co-expression is informative, whether it can be applied to genes involved in different biological processes, and how the type of dataset impacts inferences about gene functions. Here our goal is to assess the utility and limitations of using co-expression as a criterion to recover functional associations between genes. By determining the percentage of gene pairs in a metabolic pathway with significant expression correlation, we found that many genes in the same pathway do not have similar transcript profiles and the choice of dataset, annotation quality, gene function, expression similarity measure, and clustering approach significantly impacts the ability to recover functional associations between genes using Arabidopsis thaliana as an example. Some datasets are more informative in capturing coordinated expression profiles and larger data sets are not always better. In addition, to recover the maximum number of known pathways and identify candidate genes with similar functions, it is important to explore rather exhaustively multiple dataset combinations, similarity measures, clustering algorithms and parameters. Finally, we validated the biological relevance of co-expression cluster memberships with an independent phenomics dataset and found that genes that consistently cluster with leucine degradation genes tend to have similar leucine levels in mutants. This study provides a framework for obtaining gene functional associations by maximizing the information that can be obtained from gene expression datasets. PMID:27935950

  3. Systems Biology Approach to Identify Gene Network Signatures for Colorectal Cancer

    PubMed Central

    Sonachalam, Madhankumar; Shen, Jeffrey; Huang, Hui; Wu, Xiaogang

    2012-01-01

    In this work, we integrated prior knowledge from gene signatures and protein interactions with gene set enrichment analysis (GSEA), and gene/protein network modeling together to identify gene network signatures from gene expression microarray data. We demonstrated how to apply this approach into discovering gene network signatures for colorectal cancer (CRC) from microarray datasets. First, we used GSEA to analyze the microarray data through enriching differential genes in different CRC-related gene sets from two publicly available up-to-date gene set databases – Molecular Signatures Database (MSigDB) and Gene Signatures Database (GeneSigDB). Second, we compared the enriched gene sets through enrichment score, false-discovery rate, and nominal p-value. Third, we constructed an integrated protein–protein interaction (PPI) network through connecting these enriched genes by high-quality interactions from a human annotated and predicted protein interaction database, with a confidence score labeled for each interaction. Finally, we mapped differential gene expressions onto the constructed network to build a comprehensive network model containing visualized transcriptome and proteome data. The results show that although MSigDB has more CRC-relevant gene sets than GeneSigDB, the integrated PPI network connecting the enriched genes from both MSigDB and GeneSigDB can provide a more complete view for discovering gene network signatures. We also found several important sub-network signatures for CRC, such as TP53 sub-network, PCNA sub-network, and IL8 sub-network, corresponding to apoptosis, DNA repair, and immune response, respectively. PMID:22629282

  4. Common Marker Genes Identified from Various Sample Types for Systemic Lupus Erythematosus.

    PubMed

    Bing, Peng-Fei; Xia, Wei; Wang, Lan; Zhang, Yong-Hong; Lei, Shu-Feng; Deng, Fei-Yan

    2016-01-01

    Systemic lupus erythematosus (SLE) is a complex auto-immune disease. Gene expression studies have been conducted to identify SLE-related genes in various types of samples. It is unknown whether there are common marker genes significant for SLE but independent of sample types, which may have potentials for follow-up translational research. The aim of this study is to identify common marker genes across various sample types for SLE. Based on four public microarray gene expression datasets for SLE covering three representative types of blood-born samples (monocyte; peripheral blood mononuclear cell, PBMC; whole blood), we utilized three statistics (fold-change, FC; t-test p value; false discovery rate adjusted p value) to scrutinize genes simultaneously regulated with SLE across various sample types. For common marker genes, we conducted the Gene Ontology enrichment analysis and Protein-Protein Interaction analysis to gain insights into their functions. We identified 10 common marker genes associated with SLE (IFI6, IFI27, IFI44L, OAS1, OAS2, EIF2AK2, PLSCR1, STAT1, RNASE2, and GSTO1). Significant up-regulation of IFI6, IFI27, and IFI44L with SLE was observed in all the studied sample types, though the FC was most striking in monocyte, compared with PBMC and whole blood (8.82-251.66 vs. 3.73-74.05 vs. 1.19-1.87). Eight of the above 10 genes, except RNASE2 and GSTO1, interact with each other and with known SLE susceptibility genes, participate in immune response, RNA and protein catabolism, and cell death. Our data suggest that there exist common marker genes across various sample types for SLE. The 10 common marker genes, identified herein, deserve follow-up studies to dissert their potentials as diagnostic or therapeutic markers to predict SLE or treatment response.

  5. Transposon insertional mutagenesis in mice identifies human breast cancer susceptibility genes and signatures for stratification.

    PubMed

    Chen, Liming; Jenjaroenpun, Piroon; Pillai, Andrea Mun Ching; Ivshina, Anna V; Ow, Ghim Siong; Efthimios, Motakis; Zhiqun, Tang; Tan, Tuan Zea; Lee, Song-Choon; Rogers, Keith; Ward, Jerrold M; Mori, Seiichi; Adams, David J; Jenkins, Nancy A; Copeland, Neal G; Ban, Kenneth Hon-Kim; Kuznetsov, Vladimir A; Thiery, Jean Paul

    2017-03-14

    Robust prognostic gene signatures and therapeutic targets are difficult to derive from expression profiling because of the significant heterogeneity within breast cancer (BC) subtypes. Here, we performed forward genetic screening in mice using Sleeping Beauty transposon mutagenesis to identify candidate BC driver genes in an unbiased manner, using a stabilized N-terminal truncated β-catenin gene as a sensitizer. We identified 134 mouse susceptibility genes from 129 common insertion sites within 34 mammary tumors. Of these, 126 genes were orthologous to protein-coding genes in the human genome (hereafter, human BC susceptibility genes, hBCSGs), 70% of which are previously reported cancer-associated genes, and ∼16% are known BC suppressor genes. Network analysis revealed a gene hub consisting of E1A binding protein P300 (EP300), CD44 molecule (CD44), neurofibromin (NF1) and phosphatase and tensin homolog (PTEN), which are linked to a significant number of mutated hBCSGs. From our survival prediction analysis of the expression of human BC genes in 2,333 BC cases, we isolated a six-gene-pair classifier that stratifies BC patients with high confidence into prognostically distinct low-, moderate-, and high-risk subgroups. Furthermore, we proposed prognostic classifiers identifying three basal and three claudin-low tumor subgroups. Intriguingly, our hBCSGs are mostly unrelated to cell cycle/mitosis genes and are distinct from the prognostic signatures currently used for stratifying BC patients. Our findings illustrate the strength and validity of integrating functional mutagenesis screens in mice with human cancer transcriptomic data to identify highly prognostic BC subtyping biomarkers.

  6. Transposon insertional mutagenesis in mice identifies human breast cancer susceptibility genes and signatures for stratification

    PubMed Central

    Chen, Liming; Jenjaroenpun, Piroon; Pillai, Andrea Mun Ching; Ivshina, Anna V.; Ow, Ghim Siong; Efthimios, Motakis; Zhiqun, Tang; Lee, Song-Choon; Rogers, Keith; Ward, Jerrold M.; Mori, Seiichi; Adams, David J.; Jenkins, Nancy A.; Copeland, Neal G.; Ban, Kenneth Hon-Kim; Kuznetsov, Vladimir A.; Thiery, Jean Paul

    2017-01-01

    Robust prognostic gene signatures and therapeutic targets are difficult to derive from expression profiling because of the significant heterogeneity within breast cancer (BC) subtypes. Here, we performed forward genetic screening in mice using Sleeping Beauty transposon mutagenesis to identify candidate BC driver genes in an unbiased manner, using a stabilized N-terminal truncated β-catenin gene as a sensitizer. We identified 134 mouse susceptibility genes from 129 common insertion sites within 34 mammary tumors. Of these, 126 genes were orthologous to protein-coding genes in the human genome (hereafter, human BC susceptibility genes, hBCSGs), 70% of which are previously reported cancer-associated genes, and ∼16% are known BC suppressor genes. Network analysis revealed a gene hub consisting of E1A binding protein P300 (EP300), CD44 molecule (CD44), neurofibromin (NF1) and phosphatase and tensin homolog (PTEN), which are linked to a significant number of mutated hBCSGs. From our survival prediction analysis of the expression of human BC genes in 2,333 BC cases, we isolated a six-gene-pair classifier that stratifies BC patients with high confidence into prognostically distinct low-, moderate-, and high-risk subgroups. Furthermore, we proposed prognostic classifiers identifying three basal and three claudin-low tumor subgroups. Intriguingly, our hBCSGs are mostly unrelated to cell cycle/mitosis genes and are distinct from the prognostic signatures currently used for stratifying BC patients. Our findings illustrate the strength and validity of integrating functional mutagenesis screens in mice with human cancer transcriptomic data to identify highly prognostic BC subtyping biomarkers. PMID:28251929

  7. Identifying and Analyzing Novel Epilepsy-Related Genes Using Random Walk with Restart Algorithm

    PubMed Central

    Guo, Wei; Shang, Dong-Mei; Cao, Jing-Hui; Feng, Kaiyan; Wang, ShaoPeng

    2017-01-01

    As a pathological condition, epilepsy is caused by abnormal neuronal discharge in brain which will temporarily disrupt the cerebral functions. Epilepsy is a chronic disease which occurs in all ages and would seriously affect patients' personal lives. Thus, it is highly required to develop effective medicines or instruments to treat the disease. Identifying epilepsy-related genes is essential in order to understand and treat the disease because the corresponding proteins encoded by the epilepsy-related genes are candidates of the potential drug targets. In this study, a pioneering computational workflow was proposed to predict novel epilepsy-related genes using the random walk with restart (RWR) algorithm. As reported in the literature RWR algorithm often produces a number of false positive genes, and in this study a permutation test and functional association tests were implemented to filter the genes identified by RWR algorithm, which greatly reduce the number of suspected genes and result in only thirty-three novel epilepsy genes. Finally, these novel genes were analyzed based upon some recently published literatures. Our findings implicate that all novel genes were closely related to epilepsy. It is believed that the proposed workflow can also be applied to identify genes related to other diseases and deepen our understanding of the mechanisms of these diseases. PMID:28255556

  8. Using phylogenomic patterns and gene ontology to identify proteins of importance in plant evolution.

    PubMed

    Cibrián-Jaramillo, Angélica; De la Torre-Bárcena, Jose E; Lee, Ernest K; Katari, Manpreet S; Little, Damon P; Stevenson, Dennis W; Martienssen, Rob; Coruzzi, Gloria M; DeSalle, Rob

    2010-07-12

    We use measures of congruence on a combined expressed sequenced tag genome phylogeny to identify proteins that have potential significance in the evolution of seed plants. Relevant proteins are identified based on the direction of partitioned branch and hidden support on the hypothesis obtained on a 16-species tree, constructed from 2,557 concatenated orthologous genes. We provide a general method for detecting genes or groups of genes that may be under selection in directions that are in agreement with the phylogenetic pattern. Gene partitioning methods and estimates of the degree and direction of support of individual gene partitions to the overall data set are used. Using this approach, we correlate positive branch support of specific genes for key branches in the seed plant phylogeny. In addition to basic metabolic functions, such as photosynthesis or hormones, genes involved in posttranscriptional regulation by small RNAs were significantly overrepresented in key nodes of the phylogeny of seed plants. Two genes in our matrix are of critical importance as they are involved in RNA-dependent regulation, essential during embryo and leaf development. These are Argonaute and the RNA-dependent RNA polymerase 6 found to be overrepresented in the angiosperm clade. We use these genes as examples of our phylogenomics approach and show that identifying partitions or genes in this way provides a platform to explain some of the more interesting organismal differences among species, and in particular, in the evolution of plants.

  9. Using Phylogenomic Patterns and Gene Ontology to Identify Proteins of Importance in Plant Evolution

    PubMed Central

    Cibrián-Jaramillo, Angélica; De la Torre-Bárcena, Jose E.; Lee, Ernest K.; Katari, Manpreet S.; Little, Damon P.; Stevenson, Dennis W.; Martienssen, Rob; Coruzzi, Gloria M.; DeSalle, Rob

    2010-01-01

    We use measures of congruence on a combined expressed sequenced tag genome phylogeny to identify proteins that have potential significance in the evolution of seed plants. Relevant proteins are identified based on the direction of partitioned branch and hidden support on the hypothesis obtained on a 16-species tree, constructed from 2,557 concatenated orthologous genes. We provide a general method for detecting genes or groups of genes that may be under selection in directions that are in agreement with the phylogenetic pattern. Gene partitioning methods and estimates of the degree and direction of support of individual gene partitions to the overall data set are used. Using this approach, we correlate positive branch support of specific genes for key branches in the seed plant phylogeny. In addition to basic metabolic functions, such as photosynthesis or hormones, genes involved in posttranscriptional regulation by small RNAs were significantly overrepresented in key nodes of the phylogeny of seed plants. Two genes in our matrix are of critical importance as they are involved in RNA-dependent regulation, essential during embryo and leaf development. These are Argonaute and the RNA-dependent RNA polymerase 6 found to be overrepresented in the angiosperm clade. We use these genes as examples of our phylogenomics approach and show that identifying partitions or genes in this way provides a platform to explain some of the more interesting organismal differences among species, and in particular, in the evolution of plants. PMID:20624728

  10. Genes identified by visible mutant phenotypes show increased bias toward one of two subgenomes of maize.

    PubMed

    Schnable, James C; Freeling, Michael

    2011-03-10

    Not all genes are created equal. Despite being supported by sequence conservation and expression data, knockout homozygotes of many genes show no visible effects, at least under laboratory conditions. We have identified a set of maize (Zea mays L.) genes which have been the subject of a disproportionate share of publications recorded at MaizeGDB. We manually anchored these "classical" maize genes to gene models in the B73 reference genome, and identified syntenic orthologs in other grass genomes. In addition to proofing the most recent version 2 maize gene models, we show that a subset of these genes, those that were identified by morphological phenotype prior to cloning, are retained at syntenic locations throughout the grasses at much higher levels than the average expressed maize gene, and are preferentially found on the maize1 subgenome even with a duplicate copy is still retained on the opposite subgenome. Maize1 is the subgenome that experienced less gene loss following the whole genome duplication in maize lineage 5-12 million years ago and genes located on this subgenome tend to be expressed at higher levels in modern maize. Links to the web based software that supported our syntenic analyses in the grasses should empower further research and support teaching involving the history of maize genetic research. Our findings exemplify the concept of "grasses as a single genetic system," where what is learned in one grass may be applied to another.

  11. Identifying suitable reference genes for gene expression analysis in developing skeletal muscle in pigs.

    PubMed

    Niu, Guanglin; Yang, Yalan; Zhang, YuanYuan; Hua, Chaoju; Wang, Zishuai; Tang, Zhonglin; Li, Kui

    2016-01-01

    The selection of suitable reference genes is crucial to accurately evaluate and normalize the relative expression level of target genes for gene function analysis. However, commonly used reference genes have variable expression levels in developing skeletal muscle. There are few reports that systematically evaluate the expression stability of reference genes across prenatal and postnatal developing skeletal muscle in mammals. Here, we used quantitative PCR to examine the expression levels of 15 candidate reference genes (ACTB, GAPDH, RNF7, RHOA, RPS18, RPL32, PPIA, H3F3, API5, B2M, AP1S1, DRAP1, TBP, WSB, and VAPB) in porcine skeletal muscle at 26 different developmental stages (15 prenatal and 11 postnatal periods). We evaluated gene expression stability using the computer algorithms geNorm, NormFinder, and BestKeeper. Our results indicated that GAPDH and ACTB had the greatest variability among the candidate genes across prenatal and postnatal stages of skeletal muscle development. RPS18, API5, and VAPB had stable expression levels in prenatal stages, whereas API5, RPS18, RPL32, and H3F3 had stable expression levels in postnatal stages. API5 and H3F3 expression levels had the greatest stability in all tested prenatal and postnatal stages, and were the most appropriate reference genes for gene expression normalization in developing skeletal muscle. Our data provide valuable information for gene expression analysis during different stages of skeletal muscle development in mammals. This information can provide a valuable guide for the analysis of human diseases.

  12. Identifying suitable reference genes for gene expression analysis in developing skeletal muscle in pigs

    PubMed Central

    Zhang, YuanYuan; Hua, Chaoju; Wang, Zishuai; Li, Kui

    2016-01-01

    The selection of suitable reference genes is crucial to accurately evaluate and normalize the relative expression level of target genes for gene function analysis. However, commonly used reference genes have variable expression levels in developing skeletal muscle. There are few reports that systematically evaluate the expression stability of reference genes across prenatal and postnatal developing skeletal muscle in mammals. Here, we used quantitative PCR to examine the expression levels of 15 candidate reference genes (ACTB, GAPDH, RNF7, RHOA, RPS18, RPL32, PPIA, H3F3, API5, B2M, AP1S1, DRAP1, TBP, WSB, and VAPB) in porcine skeletal muscle at 26 different developmental stages (15 prenatal and 11 postnatal periods). We evaluated gene expression stability using the computer algorithms geNorm, NormFinder, and BestKeeper. Our results indicated that GAPDH and ACTB had the greatest variability among the candidate genes across prenatal and postnatal stages of skeletal muscle development. RPS18, API5, and VAPB had stable expression levels in prenatal stages, whereas API5, RPS18, RPL32, and H3F3 had stable expression levels in postnatal stages. API5 and H3F3 expression levels had the greatest stability in all tested prenatal and postnatal stages, and were the most appropriate reference genes for gene expression normalization in developing skeletal muscle. Our data provide valuable information for gene expression analysis during different stages of skeletal muscle development in mammals. This information can provide a valuable guide for the analysis of human diseases. PMID:27994956

  13. TILLING by sequencing to identify induced mutations in stress resistance genes of peanut (Arachis hypogaea).

    PubMed

    Guo, Yufang; Abernathy, Brian; Zeng, Yajuan; Ozias-Akins, Peggy

    2015-03-07

    Targeting Induced Local Lesions in Genomes (TILLING) is a powerful reverse genetics approach for functional genomics studies. We used high-throughput sequencing, combined with a two-dimensional pooling strategy, with either minimum read percentage with non-reference nucleotide or minimum variance multiplier as mutation prediction parameters, to detect genes related to abiotic and biotic stress resistances. In peanut, lipoxygenase genes were reported to be highly induced in mature seeds infected with Aspergillus spp., indicating their importance in plant-fungus interactions. Recent studies showed that phospholipase D (PLD) expression was elevated more quickly in drought sensitive lines than in drought tolerant lines of peanut. A newly discovered lipoxygenase (LOX) gene in peanut, along with two peanut PLD genes from previous publications were selected for TILLING. Additionally, two major allergen genes Ara h 1 and Ara h 2, and fatty acid desaturase AhFAD2, a gene which controls the ratio of oleic to linoleic acid in the seed, were also used in our study. The objectives of this research were to develop a suitable TILLING by sequencing method for this allotetraploid, and use this method to identify mutations induced in stress related genes. We screened a peanut root cDNA library and identified three candidate LOX genes. The gene AhLOX7 was selected for TILLING due to its high expression in seeds and roots. By screening 768 M2 lines from the TILLING population, four missense mutations were identified for AhLOX7, three missense mutations were identified for AhPLD, one missense and two silent mutations were identified for Ara h 1.01, three silent and five missense mutations were identified for Ara h 1.02, one missense mutation was identified for AhFAD2B, and one silent mutation was identified for Ara h 2.02. The overall mutation frequency was 1 SNP/1,066 kb. The SNP detection frequency for single copy genes was 1 SNP/344 kb and 1 SNP/3,028 kb for multiple copy genes. Our

  14. Identifying genes for neuron survival and axon outgrowth in Hirudo medicinalis.

    PubMed

    Blackshaw, S E; Babington, E J; Emes, R D; Malek, J; Wang, W-Z

    2004-01-01

    We have studied the molecular basis of nervous system repair in invertebrate (Hirudo medicinalis) nerve cells. Unlike in mammals, neurons in invertebrates survive injury and regrow processes to restore the connections that they held before the damage occurred. To identify genes whose expression is regulated after injury, we have used subtractive probes, constructed from regenerating and non-regenerating ganglia from the leech Hirudo medicinalis, to screen cDNA libraries made from whole leech CNS or from identified microdissected neurons. We have identified genes of known or predicted function as well as novel genes. Known genes up-regulated within hours of injury and that are widely expressed in invertebrate and mammalian cells include thioredoxin and tubulin. Other known genes, e.g. Cysteine Rich Intestinal Protein (CRIP), have previously been identified in mammalian cells though not in regenerating adult neurons. Two regulated genes identified, myohemerythrin and the novel protein ReN3 are exclusively expressed in invertebrates. Thus our approach has enabled us to identify genes, present in a neuron of known function, that are up- and down-regulated within hours of axotomy, and that may underpin the intrinsic ability of invertebrate neurons to survive damage and initiate regrowth programmes.

  15. Identifying genes for neuron survival and axon outgrowth in Hirudo medicinalis

    PubMed Central

    Blackshaw, SE; Babington, EJ; Emes, RD; Malek, J; Wang, W-Z

    2004-01-01

    We have studied the molecular basis of nervous system repair in invertebrate (Hirudo medicinalis) nerve cells. Unlike in mammals, neurons in invertebrates survive injury and regrow processes to restore the connections that they held before the damage occurred. To identify genes whose expression is regulated after injury, we have used subtractive probes, constructed from regenerating and non-regenerating ganglia from the leech Hirudo medicinalis, to screen cDNA libraries made from whole leech CNS or from identified microdissected neurons. We have identified genes of known or predicted function as well as novel genes. Known genes up-regulated within hours of injury and that are widely expressed in invertebrate and mammalian cells include thioredoxin and tubulin. Other known genes, e.g. Cysteine Rich Intestinal Protein (CRIP), have previously been identified in mammalian cells though not in regenerating adult neurons. Two regulated genes identified, myohemerythrin and the novel protein ReN3 are exclusively expressed in invertebrates. Thus our approach has enabled us to identify genes, present in a neuron of known function, that are up- and down-regulated within hours of axotomy, and that may underpin the intrinsic ability of invertebrate neurons to survive damage and initiate regrowth programmes. PMID:14690474

  16. Analysis of pan-genome to identify the core genes and essential genes of Brucella spp.

    PubMed

    Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin

    2016-04-01

    Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp.

  17. A novel reverse-genetic approach (SIMF) identifies Mutator insertions in new Myb genes.

    PubMed

    Rabinowicz, P D; Grotewold, E

    2000-11-01

    We have developed a new strategy designated SIMF (Systematic Insertional Mutagenesis of Families), to identify DNA insertions in many members of a gene family simultaneously. This method requires only a short amino acid sequence conserved in all members of the family to make a degenerate oligonucleotide, and a sequence from the end of the DNA insertion. The SIMF strategy was successfully applied to the large maize R2R3 Myb family of regulatory genes, and Mutator insertions in several novel Myb genes were identified. Application of this technique to identify insertions in other large gene families could significantly decrease the effort involved in screening at the same time for insertions in all members of groups of genes that share a limited sequence identity.

  18. Targeted sequencing identifies 91 neurodevelopmental-disorder risk genes with autism and developmental-disability biases.

    PubMed

    Stessman, Holly A F; Xiong, Bo; Coe, Bradley P; Wang, Tianyun; Hoekzema, Kendra; Fenckova, Michaela; Kvarnung, Malin; Gerdts, Jennifer; Trinh, Sandy; Cosemans, Nele; Vives, Laura; Lin, Janice; Turner, Tychele N; Santen, Gijs; Ruivenkamp, Claudia; Kriek, Marjolein; van Haeringen, Arie; Aten, Emmelien; Friend, Kathryn; Liebelt, Jan; Barnett, Christopher; Haan, Eric; Shaw, Marie; Gecz, Jozef; Anderlid, Britt-Marie; Nordgren, Ann; Lindstrand, Anna; Schwartz, Charles; Kooy, R Frank; Vandeweyer, Geert; Helsmoortel, Celine; Romano, Corrado; Alberti, Antonino; Vinci, Mirella; Avola, Emanuela; Giusto, Stefania; Courchesne, Eric; Pramparo, Tiziano; Pierce, Karen; Nalabolu, Srinivasa; Amaral, David G; Scheffer, Ingrid E; Delatycki, Martin B; Lockhart, Paul J; Hormozdiari, Fereydoun; Harich, Benjamin; Castells-Nobau, Anna; Xia, Kun; Peeters, Hilde; Nordenskjöld, Magnus; Schenck, Annette; Bernier, Raphael A; Eichler, Evan E

    2017-04-01

    Gene-disruptive mutations contribute to the biology of neurodevelopmental disorders (NDDs), but most of the related pathogenic genes are not known. We sequenced 208 candidate genes from >11,730 cases and >2,867 controls. We identified 91 genes, including 38 new NDD genes, with an excess of de novo mutations or private disruptive mutations in 5.7% of cases. Drosophila functional assays revealed a subset with increased involvement in NDDs. We identified 25 genes showing a bias for autism versus intellectual disability and highlighted a network associated with high-functioning autism (full-scale IQ >100). Clinical follow-up for NAA15, KMT5B, and ASH1L highlighted new syndromic and nonsyndromic forms of disease.

  19. Combining Genome-Scale Experimental and Computational Methods To Identify Essential Genes in Rhodobacter sphaeroides

    PubMed Central

    Burger, Brian T.; Imam, Saheed; Scarborough, Matthew J.; Noguera, Daniel R.

    2017-01-01

    ABSTRACT Rhodobacter sphaeroides is one of the best-studied alphaproteobacteria from biochemical, genetic, and genomic perspectives. To gain a better systems-level understanding of this organism, we generated a large transposon mutant library and used transposon sequencing (Tn-seq) to identify genes that are essential under several growth conditions. Using newly developed Tn-seq analysis software (TSAS), we identified 493 genes as essential for aerobic growth on a rich medium. We then used the mutant library to identify conditionally essential genes under two laboratory growth conditions, identifying 85 additional genes required for aerobic growth in a minimal medium and 31 additional genes required for photosynthetic growth. In all instances, our analyses confirmed essentiality for many known genes and identified genes not previously considered to be essential. We used the resulting Tn-seq data to refine and improve a genome-scale metabolic network model (GEM) for R. sphaeroides. Together, we demonstrate how genetic, genomic, and computational approaches can be combined to obtain a systems-level understanding of the genetic framework underlying metabolic diversity in bacterial species. IMPORTANCE Knowledge about the role of genes under a particular growth condition is required for a holistic understanding of a bacterial cell and has implications for health, agriculture, and biotechnology. We developed the Tn-seq analysis software (TSAS) package to provide a flexible and statistically rigorous workflow for the high-throughput analysis of insertion mutant libraries, advanced the knowledge of gene essentiality in R. sphaeroides, and illustrated how Tn-seq data can be used to more accurately identify genes that play important roles in metabolism and other processes that are essential for cellular survival. Author Video: An author video summary of this article is available. PMID:28744485

  20. Genome-Wide RNAi Screens in C. elegans to Identify Genes Influencing Lifespan and Innate Immunity.

    PubMed

    Sinha, Amit; Rae, Robbie

    2016-01-01

    RNA interference is a rapid, inexpensive, and highly effective tool used to inhibit gene function. In C. elegans, whole genome screens have been used to identify genes involved with numerous traits including aging and innate immunity. RNAi in C. elegans can be carried out via feeding, soaking, or injection. Here we outline protocols used to maintain, grow, and carry out RNAi via feeding in C. elegans and determine whether the inhibited genes are essential for lifespan or innate immunity.

  1. Protein functional links in Trypanosoma brucei, identified by gene fusion analysis

    PubMed Central

    2011-01-01

    Background Domain or gene fusion analysis is a bioinformatics method for detecting gene fusions in one organism by comparing its genome to that of other organisms. The occurrence of gene fusions suggests that the two original genes that participated in the fusion are functionally linked, i.e. their gene products interact either as part of a multi-subunit protein complex, or in a metabolic pathway. Gene fusion analysis has been used to identify protein functional links in prokaryotes as well as in eukaryotic model organisms, such as yeast and Drosophila. Results In this study we have extended this approach to include a number of recently sequenced protists, four of which are pathogenic, to identify fusion linked proteins in Trypanosoma brucei, the causative agent of African sleeping sickness. We have also examined the evolution of the gene fusion events identified, to determine whether they can be attributed to fusion or fission, by looking at the conservation of the fused genes and of the individual component genes across the major eukaryotic and prokaryotic lineages. We find relatively limited occurrence of gene fusions/fissions within the protist lineages examined. Our results point to two trypanosome-specific gene fissions, which have recently been experimentally confirmed, one fusion involving proteins involved in the same metabolic pathway, as well as two novel putative functional links between fusion-linked protein pairs. Conclusions This is the first study of protein functional links in T. brucei identified by gene fusion analysis. We have used strict thresholds and only discuss results which are highly likely to be genuine and which either have already been or can be experimentally verified. We discuss the possible impact of the identification of these novel putative protein-protein interactions, to the development of new trypanosome therapeutic drugs. PMID:21729286

  2. The compact Selaginella genome identifies changes in gene content associated with the evolution of vascular plants

    SciTech Connect

    Grigoriev, Igor V.; Banks, Jo Ann; Nishiyama, Tomoaki; Hasebe, Mitsuyasu; Bowman, John L.; Gribskov, Michael; dePamphilis, Claude; Albert, Victor A.; Aono, Naoki; Aoyama, Tsuyoshi; Ambrose, Barbara A.; Ashton, Neil W.; Axtell, Michael J.; Barker, Elizabeth; Barker, Michael S.; Bennetzen, Jeffrey L.; Bonawitz, Nicholas D.; Chapple, Clint; Cheng, Chaoyang; Correa, Luiz Gustavo Guedes; Dacre, Michael; DeBarry, Jeremy; Dreyer, Ingo; Elias, Marek; Engstrom, Eric M.; Estelle, Mark; Feng, Liang; Finet, Cedric; Floyd, Sandra K.; Frommer, Wolf B.; Fujita, Tomomichi; Gramzow, Lydia; Gutensohn, Michael; Harholt, Jesper; Hattori, Mitsuru; Heyl, Alexander; Hirai, Tadayoshi; Hiwatashi, Yuji; Ishikawa, Masaki; Iwata, Mineko; Karol, Kenneth G.; Koehler, Barbara; Kolukisaoglu, Uener; Kubo, Minoru; Kurata, Tetsuya; Lalonde, Sylvie; Li, Kejie; Li, Ying; Litt, Amy; Lyons, Eric; Manning, Gerard; Maruyama, Takeshi; Michael, Todd P.; Mikami, Koji; Miyazaki, Saori; Morinaga, Shin-ichi; Murata, Takashi; Mueller-Roeber, Bernd; Nelson, David R.; Obara, Mari; Oguri, Yasuko; Olmstead, Richard G.; Onodera, Naoko; Petersen, Bent Larsen; Pils, Birgit; Prigge, Michael; Rensing, Stefan A.; Riano-Pachon, Diego Mauricio; Roberts, Alison W.; Sato, Yoshikatsu; Scheller, Henrik Vibe; Schulz, Burkhard; Schulz, Christian; Shakirov, Eugene V.; Shibagaki, Nakako; Shinohara, Naoki; Shippen, Dorothy E.; Sorensen, Iben; Sotooka, Ryo; Sugimoto, Nagisa; Sugita, Mamoru; Sumikawa, Naomi; Tanurdzic, Milos; Theilsen, Gunter; Ulvskov, Peter; Wakazuki, Sachiko; Weng, Jing-Ke; Willats, William W.G.T.; Wipf, Daniel; Wolf, Paul G.; Yang, Lixing; Zimmer, Andreas D.; Zhu, Qihui; Mitros, Therese; Hellsten, Uffe; Loque, Dominique; Otillar, Robert; Salamov, Asaf; Schmutz, Jeremy; Shapiro, Harris; Lindquist, Erika; Lucas, Susan; Rokhsar, Daniel

    2011-04-28

    We report the genome sequence of the nonseed vascular plant, Selaginella moellendorffii, and by comparative genomics identify genes that likely played important roles in the early evolution of vascular plants and their subsequent evolution

  3. Epigenetic Characterization of the Growth Hormone Gene Identifies SmcHD1 as a Regulator of Autosomal Gene Clusters

    PubMed Central

    Massah, Shabnam; Hollebakken, Robert; Labrecque, Mark P.; Kolybaba, Addie M.; Beischlag, Timothy V.; Prefontaine, Gratien G.

    2014-01-01

    Regulatory elements for the mouse growth hormone (GH) gene are located distally in a putative locus control region (LCR) in addition to key elements in the promoter proximal region. The role of promoter DNA methylation for GH gene regulation is not well understood. Pit-1 is a POU transcription factor required for normal pituitary development and obligatory for GH gene expression. In mammals, Pit-1 mutations eliminate GH production resulting in a dwarf phenotype. In this study, dwarf mice illustrated that Pit-1 function was obligatory for GH promoter hypomethylation. By monitoring promoter methylation levels during developmental GH expression we found that the GH promoter became hypomethylated coincident with gene expression. We identified a promoter differentially methylated region (DMR) that was used to characterize a methylation-dependent DNA binding activity. Upon DNA affinity purification using the DMR and nuclear extracts, we identified structural maintenance of chromosomes hinge domain containing -1 (SmcHD1). To better understand the role of SmcHD1 in genome-wide gene expression, we performed microarray analysis and compared changes in gene expression upon reduced levels of SmcHD1 in human cells. Knock-down of SmcHD1 in human embryonic kidney (HEK293) cells revealed a disproportionate number of up-regulated genes were located on the X-chromosome, but also suggested regulation of genes on non-sex chromosomes. Among those, we identified several genes located in the protocadherin β cluster. In addition, we found that imprinted genes in the H19/Igf2 cluster associated with Beckwith-Wiedemann and Silver-Russell syndromes (BWS & SRS) were dysregulated. For the first time using human cells, we showed that SmcHD1 is an important regulator of imprinted and clustered genes. PMID:24818964

  4. Exploiting natural variation in Saccharomyces cerevisiae to identify genes for increased ethanol resistance.

    PubMed

    Lewis, Jeffrey A; Elkon, Isaac M; McGee, Mick A; Higbee, Alan J; Gasch, Audrey P

    2010-12-01

    Ethanol production from lignocellulosic biomass holds promise as an alternative fuel. However, industrial stresses, including ethanol stress, limit microbial fermentation and thus prevent cost competitiveness with fossil fuels. To identify novel engineering targets for increased ethanol tolerance, we took advantage of natural diversity in wild Saccharomyces cerevisiae strains. We previously showed that an S288c-derived lab strain cannot acquire higher ethanol tolerance after a mild ethanol pretreatment, which is distinct from other stresses. Here, we measured acquired ethanol tolerance in a large panel of wild strains and show that most strains can acquire higher tolerance after pretreatment. We exploited this major phenotypic difference to address the mechanism of acquired ethanol tolerance, by comparing the global gene expression response to 5% ethanol in S288c and two wild strains. Hundreds of genes showed variation in ethanol-dependent gene expression across strains. Computational analysis identified several transcription factor modules and known coregulated genes as differentially expressed, implicating genetic variation in the ethanol signaling pathway. We used this information to identify genes required for acquisition of ethanol tolerance in wild strains, including new genes and processes not previously linked to ethanol tolerance, and four genes that increase ethanol tolerance when overexpressed. Our approach shows that comparative genomics across natural isolates can quickly identify genes for industrial engineering while expanding our understanding of natural diversity.

  5. Exploiting Natural Variation in Saccharomyces cerevisiae to Identify Genes for Increased Ethanol Resistance

    PubMed Central

    Lewis, Jeffrey A.; Elkon, Isaac M.; McGee, Mick A.; Higbee, Alan J.; Gasch, Audrey P.

    2010-01-01

    Ethanol production from lignocellulosic biomass holds promise as an alternative fuel. However, industrial stresses, including ethanol stress, limit microbial fermentation and thus prevent cost competitiveness with fossil fuels. To identify novel engineering targets for increased ethanol tolerance, we took advantage of natural diversity in wild Saccharomyces cerevisiae strains. We previously showed that an S288c-derived lab strain cannot acquire higher ethanol tolerance after a mild ethanol pretreatment, which is distinct from other stresses. Here, we measured acquired ethanol tolerance in a large panel of wild strains and show that most strains can acquire higher tolerance after pretreatment. We exploited this major phenotypic difference to address the mechanism of acquired ethanol tolerance, by comparing the global gene expression response to 5% ethanol in S288c and two wild strains. Hundreds of genes showed variation in ethanol-dependent gene expression across strains. Computational analysis identified several transcription factor modules and known coregulated genes as differentially expressed, implicating genetic variation in the ethanol signaling pathway. We used this information to identify genes required for acquisition of ethanol tolerance in wild strains, including new genes and processes not previously linked to ethanol tolerance, and four genes that increase ethanol tolerance when overexpressed. Our approach shows that comparative genomics across natural isolates can quickly identify genes for industrial engineering while expanding our understanding of natural diversity. PMID:20855568

  6. CTDGFinder: A Novel Homology-Based Algorithm for Identifying Closely Spaced Clusters of Tandemly Duplicated Genes.

    PubMed

    Ortiz, Juan F; Rokas, Antonis

    2017-01-01

    Closely spaced clusters of tandemly duplicated genes (CTDGs) contribute to the diversity of many phenotypes, including chemosensation, snake venom, and animal body plans. CTDGs have traditionally been identified subjectively as genomic neighborhoods containing several gene duplicates in close proximity; however, CTDGs are often highly variable with respect to gene number, intergenic distance, and synteny. This lack of formal definition hampers the study of CTDG evolutionary dynamics and the discovery of novel CTDGs in the exponentially growing body of genomic data. To address this gap, we developed a novel homology-based algorithm, CTDGFinder, which formalizes and automates the identification of CTDGs by examining the physical distribution of individual members of families of duplicated genes across chromosomes. Application of CTDGFinder accurately identified CTDGs for many well-known gene clusters (e.g., Hox and beta-globin gene clusters) in the human, mouse and 20 other mammalian genomes. Differences between previously annotated gene clusters and our inferred CTDGs were due to the exclusion of nonhomologs that have historically been considered parts of specific gene clusters, the inclusion or absence of genes between the CTDGs and their corresponding gene clusters, and the splitting of certain gene clusters into distinct CTDGs. Examination of human genes showing tissue-specific enhancement of their expression by CTDGFinder identified members of several well-known gene clusters (e.g., cytochrome P450s and olfactory receptors) and revealed that they were unequally distributed across tissues. By formalizing and automating CTDG identification, CTDGFinder will facilitate understanding of CTDG evolutionary dynamics, their functional implications, and how they are associated with phenotypic diversity. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e

  7. Microarray Analysis Identifies Cerebellar Genes Sensitive to Chronic Ethanol Treatment in PKCγ Mice

    PubMed Central

    Bowers, Barbara J.; Radcliffe, Richard A.; Smith, Amy M.; Miyamoto-Ditmon, Jill; Wehner, Jeanne M.

    2007-01-01

    Neuroadaptive changes that occur in the development of ethanol tolerance may be the result of alterations in gene expression. We have shown that PKCγ wild-type mice develop tolerance to the sedative-hypnotic effects of ethanol after chronic ethanol treatment; whereas, mutant mice do not, making these genotypes a suitable model for identifying changes in gene expression related to tolerance development. Using a two-stage process, several genes were initially identified using microarray analyses of cerebellar tissue from ethanol-treated PKCγ mutant and wild-type mice. Subsequent confirmation of a subset of these genes using qRT-PCR was done to verify gene expression changes. A total of 109 genes from different functional classifications were identified in these groups on the microarrays. Eight genes were selected for verification: three, Twik-1, Plp, and Adk2, were chosen as genes related to tolerance; another three, Hsp70.2, Bdnf, and Th, were chosen as genes related to resistance to tolerance; and two genes, JunB and Nur77, were selected as candidate genes sensitive to chronic ethanol. The results from the verification experiments indicated that Twik-1, which codes for a potassium channel, was associated with tolerance and appeared to be dependent on the presence of PKCγ. No genes were confirmed to be related to resistance to tolerance; however, expression of two of these, Hsp70.2 and Th, were found to be sensitive to chronic ethanol and were added to the transcription factors, JunB and Nur77, confirmed by qRT-PCR, as a subset of genes that respond to chronic ethanol. PMID:17157717

  8. Oscope identifies oscillatory genes in unsynchronized single-cell RNA-seq experiments.

    PubMed

    Leng, Ning; Chu, Li-Fang; Barry, Chris; Li, Yuan; Choi, Jeea; Li, Xiaomao; Jiang, Peng; Stewart, Ron M; Thomson, James A; Kendziorski, Christina

    2015-10-01

    Oscillatory gene expression is fundamental to development, but technologies for monitoring expression oscillations are limited. We have developed a statistical approach called Oscope to identify and characterize the transcriptional dynamics of oscillating genes in single-cell RNA-seq data from an unsynchronized cell population. Applying Oscope to a number of data sets, we demonstrated its utility and also identified a potential artifact in the Fluidigm C1 platform.

  9. Bioinformatic analysis of nematode migration-associated genes identifies novel vertebrate neural crest markers.

    PubMed

    Kwon, Seung-Hae; Park, Ok Kyu; Nie, Shuyi; Kwak, Jina; Hwang, Byung Joon; Bronner, Marianne E; Kee, Yun

    2014-01-01

    Neural crest cells are highly motile, yet a limited number of genes governing neural crest migration have been identified by conventional studies. To test the hypothesis that cell migration genes are likely to be conserved over large evolutionary distances and from diverse tissues, we searched for vertebrate homologs of genes important for migration of various cell types in the invertebrate nematode and examined their expression during vertebrate neural crest cell migration. Our systematic analysis utilized a combination of comparative genomic scanning, functional pathway analysis and gene expression profiling to uncover previously unidentified genes expressed by premigratory, emigrating and/or migrating neural crest cells. The results demonstrate that similar gene sets are expressed in migratory cell types across distant animals and different germ layers. Bioinformatics analysis of these factors revealed relationships between these genes within signaling pathways that may be important during neural crest cell migration.

  10. A Novel Prioritization Method in Identifying Recurrent Venous Thromboembolism-Related Genes

    PubMed Central

    Xie, Ruiqiang; Chen, Binbin; Huang, Hao; Li, Yiran; He, Yuehan; Lv, Junjie; He, Weiming; Chen, Lina

    2016-01-01

    Identifying the genes involved in venous thromboembolism (VTE) recurrence is important not only for understanding the pathogenesis but also for discovering the therapeutic targets. We proposed a novel prioritization method called Function-Interaction-Pearson (FIP) by creating gene-disease similarity scores to prioritize candidate genes underling VTE. The scores were calculated by integrating and optimizing three types of resources including gene expression, gene ontology and protein-protein interaction. As a result, 124 out of top 200 prioritized candidate genes had been confirmed in literature, among which there were 34 antithrombotic drug targets. Compared with two well-known gene prioritization tools Endeavour and ToppNet, FIP was shown to have better performance. The approach provides a valuable alternative for drug targets discovery and disease therapy. PMID:27050193

  11. Gene dosage, expression, and ontology analysis identifies driver genes in the carcinogenesis and chemoradioresistance of cervical cancer.

    PubMed

    Lando, Malin; Holden, Marit; Bergersen, Linn C; Svendsrud, Debbie H; Stokke, Trond; Sundfør, Kolbein; Glad, Ingrid K; Kristensen, Gunnar B; Lyng, Heidi

    2009-11-01

    Integrative analysis of gene dosage, expression, and ontology (GO) data was performed to discover driver genes in the carcinogenesis and chemoradioresistance of cervical cancers. Gene dosage and expression profiles of 102 locally advanced cervical cancers were generated by microarray techniques. Fifty-two of these patients were also analyzed with the Illumina expression method to confirm the gene expression results. An independent cohort of 41 patients was used for validation of gene expressions associated with clinical outcome. Statistical analysis identified 29 recurrent gains and losses and 3 losses (on 3p, 13q, 21q) associated with poor outcome after chemoradiotherapy. The intratumor heterogeneity, assessed from the gene dosage profiles, was low for these alterations, showing that they had emerged prior to many other alterations and probably were early events in carcinogenesis. Integration of the alterations with gene expression and GO data identified genes that were regulated by the alterations and revealed five biological processes that were significantly overrepresented among the affected genes: apoptosis, metabolism, macromolecule localization, translation, and transcription. Four genes on 3p (RYBP, GBE1) and 13q (FAM48A, MED4) correlated with outcome at both the gene dosage and expression level and were satisfactorily validated in the independent cohort. These integrated analyses yielded 57 candidate drivers of 24 genetic events, including novel loci responsible for chemoradioresistance. Further mapping of the connections among genetic events, drivers, and biological processes suggested that each individual event stimulates specific processes in carcinogenesis through the coordinated control of multiple genes. The present results may provide novel therapeutic opportunities of both early and advanced stage cervical cancers.

  12. Identifying Novel Transcriptional and Epigenetic Features of Nuclear Lamina-associated Genes.

    PubMed

    Wu, Feinan; Yao, Jie

    2017-12-01

    Because a large portion of the mammalian genome is associated with the nuclear lamina (NL), it is interesting to study how native genes resided there are transcribed and regulated. In this study, we report unique transcriptional and epigenetic features of nearly 3,500 NL-associated genes (NL genes). Promoter regions of active NL genes are often excluded from NL-association, suggesting that NL-promoter interactions may repress transcription. Active NL genes with higher RNA polymerase II (Pol II) recruitment levels tend to display Pol II promoter-proximal pausing, while Pol II recruitment and Pol II pausing are not correlated among non-NL genes. At the genome-wide scale, NL-association and H3K27me3 distinguishes two large gene classes with low transcriptional activities. Notably, NL-association is anti-correlated with both transcription and active histone mark levels among genes not significantly enriched with H3K9me3 or H3K27me3, suggesting that NL-association may represent a novel gene repression pathway. Interestingly, an NL gene subgroup is not significantly enriched with H3K9me3 or H3K27me3 and is transcribed at higher levels than the rest of NL genes. Furthermore, we identified distal enhancers associated with active NL genes and reported their epigenetic features.

  13. Senescence-associated-gene signature identifies genes linked to age, prognosis, and progression of human gliomas.

    PubMed

    Coppola, Domenico; Balducci, Lodovico; Chen, Dung-Tsa; Loboda, Andrey; Nebozhyn, Michael; Staller, Aileen; Fulp, William J; Dalton, William; Yeatman, Timothy; Brem, Steven

    2014-10-01

    Senescence-associated genes (SAGs) are responsible for the senescence-associated secretory phenotype, linked in turn to cellular aging, the aging brain, and the pathogenesis of cancer. We hypothesized that senescence-associated genes are overexpressed in older patients, in higher grades of glioma, and portend a poor prognosis. Forty-seven gliomas were arrayed on a custom version of the Affymetrix HG-U133+2.0 GeneChip, for expression of fourteen senescence-associated genes: CCL2, CCL7, CDKN1A, COPG, CSF2RB, CXCL1, ICAM-1, IGFBP-3, IL-6, IL-8, SAA4, TNFRSF-11B, TNFSF-11 and TP53. A combined "senescence score" was generated using principal component analysis to measure the combined effect of the senescence-associated gene signature. An elevated senescence score correlated with older age (r=0.37; P=.01) as well as a higher degree of malignancy, as determined by WHO, histological grade (r=0.49; P<.001). There was a mild association with poor prognosis (P=.06). Gliosarcomas showed the highest scores. Six genes independently correlated with either age (IL-6, TNFRSF-11B, IGFBP-3, SAA4, and COPG), prognosis (IL-6, SAA4), or the grade of the glioma (IL-6, IL-8, ICAM-1, IGFBP-3, and COPG). We report: 1) a novel molecular signature in human gliomas, based on cellular senescence, translating the concept of SAG to human cancer; 2) the senescence signature is composed of genes central to the pathogenesis of gliomas, defining a novel, aggressive subtype of glioma; and 3) these genes provide prognostic biomarkers, as well as targets, for drug discovery and immunotherapy. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. Application of biclustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials.

    PubMed

    Williams, Andrew; Halappanavar, Sabina

    2015-01-01

    The presence of diverse types of nanomaterials (NMs) in commerce is growing at an exponential pace. As a result, human exposure to these materials in the environment is inevitable, necessitating the need for rapid and reliable toxicity testing methods to accurately assess the potential hazards associated with NMs. In this study, we applied biclustering and gene set enrichment analysis methods to derive essential features of altered lung transcriptome following exposure to NMs that are associated with lung-specific diseases. Several datasets from public microarray repositories describing pulmonary diseases in mouse models following exposure to a variety of substances were examined and functionally related biclusters of genes showing similar expression profiles were identified. The identified biclusters were then used to conduct a gene set enrichment analysis on pulmonary gene expression profiles derived from mice exposed to nano-titanium dioxide (nano-TiO2), carbon black (CB) or carbon nanotubes (CNTs) to determine the disease significance of these data-driven gene sets. Biclusters representing inflammation (chemokine activity), DNA binding, cell cycle, apoptosis, reactive oxygen species (ROS) and fibrosis processes were identified. All of the NM studies were significant with respect to the bicluster related to chemokine activity (DAVID; FDR p-value = 0.032). The bicluster related to pulmonary fibrosis was enriched in studies where toxicity induced by CNT and CB studies was investigated, suggesting the potential for these materials to induce lung fibrosis. The pro-fibrogenic potential of CNTs is well established. Although CB has not been shown to induce fibrosis, it induces stronger inflammatory, oxidative stress and DNA damage responses than nano-TiO2 particles. The results of the analysis correctly identified all NMs to be inflammogenic and only CB and CNTs as potentially fibrogenic. In addition to identifying several previously defined, functionally relevant gene

  15. A genomic strategy for the functional validation of colorectal cancer genes identifies potential therapeutic targets.

    PubMed

    Grade, Marian; Hummon, Amanda B; Camps, Jordi; Emons, Georg; Spitzner, Melanie; Gaedcke, Jochen; Hoermann, Patrick; Ebner, Reinhard; Becker, Heinz; Difilippantonio, Michael J; Ghadimi, B Michael; Beissbarth, Tim; Caplen, Natasha J; Ried, Thomas

    2011-03-01

    Genes that are highly overexpressed in tumor cells can be required for tumor cell survival and have the potential to be selective therapeutic targets. In an attempt to identify such targets, we combined a functional genomics and a systems biology approach to assess the consequences of RNAi-mediated silencing of overexpressed genes that were selected from 140 gene expression profiles from colorectal cancers (CRCs) and matched normal mucosa. In order to identify credible models for in-depth functional analysis, we first confirmed the overexpression of these genes in 25 different CRC cell lines. We then identified five candidate genes that profoundly reduced the viability of CRC cell lines when silenced with either siRNAs or short-hairpin RNAs (shRNAs), i.e., HMGA1, TACSTD2, RRM2, RPS2 and NOL5A. These genes were further studied by systematic analysis of comprehensive gene expression profiles generated following siRNA-mediated silencing. Exploration of these RNAi-specific gene expression signatures allowed the identification of the functional space in which the five genes operate and showed enrichment for cancer-specific signaling pathways, some known to be involved in CRC. By comparing the expression of the RNAi signature genes with their respective expression levels in an independent set of primary rectal carcinomas, we could recapitulate these defined RNAi signatures, therefore, establishing the biological relevance of our observations. This strategy identified the signaling pathways that are affected by the prominent oncogenes HMGA1 and TACSTD2, established a yet unknown link between RRM2 and PLK1 and identified RPS2 and NOL5A as promising potential therapeutic targets in CRC.

  16. Identifying candidate genes for Type 2 Diabetes Mellitus and obesity through gene expression profiling in multiple tissues or cells.

    PubMed

    Chen, Junhui; Meng, Yuhuan; Zhou, Jinghui; Zhuo, Min; Ling, Fei; Zhang, Yu; Du, Hongli; Wang, Xiaoning

    2013-01-01

    Type 2 Diabetes Mellitus (T2DM) and obesity have become increasingly prevalent in recent years. Recent studies have focused on identifying causal variations or candidate genes for obesity and T2DM via analysis of expression quantitative trait loci (eQTL) within a single tissue. T2DM and obesity are affected by comprehensive sets of genes in multiple tissues. In the current study, gene expression levels in multiple human tissues from GEO datasets were analyzed, and 21 candidate genes displaying high percentages of differential expression were filtered out. Specifically, DENND1B, LYN, MRPL30, POC1B, PRKCB, RP4-655J12.3, HIBADH, and TMBIM4 were identified from the T2DM-control study, and BCAT1, BMP2K, CSRNP2, MYNN, NCKAP5L, SAP30BP, SLC35B4, SP1, BAP1, GRB14, HSP90AB1, ITGA5, and TOMM5 were identified from the obesity-control study. The majority of these genes are known to be involved in T2DM and obesity. Therefore, analysis of gene expression in various tissues using GEO datasets may be an effective and feasible method to determine novel or causal genes associated with T2DM and obesity.

  17. Dysregulated module approach identifies disrupted genes and pathways associated with acute myelocytic leukemia.

    PubMed

    Fang, Y; Xie, L-N; Liu, X-M; Yu, Z; Kong, F-S; Song, N-X; Zhou, F

    2015-12-01

    To identify disrupted genes and pathways involved in acute myelocytic leukemia (AML) by systematically tracking the dysregulated modules across normal and AML conditions. In this study, we firstly integrated the protein interaction data and expression profiles to infer and reweight the normal and AML networks using Pearson correlation coefficient (PCC). Next, clustering-based on maximal cliques (CMC) approach and a maximum weight bipartite matching method were implemented to infer the condition-specific modules and capture the disturbed modules, respectively, from two conditional networks. Then, the gene compositions and functional enrichment analysis were performed to identify the dysregulated genes and pathways. Finally, reverse transcription polymerase chain reaction (RT-PCR) was implemented to study the expression level of several key genes in AML patients. In two conditional-specific networks, universal changes of gene correlations were revealed, making the differential correlation density among disrupted module pairs. In this work, a total of 84 altered modules were identified by comparing modules in normal and AML networks. Functional enrichment analysis showed that genes in altered modules mainly involved in cell cycle, nucleic acids and cancer signaling process, and differentially expressed genes (DEGs) and changed gene correlations were mainly participated in natural killer cell-mediated cytotoxicity and acute myeloid leukemia pathway. The key genes, such as MYC, EGFR, MAPK1 and CCNA1, were all significantly differentially expressed in AML patients. This module approach effectively identifies dysregulated pathways and genes associated with AML. The considerable differences of gene correlations yield to these dysfunctional modules, and the coordinated disruption of these very modules contributes to leukemogenesis.

  18. A cross-species bi-clustering approach to identifying conserved co-regulated genes

    PubMed Central

    Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

    2016-01-01

    Motivation: A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. Results: We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on

  19. Gene-Based Genome-Wide Association Analysis in European and Asian Populations Identified Novel Genes for Rheumatoid Arthritis

    PubMed Central

    Zhu, Hong; Xia, Wei; Mo, Xing-Bo; Lin, Xiang; Qiu, Ying-Hua; Yi, Neng-Jun; Zhang, Yong-Hong; Deng, Fei-Yan; Lei, Shu-Feng

    2016-01-01

    Objective Rheumatoid arthritis (RA) is a complex autoimmune disease. Using a gene-based association research strategy, the present study aims to detect unknown susceptibility to RA and to address the ethnic differences in genetic susceptibility to RA between European and Asian populations. Methods Gene-based association analyses were performed with KGG 2.5 by using publicly available large RA datasets (14,361 RA cases and 43,923 controls of European subjects, 4,873 RA cases and 17,642 controls of Asian Subjects). For the newly identified RA-associated genes, gene set enrichment analyses and protein-protein interactions analyses were carried out with DAVID and STRING version 10.0, respectively. Differential expression verification was conducted using 4 GEO datasets. The expression levels of three selected ‘highly verified’ genes were measured by ELISA among our in-house RA cases and controls. Results A total of 221 RA-associated genes were newly identified by gene-based association study, including 71‘overlapped’, 76 ‘European-specific’ and 74 ‘Asian-specific’ genes. Among them, 105 genes had significant differential expressions between RA patients and health controls at least in one dataset, especially for 20 genes including 11 ‘overlapped’ (ABCF1, FLOT1, HLA-F, IER3, TUBB, ZKSCAN4, BTN3A3, HSP90AB1, CUTA, BRD2, HLA-DMA), 5 ‘European-specific’ (PHTF1, RPS18, BAK1, TNFRSF14, SUOX) and 4 ‘Asian-specific’ (RNASET2, HFE, BTN2A2, MAPK13) genes whose differential expressions were significant at least in three datasets. The protein expressions of two selected genes FLOT1 (P value = 1.70E-02) and HLA-DMA (P value = 4.70E-02) in plasma were significantly different in our in-house samples. Conclusion Our study identified 221 novel RA-associated genes and especially highlighted the importance of 20 candidate genes on RA. The results addressed ethnic genetic background differences for RA susceptibility between European and Asian populations and

  20. Microarray analysis of hepatic gene expression identifies new genes involved in steatotic liver

    PubMed Central

    Guillén, Natalia; Navarro, María A.; Arnal, Carmen; Noone, Enda; Arbonés-Mainar, José M.; Acín, Sergio; Surra, Joaquín C.; Muniesa, Pedro; Roche, Helen M.; Osada, Jesús

    2009-01-01

    Trans-10, cis-12-conjugated linoleic acid (CLA)-enriched diets promote fatty liver in mice, while cis-9, trans-11-CLA ameliorates this effect, suggesting regulation of multiple genes. To test this hypothesis, apoE-deficient mice were fed a Western-type diet enriched with linoleic acid isomers, and their hepatic gene expression was analyzed with DNA microarrays. To provide an initial screening of candidate genes, only 12 with remarkably modified expression between both CLA isomers were considered and confirmed by quantitative RT-PCR. Additionally mRNA expression of 15 genes involved in lipid metabolism was also studied. Ten genes (Fsp27, Aqp4, Cd36, Ly6d, Scd1, Hsd3b5, Syt1, Cyp7b1, and Tff3) showed significant associations among their expressions and the degree of hepatic steatosis. Their involvement was also analyzed in other models of steatosis. In hyperhomocysteinemic mice lacking Cbs gene, only Fsp27, Cd36, Scd1, Syt1, and Hsd3b5 hepatic expressions were associated with steatosis. In apoE-deficient mice consuming olive-enriched diet displaying reduction of the fatty liver, only Fsp27 and Syt1 expressions were found associated. Using this strategy, we have shown that expression of these genes is highly associated with hepatic steatosis in a genetic disease such as Cbs deficiency and in two common situations such as Western diets containing CLA isomers or a Mediterranean-type diet. Conclusion: The results highlight new processes involved in lipid handling in liver and will help to understand the complex human pathology providing new proteins and new strategies to cope with hepatic steatosis. PMID:19258494

  1. Oligonucleotide microarray identifies genes differentially expressed during tumorigenesis of DMBA-induced pancreatic cancer in rats.

    PubMed

    Guo, Jun-Chao; Li, Jian; Yang, Ying-Chi; Zhou, Li; Zhang, Tai-Ping; Zhao, Yu-Pei

    2013-01-01

    The extremely dismal prognosis of pancreatic cancer (PC) is attributed, at least in part, to lack of early diagnosis. Therefore, identifying differentially expressed genes in multiple steps of tumorigenesis of PC is of great interest. In the present study, a 7,12-dimethylbenzanthraene (DMBA)-induced PC model was established in male Sprague-Dawley rats. The gene expression profile was screened using an oligonucleotide microarray, followed by real-time quantitative polymerase chain reaction (qRT-PCR) and immunohistochemical staining validation. A total of 661 differentially expressed genes were identified in stages of pancreatic carcinogenesis. According to GO classification, these genes were involved in multiple molecular pathways. Using two-way hierarchical clustering analysis, normal pancreas, acute and chronic pancreatitis, PanIN, early and advanced pancreatic cancer were completely discriminated. Furthermore, 11 upregulated and 142 downregulated genes (probes) were found by Mann-Kendall trend Monotone test, indicating homologous genes of rat and human. The qRT-PCR and immunohistochemistry analysis of CXCR7 and UBe2c, two of the identified genes, confirmed the microarray results. In human PC cell lines, knockdown of CXCR7 resulted in decreased migration and invasion. Collectively, our data identified several promising markers and therapeutic targets of PC based on a comprehensive screening and systemic validation.

  2. Transcriptional Analysis of Gli3 Mutants Identifies Wnt Target Genes in the Developing Hippocampus

    PubMed Central

    Hasenpusch-Theil, Kerstin; Magnani, Dario; Amaniti, Eleni-Maria; Han, Lin; Armstrong, Douglas

    2012-01-01

    Early development of the hippocampus, which is essential for spatial memory and learning, is controlled by secreted signaling molecules of the Wnt gene family and by Wnt/β-catenin signaling. Despite its importance, little is known, however, about Wnt-regulated genes during hippocampal development. Here, we used the Gli3 mutant mouse extra-toes (XtJ), in which Wnt gene expression in the forebrain is severely affected, as a tool in a microarray analyses to identify potential Wnt target genes. This approach revealed 53 candidate genes with restricted or graded expression patterns in the dorsomedial telencephalon. We identified conserved Tcf/Lef-binding sites in telencephalon-specific enhancers of several of these genes, including Dmrt3, Gli3, Nfia, and Wnt8b. Binding of Lef1 to these sites was confirmed using electrophoretic mobility shift assays. Mutations in these Tcf/Lef-binding sites disrupted or reduced enhancer activity in vivo. Moreover, ectopic activation of Wnt/β-catenin signaling in an ex vivo explant system led to increased telencephalic expression of these genes. Finally, conditional inactivation of Gli3 results in defective hippocampal growth. Collectively, these data strongly suggest that we have identified a set of direct Wnt target genes in the developing hippocampus and provide inside into the genetic hierarchy underlying Wnt-regulated hippocampal development. PMID:22235033

  3. Gene Expression Profiling Combined with Bioinformatics Analysis Identify Biomarkers for Parkinson Disease

    PubMed Central

    Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui

    2012-01-01

    Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result. PMID:23284986

  4. Gene expression profiling combined with bioinformatics analysis identify biomarkers for Parkinson disease.

    PubMed

    Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui

    2012-01-01

    Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result.

  5. Identifying stably expressed genes from multiple RNA-Seq data sets

    PubMed Central

    Emerson, Sarah; Chang, Jeff H.; Di, Yanming

    2016-01-01

    We examined RNA-Seq data on 211 biological samples from 24 different Arabidopsis experiments carried out by different labs. We grouped the samples according to tissue types, and in each of the groups, we identified genes that are stably expressed across biological samples, treatment conditions, and experiments. We fit a Poisson log-linear mixed-effect model to the read counts for each gene and decomposed the total variance into between-sample, between-treatment and between-experiment variance components. Identifying stably expressed genes is useful for count normalization and differential expression analysis. The variance component analysis that we explore here is a first step towards understanding the sources and nature of the RNA-Seq count variation. When using a numerical measure to identify stably expressed genes, the outcome depends on multiple factors: the background sample set and the reference gene set used for count normalization, the technology used for measuring gene expression, and the specific numerical stability measure used. Since differential expression (DE) is measured by relative frequencies, we argue that DE is a relative concept. We advocate using an explicit reference gene set for count normalization to improve interpretability of DE results, and recommend using a common reference gene set when analyzing multiple RNA-Seq experiments to avoid potential inconsistent conclusions. PMID:28028467

  6. Identifying reproducible cancer-associated highly expressed genes with important functional significances using multiple datasets

    PubMed Central

    Huang, Haiyan; Li, Xiangyu; Guo, You; Zhang, Yuncong; Deng, Xusheng; Chen, Lufei; Zhang, Jiahui; Guo, Zheng; Ao, Lu

    2016-01-01

    Identifying differentially expressed (DE) genes between cancer and normal tissues is of basic importance for studying cancer mechanisms. However, current methods, such as the commonly used Significance Analysis of Microarrays (SAM), are biased to genes with low expression levels. Recently, we proposed an algorithm, named the pairwise difference (PD) algorithm, to identify highly expressed DE genes based on reproducibility evaluation of top-ranked expression differences between paired technical replicates of cells under two experimental conditions. In this study, we extended the application of the algorithm to the identification of DE genes between two types of tissue samples (biological replicates) based on several independent datasets or sub-datasets of a dataset, by constructing multiple paired average gene expression profiles for the two types of samples. Using multiple datasets for lung and esophageal cancers, we demonstrated that PD could identify many DE genes highly expressed in both cancer and normal tissues that tended to be missed by the commonly used SAM. These highly expressed DE genes, including many housekeeping genes, were significantly enriched in many conservative pathways, such as ribosome, proteasome, phagosome and TNF signaling pathways with important functional significances in oncogenesis. PMID:27796338

  7. Genes associated with agronomic traits in non-heading Chinese cabbage identified by expression profiling

    PubMed Central

    2014-01-01

    Background The genomes of non-heading Chinese cabbage (Brassica rapa ssp. chinensis), heading Chinese cabbage (Brassica rapa ssp. pekinensis) and their close relative Arabidopsis thaliana have provided important resources for studying the evolution and genetic improvement of cruciferous plants. Natural growing conditions present these plants with a variety of physiological challenges for which they have a repertoire of genes that ensure adaptability and normal growth. We investigated the differential expressions of genes that control adaptability and development in plants growing in the natural environment to study underlying mechanisms of their expression. Results Using digital gene expression tag profiling, we constructed an expression profile to identify genes related to important agronomic traits under natural growing conditions. Among three non-heading Chinese cabbage cultivars, we found thousands of genes that exhibited significant differences in expression levels at five developmental stages. Through comparative analysis and previous reports, we identified several candidate genes associated with late flowering, cold tolerance, self-incompatibility, and leaf color. Two genes related to cold tolerance were verified using quantitative real-time PCR. Conclusions We identified a large number of genes associated with important agronomic traits of non-heading Chinese cabbage. This analysis will provide a wealth of resources for molecular-assisted breeding of cabbage. The raw data and detailed results of this analysis are available at the website http://nhccdata.njau.edu.cn. PMID:24655567

  8. Comparative Analysis of Cluster Validity Indices in Identifying Some Possible Genes Mediating Certain Cancers.

    PubMed

    Ghosh, Anupam; Dhara, Bibhas Chandra; De, Rajat K

    2013-04-01

    In this article, we compare the performance of 19 cluster validity indices, in identifying some possible genes mediating certain cancers, based on gene expression data. For the purpose of this comparison, we have developed a method. The proposed method involves cluster generation, selection of the best k-value or c-values, cluster identification, identifying the altered gene cluster, scoring an altered gene cluster and determining the best k-value or c-value exploring through biological repositories. The effectiveness of the method has been demonstrated on three gene expression data sets dealing with human lung cancer, colon cancer, and leukemia. Here, we have used three clustering algorithms, i.e., k-means, PAM and fuzzy c-means. We have used biochemical pathways related to these cancers and p-value statistics for validating the study. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Genes that affect brain structure and function identified by rare variant analyses of Mendelian neurologic disease

    PubMed Central

    Karaca, Ender; Harel, Tamar; Pehlivan, Davut; Jhangiani, Shalini N.; Gambin, Tomasz; Akdemir, Zeynep Coban; Gonzaga-Jauregui, Claudia; Erdin, Serkan; Bayram, Yavuz; Campbell, Ian M.; Hunter, Jill V.; Atik, Mehmed M.; Van Esch, Hilde; Yuan, Bo; Wiszniewski, Wojciech; Isikay, Sedat; Yesil, Gozde; Yuregir, Ozge O.; Bozdogan, Sevcan Tug; Aslan, Huseyin; Aydin, Hatip; Tos, Tulay; Aksoy, Ayse; De Vivo, Darryl C.; Jain, Preti; Geckinli, B. Bilge; Sezer, Ozlem; Gul, Davut; Durmaz, Burak; Cogulu, Ozgur; Ozkinay, Ferda; Topcu, Vehap; Candan, Sukru; Cebi, Alper Han; Ikbal, Mevlit; Gulec, Elif Yilmaz; Gezdirici, Alper; Koparir, Erkan; Ekici, Fatma; Coskun, Salih; Cicek, Salih; Karaer, Kadri; Koparir, Asuman; Duz, Mehmet Bugrahan; Kirat, Emre; Fenercioglu, Elif; Ulucan, Hakan; Seven, Mehmet; Guran, Tulay; Elcioglu, Nursel; Yildirim, Mahmut Selman; Aktas, Dilek; Alikaşifoğlu, Mehmet; Ture, Mehmet; Yakut, Tahsin; Overton, John D.; Yuksel, Adnan; Ozen, Mustafa; Muzny, Donna M.; Adams, David R.; Boerwinkle, Eric; Chung, Wendy K.; Gibbs, Richard A.; Lupski, James R

    2015-01-01

    Development of the human nervous system involves complex interactions between fundamental cellular processes and requires a multitude of genes, many of which remain to be associated with human disease. We applied whole exome sequencing to 128 mostly consanguineous families with neurogenetic disorders that often included brain malformations. Rare variant analyses for both single nucleotide variant (SNV) and copy number variant (CNV) alleles allowed for identification of 45 novel variants in 43 known disease genes, 41 candidate genes, and CNVs in 10 families, with an overall potential molecular cause identified in >85% of families studied. Among the candidate genes identified, we found PRUNE, VARS, and DHX37 in multiple families, and homozygous loss of function variants in AGBL2, SLC18A2, SMARCA1, UBQLN1, and CPLX1. Neuroimaging and in silico analysis of functional and expression proximity between candidate and known disease genes allowed for further understanding of genetic networks underlying specific types of brain malformations. PMID:26539891

  10. The FUN of identifying gene function in bacterial pathogens; insights from Salmonella functional genomics.

    PubMed

    Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D

    2013-10-01

    The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens.

  11. Using Formal Concept Analysis to Identify Negative Correlations in Gene Expression Data.

    PubMed

    Tu, Xudong; Wang, Yuanliang; Zhang, Maolan; Wu, Jinchuan

    2016-01-01

    Recently, many biological studies reported that two groups of genes tend to show negatively correlated or opposite expression tendency in many biological processes or pathways. The negative correlation between genes may imply an important biological mechanism. In this study, we proposed a FCA-based negative correlation algorithm (NCFCA) that can effectively identify opposite expression tendency between two gene groups in gene expression data. After applying it to expression data of cell cycle-regulated genes in yeast, we found that six minichromosome maintenance family genes showed the opposite changing tendency with eight core histone family genes. Furthermore, we confirmed that the negative correlation expression pattern between these two families may be conserved in the cell cycle. Finally, we discussed the reasons underlying the negative correlation of six minichromosome maintenance (MCM) family genes with eight core histone family genes. Our results revealed that negative correlation is an important and potential mechanism that maintains the balance of biological systems by repressing some genes while inducing others. It can thus provide new understanding of gene expression and regulation, the causes of diseases, etc.

  12. Utilization of digital differential display to identify differentially expressed genes related to rumen development.

    PubMed

    Kato, Daichi; Suzuki, Yutaka; Haga, Satoshi; So, KyoungHa; Yamauchi, Eri; Nakano, Miwa; Ishizaki, Hiroshi; Choi, Kichoon; Katoh, Kazuo; Roh, Sang-Gun

    2016-04-01

    This study aimed to identify the genes associated with the development of the rumen epithelium by screening for candidate genes by digital differential display (DDD) in silico. Using DDD in NCBI's UniGene database, expressed sequence tag (EST)-based gene expression profiles were analyzed in rumen, reticulum, omasum, abomasum and other tissues in cattle. One hundred and ten candidate genes with high expression in the rumen were derived from a library of all tissues. The expression levels of 11 genes in all candidate genes were analyzed in the rumen, reticulum, omasum and abomasum of nine Japanese Black male calves (5-week-old pre-weaning: n = 3; 15-week-old weaned calves: n = 6). Among the 11 genes, only 3-hydroxy-3-methylglutaryl-CoA synthase 2 (HMGCS2), aldo-keto reductase family 1, member C1-like (AKR1C1), and fatty acid binding protein 3 (FABP3) showed significant changes in the levels of gene expression in the rumen between the pre- and post-weaning of calves. These results indicate that DDD analysis in silico can be useful for screening candidate genes related to rumen development, and that the changes in expression levels of three genes in the rumen may have been caused by weaning, aging or both.

  13. A computational approach to identifying gene-microRNA modules in cancer.

    PubMed

    Jin, Daeyong; Lee, Hyunju

    2015-01-01

    MicroRNAs (miRNAs) play key roles in the initiation and progression of various cancers by regulating genes. Regulatory interactions between genes and miRNAs are complex, as multiple miRNAs can regulate multiple genes. In addtion, these interactions vary from patient to patient and even among patients with the same cancer type, as cancer development is a heterogeneous process. These relationships are more complicated because transcription factors and other regulatory molecules can also regulate miRNAs and genes. Hence, it is important to identify the complex relationships between genes and miRNAs in cancer. In this study, we propose a computational approach to constructing modules that represent these relationships by integrating the expression data of genes and miRNAs with gene-gene interaction data. First, we used a biclustering algorithm to construct modules consisting of a subset of genes and a subset of samples to incorporate the heterogeneity of cancer cells. Second, we combined gene-gene interactions to include genes that play important roles in cancer-related pathways. Then, we selected miRNAs that are closely associated with genes in the modules based on a Gaussian Bayesian network and Bayesian Information Criteria. When we applied our approach to ovarian cancer and glioblastoma (GBM) data sets, 33 and 54 modules were constructed, respectively. In these modules, 91% and 94% of ovarian cancer and GBM modules, respectively, were explained either by direct regulation between genes and miRNAs or by indirect relationships via transcription factors. In addition, 48.4% and 74.0% of modules from ovarian cancer and GBM, respectively, were enriched with cancer-related pathways, and 51.7% and 71.7% of miRNAs in modules were ovarian cancer-related miRNAs and GBM-related miRNAs, respectively. Finally, we extensively analyzed significant modules and showed that most genes in these modules were related to ovarian cancer and GBM.

  14. CAsubtype: An R Package to Identify Gene Sets Predictive of Cancer Subtypes and Clinical Outcomes.

    PubMed

    Kong, Hualei; Tong, Pan; Zhao, Xiaodong; Sun, Jielin; Li, Hua

    2017-01-21

    In the past decade, molecular classification of cancer has gained high popularity owing to its high predictive power on clinical outcomes as compared with traditional methods commonly used in clinical practice. In particular, using gene expression profiles, recent studies have successfully identified a number of gene sets for the delineation of cancer subtypes that are associated with distinct prognosis. However, identification of such gene sets remains a laborious task due to the lack of tools with flexibility, integration and ease of use. To reduce the burden, we have developed an R package, CAsubtype, to efficiently identify gene sets predictive of cancer subtypes and clinical outcomes. By integrating more than 13,000 annotated gene sets, CAsubtype provides a comprehensive repertoire of candidates for new cancer subtype identification. For easy data access, CAsubtype further includes the gene expression and clinical data of more than 2000 cancer patients from TCGA. CAsubtype first employs principal component analysis to identify gene sets (from user-provided or package-integrated ones) with robust principal components representing significantly large variation between cancer samples. Based on these principal components, CAsubtype visualizes the sample distribution in low-dimensional space for better understanding of the distinction between samples and classifies samples into subgroups with prevalent clustering algorithms. Finally, CAsubtype performs survival analysis to compare the clinical outcomes between the identified subgroups, assessing their clinical value as potentially novel cancer subtypes. In conclusion, CAsubtype is a flexible and well-integrated tool in the R environment to identify gene sets for cancer subtype identification and clinical outcome prediction. Its simple R commands and comprehensive data sets enable efficient examination of the clinical value of any given gene set, thus facilitating hypothesis generating and testing in biological and

  15. Overexpression screens identify conserved dosage chromosome instability genes in yeast and human cancer

    PubMed Central

    Duffy, Supipi; Fam, Hok Khim; Wang, Yi Kan; Styles, Erin B.; Kim, Jung-Hyun; Ang, J. Sidney; Singh, Tejomayee; Larionov, Vladimir; Shah, Sohrab P.; Andrews, Brenda; Boerkoel, Cornelius F.; Hieter, Philip

    2016-01-01

    Somatic copy number amplification and gene overexpression are common features of many cancers. To determine the role of gene overexpression on chromosome instability (CIN), we performed genome-wide screens in the budding yeast for yeast genes that cause CIN when overexpressed, a phenotype we refer to as dosage CIN (dCIN), and identified 245 dCIN genes. This catalog of genes reveals human orthologs known to be recurrently overexpressed and/or amplified in tumors. We show that two genes, TDP1, a tyrosyl-DNA-phosphdiesterase, and TAF12, an RNA polymerase II TATA-box binding factor, cause CIN when overexpressed in human cells. Rhabdomyosarcoma lines with elevated human Tdp1 levels also exhibit CIN that can be partially rescued by siRNA-mediated knockdown of TDP1. Overexpression of dCIN genes represents a genetic vulnerability that could be leveraged for selective killing of cancer cells through targeting of an unlinked synthetic dosage lethal (SDL) partner. Using SDL screens in yeast, we identified a set of genes that when deleted specifically kill cells with high levels of Tdp1. One gene was the histone deacetylase RPD3, for which there are known inhibitors. Both HT1080 cells overexpressing hTDP1 and rhabdomyosarcoma cells with elevated levels of hTdp1 were more sensitive to histone deacetylase inhibitors valproic acid (VPA) and trichostatin A (TSA), recapitulating the SDL interaction in human cells and suggesting VPA and TSA as potential therapeutic agents for tumors with elevated levels of hTdp1. The catalog of dCIN genes presented here provides a candidate list to identify genes that cause CIN when overexpressed in cancer, which can then be leveraged through SDL to selectively target tumors. PMID:27551064

  16. Variability of Gene Expression Identifies Transcriptional Regulators of Early Human Embryonic Development

    PubMed Central

    Hasegawa, Yu; Taylor, Deanne; Ovchinnikov, Dmitry A.; Wolvetang, Ernst J.; de Torrenté, Laurence; Mar, Jessica C.

    2015-01-01

    An analysis of gene expression variability can provide an insightful window into how regulatory control is distributed across the transcriptome. In a single cell analysis, the inter-cellular variability of gene expression measures the consistency of transcript copy numbers observed between cells in the same population. Application of these ideas to the study of early human embryonic development may reveal important insights into the transcriptional programs controlling this process, based on which components are most tightly regulated. Using a published single cell RNA-seq data set of human embryos collected at four-cell, eight-cell, morula and blastocyst stages, we identified genes with the most stable, invariant expression across all four developmental stages. Stably-expressed genes were found to be enriched for those sharing indispensable features, including essentiality, haploinsufficiency, and ubiquitous expression. The stable genes were less likely to be associated with loss-of-function variant genes or human recessive disease genes affected by a DNA copy number variant deletion, suggesting that stable genes have a functional impact on the regulation of some of the basic cellular processes. Genes with low expression variability at early stages of development are involved in regulation of DNA methylation, responses to hypoxia and telomerase activity, whereas by the blastocyst stage, low-variability genes are enriched for metabolic processes as well as telomerase signaling. Based on changes in expression variability, we identified a putative set of gene expression markers of morulae and blastocyst stages. Experimental validation of a blastocyst-expressed variability marker demonstrated that HDDC2 plays a role in the maintenance of pluripotency in human ES and iPS cells. Collectively our analyses identified new regulators involved in human embryonic development that would have otherwise been missed using methods that focus on assessment of the average expression

  17. A Screen for Genes Expressed in the Olfactory Organs of Drosophila melanogaster Identifies Genes Involved in Olfactory Behaviour

    PubMed Central

    Tunstall, Narelle E.; Herr, Anabel; de Bruyne, Marien; Warr, Coral G.

    2012-01-01

    Background For insects the sense of smell and associated olfactory-driven behaviours are essential for survival. Insects detect odorants with families of olfactory receptor proteins that are very different to those of mammals, and there are likely to be other unique genes and genetic pathways involved in the function and development of the insect olfactory system. Methodology/Principal Findings We have performed a genetic screen of a set of 505 Drosophila melanogaster gene trap insertion lines to identify novel genes expressed in the adult olfactory organs. We identified 16 lines with expression in the olfactory organs, many of which exhibited expression of the trapped genes in olfactory receptor neurons. Phenotypic analysis showed that six of the lines have decreased olfactory responses in a behavioural assay, and for one of these we showed that precise excision of the P element reverts the phenotype to wild type, confirming a role for the trapped gene in olfaction. To confirm the identity of the genes trapped in the lines we performed molecular analysis of some of the insertion sites. While for many lines the reported insertion sites were correct, we also demonstrated that for a number of lines the reported location of the element was incorrect, and in three lines there were in fact two pGT element insertions. Conclusions/Significance We identified 16 new genes expressed in the Drosophila olfactory organs, the majority in neurons, and for several of the gene trap lines demonstrated a defect in olfactory-driven behaviour. Further characterisation of these genes and their roles in olfactory system function and development will increase our understanding of how the insect olfactory system has evolved to perform the same essential function to that of mammals, but using very different molecular genetic mechanisms. PMID:22530061

  18. A combined analysis of microarray gene expression studies of the human prefrontal cortex identifies genes implicated in schizophrenia.

    PubMed

    Pérez-Santiago, Josué; Diez-Alarcia, Rebeca; Callado, Luis F; Zhang, Jin X; Chana, Gursharan; White, Cory H; Glatt, Stephen J; Tsuang, Ming T; Everall, Ian P; Meana, J Javier; Woelk, Christopher H

    2012-11-01

    Small cohort sizes and modest levels of gene expression changes in brain tissue have plagued the statistical approaches employed in microarray studies investigating the mechanism of schizophrenia. To combat these problems a combined analysis of six prior microarray studies was performed to facilitate the robust statistical analysis of gene expression data from the dorsolateral prefrontal cortex of 107 patients with schizophrenia and 118 healthy subjects. Multivariate permutation tests identified 144 genes that were differentially expressed between schizophrenia and control groups. Seventy of these genes were identified as differentially expressed in at least one component microarray study but none of these individual studies had the power to identify the remaining 74 genes, demonstrating the utility of a combined approach. Gene ontology terms and biological pathways that were significantly enriched for differentially expressed genes were related to neuronal cell-cell signaling, mesenchymal induction, and mitogen-activated protein kinase signaling, which have all previously been associated with the etiopathogenesis of schizophrenia. The differential expression of BAG3, C4B, EGR1, MT1X, NEUROD6, SST and S100A8 was confirmed by real-time quantitative PCR in an independent cohort using postmortem human prefrontal cortex samples. Comparison of gene expression between schizophrenic subjects with and without detectable levels of antipsychotics in their blood suggests that the modulation of MT1X and S100A8 may be the result of drug exposure. In conclusion, this combined analysis has resulted in a statistically robust identification of genes whose dysregulation may contribute to the mechanism of schizophrenia.

  19. Identifying Novel Candidate Genes Related to Apoptosis from a Protein-Protein Interaction Network

    PubMed Central

    Wang, Baoman; Yuan, Fei; Kong, Xiangyin; Hu, Lan-Dian; Cai, Yu-Dong

    2015-01-01

    Apoptosis is the process of programmed cell death (PCD) that occurs in multicellular organisms. This process of normal cell death is required to maintain the balance of homeostasis. In addition, some diseases, such as obesity, cancer, and neurodegenerative diseases, can be cured through apoptosis, which produces few side effects. An effective comprehension of the mechanisms underlying apoptosis will be helpful to prevent and treat some diseases. The identification of genes related to apoptosis is essential to uncover its underlying mechanisms. In this study, a computational method was proposed to identify novel candidate genes related to apoptosis. First, protein-protein interaction information was used to construct a weighted graph. Second, a shortest path algorithm was applied to the graph to search for new candidate genes. Finally, the obtained genes were filtered by a permutation test. As a result, 26 genes were obtained, and we discuss their likelihood of being novel apoptosis-related genes by collecting evidence from published literature. PMID:26543496

  20. Targeted sequencing identifies 91 neurodevelopmental disorder risk genes with autism and developmental disability biases

    PubMed Central

    Stessman, Holly A. F.; Xiong, Bo; Coe, Bradley P.; Wang, Tianyun; Hoekzema, Kendra; Fenckova, Michaela; Kvarnung, Malin; Gerdts, Jennifer; Trinh, Sandy; Cosemans, Nele; Vives, Laura; Lin, Janice; Turner, Tychele N.; Santen, Gijs; Ruivenkamp, Claudia; Kriek, Marjolein; van Haeringen, Arie; Aten, Emmelien; Friend, Kathryn; Liebelt, Jan; Barnett, Christopher; Haan, Eric; Shaw, Marie; Gecz, Jozef; Anderlid, Britt-Marie; Nordgren, Ann; Lindstrand, Anna; Schwartz, Charles; Kooy, R. Frank; Vandeweyer, Geert; Helsmoortel, Celine; Romano, Corrado; Alberti, Antonino; Vinci, Mirella; Avola, Emanuela; Giusto, Stefania; Courchesne, Eric; Pramparo, Tiziano; Pierce, Karen; Nalabolu, Srinivasa; Amaral, David; Scheffer, Ingrid E.; Delatycki, Martin B.; Lockhart, Paul J.; Hormozdiari, Fereydoun; Harich, Benjamin; Castells-Nobau, Anna; Xia, Kun; Peeters, Hilde; Nordenskjöld, Magnus; Schenck, Annette; Bernier, Raphael A.; Eichler, Evan E.

    2017-01-01

    Gene-disruptive mutations contribute to the biology of neurodevelopmental disorders (NDDs), but most pathogenic genes are not known. We sequenced 208 candidate genes from >11,730 patients and >2,867 controls. We report 91 genes with an excess of de novo mutations or private disruptive mutations in 5.7% of patients, including 38 novel NDD genes. Drosophila functional assays of a subset bolster their involvement in NDDs. We identify 25 genes that show a bias for autism versus intellectual disability and highlight a network associated with high-functioning autism (FSIQ>100). Clinical follow-up for NAA15, KMT5B, and ASH1L reveals novel syndromic and non-syndromic forms of disease. PMID:28191889

  1. Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics

    PubMed Central

    Zeeberg, Barry R; Riss, Joseph; Kane, David W; Bussey, Kimberly J; Uchio, Edward; Linehan, W Marston; Barrett, J Carl; Weinstein, John N

    2004-01-01

    Background When processing microarray data sets, we recently noticed that some gene names were being changed inadvertently to non-gene names. Results A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package. The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included. These conversions are irreversible; the original gene names cannot be recovered. Conclusions Users of Excel for analyses involving gene names should be aware of this problem, which can cause genes, including medically important ones, to be lost from view and which has contaminated even carefully curated public databases. We provide work-arounds and scripts for circumventing the problem. PMID:15214961

  2. Identifying genes and gene networks involved in chromium metabolism and detoxification in Crambe abyssinica.

    PubMed

    Zulfiqar, Asma; Paulose, Bibin; Chhikara, Sudesh; Dhankher, Om Parkash

    2011-10-01

    Chromium pollution is a serious environmental problem with few cost-effective remediation strategies available. Crambe abyssinica (a member of Brassicaseae), a non-food, fast growing high biomass crop, is an ideal candidate for phytoremediation of heavy metals contaminated soils. The present study used a PCR-Select Suppression Subtraction Hybridization approach in C. abyssinica to isolate differentially expressed genes in response to Cr exposure. A total of 72 differentially expressed subtracted cDNAs were sequenced and found to represent 43 genes. The subtracted cDNAs suggest that Cr stress significantly affects pathways related to stress/defense, ion transporters, sulfur assimilation, cell signaling, protein degradation, photosynthesis and cell metabolism. The regulation of these genes in response to Cr exposure was further confirmed by semi-quantitative RT-PCR. Characterization of these differentially expressed genes may enable the engineering of non-food, high-biomass plants, including C. abyssinica, for phytoremediation of Cr-contaminated soils and sediments.

  3. Gene expression differences between Noccaea caerulescens ecotypes help to identify candidate genes for metal phytoremediation.

    PubMed

    Halimaa, Pauliina; Lin, Ya-Fen; Ahonen, Viivi H; Blande, Daniel; Clemens, Stephan; Gyenesei, Attila; Häikiö, Elina; Kärenlampi, Sirpa O; Laiho, Asta; Aarts, Mark G M; Pursiheimo, Juha-Pekka; Schat, Henk; Schmidt, Holger; Tuomainen, Marjo H; Tervahauta, Arja I

    2014-03-18

    Populations of Noccaea caerulescens show tremendous differences in their capacity to hyperaccumulate and hypertolerate metals. To explore the differences that could contribute to these traits, we undertook SOLiD high-throughput sequencing of the root transcriptomes of three phenotypically well-characterized N. caerulescens accessions, i.e., Ganges, La Calamine, and Monte Prinzera. Genes with possible contribution to zinc, cadmium, and nickel hyperaccumulation and hypertolerance were predicted. The most significant differences between the accessions were related to metal ion (di-, trivalent inorganic cation) transmembrane transporter activity, iron and calcium ion binding, (inorganic) anion transmembrane transporter activity, and antioxidant activity. Analysis of correlation between the expression profile of each gene and the metal-related characteristics of the accessions disclosed both previously characterized (HMA4, HMA3) and new candidate genes (e.g., for nickel IRT1, ZIP10, and PDF2.3) as possible contributors to the hyperaccumulation/tolerance phenotype. A number of unknown Noccaea-specific transcripts also showed correlation with Zn(2+), Cd(2+), or Ni(2+) hyperaccumulation/tolerance. This study shows that N. caerulescens populations have evolved great diversity in the expression of metal-related genes, facilitating adaptation to various metalliferous soils. The information will be helpful in the development of improved plants for metal phytoremediation.

  4. Candidate Luminal B Breast Cancer Genes Identified by Genome, Gene Expression and DNA Methylation Profiling

    PubMed Central

    Addou-Klouche, Lynda; Finetti, Pascal; Saade, Marie-Rose; Manai, Marwa; Carbuccia, Nadine; Bekhouche, Ismahane; Letessier, Anne; Charafe-Jauffret, Emmanuelle; Jacquemier, Jocelyne; Spicuglia, Salvatore; de The, Hugues; Viens, Patrice; Bertucci, François; Birnbaum, Daniel; Chaffanet, Max

    2014-01-01

    Breast cancers (BCs) of the luminal B subtype are estrogen receptor-positive (ER+), highly proliferative, resistant to standard therapies and have a poor prognosis. To better understand this subtype we compared DNA copy number aberrations (CNAs), DNA promoter methylation, gene expression profiles, and somatic mutations in nine selected genes, in 32 luminal B tumors with those observed in 156 BCs of the other molecular subtypes. Frequent CNAs included 8p11-p12 and 11q13.1-q13.2 amplifications, 7q11.22-q34, 8q21.12-q24.23, 12p12.3-p13.1, 12q13.11-q24.11, 14q21.1-q23.1, 17q11.1-q25.1, 20q11.23-q13.33 gains and 6q14.1-q24.2, 9p21.3-p24,3, 9q21.2, 18p11.31-p11.32 losses. A total of 237 and 101 luminal B-specific candidate oncogenes and tumor suppressor genes (TSGs) presented a deregulated expression in relation with their CNAs, including 11 genes previously reported associated with endocrine resistance. Interestingly, 88% of the potential TSGs are located within chromosome arm 6q, and seven candidate oncogenes are potential therapeutic targets. A total of 100 candidate oncogenes were validated in a public series of 5,765 BCs and the overexpression of 67 of these was associated with poor survival in luminal tumors. Twenty-four genes presented a deregulated expression in relation with a high DNA methylation level. FOXO3, PIK3CA and TP53 were the most frequent mutated genes among the nine tested. In a meta-analysis of next-generation sequencing data in 875 BCs, KCNB2 mutations were associated with luminal B cases while candidate TSGs MDN1 (6q15) and UTRN (6q24), were mutated in this subtype. In conclusion, we have reported luminal B candidate genes that may play a role in the development and/or hormone resistance of this aggressive subtype. PMID:24416132

  5. Candidate luminal B breast cancer genes identified by genome, gene expression and DNA methylation profiling.

    PubMed

    Cornen, Stéphanie; Guille, Arnaud; Adélaïde, José; Addou-Klouche, Lynda; Finetti, Pascal; Saade, Marie-Rose; Manai, Marwa; Carbuccia, Nadine; Bekhouche, Ismahane; Letessier, Anne; Raynaud, Stéphane; Charafe-Jauffret, Emmanuelle; Jacquemier, Jocelyne; Spicuglia, Salvatore; de The, Hugues; Viens, Patrice; Bertucci, François; Birnbaum, Daniel; Chaffanet, Max

    2014-01-01

    Breast cancers (BCs) of the luminal B subtype are estrogen receptor-positive (ER+), highly proliferative, resistant to standard therapies and have a poor prognosis. To better understand this subtype we compared DNA copy number aberrations (CNAs), DNA promoter methylation, gene expression profiles, and somatic mutations in nine selected genes, in 32 luminal B tumors with those observed in 156 BCs of the other molecular subtypes. Frequent CNAs included 8p11-p12 and 11q13.1-q13.2 amplifications, 7q11.22-q34, 8q21.12-q24.23, 12p12.3-p13.1, 12q13.11-q24.11, 14q21.1-q23.1, 17q11.1-q25.1, 20q11.23-q13.33 gains and 6q14.1-q24.2, 9p21.3-p24,3, 9q21.2, 18p11.31-p11.32 losses. A total of 237 and 101 luminal B-specific candidate oncogenes and tumor suppressor genes (TSGs) presented a deregulated expression in relation with their CNAs, including 11 genes previously reported associated with endocrine resistance. Interestingly, 88% of the potential TSGs are located within chromosome arm 6q, and seven candidate oncogenes are potential therapeutic targets. A total of 100 candidate oncogenes were validated in a public series of 5,765 BCs and the overexpression of 67 of these was associated with poor survival in luminal tumors. Twenty-four genes presented a deregulated expression in relation with a high DNA methylation level. FOXO3, PIK3CA and TP53 were the most frequent mutated genes among the nine tested. In a meta-analysis of next-generation sequencing data in 875 BCs, KCNB2 mutations were associated with luminal B cases while candidate TSGs MDN1 (6q15) and UTRN (6q24), were mutated in this subtype. In conclusion, we have reported luminal B candidate genes that may play a role in the development and/or hormone resistance of this aggressive subtype.

  6. A novel bioinformatics approach identifies candidate genes for the synthesis and feruloylation of arabinoxylan.

    PubMed

    Mitchell, Rowan A C; Dupree, Paul; Shewry, Peter R

    2007-05-01

    Arabinoxylans (AXs) are major components of graminaceous plant cell walls, including those in the grain and straw of economically important cereals. Despite some recent advances in identifying the genes encoding biosynthetic enzymes for a number of other plant cell wall polysaccharides, the genes encoding enzymes of the final stages of AX synthesis have not been identified. We have therefore adopted a novel bioinformatics approach based on estimation of differential expression of orthologous genes between taxonomic divisions of species. Over 3 million public domain cereal and dicot expressed sequence tags were mapped onto the complete sets of rice (Oryza sativa) and Arabidopsis (Arabidopsis thaliana) genes, respectively. It was assumed that genes in cereals involved in AX biosynthesis would be expressed at high levels and that their orthologs in dicotyledonous plants would be expressed at much lower levels. Considering all rice genes encoding putative glycosyl transferases (GTs) predicted to be integral membrane proteins, genes in the GT43, GT47, and GT61 families emerged as much the strongest candidates. When the search was widened to all other rice or Arabidopsis genes predicted to encode integral membrane proteins, cereal genes in Pfam family PF02458 emerged as candidates for the feruloylation of AX. Our analysis, known activities, and recent findings elsewhere are most consistent with genes in the GT43 families encoding beta-1,4-xylan synthases, genes in the GT47 family encoding xylan alpha-1,2- or alpha-1,3-arabinosyl transferases, and genes in the GT61 family encoding feruloyl-AX beta-1,2-xylosyl transferases.

  7. Suppression subtractive hybridization and comparative expression analysis to identify developmentally regulated genes in filamentous fungi.

    PubMed

    Gesing, Stefan; Schindler, Daniel; Nowrousian, Minou

    2013-09-01

    Ascomycetes differentiate four major morphological types of fruiting bodies (apothecia, perithecia, pseudothecia and cleistothecia) that are derived from an ancestral fruiting body. Thus, fruiting body differentiation is most likely controlled by a set of common core genes. One way to identify such genes is to search for genes with evolutionary conserved expression patterns. Using suppression subtractive hybridization (SSH), we selected differentially expressed transcripts in Pyronema confluens (Pezizales) by comparing two cDNA libraries specific for sexual and for vegetative development, respectively. The expression patterns of selected genes from both libraries were verified by quantitative real time PCR. Expression of several corresponding homologous genes was found to be conserved in two members of the Sordariales (Sordaria macrospora and Neurospora crassa), a derived group of ascomycetes that is only distantly related to the Pezizales. Knockout studies with N. crassa orthologues of differentially regulated genes revealed a functional role during fruiting body development for the gene NCU05079, encoding a putative MFS peptide transporter. These data indicate conserved gene expression patterns and a functional role of the corresponding genes during fruiting body development; such genes are candidates of choice for further functional analysis.

  8. Haplotype Association Mapping Identifies a Candidate Gene Region in Mice Infected With Staphylococcus aureus.

    PubMed

    Johnson, Nicole V; Ahn, Sun Hee; Deshmukh, Hitesh; Levin, Mikhail K; Nelson, Charlotte L; Scott, William K; Allen, Andrew; Fowler, Vance G; Cowell, Lindsay G

    2012-06-01

    Exposure to Staphylococcus aureus has a variety of outcomes, from asymptomatic colonization to fatal infection. Strong evidence suggests that host genetics play an important role in susceptibility, but the specific host genetic factors involved are not known. The availability of genome-wide single nucleotide polymorphism (SNP) data for inbred Mus musculus strains means that haplotype association mapping can be used to identify candidate susceptibility genes. We applied haplotype association mapping to Perlegen SNP data and kidney bacterial counts from Staphylococcus aureus-infected mice from 13 inbred strains and detected an associated block on chromosome 7. Strong experimental evidence supports the result: a separate study demonstrated the presence of a susceptibility locus on chromosome 7 using consomic mice. The associated block contains no genes, but lies within the gene cluster of the 26-member extended kallikrein gene family, whose members have well-recognized roles in the generation of antimicrobial peptides and the regulation of inflammation. Efficient mixed-model association (EMMA) testing of all SNPs with two alleles and located within the gene cluster boundaries finds two significant associations: one of the three polymorphisms defining the associated block and one in the gene closest to the block, Klk1b11. In addition, we find that 7 of the 26 kallikrein genes are differentially expressed between susceptible and resistant mice, including the Klk1b11 gene. These genes represent a promising set of candidate genes influencing susceptibility to Staphylococcus aureus.

  9. Candidate chemosensory genes identified in the endoparasitoid Meteorus pulchricornis (Hymenoptera: Braconidae) by antennal transcriptome analysis.

    PubMed

    Sheng, Sheng; Liao, Cheng-Wu; Zheng, Yu; Zhou, Yu; Xu, Yan; Song, Wen-Miao; He, Peng; Zhang, Jian; Wu, Fu-An

    2017-06-01

    Meteorus pulchricornis is an endoparasitoid wasp which attacks the larvae of various lepidopteran pests. We present the first antennal transcriptome dataset for M. pulchricornis. A total of 48,845,072 clean reads were obtained and 34,967 unigenes were assembled. Of these, 15,458 unigenes showed a significant similarity (E-value <10(-5)) to known proteins in the NCBI non-redundant protein database. Gene ontology (GO) and cluster of orthologous groups (COG) analyses were used to classify the functions of M. pulchricornis antennae genes. We identified 16 putative odorant-binding protein (OBP) genes, eight chemosensory protein (CSP) genes, 99 olfactory receptor (OR) genes, 19 ionotropic receptor (IR) genes and one sensory neuron membrane protein (SNMP) gene. BLASTx best hit results and phylogenetic analysis both indicated that these chemosensory genes were most closely related to those found in other hymenopteran species. Real-time quantitative PCR assays showed that 14 MpulOBP genes were antennae-specific. Of these, MpulOBP6, MpulOBP9, MpulOBP10, MpulOBP12, MpulOBP15 and MpulOBP16 were found to have greater expression in the antennae than in other body parts, while MpulOBP2 and MpulOBP3 were expressed predominately in the legs and abdomens, respectively. These results might provide a foundation for future studies of olfactory genes and chemoreception in M. pulchricornis. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Functional genomics screening utilizing mutant mouse embryonic stem cells identifies novel radiation-response genes.

    PubMed

    Loesch, Kimberly; Galaviz, Stacy; Hamoui, Zaher; Clanton, Ryan; Akabani, Gamal; Deveau, Michael; DeJesus, Michael; Ioerger, Thomas; Sacchettini, James C; Wallis, Deeann

    2015-01-01

    Elucidating the genetic determinants of radiation response is crucial to optimizing and individualizing radiotherapy for cancer patients. In order to identify genes that are involved in enhanced sensitivity or resistance to radiation, a library of stable mutant murine embryonic stem cells (ESCs), each with a defined mutation, was screened for cell viability and gene expression in response to radiation exposure. We focused on a cancer-relevant subset of over 500 mutant ESC lines. We identified 13 genes; 7 genes that have been previously implicated in radiation response and 6 other genes that have never been implicated in radiation response. After screening, proteomic analysis showed enrichment for genes involved in cellular component disassembly (e.g. Dstn and Pex14) and regulation of growth (e.g. Adnp2, Epc1, and Ing4). Overall, the best targets with the highest potential for sensitizing cancer cells to radiation were Dstn and Map2k6, and the best targets for enhancing resistance to radiation were Iqgap and Vcan. Hence, we provide compelling evidence that screening mutant ESCs is a powerful approach to identify genes that alter radiation response. Ultimately, this knowledge can be used to define genetic variants or therapeutic targets that will enhance clinical therapy.

  11. Candidate genes for the progression of malignant gliomas identified by microarray analysis.

    PubMed

    Bozinov, Oliver; Köhler, Sylvia; Samans, Birgit; Benes, Ludwig; Miller, Dorothea; Ritter, Markus; Sure, Ulrich; Bertalanffy, Helmut

    2008-01-01

    Malignant astrocytomas of World Health Organization (WHO) grade III or IV have a reduced median survival time, and possible pathways have been described for the progression of anaplastic astrocytomas and glioblastomas, but the molecular basis of malignant astrocytoma progression is still poorly understood. Microarray analysis provides the chance to accelerate studies by comparison of the expression of thousands of genes in these tumours and consequently identify targeting genes. We compared the transcriptional profile of 4,608 genes in tumours of 15 patients including 6 anaplastic astrocytomas (WHO grade III) and 9 glioblastomas (WHO grade IV) using microarray analysis. The microarray data were corroborated by real-time reverse transcription-polymerase chain reaction analysis of two selected genes. We identified 166 gene alterations with a fold change of 2 and higher whose mRNA levels differed (absolute value of the t statistic of 1.96) between the two malignant glioma groups. Further analyses confirmed same transcription directions for Olig2 and IL-13Ralpha2 in anaplastic astrocytomas as compared to glioblastomas. Microarray analyses with a close binary question reveal numerous interesting candidate genes, which need further histochemical testing after selection for confirmation. IL-13Ralpha2 and Olig2 have been identified and confirmed to be interesting candidate genes whose differential expression likely plays a role in malignant progression of astrocytomas.

  12. Novel linkage disequilibrium clustering algorithm identifies new lupus genes on meta-analysis of GWAS datasets.

    PubMed

    Saeed, Mohammad

    2017-05-01

    Systemic lupus erythematosus (SLE) is a complex disorder. Genetic association studies of complex disorders suffer from the following three major issues: phenotypic heterogeneity, false positive (type I error), and false negative (type II error) results. Hence, genes with low to moderate effects are missed in standard analyses, especially after statistical corrections. OASIS is a novel linkage disequilibrium clustering algorithm that can potentially address false positives and negatives in genome-wide association studies (GWAS) of complex disorders such as SLE. OASIS was applied to two SLE dbGAP GWAS datasets (6077 subjects; ∼0.75 million single-nucleotide polymorphisms). OASIS identified three known SLE genes viz. IFIH1, TNIP1, and CD44, not previously reported using these GWAS datasets. In addition, 22 novel loci for SLE were identified and the 5 SLE genes previously reported using these datasets were verified. OASIS methodology was validated using single-variant replication and gene-based analysis with GATES. This led to the verification of 60% of OASIS loci. New SLE genes that OASIS identified and were further verified include TNFAIP6, DNAJB3, TTF1, GRIN2B, MON2, LATS2, SNX6, RBFOX1, NCOA3, and CHAF1B. This study presents the OASIS algorithm, software, and the meta-analyses of two publicly available SLE GWAS datasets along with the novel SLE genes. Hence, OASIS is a novel linkage disequilibrium clustering method that can be universally applied to existing GWAS datasets for the identification of new genes.

  13. Using machine learning algorithms to identify genes essential for cell survival.

    PubMed

    Philips, Santosh; Wu, Heng-Yi; Li, Lang

    2017-10-03

    With the explosion of data comes a proportional opportunity to identify novel knowledge with the potential for application in targeted therapies. In spite of this huge amounts of data, the solutions to treating complex disease is elusive. One reason being that these diseases are driven by a network of genes that need to be targeted in order to understand and treat them effectively. Part of the solution lies in mining and integrating information from various disciplines. Here we propose a machine learning method to mining through publicly available literature on RNA interference with the goal of identifying genes essential for cell survival. A total of 32,164 RNA interference abstracts were identified from 10.5 million pubmed abstracts (2001 - 2015). These abstracts spanned over 1467 cancer cell lines and 4373 genes representing a total of 25,891 cell gene associations. Among the 1467 cell lines 88% of them had at least 1 or up to 25 genes studied in a given cell line. Among the 4373 genes 96% of them were studied in at least 1 or up to 25 different cell lines. Identifying genes that are crucial for cell survival can be a critical piece of information especially in treating complex diseases, such as cancer. The efficacy of a therapeutic intervention is multifactorial in nature and in many cases the source of therapeutic disruption could be from an unsuspected source. Machine learning algorithms helps to narrow down the search and provides information about essential genes in different cancer types. It also provides the building blocks to generate a network of interconnected genes and processes. The information thus gained can be used to generate hypothesis which can be experimentally validated to improve our understanding of what triggers and maintains the growth of cancerous cells.

  14. Statistical completion of a partially identified graph with applications for the estimation of gene regulatory networks

    PubMed Central

    Yu, Donghyeon; Son, Won; Lim, Johan; Xiao, Guanghua

    2015-01-01

    We study the estimation of a Gaussian graphical model whose dependent structures are partially identified. In a Gaussian graphical model, an off-diagonal zero entry in the concentration matrix (the inverse covariance matrix) implies the conditional independence of two corresponding variables, given all other variables. A number of methods have been proposed to estimate a sparse large-scale Gaussian graphical model or, equivalently, a sparse large-scale concentration matrix. In practice, the graph structure to be estimated is often partially identified by other sources or a pre-screening. In this paper, we propose a simple modification of existing methods to take into account this information in the estimation. We show that the partially identified dependent structure reduces the error in estimating the dependent structure. We apply the proposed method to estimating the gene regulatory network from lung cancer data, where protein–protein interactions are partially identified from the human protein reference database. The application shows that proposed method identified many important cancer genes as hub genes in the constructed lung cancer network. In addition, we validated the prognostic importance of a newly identified cancer gene, PTPN13, in four independent lung cancer datasets. The results indicate that the proposed method could facilitate studying underlying lung cancer mechanisms and identifying reliable biomarkers for lung cancer prognosis. PMID:25837438

  15. A novel approach to identify driver genes involved in androgen-independent prostate cancer

    PubMed Central

    2014-01-01

    Background Insertional mutagenesis screens have been used with great success to identify oncogenes and tumor suppressor genes. Typically, these screens use gammaretroviruses (γRV) or transposons as insertional mutagens. However, insertional mutations from replication-competent γRVs or transposons that occur later during oncogenesis can produce passenger mutations that do not drive cancer progression. Here, we utilized a replication-incompetent lentiviral vector (LV) to perform an insertional mutagenesis screen to identify genes in the progression to androgen-independent prostate cancer (AIPC). Methods Prostate cancer cells were mutagenized with a LV to enrich for clones with a selective advantage in an androgen-deficient environment provided by a dysregulated gene(s) near the vector integration site. We performed our screen using an in vitro AIPC model and also an in vivo xenotransplant model for AIPC. Our approach identified proviral integration sites utilizing a shuttle vector that allows for rapid rescue of plasmids in E. coli that contain LV long terminal repeat (LTR)-chromosome junctions. This shuttle vector approach does not require PCR amplification and has several advantages over PCR-based techniques. Results Proviral integrations were enriched near prostate cancer susceptibility loci in cells grown in androgen-deficient medium (p < 0.001), and five candidate genes that influence AIPC were identified; ATPAF1, GCOM1, MEX3D, PTRF, and TRPM4. Additionally, we showed that RNAi knockdown of ATPAF1 significantly reduces growth (p < 0.05) in androgen-deficient conditions. Conclusions Our approach has proven effective for use in PCa, identifying a known prostate cancer gene, PTRF, and also several genes not previously associated with prostate cancer. The replication-incompetent shuttle vector approach has broad potential applications for cancer gene discovery, and for interrogating diverse biological and disease processes. PMID:24885513

  16. Ectopic Activation of Germline and Placental Genes Identifies Aggressive Metastasis-Prone Lung Cancers

    PubMed Central

    Rousseaux, Sophie; Debernardi, Alexandra; Jacquiau, Baptiste; Vitte, Anne-Laure; Vesin, Aurélien; Nagy-Mignotte, Hélène; Moro-Sibilot, Denis; Brichon, Pierre-Yves; Lantuejoul, Sylvie; Hainaut, Pierre; Laffaire, Julien; de Reyniès, Aurélien; Beer, David G.; Timsit, Jean-François; Brambilla, Christian; Brambilla, Elisabeth; Khochbin, Saadi

    2016-01-01

    Activation of normally silent tissue-specific genes and the resulting cell “identity crisis” are the unexplored consequences of malignant epigenetic reprogramming. We designed a strategy for investigating this reprogramming, which consisted of identifying a large number of tissue-restricted genes that are epigenetically silenced in normal somatic cells and then detecting their expression in cancer. This approach led to the demonstration that large-scale “off-context” gene activations systematically occur in a variety of cancer types. In our series of 293 lung tumors, we identified an ectopic gene expression signature associated with a subset of highly aggressive tumors, which predicted poor prognosis independently of the TNM (tumor size, node positivity, and metastasis) stage or histological subtype. The ability to isolate these tumors allowed us to reveal their common molecular features characterized by the acquisition of embryonic stem cell/germ cell gene expression profiles and the down-regulation of immune response genes. The methodical recognition of ectopic gene activations in cancer cells could serve as a basis for gene signature–guided tumor stratification, as well as for the discovery of oncogenic mechanisms, and expand the understanding of the biology of very aggressive tumors. PMID:23698379

  17. Integrating Epigenomic Elements and GWASs Identifies BDNF Gene Affecting Bone Mineral Density and Osteoporotic Fracture Risk

    PubMed Central

    Guo, Yan; Dong, Shan-Shan; Chen, Xiao-Feng; Jing, Ying-Aisha; Yang, Man; Yan, Han; Shen, Hui; Chen, Xiang-Ding; Tan, Li-Jun; Tian, Qing; Deng, Hong-Wen; Yang, Tie-Lin

    2016-01-01

    To identify susceptibility genes for osteoporosis, we conducted an integrative analysis that combined epigenomic elements and previous genome-wide association studies (GWASs) data, followed by validation at population and functional levels, which could identify common regulatory elements and predict new susceptibility genes that are biologically meaningful to osteoporosis. By this approach, we found a set of distinct epigenomic elements significantly enriched or depleted in the promoters of osteoporosis-associated genes, including 4 transcription factor binding sites, 27 histone marks, and 21 chromatin states segmentation types. Using these epigenomic marks, we performed reverse prediction analysis to prioritize the discovery of new candidate genes. Functional enrichment analysis of all the prioritized genes revealed several key osteoporosis related pathways, including Wnt signaling. Genes with high priority were further subjected to validation using available GWASs datasets. Three genes were significantly associated with spine bone mineral density, including BDNF, PDE4D, and SATB2, which all closely related to bone metabolism. The most significant gene BDNF was also associated with osteoporotic fractures. RNA interference revealed that BDNF knockdown can suppress osteoblast differentiation. Our results demonstrated that epigenomic data could be used to indicate common epigenomic marks to discover additional loci with biological functions for osteoporosis. PMID:27465306

  18. Integrating Epigenomic Elements and GWASs Identifies BDNF Gene Affecting Bone Mineral Density and Osteoporotic Fracture Risk.

    PubMed

    Guo, Yan; Dong, Shan-Shan; Chen, Xiao-Feng; Jing, Ying-Aisha; Yang, Man; Yan, Han; Shen, Hui; Chen, Xiang-Ding; Tan, Li-Jun; Tian, Qing; Deng, Hong-Wen; Yang, Tie-Lin

    2016-07-28

    To identify susceptibility genes for osteoporosis, we conducted an integrative analysis that combined epigenomic elements and previous genome-wide association studies (GWASs) data, followed by validation at population and functional levels, which could identify common regulatory elements and predict new susceptibility genes that are biologically meaningful to osteoporosis. By this approach, we found a set of distinct epigenomic elements significantly enriched or depleted in the promoters of osteoporosis-associated genes, including 4 transcription factor binding sites, 27 histone marks, and 21 chromatin states segmentation types. Using these epigenomic marks, we performed reverse prediction analysis to prioritize the discovery of new candidate genes. Functional enrichment analysis of all the prioritized genes revealed several key osteoporosis related pathways, including Wnt signaling. Genes with high priority were further subjected to validation using available GWASs datasets. Three genes were significantly associated with spine bone mineral density, including BDNF, PDE4D, and SATB2, which all closely related to bone metabolism. The most significant gene BDNF was also associated with osteoporotic fractures. RNA interference revealed that BDNF knockdown can suppress osteoblast differentiation. Our results demonstrated that epigenomic data could be used to indicate common epigenomic marks to discover additional loci with biological functions for osteoporosis.

  19. Microarray expression profiling identifies genes with altered expression in HDL-deficient mice

    SciTech Connect

    Callow, Matthew J.; Dudoit, Sandrine; Gong, Elaine L.; Speed, Terence P.; Rubin, Edward M.

    2000-05-05

    Based on the assumption that severe alterations in the expression of genes known to be involved in HDL metabolism may affect the expression of other genes we screened an array of over 5000 mouse expressed sequence tags (ESTs) for altered gene expression in the livers of two lines of mice with dramatic decreases in HDL plasma concentrations. Labeled cDNA from livers of apolipoprotein AI (apo AI) knockout mice, Scavenger Receptor BI (SR-BI) transgenic mice and control mice were co-hybridized to microarrays. Two-sample t-statistics were used to identify genes with altered expression levels in the knockout or transgenic mice compared with the control mice. In the SR-BI group we found 9 array elements representing at least 5 genes to be significantly altered on the basis of an adjusted p value of less than 0.05. In the apo AI knockout group 8 array elements representing 4 genes were altered compared with the control group (p < 0.05). Several of the genes identified in the SR-BI transgenic suggest altered sterol metabolism and oxidative processes. These studies illustrate the use of multiple-testing methods for the identification of genes with altered expression in replicated microarray experiments of apo AI knockout and SR-BI transgenic mice.

  20. Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies

    PubMed Central

    Segata, Nicola; Huttenhower, Curtis

    2011-01-01

    Microbial community metagenomes and individual microbial genomes are becoming increasingly accessible by means of high-throughput sequencing. Assessing organismal membership within a community is typically performed using one or a few taxonomic marker genes such as the 16S rDNA, and these same genes are also employed to reconstruct molecular phylogenies. There is thus a growing need to bioinformatically catalog strongly conserved core genes that can serve as effective taxonomic markers, to assess the agreement among phylogenies generated from different core gene, and to characterize the biological functions enriched within core genes and thus conserved throughout large microbial clades. We present a method to recursively identify core genes (i.e. genes ubiquitous within a microbial clade) in high-throughput from a large number of complete input genomes. We analyzed over 1,100 genomes to produce core gene sets spanning 2,861 bacterial and archaeal clades, ranging in size from one to >2,000 genes in inverse correlation with the α-diversity (total phylogenetic branch length) spanned by each clade. These cores are enriched as expected for housekeeping functions including translation, transcription, and replication, in addition to significant representations of regulatory, chaperone, and conserved uncharacterized proteins. In agreement with previous manually curated core gene sets, phylogenies constructed from one or more of these core genes agree with those built using 16S rDNA sequence similarity, suggesting that systematic core gene selection can be used to optimize both comparative genomics and determination of microbial community structure. Finally, we examine functional phylogenies constructed by clustering genomes by the presence or absence of orthologous gene families and show that they provide an informative complement to standard sequence-based molecular phylogenies. PMID:21931822

  1. De Novo Transcriptome Sequencing of Oryza officinalis Wall ex Watt to Identify Disease-Resistance Genes.

    PubMed

    He, Bin; Gu, Yinghong; Tao, Xiang; Cheng, Xiaojie; Wei, Changhe; Fu, Jian; Cheng, Zaiquan; Zhang, Yizheng

    2015-12-10

    Oryza officinalis Wall ex Watt is one of the most important wild relatives of cultivated rice and exhibits high resistance to many diseases. It has been used as a source of genes for introgression into cultivated rice. However, there are limited genomic resources and little genetic information publicly reported for this species. To better understand the pathways and factors involved in disease resistance and accelerating the process of rice breeding, we carried out a de novo transcriptome sequencing of O. officinalis. In this research, 137,229 contigs were obtained ranging from 200 to 19,214 bp with an N50 of 2331 bp through de novo assembly of leaves, stems and roots in O. officinalis using an Illumina HiSeq 2000 platform. Based on sequence similarity searches against a non-redundant protein database, a total of 88,249 contigs were annotated with gene descriptions and 75,589 transcripts were further assigned to GO terms. Candidate genes for plant-pathogen interaction and plant hormones regulation pathways involved in disease-resistance were identified. Further analyses of gene expression profiles showed that the majority of genes related to disease resistance were all expressed in the three tissues. In addition, there are two kinds of rice bacterial blight-resistant genes in O. officinalis, including two Xa1 genes and three Xa26 genes. All 2 Xa1 genes showed the highest expression level in stem, whereas one of Xa26 was expressed dominantly in leaf and other 2 Xa26 genes displayed low expression level in all three tissues. This transcriptomic database provides an opportunity for identifying the genes involved in disease-resistance and will provide a basis for studying functional genomics of O. officinalis and genetic improvement of cultivated rice in the future.

  2. Transcriptional Profile Analysis of RPGRORF15 Frameshift Mutation Identifies Novel Genes Associated with Retinal Degeneration

    PubMed Central

    Genini, Sem; Zangerl, Barbara; Slavik, Julianna; Acland, Gregory M.; Beltran, William A.

    2010-01-01

    Purpose. To identify genes and molecular mechanisms associated with photoreceptor degeneration in a canine model of XLRP caused by an RPGR exon ORF15 microdeletion. Methods. Expression profiles of mutant and normal retinas were compared by using canine retinal custom cDNA microarrays. qRT-PCR, Western blot analysis, and immunohistochemistry (IHC) were applied to selected genes, to confirm and expand the microarray results. Results. At 7 and 16 weeks, respectively, 56 and 18 transcripts were downregulated in the mutant retinas, but none were differentially expressed (DE) at both ages, suggesting the involvement of temporally distinct pathways. Downregulated genes included the known retina-relevant genes PAX6, CHML, and RDH11 at 7 weeks and CRX and SAG at 16 weeks. Genes directly or indirectly active in apoptotic processes were altered at 7 weeks (CAMK2G, NTRK2, PRKCB, RALA, RBBP6, RNF41, SMYD3, SPP1, and TUBB2C) and 16 weeks (SLC25A5 and NKAP). Furthermore, the DE genes at 7 weeks (ELOVL6, GLOD4, NDUFS4, and REEP1) and 16 weeks (SLC25A5 and TARS2) are related to mitochondrial functions. qRT-PCR of 18 genes confirmed the microarray results and showed DE of additional genes not on the array. Only GFAP was DE at 3 weeks of age. Western blot and IHC analyses also confirmed the high reliability of the transcriptomic data. Conclusions. Several DE genes were identified in mutant retinas. At 7 weeks, a combination of nonclassic anti- and proapoptosis genes appear to be involved in photoreceptor degeneration, whereas at both 7 and 16 weeks, the expression of mitochondria-related genes indicates that they may play a relevant role in the disease process. PMID:20574030

  3. Analysis of global gene expression profiles to identify differentially expressed genes critical for embryo development in Brassica rapa.

    PubMed

    Zhang, Yu; Peng, Lifang; Wu, Ya; Shen, Yanyue; Wu, Xiaoming; Wang, Jianbo

    2014-11-01

    Embryo development represents a crucial developmental period in the life cycle of flowering plants. To gain insights into the genetic programs that control embryo development in Brassica rapa L., RNA sequencing technology was used to perform transcriptome profiling analysis of B. rapa developing embryos. The results generated 42,906,229 sequence reads aligned with 32,941 genes. In total, 27,760, 28,871, 28,384, and 25,653 genes were identified from embryos at globular, heart, early cotyledon, and mature developmental stages, respectively, and analysis between stages revealed a subset of stage-specific genes. We next investigated 9,884 differentially expressed genes with more than fivefold changes in expression and false discovery rate ≤ 0.001 from three adjacent-stage comparisons; 1,514, 3,831, and 6,633 genes were detected between globular and heart stage embryo libraries, heart stage and early cotyledon stage, and early cotyledon and mature stage, respectively. Large numbers of genes related to cellular process, metabolism process, response to stimulus, and biological process were expressed during the early and middle stages of embryo development. Fatty acid biosynthesis, biosynthesis of secondary metabolites, and photosynthesis-related genes were expressed predominantly in embryos at the middle stage. Genes for lipid metabolism and storage proteins were highly expressed in the middle and late stages of embryo development. We also identified 911 transcription factor genes that show differential expression across embryo developmental stages. These results increase our understanding of the complex molecular and cellular events during embryo development in B. rapa and provide a foundation for future studies on other oilseed crops.

  4. Transcriptome-based gene expression profiling identifies differentially expressed genes critical for salt stress response in radish (Raphanus sativus L.).

    PubMed

    Sun, Xiaochuan; Xu, Liang; Wang, Yan; Luo, Xiaobo; Zhu, Xianwen; Kinuthia, Karanja Benard; Nie, Shanshan; Feng, Haiyang; Li, Chao; Liu, Liwang

    2016-02-01

    Transcriptome-based gene expression analysis identifies many critical salt-responsive genes in radish and facilitates further dissecting the molecular mechanism underlying salt stress response. Salt stress severely impacts plant growth and development. Radish, a moderately salt-sensitive vegetable crop, has been studied for decades towards the physiological and biochemical performances under salt stress. However, no systematic study on isolation and identification of genes involved in salt stress response has been performed in radish, and the molecular mechanism governing this process is still indistinct. Here, the RNA-Seq technique was applied to analyze the transcriptomic changes on radish roots treated with salt (200 mM NaCl) for 48 h in comparison with those cultured in normal condition. Totally 8709 differentially expressed genes (DEGs) including 3931 up- and 4778 down-regulated genes were identified. Functional annotation analysis indicated that many genes could be involved in several aspects of salt stress response including stress sensing and signal transduction, osmoregulation, ion homeostasis and ROS scavenging. The association analysis of salt-responsive genes and miRNAs exhibited that 36 miRNA-mRNA pairs had negative correlationship in expression trends. Reverse-transcription quantitative PCR (RT-qPCR) analysis revealed that the expression profiles of DEGs were in line with results from the RNA-Seq analysis. Furthermore, the putative model of DEGs and miRNA-mediated gene regulation was proposed to elucidate how radish sensed and responded to salt stress. This study represents the first comprehensive transcriptome-based gene expression profiling under salt stress in radish. The outcomes of this study could facilitate further dissecting the molecular mechanism underlying salt stress response and provide a valuable platform for further genetic improvement of salt tolerance in radish breeding programs.

  5. EPIG-Seq: extracting patterns and identifying co-expressed genes from RNA-Seq data.

    PubMed

    Li, Jianying; Bushel, Pierre R

    2016-03-22

    RNA sequencing (RNA-Seq) measures genome-wide gene expression. RNA-Seq data is count-based rendering normal distribution models for analysis inappropriate. Normalization of RNA-Seq data to transform the data has limitations which can adversely impact the analysis. Furthermore, there are a few count-based methods for analysis of RNA-Seq data but they are essentially for pairwise analysis of treatment groups or multiclasses but not pattern-based to identify co-expressed genes. We adapted our extracting patterns and identifying genes methodology for RNA-Seq (EPIG-Seq) count data. The software uses count-based correlation to measure similarity between genes, quasi-Poisson modelling to estimate dispersion in the data and a location parameter to indicate magnitude of differential expression. EPIG-Seq is different than any other software currently available for pattern analysis of RNA-Seq data in that EPIG-Seq 1) uses count level data and supports cases of inflated zeros, 2) identifies statistically significant clusters of genes that are co-expressed across experimental conditions, 3) takes into account dispersion in the replicate data and 4) provides reliable results even with small sample sizes. EPIG-Seq operates in two steps: 1) extract the pattern profiles from data as seeds for clustering co-expressed genes and 2) cluster the genes to the pattern seeds and compute statistical significance of the pattern of co-expressed genes. EPIG-Seq provides a table of the genes with bootstrapped p-values and profile plots of the patterns of co-expressed genes. In addition, EPIG-Seq provides a heat map and principal component dimension reduction plot of the clustered genes as visual aids. We demonstrate the utility of EPIG-Seq through the analysis of toxicogenomics and cancer data sets to identify biologically relevant co-expressed genes. EPIG-Seq is available at: sourceforge.net/projects/epig-seq. EPIG-Seq is unlike any other software currently available for pattern analysis of

  6. Transposon mutagenesis identifies genes that cooperate with mutant Pten in breast cancer progression.

    PubMed

    Rangel, Roberto; Lee, Song-Choon; Hon-Kim Ban, Kenneth; Guzman-Rojas, Liliana; Mann, Michael B; Newberg, Justin Y; Kodama, Takahiro; McNoe, Leslie A; Selvanesan, Luxmanan; Ward, Jerrold M; Rust, Alistair G; Chin, Kuan-Yew; Black, Michael A; Jenkins, Nancy A; Copeland, Neal G

    2016-11-29

    Triple-negative breast cancer (TNBC) has the worst prognosis of any breast cancer subtype. To better understand the genetic forces driving TNBC, we performed a transposon mutagenesis screen in a phosphatase and tensin homolog (Pten) mutant mice and identified 12 candidate trunk drivers and a much larger number of progression genes. Validation studies identified eight TNBC tumor suppressor genes, including the GATA-like transcriptional repressor TRPS1 Down-regulation of TRPS1 in TNBC cells promoted epithelial-to-mesenchymal transition (EMT) by deregulating multiple EMT pathway genes, in addition to increasing the expression of SERPINE1 and SERPINB2 and the subsequent migration, invasion, and metastasis of tumor cells. Transposon mutagenesis has thus provided a better understanding of the genetic forces driving TNBC and discovered genes with potential clinical importance in TNBC.

  7. Harnessing Single Cell Sorting to Identify Cell Division Genes and Regulators in Bacteria

    PubMed Central

    Burke, Catherine; Liu, Michael; Britton, Warwick; Triccas, James A.; Thomas, Torsten; Smith, Adrian L.; Allen, Steven; Salomon, Robert; Harry, Elizabeth

    2013-01-01

    Cell division is an essential cellular process that requires an array of known and unknown proteins for its spatial and temporal regulation. Here we develop a novel, high-throughput screening method for the identification of bacterial cell division genes and regulators. The method combines the over-expression of a shotgun genomic expression library to perturb the cell division process with high-throughput flow cytometry sorting to screen many thousands of clones. Using this approach, we recovered clones with a filamentous morphology for the model bacterium, Escherichia coli. Genetic analysis revealed that our screen identified both known cell division genes, and genes that have not previously been identified to be involved in cell division. This novel screening strategy is applicable to a wide range of organisms, including pathogenic bacteria, where cell division genes and regulators are attractive drug targets for antibiotic development. PMID:23565292

  8. Transposon mutagenesis identifies genes that cooperate with mutant Pten in breast cancer progression

    PubMed Central

    Rangel, Roberto; Lee, Song-Choon; Hon-Kim Ban, Kenneth; Guzman-Rojas, Liliana; Mann, Michael B.; Newberg, Justin Y.; McNoe, Leslie A.; Selvanesan, Luxmanan; Ward, Jerrold M.; Rust, Alistair G.; Chin, Kuan-Yew; Black, Michael A.; Jenkins, Nancy A.; Copeland, Neal G.

    2016-01-01

    Triple-negative breast cancer (TNBC) has the worst prognosis of any breast cancer subtype. To better understand the genetic forces driving TNBC, we performed a transposon mutagenesis screen in a phosphatase and tensin homolog (Pten) mutant mice and identified 12 candidate trunk drivers and a much larger number of progression genes. Validation studies identified eight TNBC tumor suppressor genes, including the GATA-like transcriptional repressor TRPS1. Down-regulation of TRPS1 in TNBC cells promoted epithelial-to-mesenchymal transition (EMT) by deregulating multiple EMT pathway genes, in addition to increasing the expression of SERPINE1 and SERPINB2 and the subsequent migration, invasion, and metastasis of tumor cells. Transposon mutagenesis has thus provided a better understanding of the genetic forces driving TNBC and discovered genes with potential clinical importance in TNBC. PMID:27849608

  9. A loss of function screen identifies nine new radiation susceptibility genes

    SciTech Connect

    Sudo, Hitomi; Tsuji, Atsushi B. Sugyo, Aya; Imai, Takashi; Saga, Tsuneo; Harada, Yoshi-nobu

    2007-12-21

    Genomic instability is considered a hallmark of carcinogenesis, and dysfunction of DNA repair and cell cycle regulation in response to DNA damage caused by ionizing radiation are thought to be important factors in the early stages of genomic instability. We performed cell-based functional screening using an RNA interference library targeting 200 genes in human cells. We identified three known and nine new radiation susceptibility genes, eight of which are linked directly or potentially with cell cycle progression. Cell cycle analysis on four of the genes not previously linked to cell cycle progression demonstrated that one, ZDHHC8, was associated with the G{sub 2}/M checkpoint in response to DNA damage. Further study of the 12 radiation susceptibility genes identified in this screen may help to elucidate the molecular mechanisms of cell cycle progression, DNA repair, cell death, cell growth and genomic instability, and to develop new radiation sensitizing agents for radiotherapy.

  10. Harnessing single cell sorting to identify cell division genes and regulators in bacteria.

    PubMed

    Burke, Catherine; Liu, Michael; Britton, Warwick; Triccas, James A; Thomas, Torsten; Smith, Adrian L; Allen, Steven; Salomon, Robert; Harry, Elizabeth

    2013-01-01

    Cell division is an essential cellular process that requires an array of known and unknown proteins for its spatial and temporal regulation. Here we develop a novel, high-throughput screening method for the identification of bacterial cell division genes and regulators. The method combines the over-expression of a shotgun genomic expression library to perturb the cell division process with high-throughput flow cytometry sorting to screen many thousands of clones. Using this approach, we recovered clones with a filamentous morphology for the model bacterium, Escherichia coli. Genetic analysis revealed that our screen identified both known cell division genes, and genes that have not previously been identified to be involved in cell division. This novel screening strategy is applicable to a wide range of organisms, including pathogenic bacteria, where cell division genes and regulators are attractive drug targets for antibiotic development.

  11. Cluster Analysis of Tumor Suppressor Genes in Canine Leukocytes Identifies Activation State

    PubMed Central

    Daly, Julie-Anne; Mortlock, Sally-Anne; Taylor, Rosanne M.; Williamson, Peter

    2015-01-01

    Cells of the immune system undergo activation and subsequent proliferation in the normal course of an immune response. Infrequently, the molecular and cellular events that underlie the mechanisms of proliferation are dysregulated and may lead to oncogenesis, leading to tumor formation. The most common forms of immunological cancers are lymphomas, which in dogs account for 8%–20% of all cancers, affecting up to 1.2% of the dog population. Key genes involved in negatively regulating proliferation of lymphocytes include a group classified as tumor suppressor genes (TSGs). These genes are also known to be associated with progression of lymphoma in humans, mice, and dogs and are potential candidates for pathological grading and diagnosis. The aim of the present study was to analyze TSG profiles in stimulated leukocytes from dogs to identify genes that discriminate an activated phenotype. A total of 554 TSGs and three gene set collections were analyzed from microarray data. Cluster analysis of three subsets of genes discriminated between stimulated and unstimulated cells. These included 20 most upregulated and downregulated TSGs, TSG in hallmark gene sets significantly enriched in active cells, and a selection of candidate TSGs, p15 (CDKN2B), p18 (CDKN2C), p19 (CDKN1A), p21 (CDKN2A), p27 (CDKN1B), and p53 (TP53) in the third set. Analysis of two subsets suggested that these genes or a subset of these genes may be used as a specialized PCR set for additional analysis. PMID:27478369

  12. Systematically identify key genes in inflammatory and non-inflammatory breast cancer.

    PubMed

    Chai, Fan; Liang, Yan; Zhang, Fan; Wang, Minghao; Zhong, Ling; Jiang, Jun

    2016-01-10

    Although the gene expression in breast tumor stroma, playing a critical role in determining inflammatory breast cancer (IBC) phenotype, has been proved to be significantly different between IBC and non-inflammatory breast cancer (non-IBC), more effort needs to systematically investigate the gene expression profiles between tumor epithelium and stroma and to efficiently uncover the potential molecular networks and critical genes for IBC and non-IBC. Here, we comprehensively analyzed and compared the transcriptional profiles from IBC and non-IBC patients using hierarchical clustering, protein-protein interaction (PPI) network, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database analyses, and identified PDGFRβ, SUMO1, COL1A1, FYN, CAV1, COL5A1 and MMP2 to be the key genes for breast cancer. Interestingly, PDGFRβ was found to be the hub gene in both IBC and non-IBC; SUMO1 and COL1A1 were respectively the key genes for IBC and non-IBC. These analysis results indicated that those key genes might play important role in IBC and non-IBC and provided some clues for future studies.

  13. Transcriptome Profiling Identifies Differentially Expressed Genes in Postnatal Developing Pituitary Gland of Miniature Pig

    PubMed Central

    Shan, Lei; Wu, Qi; Li, Yuli; Shang, Haitao; Guo, Kenan; Wu, Jiayan; Wei, Hong; Zhao, Jianguo; Yu, Jun; Li, Meng-Hua

    2014-01-01

    In recent years, Tibetan pig and Bama pig are popularly used as animal models for medical researches. However, little genomic information is available for the two breeds, particularly regarding gene expression pattern at the whole-transcriptome level. In this study, we characterized the pituitary transcriptome profile along their postnatal developmental stages within and between the two breeds in order to illustrate the differential dynamics and functions of differentially expressed genes. We obtained a total of ∼300 million 80-bp paired-end reads, detected 15 715 previously annotated genes. Most of the genes (90.33%) were shared between the two breeds with the main functions in metabolic process. Four hormone genes (GH, PRL, LHB, and FSHB) were detected in all samples with extremely high levels of expression. Functional differences between the three developmental stages (infancy, puberty and adulthood) in each breed were dominantly presented by the gene expressions at the first stage. That is, Bama pig was over-represented in the genes involved in the cellular process, while Tibetan pig was over-represented in the genes represented by the reproductive process. The identified SNPs indicated that the divergence between the miniature pig breeds and the large pig (Duroc) were greater than that between the two miniature pig breeds. This study substantially expands our knowledge concerning the genes transcribed in the pig pituitary gland and provides an overview of pituitary transcriptome dynamics throughout the period of postnatal development. PMID:24282060

  14. Transcriptome profiling identifies differentially expressed genes in postnatal developing pituitary gland of miniature pig.

    PubMed

    Shan, Lei; Wu, Qi; Li, Yuli; Shang, Haitao; Guo, Kenan; Wu, Jiayan; Wei, Hong; Zhao, Jianguo; Yu, Jun; Li, Meng-Hua

    2014-01-01

    In recent years, Tibetan pig and Bama pig are popularly used as animal models for medical researches. However, little genomic information is available for the two breeds, particularly regarding gene expression pattern at the whole-transcriptome level. In this study, we characterized the pituitary transcriptome profile along their postnatal developmental stages within and between the two breeds in order to illustrate the differential dynamics and functions of differentially expressed genes. We obtained a total of ∼300 million 80-bp paired-end reads, detected 15 715 previously annotated genes. Most of the genes (90.33%) were shared between the two breeds with the main functions in metabolic process. Four hormone genes (GH, PRL, LHB, and FSHB) were detected in all samples with extremely high levels of expression. Functional differences between the three developmental stages (infancy, puberty and adulthood) in each breed were dominantly presented by the gene expressions at the first stage. That is, Bama pig was over-represented in the genes involved in the cellular process, while Tibetan pig was over-represented in the genes represented by the reproductive process. The identified SNPs indicated that the divergence between the miniature pig breeds and the large pig (Duroc) were greater than that between the two miniature pig breeds. This study substantially expands our knowledge concerning the genes transcribed in the pig pituitary gland and provides an overview of pituitary transcriptome dynamics throughout the period of postnatal development.

  15. High-throughput screening of mouse gene knockouts identifies established and novel skeletal phenotypes

    PubMed Central

    Brommage, Robert; Liu, Jeff; Hansen, Gwenn M; Kirkpatrick, Laura L; Potter, David G; Sands, Arthur T; Zambrowicz, Brian; Powell, David R; Vogel, Peter

    2014-01-01

    Screening gene function in vivo is a powerful approach to discover novel drug targets. We present high-throughput screening (HTS) data for 3 762 distinct global gene knockout (KO) mouse lines with viable adult homozygous mice generated using either gene-trap or homologous recombination technologies. Bone mass was determined from DEXA scans of male and female mice at 14 weeks of age and by microCT analyses of bones from male mice at 16 weeks of age. Wild-type (WT) cagemates/littermates were examined for each gene KO. Lethality was observed in an additional 850 KO lines. Since primary HTS are susceptible to false positive findings, additional cohorts of mice from KO lines with intriguing HTS bone data were examined. Aging, ovariectomy, histomorphometry and bone strength studies were performed and possible non-skeletal phenotypes were explored. Together, these screens identified multiple genes affecting bone mass: 23 previously reported genes (Calcr, Cebpb, Crtap, Dcstamp, Dkk1, Duoxa2, Enpp1, Fgf23, Kiss1/Kiss1r, Kl (Klotho), Lrp5, Mstn, Neo1, Npr2, Ostm1, Postn, Sfrp4, Slc30a5, Slc39a13, Sost, Sumf1, Src, Wnt10b), five novel genes extensively characterized (Cldn18, Fam20c, Lrrk1, Sgpl1, Wnt16), five novel genes with preliminary characterization (Agpat2, Rassf5, Slc10a7, Slc26a7, Slc30a10) and three novel undisclosed genes coding for potential osteoporosis drug targets. PMID:26273529

  16. Network-Based Method for Identifying Co- Regeneration Genes in Bone, Dentin, Nerve and Vessel Tissues.

    PubMed

    Chen, Lei; Pan, Hongying; Zhang, Yu-Hang; Feng, Kaiyan; Kong, XiangYin; Huang, Tao; Cai, Yu-Dong

    2017-10-02

    Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein-protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method.

  17. Spatial Clustering of de Novo Missense Mutations Identifies Candidate Neurodevelopmental Disorder-Associated Genes.

    PubMed

    Lelieveld, Stefan H; Wiel, Laurens; Venselaar, Hanka; Pfundt, Rolph; Vriend, Gerrit; Veltman, Joris A; Brunner, Han G; Vissers, Lisenka E L M; Gilissen, Christian

    2017-09-07

    Haploinsufficiency (HI) is the best characterized mechanism through which dominant mutations exert their effect and cause disease. Non-haploinsufficiency (NHI) mechanisms, such as gain-of-function and dominant-negative mechanisms, are often characterized by the spatial clustering of mutations, thereby affecting only particular regions or base pairs of a gene. Variants leading to haploinsufficency might occasionally cluster as well, for example in critical domains, but such clustering is on the whole less pronounced with mutations often spread throughout the gene. Here we exploit this property and develop a method to specifically identify genes with significant spatial clustering patterns of de novo mutations in large cohorts. We apply our method to a dataset of 4,061 de novo missense mutations from published exome studies of trios with intellectual disability and developmental disorders (ID/DD) and successfully identify 15 genes with clustering mutations, including 12 genes for which mutations are known to cause neurodevelopmental disorders. For 11 out of these 12, NHI mutation mechanisms have been reported. Additionally, we identify three candidate ID/DD-associated genes of which two have an established role in neuronal processes. We further observe a higher intolerance to normal genetic variation of the identified genes compared to known genes for which mutations lead to HI. Finally, 3D modeling of these mutations on their protein structures shows that 81% of the observed mutations are unlikely to affect the overall structural integrity and that they therefore most likely act through a mechanism other than HI. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  18. Integrative CAGE and DNA Methylation Profiling Identify Epigenetically Regulated Genes in NSCLC.

    PubMed

    Horie, Masafumi; Kaczkowski, Bogumil; Ohshima, Mitsuhiro; Matsuzaki, Hirotaka; Noguchi, Satoshi; Mikami, Yu; Lizio, Marina; Itoh, Masayoshi; Kawaji, Hideya; Lassmann, Timo; Carninci, Piero; Hayashizaki, Yoshihide; Forrest, Alistair R R; Takai, Daiya; Yamaguchi, Yoko; Micke, Patrick; Saito, Akira; Nagase, Takahide

    2017-10-01

    Lung cancer is the leading cause of cancer-related deaths worldwide. The majority of cancer driver mutations have been identified; however, relevant epigenetic regulation involved in tumorigenesis has only been fragmentarily analyzed. Epigenetically regulated genes have a great theranostic potential, especially in tumors with no apparent driver mutations. Here, epigenetically regulated genes were identified in lung cancer by an integrative analysis of promoter-level expression profiles from Cap Analysis of Gene Expression (CAGE) of 16 non-small cell lung cancer (NSCLC) cell lines and 16 normal lung primary cell specimens with DNA methylation data of 69 NSCLC cell lines and 6 normal lung epithelial cells. A core set of 49 coding genes and 10 long noncoding RNAs (lncRNA), which are upregulated in NSCLC cell lines due to promoter hypomethylation, was uncovered. Twenty-two epigenetically regulated genes were validated (upregulated genes with hypomethylated promoters) in the adenocarcinoma and squamous cell cancer subtypes of lung cancer using The Cancer Genome Atlas data. Furthermore, it was demonstrated that multiple copies of the REP522 DNA repeat family are prominently upregulated due to hypomethylation in NSCLC cell lines, which leads to cancer-specific expression of lncRNAs, such as RP1-90G24.10, AL022344.4, and PCAT7. Finally, Myeloma Overexpressed (MYEOV) was identified as the most promising candidate. Functional studies demonstrated that MYEOV promotes cell proliferation, survival, and invasion. Moreover, high MYEOV expression levels were associated with poor prognosis.Implications: This report identifies a robust list of 22 candidate driver genes that are epigenetically regulated in lung cancer; such genes may complement the known mutational drivers.Visual Overview: http://mcr.aacrjournals.org/content/early/2017/10/01/1354-1365.MCR-17-0191-ET/F1.large.jpg Mol Cancer Res; 15(10); 1354-65. ©2017 AACR. ©2017 American Association for Cancer Research.

  19. A Comprehensive Gene Expression Meta-analysis Identifies Novel Immune Signatures in Rheumatoid Arthritis Patients

    PubMed Central

    Afroz, Sumbul; Giddaluru, Jeevan; Vishwakarma, Sandeep; Naz, Saima; Khan, Aleem Ahmed; Khan, Nooruddin

    2017-01-01

    Rheumatoid arthritis (RA), a symmetric polyarticular arthritis, has long been feared as one of the most disabling forms of arthritis. Identification of gene signatures associated with RA onset and progression would lead toward development of novel diagnostics and therapeutic interventions. This study was undertaken to identify unique gene signatures of RA patients through large-scale meta-profiling of a diverse collection of gene expression data sets. We carried out a meta-analysis of 8 publicly available RA patients’ (107 RA patients and 76 healthy controls) gene expression data sets and further validated a few meta-signatures in RA patients through quantitative real-time PCR (RT-qPCR). We identified a robust meta-profile comprising 33 differentially expressed genes, which were consistently and significantly expressed across all the data sets. Our meta-analysis unearthed upregulation of a few novel gene signatures including PLCG2, HLA-DOB, HLA-F, EIF4E2, and CYFIP2, which were validated in peripheral blood mononuclear cell samples of RA patients. Further, functional and pathway enrichment analysis reveals perturbation of several meta-genes involved in signaling pathways pertaining to inflammation, antigen presentation, hypoxia, and apoptosis during RA. Additionally, PLCG2 (phospholipase Cγ2) popped out as a novel meta-gene involved in most of the pathways relevant to RA including inflammasome activation, platelet aggregation, and activation, thereby suggesting PLCG2 as a potential therapeutic target for controlling excessive inflammation during RA. In conclusion, these findings highlight the utility of meta-analysis approach in identifying novel gene signatures that might provide mechanistic insights into disease onset, progression and possibly lead toward the development of better diagnostic and therapeutic interventions against RA. PMID:28210261

  20. A Comprehensive Gene Expression Meta-analysis Identifies Novel Immune Signatures in Rheumatoid Arthritis Patients.

    PubMed

    Afroz, Sumbul; Giddaluru, Jeevan; Vishwakarma, Sandeep; Naz, Saima; Khan, Aleem Ahmed; Khan, Nooruddin

    2017-01-01

    Rheumatoid arthritis (RA), a symmetric polyarticular arthritis, has long been feared as one of the most disabling forms of arthritis. Identification of gene signatures associated with RA onset and progression would lead toward development of novel diagnostics and therapeutic interventions. This study was undertaken to identify unique gene signatures of RA patients through large-scale meta-profiling of a diverse collection of gene expression data sets. We carried out a meta-analysis of 8 publicly available RA patients' (107 RA patients and 76 healthy controls) gene expression data sets and further validated a few meta-signatures in RA patients through quantitative real-time PCR (RT-qPCR). We identified a robust meta-profile comprising 33 differentially expressed genes, which were consistently and significantly expressed across all the data sets. Our meta-analysis unearthed upregulation of a few novel gene signatures including PLCG2, HLA-DOB, HLA-F, EIF4E2, and CYFIP2, which were validated in peripheral blood mononuclear cell samples of RA patients. Further, functional and pathway enrichment analysis reveals perturbation of several meta-genes involved in signaling pathways pertaining to inflammation, antigen presentation, hypoxia, and apoptosis during RA. Additionally, PLCG2 (phospholipase Cγ2) popped out as a novel meta-gene involved in most of the pathways relevant to RA including inflammasome activation, platelet aggregation, and activation, thereby suggesting PLCG2 as a potential therapeutic target for controlling excessive inflammation during RA. In conclusion, these findings highlight the utility of meta-analysis approach in identifying novel gene signatures that might provide mechanistic insights into disease onset, progression and possibly lead toward the development of better diagnostic and therapeutic interventions against RA.

  1. Flux variability scanning based on enforced objective flux for identifying gene amplification targets

    PubMed Central

    2012-01-01

    Background In order to reduce time and efforts to develop microbial strains with better capability of producing desired bioproducts, genome-scale metabolic simulations have proven useful in identifying gene knockout and amplification targets. Constraints-based flux analysis has successfully been employed for such simulation, but is limited in its ability to properly describe the complex nature of biological systems. Gene knockout simulations are relatively straightforward to implement, simply by constraining the flux values of the target reaction to zero, but the identification of reliable gene amplification targets is rather difficult. Here, we report a new algorithm which incorporates physiological data into a model to improve the model’s prediction capabilities and to capitalize on the relationships between genes and metabolic fluxes. Results We developed an algorithm, flux variability scanning based on enforced objective flux (FVSEOF) with grouping reaction (GR) constraints, in an effort to identify gene amplification targets by considering reactions that co-carry flux values based on physiological omics data via “GR constraints”. This method scans changes in the variabilities of metabolic fluxes in response to an artificially enforced objective flux of product formation. The gene amplification targets predicted using this method were validated by comparing the predicted effects with the previous experimental results obtained for the production of shikimic acid and putrescine in Escherichia coli. Moreover, new gene amplification targets for further enhancing putrescine production were validated through experiments involving the overexpression of each identified targeted gene under condition-controlled batch cultivation. Conclusions FVSEOF with GR constraints allows identification of gene amplification targets for metabolic engineering of microbial strains in order to enhance the production of desired bioproducts. The algorithm was validated through the

  2. Integrative strategies to identify candidate genes in rodent models of human alcoholism.

    PubMed

    Treadwell, Julie A

    2006-01-01

    The search for genes underlying alcohol-related behaviours in rodent models of human alcoholism has been ongoing for many years with only limited success. Recently, new strategies that integrate several of the traditional approaches have provided new insights into the molecular mechanisms underlying ethanol's actions in the brain. We have used alcohol-preferring C57BL/6J (B6) and alcohol-avoiding DBA/2J (D2) genetic strains of mice in an integrative strategy combining high-throughput gene expression screening, genetic segregation analysis, and mapping to previously published quantitative trait loci to uncover candidate genes for the ethanol-preference phenotype. In our study, 2 genes, retinaldehyde binding protein 1 (Rlbp1) and syntaxin 12 (Stx12), were found to be strong candidates for ethanol preference. Such experimental approaches have the power and the potential to greatly speed up the laborious process of identifying candidate genes for the animal models of human alcoholism.

  3. Identifying the source of unknown microcystin genes and predicting microcystin variants by comparing genes within uncultured cyanobacterial cells.

    PubMed

    Allender, Christopher J; LeCleir, Gary R; Rinta-Kanto, Johanna M; Small, Randall L; Satchwell, Michael F; Boyer, Gregory L; Wilhelm, Steven W

    2009-06-01

    While multiple phylogenetic markers have been used in the culture-independent study of microcystin-producing cyanobacteria, in only a few instances have multiple markers been studied within individual cells, and in all cases these studies have been conducted with cultured isolates. Here, we isolate and evaluate large DNA fragments (>6 kb) encompassing two genes involved in microcystin biosynthesis (mcyA2 and mcyB1) and use them to identify the source of gene fragments found in water samples. Further investigation of these gene loci from individual cyanobacterial cells allowed for improved analysis of the genetic diversity within microcystin producers as well as a method to predict microcystin variants for individuals. These efforts have also identified the source of the novel mcyA genotype previously termed Microcystis-like that is pervasive in the Laurentian Great Lakes and they predict the microcystin variant(s) that it produces.

  4. A gene expression biomarker identifies in vitro and in vivo ERα modulators in a human gene expression compendium

    EPA Science Inventory

    We propose the use of gene expression profiling to complement the chemical characterization currently based on HTS assay data and present a case study relevant to the Endocrine Disruptor Screening Program. We have developed computational methods to identify estrogen receptor &alp...

  5. A gene expression biomarker identifies in vitro and in vivo ERα modulators in a human gene expression compendium

    EPA Science Inventory

    We propose the use of gene expression profiling to complement the chemical characterization currently based on HTS assay data and present a case study relevant to the Endocrine Disruptor Screening Program. We have developed computational methods to identify estrogen receptor &alp...

  6. De novo transcriptome sequencing to identify the sex-determination genes in Hyriopsis schlegelii.

    PubMed

    Shi, Jianwu; Hong, Yijiang; Sheng, Junqing; Peng, Kou; Wang, Junhua

    2015-01-01

    This study presents the first analysis of expressed transcripts in the spermary and ovary of Hyriopsis schlegelii (H. schlegelii). A total of 132,055 unigenes were obtained and 31,781 of these genes were annotated. In addition, 19,511 upregulated and 25,911 downregulated unigenes were identified in the spermary. Ten sex-determination genes were selected and further analyzed by real-time PCR. In addition, mammalian genes reported to govern sex-determination pathways, including Sry, Dmrt1, Dmrt2, Sox9, GATA4, and WT1 in males and Wnt4, Rspo1, Foxl2, and β-catenin in females, were also identified in H. schlegelii. These results suggest that H. schlegelii and mammals use similar gene regulatory mechanisms to control sex determination. Moreover, genes associated with dosage compensation mechanisms, such as Msl1, Msl2, and Msl3, and hermaphrodite phenotypes, such as Tra-1, Tra-2α, Tra-2β, Fem1A, Fem1B, and Fem1C, were also identified in H. schlegelii. The identification of these genes indicates that diverse regulatory mechanisms regulate sexual polymorphism in H. schlegelii.

  7. Functional genomics identifies neural stem cell sub-type expression profiles and genes regulating neuroblast homeostasis

    PubMed Central

    Carney, Travis D.; Miller, Michael R.; Robinson, Kristin J.; Bayraktar, Omer A.; Osterhout, Jessica A.; Doe, Chris Q.

    2014-01-01

    The Drosophila larval central brain contains about 10,000 differentiated neurons and 200 scattered neural progenitors (neuroblasts), which can be further subdivided into ~95 type I neuroblasts and eight type II neuroblasts per brain lobe. Only type II neuroblasts generate self-renewing intermediate neural progenitors (INPs), and consequently each contributes more neurons to the brain, including much of the central complex. We characterized six different mutant genotypes that lead to expansion of neuroblast numbers; some preferentially expand type II or type I neuroblasts. Transcriptional profiling of larval brains from these mutant genotypes versus wild-type allowed us to identify small clusters of transcripts enriched in type II or type I neuroblasts, and we validated these clusters by gene expression analysis. Unexpectedly, only a few genes were found to be differentially expressed between type I/II neuroblasts, suggesting that these genes play a large role in establishing the different cell types. We also identified a large group of genes predicted to be expressed in all neuroblasts but not neurons. We performed a neuroblast-specific, RNAi-based functional screen and identified 84 genes that are required to maintain proper neuroblast numbers; all have conserved mammalian orthologs. These genes are excellent candidates for regulating neural progenitor self-renewal in Drosophila and mammals. PMID:22061480

  8. Copy number variation analysis identifies novel CAKUT candidate genes in children with a solitary functioning kidney

    PubMed Central

    Westland, Rik; Verbitsky, Miguel; Vukojevic, Katarina; Perry, Brittany J.; Fasel, David A.; Zwijnenburg, Petra J.G.; Bökenkamp, Arend; Gille, Johan J.P.; Saraga-Babic, Mirna; Ghiggeri, Gian Marco; D’Agati, Vivette D.; Schreuder, Michiel F.; Gharavi, Ali G.; van Wijk, Joanna A.E.; Sanna-Cherchi, Simone

    2016-01-01

    Copy number variations associate with different developmental phenotypes and represent a major cause of congenital anomalies of the kidney and urinary tract (CAKUT). Because rare pathogenic copy number variations are often large and contain multiple genes, identification of the underlying genetic drivers has proven to be difficult. Here we studied the role of rare copy number variations in 80 patients from the KIMONO-study cohort for which pathogenic mutations in three genes commonly implicated in CAKUT were excluded. In total, 13 known or novel genomic imbalances in 11 of 80 patients were absent or extremely rare in 23,362 population controls. To identify the most likely genetic drivers for the CAKUT phenotype underlying these rare copy number variations, we used a systematic in silico approach based on frequency in a large dataset of controls, annotation with publicly available databases for developmental diseases, tolerance and haploinsufficiency scores, and gene expression profile in the developing kidney and urinary tract. Five novel candidate genes for CAKUT were identified that showed specific expression in the human and mouse developing urinary tract. Among these genes, DLG1 and KIF12 are likely novel susceptibility genes for CAKUT in humans. Thus, there is a significant role of genomic imbalance in the determination of kidney developmental phenotypes. Additionally, we defined a systematic strategy to identify genetic drivers underlying rare copy number variations. PMID:26352300

  9. Copy number variation analysis identifies novel CAKUT candidate genes in children with a solitary functioning kidney.

    PubMed

    Westland, Rik; Verbitsky, Miguel; Vukojevic, Katarina; Perry, Brittany J; Fasel, David A; Zwijnenburg, Petra J G; Bökenkamp, Arend; Gille, Johan J P; Saraga-Babic, Mirna; Ghiggeri, Gian Marco; D'Agati, Vivette D; Schreuder, Michiel F; Gharavi, Ali G; van Wijk, Joanna A E; Sanna-Cherchi, Simone

    2015-12-01

    Copy number variations associate with different developmental phenotypes and represent a major cause of congenital anomalies of the kidney and urinary tract (CAKUT). Because rare pathogenic copy number variations are often large and contain multiple genes, identification of the underlying genetic drivers has proven to be difficult. Here we studied the role of rare copy number variations in 80 patients from the KIMONO study cohort for which pathogenic mutations in three genes commonly implicated in CAKUT were excluded. In total, 13 known or novel genomic imbalances in 11 of 80 patients were absent or extremely rare in 23,362 population controls. To identify the most likely genetic drivers for the CAKUT phenotype underlying these rare copy number variations, we used a systematic in silico approach based on frequency in a large data set of controls, annotation with publicly available databases for developmental diseases, tolerance and haploinsufficiency scores, and gene expression profile in the developing kidney and urinary tract. Five novel candidate genes for CAKUT were identified that showed specific expression in the human and mouse developing urinary tract. Among these genes, DLG1 and KIF12 are likely novel susceptibility genes for CAKUT in humans. Thus, there is a significant role of genomic imbalance in the determination of kidney developmental phenotypes. Additionally, we defined a systematic strategy to identify genetic drivers underlying rare copy number variations.

  10. A genome-wide screen for identifying all regulators of a target gene

    PubMed Central

    Baptist, Guillaume; Pinel, Corinne; Ranquet, Caroline; Izard, Jérôme; Ropers, Delphine; de Jong, Hidde; Geiselmann, Johannes

    2013-01-01

    We have developed a new screening methodology for identifying all genes that control the expression of a target gene through genetic or metabolic interactions. The screen combines mutant libraries with luciferase reporter constructs, whose expression can be monitored in vivo and over time in different environmental conditions. We apply the method to identify the genes that control the expression of the gene acs, encoding the acetyl coenzyme A synthetase, in Escherichia coli. We confirm most of the known genetic regulators, including CRP–cAMP, IHF and components of the phosphotransferase system. In addition, we identify new regulatory interactions, many of which involve metabolic intermediates or metabolic sensing, such as the genes pgi, pfkA, sucB and lpdA, encoding enzymes in glycolysis and the TCA cycle. Some of these novel interactions were validated by quantitative reverse transcriptase-polymerase chain reaction. More generally, we observe that a large number of mutants directly or indirectly influence acs expression, an effect confirmed for a second promoter, sdhC. The method is applicable to any promoter fused to a luminescent reporter gene in combination with a deletion mutant library. PMID:23892289

  11. Comparative Genomics of Ralstonia solanacearum Identifies Candidate Genes Associated with Cool Virulence

    PubMed Central

    Bocsanczy, Ana M.; Huguet-Tapia, Jose C.; Norman, David J.

    2017-01-01

    Strains of the Ralstonia solanacearum species complex in the phylotype IIB group are capable of causing Bacterial Wilt disease in potato and tomato at temperatures lower than 24°C. The capability of these strains to survive and to incite infection at temperatures colder than their normally tropical boundaries represents a threat to United States agriculture in temperate regions. In this work, we used a comparative genomics approach to identify orthologous genes linked to the lower temperature virulence phenotype. Six R. solanacearum cool virulent (CV) strains were compared to six strains non-pathogenic at low temperature (NPLT). CV strains can cause Bacterial Wilt symptoms at temperatures below 24°C, while NPLT cannot. Four R. solanacearum strains were sequenced for this work in order to complete the comparison. An orthologous genes comparison identified 44 genes present only in CV strains and 19 genes present only in NPLT strains. Gene annotation revealed a high percentage of genes compared with whole genomes in the transcriptional regulator and transport categories. A single nucleotide polymorphism (SNP) analysis identified 265 genes containing conserved non-synonymous SNPs in CV strains. Ten genes in the pathogenicity category were identified in this group. Comparisons of type 3 secretion system, type 6 secretion system (T6SS) clusters, and associated effectors did not indicate a correlation with the CV phenotype except for one T6SS VGR effector potentially associated with the CV phenotype. This is the first R. solanacearum genomic comparative analysis of multiple strains with different temperature related virulence. The candidate genes identified by this comparison are potential factors involved in virulence at low temperatures that need to be investigated. The high percentage of transcriptional regulators among the genes present only in CV strains supports the hypothesis that temperature dependent regulation of virulence genes explains the differential

  12. Evaluating the automatic mapping of human gene and protein mentions to unique identifiers.

    PubMed

    Morgan, Alexander A; Wellner, Benjamin; Colombe, Jeffrey B; Arens, Robert; Colosimo, Marc E; Hirschman, Lynette

    2007-01-01

    We have developed a challenge task for the second BioCreAtIvE (Critical Assessment of Information Extraction in Biology) that requires participating systems to provide lists of the EntrezGene (formerly LocusLink) identifiers for all human genes and proteins mentioned in a MEDLINE abstract. We are distributing 281 annotated abstracts and another 5,000 noisily annotated abstracts along with a gene name lexicon to participants. We have performed a series of baseline experiments to better characterize this dataset and form a foundation for participant exploration.

  13. Comparison of gene expression in segregating families identifies genes and genomic regions involved in a novel adaptation, zinc hyperaccumulation.

    PubMed

    Filatov, Victor; Dowdle, John; Smirnoff, Nicholas; Ford-Lloyd, Brian; Newbury, H John; Macnair, Mark R

    2006-09-01

    One of the challenges of comparative genomics is to identify specific genetic changes associated with the evolution of a novel adaptation or trait. We need to be able to disassociate the genes involved with a particular character from all the other genetic changes that take place as lineages diverge. Here we show that by comparing the transcriptional profile of segregating families with that of parent species differing in a novel trait, it is possible to narrow down substantially the list of potential target genes. In addition, by assuming synteny with a related model organism for which the complete genome sequence is available, it is possible to use the cosegregation of markers differing in transcription level to identify regions of the genome which probably contain quantitative trait loci (QTLs) for the character. This novel combination of genomics and classical genetics provides a very powerful tool to identify candidate genes. We use this methodology to investigate zinc hyperaccumulation in Arabidopsis halleri, the sister species to the model plant, Arabidopsis thaliana. We compare the transcriptional profile of A. halleri with that of its sister nonaccumulator species, Arabidopsis petraea, and between accumulator and nonaccumulator F(3)s derived from the cross between the two species. We identify eight genes which consistently show greater expression in accumulator phenotypes in both roots and shoots, including two metal transporter genes (NRAMP3 and ZIP6), and cytoplasmic aconitase, a gene involved in iron homeostasis in mammals. We also show that there appear to be two QTLs for zinc accumulation, on chromosomes 3 and 7.

  14. Identifying novel genes and chemicals related to nasopharyngeal cancer in a heterogeneous network.

    PubMed

    Li, Zhandong; An, Lifeng; Li, Hao; Wang, ShaoPeng; Zhou, You; Yuan, Fei; Li, Lin

    2016-05-05

    Nasopharyngeal cancer or nasopharyngeal carcinoma (NPC) is the most common cancer originating in the nasopharynx. The factors that induce nasopharyngeal cancer are still not clear. Additional information about the chemicals or genes related to nasopharyngeal cancer will promote a better understanding of the pathogenesis of this cancer and the factors that induce it. Thus, a computational method NPC-RGCP was proposed in this study to identify the possible relevant chemicals and genes based on the presently known chemicals and genes related to nasopharyngeal cancer. To extensively utilize the functional associations between proteins and chemicals, a heterogeneous network was constructed based on interactions of proteins and chemicals. The NPC-RGCP included two stages: the searching stage and the screening stage. The former stage is for finding new possible genes and chemicals in the heterogeneous network, while the latter stage is for screening and removing false discoveries and selecting the core genes and chemicals. As a result, five putative genes, CXCR3, IRF1, CDK1, GSTP1, and CDH2, and seven putative chemicals, iron, propionic acid, dimethyl sulfoxide, isopropanol, erythrose 4-phosphate, β-D-Fructose 6-phosphate, and flavin adenine dinucleotide, were identified by NPC-RGCP. Extensive analyses provided confirmation that the putative genes and chemicals have significant associations with nasopharyngeal cancer.

  15. A Genome-Wide Regulatory Framework Identifies Maize Pericarp Color1 Controlled Genes[C][W

    PubMed Central

    Morohashi, Kengo; Casas, María Isabel; Ferreyra, Lorena Falcone; Mejía-Guerra, María Katherine; Pourcel, Lucille; Yilmaz, Alper; Feller, Antje; Carvalho, Bruna; Emiliani, Julia; Rodriguez, Eduardo; Pellegrinet, Silvina; McMullen, Michael; Casati, Paula; Grotewold, Erich

    2012-01-01

    Pericarp Color1 (P1) encodes an R2R3-MYB transcription factor responsible for the accumulation of insecticidal flavones in maize (Zea mays) silks and red phlobaphene pigments in pericarps and other floral tissues, which makes P1 an important visual marker. Using genome-wide expression analyses (RNA sequencing) in pericarps and silks of plants with contrasting P1 alleles combined with chromatin immunoprecipitation coupled with high-throughput sequencing, we show here that the regulatory functions of P1 are much broader than the activation of genes corresponding to enzymes in a branch of flavonoid biosynthesis. P1 modulates the expression of several thousand genes, and ∼1500 of them were identified as putative direct targets of P1. Among them, we identified F2H1, corresponding to a P450 enzyme that converts naringenin into 2-hydroxynaringenin, a key branch point in the P1-controlled pathway and the first step in the formation of insecticidal C-glycosyl flavones. Unexpectedly, the binding of P1 to gene regulatory regions can result in both gene activation and repression. Our results indicate that P1 is the major regulator for a set of genes involved in flavonoid biosynthesis and a minor modulator of the expression of a much larger gene set that includes genes involved in primary metabolism and production of other specialized compounds. PMID:22822204

  16. Integration of QTL and bioinformatic tools to identify candidate genes for triglycerides in mice.

    PubMed

    Leduc, Magalie S; Hageman, Rachael S; Verdugo, Ricardo A; Tsaih, Shirng-Wern; Walsh, Kenneth; Churchill, Gary A; Paigen, Beverly

    2011-09-01

    To identify genetic loci influencing lipid levels, we performed quantitative trait loci (QTL) analysis between inbred mouse strains MRL/MpJ and SM/J, measuring triglyceride levels at 8 weeks of age in F2 mice fed a chow diet. We identified one significant QTL on chromosome (Chr) 15 and three suggestive QTL on Chrs 2, 7, and 17. We also carried out microarray analysis on the livers of parental strains of 282 F2 mice and used these data to find cis-regulated expression QTL. We then narrowed the list of candidate genes under significant QTL using a "toolbox" of bioinformatic resources, including haplotype analysis; parental strain comparison for gene expression differences and nonsynonymous coding single nucleotide polymorphisms (SNP); cis-regulated eQTL in livers of F2 mice; correlation between gene expression and phenotype; and conditioning of expression on the phenotype. We suggest Slc25a7 as a candidate gene for the Chr 7 QTL and, based on expression differences, five genes (Polr3 h, Cyp2d22, Cyp2d26, Tspo, and Ttll12) as candidate genes for Chr 15 QTL. This study shows how bioinformatics can be used effectively to reduce candidate gene lists for QTL related to complex traits.

  17. Identifying novel genes and chemicals related to nasopharyngeal cancer in a heterogeneous network

    PubMed Central

    Li, Zhandong; An, Lifeng; Li, Hao; Wang, ShaoPeng; Zhou, You; Yuan, Fei; Li, Lin

    2016-01-01

    Nasopharyngeal cancer or nasopharyngeal carcinoma (NPC) is the most common cancer originating in the nasopharynx. The factors that induce nasopharyngeal cancer are still not clear. Additional information about the chemicals or genes related to nasopharyngeal cancer will promote a better understanding of the pathogenesis of this cancer and the factors that induce it. Thus, a computational method NPC-RGCP was proposed in this study to identify the possible relevant chemicals and genes based on the presently known chemicals and genes related to nasopharyngeal cancer. To extensively utilize the functional associations between proteins and chemicals, a heterogeneous network was constructed based on interactions of proteins and chemicals. The NPC-RGCP included two stages: the searching stage and the screening stage. The former stage is for finding new possible genes and chemicals in the heterogeneous network, while the latter stage is for screening and removing false discoveries and selecting the core genes and chemicals. As a result, five putative genes, CXCR3, IRF1, CDK1, GSTP1, and CDH2, and seven putative chemicals, iron, propionic acid, dimethyl sulfoxide, isopropanol, erythrose 4-phosphate, β-D-Fructose 6-phosphate, and flavin adenine dinucleotide, were identified by NPC-RGCP. Extensive analyses provided confirmation that the putative genes and chemicals have significant associations with nasopharyngeal cancer. PMID:27149165

  18. Comparison of inherently essential genes of Porphyromonas gingivalis identified in two transposon-sequencing libraries.

    PubMed

    Hutcherson, J A; Gogeneni, H; Yoder-Himes, D; Hendrickson, E L; Hackett, M; Whiteley, M; Lamont, R J; Scott, D A

    2016-08-01

    Porphyromonas gingivalis is a Gram-negative anaerobe and keystone periodontal pathogen. A mariner transposon insertion mutant library has recently been used to define 463 genes as putatively essential for the in vitro growth of P. gingivalis ATCC 33277 in planktonic culture (Library 1). We have independently generated a transposon insertion mutant library (Library 2) for the same P. gingivalis strain and herein compare genes that are putatively essential for in vitro growth in complex media, as defined by both libraries. In all, 281 genes (61%) identified by Library 1 were common to Library 2. Many of these common genes are involved in fundamentally important metabolic pathways, notably pyrimidine cycling as well as lipopolysaccharide, peptidoglycan, pantothenate and coenzyme A biosynthesis, and nicotinate and nicotinamide metabolism. Also in common are genes encoding heat-shock protein homologues, sigma factors, enzymes with proteolytic activity, and the majority of sec-related protein export genes. In addition to facilitating a better understanding of critical physiological processes, transposon-sequencing technology has the potential to identify novel strategies for the control of P. gingivalis infections. Those genes defined as essential by two independently generated TnSeq mutant libraries are likely to represent particularly attractive therapeutic targets.

  19. Transcriptome analysis identifies genes involved in ethanol response of Saccharomyces cerevisiae in Agave tequilana juice.

    PubMed

    Ramirez-Córdova, Jesús; Drnevich, Jenny; Madrigal-Pulido, Jaime Alberto; Arrizon, Javier; Allen, Kirk; Martínez-Velázquez, Moisés; Alvarez-Maya, Ikuri

    2012-08-01

    During ethanol fermentation, yeast cells are exposed to stress due to the accumulation of ethanol, cell growth is altered and the output of the target product is reduced. For Agave beverages, like tequila, no reports have been published on the global gene expression under ethanol stress. In this work, we used microarray analysis to identify Saccharomyces cerevisiae genes involved in the ethanol response. Gene expression of a tequila yeast strain of S. cerevisiae (AR5) was explored by comparing global gene expression with that of laboratory strain S288C, both after ethanol exposure. Additionally, we used two different culture conditions, cells grown in Agave tequilana juice as a natural fermentation media or grown in yeast-extract peptone dextrose as artificial media. Of the 6368 S. cerevisiae genes in the microarray, 657 genes were identified that had different expression responses to ethanol stress due to strain and/or media. A cluster of 28 genes was found over-expressed specifically in the AR5 tequila strain that could be involved in the adaptation to tequila yeast fermentation, 14 of which are unknown such as yor343c, ylr162w, ygr182c, ymr265c, yer053c-a or ydr415c. These could be the most suitable genes for transforming tequila yeast to increase ethanol tolerance in the tequila fermentation process. Other genes involved in response to stress (RFC4, TSA1, MLH1, PAU3, RAD53) or transport (CYB2, TIP20, QCR9) were expressed in the same cluster. Unknown genes could be good candidates for the development of recombinant yeasts with ethanol tolerance for use in industrial tequila fermentation.

  20. Candidate gene linkage approach to Identify DNA variants that predispose to preterm birth

    PubMed Central

    Bream, Elise N.A.; Leppellere, Cara R.; Cooper, Margaret E.; Dagle, John M.; Merrill, David C.; Christensen, Kaare; Simhan, Hyagriv N.; Fong, Chin-To; Hallman, Mikko; Muglia, Louis J.; Marazita, Mary L.; Murray, Jeffrey C.

    2013-01-01

    Background To identify genetic variants contributing to preterm birth using a linkage candidate gene approach. Methods We studied 99 single nucleotide polymorphisms for 33 genes in 257 families with preterm births segregating. Nonparametric and parametric analyses were used. Premature infants and mothers of premature infants were defined as affected cases in independent analyses. Results Analyses with the infant as the case identified two genes with evidence of linkage: CRHR1 (p=0.0012) and CYP2E1 (p=0.0011). Analyses with the mother as the case identified four genes with evidence of linkage: ENPP1 (p=0.003), IGFBP3 (p=0.006), DHCR7 (p=0.009), and TRAF2 (p=0.01). DNA sequence analysis of the coding exons and splice sites for CRHR1 and TRAF2 identified no new likely etiologic variants. Conclusion These findings suggest the involvement of six genes acting through the infant and/or the mother in the etiology of preterm birth. PMID:23168575

  1. Microarray and differential display identify genes involved in jasmonate-dependent anther development.

    PubMed

    Mandaokar, Ajin; Kumar, V Dinesh; Amway, Matt; Browse, John

    2003-07-01

    Jasmonate (JA) is a signaling compound essential for anther development and pollen fertility in Arabidopsis. Mutations that block the pathway of JA synthesis result into male sterility. To understand the processes of anther and pollen maturation, we used microarray and differential display approaches to compare gene expression pattern in anthers of wild-type Arabidopsis and the male-sterile mutant, opr3. Microarray experiment revealed 25 genes that were up-regulated more than 1.8-fold in wild-type anthers as compared to mutant anthers. Experiments based on differential display identified 13 additional genes up-regulated in wild-type anthers compared to opr3 for a total of 38 differentially expressed genes. Searches of the Arabidopsis and non-redundant databases disclosed known or likely functions for 28 of the 38 genes identified, while 10 genes encode proteins of unknown function. Northern blot analysis of eight representative clones as probes confirmed low expression in opr3 anthers compared with wild-type anthers. JA responsiveness of these same genes was also investigated by northern blot analysis of anther RNA isolated from wild-type and opr3 plants, In these experiments, four genes were induced in opr3 anthers within 0.5-1 h of JA treatment while the remaining genes were up-regulated only 1-8 h after JA application. None of these genes was induced by JA in anthers of the coil mutant that is deficient in JA responsiveness. The four early-induced genes in opr3 encode lipoxygenase, a putative bHLH transcription factor, epithiospecifier protein and an unknown protein. We propose that these and other early components may be involved in JA signaling and in the initiation of developmental processes. The four late genes encode an extensin-like protein, a peptide transporter and two unknown proteins, which may represent components required later in anther and pollen maturation. Transcript profiling has provided a successful approach to identify genes involved in

  2. A Sleeping Beauty forward genetic screen identifies new genes and pathways driving osteosarcoma development and metastasis

    PubMed Central

    Moriarity, Branden S; Otto, George M; Rahrmann, Eric P; Rathe, Susan K; Wolf, Natalie K; Weg, Madison T; Manlove, Luke A; LaRue, Rebecca S; Temiz, Nuri A; Molyneux, Sam D; Choi, Kwangmin; Holly, Kevin J; Sarver, Aaron L; Scott, Milcah C; Forster, Colleen L; Modiano, Jaime F; Khanna, Chand; Hewitt, Stephen M; Khokha, Rama; Yang, Yi; Gorlick, Richard; Dyer, Michael A; Largaespada, David A

    2016-01-01

    Osteosarcomas are sarcomas of the bone, derived from osteoblasts or their precursors, with a high propensity to metastasize. Osteosarcoma is associated with massive genomic instability, making it problematic to identify driver genes using human tumors or prototypical mouse models, many of which involve loss of Trp53 function. To identify the genes driving osteosarcoma development and metastasis, we performed a Sleeping Beauty (SB) transposon-based forward genetic screen in mice with and without somatic loss of Trp53. Common insertion site (CIS) analysis of 119 primary tumors and 134 metastatic nodules identified 232 sites associated with osteosarcoma development and 43 sites associated with metastasis, respectively. Analysis of CIS-associated genes identified numerous known and new osteosarcoma-associated genes enriched in the ErbB, PI3K-AKT-mTOR and MAPK signaling pathways. Lastly, we identified several oncogenes involved in axon guidance, including Sema4d and Sema6d, which we functionally validated as oncogenes in human osteosarcoma. PMID:25961939

  3. A Sleeping Beauty forward genetic screen identifies new genes and pathways driving osteosarcoma development and metastasis.

    PubMed

    Moriarity, Branden S; Otto, George M; Rahrmann, Eric P; Rathe, Susan K; Wolf, Natalie K; Weg, Madison T; Manlove, Luke A; LaRue, Rebecca S; Temiz, Nuri A; Molyneux, Sam D; Choi, Kwangmin; Holly, Kevin J; Sarver, Aaron L; Scott, Milcah C; Forster, Colleen L; Modiano, Jaime F; Khanna, Chand; Hewitt, Stephen M; Khokha, Rama; Yang, Yi; Gorlick, Richard; Dyer, Michael A; Largaespada, David A

    2015-06-01

    Osteosarcomas are sarcomas of the bone, derived from osteoblasts or their precursors, with a high propensity to metastasize. Osteosarcoma is associated with massive genomic instability, making it problematic to identify driver genes using human tumors or prototypical mouse models, many of which involve loss of Trp53 function. To identify the genes driving osteosarcoma development and metastasis, we performed a Sleeping Beauty (SB) transposon-based forward genetic screen in mice with and without somatic loss of Trp53. Common insertion site (CIS) analysis of 119 primary tumors and 134 metastatic nodules identified 232 sites associated with osteosarcoma development and 43 sites associated with metastasis, respectively. Analysis of CIS-associated genes identified numerous known and new osteosarcoma-associated genes enriched in the ErbB, PI3K-AKT-mTOR and MAPK signaling pathways. Lastly, we identified several oncogenes involved in axon guidance, including Sema4d and Sema6d, which we functionally validated as oncogenes in human osteosarcoma.

  4. Transcriptomic Analysis Using Olive Varieties and Breeding Progenies Identifies Candidate Genes Involved in Plant Architecture.

    PubMed

    González-Plaza, Juan J; Ortiz-Martín, Inmaculada; Muñoz-Mérida, Antonio; García-López, Carmen; Sánchez-Sevilla, José F; Luque, Francisco; Trelles, Oswaldo; Bejarano, Eduardo R; De La Rosa, Raúl; Valpuesta, Victoriano; Beuzón, Carmen R

    2016-01-01

    Plant architecture is a critical trait in fruit crops that can significantly influence yield, pruning, planting density and harvesting. Little is known about how plant architecture is genetically determined in olive, were most of the existing varieties are traditional with an architecture poorly suited for modern growing and harvesting systems. In the present study, we have carried out microarray analysis of meristematic tissue to compare expression profiles of olive varieties displaying differences in architecture, as well as seedlings from their cross pooled on the basis of their sharing architecture-related phenotypes. The microarray used, previously developed by our group has already been applied to identify candidates genes involved in regulating juvenile to adult transition in the shoot apex of seedlings. Varieties with distinct architecture phenotypes and individuals from segregating progenies displaying opposite architecture features were used to link phenotype to expression. Here, we identify 2252 differentially expressed genes (DEGs) associated to differences in plant architecture. Microarray results were validated by quantitative RT-PCR carried out on genes with functional annotation likely related to plant architecture. Twelve of these genes were further analyzed in individual seedlings of the corresponding pool. We also examined Arabidopsis mutants in putative orthologs of these targeted candidate genes, finding altered architecture for most of them. This supports a functional conservation between species and potential biological relevance of the candidate genes identified. This study is the first to identify genes associated to plant architecture in olive, and the results obtained could be of great help in future programs aimed at selecting phenotypes adapted to modern cultivation practices in this species.

  5. Transcriptomic Analysis Using Olive Varieties and Breeding Progenies Identifies Candidate Genes Involved in Plant Architecture

    PubMed Central

    González-Plaza, Juan J.; Ortiz-Martín, Inmaculada; Muñoz-Mérida, Antonio; García-López, Carmen; Sánchez-Sevilla, José F.; Luque, Francisco; Trelles, Oswaldo; Bejarano, Eduardo R.; De La Rosa, Raúl; Valpuesta, Victoriano; Beuzón, Carmen R.

    2016-01-01

    Plant architecture is a critical trait in fruit crops that can significantly influence yield, pruning, planting density and harvesting. Little is known about how plant architecture is genetically determined in olive, were most of the existing varieties are traditional with an architecture poorly suited for modern growing and harvesting systems. In the present study, we have carried out microarray analysis of meristematic tissue to compare expression profiles of olive varieties displaying differences in architecture, as well as seedlings from their cross pooled on the basis of their sharing architecture-related phenotypes. The microarray used, previously developed by our group has already been applied to identify candidates genes involved in regulating juvenile to adult transition in the shoot apex of seedlings. Varieties with distinct architecture phenotypes and individuals from segregating progenies displaying opposite architecture features were used to link phenotype to expression. Here, we identify 2252 differentially expressed genes (DEGs) associated to differences in plant architecture. Microarray results were validated by quantitative RT-PCR carried out on genes with functional annotation likely related to plant architecture. Twelve of these genes were further analyzed in individual seedlings of the corresponding pool. We also examined Arabidopsis mutants in putative orthologs of these targeted candidate genes, finding altered architecture for most of them. This supports a functional conservation between species and potential biological relevance of the candidate genes identified. This study is the first to identify genes associated to plant architecture in olive, and the results obtained could be of great help in future programs aimed at selecting phenotypes adapted to modern cultivation practices in this species. PMID:26973682

  6. Muscle transcriptomic investigation of late fetal development identifies candidate genes for piglet maturity.

    PubMed

    Voillet, Valentin; SanCristobal, Magali; Lippi, Yannick; Martin, Pascal G P; Iannuccelli, Nathalie; Lascor, Christine; Vignoles, Florence; Billon, Yvon; Canario, Laurianne; Liaubet, Laurence

    2014-09-17

    In pigs, the perinatal period is the most critical time for survival. Piglet maturation, which occurs at the end of gestation, leads to a state of full development after birth. Therefore, maturity is an important determinant of early survival. Skeletal muscle plays a key role in adaptation to extra-uterine life, e.g. glycogen storage and thermoregulation. In this study, we performed microarray analysis to identify the genes and biological processes involved in piglet muscle maturity. Progeny from two breeds with extreme muscle maturity phenotypes were analyzed at two time points during gestation (gestational days 90 and 110). The Large White (LW) breed is a selected breed with an increased rate of mortality at birth, whereas the Meishan (MS) breed produces piglets with extremely low mortality at birth. The impact of the parental genome was analyzed with reciprocal crossed fetuses. Microarray analysis identified 12,326 differentially expressed probes for gestational age and genotype. Such a high number reflects an important transcriptomic change that occurs between 90 and 110 days of gestation. 2,000 probes, corresponding to 1,120 unique annotated genes, involved more particularly in the maturation process were further studied. Functional enrichment and graph inference studies underlined genes involved in muscular development around 90 days of gestation, and genes involved in metabolic functions, such as gluconeogenesis, around 110 days of gestation. Moreover, a difference in the expression of key genes, e.g. PCK2, LDHA or PGK1, was detected between MS and LW just before birth. Reciprocal crossing analysis resulted in the identification of 472 genes with an expression preferentially regulated by one parental genome. Most of these genes (366) were regulated by the paternal genome. Among these paternally regulated genes, some known imprinted genes, such as MAGEL2 or IGF2, were identified and could have a key role in the maturation process. These results reveal the

  7. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    PubMed Central

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4−/− mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases. PMID:25644381

  8. Transcriptome analysis identifies genes with enriched expression in the mouse central Extended Amygdala

    PubMed Central

    Becker, Jérôme A. J.; Befort, Katia; Blad, Clara; Filliol, Dominique; Ghate, Aditee; Dembele, Doulaye; Thibault, Christelle; Koch, Muriel; Muller, Jean; Lardenois, Aurélie; Poch, Olivier; Kieffer, Brigitte L.

    2008-01-01

    The central Extended Amygdala (EAc) is an ensemble of highly interconnected limbic structures of the anterior brain, and forms a cellular continuum including the Bed Nucleus of the Stria Terminalis (BNST), the central nucleus of the Amygdala (CeA) and the Nucleus Accumbens shell (AcbSh). This neural network is a key site for interactions between brain reward and stress systems, and has been implicated in several aspects of drug abuse. In order to increase our understanding of EAc function at the molecular level, we undertook a genome-wide screen (Affymetrix) to identify genes whose expression is enriched in the EAc. We focused on the less-well known BNST-CeA areas of the EAc, and identified 121 genes that exhibit more than 2-fold higher expression level in the EAc compared to whole brain. Among these, forty-three genes have never been described to be expressed in the EAc. We mapped these genes throughout the brain, using non-radioactive in situ hybridization, and identified eight genes with a unique and distinct rostro-caudal expression pattern along AcbSh, BNST and CeA. Q-PCR analysis performed in brain and peripheral organ tissues indicated that, with the exception of one (Spata13), all these genes are predominantly expressed in brain. These genes encode signaling proteins (Adora2, GPR88, Arpp21 and Rem2), a transcription factor (Limh6) or proteins of unknown function (Rik130, Spata13 and Wfs1). The identification of genes with enriched expression expands our knowledge of EAc at a molecular level, and provides useful information to towards genetic manipulations within the EAc. PMID:18786617

  9. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes.

    PubMed

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4(-/-) mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.

  10. Identifying genes of agronomic importance in maize by screening microsatellites for evidence of selection during domestication.

    PubMed

    Vigouroux, Y; McMullen, M; Hittinger, C T; Houchins, K; Schulz, L; Kresovich, S; Matsuoka, Y; Doebley, J

    2002-07-23

    Crop species experienced strong selective pressure directed at genes controlling traits of agronomic importance during their domestication and subsequent episodes of selective breeding. Consequently, these genes are expected to exhibit the signature of selection. We screened 501 maize genes for the signature of selection using microsatellites or simple sequence repeats (SSRs). We applied the Ewens-Watterson test, which can reveal deviations from a neutral-equilibrium model, as well as two nonequilibrium tests that incorporate the domestication bottleneck. We investigated two classes of SSRs: those known to be polymorphic in maize (Class I) and those previously classified as monomorphic in maize (Class II). Fifteen SSRs exhibited some evidence for selection in maize and 10 showed evidence under stringent criteria. The genes containing nonneutral SSRs are candidates for agronomically important genes. Because demographic factors can bias our tests, further independent tests of these candidates are necessary. We applied such an additional test to one candidate, which encodes a MADS box transcriptional regulator, and confirmed that this gene experienced a selective sweep during maize domestication. Genomic scans for the signature of selection offer a means of identifying new genes of agronomic importance even when gene function and the phenotype of interest are unknown.

  11. New SigD-regulated genes identified in the rhizobacterium Bacillus amyloliquefaciens FZB42

    PubMed Central

    Li, Yu-Long; Mariappan, Aruljothi; Becker, Anke; Wu, Xiao-Qin; Borriss, Rainer

    2016-01-01

    ABSTRACT The alternative sigma factor D is known to be involved in at least three biological processes in Bacilli: flagellin synthesis, methyl-accepting chemotaxis and autolysin synthesis. Although many Bacillus genes have been identified as SigD regulon, the list may be not be complete. With microarray-based systemic screening, we found a set of genes downregulated in the sigD knockout mutant of the plant growth-promoting rhizobacterium B. amyloliquefaciens subsp. plantarum FZB42. Eight genes (appA, blsA, dhaS, spoVG, yqgA, RBAM_004640, RBAM_018080 and ytk) were further confirmed by quantitative PCR and/or northern blot to be controlled by SigD at the transcriptional level. These genes are hitherto not reported to be controlled by SigD. Among them, four genes are of unknown function and two genes (RBAM_004640 and RBAM_018080), absent in the model strain B. subtilis 168, are unique to B. amyloliquefaciens stains. The eight genes are involved in sporulation, biofilm formation, metabolite transport and several other functions. These findings extend our knowledge of the regulatory network governed by SigD in Bacillus and will further help to decipher the roles of the genes. PMID:27797724

  12. Use of suppression subtractive hybridization to identify downy mildew genes expressed during infection of Arabidopsis thaliana.

    PubMed

    Bittner-Eddy, Peter D; Allen, Rebecca L; Rehmany, Anne P; Birch, Paul; Beynon, Jim L

    2003-11-01

    SUMMARY Peronospora parasitica is an obligate biotrophic oomycete that causes downy mildew in Arabidopsis thaliana and Brassica species. Our goal is to identify P. parasitica (At) genes that are involved in pathogenicity. We used suppression subtractive hybridization (SSH) to generate cDNA libraries enriched for in planta-expressed parasite genes and up-regulated host genes. A total of 1345 clones were sequenced representing cDNA fragments from 25 putative P. parasitica (At) genes (Ppat 1-25) and 618 Arabidopsis genes. Analyses of expression patterns showed that 15 Ppats were expressed only in planta. Eleven Ppats encoded peptides with homology (BlastP values < 1e-05) to proteins with roles in membrane or cell wall biosynthesis, amino acid metabolism, osmoregulation, cation transport, phosphorylation or protein secretion. The other 14 represent potentially novel oomycete genes with none having homologues in an extensive Phytophthora species EST database. A full-length sequence was obtained for four Ppats and each encoded small cysteine-rich proteins with amino-terminal signal peptide sequences. These results demonstrate the utility of SSH in obtaining novel in planta-expressed genes from P. parasitica (At) that complements other gene discovery approaches such as EST sequencing.

  13. Wigwams: identifying gene modules co-regulated across multiple biological conditions

    PubMed Central

    Polanski, Krzysztof; Rhodes, Johanna; Hill, Claire; Zhang, Peijun; Jenkins, Dafyd J.; Kiddle, Steven J.; Jironkin, Aleksey; Beynon, Jim; Buchanan-Wollaston, Vicky; Ott, Sascha; Denby, Katherine J.

    2014-01-01

    Motivation: Identification of modules of co-regulated genes is a crucial first step towards dissecting the regulatory circuitry underlying biological processes. Co-regulated genes are likely to reveal themselves by showing tight co-expression, e.g. high correlation of expression profiles across multiple time series datasets. However, numbers of up- or downregulated genes are often large, making it difficult to discriminate between dependent co-expression resulting from co-regulation and independent co-expression. Furthermore, modules of co-regulated genes may only show tight co-expression across a subset of the time series, i.e. show condition-dependent regulation. Results: Wigwams is a simple and efficient method to identify gene modules showing evidence for co-regulation in multiple time series of gene expression data. Wigwams analyzes similarities of gene expression patterns within each time series (condition) and directly tests the dependence or independence of these across different conditions. The expression pattern of each gene in each subset of conditions is tested statistically as a potential signature of a condition-dependent regulatory mechanism regulating multiple genes. Wigwams does not require particular time points and can process datasets that are on different time scales. Differential expression relative to control conditions can be taken into account. The output is succinct and non-redundant, enabling gene network reconstruction to be focused on those gene modules and combinations of conditions that show evidence for shared regulatory mechanisms. Wigwams was run using six Arabidopsis time series expression datasets, producing a set of biologically significant modules spanning different combinations of conditions. Availability and implementation: A Matlab implementation of Wigwams, complete with graphical user interfaces and documentation, is available at: warwick.ac.uk/wigwams. Contact: k.j.denby@warwick.ac.uk Supplementary Data: Supplementary

  14. Wigwams: identifying gene modules co-regulated across multiple biological conditions.

    PubMed

    Polanski, Krzysztof; Rhodes, Johanna; Hill, Claire; Zhang, Peijun; Jenkins, Dafyd J; Kiddle, Steven J; Jironkin, Aleksey; Beynon, Jim; Buchanan-Wollaston, Vicky; Ott, Sascha; Denby, Katherine J

    2014-04-01

    Identification of modules of co-regulated genes is a crucial first step towards dissecting the regulatory circuitry underlying biological processes. Co-regulated genes are likely to reveal themselves by showing tight co-expression, e.g. high correlation of expression profiles across multiple time series datasets. However, numbers of up- or downregulated genes are often large, making it difficult to discriminate between dependent co-expression resulting from co-regulation and independent co-expression. Furthermore, modules of co-regulated genes may only show tight co-expression across a subset of the time series, i.e. show condition-dependent regulation. Wigwams is a simple and efficient method to identify gene modules showing evidence for co-regulation in multiple time series of gene expression data. Wigwams analyzes similarities of gene expression patterns within each time series (condition) and directly tests the dependence or independence of these across different conditions. The expression pattern of each gene in each subset of conditions is tested statistically as a potential signature of a condition-dependent regulatory mechanism regulating multiple genes. Wigwams does not require particular time points and can process datasets that are on different time scales. Differential expression relative to control conditions can be taken into account. The output is succinct and non-redundant, enabling gene network reconstruction to be focused on those gene modules and combinations of conditions that show evidence for shared regulatory mechanisms. Wigwams was run using six Arabidopsis time series expression datasets, producing a set of biologically significant modules spanning different combinations of conditions. A Matlab implementation of Wigwams, complete with graphical user interfaces and documentation, is available at: warwick.ac.uk/wigwams. .

  15. Comparative genomics between fly, mouse, and cattle identifies genes associated with sire conception rate.

    PubMed

    Li, G; Peñagaricano, F; Weigel, K A; Zhang, Y; Rosa, G; Khatib, H

    2012-10-01

    The decline in reproductive performance in cattle is of major concern to farmers and the dairy industry worldwide. Most fertility studies in cattle have focused on fertility of the cow, whereas the genetics of male fertility have not been thoroughly investigated. The present study hypothesizes that the high conservation of spermatogenesis genes from fly to human implies important roles of these genes in male fertility in cattle. To test this hypothesis, we performed an association analysis between highly conserved spermatogenesis genes and sire conception rate (SCR) in US Holsteins as a measure of bull fertility. Sequencing analysis revealed 24 single nucleotide polymorphisms (SNP) in 9 genes in the bull population using the pooled DNA sequencing approach. Five SNP previously identified in 5 genes from the POU1F1 pathway were also included in this study because they have shown significant associations with female and male fertility traits. Overall, 29 SNP located in 14 candidate genes were tested for association with sire conception rate in a population of 1,988 bulls. Three SNP located in MAP1B and 1 SNP in PPP1R11 showed significant associations with SCR. For the POU1F1 pathway, single gene analysis revealed significant associations of POU1F1 and STAT5A with SCR. Analysis of genotypic interactions between adjacent genes in the pathway revealed significant associations of STAT5A and UTMP genotypic combinations with SCR. The most significant spermatogenesis gene, MAP1B, was found to be associated with fertilization and blastocyst rates. Thus, the association of these genes with bull fertility testifies to the usefulness of the comparative genomics approach in selecting candidate male fertility genes. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  16. Gene expression analysis identifies new candidate genes associated with the development of black skin spots in Corriedale sheep.

    PubMed

    Peñagaricano, Francisco; Zorrilla, Pilar; Naya, Hugo; Robello, Carlos; Urioste, Jorge I

    2012-02-01

    The white coat colour of sheep is an important economic trait. For unknown reasons, some animals are born with, and others develop with time, black skin spots that can also produce pigmented fibres. The presence of pigmented fibres in the white wool significantly decreases the fibre quality. The aim of this work was to study gene expression in black spots (with and without pigmented fibres) and white skin by microarray techniques, in order to identify the possible genes involved in the development of this trait. Five unrelated Corriedale sheep were used and, for each animal, the three possible comparisons (three different hybridisations) between the three samples of interest were performed. Differential gene expression patterns were analysed using different t-test approaches. Most of the major genes with well-known roles in skin pigmentation, e.g. ASIP, MC1R and C-KIT, showed no significant difference in the gene expression between white skin and black spots. On the other hand, many of the differentially expressed genes (raw P-value < 0.005) detected in this study, e.g. C-FOS, KLF4 and UFC1, fulfil biological functions that are plausible to be involved in the formation of black spots. The gene expression of C-FOS and KLF4, transcription factors involved in the cellular response to external factors such as ultraviolet light, was validated by quantitative polymerase chain reaction (PCR). This exploratory study provides a list of candidate genes that could be associated with the development of black skin spots that should be studied in more detail. Characterisation of these genes will enable us to discern the molecular mechanisms involved in the development of this feature and, hence, increase our understanding of melanocyte biology and skin pigmentation. In sheep, understanding this phenomenon is a first step towards developing molecular tools to assist in the selection against the presence of pigmented fibres in white wool.

  17. A Modified Entropy-Based Approach for Identifying Gene-Gene Interactions in Case-Control Study

    PubMed Central

    Yee, Jaeyong; Kwon, Min-Seok; Park, Taesung; Park, Mira

    2013-01-01

    Gene-gene interactions may play an important role in the genetics of a complex disease. Detection and characterization of gene-gene interactions is a challenging issue that has stimulated the development of various statistical methods to address it. In this study, we introduce a method to measure gene interactions using entropy-based statistics from a contingency table of trait and genotype combinations. We also developed an exploration procedure by using graphs. We propose a standardized relative information gain (RIG) measure to evaluate the interactions between single nucleotide polymorphism (SNP) combinations. To identify the kth order interactions, contingency tables of trait and genotype combinations of k SNPs are constructed, with which RIGs are calculated. The RIGs are standardized using the mean and standard deviation from the permuted datasets. SNP combinations yielding high standardized RIG are chosen for gene-gene interactions. Detection of high-order interactions and comparison of interaction strengths between different orders are made possible by using standardized RIG. We have applied the proposed standardized entropy-based method to two types of data sets from a simulation study and a real genetic association study. We have compared our method and the multifactor dimensionality reduction (MDR) method through power analysis of eight different genetic models with varying penetrance rates, number of SNPs, and sample sizes. Our method shows successful identification of genetic associations and gene-gene interactions both in simulation and real genetic data. Simulation results suggest that the proposed entropy-based method is better able to detect high-order interactions and is superior to the MDR method in most cases. The proposed method is well suited for detecting interactions without main effects as well as for models including main effects. PMID:23874943

  18. Selection on Plant Male Function Genes Identifies Candidates for Reproductive Isolation of Yellow Monkeyflowers

    PubMed Central

    Aagaard, Jan E.; George, Renee D.; Fishman, Lila; MacCoss, Michael J.; Swanson, Willie J.

    2013-01-01

    Understanding the genetic basis of reproductive isolation promises insight into speciation and the origins of biological diversity. While progress has been made in identifying genes underlying barriers to reproduction that function after fertilization (post-zygotic isolation), we know much less about earlier acting pre-zygotic barriers. Of particular interest are barriers involved in mating and fertilization that can evolve extremely rapidly under sexual selection, suggesting they may play a prominent role in the initial stages of reproductive isolation. A significant challenge to the field of speciation genetics is developing new approaches for identification of candidate genes underlying these barriers, particularly among non-traditional model systems. We employ powerful proteomic and genomic strategies to study the genetic basis of conspecific pollen precedence, an important component of pre-zygotic reproductive isolation among yellow monkeyflowers (Mimulus spp.) resulting from male pollen competition. We use isotopic labeling in combination with shotgun proteomics to identify more than 2,000 male function (pollen tube) proteins within maternal reproductive structures (styles) of M. guttatus flowers where pollen competition occurs. We then sequence array-captured pollen tube exomes from a large outcrossing population of M. guttatus, and identify those genes with evidence of selective sweeps or balancing selection consistent with their role in pollen competition. We also test for evidence of positive selection on these genes more broadly across yellow monkeyflowers, because a signal of adaptive divergence is a common feature of genes causing reproductive isolation. Together the molecular evolution studies identify 159 pollen tube proteins that are candidate genes for conspecific pollen precedence. Our work demonstrates how powerful proteomic and genomic tools can be readily adapted to non-traditional model systems, allowing for genome-wide screens towards the

  19. Gene expression in bovine rumen epithelium during weaning identifies molecular regulators of rumen development and growth.

    PubMed

    Connor, Erin E; Baldwin, Ransom L; Li, Cong-jun; Li, Robert W; Chung, Hoyoung

    2013-03-01

    During weaning, epithelial cell function in the rumen transitions in response to conversion from a pre-ruminant to a true ruminant environment to ensure efficient nutrient absorption and metabolism. To identify gene networks affected by weaning in bovine rumen, Holstein bull calves were fed commercial milk replacer only (MRO) until 42 days of age, then were provided diets of either milk + orchardgrass hay (MH) or milk + grain-based calf starter (MG). Rumen epithelial RNA was extracted from calves sacrificed at four time points: day 14 (n = 3) and day 42 (n = 3) of age while fed the MRO diet and day 56 (n = 3/diet) and day 70 (n = 3/diet) while fed the MH and MG diets for transcript profiling by microarray hybridization. Five two-group comparisons were made using Permutation Analysis of Differential Expression® to identify differentially expressed genes over time and developmental stage between days 14 and 42 within the MRO diet, between day 42 on the MRO diet and day 56 on the MG or MH diets, and between the MG and MH diets at days 56 and 70. Ingenuity Pathway Analysis (IPA) of differentially expressed genes during weaning indicated the top 5 gene networks involving molecules participating in lipid metabolism, cell morphology and death, cellular growth and proliferation, molecular transport, and the cell cycle. Putative genes functioning in the establishment of the rumen microbial population and associated rumen epithelial inflammation during weaning were identified. Activation of transcription factor PPAR-α was identified by IPA software as an important regulator of molecular changes in rumen epithelium that function in papillary development and fatty acid oxidation during the transition from pre-rumination to rumination. Thus, molecular markers of rumen development and gene networks regulating differentiation and growth of rumen epithelium were identified for selecting targets and methods for improving and assessing rumen development and

  20. De novo Transcriptome Analysis of Miscanthus lutarioriparius Identifies Candidate Genes in Rhizome Development

    PubMed Central

    Hu, Ruibo; Yu, Changjiang; Wang, Xiaoyu; Jia, Chunlin; Pei, Shengqiang; He, Kang; He, Guo; Kong, Yingzhen; Zhou, Gongke

    2017-01-01

    HIGHLIGHT De novo transcriptome profiling of five tissues reveals candidate genes putatively involved in rhizome development in M. lutarioriparius. Miscanthus lutarioriparius is a promising lignocellulosic feedstock for second-generation bioethanol production. However, the genomic resource for this species is relatively limited thus hampers our understanding of the molecular mechanisms underlying many important biological processes. In this study, we performed the first de novo transcriptome analysis of five tissues (leaf, stem, root, lateral bud and rhizome bud) of M. lutarioriparius with an emphasis to identify putative genes involved in rhizome development. Approximately 66 gigabase (GB) paired-end clean reads were obtained and assembled into 169,064 unigenes with an average length of 759 bp. Among these unigenes, 103,899 (61.5%) were annotated in seven public protein databases. Differential gene expression profiling analysis revealed that 4,609, 3,188, 1,679, 1,218, and 1,077 genes were predominantly expressed in root, leaf, stem, lateral bud, and rhizome bud, respectively. Their expression patterns were further classified into 12 distinct clusters. Pathway enrichment analysis revealed that genes predominantly expressed in rhizome bud were mainly involved in primary metabolism and hormone signaling and transduction pathways. Noteworthy, 19 transcription factors (TFs) and 16 hormone signaling pathway-related genes were identified to be predominantly expressed in rhizome bud compared with the other tissues, suggesting putative roles in rhizome formation and development. In addition, a predictive regulatory network was constructed between four TFs and six auxin and abscisic acid (ABA) -related genes. Furthermore, the expression of 24 rhizome-specific genes was further validated by quantitative real-time RT-PCR (qRT-PCR) analysis. Taken together, this study provide a global portrait of gene expression across five different tissues and reveal preliminary insights

  1. Integrative Analysis of Genomics and Transcriptome Data to Identify Potential Functional Genes of BMDs in Females.

    PubMed

    Chen, Yuan-Cheng; Guo, Yan-Fang; He, Hao; Lin, Xu; Wang, Xia-Fang; Zhou, Rou; Li, Wen-Ting; Pan, Dao-Yan; Shen, Jie; Deng, Hong-Wen

    2016-05-01

    Osteoporosis is known to be highly heritable. However, to date, the findings from more than 20 genome-wide association studies (GWASs) have explained less than 6% of genetic risks. Studies suggest that the missing heritability data may be because of joint effects among genes. To identify novel heritability for osteoporosis, we performed a system-level study on bone mineral density (BMD) by weighted gene coexpression network analysis (WGCNA), using the largest GWAS data set for BMD in the field, Genetic Factors for Osteoporosis Consortium (GEFOS-2), and a transcriptomic gene expression data set generated from transiliac bone biopsies in women. A weighted gene coexpression network was generated for 1574 genes with GWAS nominal evidence of association (p ≤ 0.05) based on dissimilarity measurement on the expression data. Twelve distinct gene modules were identified, and four modules showed nominally significant associations with BMD (p ≤ 0.05), but only one module, the yellow module, demonstrated a good correlation between module membership (MM) and gene significance (GS), suggesting that the yellow module serves an important biological role in bone regulation. Interestingly, through characterization of module content and topology, the yellow module was found to be significantly enriched with contractile fiber part (GO:044449), which is widely recognized as having a close relationship between muscle and bone. Furthermore, detailed submodule analyses of important candidate genes (HOMER1, SPTBN1) by all edges within the yellow module implied significant enrichment of functional connections between bone and cytoskeletal protein binding. Our study yielded novel information from system genetics analyses of GWAS data jointly with transcriptomic data. The findings highlighted a module and several genes in the model as playing important roles in the regulation of bone mass in females, which may yield novel insights into the genetic basis of osteoporosis. © 2016

  2. Low-copy piggyBac transposon mutagenesis in mice identifies genes driving melanoma.

    PubMed

    Ni, Thomas K; Landrette, Sean F; Bjornson, Robert D; Bosenberg, Marcus W; Xu, Tian

    2013-09-17

    Despite considerable efforts to sequence hypermutated cancers such as melanoma, distinguishing cancer-driving genes from thousands of recurrently mutated genes remains a significant challenge. To circumvent the problematic background mutation rates and identify new melanoma driver genes, we carried out a low-copy piggyBac transposon mutagenesis screen in mice. We induced eleven melanomas with mutation burdens that were 100-fold lower relative to human melanomas. Thirty-eight implicated genes, including two known drivers of human melanoma, were classified into three groups based on high, low, or background-level mutation frequencies in human melanomas, and we further explored the functional significance of genes in each group. For two genes overlooked by prevailing discovery methods, we found that loss of membrane associated guanylate kinase, WW and PDZ domain containing 2 and protein tyrosine phosphatase, receptor type, O cooperated with the v-raf murine sarcoma viral oncogene homolog B (BRAF) recurrent V600E mutation to promote cellular transformation. Moreover, for infrequently mutated genes often disregarded by current methods, we discovered recurrent mitogen-activated protein kinase kinase kinase 1 (Map3k1)-activating insertions in our screen, mirroring recurrent MAP3K1 up-regulation in human melanomas. Aberrant expression of Map3k1 enabled growth factor-autonomous proliferation and drove BRAF-independent ERK signaling, thus shedding light on alternative means of activating this prominent signaling pathway in melanoma. In summary, our study contributes several previously undescribed genes involved in melanoma and establishes an important proof-of-principle for the utility of the low-copy transposon mutagenesis approach for identifying cancer-driving genes, especially those masked by hypermutation.

  3. Transcriptome Analysis Identifies Key Candidate Genes Mediating Purple Ovary Coloration in Asiatic Hybrid Lilies

    PubMed Central

    Xu, Leifeng; Yang, Panpan; Yuan, Suxia; Feng, Yayan; Xu, Hua; Cao, Yuwei; Ming, Jun

    2016-01-01

    Lily tepals have a short lifespan. Once the tepals senesce, the ornamental value of the flower is lost. Some cultivars have attractive purple ovaries and fruits which greatly enhance the ornamental value of Asiatic hybrid lilies. However, little is known about the molecular mechanisms of anthocyanin biosynthesis in Asiatic hybrid lily ovaries. To investigate the transcriptional network that governs purple ovary coloration in Asiatic hybrid lilies, we obtained transcriptome data from green ovaries (S1) and purple ovaries (S2) of Asiatic “Tiny Padhye”. Comparative transcriptome analysis revealed 4228 differentially expressed genes. Differential expression analysis revealed that ten unigenes including four CHS genes, one CHI gene, one F3H gene, one F3′H gene, one DFR gene, one UFGT gene, and one 3RT gene were significantly up-regulated in purple ovaries. One MYB gene, LhMYB12-Lat, was identified as a key transcription factor determining the distribution of anthocyanins in Asiatic hybrid lily ovaries. Further qPCR results showed unigenes related to anthocyanin biosynthesis were highly expressed in purple ovaries of three purple-ovaried Asiatic hybrid lilies at stages 2 and 3, while they showed an extremely low level of expression in ovaries of three green-ovaried Asiatic hybrid lilies during all developmental stages. In addition, shading treatment significantly decreased pigment accumulation by suppressing the expression of several unigenes related to anthocyanin biosynthesis in ovaries of Asiatic “Tiny Padhye”. Lastly, a total of 15,048 Simple Sequence Repeats (SSRs) were identified in 13,710 sequences, and primer pairs for SSRs were designed. The results could further our understanding of the molecular mechanisms of anthocyanin biosynthesis in Asiatic hybrid lily ovaries. PMID:27879624

  4. Identifying Significant Features in Cancer Methylation Data Using Gene Pathway Segmentation

    PubMed Central

    Hira, Zena M.; Gillies, Duncan F.

    2016-01-01

    In order to provide the most effective therapy for cancer, it is important to be able to diagnose whether a patient’s cancer will respond to a proposed treatment. Methylation profiling could contain information from which such predictions could be made. Currently, hypothesis testing is used to determine whether possible biomarkers for cancer progression produce statistically significant results. However, this approach requires the identification of individual genes, or sets of genes, as candidate hypotheses, and with the increasing size of modern microarrays, this task is becoming progressively harder. Exhaustive testing of small sets of genes is computationally infeasible, and so hypothesis generation depends either on the use of established biological knowledge or on heuristic methods. As an alternative machine learning, methods can be used to identify groups of genes that are acting together within sets of cancer data and associate their behaviors with cancer progression. These methods have the advantage of being multivariate and unbiased but unfortunately also rapidly become computationally infeasible as the number of gene probes and datasets increases. To address this problem, we have investigated a way of utilizing prior knowledge to segment microarray datasets in such a way that machine learning can be used to identify candidate sets of genes for hypothesis testing. A methylation dataset is divided into subsets, where each subset contains only the probes that relate to a known gene pathway. Each of these pathway subsets is used independently for classification. The classification method is AdaBoost with decision trees as weak classifiers. Since each pathway subset contains a relatively small number of gene probes, it is possible to train and test its classification accuracy quickly and determine whether it has valuable diagnostic information. Finally, genes from successful pathway subsets can be combined to create a classifier of high accuracy. PMID

  5. A Cross-Disorder Method to Identify Novel Candidate Genes for Developmental Brain Disorders

    PubMed Central

    Gonzalez-Mantilla, Andrea J.; Moreno-De-Luca, Andres; Ledbetter, David H.; Martin, Christa Lese

    2017-01-01

    IMPORTANCE Developmental brain disorders are a group of clinically and genetically heterogeneous disorders characterized by high heritability. Specific highly penetrant genetic causes can often be shared by a subset of individuals with different phenotypic features, and recent advances in genome sequencing have allowed the rapid and cost-effective identification of many of these pathogenic variants. OBJECTIVES To identify novel candidate genes for developmental brain disorders and provide additional evidence of previously implicated genes. DATA SOURCES The PubMed database was searched for studies published from March 28,2003, through May 7,2015, with large cohorts of individuals with developmental brain disorders. DATA EXTRACTION AND SYNTHESIS A tiered, multilevel data-integration approach was used, which intersects (1) whole-genome data from structural and sequence pathogenic loss-of-function (pLOF) variants, (2) phenotype data from 6 apparently distinct disorders (intellectual disability, autism, attention-deficit/hyperactivity disorder, schizophrenia, bipolar disorder, and epilepsy), and (3) additional data from largescale studies, smaller cohorts, and case reports focusing on specific candidate genes. All candidate genes were ranked into 4 tiers based on the strength of evidence as follows: tier 1, genes with 3 or more de novo pathogenic loss-of-function variants; tier 2, genes with 2 de novo pathogenic loss-of-function variants; tier 3, genes with 1 de novo pathogenic loss-of-function variant; and tier 4, genes with only inherited (or unknown inheritance) pathogenic loss-of-function variants. MAIN OUTCOMES AND MEASURES Development of a comprehensive knowledge base of candidate genes related to developmental brain disorders. Genes were prioritized based on the inheritance pattern and total number of pathogenic loss-of-function variants identified amongst unrelated individuals with any one of six developmental brain disorders. STUDY SELECTION A combination of

  6. Use of eQTL Analysis for the Discovery of Target Genes Identified by GWAS

    DTIC Science & Technology

    2012-04-01

    prostate tissue-specific expression quantitative trait loci (eQTL) dataset; and 2) utilize this dataset to identify candidate genes for existing...set of 500 samples of normal prostate tissue sampled from men with PC. To date, we have pre-screened normal prostate tissue with the use of H&E...stained sections from 4000 men having a radical prostatectomy in order to identify those cases meeting our strict selection criteria for further

  7. Gene Expression Profile of High IFN-γ Producers Stimulated with Leishmania braziliensis Identifies Genes Associated with Cutaneous Leishmaniasis

    PubMed Central

    Carneiro, Marcia W.; Fukutani, Kiyoshi F.; Andrade, Bruno B.; Curvelo, Rebecca P.; Cristal, Juqueline R.; Carvalho, Augusto M.; Barral, Aldina

    2016-01-01

    Background The initial response to Leishmania parasites is essential in determining disease development or resistance. In vitro, a divergent response to Leishmania, characterized by high or low IFN-γ production has been described as a potential tool to predict both vaccine response and disease susceptibility in vivo. Methods and findings We identified uninfected and healthy individuals that were shown to be either high- or low IFN-γ producers (HPs and LPs, respectively) following stimulation of peripheral blood cells with Leishmania braziliensis. Following stimulation, RNA was processed for gene expression analysis using immune gene arrays. Both HPs and LPs were shown to upregulate the expression of CXCL10, IFI27, IL6 and LTA. Genes expressed in HPs only (CCL7, IL8, IFI44L and IL1B) were associated with pathways related to IL17 and TREM 1 signaling. In LPs, uniquely expressed genes (for example IL9, IFI44, IFIT1 and IL2RA) were associated with pathways related to pattern recognition receptors and interferon signaling. We then investigated whether the unique gene expression profiles described here could be recapitulated in vivo, in individuals with active Cutaneous Leishmaniasis or with subclinical infection. Indeed, using a set of six genes (TLR2, JAK2, IFI27, IFIT1, IRF1 and IL6) modulated in HPs and LPs, we could successfully discriminate these two clinical groups. Finally, we demonstrate that these six genes are significantly overexpressed in CL lesions. Conclusion Upon interrogation of the peripheral response of naive individuals with diverging IFN-γ production to L. braziliensis, we identified differences in the innate response to the parasite that are recapitulated in vivo and that discriminate CL patients from individuals presenting a subclinical infection. PMID:27870860

  8. Genetic‐Genomic Replication to Identify Candidate Mouse Atherosclerosis Modifier Genes

    PubMed Central

    Hsu, Jeffrey; Smith, Jonathan D.

    2013-01-01

    Objective Genetics plays a large role in atherosclerosis susceptibility in humans and mice. We attempted to confirm previously determined mouse atherosclerosis‐associated loci and use bioinformatics and transcriptomics to create a catalog of candidate atherosclerosis modifier genes at these loci. Methods and Results A strain intercross was performed between AKR and DBA/2 mice on the apoE−/− background generating 166 F2 progeny. Using the phenotype log10 of the aortic root lesion area, we identified 3 suggestive atherosclerosis quantitative trait loci (Ath QTLs). When combined with our prior strain intercross, we confirmed 3 significant Ath QTLs on chromosomes 2, 15, and 17, with combined logarithm of odds scores of 5.9, 5.3, and 5.6, respectively, which each met the genome‐wide 5% false discovery rate threshold. We identified all of the protein coding differences between these 2 mouse strains within the Ath QTL intervals. Microarray gene expression profiling was performed on macrophages and endothelial cells from this intercross to identify expression QTLs (eQTLs), the loci that are associated with variation in the expression levels of specific transcripts. Cross tissue eQTLs and macrophage eQTLs that replicated from a prior strain intercross were identified. These bioinformatic and eQTL analyses produced a comprehensive list of candidate genes that may be responsible for the Ath QTLs. Conclusions Replication studies for clinical traits as well as gene expression traits are worthwhile in identifying true versus false genetic associations. We have replicated 3 loci on mouse chromosomes 2, 15, and 17 that are associated with atherosclerosis. We have also identified protein coding differences and multiple replicated eQTLs, which may be useful in the identification of atherosclerosis modifier genes. PMID:23525445

  9. Convergence of Mutation and Epigenetic Alterations Identifies Common Genes in Cancer That Predict for Poor Prognosis

    PubMed Central

    Chan, Timothy A; Glockner, Sabine; Yi, Joo Mi; Chen, Wei; Van Neste, Leander; Cope, Leslie; Herman, James G; Velculescu, Victor; Schuebel, Kornel E; Ahuja, Nita; Baylin, Stephen B

    2008-01-01

    Background The identification and characterization of tumor suppressor genes has enhanced our understanding of the biology of cancer and enabled the development of new diagnostic and therapeutic modalities. Whereas in past decades, a handful of tumor suppressors have been slowly identified using techniques such as linkage analysis, large-scale sequencing of the cancer genome has enabled the rapid identification of a large number of genes that are mutated in cancer. However, determining which of these many genes play key roles in cancer development has proven challenging. Specifically, recent sequencing of human breast and colon cancers has revealed a large number of somatic gene mutations, but virtually all are heterozygous, occur at low frequency, and are tumor-type specific. We hypothesize that key tumor suppressor genes in cancer may be subject to mutation or hypermethylation. Methods and Findings Here, we show that combined genetic and epigenetic analysis of these genes reveals many with a higher putative tumor suppressor status than would otherwise be appreciated. At least 36 of the 189 genes newly recognized to be mutated are targets of promoter CpG island hypermethylation, often in both colon and breast cancer cell lines. Analyses of primary tumors show that 18 of these genes are hypermethylated strictly in primary cancers and often with an incidence that is much higher than for the mutations and which is not restricted to a single tumor-type. In the identical breast cancer cell lines in which the mutations were identified, hypermethylation is usually, but not always, mutually exclusive from genetic changes for a given tumor, and there is a high incidence of concomitant loss of expression. Sixteen out of 18 (89%) of these genes map to loci deleted in human cancers. Lastly, and most importantly, the reduced expression of a subset of these genes strongly correlates with poor clinical outcome. Conclusions Using an unbiased genome-wide approach, our analysis has

  10. Expression profiling identifies novel Hh/Gli-regulated genes in developing zebrafish embryos.

    PubMed

    Bergeron, Sadie A; Milla, Luis A; Villegas, Rosario; Shen, Meng-Chieh; Burgess, Shawn M; Allende, Miguel L; Karlstrom, Rolf O; Palma, Verónica

    2008-02-01

    The Hedgehog (Hh) signaling pathway plays critical instructional roles during embryonic development. Misregulation of Hh/Gli signaling is a major causative factor in human congenital disorders and in a variety of cancers. The zebrafish is a powerful genetic model for the study of Hh signaling during embryogenesis, as a large number of mutants that affect different components of the Hh/Gli signaling system have been identified. By performing global profiling of gene expression in different Hh/Gli gain- and loss-of-function scenarios we identified known (e.g., ptc1 and nkx2.2a) and novel Hh-regulated genes that are differentially expressed in embryos with altered Hh/Gli signaling function. By uncovering changes in tissue-specific gene expression, we revealed new embryological processes that are influenced by Hh signaling. We thus provide a comprehensive survey of Hh/Gli-regulated genes during embryogenesis and we identify new Hh-regulated genes that may be targets of misregulation during tumorigenesis.

  11. A misexpression screen identifies genes that can modulate RAS1 pathway signaling in Drosophila melanogaster.

    PubMed Central

    Huang, A M; Rubin, G M

    2000-01-01

    Differentiation of the R7 photoreceptor cell is dependent on the Sevenless receptor tyrosine kinase, which activates the RAS1/mitogen-activated protein kinase signaling cascade. Kinase suppressor of Ras (KSR) functions genetically downstream of RAS1 in this signal transduction cascade. Expression of dominant-negative KSR (KDN) in the developing eye blocks RAS pathway signaling, prevents R7 cell differentiation, and causes a rough eye phenotype. To identify genes that modulate RAS signaling, we screened for genes that alter RAS1/KSR signaling efficiency when misexpressed. In this screen, we recovered three known genes, Lk6, misshapen, and Akap200. We also identified seven previously undescribed genes; one encodes a novel rel domain member of the NFAT family, and six encode novel proteins. These genes may represent new components of the RAS pathway or components of other signaling pathways that can modulate signaling by RAS. We discuss the utility of gain-of-function screens in identifying new components of signaling pathways in Drosophila. PMID:11063696

  12. Integrated genomic analyses identify ERRFI1 and TACC3 as glioblastoma-targeted genes

    PubMed Central

    Payne, Cathy A.; Lampson, Benjamin; Chen, William C.; Liu, Jeff; Solomon, David; Waldman, Todd; Towers, Aaron J.; Gregory, Simon G.; McDonald, Kerrie L.; McLendon, Roger E.; Bigner, Darell D.; Yan, Hai

    2010-01-01

    The glioblastoma genome displays remarkable chromosomal aberrations, which harbor critical glioblastoma-specific genes contributing to several oncogenetic pathways. To identify glioblastoma-targeted genes, we completed a multifaceted genome-wide analysis to characterize the most significant aberrations of DNA content occurring in glioblastomas. We performed copy number analysis of 111 glioblastomas by Digital Karyotyping and Illumina BeadChip assays and validated our findings using data from the TCGA (The Cancer Genome Atlas) glioblastoma project. From this study, we identified recurrent focal copy number alterations in 1p36.23 and 4p16.3. Expression analyses of genes located in the two regions revealed genes which are dysregulated in glioblastomas. Specifically, we identify EGFR negative regulator, ERRFI1, within the minimal region of deletion in 1p36.23. In glioblastoma cells with a focal deletion of the ERRFI1 locus, restoration of ERRFI1 expression slowed cell migration. Furthermore, we demonstrate that TACC3, an Aurora-A kinase substrate, on 4p16.3, displays gain of copy number, is overexpressed in a glioma-grade-specific pattern, and correlates with Aurora kinase overexpression in glioblastomas. Our multifaceted genomic evaluation of glioblastoma establishes ERRFI1 as a potential candidate tumor suppressor gene and TACC3 as a potential oncogene, and provides insight on targets for oncogenic pathway-based therapy. PMID:21113414

  13. Expression profiling identifies novel Hh/Gli regulated genes in developing zebrafish embryos.

    PubMed Central

    Bergeron, Sadie A.; Milla, Luis A.; Villegas, Rosario; Shen, Meng-Chieh; Burgess, Shawn M.; Allende, Miguel L.; Karlstrom, Rolf O.; Palma, Verónica

    2008-01-01

    The Hedgehog (Hh) signaling pathway plays critical instructional roles during embryonic development. Mis-regulation of Hh/Gli signaling is a major causative factor in human congenital disorders and in a variety of cancers. The zebrafish is a powerful genetic model for the study of Hh signaling during embryogenesis, as a large number of mutants have been identified affecting different components of the Hh/Gli signaling system. By performing global profiling of gene expression in different Hh/Gli gain- and loss-of-function scenarios we identified several known (e.g. ptc1 and nkx2.2a) as well as a large number of novel Hh regulated genes that are differentially expressed in embryos with altered Hh/Gli signaling function. By uncovering changes in tissue specific gene expression, we revealed new embryological processes that are influenced by Hh signaling. We thus provide a comprehensive survey of Hh/Gli regulated genes during embryogenesis and we identify new Hh-regulated genes that may be targets of mis-regulation during tumorogenesis. PMID:18055165

  14. Identifying Context-Specific Transcription Factor Targets from Prior Knowledge and Gene Expression Data

    PubMed Central

    Fertig, Elana J; Favorov, Alexander V; Ochs, Michael F

    2013-01-01

    Numerous methodologies, assays, and databases presently provide candidate targets of transcription factors (TFs). However, TFs rarely regulate their targets universally. The context of activation of a TF can change the transcriptional response of targets. Direct multiple regulation typical to mammalian genes complicates direct inference of TF targets from gene expression data. We present a novel statistic that infers context-specific TF regulation based upon the CoGAPS algorithm, which infers overlapping gene expression patterns resulting from coregulation. Numerical experiments with simulated data showed that this statistic correctly inferred targets that are common to multiple TFs, except in cases where the signal from a TF is negligible relative to noise level and signal from other TFs. The statistic is robust to moderate levels of error in the simulated gene sets, identifying fewer false positives than false negatives. Significantly, the regulatory statistic refines the number of TF targets relevant to cell signaling in gastrointestinal stromal tumors (GIST) to genes consistent with the phosphorylation patterns of TFs identified in previous studies. As formulated, the proposed regulatory statistic has wide applicability to inferring set membership in integrated datasets. This statistic could be naturally extended to account for prior probabilities of set membership or to add candidate gene targets. PMID:23694699

  15. Identifying context-specific transcription factor targets from prior knowledge and gene expression data.

    PubMed

    Fertig, Elana J; Favorov, Alexander V; Ochs, Michael F

    2013-09-01

    Numerous methodologies, assays, and databases presently provide candidate targets of transcription factors (TFs). However, TFs rarely regulate their targets universally. The context of activation of a TF can change the transcriptional response of targets. Direct multiple regulation typical to mammalian genes complicates direct inference of TF targets from gene expression data. We present a novel statistic that infers context-specific TF regulation based upon the CoGAPS algorithm, which infers overlapping gene expression patterns resulting from coregulation. Numerical experiments with simulated data showed that this statistic correctly inferred targets that are common to multiple TFs, except in cases where the signal from a TF is negligible relative to noise level and signal from other TFs. The statistic is robust to moderate levels of error in the simulated gene sets, identifying fewer false positives than false negatives. Significantly, the regulatory statistic refines the number of TF targets relevant to cell signaling in gastrointestinal stromal tumors (GIST) to genes consistent with the phosphorylation patterns of TFs identified in previous studies. As formulated, the proposed regulatory statistic has wide applicability to inferring set membership in integrated datasets. This statistic could be naturally extended to account for prior probabilities of set membership or to add candidate gene targets.

  16. Analysis of Pigeon (Columba) Ovary Transcriptomes to Identify Genes Involved in Blue Light Regulation.

    PubMed

    Wang, Ying; Ding, Jia-Tong; Yang, Hai-Ming; Yan, Zheng-Jie; Cao, Wei; Li, Yang-Bai

    2015-01-01

    Monochromatic light is widely applied to promote poultry reproductive performance, yet little is currently known regarding the mechanism by which light wavelengths affect pigeon reproduction. Recently, high-throughput sequencing technologies have been used to provide genomic information for solving this problem. In this study, we employed Illumina Hiseq 2000 to identify differentially expressed genes in ovary tissue from pigeons under blue and white light conditions and de novo transcriptome assembly to construct a comprehensive sequence database containing information on the mechanisms of follicle development. A total of 157,774 unigenes (mean length: 790 bp) were obtained by the Trinity program, and 35.83% of these unigenes were matched to genes in a non-redundant protein database. Gene description, gene ontology, and the clustering of orthologous group terms were performed to annotate the transcriptome assembly. Differentially expressed genes between blue and white light conditions included those related to oocyte maturation, hormone biosynthesis, and circadian rhythm. Furthermore, 17,574 SSRs and 533,887 potential SNPs were identified in this transcriptome assembly. This work is the first transcriptome analysis of the Columba ovary using Illumina technology, and the resulting transcriptome and differentially expressed gene data can facilitate further investigations into the molecular mechanism of the effect of blue light on follicle development and reproduction in pigeons and other bird species.

  17. Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles.

    PubMed

    Tien, Yin-Jing; Lee, Yun-Shien; Wu, Han-Ming; Chen, Chun-Houh

    2008-03-20

    The hierarchical clustering tree (HCT) with a dendrogram 1 and the singular value decomposition (SVD) with a dimension-reduced representative map 2 are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures. This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose) seriation by Chen 3 as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends. We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at http://gap.stat.sinica.edu.tw/Software/GAP.

  18. A fast and high performance multiple data integration algorithm for identifying human disease genes

    PubMed Central

    2015-01-01

    Background Integrating multiple data sources is indispensable in improving disease gene identification. It is not only due to the fact that disease genes associated with similar genetic diseases tend to lie close with each other in various biological networks, but also due to the fact that gene-disease associations are complex. Although various algorithms have been proposed to identify disease genes, their prediction performances and the computational time still should be further improved. Results In this study, we propose a fast and high performance multiple data integration algorithm for identifying human disease genes. A posterior probability of each candidate gene associated with individual diseases is calculated by using a Bayesian analysis method and a binary logistic regression model. Two prior probability estimation strategies and two feature vector construction methods are developed to test the performance of the proposed algorithm. Conclusions The proposed algorithm is not only generated predictions with high AUC scores, but also runs very fast. When only a single PPI network is employed, the AUC score is 0.769 by using F2 as feature vectors. The average running time for each leave-one-out experiment is only around 1.5 seconds. When three biological networks are integrated, the AUC score using F3 as feature vectors increases to 0.830, and the average running time for each leave-one-out experiment takes only about 12.54 seconds. It is better than many existing algorithms. PMID:26399620

  19. Integrated genomic analyses identify frequent gene fusion events and VHL inactivation in gastrointestinal stromal tumors

    PubMed Central

    Sun, Choong-Hyun; Park, Inho; Lee, Seungmook; Kwon, Jekeun; Do, Ingu; Hong, Min Eui; Van Vrancken, Michael; Lee, Jeeyun; Park, Joon Oh; Cho, Jeonghee; Kim, Kyoung-Mee; Sohn, Tae Sung

    2016-01-01

    Gastrointestinal stromal tumors (GISTs) are the most common mesenchymal tumors of the gastrointestinal tract. We sequenced nine exomes and transcriptomes, and two genomes of GISTs for integrated analyses. We detected 306 somatic variants in nine GISTs and recurrent protein-altering mutations in 29 genes. Transcriptome sequencing revealed 328 gene fusions, and the most frequently involved fusion events were associated with IGF2 fused to several partner genes including CCND1, FUS, and LASP1. We additionally identified three recurrent read-through fusion transcripts: POLA2-CDC42EP2, C8orf42-FBXO25, and STX16-NPEPL1. Notably, we found intragenic deletions in one of three exons of the VHL gene and increased mRNAs of VEGF, PDGF-β, and IGF-1/2 in 56% of GISTs, suggesting a mechanistic link between VHL inactivation and overexpression of hypoxia-inducible factor target genes in the absence of hypoxia. We also identified copy number gain and increased mRNA expression of AMACR, CRIM1, SKP2, and CACNA1E. Mapping of copy number and gene expression results to the KEGG pathways revealed activation of the JAK-STAT pathway in small intestinal GISTs and the MAPK pathway in wild-type GISTs. These observations will allow us to determine the genetic basis of GISTs and will facilitate further investigation to develop new therapeutic options. PMID:25987131

  20. Integrated genomic analyses identify frequent gene fusion events and VHL inactivation in gastrointestinal stromal tumors.

    PubMed

    Kang, Guhyun; Yun, Hongseok; Sun, Choong-Hyun; Park, Inho; Lee, Seungmook; Kwon, Jekeun; Do, Ingu; Hong, Min Eui; Van Vrancken, Michael; Lee, Jeeyun; Park, Joon Oh; Cho, Jeonghee; Kim, Kyoung-Mee; Sohn, Tae Sung

    2016-02-09

    Gastrointestinal stromal tumors (GISTs) are the most common mesenchymal tumors of the gastrointestinal tract. We sequenced nine exomes and transcriptomes, and two genomes of GISTs for integrated analyses. We detected 306 somatic variants in nine GISTs and recurrent protein-altering mutations in 29 genes. Transcriptome sequencing revealed 328 gene fusions, and the most frequently involved fusion events were associated with IGF2 fused to several partner genes including CCND1, FUS, and LASP1. We additionally identified three recurrent read-through fusion transcripts: POLA2-CDC42EP2, C8orf42-FBXO25, and STX16-NPEPL1. Notably, we found intragenic deletions in one of three exons of the VHL gene and increased mRNAs of VEGF, PDGF-β, and IGF-1/2 in 56% of GISTs, suggesting a mechanistic link between VHL inactivation and overexpression of hypoxia-inducible factor target genes in the absence of hypoxia. We also identified copy number gain and increased mRNA expression of AMACR, CRIM1, SKP2, and CACNA1E. Mapping of copy number and gene expression results to the KEGG pathways revealed activation of the JAK-STAT pathway in small intestinal GISTs and the MAPK pathway in wild-type GISTs. These observations will allow us to determine the genetic basis of GISTs and will facilitate further investigation to develop new therapeutic options.

  1. Genome-Wide Overexpression Screen Identifies Genes Able to Bypass p16-Mediated Senescence in Melanoma.

    PubMed

    Lee, Won Jae; Škalamera, Dubravka; Dahmer-Heath, Mareike; Shakhbazov, Konstanin; Ranall, Max V; Fox, Carly; Lambie, Duncan; Stevenson, Alexander J; Yaswen, Paul; Gonda, Thomas J; Gabrielli, Brian

    2017-03-01

    Malignant melanomas often arise from nevi, which result from initial oncogene-induced hyperproliferation of melanocytes that are maintained in a CDKN2A/p16-mediated senescent state. Thus, genes that can bypass this senescence barrier are likely to contribute to melanoma development. We have performed a gain-of-function screen of 17,030 lentivirally expressed human open reading frames (ORFs) in a melanoma cell line containing an inducible p16 construct to identify such genes. Genes known to bypass p16-induced senescence arrest, including the human papilloma virus 18 E7 gene ( HPV18E7), and genes such as the p16-binding CDK6 with expected functions, as well as panel of novel genes, were identified, including high-mobility group box (HMGB) proteins. A number of these were further validated in two other models of p16-induced senescence. Tissue immunohistochemistry demonstrated higher levels of CDK6 in primary melanomas compared with normal skin and nevi. Reduction of CDK6 levels drove melanoma cells expressing functional p16 into senescence, demonstrating its contribution to bypass senescence.

  2. Challenges in identifying cancer genes by analysis of exome sequencing data

    PubMed Central

    Hofree, Matan; Carter, Hannah; Kreisberg, Jason F.; Bandyopadhyay, Sourav; Mischel, Paul S.; Friend, Stephen; Ideker, Trey

    2016-01-01

    Massively parallel sequencing has permitted an unprecedented examination of the cancer exome, leading to predictions that all genes important to cancer will soon be identified by genetic analysis of tumours. To examine this potential, here we evaluate the ability of state-of-the-art sequence analysis methods to specifically recover known cancer genes. While some cancer genes are identified by analysis of recurrence, spatial clustering or predicted impact of somatic mutations, many remain undetected due to lack of power to discriminate driver mutations from the background mutational load (13–60% recall of cancer genes impacted by somatic single-nucleotide variants, depending on the method). Cancer genes not detected by mutation recurrence also tend to be missed by all types of exome analysis. Nonetheless, these genes are implicated by other experiments such as functional genetic screens and expression profiling. These challenges are only partially addressed by increasing sample size and will likely hold even as greater numbers of tumours are analysed. PMID:27417679

  3. Evolutionary analysis of vision genes identifies potential drivers of visual differences between giraffe and okapi

    PubMed Central

    Agaba, Morris; Cavener, Douglas R.

    2017-01-01

    Background The capacity of visually oriented species to perceive and respond to visual signal is integral to their evolutionary success. Giraffes are closely related to okapi, but the two species have broad range of phenotypic differences including their visual capacities. Vision studies rank giraffe’s visual acuity higher than all other artiodactyls despite sharing similar vision ecological determinants with many of them. The extent to which the giraffe’s unique visual capacity and its difference with okapi is reflected by changes in their vision genes is not understood. Methods The recent availability of giraffe and okapi genomes provided opportunity to identify giraffe and okapi vision genes. Multiple strategies were employed to identify thirty-six candidate mammalian vision genes in giraffe and okapi genomes. Quantification of selection pressure was performed by a combination of branch-site tests of positive selection and clade models of selection divergence through comparing giraffe and okapi vision genes and orthologous sequences from other mammals. Results Signatures of selection were identified in key genes that could potentially underlie giraffe and okapi visual adaptations. Importantly, some genes that contribute to optical transparency of the eye and those that are critical in light signaling pathway were found to show signatures of adaptive evolution or selection divergence. Comparison between giraffe and other ruminants identifies significant selection divergence in CRYAA and OPN1LW. Significant selection divergence was identified in SAG while positive selection was detected in LUM when okapi is compared with ruminants and other mammals. Sequence analysis of OPN1LW showed that at least one of the sites known to affect spectral sensitivity of the red pigment is uniquely divergent between giraffe and other ruminants. Discussion By taking a systemic approach to gene function in vision, the results provide the first molecular clues associated with

  4. Evolutionary analysis of vision genes identifies potential drivers of visual differences between giraffe and okapi.

    PubMed

    Ishengoma, Edson; Agaba, Morris; Cavener, Douglas R

    2017-01-01

    The capacity of visually oriented species to perceive and respond to visual signal is integral to their evolutionary success. Giraffes are closely related to okapi, but the two species have broad range of phenotypic differences including their visual capacities. Vision studies rank giraffe's visual acuity higher than all other artiodactyls despite sharing similar vision ecological determinants with many of them. The extent to which the giraffe's unique visual capacity and its difference with okapi is reflected by changes in their vision genes is not understood. The recent availability of giraffe and okapi genomes provided opportunity to identify giraffe and okapi vision genes. Multiple strategies were employed to identify thirty-six candidate mammalian vision genes in giraffe and okapi genomes. Quantification of selection pressure was performed by a combination of branch-site tests of positive selection and clade models of selection divergence through comparing giraffe and okapi vision genes and orthologous sequences from other mammals. Signatures of selection were identified in key genes that could potentially underlie giraffe and okapi visual adaptations. Importantly, some genes that contribute to optical transparency of the eye and those that are critical in light signaling pathway were found to show signatures of adaptive evolution or selection divergence. Comparison between giraffe and other ruminants identifies significant selection divergence in CRYAA and OPN1LW. Significant selection divergence was identified in SAG while positive selection was detected in LUM when okapi is compared with ruminants and other mammals. Sequence analysis of OPN1LW showed that at least one of the sites known to affect spectral sensitivity of the red pigment is uniquely divergent between giraffe and other ruminants. By taking a systemic approach to gene function in vision, the results provide the first molecular clues associated with giraffe and okapi vision adaptations. At

  5. Salmonid microarrays identify intestinal genes that reliably monitor P deficiency in rainbow trout aquaculture.

    PubMed

    Kirchner, S; McDaniel, N K; Sugiura, S H; Soteropoulos, P; Tian, B; Fletcher, J W; Ferraris, R P

    2007-08-01

    Nutrient-responsive genes can identify important metabolic pathways and evaluate optimal dietary levels. Using a 16K Salmo salar microarray, we identified in rainbow trout (Oncorhynchus mykiss) 21 potential phosphorus (P)-responsive genes, mainly involved in immune response, proteolysis or transport, whose expression levels changed in the intestine after 5 days of feeding a low-P (LP) diet. Diet-induced changes in the expression levels of several genes in each fish were tightly correlated with changes in serum P, and the changes persisted for an additional 15 days after dietary P deficiency. We then evaluated these and previously identified P-responsive genes under simulated farm conditions, and monitored the intestinal gene expression from 6 h to 7 days after the trout were switched from a sufficient-P (SP) diet to a LP diet (SP-->LP), and from a LP diet to a SP diet (LP-->SP). After 7 days, mean serum P decreased 0.14 mM/day for SP-->LP and increased 0.10 mm/day for LP-->SP. The mRNA abundance of the metalloendopeptidase meprin 1alpha (MEP1alpha), the Na(+)-dependent phosphate co-transporter (NaPi2b,SLC34A2), the sulfotransferase SULT2beta1 and carbonic anhydrase XIII genes all increased after SP-->LP and decreased after LP-->SP, suggesting that adaptive expression is reversible and correlated with dietary P. The duration of change in gene expression in response to SP-->LP was generally shorter than that of LP-->SP, suggesting potentially different mechanisms of adaptation to deficiency as opposed to excess. Diet-induced changes in mRNA abundance of other genes were either transient or modest. We identified, by heterologous microarray hybridization, new genes sensitive to perturbations in dietary P, and then showed that these genes can reliably monitor P deficiency under field conditions. Simultaneous changes in the expression of these P biomarkers could predict either P deficiency (to prevent economic losses to the farmers) or P excess (to prevent inadvertent

  6. Joint QTL mapping and gene expression analysis identify positional candidate genes influencing pork quality traits

    PubMed Central

    González-Prendes, Rayner; Quintanilla, Raquel; Cánovas, Angela; Manunza, Arianna; Figueiredo Cardoso, Tainã; Jordana, Jordi; Noguera, José Luis; Pena, Ramona N.; Amills, Marcel

    2017-01-01

    Meat quality traits have an increasing importance in the pig industry because of their strong impact on consumer acceptance. Herewith, we have combined phenotypic and microarray expression data to map loci with potential effects on five meat quality traits recorded in the longissimus dorsi (LD) and gluteus medius (GM) muscles of 350 Duroc pigs, i.e. pH at 24 hours post-mortem (pH24), electric conductivity (CE) and muscle redness (a*), lightness (L*) and yellowness (b*). We have found significant genome-wide associations for CE of LD on SSC4 (~104 Mb), SSC5 (~15 Mb) and SSC13 (~137 Mb), while several additional regions were significantly associated with meat quality traits at the chromosome-wide level. There was a low positional concordance between the associations found for LD and GM traits, a feature that reflects the existence of differences in the genetic determinism of meat quality phenotypes in these two muscles. The performance of an eQTL search for SNPs mapping to the regions associated with meat quality traits demonstrated that the GM a* SSC3 and pH24 SSC17 QTL display positional concordance with cis-eQTL regulating the expression of several genes with a potential role on muscle metabolism. PMID:28054563

  7. Global Gene-Expression Analysis to Identify Differentially Expressed Genes Critical for the Heat Stress Response in Brassica rapa.

    PubMed

    Dong, Xiangshu; Yi, Hankuil; Lee, Jeongyeo; Nou, Ill-Sup; Han, Ching-Tack; Hur, Yoonkang

    2015-01-01

    Genome-wide dissection of the heat stress response (HSR) is necessary to overcome problems in crop production caused by global warming. To identify HSR genes, we profiled gene expression in two Chinese cabbage inbred lines with different thermotolerances, Chiifu and Kenshin. Many genes exhibited >2-fold changes in expression upon exposure to 0.5- 4 h at 45°C (high temperature, HT): 5.2% (2,142 genes) in Chiifu and 3.7% (1,535 genes) in Kenshin. The most enriched GO (Gene Ontology) items included 'response to heat', 'response to reactive oxygen species (ROS)', 'response to temperature stimulus', 'response to abiotic stimulus', and 'MAPKKK cascade'. In both lines, the genes most highly induced by HT encoded small heat shock proteins (Hsps) and heat shock factor (Hsf)-like proteins such as HsfB2A (Bra029292), whereas high-molecular weight Hsps were constitutively expressed. Other upstream HSR components were also up-regulated: ROS-scavenging genes like glutathione peroxidase 2 (BrGPX2, Bra022853), protein kinases, and phosphatases. Among heat stress (HS) marker genes in Arabidopsis, only exportin 1A (XPO1A) (Bra008580, Bra006382) can be applied to B. rapa for basal thermotolerance (BT) and short-term acquired thermotolerance (SAT) gene. CYP707A3 (Bra025083, Bra021965), which is involved in the dehydration response in Arabidopsis, was associated with membrane leakage in both lines following HS. Although many transcription factors (TF) genes, including DREB2A (Bra005852), were involved in HS tolerance in both lines, Bra024224 (MYB41) and Bra021735 (a bZIP/AIR1 [Anthocyanin-Impaired-Response-1]) were specific to Kenshin. Several candidate TFs involved in thermotolerance were confirmed as HSR genes by real-time PCR, and these assignments were further supported by promoter analysis. Although some of our findings are similar to those obtained using other plant species, clear differences in Brassica rapa reveal a distinct HSR in this species. Our data could also provide a

  8. Global Gene-Expression Analysis to Identify Differentially Expressed Genes Critical for the Heat Stress Response in Brassica rapa

    PubMed Central

    Dong, Xiangshu; Yi, Hankuil; Lee, Jeongyeo; Nou, Ill-Sup; Han, Ching-Tack; Hur, Yoonkang

    2015-01-01

    Genome-wide dissection of the heat stress response (HSR) is necessary to overcome problems in crop production caused by global warming. To identify HSR genes, we profiled gene expression in two Chinese cabbage inbred lines with different thermotolerances, Chiifu and Kenshin. Many genes exhibited >2-fold changes in expression upon exposure to 0.5– 4 h at 45°C (high temperature, HT): 5.2% (2,142 genes) in Chiifu and 3.7% (1,535 genes) in Kenshin. The most enriched GO (Gene Ontology) items included ‘response to heat’, ‘response to reactive oxygen species (ROS)’, ‘response to temperature stimulus’, ‘response to abiotic stimulus’, and ‘MAPKKK cascade’. In both lines, the genes most highly induced by HT encoded small heat shock proteins (Hsps) and heat shock factor (Hsf)-like proteins such as HsfB2A (Bra029292), whereas high-molecular weight Hsps were constitutively expressed. Other upstream HSR components were also up-regulated: ROS-scavenging genes like glutathione peroxidase 2 (BrGPX2, Bra022853), protein kinases, and phosphatases. Among heat stress (HS) marker genes in Arabidopsis, only exportin 1A (XPO1A) (Bra008580, Bra006382) can be applied to B. rapa for basal thermotolerance (BT) and short-term acquired thermotolerance (SAT) gene. CYP707A3 (Bra025083, Bra021965), which is involved in the dehydration response in Arabidopsis, was associated with membrane leakage in both lines following HS. Although many transcription factors (TF) genes, including DREB2A (Bra005852), were involved in HS tolerance in both lines, Bra024224 (MYB41) and Bra021735 (a bZIP/AIR1 [Anthocyanin-Impaired-Response-1]) were specific to Kenshin. Several candidate TFs involved in thermotolerance were confirmed as HSR genes by real-time PCR, and these assignments were further supported by promoter analysis. Although some of our findings are similar to those obtained using other plant species, clear differences in Brassica rapa reveal a distinct HSR in this species. Our data

  9. Association Analysis Suggests SOD2 as a Newly Identified Candidate Gene Associated With Leprosy Susceptibility.

    PubMed

    Ramos, Geovana Brotto; Salomão, Heloisa; Francio, Angela Schneider; Fava, Vinícius Medeiros; Werneck, Renata Iani; Mira, Marcelo Távora

    2016-08-01

    Genetic studies have identified several genes and genomic regions contributing to the control of host susceptibility to leprosy. Here, we test variants of the positional and functional candidate gene SOD2 for association with leprosy in 2 independent population samples. Family-based analysis revealed an association between leprosy and allele G of marker rs295340 (P = .042) and borderline evidence of an association between leprosy and alleles C and A of markers rs4880 (P = .077) and rs5746136 (P = .071), respectively. Findings were validated in an independent case-control sample for markers rs295340 (P = .049) and rs4880 (P = .038). These results suggest SOD2 as a newly identified gene conferring susceptibility to leprosy. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.

  10. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways

    PubMed Central

    Cirulli, Elizabeth T.; Lasseigne, Brittany N.; Petrovski, Slavé; Sapp, Peter C.; Dion, Patrick A.; Leblond, Claire S.; Couthouis, Julien; Lu, Yi-Fan; Wang, Quanli; Krueger, Brian J.; Ren, Zhong; Keebler, Jonathan; Han, Yujun; Levy, Shawn E.; Boone, Braden E.; Wimbish, Jack R.; Waite, Lindsay L.; Jones, Angela L.; Carulli, John P.; Day-Williams, Aaron G.; Staropoli, John F.; Xin, Winnie W.; Chesi, Alessandra; Raphael, Alya R.; McKenna-Yasek, Diane; Cady, Janet; de Jong, J.M.B. Vianney; Kenna, Kevin P.; Smith, Bradley N.; Topp, Simon; Miller, Jack; Gkazi, Athina; Al-Chalabi, Ammar; van den Berg, Leonard H.; Veldink, Jan; Silani, Vincenzo; Ticozzi, Nicola; Shaw, Christopher E.; Baloh, Robert H.; Appel, Stanley; Simpson, Ericka; Lagier-Tourenne, Clotilde; Pulst, Stefan M.; Gibson, Summer; Trojanowski, John Q.; Elman, Lauren; McCluskey, Leo; Grossman, Murray; Shneider, Neil A.; Chung, Wendy K.; Ravits, John M.; Glass, Jonathan D.; Sims, Katherine B.; Van Deerlin, Vivianna M.; Maniatis, Tom; Hayes, Sebastian D.; Ordureau, Alban; Swarup, Sharan; Landers, John; Baas, Frank; Allen, Andrew S.; Bedlack, Richard S.; Harper, J. Wade; Gitler, Aaron D.; Rouleau, Guy A.; Brown, Robert; Harms, Matthew B.; Cooper, Gregory M.; Harris, Tim; Myers, Richard M.; Goldstein, David B.

    2015-01-01

    Amyotrophic lateral sclerosis (ALS) is a devastating neurological disease with no effective treatment. Here we report the results of a moderate-scale sequencing study aimed at identifying new genes contributing to predisposition for ALS. We performed whole exome sequencing of 2,874 ALS patients and compared them to 6,405 controls. Several known ALS genes were found to be associated, and the non-canonical IκB kinase family TANK-Binding Kinase 1 (TBK1) was identified as an ALS gene. TBK1 is known to bind to and phosphorylate a number of proteins involved in innate immunity and autophagy, including optineurin (OPTN) and p62 (SQSTM1/sequestosome), both of which have also been implicated in ALS. These observations reveal a key role of the autophagic pathway in ALS and suggest specific targets for therapeutic intervention. PMID:25700176

  11. Regularized Non-negative Matrix Factorization for Identifying Differential Genes and Clustering Samples: a Survey.

    PubMed

    Liu, Jin-Xing; Wang, Dong; Gao, Ying-Lian; Zheng, Chun-Hou; Xu, Yong; Yu, Jiguo

    2017-02-07

    Non-negative Matrix Factorization (NMF), a classical method for dimensionality reduction, has been applied in many fields. It is based on the idea that negative numbers are physically meaningless in various data-processing tasks. Apart from its contribution to conventional data analysis, the recent overwhelming interest in NMF is due to its newly discovered ability to solve challenging data mining and machine learning problems, especially in relation to gene expression data. This survey paper mainly focuses on research examining the application of NMF to identify differentially expressed genes and to cluster samples, and the main NMF models, properties, principles, and algorithms with its various generalizations, extensions, and modifications are summarized. The experimental results demonstrate the performance of the various NMF algorithms in identifying differentially expressed genes and clustering samples.

  12. RNA Sequencing of Sessile Serrated Colon Polyps Identifies Differentially Expressed Genes and Immunohistochemical Markers

    PubMed Central

    Delker, Don A.; Pop, Stelian; Neklason, Deborah W.; Bronner, Mary P.; Burt, Randall W.; Hagedorn, Curt H.

    2014-01-01

    Background Sessile serrated adenomas/polyps (SSA/Ps) may account for 20–30% of colon cancers. Although large SSA/Ps are generally recognized phenotypically, small (<1 cm) or dysplastic SSA/Ps are difficult to differentiate from hyperplastic or small adenomatous polyps by endoscopy and histopathology. Our aim was to define the comprehensive gene expression phenotype of SSA/Ps to better define this cancer precursor. Results RNA sequencing was performed on 5′ capped RNA from seven SSA/Ps collected from patients with the serrated polyposis syndrome (SPS) versus eight controls. Highly expressed genes were analyzed by qPCR in additional SSA/Ps, adenomas and controls. The cellular localization and level of gene products were examined by immunohistochemistry in syndromic and sporadic SSA/Ps, adenomatous and hyperplastic polyps and controls. We identified 1,294 differentially expressed annotated genes, with 106 increased ≥10-fold, in SSA/Ps compared to controls. Comparing these genes with an array dataset for adenomatous polyps identified 30 protein coding genes uniquely expressed ≥10-fold in SSA/Ps. Biological pathways altered in SSA/Ps included mucosal integrity, cell adhesion, and cell development. Marked increased expression of MUC17, the cell junction protein genes VSIG1 and GJB5, and the antiapoptotic gene REG4 were found in SSA/Ps, relative to controls and adenomas, were verified by qPCR analysis of additional SSA/Ps (n = 21) and adenomas (n = 10). Immunohistochemical staining of syndromic (n≥11) and sporadic SSA/Ps (n≥17), adenomatous (n≥13) and hyperplastic (n≥10) polyps plus controls (n≥16) identified unique expression patterns for VSIG1 and MUC17 in SSA/Ps. Conclusion A subset of genes and pathways are uniquely increased in SSA/Ps, compared to adenomatous polyps, thus supporting the concept that cancer develops by different pathways in these phenotypically distinct polyps with markedly different gene expression profiles. Immunostaining

  13. Tsukamurella pulmonis bloodstream infection identified by secA1 gene sequencing.

    PubMed

    Pérez Del Molino Bernal, Inmaculada C; Cano, María E; García de la Fuente, Celia; Martínez-Martínez, Luis; López, Mónica; Fernández-Mazarrasa, Carlos; Agüero, Jesús

    2015-02-01

    Recurrent bloodstream infections caused by a Gram-positive bacterium affected an immunocompromised child. Tsukamurella pulmonis was the microorganism identified by secA1 gene sequencing. Antibiotic treatment in combination with removal of the subcutaneous port healed the patient. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  14. Systems Biology in Animal Breeding: Identifying relationships among markers, genes, and phenotypes

    USDA-ARS?s Scientific Manuscript database

    The Breeding and Genetics Symposium titled “Systems Biology in Animal Breeding: Identifying relationships among markers, genes, and phenotypes” was held at the Joint Annual Meeting of the American Dairy Science Association and the American Society of Animal Science in Phoenix, AZ, July 15 to 19, 201...

  15. Candidate fire blight resistance genes in Malus identified with the use of genomic tools and approaches

    USDA-ARS?s Scientific Manuscript database

    The goal of this research is to utilize current advances in Rosaceae genomics to identify DNA markers for use in marker-assisted selection of durable resistance to fire blight. Candidate fire blight resistance genes were selected and ranked based upon differential expression after inoculation with ...

  16. Gene from a novel plant virus satellite from grapevine identifies a viral satellite lineage

    USDA-ARS?s Scientific Manuscript database

    We have identified the genome of a novel viral satellite in deep sequence analysis of double-stranded RNA from grapevine. The genome was 1,060 bases in length, and encoded two open reading frames. Neither frame was related to any known plant virus gene. But translation of the longer frame showed ...

  17. Multiple gene mutations identified in patients infected with influenza A (H7N9) virus

    PubMed Central

    Chen, Cuicui; Wang, Mingbang; Zhu, Zhaoqin; Qu, Jieming; Xi, Xiuhong; Tang, Xinjun; Lao, Xiangda; Seeley, Eric; Li, Tao; Fan, Xiaomei; Du, Chunling; Wang, Qin; Yang, Lin; Hu, Yunwen; Bai, Chunxue; Zhang, Zhiyong; Lu, Shuihua; Song, Yuanlin; Zhou, Wenhao

    2016-01-01

    Influenza A (H7N9) virus induced high mortality since 2013. It is important to elucidate the potential genetic variations that contribute to virus infection susceptibilities. In order to identify genetic mutations that might increase host susceptibility to infection, we performed exon sequencing and validated the SNPS by Sanger sequencing on 18 H7N9 patients. Blood samples were collected from 18 confirmed H7N9 patients. The genomic DNA was captured with the Agilent SureSelect Human All Exon kit, sequenced on the Illumina Hiseq 2000, and the resulting data processed and annotated with Genome analysis Tool. SNPs were verified by independent Sanger sequencing. The DAVID database and the DAPPLE database were used to do bioinformatics analysis. Through exon sequencing and Sanger sequencing, we identified 21 genes that were highly associated with H7N9 influenza infection. Protein-protein interaction analysis showed that direct interactions among genetic products were significantly higher than expected (p = 0.004), and DAVID analysis confirmed the defense-related functions of these genes. Gene mutation profiles of survived and non-survived patients were similar, suggesting some of genes identified in this study may be associated with H7N9 influenza susceptibility. Host specific genetic determinants of disease severity identified by this approach may provide new targets for the treatment of H7N9 influenza. PMID:27156515

  18. Tsukamurella pulmonis Bloodstream Infection Identified by secA1 Gene Sequencing

    PubMed Central

    Cano, María E.; García de la Fuente, Celia; Martínez-Martínez, Luis; López, Mónica; Fernández-Mazarrasa, Carlos

    2014-01-01

    Recurrent bloodstream infections caused by a Gram-positive bacterium affected an immunocompromised child. Tsukamurella pulmonis was the microorganism identified by secA1 gene sequencing. Antibiotic treatment in combination with removal of the subcutaneous port healed the patient. PMID:25520439

  19. IVIAT: a novel method to identify microbial genes expressed specifically during human infections.

    PubMed

    Handfield, M; Brady, L J; Progulske-Fox, A; Hillman, J D

    2000-07-01

    In vivo induced antigen technology (IVIAT) is a novel technology that can quickly and easily identify in vivo induced genes in human infections, without the use of animal models. This technology is expected to facilitate the discovery of new targets for vaccines, antimicrobials and diagnostic strategies in a wide range of microbial pathogens.

  20. Identify Huntington's disease associated genes based on restricted Boltzmann machine with RNA-seq data.

    PubMed

    Jiang, Xue; Zhang, Han; Duan, Feng; Quan, Xiongwen

    2017-10-11

    Predicting disease-associated genes is helpful for understanding the molecular mechanisms during the disease progression. Since the pathological mechanisms of neurodegenerative diseases are very complex, traditional statistic-based methods are not suitable for identifying key genes related to the disease development. Recent studies have shown that the computational models with deep structure can learn automatically the features of biological data, which is useful for exploring the characteristics of gene expression during the disease progression. In this paper, we propose a deep learning approach based on the restricted Boltzmann machine to analyze the RNA-seq data of Huntington's disease, namely stacked restricted Boltzmann machine (SRBM). According to the SRBM, we also design a novel framework to screen the key genes during the Huntington's disease development. In this work, we assume that the effects of regulatory factors can be captured by the hierarchical structure and narrow hidden layers of the SRBM. First, we select disease-associated factors with different time period datasets according to the differentially activated neurons in hidden layers. Then, we select disease-associated genes according to the changes of the gene energy in SRBM at different time periods. The experimental results demonstrate that SRBM can detect the important information for differential analysis of time series gene expression datasets. The identification accuracy of the disease-associated genes is improved to some extent using the novel framework. Moreover, the prediction precision of disease-associated genes for top ranking genes using SRBM is effectively improved compared with that of the state of the art methods.

  1. Newly identified CSP41b gene localized in chloroplasts affects leaf color in rice.

    PubMed

    Mei, Jiasong; Li, Feifei; Liu, Xuri; Hu, Guocheng; Fu, Yaping; Liu, Wenzhen

    2017-03-01

    A rice mutant with light-green leaves was discovered from a transgenic line of Oryza sativa. The mutant has reduced chlorophyll content and abnormal chloroplast morphology throughout its life cycle. Genetic analysis revealed that a single nuclear-encoded recessive gene is responsible for the mutation, here designated as lgl1. To isolate the lgl1 gene, a high-resolution physical map of the chromosomal region around the lgl1 gene was made using a mapping population consisting of 1984 mutant individuals. The lgl1 gene was mapped in the 76.5kb region between marker YG4 and marker YG5 on chromosome 12. Sequence analysis revealed that there was a 39bp deletion within the fourth exon of the candidate gene Os12g0420200 (TIGR locus Os12g23180) encoding a chloroplast stem-loop-binding protein of 41kDa b (CSP41b). The lgl1 mutation was rescued by transformation with the wild type CSP41b gene. Accordingly, the CSP41b gene is identified as the LGL1 gene. CSP41b was transcribed in various tissues and was mainly expressed in leaves. Expression of CSP41b-GFP fusion protein indicated that CSP41b is localized in chloroplasts. The expression levels of some key genes involved in chlorophyll biosynthesis and photosynthesis, such as ChlD, ChlI, Hema1, Ygl1, POR, Cab1R, Cab2R, PsaA, and rbcL, was significantly changed in the lgl1 mutant. Our results demonstrate that CSP41b is a novel gene required for normal leaf color and chloroplast morphology in rice.

  2. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways.

    PubMed

    Cirulli, Elizabeth T; Lasseigne, Brittany N; Petrovski, Slavé; Sapp, Peter C; Dion, Patrick A; Leblond, Claire S; Couthouis, Julien; Lu, Yi-Fan; Wang, Quanli; Krueger, Brian J; Ren, Zhong; Keebler, Jonathan; Han, Yujun; Levy, Shawn E; Boone, Braden E; Wimbish, Jack R; Waite, Lindsay L; Jones, Angela L; Carulli, John P; Day-Williams, Aaron G; Staropoli, John F; Xin, Winnie W; Chesi, Alessandra; Raphael, Alya R; McKenna-Yasek, Diane; Cady, Janet; Vianney de Jong, J M B; Kenna, Kevin P; Smith, Bradley N; Topp, Simon; Miller, Jack; Gkazi, Athina; Al-Chalabi, Ammar; van den Berg, Leonard H; Veldink, Jan; Silani, Vincenzo; Ticozzi, Nicola; Shaw, Christopher E; Baloh, Robert H; Appel, Stanley; Simpson, Ericka; Lagier-Tourenne, Clotilde; Pulst, Stefan M; Gibson, Summer; Trojanowski, John Q; Elman, Lauren; McCluskey, Leo; Grossman, Murray; Shneider, Neil A; Chung, Wendy K; Ravits, John M; Glass, Jonathan D; Sims, Katherine B; Van Deerlin, Vivianna M; Maniatis, Tom; Hayes, Sebastian D; Ordureau, Alban; Swarup, Sharan; Landers, John; Baas, Frank; Allen, Andrew S; Bedlack, Richard S; Harper, J Wade; Gitler, Aaron D; Rouleau, Guy A; Brown, Robert; Harms, Matthew B; Cooper, Gregory M; Harris, Tim; Myers, Richard M; Goldstein, David B

    2015-03-27

    Amyotrophic lateral sclerosis (ALS) is a devastating neurological disease with no effective treatment. We report the results of a moderate-scale sequencing study aimed at increasing the number of genes known to contribute to predisposition for ALS. We performed whole-exome sequencing of 2869 ALS patients and 6405 controls. Several known ALS genes were found to be associated, and TBK1 (the gene encoding TANK-binding kinase 1) was identified as an ALS gene. TBK1 is known to bind to and phosphorylate a number of proteins involved in innate immunity and autophagy, including optineurin (OPTN) and p62 (SQSTM1/sequestosome), both of which have also been implicated in ALS. These observations reveal a key role of the autophagic pathway in ALS and suggest specific targets for therapeutic intervention. Copyright © 2015, American Association for the Advancement of Science.

  3. Gene trapping identifies a putative tumor suppressor and a new inducer of cell migration

    SciTech Connect

    Guardiola-Serrano, Francisca; Haendeler, Judith; Lukosz, Margarete; Sturm, Karsten; Melchner, Harald von; Altschmied, Joachim

    2008-11-28

    Tumor necrosis factor alpha (TNF{alpha}) is a pleiotropic cytokine involved in apoptotic cell death, cellular proliferation, differentiation, inflammation, and tumorigenesis. In tumors it is secreted by tumor associated macrophages and can have both pro- and anti-tumorigenic effects. To identify genes regulated by TNF{alpha}, we performed a gene trap screen in the mammary carcinoma cell line MCF-7 and recovered 64 unique, TNF{alpha}-induced gene trap integration sites. Among these were the genes coding for the zinc finger protein ZC3H10 and for the transcription factor grainyhead-like 3 (GRHL3). In line with the dual effects of TNF{alpha} on tumorigenesis, we found that ZC3H10 inhibits anchorage independent growth in soft agar suggesting a tumor suppressor function, whereas GRHL3 strongly stimulated the migration of endothelial cells which is consistent with an angiogenic, pro-tumorigenic function.

  4. Identifying promoter features of co-regulated genes with similar network motifs.

    PubMed

    Harari, Oscar; del Val, Coral; Romero-Zaliz, Rocío; Shin, Dongwoo; Huang, Henry; Groisman, Eduardo A; Zwir, Igor

    2009-04-29

    A large amount of computational and experimental work has been devoted to uncovering network motifs in gene regulatory networks. The leading hypothesis is that evolutionary processes independently selected recurrent architectural relationships among regulators and target genes (motifs) to produce characteristic expression patterns of its members. However, even with the same architecture, the genes may still be differentially expressed. Therefore, to define fully the expression of a group of genes, the strength of the connections in a network motif must be specified, and the cis-promoter features that participate in the regulation must be determined. We have developed a model-based approach to analyze proteobacterial genomes for promoter features that is specifically designed to account for the variability in sequence, location and topology intrinsic to differential gene expression. We provide methods for annotating regulatory regions by detecting their subjacent cis-features. This includes identifying binding sites for a transcriptional regulator, distinguishing between activation and repression sites, direct and reverse orientation, and among sequences that weakly reflect a particular pattern; binding sites for the RNA polymerase, characterizing different classes, and locations relative to the transcription factor binding sites; the presence of riboswitches in the 5'UTR, and for other transcription factors. We applied our approach to characterize network motifs controlled by the PhoP/PhoQ regulatory system of Escherichia coli and Salmonella enterica serovar Typhimurium. We identified key features that enable the PhoP protein to control its target genes, and distinct features may produce different expression patterns even within the same network motif. Global transcriptional regulators control multiple promoters by a variety of network motifs. This is clearly the case for the regulatory protein PhoP. In this work, we studied this regulatory protein and demonstrated

  5. Identifying differentially expressed genes and small molecule drugs for prostate cancer by a bioinformatics strategy.

    PubMed

    Li, Jian; Xu, Ya-Hong; Lu, Yi; Ma, Xiao-Ping; Chen, Ping; Luo, Shun-Wen; Jia, Zhi-Gang; Liu, Yang; Guo, Yu

    2013-01-01

    Prostate cancer caused by the abnormal disorderly growth of prostatic acinar cells is the most prevalent cancer of men in western countries. We aimed to screen out differentially expressed genes (DEGs) and explore small molecule drugs for prostate cancer. The GSE3824 gene expression profile of prostate cancer was downloaded from Gene Expression Omnibus database which including 21 normal samples and 18 prostate cancer cells. The DEGs were identified by Limma package in R language and gene ontology and pathway enrichment analyses were performed. In addition, potential regulatory microRNAs and the target sites of the transcription factors were screened out based on the molecular signature database. In addition, the DEGs were mapped to the connectivity map database to identify potential small molecule drugs. A total of 6,588 genes were filtered as DEGs between normal and prostate cancer samples. Examples such as ITGB6, ITGB3, ITGAV and ITGA2 may induce prostate cancer through actions on the focal adhesion pathway. Furthermore, the transcription factor, SP1, and its target genes ARHGAP26 and USF1 were identified. The most significant microRNA, MIR-506, was screened and found to regulate genes including ITGB1 and ITGB3. Additionally, small molecules MS-275, 8-azaguanine and pyrvinium were discovered to have the potential to repair the disordered metabolic pathways, abd furthermore to remedy prostate cancer. The results of our analysis bear on the mechanism of prostate cancer and allow screening for small molecular drugs for this cancer. The findings have the potential for future use in the clinic for treatment of prostate cancer.

  6. Transposon mutagenesis identifies genes and cellular processes driving epithelial-mesenchymal transition in hepatocellular carcinoma

    PubMed Central

    Kodama, Takahiro; Newberg, Justin Y.; Kodama, Michiko; Rangel, Roberto; Yoshihara, Kosuke; Tien, Jean C.; Parsons, Pamela H.; Wu, Hao; Finegold, Milton J.; Copeland, Neal G.; Jenkins, Nancy A.

    2016-01-01

    Epithelial-mesenchymal transition (EMT) is thought to contribute to metastasis and chemoresistance in patients with hepatocellular carcinoma (HCC), leading to their poor prognosis. The genes driving EMT in HCC are not yet fully understood, however. Here, we show that mobilization of Sleeping Beauty (SB) transposons in immortalized mouse hepatoblasts induces mesenchymal liver tumors on transplantation to nude mice. These tumors show significant down-regulation of epithelial markers, along with up-regulation of mesenchymal markers and EMT-related transcription factors (EMT-TFs). Sequencing of transposon insertion sites from tumors identified 233 candidate cancer genes (CCGs) that were enriched for genes and cellular processes driving EMT. Subsequent trunk driver analysis identified 23 CCGs that are predicted to function early in tumorigenesis and whose mutation or alteration in patients with HCC is correlated with poor patient survival. Validation of the top trunk drivers identified in the screen, including MET (MET proto-oncogene, receptor tyrosine kinase), GRB2-associated binding protein 1 (GAB1), HECT, UBA, and WWE domain containing 1 (HUWE1), lysine-specific demethylase 6A (KDM6A), and protein-tyrosine phosphatase, nonreceptor-type 12 (PTPN12), showed that deregulation of these genes activates an EMT program in human HCC cells that enhances tumor cell migration. Finally, deregulation of these genes in human HCC was found to confer sorafenib resistance through apoptotic tolerance and reduced proliferation, consistent with recent studies showing that EMT contributes to the chemoresistance of tumor cells. Our unique cell-based transposon mutagenesis screen appears to be an excellent resource for discovering genes involved in EMT in human HCC and potentially for identifying new drug targets. PMID:27247392

  7. A New Strategy to Identify and Annotate Human RPE-Specific Gene Expression

    PubMed Central

    Booij, Judith C.; ten Brink, Jacoline B.; Swagemakers, Sigrid M. A.; Verkerk, Annemieke J. M. H.; Essing, Anke H. W.; van der Spek, Peter J.; Bergen, Arthur A. B.

    2010-01-01

    Background To identify and functionally annotate cell type-specific gene expression in the human retinal pigment epithelium (RPE), a key tissue involved in age-related macular degeneration and retinitis pigmentosa. Methodology RPE, photoreceptor and choroidal cells were isolated from selected freshly frozen healthy human donor eyes using laser microdissection. RNA isolation, amplification and hybridization to 44 k microarrays was carried out according to Agilent specifications. Bioinformatics was carried out using Rosetta Resolver, David and Ingenuity software. Principal Findings Our previous 22 k analysis of the RPE transcriptome showed that the RPE has high levels of protein synthesis, strong energy demands, is exposed to high levels of oxidative stress and a variable degree of inflammation. We currently use a complementary new strategy aimed at the identification and functional annotation of RPE-specific expressed transcripts. This strategy takes advantage of the multilayered cellular structure of the retina and overcomes a number of limitations of previous studies. In triplicate, we compared the transcriptomes of RPE, photoreceptor and choroidal cells and we deduced RPE specific expression. We identified at least 114 entries with RPE-specific gene expression. Thirty-nine of these 114 genes also show high expression in the RPE, comparison with the literature showed that 85% of these 39 were previously identified to be expressed in the RPE. In the group of 114 RPE specific genes there was an overrepresentation of genes involved in (membrane) transport, vision and ophthalmic disease. More fundamentally, we found RPE-specific involvement in the RAR-activation, retinol metabolism and GABA receptor signaling pathways. Conclusions In this study we provide a further specification and understanding of the RPE transcriptome by identifying and analyzing genes that are specifically expressed in the RPE. PMID:20479888

  8. Gene Expression Signature Analysis Identifies Vorinostat as a Candidate Therapy for Gastric Cancer

    PubMed Central

    Choi, Woonyoung; Park, Yun-Yong; Kim, KyoungHyun; Kim, Sang-Bae; Lee, Ju-Seog; Mills, Gordon B.; Cho, Jae Yong

    2011-01-01

    Background Gastric cancer continues to be one of the deadliest cancers in the world and therefore identification of new drugs targeting this type of cancer is thus of significant importance. The purpose of this study was to identify and validate a therapeutic agent which might improve the outcomes for gastric cancer patients in the future. Methodology/Principal Findings Using microarray technology, we generated a gene expression profile of human gastric cancer–specific genes from human gastric cancer tissue samples. We used this profile in the Broad Institute's Connectivity Map analysis to identify candidate therapeutic compounds for gastric cancer. We found the histone deacetylase inhibitor vorinostat as the lead compound and thus a potential therapeutic drug for gastric cancer. Vorinostat induced both apoptosis and autophagy in gastric cancer cell lines. Pharmacological and genetic inhibition of autophagy however, increased the therapeutic efficacy of vorinostat, indicating that a combination of vorinostat with autophagy inhibitors may therapeutically be more beneficial. Moreover, gene expression analysis of gastric cancer identified a collection of genes (ITGB5, TYMS, MYB, APOC1, CBX5, PLA2G2A, and KIF20A) whose expression was elevated in gastric tumor tissue and downregulated more than 2-fold by vorinostat treatment in gastric cancer cell lines. In contrast, SCGB2A1, TCN1, CFD, APLP1, and NQO1 manifested a reversed pattern. Conclusions/Significance We showed that analysis of gene expression signature may represent an emerging approach to discover therapeutic agents for gastric cancer, such as vorinostat. The observation of altered gene expression after vorinostat treatment may provide the clue to identify the molecular mechanism of vorinostat and those patients likely to benefit from vorinostat treatment. PMID:21931799

  9. Deep sequencing identifies viral and wasp genes with potential roles in replication of Microplitis demolitor Bracovirus.

    PubMed

    Burke, Gaelen R; Strand, Michael R

    2012-03-01

    Viruses in the genus Bracovirus (BV) (Polydnaviridae) are symbionts of parasitoid wasps that specifically replicate in the ovaries of females. Recent analysis of expressed sequence tags from two wasp species, Cotesia congregata and Chelonus inanitus, identified transcripts related to 24 different nudivirus genes. These results together with other data strongly indicate that BVs evolved from a nudivirus ancestor. However, it remains unclear whether BV-carrying wasps contain other nudivirus-like genes and what types of wasp genes may also be required for BV replication. Microplitis demolitor carries Microplitis demolitor bracovirus (MdBV). Here we characterized MdBV replication and performed massively parallel sequencing of M. demolitor ovary transcripts. Our results indicated that MdBV replication begins in stage 2 pupae and continues in adults. Analysis of prereplication- and active-replication-stage ovary RNAs yielded 22 Gb of sequence that assembled into 66,425 transcripts. This breadth of sampling indicated that a large percentage of genes in the M. demolitor genome were sequenced. A total of 41 nudivirus-like transcripts were identified, of which a majority were highly expressed during MdBV replication. Our results also identified a suite of wasp genes that were highly expressed during MdBV replication. Among these products were several transcripts with conserved roles in regulating locus-specific DNA amplification by eukaryotes. Overall, our data set together with prior results likely identify the majority of nudivirus-related genes that are transcriptionally functional during BV replication. Our results also suggest that amplification of proviral DNAs for packaging into BV virions may depend upon the replication machinery of wasps.

  10. A Special Local Clustering Algorithm for Identifying the Genes Associated With Alzheimer’s Disease

    PubMed Central

    Pang, Chao-Yang; Hu, Wei; Hu, Ben-Qiong; Shi, Ying; Vanderburg, Charles R.; Rogers, Jack T.

    2010-01-01

    Clustering is the grouping of similar objects into a class. Local clustering feature refers to the phenomenon whereby one group of data is separated from another, and the data from these different groups are clustered locally. A compact class is defined as one cluster in which all similar elements cluster tightly within the cluster. Herein, the essence of the local clustering feature, revealed by mathematical manipulation, results in a novel clustering algorithm termed as the special local clustering (SLC) algorithm that was used to process gene microarray data related to Alzheimer’s disease (AD). SLC algorithm was able to group together genes with similar expression patterns and identify significantly varied gene expression values as isolated points. If a gene belongs to a compact class in control data and appears as an isolated point in incipient, moderate and/or severe AD gene microarray data, this gene is possibly associated with AD. Application of a clustering algorithm in disease-associated gene identification such as in AD is rarely reported. PMID:20089478

  11. Transcriptome Analysis Identifies the Dysregulation of Ultraviolet Target Genes in Human Skin Cancers.

    PubMed

    Shen, Yao; Kim, Arianna L; Du, Rong; Liu, Liang

    2016-01-01

    Exposure to ultraviolet radiation (UVR) is a major risk factor for both melanoma and non-melanoma skin cancers. In addition to its mutagenic effect, UVR can also induce substantial transcriptional instability in skin cells affecting thousands of genes, including many cancer genes, suggesting that transcriptional instability may be another important etiological factor in skin photocarcinogenesis. In this study, we performed detailed transcriptomic profiling studies to characterize the kinetic changes in global gene expression in human keratinocytes exposed to different UVR conditions. We identified a subset of UV-responsive genes as UV signature genes (UVSGs) based on 1) conserved UV-responsiveness of this subset of genes among different keratinocyte lines; and 2) UV-induced persistent changes in their mRNA levels long after exposure. Interestingly, 11 of the UVSGs were shown to be critical to skin cancer cell proliferation and survival. Through computational Gene Set Enrichment Analysis, we demonstrated that a significant portion of the UVSGs were dysregulated in human skin squamous cell carcinomas, but not in other human malignancies. This highlights the potential and specificity of the UVSGs in clinical diagnosis of UV damage and stratification of skin cancer risk.

  12. De novo transcriptome sequencing of Momordica cochinchinensis to identify genes involved in the carotenoid biosynthesis.

    PubMed

    Hyun, Tae Kyung; Rim, Yeonggil; Jang, Hui-Jeong; Kim, Cheol Hong; Park, Jongsun; Kumar, Ritesh; Lee, Sunghoon; Kim, Byung Chul; Bhak, Jong; Nguyen-Quoc, Binh; Kim, Seon-Won; Lee, Sang Yeol; Kim, Jae-Yean

    2012-07-01

    The ripe fruit of Momordica cochinchinensis Spreng, known as gac, is featured by very high carotenoid content. Although this plant might be a good resource for carotenoid metabolic engineering, so far, the genes involved in the carotenoid metabolic pathways in gac were unidentified due to lack of genomic information in the public database. In order to expedite the process of gene discovery, we have undertaken Illumina deep sequencing of mRNA prepared from aril of gac fruit. From 51,446,670 high-quality reads, we obtained 81,404 assembled unigenes with average length of 388 base pairs. At the protein level, gac aril transcripts showed about 81.5% similarity with cucumber proteomes. In addition 17,104 unigenes have been assigned to specific metabolic pathways in Kyoto Encyclopedia of Genes and Genomes, and all of known enzymes involved in terpenoid backbones biosynthetic and carotenoid biosynthetic pathways were also identified in our library. To analyze the relationship between putative carotenoid biosynthesis genes and alteration of carotenoid content during fruit ripening, digital gene expression analysis was performed on three different ripening stages of aril. This study has revealed putative phytoene synthase, 15-cis-phytone desaturase, zeta-carotene desaturase, carotenoid isomerase and lycopene epsilon cyclase might be key factors for controlling carotenoid contents during aril ripening. Taken together, this study has also made availability of a large gene database. This unique information for gac gene discovery would be helpful to facilitate functional studies for improving carotenoid quantities.

  13. Enhancer connectome in primary human cells identifies target genes of disease-associated DNA elements.

    PubMed

    Mumbach, Maxwell R; Satpathy, Ansuman T; Boyle, Evan A; Dai, Chao; Gowen, Benjamin G; Cho, Seung Woo; Nguyen, Michelle L; Rubin, Adam J; Granja, Jeffrey M; Kazane, Katelynn R; Wei, Yuning; Nguyen, Trieu; Greenside, Peyton G; Corces, M Ryan; Tycko, Josh; Simeonov, Dimitre R; Suliman, Nabeela; Li, Rui; Xu, Jin; Flynn, Ryan A; Kundaje, Anshul; Khavari, Paul A; Marson, Alexander; Corn, Jacob E; Quertermous, Thomas; Greenleaf, William J; Chang, Howard Y

    2017-09-25

    The challenge of linking intergenic mutations to target genes has limited molecular understanding of human diseases. Here we show that H3K27ac HiChIP generates high-resolution contact maps of active enhancers and target genes in rare primary human T cell subtypes and coronary artery smooth muscle cells. Differentiation of naive T cells into T helper 17 cells or regulatory T cells creates subtype-specific enhancer-promoter interactions, specifically at regions of shared DNA accessibility. These data provide a principled means of assigning molecular functions to autoimmune and cardiovascular disease risk variants, linking hundreds of noncoding variants to putative gene targets. Target genes identified with HiChIP are further supported by CRISPR interference and activation at linked enhancers, by the presence of expression quantitative trait loci, and by allele-specific enhancer loops in patient-derived primary cells. The majority of disease-associated enhancers contact genes beyond the nearest gene in the linear genome, leading to a fourfold increase in the number of potential target genes for autoimmune and cardiovascular diseases.

  14. Cross-species microarray hybridization to identify developmentally regulated genes in the filamentous fungus Sordaria macrospora.

    PubMed

    Nowrousian, Minou; Ringelberg, Carol; Dunlap, Jay C; Loros, Jennifer J; Kück, Ulrich

    2005-04-01

    The filamentous fungus Sordaria macrospora forms complex three-dimensional fruiting bodies that protect the developing ascospores and ensure their proper discharge. Several regulatory genes essential for fruiting body development were previously isolated by complementation of the sterile mutants pro1, pro11 and pro22. To establish the genetic relationships between these genes and to identify downstream targets, we have conducted cross-species microarray hybridizations using cDNA arrays derived from the closely related fungus Neurospora crassa and RNA probes prepared from wild-type S. macrospora and the three developmental mutants. Of the 1,420 genes which gave a signal with the probes from all the strains used, 172 (12%) were regulated differently in at least one of the three mutants compared to the wild type, and 17 (1.2%) were regulated differently in all three mutant strains. Microarray data were verified by Northern analysis or quantitative real time PCR. Among the genes that are up- or down-regulated in the mutant strains are genes encoding the pheromone precursors, enzymes involved in melanin biosynthesis and a lectin-like protein. Analysis of gene expression in double mutants revealed a complex network of interaction between the pro gene products.

  15. Transcriptome Analysis Identifies the Dysregulation of Ultraviolet Target Genes in Human Skin Cancers

    PubMed Central

    Shen, Yao; Kim, Arianna L.; Du, Rong; Liu, Liang

    2016-01-01

    Exposure to ultraviolet radiation (UVR) is a major risk factor for both melanoma and non-melanoma skin cancers. In addition to its mutagenic effect, UVR can also induce substantial transcriptional instability in skin cells affecting thousands of genes, including many cancer genes, suggesting that transcriptional instability may be another important etiological factor in skin photocarcinogenesis. In this study, we performed detailed transcriptomic profiling studies to characterize the kinetic changes in global gene expression in human keratinocytes exposed to different UVR conditions. We identified a subset of UV-responsive genes as UV signature genes (UVSGs) based on 1) conserved UV-responsiveness of this subset of genes among different keratinocyte lines; and 2) UV-induced persistent changes in their mRNA levels long after exposure. Interestingly, 11 of the UVSGs were shown to be critical to skin cancer cell proliferation and survival. Through computational Gene Set Enrichment Analysis, we demonstrated that a significant portion of the UVSGs were dysregulated in human skin squamous cell carcinomas, but not in other human malignancies. This highlights the potential and specificity of the UVSGs in clinical diagnosis of UV damage and stratification of skin cancer risk. PMID:27643989

  16. Yeast functional screen to identify genes conferring salt stress tolerance in Salicornia europaea

    PubMed Central

    Nakahara, Yoshiki; Sawabe, Shogo; Kainuma, Kenta; Katsuhara, Maki; Shibasaka, Mineo; Suzuki, Masanori; Yamamoto, Kosuke; Oguri, Suguru; Sakamoto, Hikaru

    2015-01-01

    Salinity is a critical environmental factor that adversely affects crop productivity. Halophytes have evolved various mechanisms to adapt to saline environments. Salicornia europaea L. is one of the most salt-tolerant plant species. It does not have special salt-secreting structures like a salt gland or salt bladder, and is therefore a good model for studying the common mechanisms underlying plant salt tolerance. To identify candidate genes encoding key proteins in the mediation of salt tolerance in S. europaea, we performed a functional screen of a cDNA library in yeast. The library was screened for genes that allowed the yeast to grow in the presence of 1.3 M NaCl. We obtained three full-length S. europaea genes that confer salt tolerance. The genes are predicted to encode (1) a novel protein highly homologous to thaumatin-like proteins, (2) a novel coiled-coil protein of unknown function, and (3) a novel short peptide of 32 residues. Exogenous application of a synthetic peptide corresponding to the 32 residues improved salt tolerance of Arabidopsis. The approach described in this report provides a rapid assay system for large-scale screening of S. europaea genes involved in salt stress tolerance and supports the identification of genes responsible for such mechanisms. These genes may be useful candidates for improving crop salt tolerance by genetic transformation. PMID:26579166

  17. Mapping of Craniofacial Traits in Outbred Mice Identifies Major Developmental Genes Involved in Shape Determination

    PubMed Central

    Pallares, Luisa F.; Carbonetto, Peter; Gopalakrishnan, Shyam; Parker, Clarissa C.; Ackert-Bicknell, Cheryl L.; Palmer, Abraham A.; Tautz, Diethard

    2015-01-01

    The vertebrate cranium is a prime example of the high evolvability of complex traits. While evidence of genes and developmental pathways underlying craniofacial shape determination is accumulating, we are still far from understanding how such variation at the genetic level is translated into craniofacial shape variation. Here we used 3D geometric morphometrics to map genes involved in shape determination in a population of outbred mice (Carworth Farms White, or CFW). We defined shape traits via principal component analysis of 3D skull and mandible measurements. We mapped genetic loci associated with shape traits at ~80,000 candidate single nucleotide polymorphisms in ~700 male mice. We found that craniofacial shape and size are highly heritable, polygenic traits. Despite the polygenic nature of the traits, we identified 17 loci that explain variation in skull shape, and 8 loci associated with variation in mandible shape. Together, the associated variants account for 11.4% of skull and 4.4% of mandible shape variation, however, the total additive genetic variance associated with phenotypic variation was estimated in ~45%. Candidate genes within the associated loci have known roles in craniofacial development; this includes 6 transcription factors and several regulators of bone developmental pathways. One gene, Mn1, has an unusually large effect on shape variation in our study. A knockout of this gene was previously shown to affect negatively the development of membranous bones of the cranial skeleton, and evolutionary analysis shows that the gene has arisen at the base of the bony vertebrates (Eutelostomi), where the ossified head first appeared. Therefore, Mn1 emerges as a key gene for both skull formation and within-population shape variation. Our study shows that it is possible to identify important developmental genes through genome-wide mapping of high-dimensional shape features in an outbred population. PMID:26523602

  18. Comprehensively identifying and characterizing the missing gene sequences in human reference genome with integrated analytic approaches.

    PubMed

    Chen, Geng; Wang, Charles; Shi, Leming; Tong, Weida; Qu, Xiongfei; Chen, Jiwei; Yang, Jianmin; Shi, Caiping; Chen, Long; Zhou, Peiying; Lu, Bingxin; Shi, Tieliu

    2013-08-01

    The human reference genome is still incomplete and a number of gene sequences are missing from it. The approaches to uncover them, the reasons causing their absence and their functions are less explored. Here, we comprehensively identified and characterized the missing genes of human reference genome with RNA-Seq data from 16 different human tissues. By using a combined approach of genome-guided transcriptome reconstruction coupled with genome-wide comparison, we uncovered 3.78 and 2.37 Mb transcribed regions in the human genome assemblies of Celera and HuRef either missed from their homologous chromosomes of NCBI human reference genome build 37.2 or partially or entirely absent from the reference. We further identified a significant number of novel transcript contigs in each tissue from de novo transcriptome assembly that are unalignable to NCBI build 37.2 but can be aligned to at least one of the genomes from Celera, HuRef, chimpanzee, macaca or mouse. Our analyses indicate that the missing genes could result from genome misassembly, transposition, copy number variation, translocation and other structural variations. Moreover, our results further suggest that a large portion of these missing genes are conserved between human and other mammals, implying their important biological functions. Totally, 1,233 functional protein domains were detected in these missing genes. Collectively, our study not only provides approaches for uncovering the missing genes of a genome, but also proposes the potential reasons causing genes missed from the genome and highlights the importance of uncovering the missing genes of incomplete genomes.

  19. Differentially expressed genes identified by cross-species microarray in the blind cavefish Astyanax.

    PubMed

    Strickler, Allen G; Jeffery, William R

    2009-03-01

    Changes in gene expression were examined by microarray analysis during development of the eyed surface dwelling (surface fish) and blind cave-dwelling (cavefish) forms of the teleost Astyanax mexicanus De Filippi, 1853. The cross-species microarray used surface and cavefish RNA hybridized to a DNA chip prepared from a closely related species, the zebrafish Danio rerio Hamilton, 1822. We identified a total of 67 differentially expressed probe sets at three days post-fertilization: six upregulated and 61 downregulated in cavefish relative to surface fish. Many of these genes function either in eye development and/or maintenance, or in programmed cell death. The upregulated probe set showing the highest mean fold change was similar to the human ubiquitin specific protease 53 gene. The downregulated probe sets showing some of the highest fold changes corresponded to genes with roles in eye development, including those encoding gamma crystallins, the guanine nucleotide binding proteins Gnat1 and Gant2, a BarH-like homeodomain transcription factor, and rhodopsin. Downregulation of gamma-crystallin and rhodopsin was confirmed by in situ hybridization and immunostaining with specific antibodies. Additional downregulated genes encode molecules that inhibit or activate programmed cell death. The results suggest that cross-species microarray can be used for identifying differentially expressed genes in cavefish, that many of these genes might be involved in eye degeneration via apoptotic processes, and that more genes are downregulated than upregulated in cavefish, consistent with the predominance of morphological losses over gains during regressive evolution. © 2009 ISZS, Blackwell Publishing and IOZ/CAS.

  20. The characteristic direction: a geometrical approach to identify differentially expressed genes.

    PubMed

    Clark, Neil R; Hu, Kevin S; Feldmann, Axel S; Kou, Yan; Chen, Edward Y; Duan, Qiaonan; Ma'ayan, Avi

    2014-03-21

    Identifying differentially expressed genes (DEG) is a fundamental step in studies that perform genome wide expression profiling. Typically, DEG are identified by univariate approaches such as Significance Analysis of Microarrays (SAM) or Linear Models for Microarray Data (LIMMA) for processing cDNA microarrays, and differential gene expression analysis based on the negative binomial distribution (DESeq) or Empirical analysis of Digital Gene Expression data in R (edgeR) for RNA-seq profiling. Here we present a new geometrical multivariate approach to identify DEG called the Characteristic Direction. We demonstrate that the Characteristic Direction method is significantly more sensitive than existing methods for identifying DEG in the context of transcription factor (TF) and drug perturbation responses over a large number of microarray experiments. We also benchmarked the Characteristic Direction method using synthetic data, as well as RNA-Seq data. A large collection of microarray expression data from TF perturbations (73 experiments) and drug perturbations (130 experiments) extracted from the Gene Expression Omnibus (GEO), as well as an RNA-Seq study that profiled genome-wide gene expression and STAT3 DNA binding in two subtypes of diffuse large B-cell Lymphoma, were used for benchmarking the method using real data. ChIP-Seq data identifying DNA binding sites of the perturbed TFs, as well as known drug targets of the perturbing drugs, were used as prior knowledge silver-standard for validation. In all cases the Characteristic Direction DEG calling method outperformed other methods. We find that when drugs are applied to cells in various contexts, the proteins that interact with the drug-targets are differentially expressed and more of the corresponding genes are discovered by the Characteristic Direction method. In addition, we show that the Characteristic Direction conceptualization can be used to perform improved gene set enrichment analyses when compared with

  1. Identifying gene-gene interactions that are highly associated with Body Mass Index using Quantitative Multifactor Dimensionality Reduction (QMDR).

    PubMed

    De, Rishika; Verma, Shefali S; Drenos, Fotios; Holzinger, Emily R; Holmes, Michael V; Hall, Molly A; Crosslin, David R; Carrell, David S; Hakonarson, Hakon; Jarvik, Gail; Larson, Eric; Pacheco, Jennifer A; Rasmussen-Torvik, Laura J; Moore, Carrie B; Asselbergs, Folkert W; Moore, Jason H; Ritchie, Marylyn D; Keating, Brendan J; Gilbert-Diamond, Diane

    2015-01-01

    Despite heritability estimates of 40-70 % for obesity, less than 2 % of its variation is explained by Body Mass Index (BMI) associated loci that have been identified so far. Epistasis, or gene-gene interactions are a plausible source to explain portions of the missing heritability of BMI. Using genotypic data from 18,686 individuals across five study cohorts - ARIC, CARDIA, FHS, CHS, MESA - we filtered SNPs (Single Nucleotide Polymorphisms) using two parallel approaches. SNPs were filtered either on the strength of their main effects of association with BMI, or on the number of knowledge sources supporting a specific SNP-SNP interaction in the context of BMI. Filtered SNPs were specifically analyzed for interactions that are highly associated with BMI using QMDR (Quantitative Multifactor Dimensionality Reduction). QMDR is a nonparametric, genetic model-free method that detects non-linear interactions associated with a quantitative trait. We identified seven novel, epistatic models with a Bonferroni corrected p-value of association < 0.1. Prior experimental evidence helps explain the plausible biological interactions highlighted within our results and their relationship with obesity. We identified interactions between genes involved in mitochondrial dysfunction (POLG2), cholesterol metabolism (SOAT2), lipid metabolism (CYP11B2), cell adhesion (EZR), cell proliferation (MAP2K5), and insulin resistance (IGF1R). Moreover, we found an 8.8 % increase in the variance in BMI explained by these seven SNP-SNP interactions, beyond what is explained by the main effects of an index FTO SNP and the SNPs within these interactions. We also replicated one of these interactions and 58 proxy SNP-SNP models representing it in an independent dataset from the eMERGE study. This study highlights a novel approach for discovering gene-gene interactions by combining methods such as QMDR with traditional statistics.

  2. Cross-species gene expression analysis identifies a novel set of genes implicated in human insulin sensitivity.

    PubMed

    Chaudhuri, Rima; Khoo, Poh Sim; Tonks, Katherine; Junutula, Jagath R; Kolumam, Ganesh; Modrusan, Zora; Samocha-Bonet, Dorit; Meoli, Christopher C; Hocking, Samantha; Fazakerley, Daniel J; Stöckli, Jacqueline; Hoehn, Kyle L; Greenfield, Jerry R; Yang, Jean Yee Hwa; James, David E

    2015-01-01

    Insulin resistance (IR) is one of the earliest predictors of type 2 diabetes. However, diagnosis of IR is limited. High fat fed mouse models provide key insights into IR. We hypothesized that early features of IR are associated with persistent changes in gene expression (GE) and endeavored to (a) develop novel methods for improving signal:noise in analysis of human GE using mouse models; (b) identify a GE motif that accurately diagnoses IR in humans; and (c) identify novel biology associated with IR in humans. We integrated human muscle GE data with longitudinal mouse GE data and developed an unbiased three-level cross-species analysis platform (single gene, gene set, and networks) to generate a gene expression motif (GEM) indicative of IR. A logistic regression classification model validated GEM in three independent human data sets (n=115). This GEM of 93 genes substantially improved diagnosis of IR compared with routine clinical measures across multiple independent data sets. Individuals misclassified by GEM possessed other metabolic features raising the possibility that they represent a separate metabolic subclass. The GEM was enriched in pathways previously implicated in insulin action and revealed novel associations between β-catenin and Jak1 and IR. Functional analyses using small molecule inhibitors showed an important role for these proteins in insulin action. This study shows that systems approaches for identifying molecular signatures provides a powerful way to stratify individuals into discrete metabolic groups. Moreover, we speculate that the β-catenin pathway may represent a novel biomarker for IR in humans that warrant future investigation.

  3. Identifying time-delayed gene regulatory networks via an evolvable hierarchical recurrent neural network.

    PubMed

    Kordmahalleh, Mina Moradi; Sefidmazgi, Mohammad Gorji; Harrison, Scott H; Homaifar, Abdollah

    2017-01-01

    The modeling of genetic interactions within a cell is crucial for a basic understanding of physiology and for applied areas such as drug design. Interactions in gene regulatory networks (GRNs) include effects of transcription factors, repressors, small metabolites, and microRNA species. In addition, the effects of regulatory interactions are not always simultaneous, but can occur after a finite time delay, or as a combined outcome of simultaneous and time delayed interactions. Powerful biotechnologies have been rapidly and successfully measuring levels of genetic expression to illuminate different states of biological systems. This has led to an ensuing challenge to improve the identification of specific regulatory mechanisms through regulatory network reconstructions. Solutions to this challenge will ultimately help to spur forward efforts based on the usage of regulatory network reconstructions in systems biology applications. We have developed a hierarchical recurrent neural network (HRNN) that identifies time-delayed gene interactions using time-course data. A customized genetic algorithm (GA) was used to optimize hierarchical connectivity of regulatory genes and a target gene. The proposed design provides a non-fully connected network with the flexibility of using recurrent connections inside the network. These features and the non-linearity of the HRNN facilitate the process of identifying temporal patterns of a GRN. Our HRNN method was implemented with the Python language. It was first evaluated on simulated data representing linear and nonlinear time-delayed gene-gene interaction models across a range of network sizes and variances of noise. We then further demonstrated the capability of our method in reconstructing GRNs of the Saccharomyces cerevisiae synthetic network for in vivo benchmarking of reverse-engineering and modeling approaches (IRMA). We compared the performance of our method to TD-ARACNE, HCC-CLINDE, TSNI and ebdbNet across different network

  4. Using RNA sequencing for identifying gene imprinting and random monoallelic expression in human placenta

    PubMed Central

    Metsalu, Tauno; Viltrop, Triin; Tiirats, Airi; Rajashekar, Balaji; Reimann, Ene; Kõks, Sulev; Rull, Kristiina; Milani, Lili; Acharya, Ganesh; Basnet, Purusotam; Vilo, Jaak; Mägi, Reedik; Metspalu, Andres; Peters, Maire; Haller-Kikkatalo, Kadri; Salumets, Andres

    2014-01-01

    Given the possible critical importance of placental gene imprinting and random monoallelic expression on fetal and infant health, most of those genes must be identified, in order to understand the risks that the baby might meet during pregnancy and after birth. Therefore, the aim of the current study was to introduce a workflow and tools for analyzing imprinted and random monoallelic gene expression in human placenta, by applying whole-transcriptome (WT) RNA sequencing of placental tissue and genotyping of coding DNA variants in family trios. Ten family trios, each with a healthy spontaneous single-term pregnancy, were recruited. Total RNA was extracted for WT analysis, providing the full sequence information for the placental transcriptome. Parental and child blood DNA genotypes were analyzed by exome SNP genotyping microarrays, mapping the inheritance and estimating the abundance of parental expressed alleles. Imprinted genes showed consistent expression from either parental allele, as demonstrated by the SNP content of sequenced transcripts, while monoallelically expressed genes had random activity of parental alleles. We revealed 4 novel possible imprinted genes (LGALS8, LGALS14, PAPPA2 and SPTLC3) and confirmed the imprinting of 4 genes (AIM1, PEG10, RHOBTB3 and ZFAT-AS1) in human placenta. The major finding was the identification of 4 genes (ABP1, BCLAF1, IFI30 and ZFAT) with random allelic bias, expressing one of the parental alleles preferentially. The main functions of the imprinted and monoallelically expressed genes included: i) mediating cellular apoptosis and tissue development; ii) regulating inflammation and immune system; iii) facilitating metabolic processes; and iv) regulating cell cycle. PMID:25437054

  5. Expression of DOF genes identifies early stages of vascular development in Arabidopsis leaves.

    PubMed

    Gardiner, Jason; Sherr, Ira; Scarpella, Enrico

    2010-01-01

    The sequence of events underlying the formation of vascular networks in the leaf has long fascinated developmental biologists. In Arabidopsis leaves, vascular-precursor procambial cells derive from the elongation of morphologically inconspicuous ground cells that selectively activate expression of the HD-ZIP III gene ATHB8. Inception of ATHB8 expression operationally defines acquisition of a typically irreversible preprocambial cell state that preludes to vein formation. A view of the constellation of genes whose expression is activated at preprocambial stages would therefore be particularly desirable; however, very few preprocambial gene expression profiles have been identified. Here, we show that expression of three genes encoding members of the DOF family of plant-specific transcription factors is activated at stages overlapping onset of ATHB8 expression. Expression of DOF genes is initiated in wide domains that become confined to sites of vein development. Congruence between DOF expression fields and zones of vein formation persists upon experimental manipulation of leaf vascular patterning, suggesting that DOF expression identifies consistently recurring steps in vein ontogeny. Our results contribute to defining preprocambial cell identity at the molecular level.

  6. EUF1 - a newly identified gene involved in erythritol utilization in Yarrowia lipolytica.

    PubMed

    Rzechonek, Dorota A; Neuvéglise, Cécile; Devillers, Hugo; Rymowicz, Waldemar; Mirończuk, Aleksandra M

    2017-10-02

    The gene YALI0F01562g was identified as an important factor involved in erythritol catabolism of the unconventional yeast Yarrowia lipolytica. Its putative role was identified for the first time by comparative analysis of four Y. lipolytica strains: A-101.1.31, Wratislavia K1, MK1 and AMM. The presence of a mutation that seriously damaged the gene corresponded to inability of the strain Wratislavia K1 to utilize erythritol. RT-PCR analysis of the strain MK1 demonstrated a significant increase in YALI0F01562g expression during growth on erythritol. Further studies involving deletion and overexpression of the selected gene showed that it is indeed essential for efficient erythritol assimilation. The deletion strain Y. lipolytica AMM∆euf1 was almost unable to grow on erythritol as the sole carbon source. When the strain was applied in the process of erythritol production from glycerol, the amount of erythritol remained constant after reaching the maximal concentration. Analysis of the YALI0F01562g gene sequence revealed the presence of domains characteristic for transcription factors. Therefore we suggest naming the studied gene Erythritol Utilization Factor - EUF1.

  7. Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    PubMed Central

    Yockteng, Roxana; Marthey, Sylvain; Chiapello, Hélène; Gendrault, Annie; Hood, Michael E; Rodolphe, François; Devier, Benjamin; Wincker, Patrick; Dossat, Carole; Giraud, Tatiana

    2007-01-01

    Background The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics. PMID:17692127

  8. Meta-analysis of transcriptomic datasets identifies genes enriched in the mammalian circadian pacemaker.

    PubMed

    Brown, Laurence A; Williams, John; Taylor, Lewis; Thomson, Ross J; Nolan, Patrick M; Foster, Russell G; Peirson, Stuart N

    2017-09-29

    The master circadian pacemaker in mammals is located in the suprachiasmatic nuclei (SCN) which regulate physiology and behaviour, as well as coordinating peripheral clocks throughout the body. Investigating the function of the SCN has often focused on the identification of rhythmically expressed genes. However, not all genes critical for SCN function are rhythmically expressed. An alternative strategy is to characterize those genes that are selectively enriched in the SCN. Here, we examined the transcriptome of the SCN and whole brain (WB) of mice using meta-analysis of publicly deposited data across a range of microarray platforms and RNA-Seq data. A total of 79 microarrays were used (24 SCN and 55 WB samples, 4 different microarray platforms), alongside 17 RNA-Seq data files (7 SCN and 10 WB). 31 684 MGI gene symbols had data for at least one platform. Meta-analysis using a random effects model for weighting individual effect sizes (derived from differential expression between relevant SCN and WB samples) reliably detected known SCN markers. SCN-enriched transcripts identified in this study provide novel insights into SCN function, including identifying genes which may play key roles in SCN physiology or provide SCN-specific drivers. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Characterization of novel antibiotic resistance genes identified by functional metagenomics on soil samples.

    PubMed

    Torres-Cortés, Gloria; Millán, Vicenta; Ramírez-Saad, Hugo C; Nisa-Martínez, Rafael; Toro, Nicolás; Martínez-Abarca, Francisco

    2011-04-01

    The soil microbial community is highly complex and contains a high density of antibiotic-producing bacteria, making it a likely source of diverse antibiotic resistance determinants. We used functional metagenomics to search for antibiotic resistance genes in libraries generated from three different soil samples, containing 3.6 Gb of DNA in total. We identified 11 new antibiotic resistance genes: 3 conferring resistance to ampicillin, 2 to gentamicin, 2 to chloramphenicol and 4 to trimethoprim. One of the clones identified was a new trimethoprim resistance gene encoding a 26.8 kDa protein closely resembling unassigned reductases of the dihydrofolate reductase group. This protein, Tm8-3, conferred trimethoprim resistance in Escherichia coli and Sinorhizobium meliloti (γ- and α-proteobacteria respectively). We demonstrated that this gene encoded an enzyme with dihydrofolate reductase activity, with kinetic constants similar to other type I and II dihydrofolate reductases (K(m) of 8.9 µM for NADPH and 3.7 µM for dihydrofolate and IC(50) of 20 µM for trimethoprim). This is the first description of a new type of reductase conferring resistance to trimethoprim. Our results indicate that soil bacteria display a high level of genetic diversity and are a reservoir of antibiotic resistance genes, supporting the use of this approach for the discovery of novel enzymes with unexpected activities unpredictable from their amino acid sequences.

  10. Gene coexpression networks in human brain identify epigenetic modifications in alcohol dependence.

    PubMed

    Ponomarev, Igor; Wang, Shi; Zhang, Lingling; Harris, R Adron; Mayfield, R Dayne

    2012-02-01

    Alcohol abuse causes widespread changes in gene expression in human brain, some of which contribute to alcohol dependence. Previous microarray studies identified individual genes as candidates for alcohol phenotypes, but efforts to generate an integrated view of molecular and cellular changes underlying alcohol addiction are lacking. Here, we applied a novel systems approach to transcriptome profiling in postmortem human brains and generated a systemic view of brain alterations associated with alcohol abuse. We identified critical cellular components and previously unrecognized epigenetic determinants of gene coexpression relationships and discovered novel markers of chromatin modifications in alcoholic brain. Higher expression levels of endogenous retroviruses and genes with high GC content in alcoholics were associated with DNA hypomethylation and increased histone H3K4 trimethylation, suggesting a critical role of epigenetic mechanisms in alcohol addiction. Analysis of cell-type-specific transcriptomes revealed remarkable consistency between molecular profiles and cellular abnormalities in alcoholic brain. Based on evidence from this study and others, we generated a systems hypothesis for the central role of chromatin modifications in alcohol dependence that integrates epigenetic regulation of gene expression with pathophysiological and neuroadaptive changes in alcoholic brain. Our results offer implications for epigenetic therapeutics in alcohol and drug addiction.

  11. A novel DNA replication origin identified in the human heat shock protein 70 gene promoter.

    PubMed Central

    Taira, T; Iguchi-Ariga, S M; Ariga, H

    1994-01-01

    A general and sensitive method for the mapping of initiation sites of DNA replication in vivo, developed by Vassilev and Johnson, has revealed replication origins in the region of simian virus 40 ori, in the regions upstream from the human c-myc gene and downstream from the Chinese hamster dihydrofolate reductase gene, and in the enhancer region of the mouse immunoglobulin heavy-chain gene. Here we report that the region containing the promoter of the human heat shock protein 70 (hsp70) gene was identified as a DNA replication origin in HeLa cells by this method. Several segments of the region were cloned into pUC19 and examined for autonomously replicating sequence (ARS) activity. The plasmids carrying the segments replicated episomally and semiconservatively when transfected into HeLa cells. The segments of ARS activity contained the sequences previously identified as binding sequences for a c-myc protein complex (T. Taira, Y. Negishi, F. Kihara, S. M. M. Iguchi-Ariga, and H. Ariga, Biochem. Biophys. Acta 1130:166-174, 1992). Mutations introduced within the c-myc protein complex binding sequences abolished the ARS activity. Moreover, the ARS plasmids stably replicated at episomal state for a long time in established cell lines. The results suggest that the promoter region of the human hsp70 gene plays a role in DNA replication as well as in transcription. Images PMID:8065368

  12. Immunogenetic mechanisms leading to thyroid autoimmunity: recent advances in identifying susceptibility genes and regions.

    PubMed

    Brand, Oliver J; Gough, Stephen C L

    2011-12-01

    The autoimmune thyroid diseases (AITD) include Graves' disease (GD) and Hashimoto's thyroiditis (HT), which are characterised by a breakdown in immune tolerance to thyroid antigens. Unravelling the genetic architecture of AITD is vital to better understanding of AITD pathogenesis, required to advance therapeutic options in both disease management and prevention. The early whole-genome linkage and candidate gene association studies provided the first evidence that the HLA region and CTLA-4 represented AITD risk loci. Recent improvements in; high throughput genotyping technologies, collection of larger disease cohorts and cataloguing of genome-scale variation have facilitated genome-wide association studies and more thorough screening of candidate gene regions. This has allowed identification of many novel AITD risk genes and more detailed association mapping. The growing number of confirmed AITD susceptibility loci, implicates a number of putative disease mechanisms most of which are tightly linked with aspects of immune system function. The unprecedented advances in genetic study will allow future studies to identify further novel disease risk genes and to identify aetiological variants within specific gene regions, which will undoubtedly lead to a better understanding of AITD patho-physiology.

  13. Genome-wide functional screen identifies a compendium of genes affecting sensitivity to tamoxifen.

    PubMed

    Mendes-Pereira, Ana M; Sims, David; Dexter, Tim; Fenwick, Kerry; Assiotis, Ioannis; Kozarewa, Iwanka; Mitsopoulos, Costas; Hakas, Jarle; Zvelebil, Marketa; Lord, Christopher J; Ashworth, Alan

    2012-02-21

    Therapies that target estrogen signaling have made a very considerable contribution to reducing mortality from breast cancer. However, resistance to tamoxifen remains a major clinical problem. Here we have used a genome-wide functional profiling approach to identify multiple genes that confer resistance or sensitivity to tamoxifen. Combining whole-genome shRNA screening with massively parallel sequencing, we have profiled the impact of more than 56,670 RNA interference reagents targeting 16,487 genes on the cellular response to tamoxifen. This screen, along with subsequent validation experiments, identifies a compendium of genes whose silencing causes tamoxifen resistance (including BAP1, CLPP, GPRC5D, NAE1, NF1, NIPBL, NSD1, RAD21, RARG, SMC3, and UBA3) and also a set of genes whose silencing causes sensitivity to this endocrine agent (C10orf72, C15orf55/NUT, EDF1, ING5, KRAS, NOC3L, PPP1R15B, RRAS2, TMPRSS2, and TPM4). Multiple individual genes, including NF1, a regulator of RAS signaling, also correlate with clinical outcome after tamoxifen treatment.

  14. A strategy to identify genes associated with circulating solid tumor cell survival in peripheral blood.

    PubMed Central

    Fournier, M. V.; Carvalho, M. G.; Pardee, A. B.

    1999-01-01

    Efforts in metastasis research have centered on the phenotypic and genetic differences between primary site and metastatic site tumors. However, genes that may be used as molecular markers of metastasis in circulating tumor cells remain unidentified. Genes regulating the dissemination and survival of solid tumor cells in the blood, as well as their adaptation to new environments, could be candidates for unique metastatic tumor markers. Differential display (DD) was conducted to compare the blood of tumor-free individuals with the blood of patients with lung, breast, and colon cancers. Twenty-one up-expressed genes in the tumor patient blood samples but none in the tumor-free donor blood samples were identified. Nine of these samples were isolated, amplified, and directly sequenced. A gene AB-1 homologous to a Bcl-2 family member, which might function as an apoptosis inhibitor, was identified. The overexpression of an apoptosis inhibitor in blood from patients with metastatic tumors might be correlated with the capability of solid tumor cells to survive in peripheral blood. This is the first demonstration of the usefulness of comparing control and patient blood samples by DD to find novel potential genetic markers identifying metastasis in the blood. http://link.springer-ny. com/link/service/journals/00020/bibs/5n5p313.html Images Fig. 1 Fig. 2 Fig. 3 PMID:10390547

  15. Lentiviral Vector-based Insertional Mutagenesis Identifies Genes Involved in the Resistance to Targeted Anticancer Therapies

    PubMed Central

    Ranzani, Marco; Annunziato, Stefano; Calabria, Andrea; Brasca, Stefano; Benedicenti, Fabrizio; Gallina, Pierangela; Naldini, Luigi; Montini, Eugenio

    2014-01-01

    The high transduction efficiency of lentiviral vectors in a wide variety of cells makes them an ideal tool for forward genetics screenings addressing issues of cancer research. Although molecular targeted therapies have provided significant advances in tumor treatment, relapses often occur by the expansion of tumor cell clones carrying mutations that confer resistance. Identification of the culprits of anticancer drug resistance is fundamental for the achievement of long-term response. Here, we developed a new lentiviral vector-based insertional mutagenesis screening to identify genes that confer resistance to clinically relevant targeted anticancer therapies. By applying this genome-wide approach to cell lines representing two subtypes of HER2+ breast cancer, we identified 62 candidate lapatinib resistance genes. We validated the top ranking genes, i.e., PIK3CA and PIK3CB, by showing that their forced expression confers resistance to lapatinib in vitro and found that their mutation/overexpression is associated to poor prognosis in human breast tumors. Then, we successfully applied this approach to the identification of erlotinib resistance genes in pancreatic cancer, thus showing the intrinsic versatility of the approach. The acquired knowledge can help identifying combinations of targeted drugs to overcome the occurrence of resistance, thus opening new horizons for more effective treatment of tumors. PMID:25195596

  16. Progressive retinal atrophy in Schapendoes dogs: mutation of the newly identified CCDC66 gene.

    PubMed

    Dekomien, Gabriele; Vollrath, Conni; Petrasch-Parwez, Elisabeth; Boevé, Michael H; Akkad, Denis A; Gerding, Wanda M; Epplen, Jörg T

    2010-05-01

    Canine generalized progressive retinal atrophy (gPRA) is characterized by continuous degeneration of photoreceptor cells leading to night blindness and progressive vision loss. Until now, mutations in 11 genes have been described that account for gPRA in dogs, mostly following an autosomal recessive inheritance mode. Here, we describe a gPRA locus comprising the newly identified gene coiled-coil domain containing 66 (CCDC66) on canine chromosome 20, as identified via linkage analysis in the Schapendoes breed. Mutation screening of the CCDC66 gene revealed a 1-bp insertion in exon 6 leading to a stop codon as the underlying cause of disease. The insertion is present in all affected dogs in the homozygous state as well as in all obligatory mutation carriers in the heterozygous state. The CCDC66 gene is evolutionarily conserved in different vertebrate species and exhibits a complex pattern of differential RNA splicing resulting in various isoforms in the retina. Immunohistochemically, CCDC66 protein is detected mainly in the inner segments of photoreceptors in mouse, dog, and man. The affected Schapendoes retina lacks CCDC66 protein. Thus this natural canine model for gPRA yields superior potential to understand functional implications of this newly identified protein including its physiology, and it opens new perspectives for analyzing different aspects of the general pathophysiology of gPRA.

  17. A novel approach identifies new differentially methylated regions (DMRs) associated with imprinted genes

    PubMed Central

    Choufani, Sanaa; Shapiro, Jonathan S.; Susiarjo, Martha; Butcher, Darci T.; Grafodatskaya, Daria; Lou, Youliang; Ferreira, Jose C.; Pinto, Dalila; Scherer, Stephen W.; Shaffer, Lisa G.; Coullin, Philippe; Caniggia, Isabella; Beyene, Joseph; Slim, Rima; Bartolomei, Marisa S.; Weksberg, Rosanna

    2011-01-01

    Imprinted genes are critical for normal human growth and neurodevelopment. They are characterized by differentially methylated regions (DMRs) of DNA that confer parent of origin-specific transcription. We developed a new strategy to identify imprinted gene-associated DMRs. Using genome-wide methylation profiling of sodium bisulfite modified DNA from normal human tissues of biparental origin, candidate DMRs were identified by selecting CpGs with methylation levels consistent with putative allelic differential methylation. In parallel, the methylation profiles of tissues of uniparental origin, i.e., paternally-derived androgenetic complete hydatidiform moles (AnCHMs), and maternally-derived mature cystic ovarian teratoma (MCT), were examined and then used to identify CpGs with parent of origin-specific DNA methylation. With this approach, we found known DMRs associated with imprinted genomic regions as well as new DMRs for known imprinted genes, NAP1L5 and ZNF597, and novel candidate imprinted genes. The paternally methylated DMR for one candidate, AXL, a receptor tyrosine kinase, was also validated in experiments with mouse embryos that demonstrated Axl was expressed preferentially from the maternal allele in a DNA methylation-dependent manner. PMID:21324877

  18. Comparative Transcriptome Analysis Identifies Putative Genes Involved in the Biosynthesis of Xanthanolides in Xanthium strumarium L.

    PubMed Central

    Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng

    2016-01-01

    Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides. PMID:27625674

  19. Integromic Analysis of Genetic Variation and Gene Expression Identifies Networks for Cardiovascular Disease Phenotypes

    PubMed Central

    Yao, Chen; Chen, Brian H.; Joehanes, Roby; Otlu, Burcak; Zhang, Xiaoling; Liu, Chunyu; Huan, Tianxiao; Tastan, Oznur; Cupples, L. Adrienne; Meigs, James B.; Fox, Caroline S.; Freedman, Jane E.; Courchesne, Paul; O’Donnell, Christopher J.; Munson, Peter J.; Keles, Sunduz; Levy, Daniel

    2015-01-01

    Background Cardiovascular disease (CVD) reflects a highly coordinated complex of traits. Although genome-wide association studies have reported numerous single nucleotide polymorphisms (SNPs) to be associated with CVD, the role of most of these variants in disease processes remains unknown. Methods and Results We built a CVD network using 1512 SNPs associated with 21 CVD traits in genome-wide association studies (at P≤5×10−8) and cross-linked different traits by virtue of their shared SNP associations. We then explored whole blood gene expression in relation to these SNPs in 5257 participants in the Framingham Heart Study. At a false discovery rate <0.05, we identified 370 cis-expression quantitative trait loci (eQTLs; SNPs associated with altered expression of nearby genes) and 44 trans-eQTLs (SNPs associated with altered expression of remote genes). The eQTL network revealed 13 CVD-related modules. Searching for association of eQTL genes with CVD risk factors (lipids, blood pressure, fasting blood glucose, and body mass index) in the same individuals, we found examples in which the expression of eQTL genes was significantly associated with these CVD phenotypes. In addition, mediation tests suggested that a subset of SNPs previously associated with CVD phenotypes in genome-wide association studies may exert their function by altering expression of eQTL genes (eg, LDLR and PCSK7), which in turn may promote interindividual variation in phenotypes. Conclusions Using a network approach to analyze CVD traits, we identified complex networks of SNP-phenotype and SNP-transcript connections. Integrating the CVD network with phenotypic data, we identified biological pathways that may provide insights into potential drug targets for treatment or prevention of CVD. PMID:25533967

  20. Interacting networks of resistance, virulence and core machinery genes identified by genome-wide epistasis analysis

    PubMed Central

    Pesonen, Maiju; Musser, James M.; Bentley, Stephen D.; Aurell, Erik; Corander, Jukka

    2017-01-01

    Recent advances in the scale and diversity of population genomic datasets for bacteria now provide the potential for genome-wide patterns of co-evolution to be studied at the resolution of individual bases. Here we describe a new statistical method, genomeDCA, which uses recent advances in computational structural biology to identify the polymorphic loci under the strongest co-evolutionary pressures. We apply genomeDCA to two large population data sets representing the major human pathogens Streptococcus pneumoniae (pneumococcus) and Streptococcus pyogenes (group A Streptococcus). For pneumococcus we identified 5,199 putative epistatic interactions between 1,936 sites. Over three-quarters of the links were between sites within the pbp2x, pbp1a and pbp2b genes, the sequences of which are critical in determining non-susceptibility to beta-lactam antibiotics. A network-based analysis found these genes were also coupled to that encoding dihydrofolate reductase, changes to which underlie trimethoprim resistance. Distinct from these antibiotic resistance genes, a large network component of 384 protein coding sequences encompassed many genes critical in basic cellular functions, while another distinct component included genes associated with virulence. The group A Streptococcus (GAS) data set population represents a clonal population with relatively little genetic variation and a high level of linkage disequilibrium across the genome. Despite this, we were able to pinpoint two RNA pseudouridine synthases, which were each strongly linked to a separate set of loci across the chromosome, representing biologically plausible targets of co-selection. The population genomic analysis method applied here identifies statistically significantly co-evolving locus pairs, potentially arising from fitness selection interdependence reflecting underlying protein-protein interactions, or genes whose product activities contribute to the same phenotype. This discovery approach greatly

  1. Interacting networks of resistance, virulence and core machinery genes identified by genome-wide epistasis analysis.

    PubMed

    Skwark, Marcin J; Croucher, Nicholas J; Puranen, Santeri; Chewapreecha, Claire; Pesonen, Maiju; Xu, Ying Ying; Turner, Paul; Harris, Simon R; Beres, Stephen B; Musser, James M; Parkhill, Julian; Bentley, Stephen D; Aurell, Erik; Corander, Jukka

    2017-02-01

    Recent advances in the scale and diversity of population genomic datasets for bacteria now provide the potential for genome-wide patterns of co-evolution to be studied at the resolution of individual bases. Here we describe a new statistical method, genomeDCA, which uses recent advances in computational structural biology to identify the polymorphic loci under the strongest co-evolutionary pressures. We apply genomeDCA to two large population data sets representing the major human pathogens Streptococcus pneumoniae (pneumococcus) and Streptococcus pyogenes (group A Streptococcus). For pneumococcus we identified 5,199 putative epistatic interactions between 1,936 sites. Over three-quarters of the links were between sites within the pbp2x, pbp1a and pbp2b genes, the sequences of which are critical in determining non-susceptibility to beta-lactam antibiotics. A network-based analysis found these genes were also coupled to that encoding dihydrofolate reductase, changes to which underlie trimethoprim resistance. Distinct from these antibiotic resistance genes, a large network component of 384 protein coding sequences encompassed many genes critical in basic cellular functions, while another distinct component included genes associated with virulence. The group A Streptococcus (GAS) data set population represents a clonal population with relatively little genetic variation and a high level of linkage disequilibrium across the genome. Despite this, we were able to pinpoint two RNA pseudouridine synthases, which were each strongly linked to a separate set of loci across the chromosome, representing biologically plausible targets of co-selection. The population genomic analysis method applied here identifies statistically significantly co-evolving locus pairs, potentially arising from fitness selection interdependence reflecting underlying protein-protein interactions, or genes whose product activities contribute to the same phenotype. This discovery approach greatly

  2. Exome sequencing identifies potential novel candidate genes in patients with unexplained colorectal adenomatous polyposis.

    PubMed

    Spier, Isabel; Kerick, Martin; Drichel, Dmitriy; Horpaopan, Sukanya; Altmüller, Janine; Laner, Andreas; Holzapfel, Stefanie; Peters, Sophia; Adam, Ronja; Zhao, Bixiao; Becker, Tim; Lifton, Richard P; Holinski-Feder, Elke; Perner, Sven; Thiele, Holger; Nöthen, Markus M; Hoffmann, Per; Timmermann, Bernd; Schweiger, Michal R; Aretz, Stefan

    2016-04-01

    In up to 30% of patients with colorectal adenomatous polyposis, no germline mutation in the known genes APC, causing familial adenomatous polyposis, MUTYH, causing MUTYH-associated polyposis, and POLE or POLD1, causing Polymerase-Proofreading-associated polyposis can be identified, although a hereditary etiology is likely. To uncover new causative genes, exome sequencing was performed using DNA from leukocytes and a total of 12 colorectal adenomas from seven unrelated patients with unexplained sporadic adenomatous polyposis. For data analysis and variant filtering, an established bioinformatics pipeline including in-house tools was applied. Variants were filtered for rare truncating point mutations and copy-number variants assuming a dominant, recessive, or tumor suppressor model of inheritance. Subsequently, targeted sequence analysis of the most promising candidate genes was performed in a validation cohort of 191 unrelated patients. All relevant variants were validated by Sanger sequencing. The analysis of exome sequencing data resulted in the identification of rare loss-of-function germline mutations in three promising candidate genes (DSC2, PIEZO1, ZSWIM7). In the validation cohort, further variants predicted to be pathogenic were identified in DSC2 and PIEZO1. According to the somatic mutation spectra, the adenomas in this patient cohort follow the classical pathways of colorectal tumorigenesis. The present study identified three candidate genes which might represent rare causes for a predisposition to colorectal adenoma formation. Especially PIEZO1 (FAM38A) and ZSWIM7 (SWS1) warrant further exploration. To evaluate the clinical relevance of these genes, investigation of larger patient cohorts and functional studies are required.

  3. De novo transcriptome sequencing in Pueraria lobata to identify putative genes involved in isoflavones biosynthesis.

    PubMed

    Wang, Xin; Li, Shutao; Li, Jia; Li, Changfu; Zhang, Yansheng

    2015-05-01

    Using Illumina sequencing technology, we have generated the large-scale transcriptome sequencing data and indentified many putative genes involved in isoflavones biosynthesis in Pueraria lobata. Pueraria lobata, a member of the Leguminosae family, is a traditional Chinese herb which has been used since ancient times. P. lobata root has extensive clinical usages, because it contains a rich source of isoflavones, including daidzin and puerarin. However, the knowledge of isoflavone metabolism and the characterization of corresponding genes in such a pathway remain largely unknown. In this study, de novo transcriptome of P. lobata root and leaf was sequenced using the Solexa sequencing platform. Over 140 million high-quality reads were assembled into 163,625 unigenes, of which about 43.1% were aligned to the Nr protein database. Using the RPKM (reads per kilo bases per million reads) method, 3,148 unigenes were found to be upregulated, and 2,011 genes were downregulated in the leaf as compared to those in the root. Towards a further understanding of these differentially expressed genes, Gene ontology enrichment and metabolic pathway enrichment analyses were performed. Based on these results, 47 novel structural genes were identified in the biosynthesis of isoflavones. Also, 22 putative UDP glycosyltransferases and 45 O-methyltransferases unigenes were identified as the candidates most likely to be involved in the tailoring processes of isoflavonoid downstream pathway. Moreover, MYB transcription factors were analyzed, and 133 of them were found to have higher expression levels in the roots than in the leaves. In conclusion, the de novo transcriptome investigation of these unique transcripts provided an invaluable resource for the global discovery of functional genes related to isoflavones biosynthesis in P. lobata.

  4. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes

    PubMed Central

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-01-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information. PMID:22384404

  5. A functional screen for copper homeostasis genes identifies a pharmacologically tractable cellular system.

    PubMed

    Schlecht, Ulrich; Suresh, Sundari; Xu, Weihong; Aparicio, Ana Maria; Chu, Angela; Proctor, Michael J; Davis, Ronald W; Scharfe, Curt; St Onge, Robert P

    2014-04-05

    Copper is essential for the survival of aerobic organisms. If copper is not properly regulated in the body however, it can be extremely cytotoxic and genetic mutations that compromise copper homeostasis result in severe clinical phenotypes. Understanding how cells maintain optimal copper levels is therefore highly relevant to human health. We found that addition of copper (Cu) to culture medium leads to increased respiratory growth of yeast, a phenotype which we then systematically and quantitatively measured in 5050 homozygous diploid deletion strains. Cu's positive effect on respiratory growth was quantitatively reduced in deletion strains representing 73 different genes, the function of which identify increased iron uptake as a cause of the increase in growth rate. Conversely, these effects were enhanced in strains representing 93 genes. Many of these strains exhibited respiratory defects that were specifically rescued by supplementing the growth medium with Cu. Among the genes identified are known and direct regulators of copper homeostasis, genes required to maintain low vacuolar pH, and genes where evidence supporting a functional link with Cu has been heretofore lacking. Roughly half of the genes are conserved in man, and several of these are associated with Mendelian disorders, including the Cu-imbalance syndromes Menkes and Wilson's disease. We additionally demonstrate that pharmacological agents, including the approved drug disulfiram, can rescue Cu-deficiencies of both environmental and genetic origin. A functional screen in yeast has expanded the list of genes required for Cu-dependent fitness, revealing a complex cellular system with implications for human health. Respiratory fitness defects arising from perturbations in this system can be corrected with pharmacological agents that increase intracellular copper concentrations.

  6. Expression profiling identifies genes expressed early during lint fibre initiation in cotton.

    PubMed

    Wu, Yingru; Machado, Adriane C; White, Rosemary G; Llewellyn, Danny J; Dennis, Elizabeth S

    2006-01-01

    Cotton fibres are a subset of single epidermal cells that elongate from the seed coat to produce the long cellulose strands or lint used for spinning into yarn. To identify genes that might regulate lint fibre initiation, expression profiles of 0 days post-anthesis (dpa) whole ovules from six reduced fibre or fibreless mutants were compared with wild-type linted cotton using cDNA microarrays. Numerous clones were differentially expressed, but when only those genes that are normally expressed in the ovule outer integument (where fibres develop) were considered, just 13 different cDNA clones were down-regulated in some or all of the mutants. These included: a Myb transcription factor (GhMyb25) similar to the Antirrhinum Myb AmMIXTA, a putative homeodomain protein (related to Arabidopsis ATML1), a cyclin D gene, some previously identified fibre-expressed structural and metabolic genes, such as lipid transfer protein, alpha-expansin and sucrose synthase, as well as some unknown genes. Laser capture microdissection and reverse transcription-PCR were used to show that both the GhMyb25 and the homeodomain gene were predominantly ovule specific and were up-regulated on the day of anthesis in fibre initials relative to adjacent non-fibre ovule epidermal cells. Their spatial and temporal expression pattern therefore coincided with the time and location of fibre initiation. Constitutive overexpression of GhMyb25 in transgenic tobacco resulted in an increase in branched long-stalked leaf trichomes. The involvement of cell cycle genes prompted DNA content measurements that indicated that fibre initials, like leaf trichomes, undergo DNA endoreduplication. Cotton fibre initiation therefore has some parallels with leaf trichome development, although the detailed molecular mechanisms are clearly different.

  7. Gene expression profiling identifies distinct molecular subgroups of leiomyosarcoma with clinical relevance

    PubMed Central

    Lee, Yin-Fai; Roe, Toby; Mangham, D Chas; Fisher, Cyril; Grimer, Robert J; Judson, Ian

    2016-01-01

    Background: Soft tissue sarcomas are heterogeneous and a major complication in their management is that the existing classification scheme is not definitive and is still evolving. Leiomyosarcomas, a major histologic category of soft tissue sarcomas, are malignant tumours displaying smooth muscle differentiation. Although defined as a single group, they exhibit a wide range of clinical behaviour. We aimed to carry out molecular classification to identify new molecular subgroups with clinical relevance. Methods: We used gene expression profiling on 20 extra-uterine leiomyosarcomas and cross-study analyses for molecular classification of leiomyosarcomas. Clinical significance of the subgroupings was investigated. Results: We have identified two distinct molecular subgroups of leiomyosarcomas. One group was characterised by high expression of 26 genes that included many genes from the sub-classification gene cluster proposed by Nielsen et al. These sub-classification genes include genes that have importance structurally, as well as in cell signalling. Notably, we found a statistically significant association of the subgroupings with tumour grade. Further refinement led to a group of 15 genes that could recapitulate the tumour subgroupings in our data set and in a second independent sarcoma set. Remarkably, cross-study analyses suggested that these molecular subgroups could be found in four independent data sets, providing strong support for their existence. Conclusions: Our study strongly supported the existence of distinct leiomyosarcoma molecular subgroups, which have clinical association with tumour grade. Our findings will aid in advancing the classification of leiomyosarcomas and lead to more individualised and better management of the disease. PMID:27607470

  8. A functional screen for copper homeostasis genes identifies a pharmacologically tractable cellular system

    PubMed Central

    2014-01-01

    Background Copper is essential for the survival of aerobic organisms. If copper is not properly regulated in the body however, it can be extremely cytotoxic and genetic mutations that compromise copper homeostasis result in severe clinical phenotypes. Understanding how cells maintain optimal copper levels is therefore highly relevant to human health. Results We found that addition of copper (Cu) to culture medium leads to increased respiratory growth of yeast, a phenotype which we then systematically and quantitatively measured in 5050 homozygous diploid deletion strains. Cu’s positive effect on respiratory growth was quantitatively reduced in deletion strains representing 73 different genes, the function of which identify increased iron uptake as a cause of the increase in growth rate. Conversely, these effects were enhanced in strains representing 93 genes. Many of these strains exhibited respiratory defects that were specifically rescued by supplementing the growth medium with Cu. Among the genes identified are known and direct regulators of copper homeostasis, genes required to maintain low vacuolar pH, and genes where evidence supporting a functional link with Cu has been heretofore lacking. Roughly half of the genes are conserved in man, and several of these are associated with Mendelian disorders, including the Cu-imbalance syndromes Menkes and Wilson’s disease. We additionally demonstrate that pharmacological agents, including the approved drug disulfiram, can rescue Cu-deficiencies of both environmental and genetic origin. Conclusions A functional screen in yeast has expanded the list of genes required for Cu-dependent fitness, revealing a complex cellular system with implications for human health. Respiratory fitness defects arising from perturbations in this system can be corrected with pharmacological agents that increase intracellular copper concentrations. PMID:24708151

  9. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    PubMed

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  10. Engineering and Functional Characterization of Fusion Genes Identifies Novel Oncogenic Drivers of Cancer.

    PubMed

    Lu, Hengyu; Villafane, Nicole; Dogruluk, Turgut; Grzeskowiak, Caitlin L; Kong, Kathleen; Tsang, Yiu Huen; Zagorodna, Oksana; Pantazi, Angeliki; Yang, Lixing; Neill, Nicholas J; Kim, Young Won; Creighton, Chad J; Verhaak, Roel G; Mills, Gordon B; Park, Peter J; Kucherlapati, Raju; Scott, Kenneth L

    2017-07-01

    Oncogenic gene fusions drive many human cancers, but tools to more quickly unravel their functional contributions are needed. Here we describe methodology permitting fusion gene construction for functional evaluation. Using this strategy, we engineered the known fusion oncogenes, BCR-ABL1, EML4-ALK, and ETV6-NTRK3, as well as 20 previously uncharacterized fusion genes identified in The Cancer Genome Atlas datasets. In addition to confirming oncogenic activity of the known fusion oncogenes engineered by our construction strategy, we validated five novel fusion genes involving MET, NTRK2, and BRAF kinases that exhibited potent transforming activity and conferred sensitivity to FDA-approved kinase inhibitors. Our fusion construction strategy also enabled domain-function studies of BRAF fusion genes. Our results confirmed other reports that the transforming activity of BRAF fusions results from truncation-mediated loss of inhibitory domains within the N-terminus of the BRAF protein. BRAF mutations residing within this inhibitory region may provide a means for BRAF activation in cancer, therefore we leveraged the modular design of our fusion gene construction methodology to screen N-terminal domain mutations discovered in tumors that are wild-type at the BRAF mutation hotspot, V600. We identified an oncogenic mutation, F247L, whose expression robustly activated the MAPK pathway and sensitized cells to BRAF and MEK inhibitors. When applied broadly, these tools will facilitate rapid fusion gene construction for subsequent functional characterization and translation into personalized treatment strategies. Cancer Res; 77(13); 3502-12. ©2017 AACR. ©2017 American Association for Cancer Research.

  11. Genome wide transcriptome analysis of dendritic cells identifies genes with altered expression in psoriasis.

    PubMed

    Filkor, Kata; Hegedűs, Zoltán; Szász, András; Tubak, Vilmos; Kemény, Lajos; Kondorosi, Éva; Nagy, István

    2013-01-01

    Activation of dendritic cells by different pathogens induces the secretion of proinflammatory mediators resulting in local inflammation. Importantly, innate immunity must be properly controlled, as its continuous activation leads to the development of chronic inflammatory diseases such as psoriasis. Lipopolysaccharide (LPS) or peptidoglycan (PGN) induced tolerance, a phenomenon of transient unresponsiveness of cells to repeated or prolonged stimulation, proved valuable model for the study of chronic inflammation. Thus, the aim of this study was the identification of the transcriptional diversity of primary human immature dendritic cells (iDCs) upon PGN induced tolerance. Using SAGE-Seq approach, a tag-based transcriptome sequencing method, we investigated gene expression changes of primary human iDCs upon stimulation or restimulation with Staphylococcus aureus derived PGN, a widely used TLR2 ligand. Based on the expression pattern of the altered genes, we identified non-tolerizeable and tolerizeable genes. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (Kegg) analysis showed marked enrichment of immune-, cell cycle- and apoptosis related genes. In parallel to the marked induction of proinflammatory mediators, negative feedback regulators of innate immunity, such as TNFAIP3, TNFAIP8, Tyro3 and Mer are markedly downregulated in tolerant cells. We also demonstrate, that the expression pattern of TNFAIP3 and TNFAIP8 is altered in both lesional, and non-lesional skin of psoriatic patients. Finally, we show that pretreatment of immature dendritic cells with anti-TNF-α inhibits the expression of IL-6 and CCL1 in tolerant iDCs and partially releases the suppression of TNFAIP8. Our findings suggest that after PGN stimulation/restimulation the host cell utilizes different mechanisms in order to maintain critical balance between inflammation and tolerance. Importantly, the transcriptome sequencing of stimulated/restimulated iDCs identified numerous genes with

  12. Iterative carotenogenic screens identify combinations of yeast gene deletions that enhance sclareol production.

    PubMed

    Trikka, Fotini A; Nikolaidis, Alexandros; Athanasakoglou, Anastasia; Andreadelli, Aggeliki; Ignea, Codruta; Kotta, Konstantia; Argiriou, Anagnostis; Kampranis, Sotirios C; Makris, Antonios M

    2015-04-24

    Terpenoids (isoprenoids) have numerous applications in flavors, fragrances, drugs and biofuels. The number of microbially produced terpenoids is increasing as new biosynthetic pathways are being elucidated. However, efforts to improve terpenoid production in yeast have mostly taken advantage of existing knowledge of the sterol biosynthetic pathway, while many additional factors may affect the output of the engineered system. Aiming to develop a yeast strain that can support high titers of sclareol, a diterpene of great importance for the perfume industry, we sought to identify gene deletions that improved carotenoid, and thus potentially sclareol, production. Using a carotenogenic screen, the best 100 deletion mutants, out of 4,700 mutant strains, were selected to create a subset for further analysis. To identify combinations of deletions that cooperate to further boost production, iterative carotenogenic screens were applied, and each time the top performing gene deletions were further ranked according to the number of genetic and physical interactions known for each specific gene. The gene selected in each round was deleted and the resulting strain was employed in a new round of selection. This approach led to the development of an EG60 derived haploid strain combining six deletions (rox1, dos2, yer134c, vba5, ynr063w and ygr259c) and exhibiting a 40-fold increase in carotenoid and 12-fold increase in sclareol titers, reaching 750 mg/L sclareol in shake flask cultivation. Using an iterative approach, we identified novel combinations of yeast gene deletions that improve carotenoid and sclareol production titers without compromising strain growth and viability. Most of the identified deletions have not previously been implicated in sterol pathway control. Applying the same approach using a different starting point could yield alternative sets of deletions with similar or improved outcome.

  13. Identifying novel mycobacterial stress associated genes using a random mutagenesis screen in Mycobacterium smegmatis.

    PubMed

    Viswanathan, Gopinath; Joshi, Shrilaxmi V; Sridhar, Aditi; Dutta, Sayantanee; Raghunand, Tirumalai R

    2015-12-10

    Cell envelope associated components of Mycobacterium tuberculosis (M.tb) have been implicated in stress response, immune modulation and in vivo survival of the pathogen. Although many such factors have been identified, there is a large disparity between the number of genes predicted to be involved in functions linked to the envelope and those described in the literature. To identify and characterise novel stress related factors associated with the mycobacterial cell envelope, we isolated colony morphotype mutants of Mycobacterium smegmatis (M. smegmatis), based on the hypothesis that mutants with unusual colony morphology may have defects in the biosynthesis of cell envelope components. On testing their susceptibility to stress conditions relevant to M.tb physiology, multiple mutants were found to be sensitive to Isoniazid, Diamide and H2O2, indicative of altered permeability due to changes in cell envelope composition. Two mutants showed defects in biofilm formation implying possible roles for the target genes in antibiotic tolerance and/or virulence. These assays identified novel stress associated roles for several mycobacterial genes including sahH, tatB and aceE. Complementation analysis of selected mutants with the M. smegmatis genes and their M.tb homologues showed phenotypic restoration, validating their link to the observed phenotypes. A mutant carrying an insertion in fhaA encoding a forkhead associated domain containing protein, showed reduced survival in THP-1 macrophages, providing in vivo validation to this screen. Taken together, these results suggest that the M.tb homologues of a majority of the identified genes may play significant roles in the pathogenesis of tuberculosis.

  14. A cross-species transcriptomics approach to identify genes involved in leaf development

    PubMed Central

    Street, Nathaniel Robert; Sjödin, Andreas; Bylesjö, Max; Gustafsson, Petter; Trygg, Johan; Jansson, Stefan

    2008-01-01

    Background We have made use of publicly available gene expression data to identify transcription factors and transcriptional modules (regulons) associated with leaf development in Populus. Different tissue types were compared to identify genes informative in the discrimination of leaf and non-leaf tissues. Transcriptional modules within this set of genes were identified in a much wider set of microarray data collected from leaves in a number of developmental, biotic, abiotic and transgenic experiments. Results Transcription factors that were over represented in leaf EST libraries and that were useful for discriminating leaves from other tissues were identified, revealing that the C2C2-YABBY, CCAAT-HAP3 and 5, MYB, and ZF-HD families are particularly important in leaves. The expression of transcriptional modules and transcription factors was examined across a number of experiments to select those that were particularly active during the early stages of leaf development. Two transcription factors were found to collocate to previously published Quantitative Trait Loci (QTL) for leaf length. We also found that miRNA family 396 may be important in the control of leaf development, with three members of the family collocating with clusters of leaf development QTL. Conclusion This work provides a set of candidate genes involved in the control and processes of leaf development. This resource can be used for a wide variety of purposes such as informing the selection of candidate genes for association mapping or for the selection of targets for reverse genetics studies to further understanding of the genetic control of leaf size and shape. PMID:19061504

  15. Transposon mutagenesis identifies genes driving hepatocellular carcinoma in a chronic hepatitis B mouse model

    PubMed Central

    Bard-Chapeau, Emilie A.; Nguyen, Anh-Tuan; Rust, Alistair G.; Sayadi, Ahmed; Lee, Philip; Chua, Belinda Q; New, Lee-Sun; de Jong, Johann; Ward, Jerrold M.; Chin, Christopher KY.; Chew, Valerie; Toh, Han Chong; Abastado, Jean-Pierre; Benoukraf, Touati; Soong, Richie; Bard, Frederic A.; Dupuy, Adam J.; Johnson, Randy L.; Radda, George K.; Chan, Eric CY.; Wessels, Lodewyk FA.; Adams, David J.

    2014-01-01

    The most common risk factor for developing hepatocellular carcinoma (HCC) is chronic infection with hepatitis B virus (HBV). To better understand the evolutionary forces driving HCC we performed a near saturating transposon mutagenesis screen in a mouse HBV model of HCC. This screen identified 21 candidate early stage drivers, and a bewildering number (2860) of candidate later stage drivers, that were enriched for genes mutated, deregulated, or that function in signaling pathways important for human HCC, with a striking 1199 genes linked to cellular metabolic processes. Our study provides a comprehensive overview of the genetic landscape of HCC. PMID:24316982

  16. An in vivo screen identifies ependymoma oncogenes and tumor-suppressor genes.

    PubMed

    Mohankumar, Kumarasamypet M; Currle, David S; White, Elsie; Boulos, Nidal; Dapper, Jason; Eden, Christopher; Nimmervoll, Birgit; Thiruvenkatam, Radhika; Connelly, Michele; Kranenburg, Tanya A; Neale, Geoffrey; Olsen, Scott; Wang, Yong-Dong; Finkelstein, David; Wright, Karen; Gupta, Kirti; Ellison, David W; Thomas, Arzu Onar; Gilbertson, Richard J

    2015-08-01

    Cancers are characterized by non-random chromosome copy number alterations that presumably contain oncogenes and tumor-suppressor genes (TSGs). The affected loci are often large, making it difficult to pinpoint which genes are driving the cancer. Here we report a cross-species in vivo screen of 84 candidate oncogenes and 39 candidate TSGs, located within 28 recurrent chromosomal alterations in ependymoma. Through a series of mouse models, we validate eight new ependymoma oncogenes and ten new ependymoma TSGs that converge on a small number of cell functions, including vesicle trafficking, DNA modification and cholesterol biosynthesis, identifying these as potential new therapeutic targets.

  17. Back to the sea twice: identifying candidate plant genes for molecular evolution to marine life

    PubMed Central

    2011-01-01

    Background Seagrasses are a polyphyletic group of monocotyledonous angiosperms that have adapted to a completely submerged lifestyle in marine waters. Here, we exploit two collections of expressed sequence tags (ESTs) of two wide-spread and ecologically important seagrass species, the Mediterranean seagrass Posidonia oceanica (L.) Delile and the eelgrass Zostera marina L., which have independently evolved from aquatic ancestors. This replicated, yet independent evolutionary history facilitates the identification of traits that may have evolved in parallel and are possible instrumental candidates for adaptation to a marine habitat. Results In our study, we provide the first quantitative perspective on molecular adaptations in two seagrass species. By constructing orthologous gene clusters shared between two seagrasses (Z. marina and P. oceanica) and eight distantly related terrestrial angiosperm species, 51 genes could be identified with detection of positive selection along the seagrass branches of the phylogenetic tree. Characterization of these positively selected genes using KEGG pathways and the Gene Ontology uncovered that these genes are mostly involved in translation, metabolism, and photosynthesis. Conclusions These results provide first insights into which seagrass genes have diverged from their terrestrial counterparts via an initial aquatic stage characteristic of the order and to the derived fully-marine stage characteristic of seagrasses. We discuss how adaptive changes in these processes may have contributed to the evolution towards an aquatic and marine existence. PMID:21226908

  18. Quantitative analysis of bristle number in Drosophila mutants identifies genes involved in neural development

    NASA Technical Reports Server (NTRS)

    Norga, Koenraad K.; Gurganus, Marjorie C.; Dilda, Christy L.; Yamamoto, Akihiko; Lyman, Richard F.; Patel, Prajal H.; Rubin, Gerald M.; Hoskins, Roger A.; Mackay, Trudy F.; Bellen, Hugo J.

    2003-01-01

    BACKGROUND: The identification of the function of all genes that contribute to specific biological processes and complex traits is one of the major challenges in the postgenomic era. One approach is to employ forward genetic screens in genetically tractable model organisms. In Drosophila melanogaster, P element-mediated insertional mutagenesis is a versatile tool for the dissection of molecular pathways, and there is an ongoing effort to tag every gene with a P element insertion. However, the vast majority of P element insertion lines are viable and fertile as homozygotes and do not exhibit obvious phenotypic defects, perhaps because of the tendency for P elements to insert 5' of transcription units. Quantitative genetic analysis of subtle effects of P element mutations that have been induced in an isogenic background may be a highly efficient method for functional genome annotation. RESULTS: Here, we have tested the efficacy of this strategy by assessing the extent to which screening for quantitative effects of P elements on sensory bristle number can identify genes affecting neural development. We find that such quantitative screens uncover an unusually large number of genes that are known to function in neural development, as well as genes with yet uncharacterized effects on neural development, and novel loci. CONCLUSIONS: Our findings establish the use of quantitative trait analysis for functional genome annotation through forward genetics. Similar analyses of quantitative effects of P element insertions will facilitate our understanding of the genes affecting many other complex traits in Drosophila.

  19. Identifying photoreceptors in blind eyes caused by RPE65 mutations: Prerequisite for human gene therapy success.

    PubMed

    Jacobson, Samuel G; Aleman, Tomas S; Cideciyan, Artur V; Sumaroka, Alexander; Schwartz, Sharon B; Windsor, Elizabeth A M; Traboulsi, Elias I; Heon, Elise; Pittler, Steven J; Milam, Ann H; Maguire, Albert M; Palczewski, Krzysztof; Stone, Edwin M; Bennett, Jean

    2005-04-26

    Mutations in RPE65, a gene essential to normal operation of the visual (retinoid) cycle, cause the childhood blindness known as Leber congenital amaurosis (LCA). Retinal gene therapy restores vision to blind canine and murine models of LCA. Gene therapy in blind humans with LCA from RPE65 mutations may also have potential for success but only if the retinal photoreceptor layer is intact, as in the early-disease stage-treated animals. Here, we use high-resolution in vivo microscopy to quantify photoreceptor layer thickness in the human disease to define the relationship of retinal structure to vision and determine the potential for gene therapy success. The normally cone photoreceptor-rich central retina and rod-rich regions were studied. Despite severely reduced cone vision, many RPE65-mutant retinas had near-normal central microstructure. Absent rod vision was associated with a detectable but thinned photoreceptor layer. We asked whether abnormally thinned RPE65-mutant retina with photoreceptor loss would respond to treatment. Gene therapy in Rpe65(-/-) mice at advanced-disease stages, a more faithful mimic of the humans we studied, showed success but only in animals with better-preserved photoreceptor structure. The results indicate that identifying and then targeting retinal locations with retained photoreceptors will be a prerequisite for successful gene therapy in humans with RPE65 mutations and in other retinal degenerative disorders now moving from proof-of-concept studies toward clinical trials.

  20. GWAS identifies novel SLE susceptibility genes and explains the association of the HLA region.

    PubMed

    Armstrong, D L; Zidovetzki, R; Alarcón-Riquelme, M E; Tsao, B P; Criswell, L A; Kimberly, R P; Harley, J B; Sivils, K L; Vyse, T J; Gaffney, P M; Langefeld, C D; Jacob, C O

    2014-09-01

    In a genome-wide association study (GWAS) of individuals of European ancestry afflicted with systemic lupus erythematosus (SLE) the extensive utilization of imputation, step-wise multiple regression, lasso regularization and increasing study power by utilizing false discovery rate instead of a Bonferroni multiple test correction enabled us to identify 13 novel non-human leukocyte antigen (HLA) genes and confirmed the association of four genes previously reported to be associated. Novel genes associated with SLE susceptibility included two transcription factors (EHF and MED1), two components of the NF-κB pathway (RASSF2 and RNF114), one gene involved in adhesion and endothelial migration (CNTN6) and two genes involved in antigen presentation (BIN1 and SEC61G). In addition, the strongly significant association of multiple single-nucleotide polymorphisms (SNPs) in the HLA region was assigned to HLA alleles and serotypes and deconvoluted into four primary signals. The novel SLE-associated genes point to new directions for both the diagnosis and treatment of this debilitating autoimmune disease.

  1. GWAS identifies novel SLE susceptibility genes and explains the association of the HLA region

    PubMed Central

    Armstrong, Don L.; Zidovetzki, Raphael; Alarcón-Riquelme, Marta E; Tsao, Betty P; Criswell, Lindsey A; Kimberly, Robert P; Harley, John B; Sivils, Kathy L; Vyse, Timothy J; Gaffney, Patrick M.; Langefeld, Carl D; Jacob, Chaim O.

    2014-01-01

    In a Genome Wide Association Study (GWAS) of individuals of European ancestry afflicted with Systemic Lupus Erythematosus (SLE) the extensive utilization of imputation, stepwise multiple regression, lasso regularization, and increasing study power by utilizing False Discovery Rate (FDR) instead of a Bonferroni multiple test correction enabled us to identify 13 novel non-human leukocyte antigen (HLA) genes and confirmed the association of 4 genes previously reported to be associated. Novel genes associated with SLE susceptibility included two transcription factors (EHF, and MED1), two components of the NFκB pathway (RASSF2 and RNF114), one gene involved in adhesion and endothelial migration (CNTN6), and two genes involved in antigen presentation (BIN1 and SEC61G). In addition, the strongly significant association of multiple single nucleotide polymorphisms (SNPs) in the HLA region was assigned to HLA alleles and serotypes and deconvoluted into four primary signals. The novel SLE-associated genes point to new directions for both the diagnosis and treatment of this debilitating autoimmune disease. PMID:24871463

  2. Transcriptomic profiling in muscle and adipose tissue identifies genes related to growth and lipid deposition.

    PubMed

    Tao, Xuan; Liang, Yan; Yang, Xuemei; Pang, Jianhui; Zhong, Zhijun; Chen, Xiaohui; Yang, Yuekui; Zeng, Kai; Kang, Runming; Lei, Yunfeng; Ying, Sancheng; Gong, Jianjun; Gu, Yiren; Lv, Xuebin

    2017-01-01

    Growth performance and meat quality are important traits for the pig industry and consumers. Adipose tissue is the main site at which fat storage and fatty acid synthesis occur. Therefore, we combined high-throughput transcriptomic sequencing in adipose and muscle tissues with the quantification of corresponding phenotypic features using seven Chinese indigenous pig breeds and one Western commercial breed (Yorkshire). We obtained data on 101 phenotypic traits, from which principal component analysis distinguished two groups: one associated with the Chinese breeds and one with Yorkshire. The numbers of differentially expressed genes between all Chinese breeds and Yorkshire were shown to be 673 and 1056 in adipose and muscle tissues, respectively. Functional enrichment analysis revealed that these genes are associated with biological functions and canonical pathways related to oxidoreductase activity, immune response, and metabolic process. Weighted gene coexpression network analysis found more coexpression modules significantly correlated with the measured phenotypic traits in adipose than in muscle, indicating that adipose regulates meat and carcass quality. Using the combination of differential expression, QTL information, gene significance, and module hub genes, we identified a large number of candidate genes potentially related to economically important traits in pig, which should help us improve meat production and quality.

  3. A CRISPR-Based Screen Identifies Genes Essential for West-Nile-Virus-Induced Cell Death.

    PubMed

    Ma, Hongming; Dang, Ying; Wu, Yonggan; Jia, Gengxiang; Anaya, Edgar; Zhang, Junli; Abraham, Sojan; Choi, Jang-Gi; Shi, Guojun; Qi, Ling; Manjunath, N; Wu, Haoquan

    2015-07-28

    West Nile virus (WNV) causes an acute neurological infection attended by massive neuronal cell death. However, the mechanism(s) behind the virus-induced cell death is poorly understood. Using a library containing 77,406 sgRNAs targeting 20,121 genes, we performed a genome-wide screen followed by a second screen with a sub-library. Among the genes identified, seven genes, EMC2, EMC3, SEL1L, DERL2, UBE2G2, UBE2J1, and HRD1, stood out as having the strongest phenotype, whose knockout conferred strong protection against WNV-induced cell death with two different WNV strains and in three cell lines. Interestingly, knockout of these genes did not block WNV replication. Thus, these appear to be essential genes that link WNV replication to downstream cell death pathway(s). In addition, the fact that all of these genes belong to the ER-associated protein degradation (ERAD) pathway suggests that this might be the primary driver of WNV-induced cell death. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  4. Temporal patterns of gene expression in developing maize endosperm identified through transcriptome sequencing.

    PubMed

    Li, Guosheng; Wang, Dongfang; Yang, Ruolin; Logan, Kyle; Chen, Hao; Zhang, Shanshan; Skaggs, Megan I; Lloyd, Alan; Burnett, William J; Laurie, John D; Hunter, Brenda G; Dannenhoffer, Joanne M; Larkins, Brian A; Drews, Gary N; Wang, Xiangfeng; Yadegari, Ramin

    2014-05-27

    Endosperm is a filial structure resulting from a second fertilization event in angiosperms. As an absorptive storage organ, endosperm plays an essential role in support of embryo development and seedling germination. The accumulation of carbohydrate and protein storage products in cereal endosperm provides humanity with a major portion of its food, feed, and renewable resources. Little is known regarding the regulatory gene networks controlling endosperm proliferation and differentiation. As a first step toward understanding these networks, we profiled all mRNAs in the maize kernel and endosperm at eight successive stages during the first 12 d after pollination. Analysis of these gene sets identified temporal programs of gene expression, including hundreds of transcription-factor genes. We found a close correlation of the sequentially expressed gene sets with distinct cellular and metabolic programs in distinct compartments of the developing endosperm. The results constitute a preliminary atlas of spatiotemporal patterns of endosperm gene expression in support of future efforts for understanding the underlying mechanisms that control seed yield and quality.

  5. Temporal patterns of gene expression in developing maize endosperm identified through transcriptome sequencing

    PubMed Central

    Li, Guosheng; Wang, Dongfang; Yang, Ruolin; Logan, Kyle; Chen, Hao; Zhang, Shanshan; Skaggs, Megan I.; Lloyd, Alan; Burnett, William J.; Laurie, John D.; Hunter, Brenda G.; Dannenhoffer, Joanne M.; Larkins, Brian A.; Drews, Gary N.; Wang, Xiangfeng; Yadegari, Ramin

    2014-01-01

    Endosperm is a filial structure resulting from a second fertilization event in angiosperms. As an absorptive storage organ, endosperm plays an essential role in support of embryo development and seedling germination. The accumulation of carbohydrate and protein storage products in cereal endosperm provides humanity with a major portion of its food, feed, and renewable resources. Little is known regarding the regulatory gene networks controlling endosperm proliferation and differentiation. As a first step toward understanding these networks, we profiled all mRNAs in the maize kernel and endosperm at eight successive stages during the first 12 d after pollination. Analysis of these gene sets identified temporal programs of gene expression, including hundreds of transcription-factor genes. We found a close correlation of the sequentially expressed gene sets with distinct cellular and metabolic programs in distinct compartments of the developing endosperm. The results constitute a preliminary atlas of spatiotemporal patterns of endosperm gene expression in support of future efforts for understanding the underlying mechanisms that control seed yield and quality. PMID:24821765

  6. Molecular profiling of experimental endometriosis identified gene expression patterns in common with human disease

    PubMed Central

    Flores, Idhaliz; Rivera, Elizabeth; Ruiz, Lynnette A.; Santiago, Olga I.; Vernon, Michael W.; Appleyard, Caroline B.

    2007-01-01

    OBJECTIVE To validate a rat model of endometriosis using cDNA microarrays by identifying common gene expression patterns beween experimental and natural disease. DESIGN Autotransplantation rat model. SETTING Medical school department. ANIMALS Female Sprague-Dawley rats. INTERVENTIONS Endometriosis was surgically-induced by suturing uterine horn implants next to the small intestine’s mesentery. Control rats received sutures with no implants. After 60 days, endometriotic implants and uterine horn were obtained. MAIN OUTCOME MEASURES Gene expression levels determined by cDNA microarrays and QRT-PCR. METHODS Cy5-labeled cDNA was synthesized from total RNA obtained from endometriotic implants. Cy3-labeled cDNA was synthesized using uterine RNA from a control rat. Gene expression levels were analyzed after hybridizing experimental and control labeled cDNA to PIQOR™ Toxicology Rat Microarrays (Miltenyi Biotec) containing 1,252 known genes. Cy5/Cy3 ratios were determined and genes with >2-fold higher or <0.5-fold lower expression levels were selected. Microarray results were validated by QRT-PCR. RESULTS We observed differential expression of genes previously shown to be upregulated in patients, including growth factors, inflammatory cytokines/receptors, tumor invasion/metastasis factors, adhesion molecules, and anti-apoptotic factors. CONCLUSIONS This study presents evidence in support of using this rat model to study the natural history of endometriosis and test novel therapeutics for this incurable disease. PMID:17478174

  7. Identifying candidate genes affecting developmental time in Drosophila melanogaster: pervasive pleiotropy and gene-by-environment interaction

    PubMed Central

    Mensch, Julián; Lavagnino, Nicolás; Carreira, Valeria Paula; Massaldi, Ana; Hasson, Esteban; Fanara, Juan José

    2008-01-01

    Background Understanding the genetic architecture of ecologically relevant adaptive traits requires the contribution of developmental and evolutionary biology. The time to reach the age of reproduction is a complex life history trait commonly known as developmental time. In particular, in holometabolous insects that occupy ephemeral habitats, like fruit flies, the impact of developmental time on fitness is further exaggerated. The present work is one of the first systematic studies of the genetic basis of developmental time, in which we also evaluate the impact of environmental variation on the expression of the trait. Results We analyzed 179 co-isogenic single P[GT1]-element insertion lines of Drosophila melanogaster to identify novel genes affecting developmental time in flies reared at 25°C. Sixty percent of the lines showed a heterochronic phenotype, suggesting that a large number of genes affect this trait. Mutant lines for the genes Merlin and Karl showed the most extreme phenotypes exhibiting a developmental time reduction and increase, respectively, of over 2 days and 4 days relative to the control (a co-isogenic P-element insertion free line). In addition, a subset of 42 lines selected at random from the initial set of 179 lines was screened at 17°C. Interestingly, the gene-by-environment interaction accounted for 52% of total phenotypic variance. Plastic reaction norms were found for a large number of developmental time candidate genes. Conclusion We identified components of several integrated time-dependent pathways affecting egg-to-adult developmental time in Drosophila. At the same time, we also show that many heterochronic phenotypes may arise from changes in genes involved in several developmental mechanisms that do not explicitly control the timing of specific events. We also demonstrate that many developmental time genes have pleiotropic effects on several adult traits and that the action of most of them is sensitive to temperature during

  8. SVM-T-RFE: a novel gene selection algorithm for identifying metastasis-related genes in colorectal cancer using gene expression profiles.

    PubMed

    Li, Xiaobo; Peng, Sihua; Chen, Jian; Lü, Bingjian; Zhang, Honghe; Lai, Maode

    2012-03-09

    Although metastasis is the principal cause of death cause for colorectal cancer (CRC) patients, the molecular mechanisms underlying CRC metastasis are still not fully understood. In an attempt to identify metastasis-related genes in CRC, we obtained gene expression profiles of 55 early stage primary CRCs, 56 late stage primary CRCs, and 34 metastatic CRCs from the expression project in Oncology (http://www.intgen.org/expo/). We developed a novel gene selection algorithm (SVM-T-RFE), which extends support vector machine recursive feature elimination (SVM-RFE) algorithm by incorporating T-statistic. We achieved highest classification accuracy (100%) with smaller gene subsets (10 and 6, respectively), when classifying between early and late stage primary CRCs, as well as between metastatic CRCs and late stage primary CRCs. We also compared the performance of SVM-T-RFE and SVM-RFE gene selection algorithms on another large-scale CRC dataset and the five public microarray datasets. SVM-T-RFE bestowed SVM-RFE algorithm in identifying more differentially expressed genes, and achieving highest prediction accuracy using equal or smaller number of selected genes. A fraction of selected genes have been reported to be associated with CRC development or metastasis.

  9. Hyperoxia-induced neurodegeneration as a tool to identify neuroprotective genes in Drosophila melanogaster.

    PubMed

    Gruenewald, Christoph; Botella, Jose A; Bayersdorfer, Florian; Navarro, Juan A; Schneuwly, Stephan

    2009-06-15

    Oxidative stress has been reported to be a common underlying mechanism in the pathogenesis of many neurodegenerative disorders such as Alzheimer, Huntington, Creutzfeld-Jakob, and Parkinson disease. Despite the increasing number of articles showing a correlation between oxidative damage and neurodegeneration little is known about the genetic elements that confer protection against the deleterious effects of an oxidative imbalance in neurons. We show that oxygen-induced damage is a direct cause of brain degeneration in Drosophila and establish an experimental setup measuring dopaminergic neuron survival to model oxidative stress-induced neurodegeneration in flies. The overexpression of superoxide dismutase but not catalase was able to protect dopaminergic neurons against oxidative imbalance under hyperoxia treatment. In an effort to identify new genes involved in the process of oxidative stress-induced neurodegeneration, we have carried out a genome-wide expression analysis to identify genes whose expression is upregulated in fly heads under hyperoxia. Among them, a number of mitochondrial and cytoplasmic chaperones could be identified and were shown to protect dopaminergic neurons when overexpressed, thus validating our approach to identifying new genes involved in the neuronal defense mechanism against oxidative stress.

  10. Identifying genes related to choriogenesis in insect panoistic ovaries by Suppression Subtractive Hybridization

    PubMed Central

    Irles, Paula; Bellés, Xavier; Piulachs, M Dolors

    2009-01-01

    Background Insect ovarioles are classified into two categories: panoistic and meroistic, the later having apparently evolved from an ancestral panoistic type. Molecular data on oogenesis is practically restricted to meroistic ovaries. If we aim at studying the evolutionary transition from panoistic to meroistic, data on panoistic ovaries should be gathered. To this end, we planned the construction of a Suppression Subtractive Hybridization (SSH) library to identify genes involved in panoistic choriogenesis, using the cockroach Blattella germanica as model. Results We constructed a post-vitellogenic ovary library by SSH to isolate genes involved in choriogenesis in B. germanica. The tester library was prepared with an ovary pool from 6- to 7-day-old females, whereas the driver library was prepared with an ovary pool from 3- to 4-day-old females. From the SSH library, we obtained 258 high quality sequences which clustered into 34 unique sequences grouped in 19 contigs and 15 singlets. The sequences were compared against non-redundant NCBI databases using BLAST. We found that 44% of the unique sequences had homologous sequences in known genes of other organisms, whereas 56% had no significant similarity to any of the databases entries. A Gene Ontology analysis was carried out, classifying the 34 sequences into different functional categories. Seven of these gene sequences, representative of different categories and processes, were chosen to perform expression studies during the first gonadotrophic cycle by real-time PCR. Results showed that they were mainly expressed during post-vitellogenesis, which validates the SSH technique. In two of them corresponding to novel genes, we demonstrated that they are specifically expressed in the cytoplasm of follicular cells in basal oocytes at the time of choriogenesis. Conclusion The SSH approach has proven to be useful in identifying ovarian genes expressed after vitellogenesis in B. germanica. For most of the genes, functions

  11. Transcriptomic and genetic studies identify IL-33 as a candidate gene for Alzheimer’s disease

    PubMed Central

    Chapuis, J; Hot, D; Hansmannel, F; Kerdraon, O; Ferreira, S; Hubans, C; Maurage, CA; Huot, L; Bensemain, F; Laumet, G; Ayral, AM; Fievet, N; Hauw, JJ; DeKosky, ST; Lemoine, Y; Iwatsubo, T; Wavrant-Devrièze, F; Dartigues, JF; Tzourio, C; Buée, L; Pasquier, F; Berr, C; Mann, D; Lendon, C; Alpérovitch, A; Kamboh, MI; Amouyel, P; Lambert, JC

    2010-01-01

    The only recognised genetic determinant of the common forms of Alzheimer’s disease (AD) is the ε4 allele of the apolipoprotein E gene (APOE). To identify new candidate genes, we recently performed transcriptomic analysis of 2,741 genes in chromosomal regions of interest using brain tissue of AD cases and controls. From 82 differentially expressed genes, 1,156 polymorphisms were genotyped in two independent discovery sub-samples (n=945). Seventeen genes exhibited at least one polymorphism associated with AD risk and following correction for multiple testing, we retained the IL-33 gene. We first confirmed that the IL-33 expression was decreased in the brain of AD cases compared with that of controls. Further genetic analysis led us to select 3 polymorphisms within this gene, which we analysed in three independent case-control studies. These polymorphisms and a resulting protective haplotype were systematically associated with AD risk in non-APOE ε4 carriers. Using a large prospective study, these associations were also detected when analyzing the prevalent and incident AD cases together or the incident AD cases alone. These polymorphisms were also associated with less cerebral amyloid angiopathy (CAA) in the brain of non-APOE ε4 AD cases. Immunohistochemistry experiments finally indicated that the IL-33 expression was consistently restricted to vascular capillaries in the brain. Moreover, IL-33 overexpression in cellular models led to a specific decrease in secretion of the Aβ40 peptides, the main CAA component. In conclusion, our data suggest that genetic variants in IL-33 gene may be associated with a decrease in AD risk potentially in modulating CAA formation. PMID:19204726

  12. Transcriptomic and genetic studies identify NFAT5 as a candidate gene for cocaine dependence

    PubMed Central

    Fernàndez-Castillo, N; Cabana-Domínguez, J; Soriano, J; Sànchez-Mora, C; Roncero, C; Grau-López, L; Ros-Cucurull, E; Daigre, C; van Donkelaar, M M J; Franke, B; Casas, M; Ribasés, M; Cormand, B

    2015-01-01

    Cocaine reward and reinforcing effects are mediated mainly by dopaminergic neurotransmission. In this study, we aimed at evaluating gene expression changes induced by acute cocaine exposure on SH-SY5Y-differentiated cells, which have been widely used as a dopaminergic neuronal model. Expression changes and a concomitant increase in neuronal activity were observed after a 5 μM cocaine exposure, whereas no changes in gene expression or in neuronal activity took place at 1 μM cocaine. Changes in gene expression were identified in a total of 756 genes, mainly related to regulation of transcription and gene expression, cell cycle, adhesion and cell projection, as well as mitogen-activeated protein kinase (MAPK), CREB, neurotrophin and neuregulin signaling pathways. Some genes displaying altered expression were subsequently targeted with predicted functional single-nucleotide polymorphisms (SNPs) in a case–control association study in a sample of 806 cocaine-dependent patients and 817 controls. This study highlighted associations between cocaine dependence and five SNPs predicted to alter microRNA binding at the 3′-untranslated region of the NFAT5 gene. The association of SNP rs1437134 with cocaine dependence survived the Bonferroni correction for multiple testing. A functional effect was confirmed for this variant by a luciferase reporter assay, with lower expression observed for the rs1437134G allele, which was more pronounced in the presence of hsa-miR-509. However, brain volumes in regions of relevance to addiction, as assessed with magnetic resonance imaging, did not correlate with NFAT5 variation. These results suggest that the NFAT5 gene, which is upregulated a few hours after cocaine exposure, may be involved in the genetic predisposition to cocaine dependence. PMID:26506053

  13. Evaluation of voltage-dependent calcium channel γ gene families identified several novel potential susceptible genes to schizophrenia

    PubMed Central

    Guan, Fanglin; Zhang, Tianxiao; Liu, Xinshe; Han, Wei; Lin, Huali; Li, Lu; Chen, Gang; Li, Tao

    2016-01-01

    Voltage-gated L-type calcium channels (VLCC) are distributed widely throughout the brain. Among the genes involved in schizophrenia (SCZ), genes encoding VLCC subunits have attracted widespread attention. Among the four subunits comprising the VLCC (α − 1, α −2/δ, β, and γ), the γ subunit that comprises an eight-member protein family is the least well understood. In our study, to further investigate the risk susceptibility by the γ subunit gene family to SCZ, we conducted a large-scale association study in Han Chinese individuals. The SNP rs17645023 located in the intergenic region of CACNG4 and CACNG5 was identified to be significantly associated with SCZ (OR = 0.856, P = 5.43 × 10−5). Similar results were obtained in the meta-analysis with the current SCZ PGC data (OR = 0.8853). We also identified a two-SNP haplotype (rs10420331-rs11084307, P = 1.4 × 10−6) covering the intronic region of CACNG8 to be significantly associated with SCZ. Epistasis analyses were conducted, and significant statistical interaction (OR = 0.622, P = 2.93 × 10−6, Pperm < 0.001) was observed between rs192808 (CACNG6) and rs2048137 (CACNG5). Our results indicate that CACNG4, CACNG5, CACNG6 and CACNG8 may contribute to the risk of SCZ. The statistical epistasis identified between CACNG5 and CACNG6 suggests that there may be an underlying biological interaction between the two genes. PMID:27102562

  14. Gene from a novel plant virus satellite from grapevine identifies a viral satellite lineage.

    PubMed

    Al Rwahnih, Maher; Daubert, Steve; Sudarshana, Mysore R; Rowhani, Adib

    2013-08-01

    We have identified the genome of a novel viral satellite in deep sequence analysis of double-stranded RNA from grapevine. The genome was 1,060 bases in length, and encoded two open reading frames. Neither frame was related to any known plant virus gene. But translation of the longer frame showed a protein sequence similar to those of other plant virus satellites. Other than in commonalities they shared in this gene sequence, members of that group were extensively divergent. The reading frame in this gene from the novel satellite could be translationally coupled to an adjacent reading frame in the -1 register, through overlapping start/stop codons. These overlapping AUGA start/stop codons were adjacent to a sequence that could be folded into a pseudoknot structure. Field surveys with PCR probes specific for the novel satellite revealed its presence in 3% of the grapevines (n = 346) sampled.

  15. Transcriptome Analysis to Identify Cold-Responsive Genes in Amur Carp (Cyprinus carpio haematopterus)

    PubMed Central

    He, XuLing

    2015-01-01

    The adaptation of fish to low temperatures is the result of long-term evolution. Amur carp (Cyprinus carpio haematopterus) survives low temperatures (0-4°C) for six months per year. Therefore, we chose this fish as a model organism to study the mechanisms of cold-adaptive responses using high-throughput sequencing technology. This system provided an excellent model for exploring the relationship between evolutionary genomic changes and environmental adaptations. The Amur carp transcriptome was sequenced using the Illumina platform and was assembled into 163,121 cDNA contigs, with an average read length of 594 bp and an N50 length of 913 bp. A total of 162,339 coding sequences (CDSs) were identified and of 32,730 unique CDSs were annotated. Gene Ontology (GO), EuKaryotic Orthologous Groups (KOG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed to classify all CDSs into different functional categories. A large number of cold-responsive genes were detected in different tissues at different temperatures. A total of 9,427 microsatellites were identified and classified, with 1952 identifying in cold-responsive genes. Based on GO enrichment analysis of the cold-induced genes, “protein localization” and “protein transport” were the most highly represented biological processes. “Circadian rhythm,” “protein processing in endoplasmic reticulum,” “endocytosis,” “insulin signaling pathway,” and “lysosome” were the most highly enriched pathways for the genes induced by cold stress. Our data greatly contribute to the common carp (C. carpio) transcriptome resource, and the identification of cold-responsive genes in different tissues at different temperatures will aid in deciphering the genetic basis of ecological and environmental adaptations in this species. Based on our results, the Amur carp has evolved special strategies to survive low temperatures, and these strategies include the system-wide or tissue-specific induction

  16. Comparative analysis of gene expression profiles for several migrating cell types identifies cell migration regulators.

    PubMed

    Bae, Young-Kyung; Macabenta, Frank; Curtis, Heather Leigh; Stathopoulos, Angelike

    2017-04-18

    Cell migration is an instrumental process that ensures cells are properly positioned to support the specification of distinct tissue types during development. To provide insight, we used fluorescence activated cell sorting (FACS) to isolate two migrating cell types from the Drosophila embryo: caudal visceral mesoderm (CVM) cells, precursors of longitudinal muscles of the gut, and hemocytes (HCs), the Drosophila equivalent of blood cells. ~350 genes were identified from each of the sorted samples using RNA-seq, and in situ hybridization was used to confirm expression within each cell type or, alternatively, within other interacting, co-sorted cell types. To start, the two gene expression profiling datasets were compared to identify cell migration regulators that are potentially generally-acting. 73 genes were present in both CVM cell and HC gene expression profiles, including the transcription factor zinc finger homeodomain-1 (zfh1). Comparisons with gene expression profiles of Drosophila border cells that migrate during oogenesis had a more limited overlap, with only the genes neyo (neo) and singed (sn) found to be expressed in border cells as well as CVM cells and HCs, respectively. Neo encodes a protein with Zona pellucida domain linked to cell polarity, while sn encodes an actin binding protein. Tissue specific RNAi expression coupled with live in vivo imaging was used to confirm cell-autonomous roles for zfh1 and neo in supporting CVM cell migration, whereas previous studies had demonstrated a role for Sn in supporting HC migration. In addition, comparisons were made to migrating cells from vertebrates. Seven genes were found expressed by chick neural crest cells, CVM cells, and HCs including extracellular matrix (ECM) proteins and proteases. In summary, we show that genes shared in common between CVM cells, HCs, and other migrating cell types can help identify regulators of cell migration. Our analyses show that neo in addition to zfh1 and sn studied

  17. Transcriptional profiling identifies differentially expressed genes in developing turkey skeletal muscle

    PubMed Central

    2011-01-01

    involved in extracellular matrix regulation, cell death/apoptosis, and calcium signaling/muscle function, as well as genes with miscellaneous function was confirmed by qPCR. Conclusions The current study identified gene pathways and uncovered novel genes important in turkey muscle growth and development. Future experiments will focus further on several of these candidate genes and the expression and mechanism of action of their protein products. PMID:21385442

  18. A novel approach of homozygous haplotype sharing identifies candidate genes in autism spectrum disorder.

    PubMed

    Casey, Jillian P; Magalhaes, Tiago; Conroy, Judith M; Regan, Regina; Shah, Naisha; Anney, Richard; Shields, Denis C; Abrahams, Brett S; Almeida, Joana; Bacchelli, Elena; Bailey, Anthony J; Baird, Gillian; Battaglia, Agatino; Berney, Tom; Bolshakova, Nadia; Bolton, Patrick F; Bourgeron, Thomas; Brennan, Sean; Cali, Phil; Correia, Catarina; Corsello, Christina; Coutanche, Marc; Dawson, Geraldine; de Jonge, Maretha; Delorme, Richard; Duketis, Eftichia; Duque, Frederico; Estes, Annette; Farrar, Penny; Fernandez, Bridget A; Folstein, Susan E; Foley, Suzanne; Fombonne, Eric; Freitag, Christine M; Gilbert, John; Gillberg, Christopher; Glessner, Joseph T; Green, Jonathan; Guter, Stephen J; Hakonarson, Hakon; Holt, Richard; Hughes, Gillian; Hus, Vanessa; Igliozzi, Roberta; Kim, Cecilia; Klauck, Sabine M; Kolevzon, Alexander; Lamb, Janine A; Leboyer, Marion; Le Couteur, Ann; Leventhal, Bennett L; Lord, Catherine; Lund, Sabata C; Maestrini, Elena; Mantoulan, Carine; Marshall, Christian R; McConachie, Helen; McDougle, Christopher J; McGrath, Jane; McMahon, William M; Merikangas, Alison; Miller, Judith; Minopoli, Fiorella; Mirza, Ghazala K; Munson, Jeff; Nelson, Stanley F; Nygren, Gudrun; Oliveira, Guiomar; Pagnamenta, Alistair T; Papanikolaou, Katerina; Parr, Jeremy R; Parrini, Barbara; Pickles, Andrew; Pinto, Dalila; Piven, Joseph; Posey, David J; Poustka, Annemarie; Poustka, Fritz; Ragoussis, Jiannis; Roge, Bernadette; Rutter, Michael L; Sequeira, Ana F; Soorya, Latha; Sousa, Inês; Sykes, Nuala; Stoppioni, Vera; Tancredi, Raffaella; Tauber, Maïté; Thompson, Ann P; Thomson, Susanne; Tsiantis, John; Van Engeland, Herman; Vincent, John B; Volkmar, Fred; Vorstman, Jacob A S; Wallace, Simon; Wang, Kai; Wassink, Thomas H; White, Kathy; Wing, Kirsty; Wittemeyer, Kerstin; Yaspan, Brian L; Zwaigenbaum, Lonnie; Betancur, Catalina; Buxbaum, Joseph D; Cantor, Rita M; Cook, Edwin H; Coon, Hilary; Cuccaro, Michael L; Geschwind, Daniel H; Haines, Jonathan L; Hallmayer, Joachim; Monaco, Anthony P; Nurnberger, John I; Pericak-Vance, Margaret A; Schellenberg, Gerard D; Scherer, Stephen W; Sutcliffe, James S; Szatmari, Peter; Vieland, Veronica J; Wijsman, Ellen M; Green, Andrew; Gill, Michael; Gallagher, Louise; Vicente, Astrid; Ennis, Sean

    2012-04-01

    Autism spectrum disorder (ASD) is a highly heritable disorder of complex and heterogeneous aetiology. It is primarily characterized by altered cognitive ability including impaired language and communication skills and fundamental deficits in social reciprocity. Despite some notable successes in neuropsychiatric genetics, overall, the high heritability of ASD (~90%) remains poorly explained by common genetic risk variants. However, recent studies suggest that rare genomic variation, in particular copy number variation, may account for a significant proportion of the genetic basis of ASD. We present a large scale analysis to identify candidate genes which may contain low-frequency recessive variation contributing to ASD while taking into account the potential contribution of population differences to the genetic heterogeneity of ASD. Our strategy, homozygous haplotype (HH) mapping, aims to detect homozygous segments of identical haplotype structure that are shared at a higher frequency amongst ASD patients compared to parental controls. The analysis was performed on 1,402 Autism Genome Project trios genotyped for 1 million single nucleotide polymorphisms (SNPs). We identified 25 known and 1,218 novel ASD candidate genes in the discovery analysis including CADM2, ABHD14A, CHRFAM7A, GRIK2, GRM3, EPHA3, FGF10, KCND2, PDZK1, IMMP2L and FOXP2. Furthermore, 10 of the previously reported ASD genes and 300 of the novel candidates identified in the discovery analysis were replicated in an independent sample of 1,182 trios. Our results demonstrate that regions of HH are significantly enriched for previously reported ASD candidate genes and the observed association is independent of gene size (odds ratio 2.10). Our findings highlight the applicability of HH mapping in complex disorders such as ASD and offer an alternative approach to the analysis of genome-wide association data.

  19. Fine Mapping and Whole-Genome Resequencing Identify the Seed Coat Color Gene in Brassica rapa

    PubMed Central

    Guo, Shaomin; An, Fengyun; Du, Dezhi

    2016-01-01

    A yellow seed coat is a desirable agronomic trait in the seeds of oilseed-type Brassica crops. In this study, we identified a candidate gene for seed coat color in Dahuang, a landrace of Brassica rapa. A previous study of Dahuang mapped the seed coat color gene Brsc1 to a 2.8-Mb interval on chromosome A9 of B. rapa. In the present study, the density of the linkage map for Brsc1 was increased by adding simple sequence repeat (SSR) markers, and the candidate region for Brsc1 was narrowed to 1.04 Mb. In addition, whole-genome resequencing with bulked segregant analysis (BSA) was conducted to identify candidate intervals for Brsc1. A genome-wide comparison of SNP profiles was performed between yellow-seeded and brown-seeded bulk samples. SNP index analyses identified a major candidate interval on chromosome A9 (A09:18,255,838–18,934,000, 678 kb) containing a long overlap with the target region recovered from the fine mapping results. According to gene annotation, Bra028067 (BrTT1) is an important candidate gene for Brsc1 in the overlapping region. Quantitative reverse transcription (qRT)-PCR revealed that BrTT1 mainly functions in the seed. Point mutations and small deletions in BrTT1 were found between yellow- and brown-seeded Dahuang plants. Collectively, the expression and sequence analysis results provide preliminary evidence that BrTT1 is a candidate gene for the seed coat color trait in Dahuang. PMID:27829069

  20. Genome Screen to Identify Susceptibility Genes for Parkinson Disease in a Sample without parkin Mutations

    PubMed Central

    Pankratz, Nathan; Nichols, William C.; Uniacke, Sean K.; Halter, Cheryl; Rudolph, Alice; Shults, Cliff; Conneally, P. Michael; Foroud, Tatiana

    2002-01-01

    Parkinson disease (PD) is a common neurodegenerative disorder characterized by bradykinesia, resting tremor, muscular rigidity, and postural instability, as well as by a clinically significant response to treatment with levodopa. Mutations in the α-synuclein gene have been found to result in autosomal dominant PD, and mutations in the parkin gene produce autosomal recessive juvenile-onset PD. We have studied 203 sibling pairs with PD who were evaluated by a rigorous neurological assessment based on (a) inclusion criteria consisting of clinical features highly associated with autopsy-confirmed PD and (b) exclusion criteria highly associated with other, non-PD pathological diagnoses. Families with positive LOD scores for a marker in an intron of the parkin gene were prioritized for parkin-gene testing, and mutations in the parkin gene were identified in 22 families. To reduce genetic heterogeneity, these families were not included in subsequent genome-screen analysis. Thus, a total of 160 multiplex families without evidence of a parkin mutation were used in multipoint nonparametric linkage analysis to identify PD-susceptibility genes. Two models of PD affection status were considered: model I included only those individuals with a more stringent diagnosis of verified PD (96 sibling pairs from 90 families), whereas model II included all examined individuals as affected, regardless of their final diagnostic classification (170 sibling pairs from 160 families). Under model I, the highest LOD scores were observed on chromosome X (LOD score 2.1) and on chromosome 2 (LOD score 1.9). Analyses performed with all available sibling pairs (model II) found even greater evidence of linkage to chromosome X (LOD score 2.7) and to chromosome 2 (LOD score 2.5). Evidence of linkage was also found to chromosomes 4, 5, and 13 (LOD scores >1.5). Our findings are consistent with those of other linkage studies that have reported linkage to chromosomes 5 and X. PMID:12058349

  1. Comparative Transcriptome Analysis of White and Purple Potato to Identify Genes Involved in Anthocyanin Biosynthesis

    PubMed Central

    Liu, Yuhui; Lin-Wang, Kui; Deng, Cecilia; Warran, Ben; Wang, Li; Yu, Bin; Yang, Hongyu; Wang, Jing; Espley, Richard V.; Zhang, Junlian; Wang, Di; Allan, Andrew C.

    2015-01-01

    Introduction The potato (Solanum tuberosum) cultivar ‘Xin Daping’ is tetraploid with white skin and white flesh, while the cultivar ‘Hei Meiren’ is also tetraploid with purple skin and purple flesh. Comparative transcriptome analysis of white and purple cultivars was carried out using high-throughput RNA sequencing in order to further understand the mechanism of anthocyanin biosynthesis in potato. Methods and Results By aligning transcript reads to the recently published diploid potato genome and de novo assembly, 209 million paired-end Illumina RNA-seq reads from these tetraploid cultivars were assembled on to 60,930 transcripts, of which 27,754 (45.55%) are novel transcripts and 9393 alternative transcripts. Using a comparison of the RNA-sequence datasets, multiple versions of the genes encoding anthocyanin biosynthetic steps and regulatory transcription factors were identified. Other novel genes potentially involved in anthocyanin biosynthesis in potato tubers were also discovered. Real-time qPCR validation of candidate genes revealed good correlation with the transcriptome data. SNPs (Single Nucleotide Polymorphism) and indels were predicted and validated for the transcription factors MYB AN1 and bHLH1 and the biosynthetic gene anthocyanidin 3-O-glucosyltransferase (UFGT). Conclusions These results contribute to our understanding of the molecular mechanism of white and purple potato development, by identifying differential responses of biosynthetic gene family members together with the variation in structural genes and transcription factors in this highly heterozygous crop. This provides an excellent platform and resource for future genetic and functional genomic research. PMID:26053878

  2. Large-Scale Gene-Centric Meta-analysis across 32 Studies Identifies Multiple Lipid Loci

    PubMed Central

    Asselbergs, Folkert W.; Guo, Yiran; van Iperen, Erik P.A.; Sivapalaratnam, Suthesh; Tragante, Vinicius; Lanktree, Matthew B.; Lange, Leslie A.; Almoguera, Berta; Appelman, Yolande E.; Barnard, John; Baumert, Jens; Beitelshees, Amber L.; Bhangale, Tushar R.; Chen, Yii-Der Ida; Gaunt, Tom R.; Gong, Yan; Hopewell, Jemma C.; Johnson, Toby; Kleber, Marcus E.; Langaee, Taimour Y.; Li, Mingyao; Li, Yun R.; Liu, Kiang; McDonough, Caitrin W.; Meijs, Matthijs F.L.; Middelberg, Rita P.S.; Musunuru, Kiran; Nelson, Christopher P.; O’Connell, Jeffery R.; Padmanabhan, Sandosh; Pankow, James S.; Pankratz, Nathan; Rafelt, Suzanne; Rajagopalan, Ramakrishnan; Romaine, Simon P.R.; Schork, Nicholas J.; Shaffer, Jonathan; Shen, Haiqing; Smith, Erin N.; Tischfield, Sam E.; van der Most, Peter J.; van Vliet-Ostaptchouk, Jana V.; Verweij, Niek; Volcik, Kelly A.; Zhang, Li; Bailey, Kent R.; Bailey, Kristian M.; Bauer, Florianne; Boer, Jolanda M.A.; Braund, Peter S.; Burt, Amber; Burton, Paul R.; Buxbaum, Sarah G.; Chen, Wei; Cooper-DeHoff, Rhonda M.; Cupples, L. Adrienne; deJong, Jonas S.; Delles, Christian; Duggan, David; Fornage, Myriam; Furlong, Clement E.; Glazer, Nicole; Gums, John G.; Hastie, Claire; Holmes, Michael V.; Illig, Thomas; Kirkland, Susan A.; Kivimaki, Mika; Klein, Ronald; Klein, Barbara E.; Kooperberg, Charles; Kottke-Marchant, Kandice; Kumari, Meena; LaCroix, Andrea Z.; Mallela, Laya; Murugesan, Gurunathan; Ordovas, Jose; Ouwehand, Willem H.; Post, Wendy S.; Saxena, Richa; Scharnagl, Hubert; Schreiner, Pamela J.; Shah, Tina; Shields, Denis C.; Shimbo, Daichi; Srinivasan, Sathanur R.; Stolk, Ronald P.; Swerdlow, Daniel I.; Taylor, Herman A.; Topol, Eric J.; Toskala, Elina; van Pelt, Joost L.; van Setten, Jessica; Yusuf, Salim; Whittaker, John C.; Zwinderman, A.H.; Anand, Sonia S.; Balmforth, Anthony J.; Berenson, Gerald S.; Bezzina, Connie R.; Boehm, Bernhard O.; Boerwinkle, Eric; Casas, Juan P.; Caulfield, Mark J.; Clarke, Robert; Connell, John M.; Cruickshanks, Karen J.; Davidson, Karina W.; Day, Ian N.M.; de Bakker, Paul I.W.; Doevendans, Pieter A.; Dominiczak, Anna F.; Hall, Alistair S.; Hartman, Catharina A.; Hengstenberg, Christian; Hillege, Hans L.; Hofker, Marten H.; Humphries, Steve E.; Jarvik, Gail P.; Johnson, Julie A.; Kaess, Bernhard M.; Kathiresan, Sekar; Koenig