Sample records for identified numerous genes

  1. A P-Norm Robust Feature Extraction Method for Identifying Differentially Expressed Genes

    PubMed Central

    Liu, Jian; Liu, Jin-Xing; Gao, Ying-Lian; Kong, Xiang-Zhen; Wang, Xue-Song; Wang, Dong

    2015-01-01

    In current molecular biology, it becomes more and more important to identify differentially expressed genes closely correlated with a key biological process from gene expression data. In this paper, based on the Schatten p-norm and Lp-norm, a novel p-norm robust feature extraction method is proposed to identify the differentially expressed genes. In our method, the Schatten p-norm is used as the regularization function to obtain a low-rank matrix and the Lp-norm is taken as the error function to improve the robustness to outliers in the gene expression data. The results on simulation data show that our method can obtain higher identification accuracies than the competitive methods. Numerous experiments on real gene expression data sets demonstrate that our method can identify more differentially expressed genes than the others. Moreover, we confirmed that the identified genes are closely correlated with the corresponding gene expression data. PMID:26201006

  2. A P-Norm Robust Feature Extraction Method for Identifying Differentially Expressed Genes.

    PubMed

    Liu, Jian; Liu, Jin-Xing; Gao, Ying-Lian; Kong, Xiang-Zhen; Wang, Xue-Song; Wang, Dong

    2015-01-01

    In current molecular biology, it becomes more and more important to identify differentially expressed genes closely correlated with a key biological process from gene expression data. In this paper, based on the Schatten p-norm and Lp-norm, a novel p-norm robust feature extraction method is proposed to identify the differentially expressed genes. In our method, the Schatten p-norm is used as the regularization function to obtain a low-rank matrix and the Lp-norm is taken as the error function to improve the robustness to outliers in the gene expression data. The results on simulation data show that our method can obtain higher identification accuracies than the competitive methods. Numerous experiments on real gene expression data sets demonstrate that our method can identify more differentially expressed genes than the others. Moreover, we confirmed that the identified genes are closely correlated with the corresponding gene expression data.

  3. Gene expression meta-analysis identifies chromosomal regions and candidate genes involved in breast cancer metastasis.

    PubMed

    Thomassen, Mads; Tan, Qihua; Kruse, Torben A

    2009-01-01

    Breast cancer cells exhibit complex karyotypic alterations causing deregulation of numerous genes. Some of these genes are probably causal for cancer formation and local growth whereas others are causal for the various steps of metastasis. In a fraction of tumors deregulation of the same genes might be caused by epigenetic modulations, point mutations or the influence of other genes. We have investigated the relation of gene expression and chromosomal position, using eight datasets including more than 1200 breast tumors, to identify chromosomal regions and candidate genes possibly causal for breast cancer metastasis. By use of "Gene Set Enrichment Analysis" we have ranked chromosomal regions according to their relation to metastasis. Overrepresentation analysis identified regions with increased expression for chromosome 1q41-42, 8q24, 12q14, 16q22, 16q24, 17q12-21.2, 17q21-23, 17q25, 20q11, and 20q13 among metastasizing tumors and reduced gene expression at 1p31-21, 8p22-21, and 14q24. By analysis of genes with extremely imbalanced expression in these regions we identified DIRAS3 at 1p31, PSD3, LPL, EPHX2 at 8p21-22, and FOS at 14q24 as candidate metastasis suppressor genes. Potential metastasis promoting genes includes RECQL4 at 8q24, PRMT7 at 16q22, GINS2 at 16q24, and AURKA at 20q13.

  4. An Integrative Genetics Approach to Identify Candidate Genes Regulating BMD: Combining Linkage, Gene Expression, and Association

    PubMed Central

    Farber, Charles R; van Nas, Atila; Ghazalpour, Anatole; Aten, Jason E; Doss, Sudheer; Sos, Brandon; Schadt, Eric E; Ingram-Drake, Leslie; Davis, Richard C; Horvath, Steve; Smith, Desmond J; Drake, Thomas A; Lusis, Aldons J

    2009-01-01

    Numerous quantitative trait loci (QTLs) affecting bone traits have been identified in the mouse; however, few of the underlying genes have been discovered. To improve the process of transitioning from QTL to gene, we describe an integrative genetics approach, which combines linkage analysis, expression QTL (eQTL) mapping, causality modeling, and genetic association in outbred mice. In C57BL/6J × C3H/HeJ (BXH) F2 mice, nine QTLs regulating femoral BMD were identified. To select candidate genes from within each QTL region, microarray gene expression profiles from individual F2 mice were used to identify 148 genes whose expression was correlated with BMD and regulated by local eQTLs. Many of the genes that were the most highly correlated with BMD have been previously shown to modulate bone mass or skeletal development. Candidates were further prioritized by determining whether their expression was predicted to underlie variation in BMD. Using network edge orienting (NEO), a causality modeling algorithm, 18 of the 148 candidates were predicted to be causally related to differences in BMD. To fine-map QTLs, markers in outbred MF1 mice were tested for association with BMD. Three chromosome 11 SNPs were identified that were associated with BMD within the Bmd11 QTL. Finally, our approach provides strong support for Wnt9a, Rasd1, or both underlying Bmd11. Integration of multiple genetic and genomic data sets can substantially improve the efficiency of QTL fine-mapping and candidate gene identification. PMID:18767929

  5. Resistance gene candidates identified by PCR with degenerate oligonucleotide primers map to clusters of resistance genes in lettuce.

    PubMed

    Shen, K A; Meyers, B C; Islam-Faridi, M N; Chin, D B; Stelly, D M; Michelmore, R W

    1998-08-01

    The recent cloning of genes for resistance against diverse pathogens from a variety of plants has revealed that many share conserved sequence motifs. This provides the possibility of isolating numerous additional resistance genes by polymerase chain reaction (PCR) with degenerate oligonucleotide primers. We amplified resistance gene candidates (RGCs) from lettuce with multiple combinations of primers with low degeneracy designed from motifs in the nucleotide binding sites (NBSs) of RPS2 of Arabidopsis thaliana and N of tobacco. Genomic DNA, cDNA, and bacterial artificial chromosome (BAC) clones were successfully used as templates. Four families of sequences were identified that had the same similarity to each other as to resistance genes from other species. The relationship of the amplified products to resistance genes was evaluated by several sequence and genetic criteria. The amplified products contained open reading frames with additional sequences characteristic of NBSs. Hybridization of RGCs to genomic DNA and to BAC clones revealed large numbers of related sequences. Genetic analysis demonstrated the existence of clustered multigene families for each of the four RGC sequences. This parallels classical genetic data on clustering of disease resistance genes. Two of the four families mapped to known clusters of resistance genes; these two families were therefore studied in greater detail. Additional evidence that these RGCs could be resistance genes was gained by the identification of leucine-rich repeat (LRR) regions in sequences adjoining the NBS similar to those in RPM1 and RPS2 of A. thaliana. Fluorescent in situ hybridization confirmed the clustered genomic distribution of these sequences. The use of PCR with degenerate oligonucleotide primers is therefore an efficient method to identify numerous RGCs in plants.

  6. [Numeric alterations in the dys gene and their association with clinical features].

    PubMed

    Mampel, Alejandra; Echeverría, María Inés; Vargas, Ana Lía; Roque, María

    2011-01-01

    The Duchenne/Becker muscular dystrophy is a hereditary miopathy with a recessive sex-linked pattern. The related gene is called DYS and the coded protein plays a crucial role in the anchorage between the cytoskeleton and the cellular membrane in muscle cells. Different clinical manifestations are observed depending on the impact of the genetic alteration on the protein. The global register of mutations reveals an enhanced frequency for deletions/duplications of one or more exons affecting the DYS gene. In the present work, numeric alterations have been studied in the 79 exons of the DYS gene. The study has been performed on 59 individuals, including 31 independent cases and 28 cases with a familial link. The applied methodology was Multiplex Ligation Dependent Probe Amplification (MLPA). In the 31 independent cases clinical data were established: i.e. the clinical score, the Raven test percentiles, and the creatininphosphokinase (CPK) blood values. Our results reveal a 61.3% frequency of numeric alterations affecting the DYS gene in our population, provoking all of them a reading frame shift. The rate for de novo mutations was identified as 35.2%. Alterations involving a specific region of one exon were observed with high frequency, affecting a specific region. A significant association was found between numeric alterations and a low percentile for the Raven test. These data contribute to the local knowledge of genetic alterations and their phenotypic impact for the Duchenne/Becker disease.

  7. NIH Researchers Identify OCD Risk Gene

    MedlinePlus

    ... News From NIH NIH Researchers Identify OCD Risk Gene Past Issues / Summer 2006 Table of Contents For ... and Alcoholism (NIAAA) have identified a previously unknown gene variant that doubles an individual's risk for obsessive- ...

  8. A Sleeping Beauty forward genetic screen identifies new genes and pathways driving osteosarcoma development and metastasis

    PubMed Central

    Moriarity, Branden S; Otto, George M; Rahrmann, Eric P; Rathe, Susan K; Wolf, Natalie K; Weg, Madison T; Manlove, Luke A; LaRue, Rebecca S; Temiz, Nuri A; Molyneux, Sam D; Choi, Kwangmin; Holly, Kevin J; Sarver, Aaron L; Scott, Milcah C; Forster, Colleen L; Modiano, Jaime F; Khanna, Chand; Hewitt, Stephen M; Khokha, Rama; Yang, Yi; Gorlick, Richard; Dyer, Michael A; Largaespada, David A

    2016-01-01

    Osteosarcomas are sarcomas of the bone, derived from osteoblasts or their precursors, with a high propensity to metastasize. Osteosarcoma is associated with massive genomic instability, making it problematic to identify driver genes using human tumors or prototypical mouse models, many of which involve loss of Trp53 function. To identify the genes driving osteosarcoma development and metastasis, we performed a Sleeping Beauty (SB) transposon-based forward genetic screen in mice with and without somatic loss of Trp53. Common insertion site (CIS) analysis of 119 primary tumors and 134 metastatic nodules identified 232 sites associated with osteosarcoma development and 43 sites associated with metastasis, respectively. Analysis of CIS-associated genes identified numerous known and new osteosarcoma-associated genes enriched in the ErbB, PI3K-AKT-mTOR and MAPK signaling pathways. Lastly, we identified several oncogenes involved in axon guidance, including Sema4d and Sema6d, which we functionally validated as oncogenes in human osteosarcoma. PMID:25961939

  9. LGscore: A method to identify disease-related genes using biological literature and Google data.

    PubMed

    Kim, Jeongwoo; Kim, Hyunjin; Yoon, Youngmi; Park, Sanghyun

    2015-04-01

    Since the genome project in 1990s, a number of studies associated with genes have been conducted and researchers have confirmed that genes are involved in disease. For this reason, the identification of the relationships between diseases and genes is important in biology. We propose a method called LGscore, which identifies disease-related genes using Google data and literature data. To implement this method, first, we construct a disease-related gene network using text-mining results. We then extract gene-gene interactions based on co-occurrences in abstract data obtained from PubMed, and calculate the weights of edges in the gene network by means of Z-scoring. The weights contain two values: the frequency and the Google search results. The frequency value is extracted from literature data, and the Google search result is obtained using Google. We assign a score to each gene through a network analysis. We assume that genes with a large number of links and numerous Google search results and frequency values are more likely to be involved in disease. For validation, we investigated the top 20 inferred genes for five different diseases using answer sets. The answer sets comprised six databases that contain information on disease-gene relationships. We identified a significant number of disease-related genes as well as candidate genes for Alzheimer's disease, diabetes, colon cancer, lung cancer, and prostate cancer. Our method was up to 40% more accurate than existing methods. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Utilizing Gene Tree Variation to Identify Candidate Effector Genes in Zymoseptoria tritici

    PubMed Central

    McDonald, Megan C.; McGinness, Lachlan; Hane, James K.; Williams, Angela H.; Milgate, Andrew; Solomon, Peter S.

    2016-01-01

    Zymoseptoria tritici is a host-specific, necrotrophic pathogen of wheat. Infection by Z. tritici is characterized by its extended latent period, which typically lasts 2 wks, and is followed by extensive host cell death, and rapid proliferation of fungal biomass. This work characterizes the level of genomic variation in 13 isolates, for which we have measured virulence on 11 wheat cultivars with differential resistance genes. Between the reference isolate, IPO323, and the 13 Australian isolates we identified over 800,000 single nucleotide polymorphisms, of which ∼10% had an effect on the coding regions of the genome. Furthermore, we identified over 1700 probable presence/absence polymorphisms in genes across the Australian isolates using de novo assembly. Finally, we developed a gene tree sorting method that quickly identifies groups of isolates within a single gene alignment whose sequence haplotypes correspond with virulence scores on a single wheat cultivar. Using this method, we have identified < 100 candidate effector genes whose gene sequence correlates with virulence toward a wheat cultivar carrying a major resistance gene. PMID:26837952

  11. Identifying Cancer Driver Genes Using Replication-Incompetent Retroviral Vectors

    PubMed Central

    Bii, Victor M.; Trobridge, Grant D.

    2016-01-01

    Identifying novel genes that drive tumor metastasis and drug resistance has significant potential to improve patient outcomes. High-throughput sequencing approaches have identified cancer genes, but distinguishing driver genes from passengers remains challenging. Insertional mutagenesis screens using replication-incompetent retroviral vectors have emerged as a powerful tool to identify cancer genes. Unlike replicating retroviruses and transposons, replication-incompetent retroviral vectors lack additional mutagenesis events that can complicate the identification of driver mutations from passenger mutations. They can also be used for almost any human cancer due to the broad tropism of the vectors. Replication-incompetent retroviral vectors have the ability to dysregulate nearby cancer genes via several mechanisms including enhancer-mediated activation of gene promoters. The integrated provirus acts as a unique molecular tag for nearby candidate driver genes which can be rapidly identified using well established methods that utilize next generation sequencing and bioinformatics programs. Recently, retroviral vector screens have been used to efficiently identify candidate driver genes in prostate, breast, liver and pancreatic cancers. Validated driver genes can be potential therapeutic targets and biomarkers. In this review, we describe the emergence of retroviral insertional mutagenesis screens using replication-incompetent retroviral vectors as a novel tool to identify cancer driver genes in different cancer types. PMID:27792127

  12. Identifying signatures of positive selection in pigmentation genes in two South Asian populations.

    PubMed

    Jonnalagadda, Manjari; Bharti, Neeraj; Patil, Yatish; Ozarkar, Shantanu; K, Sunitha Manjari; Joshi, Rajendra; Norton, Heather

    2017-09-10

    Skin pigmentation is a polygenic trait showing wide phenotypic variations among global populations. While numerous pigmentation genes have been identified to be under positive selection among European and East populations, genes contributing to phenotypic variation in skin pigmentation within and among South Asian populations are still poorly understood. The present study uses data from the Phase 3 of the 1000 genomes project focusing on two South Asian populations-GIH (Gujarati Indian from Houston, Texas) and ITU (Indian Telugu from UK), so as to decode the genetic architecture involved in adaptation to ultraviolet radiation in South Asian populations. Statistical tests included were (1) tests to identify deviations of the Site Frequency Spectrum (SFS) from neutral expectations (Tajima's D, Fay and Wu's H and Fu and Li's D* and F*), (2) tests focused on the identification of high-frequency haplotypes with extended linkage disequilibrium (iHS and Rsb), and (3) tests based on genetic differentiation between populations (LSBL). Twenty-two pigmentation genes fall in the top 1% for at least one statistic in the GIH population, 5 of which (LYST, OCA2, SLC24A5, SLC45A2, and TYR) have been previously associated with normal variation in skin, hair, or eye color. In comparison, 17 genes fall in the top 1% for at least one statistic in the ITU population. Twelve loci which are identified as outliers in the ITU scan were also identified in the GIH population. These results suggest that selection may have affected these loci broadly across the region. © 2017 Wiley Periodicals, Inc.

  13. Candidate genes for panhypopituitarism identified by gene expression profiling

    PubMed Central

    Mortensen, Amanda H.; MacDonald, James W.; Ghosh, Debashis

    2011-01-01

    Mutations in the transcription factors PROP1 and PIT1 (POU1F1) lead to pituitary hormone deficiency and hypopituitarism in mice and humans. The dysmorphology of developing Prop1 mutant pituitaries readily distinguishes them from those of Pit1 mutants and normal mice. This and other features suggest that Prop1 controls the expression of genes besides Pit1 that are important for pituitary cell migration, survival, and differentiation. To identify genes involved in these processes we used microarray analysis of gene expression to compare pituitary RNA from newborn Prop1 and Pit1 mutants and wild-type littermates. Significant differences in gene expression were noted between each mutant and their normal littermates, as well as between Prop1 and Pit1 mutants. Otx2, a gene critical for normal eye and pituitary development in humans and mice, exhibited elevated expression specifically in Prop1 mutant pituitaries. We report the spatial and temporal regulation of Otx2 in normal mice and Prop1 mutants, and the results suggest Otx2 could influence pituitary development by affecting signaling from the ventral diencephalon and regulation of gene expression in Rathke's pouch. The discovery that Otx2 expression is affected by Prop1 deficiency provides support for our hypothesis that identifying molecular differences in mutants will contribute to understanding the molecular mechanisms that control pituitary organogenesis and lead to human pituitary disease. PMID:21828248

  14. Identifying potential maternal genes of Bombyx mori using digital gene expression profiling

    PubMed Central

    Xu, Pingzhen

    2018-01-01

    Maternal genes present in mature oocytes play a crucial role in the early development of silkworm. Although maternal genes have been widely studied in many other species, there has been limited research in Bombyx mori. High-throughput next generation sequencing provides a practical method for gene discovery on a genome-wide level. Herein, a transcriptome study was used to identify maternal-related genes from silkworm eggs. Unfertilized eggs from five different stages of early development were used to detect the changing situation of gene expression. The expressed genes showed different patterns over time. Seventy-six maternal genes were annotated according to homology analysis with Drosophila melanogaster. More than half of the differentially expressed maternal genes fell into four expression patterns, while the expression patterns showed a downward trend over time. The functional annotation of these material genes was mainly related to transcription factor activity, growth factor activity, nucleic acid binding, RNA binding, ATP binding, and ion binding. Additionally, twenty-two gene clusters including maternal genes were identified from 18 scaffolds. Altogether, we plotted a profile for the maternal genes of Bombyx mori using a digital gene expression profiling method. This will provide the basis for maternal-specific signature research and improve the understanding of the early development of silkworm. PMID:29462160

  15. A Sparse Reconstruction Approach for Identifying Gene Regulatory Networks Using Steady-State Experiment Data

    PubMed Central

    Zhang, Wanhong; Zhou, Tong

    2015-01-01

    Motivation Identifying gene regulatory networks (GRNs) which consist of a large number of interacting units has become a problem of paramount importance in systems biology. Situations exist extensively in which causal interacting relationships among these units are required to be reconstructed from measured expression data and other a priori information. Though numerous classical methods have been developed to unravel the interactions of GRNs, these methods either have higher computing complexities or have lower estimation accuracies. Note that great similarities exist between identification of genes that directly regulate a specific gene and a sparse vector reconstruction, which often relates to the determination of the number, location and magnitude of nonzero entries of an unknown vector by solving an underdetermined system of linear equations y = Φx. Based on these similarities, we propose a novel framework of sparse reconstruction to identify the structure of a GRN, so as to increase accuracy of causal regulation estimations, as well as to reduce their computational complexity. Results In this paper, a sparse reconstruction framework is proposed on basis of steady-state experiment data to identify GRN structure. Different from traditional methods, this approach is adopted which is well suitable for a large-scale underdetermined problem in inferring a sparse vector. We investigate how to combine the noisy steady-state experiment data and a sparse reconstruction algorithm to identify causal relationships. Efficiency of this method is tested by an artificial linear network, a mitogen-activated protein kinase (MAPK) pathway network and the in silico networks of the DREAM challenges. The performance of the suggested approach is compared with two state-of-the-art algorithms, the widely adopted total least-squares (TLS) method and those available results on the DREAM project. Actual results show that, with a lower computational cost, the proposed method can

  16. Identifying key genes in rheumatoid arthritis by weighted gene co-expression network analysis.

    PubMed

    Ma, Chunhui; Lv, Qi; Teng, Songsong; Yu, Yinxian; Niu, Kerun; Yi, Chengqin

    2017-08-01

    This study aimed to identify rheumatoid arthritis (RA) related genes based on microarray data using the WGCNA (weighted gene co-expression network analysis) method. Two gene expression profile datasets GSE55235 (10 RA samples and 10 healthy controls) and GSE77298 (16 RA samples and seven healthy controls) were downloaded from Gene Expression Omnibus database. Characteristic genes were identified using metaDE package. WGCNA was used to find disease-related networks based on gene expression correlation coefficients, and module significance was defined as the average gene significance of all genes used to assess the correlation between the module and RA status. Genes in the disease-related gene co-expression network were subject to functional annotation and pathway enrichment analysis using Database for Annotation Visualization and Integrated Discovery. Characteristic genes were also mapped to the Connectivity Map to screen small molecules. A total of 599 characteristic genes were identified. For each dataset, characteristic genes in the green, red and turquoise modules were most closely associated with RA, with gene numbers of 54, 43 and 79, respectively. These genes were enriched in totally enriched in 17 Gene Ontology terms, mainly related to immune response (CD97, FYB, CXCL1, IKBKE, CCR1, etc.), inflammatory response (CD97, CXCL1, C3AR1, CCR1, LYZ, etc.) and homeostasis (C3AR1, CCR1, PLN, CCL19, PPT1, etc.). Two small-molecule drugs sanguinarine and papaverine were predicted to have a therapeutic effect against RA. Genes related to immune response, inflammatory response and homeostasis presumably have critical roles in RA pathogenesis. Sanguinarine and papaverine have a potential therapeutic effect against RA. © 2017 Asia Pacific League of Associations for Rheumatology and John Wiley & Sons Australia, Ltd.

  17. Identifying gene networks underlying the neurobiology of ethanol and alcoholism.

    PubMed

    Wolen, Aaron R; Miles, Michael F

    2012-01-01

    For complex disorders such as alcoholism, identifying the genes linked to these diseases and their specific roles is difficult. Traditional genetic approaches, such as genetic association studies (including genome-wide association studies) and analyses of quantitative trait loci (QTLs) in both humans and laboratory animals already have helped identify some candidate genes. However, because of technical obstacles, such as the small impact of any individual gene, these approaches only have limited effectiveness in identifying specific genes that contribute to complex diseases. The emerging field of systems biology, which allows for analyses of entire gene networks, may help researchers better elucidate the genetic basis of alcoholism, both in humans and in animal models. Such networks can be identified using approaches such as high-throughput molecular profiling (e.g., through microarray-based gene expression analyses) or strategies referred to as genetical genomics, such as the mapping of expression QTLs (eQTLs). Characterization of gene networks can shed light on the biological pathways underlying complex traits and provide the functional context for identifying those genes that contribute to disease development.

  18. Identifying key genes in glaucoma based on a benchmarked dataset and the gene regulatory network.

    PubMed

    Chen, Xi; Wang, Qiao-Ling; Zhang, Meng-Hui

    2017-10-01

    The current study aimed to identify key genes in glaucoma based on a benchmarked dataset and gene regulatory network (GRN). Local and global noise was added to the gene expression dataset to produce a benchmarked dataset. Differentially-expressed genes (DEGs) between patients with glaucoma and normal controls were identified utilizing the Linear Models for Microarray Data (Limma) package based on benchmarked dataset. A total of 5 GRN inference methods, including Zscore, GeneNet, context likelihood of relatedness (CLR) algorithm, Partial Correlation coefficient with Information Theory (PCIT) and GEne Network Inference with Ensemble of Trees (Genie3) were evaluated using receiver operating characteristic (ROC) and precision and recall (PR) curves. The interference method with the best performance was selected to construct the GRN. Subsequently, topological centrality (degree, closeness and betweenness) was conducted to identify key genes in the GRN of glaucoma. Finally, the key genes were validated by performing reverse transcription-quantitative polymerase chain reaction (RT-qPCR). A total of 176 DEGs were detected from the benchmarked dataset. The ROC and PR curves of the 5 methods were analyzed and it was determined that Genie3 had a clear advantage over the other methods; thus, Genie3 was used to construct the GRN. Following topological centrality analysis, 14 key genes for glaucoma were identified, including IL6 , EPHA2 and GSTT1 and 5 of these 14 key genes were validated by RT-qPCR. Therefore, the current study identified 14 key genes in glaucoma, which may be potential biomarkers to use in the diagnosis of glaucoma and aid in identifying the molecular mechanism of this disease.

  19. Diametrical clustering for identifying anti-correlated gene clusters.

    PubMed

    Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman

    2003-09-01

    Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.

  20. MVisAGe Identifies Concordant and Discordant Genomic Alterations of Driver Genes in Squamous Tumors.

    PubMed

    Walter, Vonn; Du, Ying; Danilova, Ludmila; Hayward, Michele C; Hayes, D Neil

    2018-06-15

    Integrated analyses of multiple genomic datatypes are now common in cancer profiling studies. Such data present opportunities for numerous computational experiments, yet analytic pipelines are limited. Tools such as the cBioPortal and Regulome Explorer, although useful, are not easy to access programmatically or to implement locally. Here, we introduce the MVisAGe R package, which allows users to quantify gene-level associations between two genomic datatypes to investigate the effect of genomic alterations (e.g., DNA copy number changes on gene expression). Visualizing Pearson/Spearman correlation coefficients according to the genomic positions of the underlying genes provides a powerful yet novel tool for conducting exploratory analyses. We demonstrate its utility by analyzing three publicly available cancer datasets. Our approach highlights canonical oncogenes in chr11q13 that displayed the strongest associations between expression and copy number, including CCND1 and CTTN , genes not identified by copy number analysis in the primary reports. We demonstrate highly concordant usage of shared oncogenes on chr3q, yet strikingly diverse oncogene usage on chr11q as a function of HPV infection status. Regions of chr19 that display remarkable associations between methylation and gene expression were identified, as were previously unreported miRNA-gene expression associations that may contribute to the epithelial-to-mesenchymal transition. Significance: This study presents an important bioinformatics tool that will enable integrated analyses of multiple genomic datatypes. Cancer Res; 78(12); 3375-85. ©2018 AACR . ©2018 American Association for Cancer Research.

  1. ENU Mutagenesis in Mice Identifies Candidate Genes For Hypogonadism

    PubMed Central

    Weiss, Jeffrey; Hurley, Lisa A.; Harris, Rebecca M.; Finlayson, Courtney; Tong, Minghan; Fisher, Lisa A.; Moran, Jennifer L.; Beier, David R.; Mason, Christopher; Jameson, J. Larry

    2012-01-01

    Genome-wide mutagenesis was performed in mice to identify candidate genes for male infertility, for which the predominant causes remain idiopathic. Mice were mutagenized using N-ethyl-N-nitrosourea (ENU), bred, and screened for phenotypes associated with the male urogenital system. Fifteen heritable lines were isolated and chromosomal loci were assigned using low density genome-wide SNP arrays. Ten of the fifteen lines were pursued further using higher resolution SNP analysis to narrow the candidate gene regions. Exon sequencing of candidate genes identified mutations in mice with cystic kidneys (Bicc1), cryptorchidism (Rxfp2), restricted germ cell deficiency (Plk4), and severe germ cell deficiency (Prdm9). In two other lines with severe hypogonadism candidate sequencing failed to identify mutations, suggesting defects in genes with previously undocumented roles in gonadal function. These genomic intervals were sequenced in their entirety and a candidate mutation was identified in SnrpE in one of the two lines. The line harboring the SnrpE variant retains substantial spermatogenesis despite small testis size, an unusual phenotype. In addition to the reproductive defects, heritable phenotypes were observed in mice with ataxia (Myo5a), tremors (Pmp22), growth retardation (unknown gene), and hydrocephalus (unknown gene). These results demonstrate that the ENU screen is an effective tool for identifying potential causes of male infertility. PMID:22258617

  2. LNDriver: identifying driver genes by integrating mutation and expression data based on gene-gene interaction network.

    PubMed

    Wei, Pi-Jing; Zhang, Di; Xia, Junfeng; Zheng, Chun-Hou

    2016-12-23

    Cancer is a complex disease which is characterized by the accumulation of genetic alterations during the patient's lifetime. With the development of the next-generation sequencing technology, multiple omics data, such as cancer genomic, epigenomic and transcriptomic data etc., can be measured from each individual. Correspondingly, one of the key challenges is to pinpoint functional driver mutations or pathways, which contributes to tumorigenesis, from millions of functional neutral passenger mutations. In this paper, in order to identify driver genes effectively, we applied a generalized additive model to mutation profiles to filter genes with long length and constructed a new gene-gene interaction network. Then we integrated the mutation data and expression data into the gene-gene interaction network. Lastly, greedy algorithm was used to prioritize candidate driver genes from the integrated data. We named the proposed method Length-Net-Driver (LNDriver). Experiments on three TCGA datasets, i.e., head and neck squamous cell carcinoma, kidney renal clear cell carcinoma and thyroid carcinoma, demonstrated that the proposed method was effective. Also, it can identify not only frequently mutated drivers, but also rare candidate driver genes.

  3. Identifying key genes associated with acute myocardial infarction.

    PubMed

    Cheng, Ming; An, Shoukuan; Li, Junquan

    2017-10-01

    This study aimed to identify key genes associated with acute myocardial infarction (AMI) by reanalyzing microarray data. Three gene expression profile datasets GSE66360, GSE34198, and GSE48060 were downloaded from GEO database. After data preprocessing, genes without heterogeneity across different platforms were subjected to differential expression analysis between the AMI group and the control group using metaDE package. P < .05 was used as the cutoff for a differentially expressed gene (DEG). The expression data matrices of DEGs were imported in ReactomeFIViz to construct a gene functional interaction (FI) network. Then, DEGs in each module were subjected to pathway enrichment analysis using DAVID. MiRNAs and transcription factors predicted to regulate target DEGs were identified. Quantitative real-time polymerase chain reaction (RT-PCR) was applied to verify the expression of genes. A total of 913 upregulated genes and 1060 downregulated genes were identified in the AMI group. A FI network consists of 21 modules and DEGs in 12 modules were significantly enriched in pathways. The transcription factor-miRNA-gene network contains 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p. RT-PCR validations showed that expression levels of FOXO3 and MYBL2 were significantly increased in AMI, and expression levels of hsa-miR-21-5p and hsa-miR-30c-5p were obviously decreased in AMI. A total of 41 DEGs, such as SOCS3, VAPA, and COL5A2, are speculated to have roles in the pathogenesis of AMI; 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p may be involved in the regulation of the expression of these DEGs.

  4. Mapping eQTLs in the Norfolk Island Genetic Isolate Identifies Candidate Genes for CVD Risk Traits

    PubMed Central

    Benton, Miles C.; Lea, Rod A.; Macartney-Coxson, Donia; Carless, Melanie A.; Göring, Harald H.; Bellis, Claire; Hanna, Michelle; Eccles, David; Chambers, Geoffrey K.; Curran, Joanne E.; Harper, Jacquie L.; Blangero, John; Griffiths, Lyn R.

    2013-01-01

    Cardiovascular disease (CVD) affects millions of people worldwide and is influenced by numerous factors, including lifestyle and genetics. Expression quantitative trait loci (eQTLs) influence gene expression and are good candidates for CVD risk. Founder-effect pedigrees can provide additional power to map genes associated with disease risk. Therefore, we identified eQTLs in the genetic isolate of Norfolk Island (NI) and tested for associations between these and CVD risk factors. We measured genome-wide transcript levels of blood lymphocytes in 330 individuals and used pedigree-based heritability analysis to identify heritable transcripts. eQTLs were identified by genome-wide association testing of these transcripts. Testing for association between CVD risk factors (i.e., blood lipids, blood pressure, and body fat indices) and eQTLs revealed 1,712 heritable transcripts (p < 0.05) with heritability values ranging from 0.18 to 0.84. From these, we identified 200 cis-acting and 70 trans-acting eQTLs (p < 1.84 × 10−7) An eQTL-centric analysis of CVD risk traits revealed multiple associations, including 12 previously associated with CVD-related traits. Trait versus eQTL regression modeling identified four CVD risk candidates (NAAA, PAPSS1, NME1, and PRDX1), all of which have known biological roles in disease. In addition, we implicated several genes previously associated with CVD risk traits, including MTHFR and FN3KRP. We have successfully identified a panel of eQTLs in the NI pedigree and used this to implicate several genes in CVD risk. Future studies are required for further assessing the functional importance of these eQTLs and whether the findings here also relate to outbred populations. PMID:24314549

  5. A genomic approach to identify hybrid incompatibility genes.

    PubMed

    Cooper, Jacob C; Phadnis, Nitin

    2016-07-02

    Uncovering the genetic and molecular basis of barriers to gene flow between populations is key to understanding how new species are born. Intrinsic postzygotic reproductive barriers such as hybrid sterility and hybrid inviability are caused by deleterious genetic interactions known as hybrid incompatibilities. The difficulty in identifying these hybrid incompatibility genes remains a rate-limiting step in our understanding of the molecular basis of speciation. We recently described how whole genome sequencing can be applied to identify hybrid incompatibility genes, even from genetically terminal hybrids. Using this approach, we discovered a new hybrid incompatibility gene, gfzf, between Drosophila melanogaster and Drosophila simulans, and found that it plays an essential role in cell cycle regulation. Here, we discuss the history of the hunt for incompatibility genes between these species, discuss the molecular roles of gfzf in cell cycle regulation, and explore how intragenomic conflict drives the evolution of fundamental cellular mechanisms that lead to the developmental arrest of hybrids.

  6. A genomic approach to identify hybrid incompatibility genes

    PubMed Central

    Cooper, Jacob C.; Phadnis, Nitin

    2016-01-01

    ABSTRACT Uncovering the genetic and molecular basis of barriers to gene flow between populations is key to understanding how new species are born. Intrinsic postzygotic reproductive barriers such as hybrid sterility and hybrid inviability are caused by deleterious genetic interactions known as hybrid incompatibilities. The difficulty in identifying these hybrid incompatibility genes remains a rate-limiting step in our understanding of the molecular basis of speciation. We recently described how whole genome sequencing can be applied to identify hybrid incompatibility genes, even from genetically terminal hybrids. Using this approach, we discovered a new hybrid incompatibility gene, gfzf, between Drosophila melanogaster and Drosophila simulans, and found that it plays an essential role in cell cycle regulation. Here, we discuss the history of the hunt for incompatibility genes between these species, discuss the molecular roles of gfzf in cell cycle regulation, and explore how intragenomic conflict drives the evolution of fundamental cellular mechanisms that lead to the developmental arrest of hybrids. PMID:27230814

  7. Vasohibin-1 is identified as a master-regulator of endothelial cell apoptosis using gene network analysis

    PubMed Central

    2013-01-01

    Background Apoptosis is a critical process in endothelial cell (EC) biology and pathology, which has been extensively studied at protein level. Numerous gene expression studies of EC apoptosis have also been performed, however few attempts have been made to use gene expression data to identify the molecular relationships and master regulators that underlie EC apoptosis. Therefore, we sought to understand these relationships by generating a Bayesian gene regulatory network (GRN) model. Results ECs were induced to undergo apoptosis using serum withdrawal and followed over a time course in triplicate, using microarrays. When generating the GRN, this EC time course data was supplemented by a library of microarray data from EC treated with siRNAs targeting over 350 signalling molecules. The GRN model proposed Vasohibin-1 (VASH1) as one of the candidate master-regulators of EC apoptosis with numerous downstream mRNAs. To evaluate the role played by VASH1 in EC, we used siRNA to reduce the expression of VASH1. Of 10 mRNAs downstream of VASH1 in the GRN that were examined, 7 were significantly up- or down-regulated in the direction predicted by the GRN.Further supporting an important biological role of VASH1 in EC, targeted reduction of VASH1 mRNA abundance conferred resistance to serum withdrawal-induced EC death. Conclusion We have utilised Bayesian GRN modelling to identify a novel candidate master regulator of EC apoptosis. This study demonstrates how GRN technology can complement traditional methods to hypothesise the regulatory relationships that underlie important biological processes. PMID:23324451

  8. Identifying a gene expression signature of cluster headache in blood

    PubMed Central

    Eising, Else; Pelzer, Nadine; Vijfhuizen, Lisanne S.; Vries, Boukje de; Ferrari, Michel D.; ‘t Hoen, Peter A. C.; Terwindt, Gisela M.; van den Maagdenberg, Arn M. J. M.

    2017-01-01

    Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20). Gene expression data were analysed by gene and by module of co-expressed genes with particular attention to previously implicated disease pathways including hypocretin dysregulation. Only moderate gene expression differences were identified and no associations were found with previously reported pathogenic mechanisms. At the level of functional gene sets, associations were observed for genes involved in several brain-related mechanisms such as GABA receptor function and voltage-gated channels. In addition, genes and modules of co-expressed genes showed a role for intracellular signalling cascades, mitochondria and inflammation. Although larger study samples may be required to identify the full range of involved pathways, these results indicate a role for mitochondria, intracellular signalling and inflammation in cluster headache. PMID:28074859

  9. Identifying key genes associated with acute myocardial infarction

    PubMed Central

    Cheng, Ming; An, Shoukuan; Li, Junquan

    2017-01-01

    Abstract Background: This study aimed to identify key genes associated with acute myocardial infarction (AMI) by reanalyzing microarray data. Methods: Three gene expression profile datasets GSE66360, GSE34198, and GSE48060 were downloaded from GEO database. After data preprocessing, genes without heterogeneity across different platforms were subjected to differential expression analysis between the AMI group and the control group using metaDE package. P < .05 was used as the cutoff for a differentially expressed gene (DEG). The expression data matrices of DEGs were imported in ReactomeFIViz to construct a gene functional interaction (FI) network. Then, DEGs in each module were subjected to pathway enrichment analysis using DAVID. MiRNAs and transcription factors predicted to regulate target DEGs were identified. Quantitative real-time polymerase chain reaction (RT-PCR) was applied to verify the expression of genes. Result: A total of 913 upregulated genes and 1060 downregulated genes were identified in the AMI group. A FI network consists of 21 modules and DEGs in 12 modules were significantly enriched in pathways. The transcription factor-miRNA-gene network contains 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p. RT-PCR validations showed that expression levels of FOXO3 and MYBL2 were significantly increased in AMI, and expression levels of hsa-miR-21–5p and hsa-miR-30c-5p were obviously decreased in AMI. Conclusion: A total of 41 DEGs, such as SOCS3, VAPA, and COL5A2, are speculated to have roles in the pathogenesis of AMI; 2 transcription factors FOXO3 and MYBL2, and 2 miRNAs hsa-miR-21-5p and hsa-miR-30c-5p may be involved in the regulation of the expression of these DEGs. PMID:29049183

  10. Gene-based rare allele analysis identified a risk gene of Alzheimer's disease.

    PubMed

    Kim, Jong Hun; Song, Pamela; Lim, Hyunsun; Lee, Jae-Hyung; Lee, Jun Hong; Park, Sun Ah

    2014-01-01

    Alzheimer's disease (AD) has a strong propensity to run in families. However, the known risk genes excluding APOE are not clinically useful. In various complex diseases, gene studies have targeted rare alleles for unsolved heritability. Our study aims to elucidate previously unknown risk genes for AD by targeting rare alleles. We used data from five publicly available genetic studies from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the database of Genotypes and Phenotypes (dbGaP). A total of 4,171 cases and 9,358 controls were included. The genotype information of rare alleles was imputed using 1,000 genomes. We performed gene-based analysis of rare alleles (minor allele frequency≤3%). The genome-wide significance level was defined as meta P<1.8×10(-6) (0.05/number of genes in human genome = 0.05/28,517). ZNF628, which is located at chromosome 19q13.42, showed a genome-wide significant association with AD. The association of ZNF628 with AD was not dependent on APOE ε4. APOE and TREM2 were also significantly associated with AD, although not at genome-wide significance levels. Other genes identified by targeting common alleles could not be replicated in our gene-based rare allele analysis. We identified that rare variants in ZNF628 are associated with AD. The protein encoded by ZNF628 is known as a transcription factor. Furthermore, the associations of APOE and TREM2 with AD were highly significant, even in gene-based rare allele analysis, which implies that further deep sequencing of these genes is required in AD heritability studies.

  11. A gene-trap strategy identifies quiescence-induced genes in synchronized myoblasts.

    PubMed

    Sambasivan, Ramkumar; Pavlath, Grace K; Dhawan, Jyotsna

    2008-03-01

    Cellular quiescence is characterized not only by reduced mitotic and metabolic activity but also by altered gene expression. Growing evidence suggests that quiescence is not merely a basal state but is regulated by active mechanisms. To understand the molecular programme that governs reversible cell cycle exit, we focused on quiescence-related gene expression in a culture model of myogenic cell arrest and activation. Here we report the identification of quiescence-induced genes using a gene-trap strategy. Using a retroviral vector, we generated a library of gene traps in C2C12 myoblasts that were screened for arrest-induced insertions by live cell sorting (FACS-gal). Several independent gene- trap lines revealed arrest-dependent induction of betagal activity, confirming the efficacy of the FACS screen. The locus of integration was identified in 15 lines. In three lines,insertion occurred in genes previously implicated in the control of quiescence, i.e. EMSY - a BRCA2--interacting protein, p8/com1 - a p300HAT -- binding protein and MLL5 - a SET domain protein. Our results demonstrate that expression of chromatin modulatory genes is induced in G0, providing support to the notion that this reversibly arrested state is actively regulated.

  12. A Strategy for Identifying Quantitative Trait Genes Using Gene Expression Analysis and Causal Analysis.

    PubMed

    Ishikawa, Akira

    2017-11-27

    Large numbers of quantitative trait loci (QTL) affecting complex diseases and other quantitative traits have been reported in humans and model animals. However, the genetic architecture of these traits remains elusive due to the difficulty in identifying causal quantitative trait genes (QTGs) for common QTL with relatively small phenotypic effects. A traditional strategy based on techniques such as positional cloning does not always enable identification of a single candidate gene for a QTL of interest because it is difficult to narrow down a target genomic interval of the QTL to a very small interval harboring only one gene. A combination of gene expression analysis and statistical causal analysis can greatly reduce the number of candidate genes. This integrated approach provides causal evidence that one of the candidate genes is a putative QTG for the QTL. Using this approach, I have recently succeeded in identifying a single putative QTG for resistance to obesity in mice. Here, I outline the integration approach and discuss its usefulness using my studies as an example.

  13. Genome-wide RNAi screening identifies protein damage as a regulator of osmoprotective gene expression.

    PubMed

    Lamitina, Todd; Huang, Chunyi George; Strange, Kevin

    2006-08-08

    The detection, stabilization, and repair of stress-induced damage are essential requirements for cellular life. All cells respond to osmotic stress-induced water loss with increased expression of genes that mediate accumulation of organic osmolytes, solutes that function as chemical chaperones and restore osmotic homeostasis. The signals and signaling mechanisms that regulate osmoprotective gene expression in animal cells are poorly understood. Here, we show that gpdh-1 and gpdh-2, genes that mediate the accumulation of the organic osmolyte glycerol, are essential for survival of the nematode Caenorhabditis elegans during osmotic stress. Expression of GFP driven by the gpdh-1 promoter (P(gpdh-1)::GFP) is detected only during hypertonic stress but is not induced by other stressors. Using P(gpdh-1)::GFP expression as a phenotype, we screened approximately 16,000 genes by RNAi feeding and identified 122 that cause constitutive activation of gpdh-1 expression and glycerol accumulation. Many of these genes function to regulate protein translation and cotranslational protein folding and to target and degrade denatured proteins, suggesting that the accumulation of misfolded proteins functions as a signal to activate osmoprotective gene expression and organic osmolyte accumulation in animal cells. Consistent with this hypothesis, 73% of these protein-homeostasis genes have been shown to slow age-dependent protein aggregation in C. elegans. Because diverse environmental stressors and numerous disease states result in protein misfolding, mechanisms must exist that discriminate between osmotically induced and other forms of stress-induced protein damage. Our findings provide a foundation for understanding how these damage-selectivity mechanisms function.

  14. Genome-wide RNAi screening identifies protein damage as a regulator of osmoprotective gene expression

    PubMed Central

    Lamitina, Todd; Huang, Chunyi George; Strange, Kevin

    2006-01-01

    The detection, stabilization, and repair of stress-induced damage are essential requirements for cellular life. All cells respond to osmotic stress-induced water loss with increased expression of genes that mediate accumulation of organic osmolytes, solutes that function as chemical chaperones and restore osmotic homeostasis. The signals and signaling mechanisms that regulate osmoprotective gene expression in animal cells are poorly understood. Here, we show that gpdh-1 and gpdh-2, genes that mediate the accumulation of the organic osmolyte glycerol, are essential for survival of the nematode Caenorhabditis elegans during osmotic stress. Expression of GFP driven by the gpdh-1 promoter (Pgpdh-1::GFP) is detected only during hypertonic stress but is not induced by other stressors. Using Pgpdh-1::GFP expression as a phenotype, we screened ≈16,000 genes by RNAi feeding and identified 122 that cause constitutive activation of gpdh-1 expression and glycerol accumulation. Many of these genes function to regulate protein translation and cotranslational protein folding and to target and degrade denatured proteins, suggesting that the accumulation of misfolded proteins functions as a signal to activate osmoprotective gene expression and organic osmolyte accumulation in animal cells. Consistent with this hypothesis, 73% of these protein-homeostasis genes have been shown to slow age-dependent protein aggregation in C. elegans. Because diverse environmental stressors and numerous disease states result in protein misfolding, mechanisms must exist that discriminate between osmotically induced and other forms of stress-induced protein damage. Our findings provide a foundation for understanding how these damage-selectivity mechanisms function. PMID:16880390

  15. Gene-Trap Mutagenesis Identifies Mammalian Genes Contributing to Intoxication by Clostridium perfringens ε-Toxin

    PubMed Central

    Ivie, Susan E.; Fennessey, Christine M.; Sheng, Jinsong; Rubin, Donald H.; McClain, Mark S.

    2011-01-01

    The Clostridium perfringens ε-toxin is an extremely potent toxin associated with lethal toxemias in domesticated ruminants and may be toxic to humans. Intoxication results in fluid accumulation in various tissues, most notably in the brain and kidneys. Previous studies suggest that the toxin is a pore-forming toxin, leading to dysregulated ion homeostasis and ultimately cell death. However, mammalian host factors that likely contribute to ε-toxin-induced cytotoxicity are poorly understood. A library of insertional mutant Madin Darby canine kidney (MDCK) cells, which are highly susceptible to the lethal affects of ε-toxin, was used to select clones of cells resistant to ε-toxin-induced cytotoxicity. The genes mutated in 9 surviving resistant cell clones were identified. We focused additional experiments on one of the identified genes as a means of validating the experimental approach. Gene expression microarray analysis revealed that one of the identified genes, hepatitis A virus cellular receptor 1 (HAVCR1, KIM-1, TIM1), is more abundantly expressed in human kidney cell lines than it is expressed in human cells known to be resistant to ε-toxin. One human kidney cell line, ACHN, was found to be sensitive to the toxin and expresses a larger isoform of the HAVCR1 protein than the HAVCR1 protein expressed by other, toxin-resistant human kidney cell lines. RNA interference studies in MDCK and in ACHN cells confirmed that HAVCR1 contributes to ε-toxin-induced cytotoxicity. Additionally, ε-toxin was shown to bind to HAVCR1 in vitro. The results of this study indicate that HAVCR1 and the other genes identified through the use of gene-trap mutagenesis and RNA interference strategies represent important targets for investigation of the process by which ε-toxin induces cell death and new targets for potential therapeutic intervention. PMID:21412435

  16. Gene-trap mutagenesis identifies mammalian genes contributing to intoxication by Clostridium perfringens ε-toxin.

    PubMed

    Ivie, Susan E; Fennessey, Christine M; Sheng, Jinsong; Rubin, Donald H; McClain, Mark S

    2011-03-11

    The Clostridium perfringens ε-toxin is an extremely potent toxin associated with lethal toxemias in domesticated ruminants and may be toxic to humans. Intoxication results in fluid accumulation in various tissues, most notably in the brain and kidneys. Previous studies suggest that the toxin is a pore-forming toxin, leading to dysregulated ion homeostasis and ultimately cell death. However, mammalian host factors that likely contribute to ε-toxin-induced cytotoxicity are poorly understood. A library of insertional mutant Madin Darby canine kidney (MDCK) cells, which are highly susceptible to the lethal affects of ε-toxin, was used to select clones of cells resistant to ε-toxin-induced cytotoxicity. The genes mutated in 9 surviving resistant cell clones were identified. We focused additional experiments on one of the identified genes as a means of validating the experimental approach. Gene expression microarray analysis revealed that one of the identified genes, hepatitis A virus cellular receptor 1 (HAVCR1, KIM-1, TIM1), is more abundantly expressed in human kidney cell lines than it is expressed in human cells known to be resistant to ε-toxin. One human kidney cell line, ACHN, was found to be sensitive to the toxin and expresses a larger isoform of the HAVCR1 protein than the HAVCR1 protein expressed by other, toxin-resistant human kidney cell lines. RNA interference studies in MDCK and in ACHN cells confirmed that HAVCR1 contributes to ε-toxin-induced cytotoxicity. Additionally, ε-toxin was shown to bind to HAVCR1 in vitro. The results of this study indicate that HAVCR1 and the other genes identified through the use of gene-trap mutagenesis and RNA interference strategies represent important targets for investigation of the process by which ε-toxin induces cell death and new targets for potential therapeutic intervention.

  17. A Penalized Robust Method for Identifying Gene-Environment Interactions

    PubMed Central

    Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Xie, Yang; Ma, Shuangge

    2015-01-01

    In high-throughput studies, an important objective is to identify gene-environment interactions associated with disease outcomes and phenotypes. Many commonly adopted methods assume specific parametric or semiparametric models, which may be subject to model mis-specification. In addition, they usually use significance level as the criterion for selecting important interactions. In this study, we adopt the rank-based estimation, which is much less sensitive to model specification than some of the existing methods and includes several commonly encountered data and models as special cases. Penalization is adopted for the identification of gene-environment interactions. It achieves simultaneous estimation and identification and does not rely on significance level. For computation feasibility, a smoothed rank estimation is further proposed. Simulation shows that under certain scenarios, for example with contaminated or heavy-tailed data, the proposed method can significantly outperform the existing alternatives with more accurate identification. We analyze a lung cancer prognosis study with gene expression measurements under the AFT (accelerated failure time) model. The proposed method identifies interactions different from those using the alternatives. Some of the identified genes have important implications. PMID:24616063

  18. Predicting hepatocellular carcinoma through cross-talk genes identified by risk pathways

    PubMed Central

    Shao, Zhuo; Huo, Diwei; Zhang, Denan; Xie, Hongbo; Yang, Jingbo; Liu, Qiuqi; Chen, Xiujie

    2018-01-01

    Hepatocellular carcinoma (HCC) is the most frequent type of liver cancer with poor survival rate and high mortality. Despite efforts on the mechanism of HCC, new molecular markers are needed for exact diagnosis, evaluation and treatment. Here, we combined transcriptome of HCC with networks and pathways to identify reliable molecular markers. Through integrating 249 differentially expressed genes with syncretic protein interaction networks, we constructed a HCC-specific network, from which we further extracted 480 pivotal genes. Based on the cross-talk between the enriched pathways of the pivotal genes, we finally identified a HCC signature of 45 genes, which could accurately distinguish HCC patients with normal individuals and reveal the prognosis of HCC patients. Among these 45 genes, 15 showed dysregulated expression patterns and a part have been reported to be associated with HCC and/or other cancers. These findings suggested that our identified 45 gene signature could be potential and valuable molecular markers for diagnosis and evaluation of HCC. PMID:29765536

  19. High-resolution linkage analyses to identify genes that influence Varroa sensitive hygiene behavior in honey bees.

    PubMed

    Tsuruda, Jennifer M; Harris, Jeffrey W; Bourgeois, Lanie; Danka, Robert G; Hunt, Greg J

    2012-01-01

    Varroa mites (V. destructor) are a major threat to honey bees (Apis melilfera) and beekeeping worldwide and likely lead to colony decline if colonies are not treated. Most treatments involve chemical control of the mites; however, Varroa has evolved resistance to many of these miticides, leaving beekeepers with a limited number of alternatives. A non-chemical control method is highly desirable for numerous reasons including lack of chemical residues and decreased likelihood of resistance. Varroa sensitive hygiene behavior is one of two behaviors identified that are most important for controlling the growth of Varroa populations in bee hives. To identify genes influencing this trait, a study was conducted to map quantitative trait loci (QTL). Individual workers of a backcross family were observed and evaluated for their VSH behavior in a mite-infested observation hive. Bees that uncapped or removed pupae were identified. The genotypes for 1,340 informative single nucleotide polymorphisms were used to construct a high-resolution genetic map and interval mapping was used to analyze the association of the genotypes with the performance of Varroa sensitive hygiene. We identified one major QTL on chromosome 9 (LOD score = 3.21) and a suggestive QTL on chromosome 1 (LOD = 1.95). The QTL confidence interval on chromosome 9 contains the gene 'no receptor potential A' and a dopamine receptor. 'No receptor potential A' is involved in vision and olfaction in Drosophila, and dopamine signaling has been previously shown to be required for aversive olfactory learning in honey bees, which is probably necessary for identifying mites within brood cells. Further studies on these candidate genes may allow for breeding bees with this trait using marker-assisted selection.

  20. Inferring Gene Family Histories in Yeast Identifies Lineage Specific Expansions

    PubMed Central

    Ames, Ryan M.; Money, Daniel; Lovell, Simon C.

    2014-01-01

    The complement of genes found in the genome is a balance between gene gain and gene loss. Knowledge of the specific genes that are gained and lost over evolutionary time allows an understanding of the evolution of biological functions. Here we use new evolutionary models to infer gene family histories across complete yeast genomes; these models allow us to estimate the relative genome-wide rates of gene birth, death, innovation and extinction (loss of an entire family) for the first time. We show that the rates of gene family evolution vary both between gene families and between species. We are also able to identify those families that have experienced rapid lineage specific expansion/contraction and show that these families are enriched for specific functions. Moreover, we find that families with specific functions are repeatedly expanded in multiple species, suggesting the presence of common adaptations and that these family expansions/contractions are not random. Additionally, we identify potential specialisations, unique to specific species, in the functions of lineage specific expanded families. These results suggest that an important mechanism in the evolution of genome content is the presence of lineage-specific gene family changes. PMID:24921666

  1. The genetics of alcoholism: identifying specific genes through family studies.

    PubMed

    Edenberg, Howard J; Foroud, Tatiana

    2006-09-01

    Alcoholism is a complex disorder with both genetic and environmental risk factors. Studies in humans have begun to elucidate the genetic underpinnings of the risk for alcoholism. Here we briefly review strategies for identifying individual genes in which variations affect the risk for alcoholism and related phenotypes, in the context of one large study that has successfully identified such genes. The Collaborative Study on the Genetics of Alcoholism (COGA) is a family-based study that has collected detailed phenotypic data on individuals in families with multiple alcoholic members. A genome-wide linkage approach led to the identification of chromosomal regions containing genes that influenced alcoholism risk and related phenotypes. Subsequently, single nucleotide polymorphisms (SNPs) were genotyped in positional candidate genes located within the linked chromosomal regions, and analyzed for association with these phenotypes. Using this sequential approach, COGA has detected association with GABRA2, CHRM2 and ADH4; these associations have all been replicated by other researchers. COGA has detected association to additional genes including GABRG3, TAS2R16, SNCA, OPRK1 and PDYN, results that are awaiting confirmation. These successes demonstrate that genes contributing to the risk for alcoholism can be reliably identified using human subjects.

  2. Gene expression patterns combined with bioinformatics analysis identify genes associated with cholangiocarcinoma.

    PubMed

    Li, Chen; Shen, Weixing; Shen, Sheng; Ai, Zhilong

    2013-12-01

    To explore the molecular mechanisms of cholangiocarcinoma (CC), microarray technology was used to find biomarkers for early detection and diagnosis. The gene expression profiles from 6 patients with CC and 5 normal controls were downloaded from Gene Expression Omnibus and compared. As a result, 204 differentially co-expressed genes (DCGs) in CC patients compared to normal controls were identified using a computational bioinformatics analysis. These genes were mainly involved in coenzyme metabolic process, peptidase activity and oxidation reduction. A regulatory network was constructed by mapping the DCGs to known regulation data. Four transcription factors, FOXC1, ZIC2, NKX2-2 and GCGR, were hub nodes in the network. In conclusion, this study provides a set of targets useful for future investigations into molecular biomarker studies. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. Identifying conserved gene clusters in the presence of homology families.

    PubMed

    He, Xin; Goldwasser, Michael H

    2005-01-01

    The study of conserved gene clusters is important for understanding the forces behind genome organization and evolution, as well as the function of individual genes or gene groups. In this paper, we present a new model and algorithm for identifying conserved gene clusters from pairwise genome comparison. This generalizes a recent model called "gene teams." A gene team is a set of genes that appear homologously in two or more species, possibly in a different order yet with the distance of adjacent genes in the team for each chromosome always no more than a certain threshold. We remove the constraint in the original model that each gene must have a unique occurrence in each chromosome and thus allow the analysis on complex prokaryotic or eukaryotic genomes with extensive paralogs. Our algorithm analyzes a pair of chromosomes in O(mn) time and uses O(m+n) space, where m and n are the number of genes in the respective chromosomes. We demonstrate the utility of our methods by studying two bacterial genomes, E. coli K-12 and B. subtilis. Many of the teams identified by our algorithm correlate with documented E. coli operons, while several others match predicted operons, previously suggested by computational techniques. Our implementation and data are publicly available at euler.slu.edu/ approximately goldwasser/homologyteams/.

  4. High-Resolution Linkage Analyses to Identify Genes That Influence Varroa Sensitive Hygiene Behavior in Honey Bees

    PubMed Central

    Tsuruda, Jennifer M.; Harris, Jeffrey W.; Bourgeois, Lanie; Danka, Robert G.; Hunt, Greg J.

    2012-01-01

    Varroa mites (V. destructor) are a major threat to honey bees (Apis melilfera) and beekeeping worldwide and likely lead to colony decline if colonies are not treated. Most treatments involve chemical control of the mites; however, Varroa has evolved resistance to many of these miticides, leaving beekeepers with a limited number of alternatives. A non-chemical control method is highly desirable for numerous reasons including lack of chemical residues and decreased likelihood of resistance. Varroa sensitive hygiene behavior is one of two behaviors identified that are most important for controlling the growth of Varroa populations in bee hives. To identify genes influencing this trait, a study was conducted to map quantitative trait loci (QTL). Individual workers of a backcross family were observed and evaluated for their VSH behavior in a mite-infested observation hive. Bees that uncapped or removed pupae were identified. The genotypes for 1,340 informative single nucleotide polymorphisms were used to construct a high-resolution genetic map and interval mapping was used to analyze the association of the genotypes with the performance of Varroa sensitive hygiene. We identified one major QTL on chromosome 9 (LOD score = 3.21) and a suggestive QTL on chromosome 1 (LOD = 1.95). The QTL confidence interval on chromosome 9 contains the gene ‘no receptor potential A’ and a dopamine receptor. ‘No receptor potential A’ is involved in vision and olfaction in Drosophila, and dopamine signaling has been previously shown to be required for aversive olfactory learning in honey bees, which is probably necessary for identifying mites within brood cells. Further studies on these candidate genes may allow for breeding bees with this trait using marker-assisted selection. PMID:23133626

  5. Gene Signature in Sessile Serrated Polyps Identifies Colon Cancer Subtype

    PubMed Central

    Kanth, Priyanka; Bronner, Mary P.; Boucher, Kenneth M.; Burt, Randall W.; Neklason, Deborah W.; Hagedorn, Curt H.; Delker, Don A.

    2016-01-01

    Sessile serrated colon adenoma/polyps (SSA/Ps) are found during routine screening colonoscopy and may account for 20–30% of colon cancers. However, differentiating SSA/Ps from hyperplastic polyps (HP) with little risk of cancer is challenging and complementary molecular markers are needed. Additionally, the molecular mechanisms of colon cancer development from SSA/Ps are poorly understood. RNA sequencing was performed on 21 SSA/Ps, 10 HPs, 10 adenomas, 21 uninvolved colon and 20 control colon specimens. Differential expression and leave-one-out cross validation methods were used to define a unique gene signature of SSA/Ps. Our SSA/P gene signature was evaluated in colon cancer RNA-Seq data from The Cancer Genome Atlas (TCGA) to identify a subtype of colon cancers that may develop from SSA/Ps. A total of 1422 differentially expressed genes were found in SSA/Ps relative to controls. Serrated polyposis syndrome (n=12) and sporadic SSA/Ps (n=9) exhibited almost complete (96%) gene overlap. A 51-gene panel in SSA/P showed similar expression in a subset of TCGA colon cancers with high microsatellite instability (MSI-H). A smaller seven-gene panel showed high sensitivity and specificity in identifying BRAF mutant, CpG island methylator phenotype high (CIMP-H) and MLH1 silenced colon cancers. We describe a unique gene signature in SSA/Ps that identifies a subset of colon cancers likely to develop through the serrated pathway. These gene panels may be utilized for improved differentiation of SSA/Ps from HPs and provide insights into novel molecular pathways altered in colon cancer arising from the serrated pathway. PMID:27026680

  6. Phenoscape: Identifying Candidate Genes for Evolutionary Phenotypes

    PubMed Central

    Edmunds, Richard C.; Su, Baofeng; Balhoff, James P.; Eames, B. Frank; Dahdul, Wasila M.; Lapp, Hilmar; Lundberg, John G.; Vision, Todd J.; Dunham, Rex A.; Mabee, Paula M.; Westerfield, Monte

    2016-01-01

    Phenotypes resulting from mutations in genetic model organisms can help reveal candidate genes for evolutionarily important phenotypic changes in related taxa. Although testing candidate gene hypotheses experimentally in nonmodel organisms is typically difficult, ontology-driven information systems can help generate testable hypotheses about developmental processes in experimentally tractable organisms. Here, we tested candidate gene hypotheses suggested by expert use of the Phenoscape Knowledgebase, specifically looking for genes that are candidates responsible for evolutionarily interesting phenotypes in the ostariophysan fishes that bear resemblance to mutant phenotypes in zebrafish. For this, we searched ZFIN for genetic perturbations that result in either loss of basihyal element or loss of scales phenotypes, because these are the ancestral phenotypes observed in catfishes (Siluriformes). We tested the identified candidate genes by examining their endogenous expression patterns in the channel catfish, Ictalurus punctatus. The experimental results were consistent with the hypotheses that these features evolved through disruption in developmental pathways at, or upstream of, brpf1 and eda/edar for the ancestral losses of basihyal element and scales, respectively. These results demonstrate that ontological annotations of the phenotypic effects of genetic alterations in model organisms, when aggregated within a knowledgebase, can be used effectively to generate testable, and useful, hypotheses about evolutionary changes in morphology. PMID:26500251

  7. Common Marker Genes Identified from Various Sample Types for Systemic Lupus Erythematosus.

    PubMed

    Bing, Peng-Fei; Xia, Wei; Wang, Lan; Zhang, Yong-Hong; Lei, Shu-Feng; Deng, Fei-Yan

    2016-01-01

    Systemic lupus erythematosus (SLE) is a complex auto-immune disease. Gene expression studies have been conducted to identify SLE-related genes in various types of samples. It is unknown whether there are common marker genes significant for SLE but independent of sample types, which may have potentials for follow-up translational research. The aim of this study is to identify common marker genes across various sample types for SLE. Based on four public microarray gene expression datasets for SLE covering three representative types of blood-born samples (monocyte; peripheral blood mononuclear cell, PBMC; whole blood), we utilized three statistics (fold-change, FC; t-test p value; false discovery rate adjusted p value) to scrutinize genes simultaneously regulated with SLE across various sample types. For common marker genes, we conducted the Gene Ontology enrichment analysis and Protein-Protein Interaction analysis to gain insights into their functions. We identified 10 common marker genes associated with SLE (IFI6, IFI27, IFI44L, OAS1, OAS2, EIF2AK2, PLSCR1, STAT1, RNASE2, and GSTO1). Significant up-regulation of IFI6, IFI27, and IFI44L with SLE was observed in all the studied sample types, though the FC was most striking in monocyte, compared with PBMC and whole blood (8.82-251.66 vs. 3.73-74.05 vs. 1.19-1.87). Eight of the above 10 genes, except RNASE2 and GSTO1, interact with each other and with known SLE susceptibility genes, participate in immune response, RNA and protein catabolism, and cell death. Our data suggest that there exist common marker genes across various sample types for SLE. The 10 common marker genes, identified herein, deserve follow-up studies to dissert their potentials as diagnostic or therapeutic markers to predict SLE or treatment response.

  8. Integrative Analysis of GWASs, Human Protein Interaction, and Gene Expression Identified Gene Modules Associated With BMDs

    PubMed Central

    He, Hao; Zhang, Lei; Li, Jian; Wang, Yu-Ping; Zhang, Ji-Gang; Shen, Jie; Guo, Yan-Fang

    2014-01-01

    Context: To date, few systems genetics studies in the bone field have been performed. We designed our study from a systems-level perspective by integrating genome-wide association studies (GWASs), human protein-protein interaction (PPI) network, and gene expression to identify gene modules contributing to osteoporosis risk. Methods: First we searched for modules significantly enriched with bone mineral density (BMD)-associated genes in human PPI network by using 2 large meta-analysis GWAS datasets through a dense module search algorithm. One included 7 individual GWAS samples (Meta7). The other was from the Genetic Factors for Osteoporosis Consortium (GEFOS2). One was assigned as a discovery dataset and the other as an evaluation dataset, and vice versa. Results: In total, 42 modules and 129 modules were identified significantly in both Meta7 and GEFOS2 datasets for femoral neck and spine BMD, respectively. There were 3340 modules identified for hip BMD only in Meta7. As candidate modules, they were assessed for the biological relevance to BMD by gene set enrichment analysis in 2 expression profiles generated from circulating monocytes in subjects with low versus high BMD values. Interestingly, there were 2 modules significantly enriched in monocytes from the low BMD group in both gene expression datasets (nominal P value <.05). Two modules had 16 nonredundant genes. Functional enrichment analysis revealed that both modules were enriched for genes involved in Wnt receptor signaling and osteoblast differentiation. Conclusion: We highlighted 2 modules and novel genes playing important roles in the regulation of bone mass, providing important clues for therapeutic approaches for osteoporosis. PMID:25119315

  9. Integration of ATAC-seq and RNA-seq identifies human alpha cell and beta cell signature genes.

    PubMed

    Ackermann, Amanda M; Wang, Zhiping; Schug, Jonathan; Naji, Ali; Kaestner, Klaus H

    2016-03-01

    Although glucagon-secreting α-cells and insulin-secreting β-cells have opposing functions in regulating plasma glucose levels, the two cell types share a common developmental origin and exhibit overlapping transcriptomes and epigenomes. Notably, destruction of β-cells can stimulate repopulation via transdifferentiation of α-cells, at least in mice, suggesting plasticity between these cell fates. Furthermore, dysfunction of both α- and β-cells contributes to the pathophysiology of type 1 and type 2 diabetes, and β-cell de-differentiation has been proposed to contribute to type 2 diabetes. Our objective was to delineate the molecular properties that maintain islet cell type specification yet allow for cellular plasticity. We hypothesized that correlating cell type-specific transcriptomes with an atlas of open chromatin will identify novel genes and transcriptional regulatory elements such as enhancers involved in α- and β-cell specification and plasticity. We sorted human α- and β-cells and performed the "Assay for Transposase-Accessible Chromatin with high throughput sequencing" (ATAC-seq) and mRNA-seq, followed by integrative analysis to identify cell type-selective gene regulatory regions. We identified numerous transcripts with either α-cell- or β-cell-selective expression and discovered the cell type-selective open chromatin regions that correlate with these gene activation patterns. We confirmed cell type-selective expression on the protein level for two of the top hits from our screen. The "group specific protein" (GC; or vitamin D binding protein) was restricted to α-cells, while CHODL (chondrolectin) immunoreactivity was only present in β-cells. Furthermore, α-cell- and β-cell-selective ATAC-seq peaks were identified to overlap with known binding sites for islet transcription factors, as well as with single nucleotide polymorphisms (SNPs) previously identified as risk loci for type 2 diabetes. We have determined the genetic landscape of

  10. Gene expression patterns combined with network analysis identify hub genes associated with bladder cancer.

    PubMed

    Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia

    2015-06-01

    To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. A cross-species bi-clustering approach to identifying conserved co-regulated genes.

    PubMed

    Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

    2016-06-15

    A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on synthetic data and compared

  12. ICan: an integrated co-alteration network to identify ovarian cancer-related genes.

    PubMed

    Zhou, Yuanshuai; Liu, Yongjing; Li, Kening; Zhang, Rui; Qiu, Fujun; Zhao, Ning; Xu, Yan

    2015-01-01

    Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data.

  13. Identifying differentially expressed genes in cancer patients using a non-parameter Ising model.

    PubMed

    Li, Xumeng; Feltus, Frank A; Sun, Xiaoqian; Wang, James Z; Luo, Feng

    2011-10-01

    Identification of genes and pathways involved in diseases and physiological conditions is a major task in systems biology. In this study, we developed a novel non-parameter Ising model to integrate protein-protein interaction network and microarray data for identifying differentially expressed (DE) genes. We also proposed a simulated annealing algorithm to find the optimal configuration of the Ising model. The Ising model was applied to two breast cancer microarray data sets. The results showed that more cancer-related DE sub-networks and genes were identified by the Ising model than those by the Markov random field model. Furthermore, cross-validation experiments showed that DE genes identified by Ising model can improve classification performance compared with DE genes identified by Markov random field model. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. A recellularized human colon model identifies cancer driver genes

    PubMed Central

    Chen, Huanhuan Joyce; Wei, Zhubo; Sun, Jian; Bhattacharya, Asmita; Savage, David J; Serda, Rita; Mackeyev, Yuri; Curley, Steven A.; Bu, Pengcheng; Wang, Lihua; Chen, Shuibing; Cohen-Gould, Leona; Huang, Emina; Shen, Xiling; Lipkin, Steven M.; Copeland, Neal G.; Jenkins, Nancy A.; Shuler, Michael L.

    2016-01-01

    Refined cancer models are needed to bridge the gap between cell-line, animal and clinical research. Here we describe the engineering of an organotypic colon cancer model by recellularization of a native human matrix that contains cell-populated mucosa and an intact muscularis mucosa layer. This ex vivo system recapitulates the pathophysiological progression from APC-mutant neoplasia to submucosal invasive tumor. We used it to perform a Sleeping Beauty transposon mutagenesis screen to identify genes that cooperate with mutant APC in driving invasive neoplasia. 38 candidate invasion driver genes were identified, 17 of which have been previously implicated in colorectal cancer progression, including TCF7L2, TWIST2, MSH2, DCC and EPHB1/2. Six invasion driver genes that to our knowledge have not been previously described were validated in vitro using cell proliferation, migration and invasion assays, and ex vivo using recellularized human colon. These results demonstrate the utility of our organoid model for studying cancer biology. PMID:27398792

  15. Gene expression profiling combined with bioinformatics analysis identify biomarkers for Parkinson disease.

    PubMed

    Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui

    2012-01-01

    Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result.

  16. Gene Expression Profiling Combined with Bioinformatics Analysis Identify Biomarkers for Parkinson Disease

    PubMed Central

    Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui

    2012-01-01

    Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result. PMID:23284986

  17. Axon Regeneration Genes Identified by RNAi Screening in C. elegans

    PubMed Central

    Nix, Paola; Hammarlund, Marc; Hauth, Linda; Lachnit, Martina; Jorgensen, Erik M.

    2014-01-01

    Axons of the mammalian CNS lose the ability to regenerate soon after development due to both an inhibitory CNS environment and the loss of cell-intrinsic factors necessary for regeneration. The complex molecular events required for robust regeneration of mature neurons are not fully understood, particularly in vivo. To identify genes affecting axon regeneration in Caenorhabditis elegans, we performed both an RNAi-based screen for defective motor axon regeneration in unc-70/β-spectrin mutants and a candidate gene screen. From these screens, we identified at least 50 conserved genes with growth-promoting or growth-inhibiting functions. Through our analysis of mutants, we shed new light on certain aspects of regeneration, including the role of β-spectrin and membrane dynamics, the antagonistic activity of MAP kinase signaling pathways, and the role of stress in promoting axon regeneration. Many gene candidates had not previously been associated with axon regeneration and implicate new pathways of interest for therapeutic intervention. PMID:24403161

  18. Gene expression profiles analysis identifies key genes for acute lung injury in patients with sepsis.

    PubMed

    Guo, Zhiqiang; Zhao, Chuncheng; Wang, Zheng

    2014-09-26

    To identify critical genes and biological pathways in acute lung injury (ALI), a comparative analysis of gene expression profiles of patients with ALI + sepsis compared with patients with sepsis alone were performed with bioinformatic tools. GSE10474 was downloaded from Gene Expression Omnibus, including a collective of 13 whole blood samples with ALI + sepsis and 21 whole blood samples with sepsis alone. After pre-treatment with robust multichip averaging (RMA) method, differential analysis was conducted using simpleaffy package based upon t-test and fold change. Hierarchical clustering was also performed using function hclust from package stats. Beisides, functional enrichment analysis was conducted using iGepros. Moreover, the gene regulatory network was constructed with information from Kyoto Encyclopedia of Genes and Genomes (KEGG) and then visualized by Cytoscape. A total of 128 differentially expressed genes (DEGs) were identified, including 47 up- and 81 down-regulated genes. The significantly enriched functions included negative regulation of cell proliferation, regulation of response to stimulus and cellular component morphogenesis. A total of 27 DEGs were significantly enriched in 16 KEGG pathways, such as protein digestion and absorption, fatty acid metabolism, amoebiasis, etc. Furthermore, the regulatory network of these 27 DEGs was constructed, which involved several key genes, including protein tyrosine kinase 2 (PTK2), v-src avian sarcoma (SRC) and Caveolin 2 (CAV2). PTK2, SRC and CAV2 may be potential markers for diagnosis and treatment of ALI. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/5865162912987143.

  19. Combining Genome-Scale Experimental and Computational Methods To Identify Essential Genes in Rhodobacter sphaeroides

    DOE PAGES

    Burger, Brian T.; Imam, Saheed; Scarborough, Matthew J.; ...

    2017-06-06

    Rhodobacter sphaeroides is one of the best-studied alphaproteobacteria from biochemical, genetic, and genomic perspectives. To gain a better systems-level understanding of this organism, we generated a large transposon mutant library and used transposon sequencing (Tn-seq) to identify genes that are essential under several growth conditions. Using newly developed Tn-seq analysis software (TSAS), we identified 493 genes as essential for aerobic growth on a rich medium. We then used the mutant library to identify conditionally essential genes under two laboratory growth conditions, identifying 85 additional genes required for aerobic growth in a minimal medium and 31 additional genes required for photosyntheticmore » growth. In all instances, our analyses confirmed essentiality for many known genes and identified genes not previously considered to be essential. We used the resulting Tn-seq data to refine and improve a genome-scale metabolic network model (GEM) for R. sphaeroides. Together, we demonstrate how genetic, genomic, and computational approaches can be combined to obtain a systems-level understanding of the genetic framework underlying metabolic diversity in bacterial species.« less

  20. A Functional Genomics Approach to Identify Novel Breast Cancer Gene Targets in Yeast

    DTIC Science & Technology

    2004-05-01

    AD Award Number: DAMD17-03-1-0232 TITLE: A Functional Genomics Approach to Identify Novel Breast Cancer Gene Targets in Yeast PRINCIPAL INVESTIGATOR...Approach to Identify Novel Breast DAMD17-03-1-0232 Cancer Gene Targets in Yeast 6. A UTHOR(S) Craig Bennett, Ph.D. 7. PERFORMING ORGANIZA TION NAME(S...Unlimited 13. ABSTRACT (Maximum 200 Words) We are using the yeast Saccharomyces cerevisiae to identify new cancer gene targets that interact with the

  1. ICan: An Integrated Co-Alteration Network to Identify Ovarian Cancer-Related Genes

    PubMed Central

    Zhou, Yuanshuai; Liu, Yongjing; Li, Kening; Zhang, Rui; Qiu, Fujun; Zhao, Ning; Xu, Yan

    2015-01-01

    Background Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. Results We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). Conclusion In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data. PMID:25803614

  2. MMTV insertional mutagenesis identifies genes, gene families and pathways involved in mammary cancer.

    PubMed

    Theodorou, Vassiliki; Kimm, Melanie A; Boer, Mandy; Wessels, Lodewyk; Theelen, Wendy; Jonkers, Jos; Hilkens, John

    2007-06-01

    We performed a high-throughput retroviral insertional mutagenesis screen in mouse mammary tumor virus (MMTV)-induced mammary tumors and identified 33 common insertion sites, of which 17 genes were previously not known to be associated with mammary cancer and 13 had not previously been linked to cancer in general. Although members of the Wnt and fibroblast growth factors (Fgf) families were frequently tagged, our exhaustive screening for MMTV insertion sites uncovered a new repertoire of candidate breast cancer oncogenes. We validated one of these genes, Rspo3, as an oncogene by overexpression in a p53-deficient mammary epithelial cell line. The human orthologs of the candidate oncogenes were frequently deregulated in human breast cancers and associated with several tumor parameters. Computational analysis of all MMTV-tagged genes uncovered specific gene families not previously associated with cancer and showed a significant overrepresentation of protein domains and signaling pathways mainly associated with development and growth factor signaling. Comparison of all tagged genes in MMTV and Moloney murine leukemia virus-induced malignancies showed that both viruses target mostly different genes that act predominantly in distinct pathways.

  3. Identifying Mendelian disease genes with the Variant Effect Scoring Tool

    PubMed Central

    2013-01-01

    Background Whole exome sequencing studies identify hundreds to thousands of rare protein coding variants of ambiguous significance for human health. Computational tools are needed to accelerate the identification of specific variants and genes that contribute to human disease. Results We have developed the Variant Effect Scoring Tool (VEST), a supervised machine learning-based classifier, to prioritize rare missense variants with likely involvement in human disease. The VEST classifier training set comprised ~ 45,000 disease mutations from the latest Human Gene Mutation Database release and another ~45,000 high frequency (allele frequency >1%) putatively neutral missense variants from the Exome Sequencing Project. VEST outperforms some of the most popular methods for prioritizing missense variants in carefully designed holdout benchmarking experiments (VEST ROC AUC = 0.91, PolyPhen2 ROC AUC = 0.86, SIFT4.0 ROC AUC = 0.84). VEST estimates variant score p-values against a null distribution of VEST scores for neutral variants not included in the VEST training set. These p-values can be aggregated at the gene level across multiple disease exomes to rank genes for probable disease involvement. We tested the ability of an aggregate VEST gene score to identify candidate Mendelian disease genes, based on whole-exome sequencing of a small number of disease cases. We used whole-exome data for two Mendelian disorders for which the causal gene is known. Considering only genes that contained variants in all cases, the VEST gene score ranked dihydroorotate dehydrogenase (DHODH) number 2 of 2253 genes in four cases of Miller syndrome, and myosin-3 (MYH3) number 2 of 2313 genes in three cases of Freeman Sheldon syndrome. Conclusions Our results demonstrate the potential power gain of aggregating bioinformatics variant scores into gene-level scores and the general utility of bioinformatics in assisting the search for disease genes in large-scale exome sequencing studies. VEST is

  4. Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene annotation in the sponge Amphimedon queenslandica.

    PubMed

    Fernandez-Valverde, Selene L; Calcino, Andrew D; Degnan, Bernard M

    2015-05-15

    The demosponge Amphimedon queenslandica is amongst the few early-branching metazoans with an assembled and annotated draft genome, making it an important species in the study of the origin and early evolution of animals. Current gene models in this species are largely based on in silico predictions and low coverage expressed sequence tag (EST) evidence. Amphimedon queenslandica protein-coding gene models are improved using deep RNA-Seq data from four developmental stages and CEL-Seq data from 82 developmental samples. Over 86% of previously predicted genes are retained in the new gene models, although 24% have additional exons; there is also a marked increase in the total number of annotated 3' and 5' untranslated regions (UTRs). Importantly, these new developmental transcriptome data reveal numerous previously unannotated protein-coding genes in the Amphimedon genome, increasing the total gene number by 25%, from 30,060 to 40,122. In general, Amphimedon genes have introns that are markedly smaller than those in other animals and most of the alternatively spliced genes in Amphimedon undergo intron-retention; exon-skipping is the least common mode of alternative splicing. Finally, in addition to canonical polyadenylation signal sequences, Amphimedon genes are enriched in a number of unique AT-rich motifs in their 3' UTRs. The inclusion of developmental transcriptome data has substantially improved the structure and composition of protein-coding gene models in Amphimedon queenslandica, providing a more accurate and comprehensive set of genes for functional and comparative studies. These improvements reveal the Amphimedon genome is comprised of a remarkably high number of tightly packed genes. These genes have small introns and there is pervasive intron retention amongst alternatively spliced transcripts. These aspects of the sponge genome are more similar unicellular opisthokont genomes than to other animal genomes.

  5. Gene-Based Genome-Wide Association Analysis in European and Asian Populations Identified Novel Genes for Rheumatoid Arthritis.

    PubMed

    Zhu, Hong; Xia, Wei; Mo, Xing-Bo; Lin, Xiang; Qiu, Ying-Hua; Yi, Neng-Jun; Zhang, Yong-Hong; Deng, Fei-Yan; Lei, Shu-Feng

    2016-01-01

    Rheumatoid arthritis (RA) is a complex autoimmune disease. Using a gene-based association research strategy, the present study aims to detect unknown susceptibility to RA and to address the ethnic differences in genetic susceptibility to RA between European and Asian populations. Gene-based association analyses were performed with KGG 2.5 by using publicly available large RA datasets (14,361 RA cases and 43,923 controls of European subjects, 4,873 RA cases and 17,642 controls of Asian Subjects). For the newly identified RA-associated genes, gene set enrichment analyses and protein-protein interactions analyses were carried out with DAVID and STRING version 10.0, respectively. Differential expression verification was conducted using 4 GEO datasets. The expression levels of three selected 'highly verified' genes were measured by ELISA among our in-house RA cases and controls. A total of 221 RA-associated genes were newly identified by gene-based association study, including 71'overlapped', 76 'European-specific' and 74 'Asian-specific' genes. Among them, 105 genes had significant differential expressions between RA patients and health controls at least in one dataset, especially for 20 genes including 11 'overlapped' (ABCF1, FLOT1, HLA-F, IER3, TUBB, ZKSCAN4, BTN3A3, HSP90AB1, CUTA, BRD2, HLA-DMA), 5 'European-specific' (PHTF1, RPS18, BAK1, TNFRSF14, SUOX) and 4 'Asian-specific' (RNASET2, HFE, BTN2A2, MAPK13) genes whose differential expressions were significant at least in three datasets. The protein expressions of two selected genes FLOT1 (P value = 1.70E-02) and HLA-DMA (P value = 4.70E-02) in plasma were significantly different in our in-house samples. Our study identified 221 novel RA-associated genes and especially highlighted the importance of 20 candidate genes on RA. The results addressed ethnic genetic background differences for RA susceptibility between European and Asian populations and detected a long list of overlapped or ethnic specific RA genes. The

  6. Systematic analysis of microarray datasets to identify Parkinson's disease‑associated pathways and genes.

    PubMed

    Feng, Yinling; Wang, Xuefeng

    2017-03-01

    In order to investigate commonly disturbed genes and pathways in various brain regions of patients with Parkinson's disease (PD), microarray datasets from previous studies were collected and systematically analyzed. Different normalization methods were applied to microarray datasets from different platforms. A strategy combining gene co‑expression networks and clinical information was adopted, using weighted gene co‑expression network analysis (WGCNA) to screen for commonly disturbed genes in different brain regions of patients with PD. Functional enrichment analysis of commonly disturbed genes was performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). Co‑pathway relationships were identified with Pearson's correlation coefficient tests and a hypergeometric distribution‑based test. Common genes in pathway pairs were selected out and regarded as risk genes. A total of 17 microarray datasets from 7 platforms were retained for further analysis. Five gene coexpression modules were identified, containing 9,745, 736, 233, 101 and 93 genes, respectively. One module was significantly correlated with PD samples and thus the 736 genes it contained were considered to be candidate PD‑associated genes. Functional enrichment analysis demonstrated that these genes were implicated in oxidative phosphorylation and PD. A total of 44 pathway pairs and 52 risk genes were revealed, and a risk gene pathway relationship network was constructed. Eight modules were identified and were revealed to be associated with PD, cancers and metabolism. A number of disturbed pathways and risk genes were unveiled in PD, and these findings may help advance understanding of PD pathogenesis.

  7. Identifying Candidate Reprogramming Genes in Mouse Induced Pluripotent Stem Cells.

    PubMed

    Gao, Fang; Li, Jingyu; Zhang, Heng; Yang, Xu; An, Tiezhu

    2017-08-01

    Factor-based induced reprogramming approaches have tremendous potential for human regenerative medicine, but the efficiencies of these approaches are still low. In this study, we analyzed the global transcriptional profiles of mouse induced pluripotent stem cells (miPSCs) and mouse embryonic stem cells (mESCs) from seven different labs and present here the first successful clustering according to cell type, not by lab of origin. We identified 2131 different expression genes (DEs) as candidate pluripotency-associated genes by comparing mESCs/miPSCs with somatic cells and 720 DEs between miPSCs and mESCs. Interestingly, there was a significant overlap between the two DE sets. Therefore, we defined the overlap DEs as "consensus DEs" including 313 miPSC-specific genes expressed at a higher level in miPSCs versus mESCs and 184 mESC-specific genes in total and reasoned that these may contribute to the differences in pluripotency between mESCs and miPSCs. A classification of "consensus DEs" according to their different expression levels between somatic cells and mESCs/miPSCs shows that 86% of the miPSC-specific genes are more highly expressed in somatic cells, while 73% of mESC-specific genes are highly expressed in mESCs/miPSCs, indicating that the miPSCs have not efficiently silenced the expression pattern of the somatic cells from which they are derived and failed to completely induce the genes with high expression levels in mESCs. We further revealed a strong correlation between oocyte-enriched factors and insufficiently induced mESC-specific genes and identified 11 hub genes via network analysis. In light of these findings, we postulated that these key hub genes might not only drive somatic cell nuclear transfer (SCNT) reprogramming but also augment the efficiency and quality of miPSC reprogramming.

  8. Exome Sequencing Identifies Three Novel Candidate Genes Implicated in Intellectual Disability

    PubMed Central

    Azam, Maleeha; Ayub, Humaira; Vissers, Lisenka E. L. M.; Gilissen, Christian; Ali, Syeda Hafiza Benish; Riaz, Moeen; Veltman, Joris A.; Pfundt, Rolph; van Bokhoven, Hans; Qamar, Raheel

    2014-01-01

    Intellectual disability (ID) is a major health problem mostly with an unknown etiology. Recently exome sequencing of individuals with ID identified novel genes implicated in the disease. Therefore the purpose of the present study was to identify the genetic cause of ID in one syndromic and two non-syndromic Pakistani families. Whole exome of three ID probands was sequenced. Missense variations in two plausible novel genes implicated in autosomal recessive ID were identified: lysine (K)-specific methyltransferase 2B (KMT2B), zinc finger protein 589 (ZNF589), as well as hedgehog acyltransferase (HHAT) with a de novo mutation with autosomal dominant mode of inheritance. The KMT2B recessive variant is the first report of recessive Kleefstra syndrome-like phenotype. Identification of plausible causative mutations for two recessive and a dominant type of ID, in genes not previously implicated in disease, underscores the large genetic heterogeneity of ID. These results also support the viewpoint that large number of ID genes converge on limited number of common networks i.e. ZNF589 belongs to KRAB-domain zinc-finger proteins previously implicated in ID, HHAT is predicted to affect sonic hedgehog, which is involved in several disorders with ID, KMT2B associated with syndromic ID fits the epigenetic module underlying the Kleefstra syndromic spectrum. The association of these novel genes in three different Pakistani ID families highlights the importance of screening these genes in more families with similar phenotypes from different populations to confirm the involvement of these genes in pathogenesis of ID. PMID:25405613

  9. Integrating Gene Expression with Summary Association Statistics to Identify Genes Associated with 30 Complex Traits.

    PubMed

    Mancuso, Nicholas; Shi, Huwenbo; Goddard, Pagé; Kichaev, Gleb; Gusev, Alexander; Pasaniuc, Bogdan

    2017-03-02

    Although genome-wide association studies (GWASs) have identified thousands of risk loci for many complex traits and diseases, the causal variants and genes at these loci remain largely unknown. Here, we introduce a method for estimating the local genetic correlation between gene expression and a complex trait and utilize it to estimate the genetic correlation due to predicted expression between pairs of traits. We integrated gene expression measurements from 45 expression panels with summary GWAS data to perform 30 multi-tissue transcriptome-wide association studies (TWASs). We identified 1,196 genes whose expression is associated with these traits; of these, 168 reside more than 0.5 Mb away from any previously reported GWAS significant variant. We then used our approach to find 43 pairs of traits with significant genetic correlation at the level of predicted expression; of these, eight were not found through genetic correlation at the SNP level. Finally, we used bi-directional regression to find evidence that BMI causally influences triglyceride levels and that triglyceride levels causally influence low-density lipoprotein. Together, our results provide insight into the role of gene expression in the susceptibility of complex traits and diseases. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  10. Using SCOPE to identify potential regulatory motifs in coregulated genes.

    PubMed

    Martyanov, Viktor; Gross, Robert H

    2011-05-31

    SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data. In this article, we utilize a web version of SCOPE to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs and has been used in other studies. The three algorithms that comprise SCOPE are BEAM, which finds non-degenerate motifs (ACCGGT), PRISM, which finds degenerate motifs (ASCGWT), and SPACER, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well. Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor. Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run. Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from

  11. [Key effect genes responding to nerve injury identified by gene ontology and computer pattern recognition].

    PubMed

    Pan, Qian; Peng, Jin; Zhou, Xue; Yang, Hao; Zhang, Wei

    2012-07-01

    In order to screen out important genes from large gene data of gene microarray after nerve injury, we combine gene ontology (GO) method and computer pattern recognition technology to find key genes responding to nerve injury, and then verify one of these screened-out genes. Data mining and gene ontology analysis of gene chip data GSE26350 was carried out through MATLAB software. Cd44 was selected from screened-out key gene molecular spectrum by comparing genes' different GO terms and positions on score map of principal component. Function interferences were employed to influence the normal binding of Cd44 and one of its ligands, chondroitin sulfate C (CSC), to observe neurite extension. Gene ontology analysis showed that the first genes on score map (marked by red *) mainly distributed in molecular transducer activity, receptor activity, protein binding et al molecular function GO terms. Cd44 is one of six effector protein genes, and attracted us with its function diversity. After adding different reagents into the medium to interfere the normal binding of CSC and Cd44, varying-degree remissions of CSC's inhibition on neurite extension were observed. CSC can inhibit neurite extension through binding Cd44 on the neuron membrane. This verifies that important genes in given physiological processes can be identified by gene ontology analysis of gene chip data.

  12. Sleeping Beauty transposon mutagenesis identifies genes that cooperate with mutant Smad4 in gastric cancer development

    PubMed Central

    Takeda, Haruna; Rust, Alistair G.; Ward, Jerrold M.; Yew, Christopher Chin Kuan; Jenkins, Nancy A.; Copeland, Neal G.

    2016-01-01

    Mutations in SMAD4 predispose to the development of gastrointestinal cancer, which is the third leading cause of cancer-related deaths. To identify genes driving gastric cancer (GC) development, we performed a Sleeping Beauty (SB) transposon mutagenesis screen in the stomach of Smad4+/− mutant mice. This screen identified 59 candidate GC trunk drivers and a much larger number of candidate GC progression genes. Strikingly, 22 SB-identified trunk drivers are known or candidate cancer genes, whereas four SB-identified trunk drivers, including PTEN, SMAD4, RNF43, and NF1, are known human GC trunk drivers. Similar to human GC, pathway analyses identified WNT, TGF-β, and PI3K-PTEN signaling, ubiquitin-mediated proteolysis, adherens junctions, and RNA degradation in addition to genes involved in chromatin modification and organization as highly deregulated pathways in GC. Comparative oncogenomic filtering of the complete list of SB-identified genes showed that they are highly enriched for genes mutated in human GC and identified many candidate human GC genes. Finally, by comparing our complete list of SB-identified genes against the list of mutated genes identified in five large-scale human GC sequencing studies, we identified LDL receptor-related protein 1B (LRP1B) as a previously unidentified human candidate GC tumor suppressor gene. In LRP1B, 129 mutations were found in 462 human GC samples sequenced, and LRP1B is one of the top 10 most deleted genes identified in a panel of 3,312 human cancers. SB mutagenesis has, thus, helped to catalog the cooperative molecular mechanisms driving SMAD4-induced GC growth and discover genes with potential clinical importance in human GC. PMID:27006499

  13. Sleeping Beauty transposon mutagenesis identifies genes that cooperate with mutant Smad4 in gastric cancer development.

    PubMed

    Takeda, Haruna; Rust, Alistair G; Ward, Jerrold M; Yew, Christopher Chin Kuan; Jenkins, Nancy A; Copeland, Neal G

    2016-04-05

    Mutations in SMAD4 predispose to the development of gastrointestinal cancer, which is the third leading cause of cancer-related deaths. To identify genes driving gastric cancer (GC) development, we performed a Sleeping Beauty (SB) transposon mutagenesis screen in the stomach of Smad4(+/-) mutant mice. This screen identified 59 candidate GC trunk drivers and a much larger number of candidate GC progression genes. Strikingly, 22 SB-identified trunk drivers are known or candidate cancer genes, whereas four SB-identified trunk drivers, including PTEN, SMAD4, RNF43, and NF1, are known human GC trunk drivers. Similar to human GC, pathway analyses identified WNT, TGF-β, and PI3K-PTEN signaling, ubiquitin-mediated proteolysis, adherens junctions, and RNA degradation in addition to genes involved in chromatin modification and organization as highly deregulated pathways in GC. Comparative oncogenomic filtering of the complete list of SB-identified genes showed that they are highly enriched for genes mutated in human GC and identified many candidate human GC genes. Finally, by comparing our complete list of SB-identified genes against the list of mutated genes identified in five large-scale human GC sequencing studies, we identified LDL receptor-related protein 1B (LRP1B) as a previously unidentified human candidate GC tumor suppressor gene. In LRP1B, 129 mutations were found in 462 human GC samples sequenced, and LRP1B is one of the top 10 most deleted genes identified in a panel of 3,312 human cancers. SB mutagenesis has, thus, helped to catalog the cooperative molecular mechanisms driving SMAD4-induced GC growth and discover genes with potential clinical importance in human GC.

  14. Novel Myopia Genes and Pathways Identified From Syndromic Forms of Myopia

    PubMed Central

    Loughman, James; Wildsoet, Christine F.; Williams, Cathy; Guggenheim, Jeremy A.

    2018-01-01

    Purpose To test the hypothesis that genes known to cause clinical syndromes featuring myopia also harbor polymorphisms contributing to nonsyndromic refractive errors. Methods Clinical phenotypes and syndromes that have refractive errors as a recognized feature were identified using the Online Mendelian Inheritance in Man (OMIM) database. One hundred fifty-four unique causative genes were identified, of which 119 were specifically linked with myopia and 114 represented syndromic myopia (i.e., myopia and at least one other clinical feature). Myopia was the only refractive error listed for 98 genes and hyperopia and the only refractive error noted for 28 genes, with the remaining 28 genes linked to phenotypes with multiple forms of refractive error. Pathway analysis was carried out to find biological processes overrepresented within these sets of genes. Genetic variants located within 50 kb of the 119 myopia-related genes were evaluated for involvement in refractive error by analysis of summary statistics from genome-wide association studies (GWAS) conducted by the CREAM Consortium and 23andMe, using both single-marker and gene-based tests. Results Pathway analysis identified several biological processes already implicated in refractive error development through prior GWAS analyses and animal studies, including extracellular matrix remodeling, focal adhesion, and axon guidance, supporting the research hypothesis. Novel pathways also implicated in myopia development included mannosylation, glycosylation, lens development, gliogenesis, and Schwann cell differentiation. Hyperopia was found to be linked to a different pattern of biological processes, mostly related to organogenesis. Comparison with GWAS findings further confirmed that syndromic myopia genes were enriched for genetic variants that influence refractive errors in the general population. Gene-based analyses implicated 21 novel candidate myopia genes (ADAMTS18, ADAMTS2, ADAMTSL4, AGK, ALDH18A1, ASXL1, COL4A1

  15. Expression profiling identifies novel Hh/Gli regulated genes in developing zebrafish embryos.

    PubMed Central

    Bergeron, Sadie A.; Milla, Luis A.; Villegas, Rosario; Shen, Meng-Chieh; Burgess, Shawn M.; Allende, Miguel L.; Karlstrom, Rolf O.; Palma, Verónica

    2008-01-01

    The Hedgehog (Hh) signaling pathway plays critical instructional roles during embryonic development. Mis-regulation of Hh/Gli signaling is a major causative factor in human congenital disorders and in a variety of cancers. The zebrafish is a powerful genetic model for the study of Hh signaling during embryogenesis, as a large number of mutants have been identified affecting different components of the Hh/Gli signaling system. By performing global profiling of gene expression in different Hh/Gli gain- and loss-of-function scenarios we identified several known (e.g. ptc1 and nkx2.2a) as well as a large number of novel Hh regulated genes that are differentially expressed in embryos with altered Hh/Gli signaling function. By uncovering changes in tissue specific gene expression, we revealed new embryological processes that are influenced by Hh signaling. We thus provide a comprehensive survey of Hh/Gli regulated genes during embryogenesis and we identify new Hh-regulated genes that may be targets of mis-regulation during tumorogenesis. PMID:18055165

  16. Identifying Stress Transcription Factors Using Gene Expression and TF-Gene Association Data

    PubMed Central

    Wu, Wei-Sheng; Chen, Bor-Sen

    2007-01-01

    Unicellular organisms such as yeasts have evolved to survive environmental stresses by rapidly reorganizing the genomic expression program to meet the challenges of harsh environments. The complex adaptation mechanisms to stress remain to be elucidated. In this study, we developed Stress Transcription Factor Identification Algorithm (STFIA), which integrates gene expression and TF-gene association data to identify the stress transcription factors (TFs) of six kinds of stresses. We identified some general stress TFs that are in response to various stresses, and some specific stress TFs that are in response to one specific stress. The biological significance of our findings is validated by the literature. We found that a small number of TFs may be sufficient to control a wide variety of expression patterns in yeast under different stresses. Two implications can be inferred from this observation. First, the adaptation mechanisms to different stresses may have a bow-tie structure. Second, there may exist extensive regulatory cross-talk among different stress responses. In conclusion, this study proposes a network of the regulators of stress responses and their mechanism of action. PMID:20066130

  17. GeneCOST: a novel scoring-based prioritization framework for identifying disease causing genes.

    PubMed

    Ozer, Bugra; Sağıroğlu, Mahmut; Demirci, Hüseyin

    2015-11-15

    Due to the big data produced by next-generation sequencing studies, there is an evident need for methods to extract the valuable information gathered from these experiments. In this work, we propose GeneCOST, a novel scoring-based method to evaluate every gene for their disease association. Without any prior filtering and any prior knowledge, we assign a disease likelihood score to each gene in correspondence with their variations. Then, we rank all genes based on frequency, conservation, pedigree and detailed variation information to find out the causative reason of the disease state. We demonstrate the usage of GeneCOST with public and real life Mendelian disease cases including recessive, dominant, compound heterozygous and sporadic models. As a result, we were able to identify causative reason behind the disease state in top rankings of our list, proving that this novel prioritization framework provides a powerful environment for the analysis in genetic disease studies alternative to filtering-based approaches. GeneCOST software is freely available at www.igbam.bilgem.tubitak.gov.tr/en/softwares/genecost-en/index.html. buozer@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Pathway-driven gene stability selection of two rheumatoid arthritis GWAS identifies and validates new susceptibility genes in receptor mediated signalling pathways.

    PubMed

    Eleftherohorinou, Hariklia; Hoggart, Clive J; Wright, Victoria J; Levin, Michael; Coin, Lachlan J M

    2011-09-01

    Rheumatoid arthritis (RA) is the commonest chronic, systemic, inflammatory disorder affecting ∼1% of the world population. It has a strong genetic component and a growing number of associated genes have been discovered in genome-wide association studies (GWAS), which nevertheless only account for 23% of the total genetic risk. We aimed to identify additional susceptibility loci through the analysis of GWAS in the context of biological function. We bridge the gap between pathway and gene-oriented analyses of GWAS, by introducing a pathway-driven gene stability-selection methodology that identifies potential causal genes in the top-associated disease pathways that may be driving the pathway association signals. We analysed the WTCCC and the NARAC studies of ∼5000 and ∼2000 subjects, respectively. We examined 700 pathways comprising ∼8000 genes. Ranking pathways by significance revealed that the NARAC top-ranked ∼6% laid within the top 10% of WTCCC. Gene selection on those pathways identified 58 genes in WTCCC and 61 in NARAC; 21 of those were common (P(overlap)< 10(-21)), of which 16 were novel discoveries. Among the identified genes, we validated 10 known RA associations in WTCCC and 13 in NARAC, not discovered using single-SNP approaches on the same data. Gene ontology functional enrichment analysis on the identified genes showed significant over-representation of signalling activity (P< 10(-29)) in both studies. Our findings suggest a novel model of RA genetic predisposition, which involves cell-membrane receptors and genes in second messenger signalling systems, in addition to genes that regulate immune responses, which have been the focus of interest previously.

  19. Use of RNA-seq to identify cardiac genes and gene pathways differentially expressed between dogs with and without dilated cardiomyopathy

    PubMed Central

    Friedenberg, Steven G.; Chdid, Lhoucine; Keene, Bruce; Sherry, Barbara; Motsinger-Reif, Alison; Meurs, Kathryn M.

    2017-01-01

    OBJECTIVE To identify cardiac tissue genes and gene pathways differentially expressed between dogs with and without dilated cardiomyopathy (DCM). ANIMALS 8 dogs with and 5 dogs without DCM. PROCEDURES Following euthanasia, samples of left ventricular myocardium were collected from each dog. Total RNA was extracted from tissue samples, and RNA sequencing was performed on each sample. Samples from dogs with and without DCM were grouped to identify genes that were differentially regulated between the 2 populations. Overrepresentation analysis was performed on upregulated and downregulated gene sets to identify altered molecular pathways in dogs with DCM. RESULTS Genes involved in cellular energy metabolism, especially metabolism of carbohydrates and fats, were significantly downregulated in dogs with DCM. Expression of cardiac structural proteins was also altered in affected dogs. CONCLUSIONS AND CLINICAL RELEVANCE Results suggested that RNA sequencing may provide important insights into the pathogenesis of DCM in dogs and highlight pathways that should be explored to identify causative mutations and develop novel therapeutic interventions. PMID:27347821

  20. Use of RNA-seq to identify cardiac genes and gene pathways differentially expressed between dogs with and without dilated cardiomyopathy.

    PubMed

    Friedenberg, Steven G; Chdid, Lhoucine; Keene, Bruce; Sherry, Barbara; Motsinger-Reif, Alison; Meurs, Kathryn M

    2016-07-01

    OBJECTIVE To identify cardiac tissue genes and gene pathways differentially expressed between dogs with and without dilated cardiomyopathy (DCM). ANIMALS 8 dogs with and 5 dogs without DCM. PROCEDURES Following euthanasia, samples of left ventricular myocardium were collected from each dog. Total RNA was extracted from tissue samples, and RNA sequencing was performed on each sample. Samples from dogs with and without DCM were grouped to identify genes that were differentially regulated between the 2 populations. Overrepresentation analysis was performed on upregulated and downregulated gene sets to identify altered molecular pathways in dogs with DCM. RESULTS Genes involved in cellular energy metabolism, especially metabolism of carbohydrates and fats, were significantly downregulated in dogs with DCM. Expression of cardiac structural proteins was also altered in affected dogs. CONCLUSIONS AND CLINICAL RELEVANCE Results suggested that RNA sequencing may provide important insights into the pathogenesis of DCM in dogs and highlight pathways that should be explored to identify causative mutations and develop novel therapeutic interventions.

  1. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    PubMed

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  2. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes

    PubMed Central

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-01-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information. PMID:22384404

  3. A general method for identifying major hybrid male sterility genes in Drosophila.

    PubMed

    Zeng, L W; Singh, R S

    1995-10-01

    The genes responsible for hybrid male sterility in species crosses are usually identified by introgressing chromosome segments, monitored by visible markers, between closely related species by continuous backcrosses. This commonly used method, however, suffers from two problems. First, it relies on the availability of markers to monitor the introgressed regions and so the portion of the genome examined is limited to the marked regions. Secondly, the introgressed regions are usually large and it is impossible to tell if the effects of the introgressed regions are the result of single (or few) major genes or many minor genes (polygenes). Here we introduce a simple and general method for identifying putative major hybrid male sterility genes which is free of these problems. In this method, the actual hybrid male sterility genes (rather than markers), or tightly linked gene complexes with large effects, are selectively introgressed from one species into the background of another species by repeated backcrosses. This is performed by selectively backcrossing heterozygous (for hybrid male sterility gene or genes) females producing fertile and sterile sons in roughly equal proportions to males of either parental species. As no marker gene is required for this procedure, this method can be used with any species pairs that produce unisexual sterility. With the application of this method, a small X chromosome region of Drosophila mauritiana which produces complete hybrid male sterility (aspermic testes) in the background of D. simulans was identified. Recombination analysis reveals that this region contains a second major hybrid male sterility gene linked to the forked locus located at either 62.7 +/- 0.66 map units or at the centromere region of the X chromosome of D. mauritiana.

  4. Epidermal growth factor gene is a newly identified candidate gene for gout.

    PubMed

    Han, Lin; Cao, Chunwei; Jia, Zhaotong; Liu, Shiguo; Liu, Zhen; Xin, Ruosai; Wang, Can; Li, Xinde; Ren, Wei; Wang, Xuefeng; Li, Changgui

    2016-08-10

    Chromosome 4q25 has been identified as a genomic region associated with gout. However, the associations of gout with the genes in this region have not yet been confirmed. Here, we performed two-stage analysis to determine whether variations in candidate genes in the 4q25 region are associated with gout in a male Chinese Han population. We first evaluated 96 tag single nucleotide polymorphisms (SNPs) in eight inflammatory/immune pathway- or glucose/lipid metabolism-related genes in the 4q25 region in 480 male gout patients and 480 controls. The SNP rs12504538, located in the elongation of very-long-chain-fatty-acid-like family member 6 gene (Elovl6), was found to be associated with gout susceptibility (Padjusted = 0.00595). In the second stage of analysis, we performed fine mapping analysis of 93 tag SNPs in Elovl6 and in the epidermal growth factor gene (EGF) and its flanking regions in 1017 male patients gout and 1897 healthy male controls. We observed a significant association between the T allele of EGF rs2298999 and gout (odds ratio = 0.77, 95% confidence interval = 0.67-0.88, Padjusted = 6.42 × 10(-3)). These results provide the first evidence for an association between the EGF rs2298999 C/T polymorphism and gout. Our findings should be validated in additional populations.

  5. Whole exome sequencing identifies novel candidate genes that modify chronic obstructive pulmonary disease susceptibility.

    PubMed

    Bruse, Shannon; Moreau, Michael; Bromberg, Yana; Jang, Jun-Ho; Wang, Nan; Ha, Hongseok; Picchi, Maria; Lin, Yong; Langley, Raymond J; Qualls, Clifford; Klensney-Tait, Julia; Zabner, Joseph; Leng, Shuguang; Mao, Jenny; Belinsky, Steven A; Xing, Jinchuan; Nyunoya, Toru

    2016-01-07

    Chronic obstructive pulmonary disease (COPD) is characterized by an irreversible airflow limitation in response to inhalation of noxious stimuli, such as cigarette smoke. However, only 15-20 % smokers manifest COPD, suggesting a role for genetic predisposition. Although genome-wide association studies have identified common genetic variants that are associated with susceptibility to COPD, effect sizes of the identified variants are modest, as is the total heritability accounted for by these variants. In this study, an extreme phenotype exome sequencing study was combined with in vitro modeling to identify COPD candidate genes. We performed whole exome sequencing of 62 highly susceptible smokers and 30 exceptionally resistant smokers to identify rare variants that may contribute to disease risk or resistance to COPD. This was a cross-sectional case-control study without therapeutic intervention or longitudinal follow-up information. We identified candidate genes based on rare variant analyses and evaluated exonic variants to pinpoint individual genes whose function was computationally established to be significantly different between susceptible and resistant smokers. Top scoring candidate genes from these analyses were further filtered by requiring that each gene be expressed in human bronchial epithelial cells (HBECs). A total of 81 candidate genes were thus selected for in vitro functional testing in cigarette smoke extract (CSE)-exposed HBECs. Using small interfering RNA (siRNA)-mediated gene silencing experiments, we showed that silencing of several candidate genes augmented CSE-induced cytotoxicity in vitro. Our integrative analysis through both genetic and functional approaches identified two candidate genes (TACC2 and MYO1E) that augment cigarette smoke (CS)-induced cytotoxicity and, potentially, COPD susceptibility.

  6. Clustering approaches to identifying gene expression patterns from DNA microarray data.

    PubMed

    Do, Jin Hwan; Choi, Dong-Kug

    2008-04-30

    The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

  7. Genes Important for Schizosaccharomyces pombe Meiosis Identified Through a Functional Genomics Screen

    PubMed Central

    Blyth, Julie; Makrantoni, Vasso; Barton, Rachael E.; Spanos, Christos; Rappsilber, Juri; Marston, Adele L.

    2018-01-01

    Meiosis is a specialized cell division that generates gametes, such as eggs and sperm. Errors in meiosis result in miscarriages and are the leading cause of birth defects; however, the molecular origins of these defects remain unknown. Studies in model organisms are beginning to identify the genes and pathways important for meiosis, but the parts list is still poorly defined. Here we present a comprehensive catalog of genes important for meiosis in the fission yeast, Schizosaccharomyces pombe. Our genome-wide functional screen surveyed all nonessential genes for roles in chromosome segregation and spore formation. Novel genes important at distinct stages of the meiotic chromosome segregation and differentiation program were identified. Preliminary characterization implicated three of these genes in centrosome/spindle pole body, centromere, and cohesion function. Our findings represent a near-complete parts list of genes important for meiosis in fission yeast, providing a valuable resource to advance our molecular understanding of meiosis. PMID:29259000

  8. A systems approach to identifying correlated gene targets for the loss of colour pigmentation in plants

    PubMed Central

    2011-01-01

    Background The numerous diverse metabolic pathways by which plant compounds can be produced make it difficult to predict how colour pigmentation is lost for different tissues and plants. This study employs mathematical and in silico methods to identify correlated gene targets for the loss of colour pigmentation in plants from a whole cell perspective based on the full metabolic network of Arabidopsis. This involves extracting a self-contained flavonoid subnetwork from the AraCyc database and calculating feasible metabolic routes or elementary modes (EMs) for it. Those EMs leading to anthocyanin compounds are taken to constitute the anthocyanin biosynthetic pathway (ABP) and their interplay with the rest of the EMs is used to study the minimal cut sets (MCSs), which are different combinations of reactions to block for eliminating colour pigmentation. By relating the reactions to their corresponding genes, the MCSs are used to explore the phenotypic roles of the ABP genes, their relevance to the ABP and the impact their eliminations would have on other processes in the cell. Results Simulation and prediction results of the effect of different MCSs for eliminating colour pigmentation correspond with existing experimental observations. Two examples are: i) two MCSs which require the simultaneous suppression of genes DFR and ANS to eliminate colour pigmentation, correspond to observational results of the same genes being co-regulated for eliminating floral pigmentation in Aquilegia and; ii) the impact of another MCS requiring CHS suppression, corresponds to findings where the suppression of the early gene CHS eliminated nearly all flavonoids but did not affect the production of volatile benzenoids responsible for floral scent. Conclusions From the various MCSs identified for eliminating colour pigmentation, several correlate to existing experimental observations, indicating that different MCSs are suitable for different plants, different cells, and different conditions

  9. Microarray and differential display identify genes involved in jasmonate-dependent anther development.

    PubMed

    Mandaokar, Ajin; Kumar, V Dinesh; Amway, Matt; Browse, John

    2003-07-01

    Jasmonate (JA) is a signaling compound essential for anther development and pollen fertility in Arabidopsis. Mutations that block the pathway of JA synthesis result into male sterility. To understand the processes of anther and pollen maturation, we used microarray and differential display approaches to compare gene expression pattern in anthers of wild-type Arabidopsis and the male-sterile mutant, opr3. Microarray experiment revealed 25 genes that were up-regulated more than 1.8-fold in wild-type anthers as compared to mutant anthers. Experiments based on differential display identified 13 additional genes up-regulated in wild-type anthers compared to opr3 for a total of 38 differentially expressed genes. Searches of the Arabidopsis and non-redundant databases disclosed known or likely functions for 28 of the 38 genes identified, while 10 genes encode proteins of unknown function. Northern blot analysis of eight representative clones as probes confirmed low expression in opr3 anthers compared with wild-type anthers. JA responsiveness of these same genes was also investigated by northern blot analysis of anther RNA isolated from wild-type and opr3 plants, In these experiments, four genes were induced in opr3 anthers within 0.5-1 h of JA treatment while the remaining genes were up-regulated only 1-8 h after JA application. None of these genes was induced by JA in anthers of the coil mutant that is deficient in JA responsiveness. The four early-induced genes in opr3 encode lipoxygenase, a putative bHLH transcription factor, epithiospecifier protein and an unknown protein. We propose that these and other early components may be involved in JA signaling and in the initiation of developmental processes. The four late genes encode an extensin-like protein, a peptide transporter and two unknown proteins, which may represent components required later in anther and pollen maturation. Transcript profiling has provided a successful approach to identify genes involved in

  10. Lentiviral vector-based insertional mutagenesis identifies genes associated with liver cancer

    PubMed Central

    Ranzani, Marco; Cesana, Daniela; Bartholomae, Cynthia C.; Sanvito, Francesca; Pala, Mauro; Benedicenti, Fabrizio; Gallina, Pierangela; Sergi, Lucia Sergi; Merella, Stefania; Bulfone, Alessandro; Doglioni, Claudio; von Kalle, Christof; Kim, Yoon Jun; Schmidt, Manfred; Tonon, Giovanni; Naldini, Luigi; Montini, Eugenio

    2013-01-01

    Transposons and γ-retroviruses have been efficiently used as insertional mutagens in different tissues to identify molecular culprits of cancer. However, these systems are characterized by recurring integrations that accumulate in tumor cells, hampering the identification of early cancer-driving events amongst bystander and progression-related events. We developed an insertional mutagenesis platform based on lentiviral vectors (LVV) by which we could efficiently induce hepatocellular carcinoma (HCC) in 3 different mouse models. By virtue of LVV’s replication-deficient nature and broad genome-wide integration pattern, LVV-based insertional mutagenesis allowed identification of 4 new liver cancer genes from a limited number of integrations. We validated the oncogenic potential of all the identified genes in vivo, with different levels of penetrance. Our newly identified cancer genes are likely to play a role in human disease, since they are upregulated and/or amplified/deleted in human HCCs and can predict clinical outcome of patients. PMID:23314173

  11. GESearch: An Interactive GUI Tool for Identifying Gene Expression Signature.

    PubMed

    Ye, Ning; Yin, Hengfu; Liu, Jingjing; Dai, Xiaogang; Yin, Tongming

    2015-01-01

    The huge amount of gene expression data generated by microarray and next-generation sequencing technologies present challenges to exploit their biological meanings. When searching for the coexpression genes, the data mining process is largely affected by selection of algorithms. Thus, it is highly desirable to provide multiple options of algorithms in the user-friendly analytical toolkit to explore the gene expression signatures. For this purpose, we developed GESearch, an interactive graphical user interface (GUI) toolkit, which is written in MATLAB and supports a variety of gene expression data files. This analytical toolkit provides four models, including the mean, the regression, the delegate, and the ensemble models, to identify the coexpression genes, and enables the users to filter data and to select gene expression patterns by browsing the display window or by importing knowledge-based genes. Subsequently, the utility of this analytical toolkit is demonstrated by analyzing two sets of real-life microarray datasets from cell-cycle experiments. Overall, we have developed an interactive GUI toolkit that allows for choosing multiple algorithms for analyzing the gene expression signatures.

  12. Identifying candidate genes for Type 2 Diabetes Mellitus and obesity through gene expression profiling in multiple tissues or cells.

    PubMed

    Chen, Junhui; Meng, Yuhuan; Zhou, Jinghui; Zhuo, Min; Ling, Fei; Zhang, Yu; Du, Hongli; Wang, Xiaoning

    2013-01-01

    Type 2 Diabetes Mellitus (T2DM) and obesity have become increasingly prevalent in recent years. Recent studies have focused on identifying causal variations or candidate genes for obesity and T2DM via analysis of expression quantitative trait loci (eQTL) within a single tissue. T2DM and obesity are affected by comprehensive sets of genes in multiple tissues. In the current study, gene expression levels in multiple human tissues from GEO datasets were analyzed, and 21 candidate genes displaying high percentages of differential expression were filtered out. Specifically, DENND1B, LYN, MRPL30, POC1B, PRKCB, RP4-655J12.3, HIBADH, and TMBIM4 were identified from the T2DM-control study, and BCAT1, BMP2K, CSRNP2, MYNN, NCKAP5L, SAP30BP, SLC35B4, SP1, BAP1, GRB14, HSP90AB1, ITGA5, and TOMM5 were identified from the obesity-control study. The majority of these genes are known to be involved in T2DM and obesity. Therefore, analysis of gene expression in various tissues using GEO datasets may be an effective and feasible method to determine novel or causal genes associated with T2DM and obesity.

  13. Applying Multivariate Adaptive Splines to Identify Genes With Expressions Varying After Diagnosis in Microarray Experiments.

    PubMed

    Duan, Fenghai; Xu, Ye

    2017-01-01

    To analyze a microarray experiment to identify the genes with expressions varying after the diagnosis of breast cancer. A total of 44 928 probe sets in an Affymetrix microarray data publicly available on Gene Expression Omnibus from 249 patients with breast cancer were analyzed by the nonparametric multivariate adaptive splines. Then, the identified genes with turning points were grouped by K-means clustering, and their network relationship was subsequently analyzed by the Ingenuity Pathway Analysis. In total, 1640 probe sets (genes) were reliably identified to have turning points along with the age at diagnosis in their expression profiling, of which 927 expressed lower after turning points and 713 expressed higher after the turning points. K-means clustered them into 3 groups with turning points centering at 54, 62.5, and 72, respectively. The pathway analysis showed that the identified genes were actively involved in various cancer-related functions or networks. In this article, we applied the nonparametric multivariate adaptive splines method to a publicly available gene expression data and successfully identified genes with expressions varying before and after breast cancer diagnosis.

  14. TGMI: an efficient algorithm for identifying pathway regulators through evaluation of triple-gene mutual interaction

    PubMed Central

    Gunasekara, Chathura; Zhang, Kui; Deng, Wenping; Brown, Laura

    2018-01-01

    Abstract Despite their important roles, the regulators for most metabolic pathways and biological processes remain elusive. Presently, the methods for identifying metabolic pathway and biological process regulators are intensively sought after. We developed a novel algorithm called triple-gene mutual interaction (TGMI) for identifying these regulators using high-throughput gene expression data. It first calculated the regulatory interactions among triple gene blocks (two pathway genes and one transcription factor (TF)), using conditional mutual information, and then identifies significantly interacted triple genes using a newly identified novel mutual interaction measure (MIM), which was substantiated to reflect strengths of regulatory interactions within each triple gene block. The TGMI calculated the MIM for each triple gene block and then examined its statistical significance using bootstrap. Finally, the frequencies of all TFs present in all significantly interacted triple gene blocks were calculated and ranked. We showed that the TFs with higher frequencies were usually genuine pathway regulators upon evaluating multiple pathways in plants, animals and yeast. Comparison of TGMI with several other algorithms demonstrated its higher accuracy. Therefore, TGMI will be a valuable tool that can help biologists to identify regulators of metabolic pathways and biological processes from the exploded high-throughput gene expression data in public repositories. PMID:29579312

  15. Epidermal growth factor gene is a newly identified candidate gene for gout

    PubMed Central

    Han, Lin; Cao, Chunwei; Jia, Zhaotong; Liu, Shiguo; Liu, Zhen; Xin, Ruosai; Wang, Can; Li, Xinde; Ren, Wei; Wang, Xuefeng; Li, Changgui

    2016-01-01

    Chromosome 4q25 has been identified as a genomic region associated with gout. However, the associations of gout with the genes in this region have not yet been confirmed. Here, we performed two-stage analysis to determine whether variations in candidate genes in the 4q25 region are associated with gout in a male Chinese Han population. We first evaluated 96 tag single nucleotide polymorphisms (SNPs) in eight inflammatory/immune pathway- or glucose/lipid metabolism-related genes in the 4q25 region in 480 male gout patients and 480 controls. The SNP rs12504538, located in the elongation of very-long-chain-fatty-acid-like family member 6 gene (Elovl6), was found to be associated with gout susceptibility (Padjusted = 0.00595). In the second stage of analysis, we performed fine mapping analysis of 93 tag SNPs in Elovl6 and in the epidermal growth factor gene (EGF) and its flanking regions in 1017 male patients gout and 1897 healthy male controls. We observed a significant association between the T allele of EGF rs2298999 and gout (odds ratio = 0.77, 95% confidence interval = 0.67–0.88, Padjusted = 6.42 × 10−3). These results provide the first evidence for an association between the EGF rs2298999 C/T polymorphism and gout. Our findings should be validated in additional populations. PMID:27506295

  16. Identifying the genes of unconventional high temperature superconductors.

    PubMed

    Hu, Jiangping

    We elucidate a recently emergent framework in unifying the two families of high temperature (high [Formula: see text]) superconductors, cuprates and iron-based superconductors. The unification suggests that the latter is simply the counterpart of the former to realize robust extended s-wave pairing symmetries in a square lattice. The unification identifies that the key ingredients (gene) of high [Formula: see text] superconductors is a quasi two dimensional electronic environment in which the d -orbitals of cations that participate in strong in-plane couplings to the p -orbitals of anions are isolated near Fermi energy. With this gene, the superexchange magnetic interactions mediated by anions could maximize their contributions to superconductivity. Creating the gene requires special arrangements between local electronic structures and crystal lattice structures. The speciality explains why high [Formula: see text] superconductors are so rare. An explicit prediction is made to realize high [Formula: see text] superconductivity in Co/Ni-based materials with a quasi two dimensional hexagonal lattice structure formed by trigonal bipyramidal complexes.

  17. Combining gene expression and genetic analyses to identify candidate genes involved in cold responses in pea.

    PubMed

    Legrand, Sylvain; Marque, Gilles; Blassiau, Christelle; Bluteau, Aurélie; Canoy, Anne-Sophie; Fontaine, Véronique; Jaminon, Odile; Bahrman, Nasser; Mautord, Julie; Morin, Julie; Petit, Aurélie; Baranger, Alain; Rivière, Nathalie; Wilmer, Jeroen; Delbreil, Bruno; Lejeune-Hénaut, Isabelle

    2013-09-01

    Cold stress affects plant growth and development. In order to better understand the responses to cold (chilling or freezing tolerance), we used two contrasted pea lines. Following a chilling period, the Champagne line becomes tolerant to frost whereas the Terese line remains sensitive. Four suppression subtractive hybridisation libraries were obtained using mRNAs isolated from pea genotypes Champagne and Terese. Using quantitative polymerase chain reaction (qPCR) performed on 159 genes, 43 and 54 genes were identified as differentially expressed at the initial time point and during the time course study, respectively. Molecular markers were developed from the differentially expressed genes and were genotyped on a population of 164 RILs derived from a cross between Champagne and Terese. We identified 5 candidate genes colocalizing with 3 different frost damage quantitative trait loci (QTL) intervals and a protein quantity locus (PQL) rich region previously reported. This investigation revealed the role of constitutive differences between both genotypes in the cold responses, in particular with genes related to glycine degradation pathway that could confer to Champagne a better frost tolerance. We showed that freezing tolerance involves a decrease of expression of genes related to photosynthesis and the expression of a gene involved in the production of cysteine and methionine that could act as cryoprotectant molecules. Although it remains to be confirmed, this study could also reveal the involvement of the jasmonate pathway in the cold responses, since we observed that two genes related to this pathway were mapped in a frost damage QTL interval and in a PQL rich region interval, respectively. Copyright © 2013 Elsevier GmbH. All rights reserved.

  18. Microarray expression profiling identifies genes with altered expression in HDL-deficient mice

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Callow, Matthew J.; Dudoit, Sandrine; Gong, Elaine L.

    2000-05-05

    Based on the assumption that severe alterations in the expression of genes known to be involved in HDL metabolism may affect the expression of other genes we screened an array of over 5000 mouse expressed sequence tags (ESTs) for altered gene expression in the livers of two lines of mice with dramatic decreases in HDL plasma concentrations. Labeled cDNA from livers of apolipoprotein AI (apo AI) knockout mice, Scavenger Receptor BI (SR-BI) transgenic mice and control mice were co-hybridized to microarrays. Two-sample t-statistics were used to identify genes with altered expression levels in the knockout or transgenic mice compared withmore » the control mice. In the SR-BI group we found 9 array elements representing at least 5 genes to be significantly altered on the basis of an adjusted p value of less than 0.05. In the apo AI knockout group 8 array elements representing 4 genes were altered compared with the control group (p < 0.05). Several of the genes identified in the SR-BI transgenic suggest altered sterol metabolism and oxidative processes. These studies illustrate the use of multiple-testing methods for the identification of genes with altered expression in replicated microarray experiments of apo AI knockout and SR-BI transgenic mice.« less

  19. Identifying the rooted species tree from the distribution of unrooted gene trees under the coalescent.

    PubMed

    Allman, Elizabeth S; Degnan, James H; Rhodes, John A

    2011-06-01

    Gene trees are evolutionary trees representing the ancestry of genes sampled from multiple populations. Species trees represent populations of individuals-each with many genes-splitting into new populations or species. The coalescent process, which models ancestry of gene copies within populations, is often used to model the probability distribution of gene trees given a fixed species tree. This multispecies coalescent model provides a framework for phylogeneticists to infer species trees from gene trees using maximum likelihood or Bayesian approaches. Because the coalescent models a branching process over time, all trees are typically assumed to be rooted in this setting. Often, however, gene trees inferred by traditional phylogenetic methods are unrooted. We investigate probabilities of unrooted gene trees under the multispecies coalescent model. We show that when there are four species with one gene sampled per species, the distribution of unrooted gene tree topologies identifies the unrooted species tree topology and some, but not all, information in the species tree edges (branch lengths). The location of the root on the species tree is not identifiable in this situation. However, for 5 or more species with one gene sampled per species, we show that the distribution of unrooted gene tree topologies identifies the rooted species tree topology and all its internal branch lengths. The length of any pendant branch leading to a leaf of the species tree is also identifiable for any species from which more than one gene is sampled.

  20. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines

    PubMed Central

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours’ biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription–quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes. PMID:29263807

  1. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines.

    PubMed

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours' biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription-quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes.

  2. Haplotype Analysis in Multiple Crosses to Identify a QTL Gene

    PubMed Central

    Wang, Xiaosong; Korstanje, Ron; Higgins, David; Paigen, Beverly

    2004-01-01

    Identifying quantitative trait locus (QTL) genes is a challenging task. Herein, we report using a two-step process to identify Apoa2 as the gene underlying Hdlq5, a QTL for plasma high-density lipoprotein cholesterol (HDL) levels on mouse chromosome 1. First, we performed a sequence analysis of the Apoa2 coding region in 46 genetically diverse mouse strains and found five different APOA2 protein variants, which we named APOA2a to APOA2e. Second, we conducted a haplotype analysis of the strains in 21 crosses that have so far detected HDL QTLs; we found that Hdlq5 was detected only in the nine crosses where one parent had the APOA2b protein variant characterized by an Ala61-to-Val61 substitution. We then found that strains with the APOA2b variant had significantly higher (P ≤ 0.002) plasma HDL levels than those with either the APOA2a or the APOA2c variant. These findings support Apoa2 as the underlying Hdlq5 gene and suggest the Apoa2 polymorphisms responsible for the Hdlq5 phenotype. Therefore, haplotype analysis in multiple crosses can be used to support a candidate QTL gene. PMID:15310659

  3. Haplotype analysis in multiple crosses to identify a QTL gene.

    PubMed

    Wang, Xiaosong; Korstanje, Ron; Higgins, David; Paigen, Beverly

    2004-09-01

    Identifying quantitative trait locus (QTL) genes is a challenging task. Herein, we report using a two-step process to identify Apoa2 as the gene underlying Hdlq5, a QTL for plasma high-density lipoprotein cholesterol (HDL) levels on mouse chromosome 1. First, we performed a sequence analysis of the Apoa2 coding region in 46 genetically diverse mouse strains and found five different APOA2 protein variants, which we named APOA2a to APOA2e. Second, we conducted a haplotype analysis of the strains in 21 crosses that have so far detected HDL QTLs; we found that Hdlq5 was detected only in the nine crosses where one parent had the APOA2b protein variant characterized by an Ala61-to-Val61 substitution. We then found that strains with the APOA2b variant had significantly higher (P < or = 0.002) plasma HDL levels than those with either the APOA2a or the APOA2c variant. These findings support Apoa2 as the underlying Hdlq5 gene and suggest the Apoa2 polymorphisms responsible for the Hdlq5 phenotype. Therefore, haplotype analysis in multiple crosses can be used to support a candidate QTL gene.

  4. Novel numerical and graphical representation of DNA sequences and proteins.

    PubMed

    Randić, M; Novic, M; Vikić-Topić, D; Plavsić, D

    2006-12-01

    We have introduced novel numerical and graphical representations of DNA, which offer a simple and unique characterization of DNA sequences. The numerical representation of a DNA sequence is given as a sequence of real numbers derived from a unique graphical representation of the standard genetic code. There is no loss of information on the primary structure of a DNA sequence associated with this numerical representation. The novel representations are illustrated with the coding sequences of the first exon of beta-globin gene of half a dozen species in addition to human. The method can be extended to proteins as is exemplified by humanin, a 24-aa peptide that has recently been identified as a specific inhibitor of neuronal cell death induced by familial Alzheimer's disease mutant genes.

  5. A transposon-based genetic screen in mice identifies genes altered in colorectal cancer.

    PubMed

    Starr, Timothy K; Allaei, Raha; Silverstein, Kevin A T; Staggs, Rodney A; Sarver, Aaron L; Bergemann, Tracy L; Gupta, Mihir; O'Sullivan, M Gerard; Matise, Ilze; Dupuy, Adam J; Collier, Lara S; Powers, Scott; Oberg, Ann L; Asmann, Yan W; Thibodeau, Stephen N; Tessarollo, Lino; Copeland, Neal G; Jenkins, Nancy A; Cormier, Robert T; Largaespada, David A

    2009-03-27

    Human colorectal cancers (CRCs) display a large number of genetic and epigenetic alterations, some of which are causally involved in tumorigenesis (drivers) and others that have little functional impact (passengers). To help distinguish between these two classes of alterations, we used a transposon-based genetic screen in mice to identify candidate genes for CRC. Mice harboring mutagenic Sleeping Beauty (SB) transposons were crossed with mice expressing SB transposase in gastrointestinal tract epithelium. Most of the offspring developed intestinal lesions, including intraepithelial neoplasia, adenomas, and adenocarcinomas. Analysis of over 16,000 transposon insertions identified 77 candidate CRC genes, 60 of which are mutated and/or dysregulated in human CRC and thus are most likely to drive tumorigenesis. These genes include APC, PTEN, and SMAD4. The screen also identified 17 candidate genes that had not previously been implicated in CRC, including POLI, PTPRK, and RSPO2.

  6. Comparison of gene expression in segregating families identifies genes and genomic regions involved in a novel adaptation, zinc hyperaccumulation.

    PubMed

    Filatov, Victor; Dowdle, John; Smirnoff, Nicholas; Ford-Lloyd, Brian; Newbury, H John; Macnair, Mark R

    2006-09-01

    One of the challenges of comparative genomics is to identify specific genetic changes associated with the evolution of a novel adaptation or trait. We need to be able to disassociate the genes involved with a particular character from all the other genetic changes that take place as lineages diverge. Here we show that by comparing the transcriptional profile of segregating families with that of parent species differing in a novel trait, it is possible to narrow down substantially the list of potential target genes. In addition, by assuming synteny with a related model organism for which the complete genome sequence is available, it is possible to use the cosegregation of markers differing in transcription level to identify regions of the genome which probably contain quantitative trait loci (QTLs) for the character. This novel combination of genomics and classical genetics provides a very powerful tool to identify candidate genes. We use this methodology to investigate zinc hyperaccumulation in Arabidopsis halleri, the sister species to the model plant, Arabidopsis thaliana. We compare the transcriptional profile of A. halleri with that of its sister nonaccumulator species, Arabidopsis petraea, and between accumulator and nonaccumulator F(3)s derived from the cross between the two species. We identify eight genes which consistently show greater expression in accumulator phenotypes in both roots and shoots, including two metal transporter genes (NRAMP3 and ZIP6), and cytoplasmic aconitase, a gene involved in iron homeostasis in mammals. We also show that there appear to be two QTLs for zinc accumulation, on chromosomes 3 and 7.

  7. Integrating genome-wide association study and expression quantitative trait loci data identifies multiple genes and gene set associated with neuroticism.

    PubMed

    Fan, Qianrui; Wang, Wenyu; Hao, Jingcan; He, Awen; Wen, Yan; Guo, Xiong; Wu, Cuiyan; Ning, Yujie; Wang, Xi; Wang, Sen; Zhang, Feng

    2017-08-01

    Neuroticism is a fundamental personality trait with significant genetic determinant. To identify novel susceptibility genes for neuroticism, we conducted an integrative analysis of genomic and transcriptomic data of genome wide association study (GWAS) and expression quantitative trait locus (eQTL) study. GWAS summary data was driven from published studies of neuroticism, totally involving 170,906 subjects. eQTL dataset containing 927,753 eQTLs were obtained from an eQTL meta-analysis of 5311 samples. Integrative analysis of GWAS and eQTL data was conducted by summary data-based Mendelian randomization (SMR) analysis software. To identify neuroticism associated gene sets, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). The gene set annotation dataset (containing 13,311 annotated gene sets) of GSEA Molecular Signatures Database was used. SMR single gene analysis identified 6 significant genes for neuroticism, including MSRA (p value=2.27×10 -10 ), MGC57346 (p value=6.92×10 -7 ), BLK (p value=1.01×10 -6 ), XKR6 (p value=1.11×10 -6 ), C17ORF69 (p value=1.12×10 -6 ) and KIAA1267 (p value=4.00×10 -6 ). Gene set enrichment analysis observed significant association for Chr8p23 gene set (false discovery rate=0.033). Our results provide novel clues for the genetic mechanism studies of neuroticism. Copyright © 2017. Published by Elsevier Inc.

  8. A Multiomics Approach to Identify Genes Associated with Childhood Asthma Risk and Morbidity.

    PubMed

    Forno, Erick; Wang, Ting; Yan, Qi; Brehm, John; Acosta-Perez, Edna; Colon-Semidey, Angel; Alvarez, Maria; Boutaoui, Nadia; Cloutier, Michelle M; Alcorn, John F; Canino, Glorisa; Chen, Wei; Celedón, Juan C

    2017-10-01

    Childhood asthma is a complex disease. In this study, we aim to identify genes associated with childhood asthma through a multiomics "vertical" approach that integrates multiple analytical steps using linear and logistic regression models. In a case-control study of childhood asthma in Puerto Ricans (n = 1,127), we used adjusted linear or logistic regression models to evaluate associations between several analytical steps of omics data, including genome-wide (GW) genotype data, GW methylation, GW expression profiling, cytokine levels, asthma-intermediate phenotypes, and asthma status. At each point, only the top genes/single-nucleotide polymorphisms/probes/cytokines were carried forward for subsequent analysis. In step 1, asthma modified the gene expression-protein level association for 1,645 genes; pathway analysis showed an enrichment of these genes in the cytokine signaling system (n = 269 genes). In steps 2-3, expression levels of 40 genes were associated with intermediate phenotypes (asthma onset age, forced expiratory volume in 1 second, exacerbations, eosinophil counts, and skin test reactivity); of those, methylation of seven genes was also associated with asthma. Of these seven candidate genes, IL5RA was also significant in analytical steps 4-8. We then measured plasma IL-5 receptor α levels, which were associated with asthma age of onset and moderate-severe exacerbations. In addition, in silico database analysis showed that several of our identified IL5RA single-nucleotide polymorphisms are associated with transcription factors related to asthma and atopy. This approach integrates several analytical steps and is able to identify biologically relevant asthma-related genes, such as IL5RA. It differs from other methods that rely on complex statistical models with various assumptions.

  9. A large-scale RNA interference screen identifies genes that regulate autophagy at different stages.

    PubMed

    Guo, Sujuan; Pridham, Kevin J; Virbasius, Ching-Man; He, Bin; Zhang, Liqing; Varmark, Hanne; Green, Michael R; Sheng, Zhi

    2018-02-12

    Dysregulated autophagy is central to the pathogenesis and therapeutic development of cancer. However, how autophagy is regulated in cancer is not well understood and genes that modulate cancer autophagy are not fully defined. To gain more insights into autophagy regulation in cancer, we performed a large-scale RNA interference screen in K562 human chronic myeloid leukemia cells using monodansylcadaverine staining, an autophagy-detecting approach equivalent to immunoblotting of the autophagy marker LC3B or fluorescence microscopy of GFP-LC3B. By coupling monodansylcadaverine staining with fluorescence-activated cell sorting, we successfully isolated autophagic K562 cells where we identified 336 short hairpin RNAs. After candidate validation using Cyto-ID fluorescence spectrophotometry, LC3B immunoblotting, and quantitative RT-PCR, 82 genes were identified as autophagy-regulating genes. 20 genes have been reported previously and the remaining 62 candidates are novel autophagy mediators. Bioinformatic analyses revealed that most candidate genes were involved in molecular pathways regulating autophagy, rather than directly participating in the autophagy process. Further autophagy flux assays revealed that 57 autophagy-regulating genes suppressed autophagy initiation, whereas 21 candidates promoted autophagy maturation. Our RNA interference screen identifies identified genes that regulate autophagy at different stages, which helps decode autophagy regulation in cancer and offers novel avenues to develop autophagy-related therapies for cancer.

  10. Overexpression screens identify conserved dosage chromosome instability genes in yeast and human cancer

    PubMed Central

    Duffy, Supipi; Fam, Hok Khim; Wang, Yi Kan; Styles, Erin B.; Kim, Jung-Hyun; Ang, J. Sidney; Singh, Tejomayee; Larionov, Vladimir; Shah, Sohrab P.; Andrews, Brenda; Boerkoel, Cornelius F.; Hieter, Philip

    2016-01-01

    Somatic copy number amplification and gene overexpression are common features of many cancers. To determine the role of gene overexpression on chromosome instability (CIN), we performed genome-wide screens in the budding yeast for yeast genes that cause CIN when overexpressed, a phenotype we refer to as dosage CIN (dCIN), and identified 245 dCIN genes. This catalog of genes reveals human orthologs known to be recurrently overexpressed and/or amplified in tumors. We show that two genes, TDP1, a tyrosyl-DNA-phosphdiesterase, and TAF12, an RNA polymerase II TATA-box binding factor, cause CIN when overexpressed in human cells. Rhabdomyosarcoma lines with elevated human Tdp1 levels also exhibit CIN that can be partially rescued by siRNA-mediated knockdown of TDP1. Overexpression of dCIN genes represents a genetic vulnerability that could be leveraged for selective killing of cancer cells through targeting of an unlinked synthetic dosage lethal (SDL) partner. Using SDL screens in yeast, we identified a set of genes that when deleted specifically kill cells with high levels of Tdp1. One gene was the histone deacetylase RPD3, for which there are known inhibitors. Both HT1080 cells overexpressing hTDP1 and rhabdomyosarcoma cells with elevated levels of hTdp1 were more sensitive to histone deacetylase inhibitors valproic acid (VPA) and trichostatin A (TSA), recapitulating the SDL interaction in human cells and suggesting VPA and TSA as potential therapeutic agents for tumors with elevated levels of hTdp1. The catalog of dCIN genes presented here provides a candidate list to identify genes that cause CIN when overexpressed in cancer, which can then be leveraged through SDL to selectively target tumors. PMID:27551064

  11. Integrating mean and variance heterogeneities to identify differentially expressed genes.

    PubMed

    Ouyang, Weiwei; An, Qiang; Zhao, Jinying; Qin, Huaizhen

    2016-12-06

    In functional genomics studies, tests on mean heterogeneity have been widely employed to identify differentially expressed genes with distinct mean expression levels under different experimental conditions. Variance heterogeneity (aka, the difference between condition-specific variances) of gene expression levels is simply neglected or calibrated for as an impediment. The mean heterogeneity in the expression level of a gene reflects one aspect of its distribution alteration; and variance heterogeneity induced by condition change may reflect another aspect. Change in condition may alter both mean and some higher-order characteristics of the distributions of expression levels of susceptible genes. In this report, we put forth a conception of mean-variance differentially expressed (MVDE) genes, whose expression means and variances are sensitive to the change in experimental condition. We mathematically proved the null independence of existent mean heterogeneity tests and variance heterogeneity tests. Based on the independence, we proposed an integrative mean-variance test (IMVT) to combine gene-wise mean heterogeneity and variance heterogeneity induced by condition change. The IMVT outperformed its competitors under comprehensive simulations of normality and Laplace settings. For moderate samples, the IMVT well controlled type I error rates, and so did existent mean heterogeneity test (i.e., the Welch t test (WT), the moderated Welch t test (MWT)) and the procedure of separate tests on mean and variance heterogeneities (SMVT), but the likelihood ratio test (LRT) severely inflated type I error rates. In presence of variance heterogeneity, the IMVT appeared noticeably more powerful than all the valid mean heterogeneity tests. Application to the gene profiles of peripheral circulating B raised solid evidence of informative variance heterogeneity. After adjusting for background data structure, the IMVT replicated previous discoveries and identified novel experiment

  12. G20210A prothrombin gene mutation identified in patients with venous leg ulcers.

    PubMed

    Jebeleanu, G; Procopciuc, L

    2001-01-01

    The G20210A mutation variant of prothrombin gene is the second most frequent mutation identified in patients with deep venous thrombosis, after factor V Leiden. The risk for developing deep venous thrombosis is high in patients identified as heterozygous for G20210A mutation. In order to identify this polymorphism in the gene coding prothrombin, the 345bp fragment in the 3'- untranslated region of the prothrombin gene was amplified using amplification by polymerase chain reaction and enzymatic digestion by HindIII (restriction endonuclease enzyme). The products of amplification and enzymatic's digestion were analized using agarose gel electrophoresis. We investigated 20 patients with venous leg ulcers and we found 2 heterozygous (10%) for G20210A mutation. None of the patients in the control group had G20210A mutation. Our study confirms the presence of G20210A mutation in the Romanian population. Our study also shows the link between venous leg ulcers and this polymorphism in the prothrombin gene.

  13. A Novel Yeast Genomics Method for Identifying New Breast Cancer Susceptibility Genes

    DTIC Science & Technology

    2007-05-01

    find new candidate genes for breast cancer susceptibility in women and identifying these human genes can further improve monitoring and treatment...breast cancer susceptibility genes in humans that are currently unknown and not deducible from current methodologies. It is a fundamental...template to faithfully repair the broken strand. In human cancer it is loss of HR, rather than NHEJ, that is more important in increasing cancer

  14. Next-generation sequencing to identify candidate genes and develop diagnostic markers for a novel Phytophthora resistance gene, RpsHC18, in soybean.

    PubMed

    Zhong, Chao; Sun, Suli; Li, Yinping; Duan, Canxing; Zhu, Zhendong

    2018-03-01

    A novel Phytophthora sojae resistance gene RpsHC18 was identified and finely mapped on soybean chromosome 3. Two NBS-LRR candidate genes were identified and two diagnostic markers of RpsHC18 were developed. Phytophthora root rot caused by Phytophthora sojae is a destructive disease of soybean. The most effective disease-control strategy is to deploy resistant cultivars carrying Phytophthora-resistant Rps genes. The soybean cultivar Huachun 18 has a broad and distinct resistance spectrum to 12 P. sojae isolates. Quantitative trait loci sequencing (QTL-seq), based on the whole-genome resequencing (WGRS) of two extreme resistant and susceptible phenotype bulks from an F 2:3 population, was performed, and one 767-kb genomic region with ΔSNP-index ≥ 0.9 on chromosome 3 was identified as the RpsHC18 candidate region in Huachun 18. The candidate region was reduced to a 146-kb region by fine mapping. Nonsynonymous SNP and haplotype analyses were carried out in the 146-kb region among ten soybean genotypes using WGRS. Four specific nonsynonymous SNPs were identified in two nucleotide-binding sites-leucine-rich repeat (NBS-LRR) genes, RpsHC18-NBL1 and RpsHC18-NBL2, which were considered to be the candidate genes. Finally, one specific SNP marker in each candidate gene was successfully developed using a tetra-primer ARMS-PCR assay, and the two markers were verified to be specific for RpsHC18 and to effectively distinguish other known Rps genes. In this study, we applied an integrated genomic-based strategy combining WGRS with traditional genetic mapping to identify RpsHC18 candidate genes and develop diagnostic markers. These results suggest that next-generation sequencing is a precise, rapid and cost-effective way to identify candidate genes and develop diagnostic markers, and it can accelerate Rps gene cloning and marker-assisted selection for breeding of P. sojae-resistant soybean cultivars.

  15. Transcriptomic analysis of the mussel Elliptio complanata identifies candidate stress-response genes and an abundance of novel or noncoding transcripts

    USGS Publications Warehouse

    Cornman, Robert S.; Robertson, Laura S.; Galbraith, Heather S.; Blakeslee, Carrie J.

    2014-01-01

    Mussels are useful indicator species of environmental stress and degradation, and the global decline in freshwater mussel diversity and abundance is of conservation concern. Elliptio complanata is a common freshwater mussel of eastern North America that can serve both as an indicator and as an experimental model for understanding mussel physiology and genetics. To support genetic components of these research goals, we assembled transcriptome contigs from Illumina paired-end reads. Despite efforts to collapse similar contigs, the final assembly was in excess of 136,000 contigs with an N50 of 982 bp. Even so, comparisons to the CEGMA database of conserved eukaryotic genes indicated that ∼20% of genes remain unrepresented. However, numerous candidate stress-response genes were present, and we identified lineage-specific patterns of diversification among molluscs for cytochrome P450 detoxification genes and two saccharide-modifying enzymes: 1,3 beta-galactosyltransferase and fucosyltransferase. Less than a quarter of contigs had protein-level similarity based on modest BLAST and Hmmer3 statistical thresholds. These results add comparative genomic resources for molluscs and suggest a wealth of novel proteins and noncoding transcripts.

  16. A 6-gene signature identifies four molecular subgroups of neuroblastoma

    PubMed Central

    2011-01-01

    Background There are currently three postulated genomic subtypes of the childhood tumour neuroblastoma (NB); Type 1, Type 2A, and Type 2B. The most aggressive forms of NB are characterized by amplification of the oncogene MYCN (MNA) and low expression of the favourable marker NTRK1. Recently, mutations or high expression of the familial predisposition gene Anaplastic Lymphoma Kinase (ALK) was associated to unfavourable biology of sporadic NB. Also, various other genes have been linked to NB pathogenesis. Results The present study explores subgroup discrimination by gene expression profiling using three published microarray studies on NB (47 samples). Four distinct clusters were identified by Principal Components Analysis (PCA) in two separate data sets, which could be verified by an unsupervised hierarchical clustering in a third independent data set (101 NB samples) using a set of 74 discriminative genes. The expression signature of six NB-associated genes ALK, BIRC5, CCND1, MYCN, NTRK1, and PHOX2B, significantly discriminated the four clusters (p < 0.05, one-way ANOVA test). PCA clusters p1, p2, and p3 were found to correspond well to the postulated subtypes 1, 2A, and 2B, respectively. Remarkably, a fourth novel cluster was detected in all three independent data sets. This cluster comprised mainly 11q-deleted MNA-negative tumours with low expression of ALK, BIRC5, and PHOX2B, and was significantly associated with higher tumour stage, poor outcome and poor survival compared to the Type 1-corresponding favourable group (INSS stage 4 and/or dead of disease, p < 0.05, Fisher's exact test). Conclusions Based on expression profiling we have identified four molecular subgroups of neuroblastoma, which can be distinguished by a 6-gene signature. The fourth subgroup has not been described elsewhere, and efforts are currently made to further investigate this group's specific characteristics. PMID:21492432

  17. Exome sequencing of a large family identifies potential candidate genes contributing risk to bipolar disorder.

    PubMed

    Zhang, Tianxiao; Hou, Liping; Chen, David T; McMahon, Francis J; Wang, Jen-Chyong; Rice, John P

    2018-03-01

    Bipolar disorder is a mental illness with lifetime prevalence of about 1%. Previous genetic studies have identified multiple chromosomal linkage regions and candidate genes that might be associated with bipolar disorder. The present study aimed to identify potential susceptibility variants for bipolar disorder using 6 related case samples from a four-generation family. A combination of exome sequencing and linkage analysis was performed to identify potential susceptibility variants for bipolar disorder. Our study identified a list of five potential candidate genes for bipolar disorder. Among these five genes, GRID1(Glutamate Receptor Delta-1 Subunit), which was previously reported to be associated with several psychiatric disorders and brain related traits, is particularly interesting. Variants with functional significance in this gene were identified from two cousins in our bipolar disorder pedigree. Our findings suggest a potential role for these genes and the related rare variants in the onset and development of bipolar disorder in this one family. Additional research is needed to replicate these findings and evaluate their patho-biological significance. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Systems approach identifies an organic nitrogen-responsive gene network that is regulated by the master clock control gene CCA1.

    PubMed

    Gutiérrez, Rodrigo A; Stokes, Trevor L; Thum, Karen; Xu, Xiaodong; Obertello, Mariana; Katari, Manpreet S; Tanurdzic, Milos; Dean, Alexis; Nero, Damion C; McClung, C Robertson; Coruzzi, Gloria M

    2008-03-25

    Understanding how nutrients affect gene expression will help us to understand the mechanisms controlling plant growth and development as a function of nutrient availability. Nitrate has been shown to serve as a signal for the control of gene expression in Arabidopsis. There is also evidence, on a gene-by-gene basis, that downstream products of nitrogen (N) assimilation such as glutamate (Glu) or glutamine (Gln) might serve as signals of organic N status that in turn regulate gene expression. To identify genome-wide responses to such organic N signals, Arabidopsis seedlings were transiently treated with ammonium nitrate in the presence or absence of MSX, an inhibitor of glutamine synthetase, resulting in a block of Glu/Gln synthesis. Genes that responded to organic N were identified as those whose response to ammonium nitrate treatment was blocked in the presence of MSX. We showed that some genes previously identified to be regulated by nitrate are under the control of an organic N-metabolite. Using an integrated network model of molecular interactions, we uncovered a subnetwork regulated by organic N that included CCA1 and target genes involved in N-assimilation. We validated some of the predicted interactions and showed that regulation of the master clock control gene CCA1 by Glu or a Glu-derived metabolite in turn regulates the expression of key N-assimilatory genes. Phase response curve analysis shows that distinct N-metabolites can advance or delay the CCA1 phase. Regulation of CCA1 by organic N signals may represent a novel input mechanism for N-nutrients to affect plant circadian clock function.

  19. Weighted gene co-expression network analysis of expression data of monozygotic twins identifies specific modules and hub genes related to BMI.

    PubMed

    Wang, Weijing; Jiang, Wenjie; Hou, Lin; Duan, Haiping; Wu, Yili; Xu, Chunsheng; Tan, Qihua; Li, Shuxia; Zhang, Dongfeng

    2017-11-13

    The therapeutic management of obesity is challenging, hence further elucidating the underlying mechanisms of obesity development and identifying new diagnostic biomarkers and therapeutic targets are urgent and necessary. Here, we performed differential gene expression analysis and weighted gene co-expression network analysis (WGCNA) to identify significant genes and specific modules related to BMI based on gene expression profile data of 7 discordant monozygotic twins. In the differential gene expression analysis, it appeared that 32 differentially expressed genes (DEGs) were with a trend of up-regulation in twins with higher BMI when compared to their siblings. Categories of positive regulation of nitric-oxide synthase biosynthetic process, positive regulation of NF-kappa B import into nucleus, and peroxidase activity were significantly enriched within GO database and NF-kappa B signaling pathway within KEGG database. DEGs of NAMPT, TLR9, PTGS2, HBD, and PCSK1N might be associated with obesity. In the WGCNA, among the total 20 distinct co-expression modules identified, coral1 module (68 genes) had the strongest positive correlation with BMI (r = 0.56, P = 0.04) and disease status (r = 0.56, P = 0.04). Categories of positive regulation of phospholipase activity, high-density lipoprotein particle clearance, chylomicron remnant clearance, reverse cholesterol transport, intermediate-density lipoprotein particle, chylomicron, low-density lipoprotein particle, very-low-density lipoprotein particle, voltage-gated potassium channel complex, cholesterol transporter activity, and neuropeptide hormone activity were significantly enriched within GO database for this module. And alcoholism and cell adhesion molecules pathways were significantly enriched within KEGG database. Several hub genes, such as GAL, ASB9, NPPB, TBX2, IL17C, APOE, ABCG4, and APOC2 were also identified. The module eigengene of saddlebrown module (212 genes) was also significantly

  20. Analysis of global gene expression profiles to identify differentially expressed genes critical for embryo development in Brassica rapa.

    PubMed

    Zhang, Yu; Peng, Lifang; Wu, Ya; Shen, Yanyue; Wu, Xiaoming; Wang, Jianbo

    2014-11-01

    Embryo development represents a crucial developmental period in the life cycle of flowering plants. To gain insights into the genetic programs that control embryo development in Brassica rapa L., RNA sequencing technology was used to perform transcriptome profiling analysis of B. rapa developing embryos. The results generated 42,906,229 sequence reads aligned with 32,941 genes. In total, 27,760, 28,871, 28,384, and 25,653 genes were identified from embryos at globular, heart, early cotyledon, and mature developmental stages, respectively, and analysis between stages revealed a subset of stage-specific genes. We next investigated 9,884 differentially expressed genes with more than fivefold changes in expression and false discovery rate ≤ 0.001 from three adjacent-stage comparisons; 1,514, 3,831, and 6,633 genes were detected between globular and heart stage embryo libraries, heart stage and early cotyledon stage, and early cotyledon and mature stage, respectively. Large numbers of genes related to cellular process, metabolism process, response to stimulus, and biological process were expressed during the early and middle stages of embryo development. Fatty acid biosynthesis, biosynthesis of secondary metabolites, and photosynthesis-related genes were expressed predominantly in embryos at the middle stage. Genes for lipid metabolism and storage proteins were highly expressed in the middle and late stages of embryo development. We also identified 911 transcription factor genes that show differential expression across embryo developmental stages. These results increase our understanding of the complex molecular and cellular events during embryo development in B. rapa and provide a foundation for future studies on other oilseed crops.

  1. GeneBreak: detection of recurrent DNA copy number aberration-associated chromosomal breakpoints within genes.

    PubMed

    van den Broek, Evert; van Lieshout, Stef; Rausch, Christian; Ylstra, Bauke; van de Wiel, Mark A; Meijer, Gerrit A; Fijneman, Remond J A; Abeln, Sanne

    2016-01-01

    Development of cancer is driven by somatic alterations, including numerical and structural chromosomal aberrations. Currently, several computational methods are available and are widely applied to detect numerical copy number aberrations (CNAs) of chromosomal segments in tumor genomes. However, there is lack of computational methods that systematically detect structural chromosomal aberrations by virtue of the genomic location of CNA-associated chromosomal breaks and identify genes that appear non-randomly affected by chromosomal breakpoints across (large) series of tumor samples. 'GeneBreak' is developed to systematically identify genes recurrently affected by the genomic location of chromosomal CNA-associated breaks by a genome-wide approach, which can be applied to DNA copy number data obtained by array-Comparative Genomic Hybridization (CGH) or by (low-pass) whole genome sequencing (WGS). First, 'GeneBreak' collects the genomic locations of chromosomal CNA-associated breaks that were previously pinpointed by the segmentation algorithm that was applied to obtain CNA profiles. Next, a tailored annotation approach for breakpoint-to-gene mapping is implemented. Finally, dedicated cohort-based statistics is incorporated with correction for covariates that influence the probability to be a breakpoint gene. In addition, multiple testing correction is integrated to reveal recurrent breakpoint events. This easy-to-use algorithm, 'GeneBreak', is implemented in R ( www.cran.r-project.org ) and is available from Bioconductor ( www.bioconductor.org/packages/release/bioc/html/GeneBreak.html ).

  2. Evolutionary Inference across Eukaryotes Identifies Specific Pressures Favoring Mitochondrial Gene Retention.

    PubMed

    Johnston, Iain G; Williams, Ben P

    2016-02-24

    Since their endosymbiotic origin, mitochondria have lost most of their genes. Although many selective mechanisms underlying the evolution of mitochondrial genomes have been proposed, a data-driven exploration of these hypotheses is lacking, and a quantitatively supported consensus remains absent. We developed HyperTraPS, a methodology coupling stochastic modeling with Bayesian inference, to identify the ordering of evolutionary events and suggest their causes. Using 2015 complete mitochondrial genomes, we inferred evolutionary trajectories of mtDNA gene loss across the eukaryotic tree of life. We find that proteins comprising the structural cores of the electron transport chain are preferentially encoded within mitochondrial genomes across eukaryotes. A combination of high GC content and high protein hydrophobicity is required to explain patterns of mtDNA gene retention; a model that accounts for these selective pressures can also predict the success of artificial gene transfer experiments in vivo. This work provides a general method for data-driven inference of the ordering of evolutionary and progressive events, here identifying the distinct features shaping mitochondrial genomes of present-day species. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Coalitional game theory as a promising approach to identify candidate autism genes.

    PubMed

    Gupta, Anika; Sun, Min Woo; Paskov, Kelley Marie; Stockham, Nate Tyler; Jung, Jae-Yoon; Wall, Dennis Paul

    2018-01-01

    Despite mounting evidence for the strong role of genetics in the phenotypic manifestation of Autism Spectrum Disorder (ASD), the specific genes responsible for the variable forms of ASD remain undefined. ASD may be best explained by a combinatorial genetic model with varying epistatic interactions across many small effect mutations. Coalitional or cooperative game theory is a technique that studies the combined effects of groups of players, known as coalitions, seeking to identify players who tend to improve the performance--the relationship to a specific disease phenotype--of any coalition they join. This method has been previously shown to boost biologically informative signal in gene expression data but to-date has not been applied to the search for cooperative mutations among putative ASD genes. We describe our approach to highlight genes relevant to ASD using coalitional game theory on alteration data of 1,965 fully sequenced genomes from 756 multiplex families. Alterations were encoded into binary matrices for ASD (case) and unaffected (control) samples, indicating likely gene-disrupting, inherited mutations in altered genes. To determine individual gene contributions given an ASD phenotype, a "player" metric, referred to as the Shapley value, was calculated for each gene in the case and control cohorts. Sixty seven genes were found to have significantly elevated player scores and likely represent significant contributors to the genetic coordination underlying ASD. Using network and cross-study analysis, we found that these genes are involved in biological pathways known to be affected in the autism cases and that a subset directly interact with several genes known to have strong associations to autism. These findings suggest that coalitional game theory can be applied to large-scale genomic data to identify hidden yet influential players in complex polygenic disorders such as autism.

  4. The FUN of identifying gene function in bacterial pathogens; insights from Salmonella functional genomics.

    PubMed

    Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D

    2013-10-01

    The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.

  5. Transposon mutagenesis identifies genes that cooperate with mutant Pten in breast cancer progression

    PubMed Central

    Rangel, Roberto; Lee, Song-Choon; Hon-Kim Ban, Kenneth; Guzman-Rojas, Liliana; Mann, Michael B.; Newberg, Justin Y.; McNoe, Leslie A.; Selvanesan, Luxmanan; Ward, Jerrold M.; Rust, Alistair G.; Chin, Kuan-Yew; Black, Michael A.; Jenkins, Nancy A.; Copeland, Neal G.

    2016-01-01

    Triple-negative breast cancer (TNBC) has the worst prognosis of any breast cancer subtype. To better understand the genetic forces driving TNBC, we performed a transposon mutagenesis screen in a phosphatase and tensin homolog (Pten) mutant mice and identified 12 candidate trunk drivers and a much larger number of progression genes. Validation studies identified eight TNBC tumor suppressor genes, including the GATA-like transcriptional repressor TRPS1. Down-regulation of TRPS1 in TNBC cells promoted epithelial-to-mesenchymal transition (EMT) by deregulating multiple EMT pathway genes, in addition to increasing the expression of SERPINE1 and SERPINB2 and the subsequent migration, invasion, and metastasis of tumor cells. Transposon mutagenesis has thus provided a better understanding of the genetic forces driving TNBC and discovered genes with potential clinical importance in TNBC. PMID:27849608

  6. Identifying Novel Transcriptional and Epigenetic Features of Nuclear Lamina-associated Genes.

    PubMed

    Wu, Feinan; Yao, Jie

    2017-03-07

    Because a large portion of the mammalian genome is associated with the nuclear lamina (NL), it is interesting to study how native genes resided there are transcribed and regulated. In this study, we report unique transcriptional and epigenetic features of nearly 3,500 NL-associated genes (NL genes). Promoter regions of active NL genes are often excluded from NL-association, suggesting that NL-promoter interactions may repress transcription. Active NL genes with higher RNA polymerase II (Pol II) recruitment levels tend to display Pol II promoter-proximal pausing, while Pol II recruitment and Pol II pausing are not correlated among non-NL genes. At the genome-wide scale, NL-association and H3K27me3 distinguishes two large gene classes with low transcriptional activities. Notably, NL-association is anti-correlated with both transcription and active histone mark levels among genes not significantly enriched with H3K9me3 or H3K27me3, suggesting that NL-association may represent a novel gene repression pathway. Interestingly, an NL gene subgroup is not significantly enriched with H3K9me3 or H3K27me3 and is transcribed at higher levels than the rest of NL genes. Furthermore, we identified distal enhancers associated with active NL genes and reported their epigenetic features.

  7. Using reporter gene assays to identify cis regulatory differences between humans and chimpanzees.

    PubMed

    Chabot, Adrien; Shrit, Ralla A; Blekhman, Ran; Gilad, Yoav

    2007-08-01

    Most phenotypic differences between human and chimpanzee are likely to result from differences in gene regulation, rather than changes to protein-coding regions. To date, however, only a handful of human-chimpanzee nucleotide differences leading to changes in gene regulation have been identified. To hone in on differences in regulatory elements between human and chimpanzee, we focused on 10 genes that were previously found to be differentially expressed between the two species. We then designed reporter gene assays for the putative human and chimpanzee promoters of the 10 genes. Of seven promoters that we found to be active in human liver cell lines, human and chimpanzee promoters had significantly different activity in four cases, three of which recapitulated the gene expression difference seen in the microarray experiment. For these three genes, we were therefore able to demonstrate that a change in cis influences expression differences between humans and chimpanzees. Moreover, using site-directed mutagenesis on one construct, the promoter for the DDA3 gene, we were able to identify three nucleotides that together lead to a cis regulatory difference between the species. High-throughput application of this approach can provide a map of regulatory element differences between humans and our close evolutionary relatives.

  8. Genome-Wide Association Study Identifies Candidate Genes for Starch Content Regulation in Maize Kernels

    PubMed Central

    Liu, Na; Xue, Yadong; Guo, Zhanyong; Li, Weihua; Tang, Jihua

    2016-01-01

    Kernel starch content is an important trait in maize (Zea mays L.) as it accounts for 65–75% of the dry kernel weight and positively correlates with seed yield. A number of starch synthesis-related genes have been identified in maize in recent years. However, many loci underlying variation in starch content among maize inbred lines still remain to be identified. The current study is a genome-wide association study that used a set of 263 maize inbred lines. In this panel, the average kernel starch content was 66.99%, ranging from 60.60 to 71.58% over the three study years. These inbred lines were genotyped with the SNP50 BeadChip maize array, which is comprised of 56,110 evenly spaced, random SNPs. Population structure was controlled by a mixed linear model (MLM) as implemented in the software package TASSEL. After the statistical analyses, four SNPs were identified as significantly associated with starch content (P ≤ 0.0001), among which one each are located on chromosomes 1 and 5 and two are on chromosome 2. Furthermore, 77 candidate genes associated with starch synthesis were found within the 100-kb intervals containing these four QTLs, and four highly associated genes were within 20-kb intervals of the associated SNPs. Among the four genes, Glucose-1-phosphate adenylyltransferase (APS1; Gene ID GRMZM2G163437) is known as an important regulator of kernel starch content. The identified SNPs, QTLs, and candidate genes may not only be readily used for germplasm improvement by marker-assisted selection in breeding, but can also elucidate the genetic basis of starch content. Further studies on these identified candidate genes may help determine the molecular mechanisms regulating kernel starch content in maize and other important cereal crops. PMID:27512395

  9. Identifying resistance gene analogs associated with resistances to different pathogens in common bean.

    PubMed

    López, Camilo E; Acosta, Iván F; Jara, Carlos; Pedraza, Fabio; Gaitán-Solís, Eliana; Gallego, Gerardo; Beebe, Steve; Tohme, Joe

    2003-01-01

    ABSTRACT A polymerase chain reaction approach using degenerate primers that targeted the conserved domains of cloned plant disease resistance genes (R genes) was used to isolate a set of 15 resistance gene analogs (RGAs) from common bean (Phaseolus vulgaris). Eight different classes of RGAs were obtained from nucleotide binding site (NBS)-based primers and seven from not previously described Toll/Interleukin-1 receptor-like (TIR)-based primers. Putative amino acid sequences of RGAs were significantly similar to R genes and contained additional conserved motifs. The NBS-type RGAs were classified in two subgroups according to the expected final residue in the kinase-2 motif. Eleven RGAs were mapped at 19 loci on eight linkage groups of the common bean genetic map constructed at Centro Internacional de Agricultura Tropical. Genetic linkage was shown for eight RGAs with partial resistance to anthracnose, angular leaf spot (ALS) and Bean golden yellow mosaic virus (BGYMV). RGA1 and RGA2 were associated with resistance loci to anthracnose and BGYMV and were part of two clusters of R genes previously described. A new major cluster was detected by RGA7 and explained up to 63.9% of resistance to ALS and has a putative contribution to anthracnose resistance. These results show the usefulness of RGAs as candidate genes to detect and eventually isolate numerous R genes in common bean.

  10. GENE EXPRESSION PROFILING TO IDENTIFY MECHANISMS OF MALE REPRODUCTIVE TOXICITY

    EPA Science Inventory

    Gene Expression Profiling to Identify Mechanisms of Male Reproductive Toxicity
    David J. Dix
    National Health and Environmental Effects Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA.
    Ab...

  11. Application of identified sensitive physical parameters in reducing the uncertainty of numerical simulation

    NASA Astrophysics Data System (ADS)

    Sun, Guodong; Mu, Mu

    2016-04-01

    An important source of uncertainty, which then causes further uncertainty in numerical simulations, is that residing in the parameters describing physical processes in numerical models. There are many physical parameters in numerical models in the atmospheric and oceanic sciences, and it would cost a great deal to reduce uncertainties in all physical parameters. Therefore, finding a subset of these parameters, which are relatively more sensitive and important parameters, and reducing the errors in the physical parameters in this subset would be a far more efficient way to reduce the uncertainties involved in simulations. In this context, we present a new approach based on the conditional nonlinear optimal perturbation related to parameter (CNOP-P) method. The approach provides a framework to ascertain the subset of those relatively more sensitive and important parameters among the physical parameters. The Lund-Potsdam-Jena (LPJ) dynamical global vegetation model was utilized to test the validity of the new approach. The results imply that nonlinear interactions among parameters play a key role in the uncertainty of numerical simulations in arid and semi-arid regions of China compared to those in northern, northeastern and southern China. The uncertainties in the numerical simulations were reduced considerably by reducing the errors of the subset of relatively more sensitive and important parameters. The results demonstrate that our approach not only offers a new route to identify relatively more sensitive and important physical parameters but also that it is viable to then apply "target observations" to reduce the uncertainties in model parameters.

  12. A data mining paradigm for identifying key factors in biological processes using gene expression data.

    PubMed

    Li, Jin; Zheng, Le; Uchiyama, Akihiko; Bin, Lianghua; Mauro, Theodora M; Elias, Peter M; Pawelczyk, Tadeusz; Sakowicz-Burkiewicz, Monika; Trzeciak, Magdalena; Leung, Donald Y M; Morasso, Maria I; Yu, Peng

    2018-06-13

    A large volume of biological data is being generated for studying mechanisms of various biological processes. These precious data enable large-scale computational analyses to gain biological insights. However, it remains a challenge to mine the data efficiently for knowledge discovery. The heterogeneity of these data makes it difficult to consistently integrate them, slowing down the process of biological discovery. We introduce a data processing paradigm to identify key factors in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. To demonstrate its effectiveness, our paradigm was applied to epidermal development and identified many genes that play a potential role in this process. Besides the known epidermal development genes, a substantial proportion of the identified genes are still not supported by gain- or loss-of-function studies, yielding many novel genes for future studies. Among them, we selected a top gene for loss-of-function experimental validation and confirmed its function in epidermal differentiation, proving the ability of this paradigm to identify new factors in biological processes. In addition, this paradigm revealed many key genes in cold-induced thermogenesis using data from cold-challenged tissues, demonstrating its generalizability. This paradigm can lead to fruitful results for studying molecular mechanisms in an era of explosive accumulation of publicly available biological data.

  13. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes.

    PubMed

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4(-/-) mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases.

  14. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes

    PubMed Central

    Hu, H; Haas, S A; Chelly, J; Van Esch, H; Raynaud, M; de Brouwer, A P M; Weinert, S; Froyen, G; Frints, S G M; Laumonnier, F; Zemojtel, T; Love, M I; Richard, H; Emde, A-K; Bienek, M; Jensen, C; Hambrock, M; Fischer, U; Langnick, C; Feldkamp, M; Wissink-Lindhout, W; Lebrun, N; Castelnau, L; Rucci, J; Montjean, R; Dorseuil, O; Billuart, P; Stuhlmann, T; Shaw, M; Corbett, M A; Gardner, A; Willis-Owen, S; Tan, C; Friend, K L; Belet, S; van Roozendaal, K E P; Jimenez-Pocquet, M; Moizard, M-P; Ronce, N; Sun, R; O'Keeffe, S; Chenna, R; van Bömmel, A; Göke, J; Hackett, A; Field, M; Christie, L; Boyle, J; Haan, E; Nelson, J; Turner, G; Baynam, G; Gillessen-Kaesbach, G; Müller, U; Steinberger, D; Budny, B; Badura-Stronka, M; Latos-Bieleńska, A; Ousager, L B; Wieacker, P; Rodríguez Criado, G; Bondeson, M-L; Annerén, G; Dufke, A; Cohen, M; Van Maldergem, L; Vincent-Delorme, C; Echenne, B; Simon-Bouy, B; Kleefstra, T; Willemsen, M; Fryns, J-P; Devriendt, K; Ullmann, R; Vingron, M; Wrogemann, K; Wienker, T F; Tzschach, A; van Bokhoven, H; Gecz, J; Jentsch, T J; Chen, W; Ropers, H-H; Kalscheuer, V M

    2016-01-01

    X-linked intellectual disability (XLID) is a clinically and genetically heterogeneous disorder. During the past two decades in excess of 100 X-chromosome ID genes have been identified. Yet, a large number of families mapping to the X-chromosome remained unresolved suggesting that more XLID genes or loci are yet to be identified. Here, we have investigated 405 unresolved families with XLID. We employed massively parallel sequencing of all X-chromosome exons in the index males. The majority of these males were previously tested negative for copy number variations and for mutations in a subset of known XLID genes by Sanger sequencing. In total, 745 X-chromosomal genes were screened. After stringent filtering, a total of 1297 non-recurrent exonic variants remained for prioritization. Co-segregation analysis of potential clinically relevant changes revealed that 80 families (20%) carried pathogenic variants in established XLID genes. In 19 families, we detected likely causative protein truncating and missense variants in 7 novel and validated XLID genes (CLCN4, CNKSR2, FRMPD4, KLHL15, LAS1L, RLIM and USP27X) and potentially deleterious variants in 2 novel candidate XLID genes (CDK16 and TAF1). We show that the CLCN4 and CNKSR2 variants impair protein functions as indicated by electrophysiological studies and altered differentiation of cultured primary neurons from Clcn4−/− mice or after mRNA knock-down. The newly identified and candidate XLID proteins belong to pathways and networks with established roles in cognitive function and intellectual disability in particular. We suggest that systematic sequencing of all X-chromosomal genes in a cohort of patients with genetic evidence for X-chromosome locus involvement may resolve up to 58% of Fragile X-negative cases. PMID:25644381

  15. A Genome-Wide Knockout Screen to Identify Genes Involved in Acquired Carboplatin Resistance

    DTIC Science & Technology

    2016-07-01

    library screen to identify genes that when knocked out render human ovarian cells > 2.5-fold resistant to CBDCA; 2) Validate the ability of...a GeCKOv2 library screen to identify genes that when knocked out render human ovarian cells > 2.5-fold resistant to CBDCA; 2) validate the ability of...resistance in either cell lines or clinical samples. The CRIPSR-cas9 technology now provides us with a major new tool to introduce knock out mutations

  16. A computational approach to identify cellular heterogeneity and tissue-specific gene regulatory networks.

    PubMed

    Jambusaria, Ankit; Klomp, Jeff; Hong, Zhigang; Rafii, Shahin; Dai, Yang; Malik, Asrar B; Rehman, Jalees

    2018-06-07

    The heterogeneity of cells across tissue types represents a major challenge for studying biological mechanisms as well as for therapeutic targeting of distinct tissues. Computational prediction of tissue-specific gene regulatory networks may provide important insights into the mechanisms underlying the cellular heterogeneity of cells in distinct organs and tissues. Using three pathway analysis techniques, gene set enrichment analysis (GSEA), parametric analysis of gene set enrichment (PGSEA), alongside our novel model (HeteroPath), which assesses heterogeneously upregulated and downregulated genes within the context of pathways, we generated distinct tissue-specific gene regulatory networks. We analyzed gene expression data derived from freshly isolated heart, brain, and lung endothelial cells and populations of neurons in the hippocampus, cingulate cortex, and amygdala. In both datasets, we found that HeteroPath segregated the distinct cellular populations by identifying regulatory pathways that were not identified by GSEA or PGSEA. Using simulated datasets, HeteroPath demonstrated robustness that was comparable to what was seen using existing gene set enrichment methods. Furthermore, we generated tissue-specific gene regulatory networks involved in vascular heterogeneity and neuronal heterogeneity by performing motif enrichment of the heterogeneous genes identified by HeteroPath and linking the enriched motifs to regulatory transcription factors in the ENCODE database. HeteroPath assesses contextual bidirectional gene expression within pathways and thus allows for transcriptomic assessment of cellular heterogeneity. Unraveling tissue-specific heterogeneity of gene expression can lead to a better understanding of the molecular underpinnings of tissue-specific phenotypes.

  17. Identifying candidate driver genes by integrative ovarian cancer genomics data

    NASA Astrophysics Data System (ADS)

    Lu, Xinguo; Lu, Jibo

    2017-08-01

    Integrative analysis of molecular mechanics underlying cancer can distinguish interactions that cannot be revealed based on one kind of data for the appropriate diagnosis and treatment of cancer patients. Tumor samples exhibit heterogeneity in omics data, such as somatic mutations, Copy Number Variations CNVs), gene expression profiles and so on. In this paper we combined gene co-expression modules and mutation modulators separately in tumor patients to obtain the candidate driver genes for resistant and sensitive tumor from the heterogeneous data. The final list of modulators identified are well known in biological processes associated with ovarian cancer, such as CCL17, CACTIN, CCL16, CCL22, APOB, KDF1, CCL11, HNF1B, LRG1, MED1 and so on, which can help to facilitate the discovery of biomarkers, molecular diagnostics, and drug discovery.

  18. Identifying novel genes and chemicals related to nasopharyngeal cancer in a heterogeneous network.

    PubMed

    Li, Zhandong; An, Lifeng; Li, Hao; Wang, ShaoPeng; Zhou, You; Yuan, Fei; Li, Lin

    2016-05-05

    Nasopharyngeal cancer or nasopharyngeal carcinoma (NPC) is the most common cancer originating in the nasopharynx. The factors that induce nasopharyngeal cancer are still not clear. Additional information about the chemicals or genes related to nasopharyngeal cancer will promote a better understanding of the pathogenesis of this cancer and the factors that induce it. Thus, a computational method NPC-RGCP was proposed in this study to identify the possible relevant chemicals and genes based on the presently known chemicals and genes related to nasopharyngeal cancer. To extensively utilize the functional associations between proteins and chemicals, a heterogeneous network was constructed based on interactions of proteins and chemicals. The NPC-RGCP included two stages: the searching stage and the screening stage. The former stage is for finding new possible genes and chemicals in the heterogeneous network, while the latter stage is for screening and removing false discoveries and selecting the core genes and chemicals. As a result, five putative genes, CXCR3, IRF1, CDK1, GSTP1, and CDH2, and seven putative chemicals, iron, propionic acid, dimethyl sulfoxide, isopropanol, erythrose 4-phosphate, β-D-Fructose 6-phosphate, and flavin adenine dinucleotide, were identified by NPC-RGCP. Extensive analyses provided confirmation that the putative genes and chemicals have significant associations with nasopharyngeal cancer.

  19. Identifying novel genes and chemicals related to nasopharyngeal cancer in a heterogeneous network

    PubMed Central

    Li, Zhandong; An, Lifeng; Li, Hao; Wang, ShaoPeng; Zhou, You; Yuan, Fei; Li, Lin

    2016-01-01

    Nasopharyngeal cancer or nasopharyngeal carcinoma (NPC) is the most common cancer originating in the nasopharynx. The factors that induce nasopharyngeal cancer are still not clear. Additional information about the chemicals or genes related to nasopharyngeal cancer will promote a better understanding of the pathogenesis of this cancer and the factors that induce it. Thus, a computational method NPC-RGCP was proposed in this study to identify the possible relevant chemicals and genes based on the presently known chemicals and genes related to nasopharyngeal cancer. To extensively utilize the functional associations between proteins and chemicals, a heterogeneous network was constructed based on interactions of proteins and chemicals. The NPC-RGCP included two stages: the searching stage and the screening stage. The former stage is for finding new possible genes and chemicals in the heterogeneous network, while the latter stage is for screening and removing false discoveries and selecting the core genes and chemicals. As a result, five putative genes, CXCR3, IRF1, CDK1, GSTP1, and CDH2, and seven putative chemicals, iron, propionic acid, dimethyl sulfoxide, isopropanol, erythrose 4-phosphate, β-D-Fructose 6-phosphate, and flavin adenine dinucleotide, were identified by NPC-RGCP. Extensive analyses provided confirmation that the putative genes and chemicals have significant associations with nasopharyngeal cancer. PMID:27149165

  20. A whole-blood transcriptome meta-analysis identifies gene expression signatures of cigarette smoking

    PubMed Central

    Huan, Tianxiao; Joehanes, Roby; Schurmann, Claudia; Schramm, Katharina; Pilling, Luke C.; Peters, Marjolein J.; Mägi, Reedik; DeMeo, Dawn; O'Connor, George T.; Ferrucci, Luigi; Teumer, Alexander; Homuth, Georg; Biffar, Reiner; Völker, Uwe; Herder, Christian; Waldenberger, Melanie; Peters, Annette; Zeilinger, Sonja; Metspalu, Andres; Hofman, Albert; Uitterlinden, André G.; Hernandez, Dena G.; Singleton, Andrew B.; Bandinelli, Stefania; Munson, Peter J.; Lin, Honghuang; Benjamin, Emelia J.; Esko, Tõnu; Grabe, Hans J.; Prokisch, Holger; van Meurs, Joyce B.J.; Melzer, David; Levy, Daniel

    2016-01-01

    Abstract Cigarette smoking is a leading modifiable cause of death worldwide. We hypothesized that cigarette smoking induces extensive transcriptomic changes that lead to target-organ damage and smoking-related diseases. We performed a meta-analysis of transcriptome-wide gene expression using whole blood-derived RNA from 10,233 participants of European ancestry in six cohorts (including 1421 current and 3955 former smokers) to identify associations between smoking and altered gene expression levels. At a false discovery rate (FDR) <0.1, we identified 1270 differentially expressed genes in current vs. never smokers, and 39 genes in former vs. never smokers. Expression levels of 12 genes remained elevated up to 30 years after smoking cessation, suggesting that the molecular consequence of smoking may persist for decades. Gene ontology analysis revealed enrichment of smoking-related genes for activation of platelets and lymphocytes, immune response, and apoptosis. Many of the top smoking-related differentially expressed genes, including LRRN3 and GPR15, have DNA methylation loci in promoter regions that were recently reported to be hypomethylated among smokers. By linking differential gene expression with smoking-related disease phenotypes, we demonstrated that stroke and pulmonary function show enrichment for smoking-related gene expression signatures. Mediation analysis revealed the expression of several genes (e.g. ALAS2) to be putative mediators of the associations between smoking and inflammatory biomarkers (IL6 and C-reactive protein levels). Our transcriptomic study provides potential insights into the effects of cigarette smoking on gene expression in whole blood and their relations to smoking-related diseases. The results of such analyses may highlight attractive targets for treating or preventing smoking-related health effects. PMID:28158590

  1. Gene-environment interaction involving recently identified colorectal cancer susceptibility loci

    PubMed Central

    Kantor, Elizabeth D.; Hutter, Carolyn M.; Minnier, Jessica; Berndt, Sonja I.; Brenner, Hermann; Caan, Bette J.; Campbell, Peter T.; Carlson, Christopher S.; Casey, Graham; Chan, Andrew T.; Chang-Claude, Jenny; Chanock, Stephen J.; Cotterchio, Michelle; Du, Mengmeng; Duggan, David; Fuchs, Charles S.; Giovannucci, Edward L.; Gong, Jian; Harrison, Tabitha A.; Hayes, Richard B.; Henderson, Brian E.; Hoffmeister, Michael; Hopper, John L.; Jenkins, Mark A.; Jiao, Shuo; Kolonel, Laurence N.; Le Marchand, Loic; Lemire, Mathieu; Ma, Jing; Newcomb, Polly A.; Ochs-Balcom, Heather M.; Pflugeisen, Bethann M.; Potter, John D.; Rudolph, Anja; Schoen, Robert E.; Seminara, Daniela; Slattery, Martha L.; Stelling, Deanna L.; Thomas, Fridtjof; Thornquist, Mark; Ulrich, Cornelia M.; Warnick, Greg S.; Zanke, Brent W.; Peters, Ulrike; Hsu, Li; White, Emily

    2014-01-01

    BACKGROUND Genome-wide association studies have identified several single nucleotide polymorphisms (SNPs) that are associated with risk of colorectal cancer (CRC). Prior research has evaluated the presence of gene-environment interaction involving the first 10 identified susceptibility loci, but little work has been conducted on interaction involving SNPs at recently identified susceptibility loci, including: rs10911251, rs6691170, rs6687758, rs11903757, rs10936599, rs647161, rs1321311, rs719725, rs1665650, rs3824999, rs7136702, rs11169552, rs59336, rs3217810, rs4925386, and rs2423279. METHODS Data on 9160 cases and 9280 controls from the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) and Colon Cancer Family Registry (CCFR) were used to evaluate the presence of interaction involving the above-listed SNPs and sex, body mass index (BMI), alcohol consumption, smoking, aspirin use, post-menopausal hormone (PMH) use, as well as intake of dietary calcium, dietary fiber, dietary folate, red meat, processed meat, fruit, and vegetables. Interaction was evaluated using a fixed-effects meta-analysis of an efficient Empirical Bayes estimator, and permutation was used to account for multiple comparisons. RESULTS None of the permutation-adjusted p-values reached statistical significance. CONCLUSIONS The associations between recently identified genetic susceptibility loci and CRC are not strongly modified by sex, BMI, alcohol, smoking, aspirin, PMH use, and various dietary factors. IMPACT Results suggest no evidence of strong gene-environment interactions involving the recently identified 16 susceptibility loci for CRC taken one at a time. PMID:24994789

  2. Identifying candidate genes affecting developmental time in Drosophila melanogaster: pervasive pleiotropy and gene-by-environment interaction

    PubMed Central

    Mensch, Julián; Lavagnino, Nicolás; Carreira, Valeria Paula; Massaldi, Ana; Hasson, Esteban; Fanara, Juan José

    2008-01-01

    Background Understanding the genetic architecture of ecologically relevant adaptive traits requires the contribution of developmental and evolutionary biology. The time to reach the age of reproduction is a complex life history trait commonly known as developmental time. In particular, in holometabolous insects that occupy ephemeral habitats, like fruit flies, the impact of developmental time on fitness is further exaggerated. The present work is one of the first systematic studies of the genetic basis of developmental time, in which we also evaluate the impact of environmental variation on the expression of the trait. Results We analyzed 179 co-isogenic single P[GT1]-element insertion lines of Drosophila melanogaster to identify novel genes affecting developmental time in flies reared at 25°C. Sixty percent of the lines showed a heterochronic phenotype, suggesting that a large number of genes affect this trait. Mutant lines for the genes Merlin and Karl showed the most extreme phenotypes exhibiting a developmental time reduction and increase, respectively, of over 2 days and 4 days relative to the control (a co-isogenic P-element insertion free line). In addition, a subset of 42 lines selected at random from the initial set of 179 lines was screened at 17°C. Interestingly, the gene-by-environment interaction accounted for 52% of total phenotypic variance. Plastic reaction norms were found for a large number of developmental time candidate genes. Conclusion We identified components of several integrated time-dependent pathways affecting egg-to-adult developmental time in Drosophila. At the same time, we also show that many heterochronic phenotypes may arise from changes in genes involved in several developmental mechanisms that do not explicitly control the timing of specific events. We also demonstrate that many developmental time genes have pleiotropic effects on several adult traits and that the action of most of them is sensitive to temperature during

  3. The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression.

    PubMed

    Catto, James W F; Abbod, Maysam F; Wild, Peter J; Linkens, Derek A; Pilarsky, Christian; Rehman, Ishtiaq; Rosario, Derek J; Denzinger, Stefan; Burger, Maximilian; Stoehr, Robert; Knuechel, Ruth; Hartmann, Arndt; Hamdy, Freddie C

    2010-03-01

    New methods for identifying bladder cancer (BCa) progression are required. Gene expression microarrays can reveal insights into disease biology and identify novel biomarkers. However, these experiments produce large datasets that are difficult to interpret. To develop a novel method of microarray analysis combining two forms of artificial intelligence (AI): neurofuzzy modelling (NFM) and artificial neural networks (ANN) and validate it in a BCa cohort. We used AI and statistical analyses to identify progression-related genes in a microarray dataset (n=66 tumours, n=2800 genes). The AI-selected genes were then investigated in a second cohort (n=262 tumours) using immunohistochemistry. We compared the accuracy of AI and statistical approaches to identify tumour progression. AI identified 11 progression-associated genes (odds ratio [OR]: 0.70; 95% confidence interval [CI], 0.56-0.87; p=0.0004), and these were more discriminate than genes chosen using statistical analyses (OR: 1.24; 95% CI, 0.96-1.60; p=0.09). The expression of six AI-selected genes (LIG3, FAS, KRT18, ICAM1, DSG2, and BRCA2) was determined using commercial antibodies and successfully identified tumour progression (concordance index: 0.66; log-rank test: p=0.01). AI-selected genes were more discriminate than pathologic criteria at determining progression (Cox multivariate analysis: p=0.01). Limitations include the use of statistical correlation to identify 200 genes for AI analysis and that we did not compare regression identified genes with immunohistochemistry. AI and statistical analyses use different techniques of inference to determine gene-phenotype associations and identify distinct prognostic gene signatures that are equally valid. We have identified a prognostic gene signature whose members reflect a variety of carcinogenic pathways that could identify progression in non-muscle-invasive BCa. 2009 European Association of Urology. Published by Elsevier B.V. All rights reserved.

  4. Comprehensive Ex Vivo Transposon Mutagenesis Identifies Genes That Promote Growth Factor Independence and Leukemogenesis.

    PubMed

    Guo, Yabin; Updegraff, Barrett L; Park, Sunho; Durakoglugil, Deniz; Cruz, Victoria H; Maddux, Sarah; Hwang, Tae Hyun; O'Donnell, Kathryn A

    2016-02-15

    Aberrant signaling through cytokine receptors and their downstream signaling pathways is a major oncogenic mechanism underlying hematopoietic malignancies. To better understand how these pathways become pathologically activated and to potentially identify new drivers of hematopoietic cancers, we developed a high-throughput functional screening approach using ex vivo mutagenesis with the Sleeping Beauty transposon. We analyzed over 1,100 transposon-mutagenized pools of Ba/F3 cells, an IL3-dependent pro-B-cell line, which acquired cytokine independence and tumor-forming ability. Recurrent transposon insertions could be mapped to genes in the JAK/STAT and MAPK pathways, confirming the ability of this strategy to identify known oncogenic components of cytokine signaling pathways. In addition, recurrent insertions were identified in a large set of genes that have been found to be mutated in leukemia or associated with survival, but were not previously linked to the JAK/STAT or MAPK pathways nor shown to functionally contribute to leukemogenesis. Forced expression of these novel genes resulted in IL3-independent growth in vitro and tumorigenesis in vivo, validating this mutagenesis-based approach for identifying new genes that promote cytokine signaling and leukemogenesis. Therefore, our findings provide a broadly applicable approach for classifying functionally relevant genes in diverse malignancies and offer new insights into the impact of cytokine signaling on leukemia development. ©2015 American Association for Cancer Research.

  5. The Search for Autism Disease Genes

    ERIC Educational Resources Information Center

    Wassink, Thomas H.; Brzustowicz, Linda M.; Bartlett, Christopher W.; Szatmari, Peter

    2004-01-01

    Autism is a heritable disorder characterized by phenotypic and genetic complexity. This review begins by surveying current linkage, gene association, and cytogenetic studies performed with the goal of identifying autism disease susceptibility variants. Though numerous linkages and associations have been identified, they tend to diminish upon…

  6. Selection on plant male function genes identifies candidates for reproductive isolation of yellow monkeyflowers.

    PubMed

    Aagaard, Jan E; George, Renee D; Fishman, Lila; Maccoss, Michael J; Swanson, Willie J

    2013-01-01

    Understanding the genetic basis of reproductive isolation promises insight into speciation and the origins of biological diversity. While progress has been made in identifying genes underlying barriers to reproduction that function after fertilization (post-zygotic isolation), we know much less about earlier acting pre-zygotic barriers. Of particular interest are barriers involved in mating and fertilization that can evolve extremely rapidly under sexual selection, suggesting they may play a prominent role in the initial stages of reproductive isolation. A significant challenge to the field of speciation genetics is developing new approaches for identification of candidate genes underlying these barriers, particularly among non-traditional model systems. We employ powerful proteomic and genomic strategies to study the genetic basis of conspecific pollen precedence, an important component of pre-zygotic reproductive isolation among yellow monkeyflowers (Mimulus spp.) resulting from male pollen competition. We use isotopic labeling in combination with shotgun proteomics to identify more than 2,000 male function (pollen tube) proteins within maternal reproductive structures (styles) of M. guttatus flowers where pollen competition occurs. We then sequence array-captured pollen tube exomes from a large outcrossing population of M. guttatus, and identify those genes with evidence of selective sweeps or balancing selection consistent with their role in pollen competition. We also test for evidence of positive selection on these genes more broadly across yellow monkeyflowers, because a signal of adaptive divergence is a common feature of genes causing reproductive isolation. Together the molecular evolution studies identify 159 pollen tube proteins that are candidate genes for conspecific pollen precedence. Our work demonstrates how powerful proteomic and genomic tools can be readily adapted to non-traditional model systems, allowing for genome-wide screens towards the

  7. Comparative Transcriptional Profiling of the Axolotl Limb Identifies a Tripartite Regeneration-Specific Gene Program

    PubMed Central

    Knapp, Dunja; Schulz, Herbert; Rascon, Cynthia Alexander; Volkmer, Michael; Scholz, Juliane; Nacu, Eugen; Le, Mu; Novozhilov, Sergey; Tazaki, Akira; Protze, Stephanie; Jacob, Tina; Hubner, Norbert; Habermann, Bianca; Tanaka, Elly M.

    2013-01-01

    Understanding how the limb blastema is established after the initial wound healing response is an important aspect of regeneration research. Here we performed parallel expression profile time courses of healing lateral wounds versus amputated limbs in axolotl. This comparison between wound healing and regeneration allowed us to identify amputation-specific genes. By clustering the expression profiles of these samples, we could detect three distinguishable phases of gene expression – early wound healing followed by a transition-phase leading to establishment of the limb development program, which correspond to the three phases of limb regeneration that had been defined by morphological criteria. By focusing on the transition-phase, we identified 93 strictly amputation-associated genes many of which are implicated in oxidative-stress response, chromatin modification, epithelial development or limb development. We further classified the genes based on whether they were or were not significantly expressed in the developing limb bud. The specific localization of 53 selected candidates within the blastema was investigated by in situ hybridization. In summary, we identified a set of genes that are expressed specifically during regeneration and are therefore, likely candidates for the regulation of blastema formation. PMID:23658691

  8. Genome-wide gene by lead exposure interaction analysis identifies UNC5D as a candidate gene for neurodevelopment.

    PubMed

    Wang, Zhaoxi; Claus Henn, Birgit; Wang, Chaolong; Wei, Yongyue; Su, Li; Sun, Ryan; Chen, Han; Wagner, Peter J; Lu, Quan; Lin, Xihong; Wright, Robert; Bellinger, David; Kile, Molly; Mazumdar, Maitreyi; Tellez-Rojo, Martha Maria; Schnaas, Lourdes; Christiani, David C

    2017-07-28

    Neurodevelopment is a complex process involving both genetic and environmental factors. Prenatal exposure to lead (Pb) has been associated with lower performance on neurodevelopmental tests. Adverse neurodevelopmental outcomes are more frequent and/or more severe when toxic exposures interact with genetic susceptibility. To explore possible loci associated with increased susceptibility to prenatal Pb exposure, we performed a genome-wide gene-environment interaction study (GWIS) in young children from Mexico (n = 390) and Bangladesh (n = 497). Prenatal Pb exposure was estimated by cord blood Pb concentration. Neurodevelopment was assessed using the Bayley Scales of Infant Development. We identified a locus on chromosome 8, containing UNC5D, and demonstrated evidence of its genome-wide significance with mental composite scores (rs9642758, p meta  = 4.35 × 10 -6 ). Within this locus, the joint effects of two independent single nucleotide polymorphisms (SNPs, rs9642758 and rs10503970) had a p-value of 4.38 × 10 -9 for mental composite scores. Correlating GWIS results with in vitro transcriptomic profiles identified one common gene, SLC1A5, which is involved in synaptic function, neuronal development, and excitotoxicity. Further analysis revealed interconnected interactions that formed a large network of 52 genes enriched with oxidative stress genes and neurodevelopmental genes. Our findings suggest that certain genetic polymorphisms within/near genes relevant to neurodevelopment might modify the toxic effects of Pb exposure via oxidative stress.

  9. Preferential Allele Expression Analysis Identifies Shared Germline and Somatic Driver Genes in Advanced Ovarian Cancer

    PubMed Central

    Halabi, Najeeb M.; Martinez, Alejandra; Al-Farsi, Halema; Mery, Eliane; Puydenus, Laurence; Pujol, Pascal; Khalak, Hanif G.; McLurcan, Cameron; Ferron, Gwenael; Querleu, Denis; Al-Azwani, Iman; Al-Dous, Eman; Mohamoud, Yasmin A.; Malek, Joel A.; Rafii, Arash

    2016-01-01

    Identifying genes where a variant allele is preferentially expressed in tumors could lead to a better understanding of cancer biology and optimization of targeted therapy. However, tumor sample heterogeneity complicates standard approaches for detecting preferential allele expression. We therefore developed a novel approach combining genome and transcriptome sequencing data from the same sample that corrects for sample heterogeneity and identifies significant preferentially expressed alleles. We applied this analysis to epithelial ovarian cancer samples consisting of matched primary ovary and peritoneum and lymph node metastasis. We find that preferentially expressed variant alleles include germline and somatic variants, are shared at a relatively high frequency between patients, and are in gene networks known to be involved in cancer processes. Analysis at a patient level identifies patient-specific preferentially expressed alleles in genes that are targets for known drugs. Analysis at a site level identifies patterns of site specific preferential allele expression with similar pathways being impacted in the primary and metastasis sites. We conclude that genes with preferentially expressed variant alleles can act as cancer drivers and that targeting those genes could lead to new therapeutic strategies. PMID:26735499

  10. An Optimal Mean Based Block Robust Feature Extraction Method to Identify Colorectal Cancer Genes with Integrated Data.

    PubMed

    Liu, Jian; Cheng, Yuhu; Wang, Xuesong; Zhang, Lin; Liu, Hui

    2017-08-17

    It is urgent to diagnose colorectal cancer in the early stage. Some feature genes which are important to colorectal cancer development have been identified. However, for the early stage of colorectal cancer, less is known about the identity of specific cancer genes that are associated with advanced clinical stage. In this paper, we conducted a feature extraction method named Optimal Mean based Block Robust Feature Extraction method (OMBRFE) to identify feature genes associated with advanced colorectal cancer in clinical stage by using the integrated colorectal cancer data. Firstly, based on the optimal mean and L 2,1 -norm, a novel feature extraction method called Optimal Mean based Robust Feature Extraction method (OMRFE) is proposed to identify feature genes. Then the OMBRFE method which introduces the block ideology into OMRFE method is put forward to process the colorectal cancer integrated data which includes multiple genomic data: copy number alterations, somatic mutations, methylation expression alteration, as well as gene expression changes. Experimental results demonstrate that the OMBRFE is more effective than previous methods in identifying the feature genes. Moreover, genes identified by OMBRFE are verified to be closely associated with advanced colorectal cancer in clinical stage.

  11. Genes contributing to the development of alcoholism: an overview.

    PubMed

    Edenberg, Howard J

    2012-01-01

    Genetic factors (i.e., variations in specific genes) account for a substantial portion of the risk for alcoholism. However, identifying those genes and the specific variations involved is challenging. Researchers have used both case-control and family studies to identify genes related to alcoholism risk. In addition, different strategies such as candidate gene analyses and genome-wide association studies have been used. The strongest effects have been found for specific variants of genes that encode two enzymes involved in alcohol metabolism-alcohol dehydrogenase and aldehyde dehydrogenase. Accumulating evidence indicates that variations in numerous other genes have smaller but measurable effects.

  12. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways.

    PubMed

    Cirulli, Elizabeth T; Lasseigne, Brittany N; Petrovski, Slavé; Sapp, Peter C; Dion, Patrick A; Leblond, Claire S; Couthouis, Julien; Lu, Yi-Fan; Wang, Quanli; Krueger, Brian J; Ren, Zhong; Keebler, Jonathan; Han, Yujun; Levy, Shawn E; Boone, Braden E; Wimbish, Jack R; Waite, Lindsay L; Jones, Angela L; Carulli, John P; Day-Williams, Aaron G; Staropoli, John F; Xin, Winnie W; Chesi, Alessandra; Raphael, Alya R; McKenna-Yasek, Diane; Cady, Janet; Vianney de Jong, J M B; Kenna, Kevin P; Smith, Bradley N; Topp, Simon; Miller, Jack; Gkazi, Athina; Al-Chalabi, Ammar; van den Berg, Leonard H; Veldink, Jan; Silani, Vincenzo; Ticozzi, Nicola; Shaw, Christopher E; Baloh, Robert H; Appel, Stanley; Simpson, Ericka; Lagier-Tourenne, Clotilde; Pulst, Stefan M; Gibson, Summer; Trojanowski, John Q; Elman, Lauren; McCluskey, Leo; Grossman, Murray; Shneider, Neil A; Chung, Wendy K; Ravits, John M; Glass, Jonathan D; Sims, Katherine B; Van Deerlin, Vivianna M; Maniatis, Tom; Hayes, Sebastian D; Ordureau, Alban; Swarup, Sharan; Landers, John; Baas, Frank; Allen, Andrew S; Bedlack, Richard S; Harper, J Wade; Gitler, Aaron D; Rouleau, Guy A; Brown, Robert; Harms, Matthew B; Cooper, Gregory M; Harris, Tim; Myers, Richard M; Goldstein, David B

    2015-03-27

    Amyotrophic lateral sclerosis (ALS) is a devastating neurological disease with no effective treatment. We report the results of a moderate-scale sequencing study aimed at increasing the number of genes known to contribute to predisposition for ALS. We performed whole-exome sequencing of 2869 ALS patients and 6405 controls. Several known ALS genes were found to be associated, and TBK1 (the gene encoding TANK-binding kinase 1) was identified as an ALS gene. TBK1 is known to bind to and phosphorylate a number of proteins involved in innate immunity and autophagy, including optineurin (OPTN) and p62 (SQSTM1/sequestosome), both of which have also been implicated in ALS. These observations reveal a key role of the autophagic pathway in ALS and suggest specific targets for therapeutic intervention. Copyright © 2015, American Association for the Advancement of Science.

  13. MADGiC: a model-based approach for identifying driver genes in cancer

    PubMed Central

    Korthauer, Keegan D.; Kendziorski, Christina

    2015-01-01

    Motivation: Identifying and prioritizing somatic mutations is an important and challenging area of cancer research that can provide new insights into gene function as well as new targets for drug development. Most methods for prioritizing mutations rely primarily on frequency-based criteria, where a gene is identified as having a driver mutation if it is altered in significantly more samples than expected according to a background model. Although useful, frequency-based methods are limited in that all mutations are treated equally. It is well known, however, that some mutations have no functional consequence, while others may have a major deleterious impact. The spatial pattern of mutations within a gene provides further insight into their functional consequence. Properly accounting for these factors improves both the power and accuracy of inference. Also important is an accurate background model. Results: Here, we develop a Model-based Approach for identifying Driver Genes in Cancer (termed MADGiC) that incorporates both frequency and functional impact criteria and accommodates a number of factors to improve the background model. Simulation studies demonstrate advantages of the approach, including a substantial increase in power over competing methods. Further advantages are illustrated in an analysis of ovarian and lung cancer data from The Cancer Genome Atlas (TCGA) project. Availability and implementation: R code to implement this method is available at http://www.biostat.wisc.edu/ kendzior/MADGiC/. Contact: kendzior@biostat.wisc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25573922

  14. Male specific genes from dioecious white campion identified by fluorescent differential display.

    PubMed

    Scutt, Charles P; Jenkins, Tom; Furuya, Masaki; Gilmartin, Philip M

    2002-05-01

    Fluorescent differential display (FDD) has been used to screen for cDNAs that are differentially up-regulated in male flowers of the dioecious plant Silene latifolia in which an X/Y chromosome system of sex determination operates. To adapt FDD to the cloning of large numbers of differential cDNAs, a novel method of confirming the differential expression of these has been devised. FDD gels were Southern electro-blotted and probed with mixtures of individual cDNA clones derived from different FDD product ligation reactions. These Southern blots were then stripped and re-probed with further mixtures of individual cloned FDD products to identify the maximum number of recombinant clones carrying the true differential amplification products. Of 135 differential bands identified by FDD, 56 differential amplification products were confirmed; these represent 23 unique differentially expressed genes as determined by virtual Northern analysis and two genes expressed at or below the level of detection by virtual Northern analysis. These two low expressed genes show bands of hybridization on genomic Southern blots that are specific to male plants, indicating that they are derived from, or closely related to, Y chromosome genes.

  15. Gene-set analysis based on the pharmacological profiles of drugs to identify repurposing opportunities in schizophrenia.

    PubMed

    de Jong, Simone; Vidler, Lewis R; Mokrab, Younes; Collier, David A; Breen, Gerome

    2016-08-01

    Genome-wide association studies (GWAS) have identified thousands of novel genetic associations for complex genetic disorders, leading to the identification of potential pharmacological targets for novel drug development. In schizophrenia, 108 conservatively defined loci that meet genome-wide significance have been identified and hundreds of additional sub-threshold associations harbour information on the genetic aetiology of the disorder. In the present study, we used gene-set analysis based on the known binding targets of chemical compounds to identify the 'drug pathways' most strongly associated with schizophrenia-associated genes, with the aim of identifying potential drug repositioning opportunities and clues for novel treatment paradigms, especially in multi-target drug development. We compiled 9389 gene sets (2496 with unique gene content) and interrogated gene-based p-values from the PGC2-SCZ analysis. Although no single drug exceeded experiment wide significance (corrected p<0.05), highly ranked gene-sets reaching suggestive significance including the dopamine receptor antagonists metoclopramide and trifluoperazine and the tyrosine kinase inhibitor neratinib. This is a proof of principle analysis showing the potential utility of GWAS data of schizophrenia for the direct identification of candidate drugs and molecules that show polypharmacy. © The Author(s) 2016.

  16. Exome Sequencing and Linkage Analysis Identified Novel Candidate Genes in Recessive Intellectual Disability Associated with Ataxia.

    PubMed

    Jazayeri, Roshanak; Hu, Hao; Fattahi, Zohreh; Musante, Luciana; Abedini, Seyedeh Sedigheh; Hosseini, Masoumeh; Wienker, Thomas F; Ropers, Hans Hilger; Najmabadi, Hossein; Kahrizi, Kimia

    2015-10-01

    Intellectual disability (ID) is a neuro-developmental disorder which causes considerable socio-economic problems. Some ID individuals are also affected by ataxia, and the condition includes different mutations affecting several genes. We used whole exome sequencing (WES) in combination with homozygosity mapping (HM) to identify the genetic defects in five consanguineous families among our cohort study, with two affected children with ID and ataxia as major clinical symptoms. We identified three novel candidate genes, RIPPLY1, MRPL10, SNX14, and a new mutation in known gene SURF1. All are autosomal genes, except RIPPLY1, which is located on the X chromosome. Two are housekeeping genes, implicated in transcription and translation regulation and intracellular trafficking, and two encode mitochondrial proteins. The pathogenesis of these variants was evaluated by mutation classification, bioinformatic methods, review of medical and biological relevance, co-segregation studies in the particular family, and a normal population study. Linkage analysis and exome sequencing of a small number of affected family members is a powerful new technique which can be used to decrease the number of candidate genes in heterogenic disorders such as ID, and may even identify the responsible gene(s).

  17. High-Throughput Screening to Identify Regulators of Meiosis-Specific Gene Expression in Saccharomyces cerevisiae.

    PubMed

    Kassir, Yona

    2017-01-01

    Meiosis and gamete formation are processes that are essential for sexual reproduction in all eukaryotic organisms. Multiple intracellular and extracellular signals feed into pathways that converge on transcription factors that induce the expression of meiosis-specific genes. Once triggered the meiosis-specific gene expression program proceeds in a cascade that drives progress through the events of meiosis and gamete formation. Meiosis-specific gene expression is tightly controlled by a balance of positive and negative regulatory factors that respond to a plethora of signaling pathways. The budding yeast Saccharomyces cerevisiae has proven to be an outstanding model for the dissection of gametogenesis owing to the sophisticated genetic manipulations that can be performed with the cells. It is possible to use a variety selection and screening methods to identify genes and their functions. High-throughput screening technology has been developed to allow an array of all viable yeast gene deletion mutants to be screened for phenotypes and for regulators of gene expression. This chapter describes a protocol that has been used to screen a library of homozygous diploid yeast deletion strains to identify regulators of the meiosis-specific IME1 gene.

  18. Transposon mutagenesis identifies genes and cellular processes driving epithelial-mesenchymal transition in hepatocellular carcinoma

    PubMed Central

    Kodama, Takahiro; Newberg, Justin Y.; Kodama, Michiko; Rangel, Roberto; Yoshihara, Kosuke; Tien, Jean C.; Parsons, Pamela H.; Wu, Hao; Finegold, Milton J.; Copeland, Neal G.; Jenkins, Nancy A.

    2016-01-01

    Epithelial-mesenchymal transition (EMT) is thought to contribute to metastasis and chemoresistance in patients with hepatocellular carcinoma (HCC), leading to their poor prognosis. The genes driving EMT in HCC are not yet fully understood, however. Here, we show that mobilization of Sleeping Beauty (SB) transposons in immortalized mouse hepatoblasts induces mesenchymal liver tumors on transplantation to nude mice. These tumors show significant down-regulation of epithelial markers, along with up-regulation of mesenchymal markers and EMT-related transcription factors (EMT-TFs). Sequencing of transposon insertion sites from tumors identified 233 candidate cancer genes (CCGs) that were enriched for genes and cellular processes driving EMT. Subsequent trunk driver analysis identified 23 CCGs that are predicted to function early in tumorigenesis and whose mutation or alteration in patients with HCC is correlated with poor patient survival. Validation of the top trunk drivers identified in the screen, including MET (MET proto-oncogene, receptor tyrosine kinase), GRB2-associated binding protein 1 (GAB1), HECT, UBA, and WWE domain containing 1 (HUWE1), lysine-specific demethylase 6A (KDM6A), and protein-tyrosine phosphatase, nonreceptor-type 12 (PTPN12), showed that deregulation of these genes activates an EMT program in human HCC cells that enhances tumor cell migration. Finally, deregulation of these genes in human HCC was found to confer sorafenib resistance through apoptotic tolerance and reduced proliferation, consistent with recent studies showing that EMT contributes to the chemoresistance of tumor cells. Our unique cell-based transposon mutagenesis screen appears to be an excellent resource for discovering genes involved in EMT in human HCC and potentially for identifying new drug targets. PMID:27247392

  19. Theoretical and Numerical Modeling of Transport of Land Use-Specific Fecal Source Identifiers

    NASA Astrophysics Data System (ADS)

    Bombardelli, F. A.; Sirikanchana, K. J.; Bae, S.; Wuertz, S.

    2008-12-01

    Microbial contamination in coastal and estuarine waters is of particular concern to public health officials. In this work, we advocate that well-formulated and developed mathematical and numerical transport models can be combined with modern molecular techniques in order to predict continuous concentrations of microbial indicators under diverse scenarios of interest, and that they can help in source identification of fecal pollution. As a proof of concept, we present initially the theory, numerical implementation and validation of one- and two-dimensional numerical models aimed at computing the distribution of fecal source identifiers in water bodies (based on Bacteroidales marker DNA sequences) coming from different land uses such as wildlife, livestock, humans, dogs or cats. These models have been developed to allow for source identification of fecal contamination in large bodies of water. We test the model predictions using diverse velocity fields and boundary conditions. Then, we present some preliminary results of an application of a three-dimensional water quality model to address the source of fecal contamination in the San Pablo Bay (SPB), United States, which constitutes an important sub-embayment of the San Francisco Bay. The transport equations for Bacteroidales include the processes of advection, diffusion, and decay of Bacteroidales. We discuss the validation of the developed models through comparisons of numerical results with field campaigns developed in the SPB. We determine the extent and importance of the contamination in the bay for two decay rates obtained from field observations, corresponding to total host-specific Bacteroidales DNA and host-specific viable Bacteroidales cells, respectively. Finally, we infer transport conditions in the SPB based on the numerical results, characterizing the fate of outflows coming from the Napa, Petaluma and Sonoma rivers.

  20. Transcriptomic Analysis Using Olive Varieties and Breeding Progenies Identifies Candidate Genes Involved in Plant Architecture.

    PubMed

    González-Plaza, Juan J; Ortiz-Martín, Inmaculada; Muñoz-Mérida, Antonio; García-López, Carmen; Sánchez-Sevilla, José F; Luque, Francisco; Trelles, Oswaldo; Bejarano, Eduardo R; De La Rosa, Raúl; Valpuesta, Victoriano; Beuzón, Carmen R

    2016-01-01

    Plant architecture is a critical trait in fruit crops that can significantly influence yield, pruning, planting density and harvesting. Little is known about how plant architecture is genetically determined in olive, were most of the existing varieties are traditional with an architecture poorly suited for modern growing and harvesting systems. In the present study, we have carried out microarray analysis of meristematic tissue to compare expression profiles of olive varieties displaying differences in architecture, as well as seedlings from their cross pooled on the basis of their sharing architecture-related phenotypes. The microarray used, previously developed by our group has already been applied to identify candidates genes involved in regulating juvenile to adult transition in the shoot apex of seedlings. Varieties with distinct architecture phenotypes and individuals from segregating progenies displaying opposite architecture features were used to link phenotype to expression. Here, we identify 2252 differentially expressed genes (DEGs) associated to differences in plant architecture. Microarray results were validated by quantitative RT-PCR carried out on genes with functional annotation likely related to plant architecture. Twelve of these genes were further analyzed in individual seedlings of the corresponding pool. We also examined Arabidopsis mutants in putative orthologs of these targeted candidate genes, finding altered architecture for most of them. This supports a functional conservation between species and potential biological relevance of the candidate genes identified. This study is the first to identify genes associated to plant architecture in olive, and the results obtained could be of great help in future programs aimed at selecting phenotypes adapted to modern cultivation practices in this species.

  1. Systematic analysis of mutation distribution in three dimensional protein structures identifies cancer driver genes.

    PubMed

    Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki

    2016-05-26

    Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes.

  2. Systematic analysis of mutation distribution in three dimensional protein structures identifies cancer driver genes

    PubMed Central

    Fujimoto, Akihiro; Okada, Yukinori; Boroevich, Keith A.; Tsunoda, Tatsuhiko; Taniguchi, Hiroaki; Nakagawa, Hidewaki

    2016-01-01

    Protein tertiary structure determines molecular function, interaction, and stability of the protein, therefore distribution of mutation in the tertiary structure can facilitate the identification of new driver genes in cancer. To analyze mutation distribution in protein tertiary structures, we applied a novel three dimensional permutation test to the mutation positions. We analyzed somatic mutation datasets of 21 types of cancers obtained from exome sequencing conducted by the TCGA project. Of the 3,622 genes that had ≥3 mutations in the regions with tertiary structure data, 106 genes showed significant skew in mutation distribution. Known tumor suppressors and oncogenes were significantly enriched in these identified cancer gene sets. Physical distances between mutations in known oncogenes were significantly smaller than those of tumor suppressors. Twenty-three genes were detected in multiple cancers. Candidate genes with significant skew of the 3D mutation distribution included kinases (MAPK1, EPHA5, ERBB3, and ERBB4), an apoptosis related gene (APP), an RNA splicing factor (SF1), a miRNA processing factor (DICER1), an E3 ubiquitin ligase (CUL1) and transcription factors (KLF5 and EEF1B2). Our study suggests that systematic analysis of mutation distribution in the tertiary protein structure can help identify cancer driver genes. PMID:27225414

  3. Specific PCR primers directed to identify cryI and cryIII genes within a Bacillus thuringiensis strain collection.

    PubMed Central

    Cerón, J; Ortíz, A; Quintero, R; Güereca, L; Bravo, A

    1995-01-01

    In this paper we describe a PCR strategy that can be used to rapidly identify Bacillus thuringiensis strains that harbor any of the known cryI or cryIII genes. Four general PCR primers which amplify DNA fragments from the known cryI or cryIII genes were selected from conserved regions. Once a strain was identified as an organism that contains a particular type of cry gene, it could be easily characterized by performing additional PCR with specific cryI and cryIII primers selected from variable regions. The method described in this paper can be used to identify the 10 different cryI genes and the five different cryIII genes. One feature of this screening method is that each cry gene is expected to produce a PCR product having a precise molecular weight. The genes which produce PCR products having different sizes probably represent strains that harbor a potentially novel cry gene. Finally, we present evidence that novel crystal genes can be identified by the method described in this paper. PMID:8526493

  4. A method to identify differential expression profiles of time-course gene data with Fourier transformation.

    PubMed

    Kim, Jaehee; Ogden, Robert Todd; Kim, Haseong

    2013-10-18

    Time course gene expression experiments are an increasingly popular method for exploring biological processes. Temporal gene expression profiles provide an important characterization of gene function, as biological systems are both developmental and dynamic. With such data it is possible to study gene expression changes over time and thereby to detect differential genes. Much of the early work on analyzing time series expression data relied on methods developed originally for static data and thus there is a need for improved methodology. Since time series expression is a temporal process, its unique features such as autocorrelation between successive points should be incorporated into the analysis. This work aims to identify genes that show different gene expression profiles across time. We propose a statistical procedure to discover gene groups with similar profiles using a nonparametric representation that accounts for the autocorrelation in the data. In particular, we first represent each profile in terms of a Fourier basis, and then we screen out genes that are not differentially expressed based on the Fourier coefficients. Finally, we cluster the remaining gene profiles using a model-based approach in the Fourier domain. We evaluate the screening results in terms of sensitivity, specificity, FDR and FNR, compare with the Gaussian process regression screening in a simulation study and illustrate the results by application to yeast cell-cycle microarray expression data with alpha-factor synchronization.The key elements of the proposed methodology: (i) representation of gene profiles in the Fourier domain; (ii) automatic screening of genes based on the Fourier coefficients and taking into account autocorrelation in the data, while controlling the false discovery rate (FDR); (iii) model-based clustering of the remaining gene profiles. Using this method, we identified a set of cell-cycle-regulated time-course yeast genes. The proposed method is general and can be

  5. A method to identify differential expression profiles of time-course gene data with Fourier transformation

    PubMed Central

    2013-01-01

    Background Time course gene expression experiments are an increasingly popular method for exploring biological processes. Temporal gene expression profiles provide an important characterization of gene function, as biological systems are both developmental and dynamic. With such data it is possible to study gene expression changes over time and thereby to detect differential genes. Much of the early work on analyzing time series expression data relied on methods developed originally for static data and thus there is a need for improved methodology. Since time series expression is a temporal process, its unique features such as autocorrelation between successive points should be incorporated into the analysis. Results This work aims to identify genes that show different gene expression profiles across time. We propose a statistical procedure to discover gene groups with similar profiles using a nonparametric representation that accounts for the autocorrelation in the data. In particular, we first represent each profile in terms of a Fourier basis, and then we screen out genes that are not differentially expressed based on the Fourier coefficients. Finally, we cluster the remaining gene profiles using a model-based approach in the Fourier domain. We evaluate the screening results in terms of sensitivity, specificity, FDR and FNR, compare with the Gaussian process regression screening in a simulation study and illustrate the results by application to yeast cell-cycle microarray expression data with alpha-factor synchronization. The key elements of the proposed methodology: (i) representation of gene profiles in the Fourier domain; (ii) automatic screening of genes based on the Fourier coefficients and taking into account autocorrelation in the data, while controlling the false discovery rate (FDR); (iii) model-based clustering of the remaining gene profiles. Conclusions Using this method, we identified a set of cell-cycle-regulated time-course yeast genes. The

  6. Transcriptome Sequencing Identified Genes and Gene Ontologies Associated with Early Freezing Tolerance in Maize

    PubMed Central

    Li, Zhao; Hu, Guanghui; Liu, Xiangfeng; Zhou, Yao; Li, Yu; Zhang, Xu; Yuan, Xiaohui; Zhang, Qian; Yang, Deguang; Wang, Tianyu; Zhang, Zhiwu

    2016-01-01

    Originating in a tropical climate, maize has faced great challenges as cultivation has expanded to the majority of the world's temperate zones. In these zones, frost and cold temperatures are major factors that prevent maize from reaching its full yield potential. Among 30 elite maize inbred lines adapted to northern China, we identified two lines of extreme, but opposite, freezing tolerance levels—highly tolerant and highly sensitive. During the seedling stage of these two lines, we used RNA-seq to measure changes in maize whole genome transcriptome before and after freezing treatment. In total, 19,794 genes were expressed, of which 4550 exhibited differential expression due to either treatment (before or after freezing) or line type (tolerant or sensitive). Of the 4550 differently expressed genes, 948 exhibited differential expression due to treatment within line or lines under freezing condition. Analysis of gene ontology found that these 948 genes were significantly enriched for binding functions (DNA binding, ATP binding, and metal ion binding), protein kinase activity, and peptidase activity. Based on their enrichment, literature support, and significant levels of differential expression, 30 of these 948 genes were selected for quantitative real-time PCR (qRT-PCR) validation. The validation confirmed our RNA-Seq-based findings, with squared correlation coefficients of 80% and 50% in the tolerance and sensitive lines, respectively. This study provided valuable resources for further studies to enhance understanding of the molecular mechanisms underlying maize early freezing response and enable targeted breeding strategies for developing varieties with superior frost resistance to achieve yield potential. PMID:27774095

  7. Utility and Limitations of Using Gene Expression Data to Identify Functional Associations

    PubMed Central

    Peng, Cheng; Shiu, Shin-Han

    2016-01-01

    Gene co-expression has been widely used to hypothesize gene function through guilt-by association. However, it is not clear to what degree co-expression is informative, whether it can be applied to genes involved in different biological processes, and how the type of dataset impacts inferences about gene functions. Here our goal is to assess the utility and limitations of using co-expression as a criterion to recover functional associations between genes. By determining the percentage of gene pairs in a metabolic pathway with significant expression correlation, we found that many genes in the same pathway do not have similar transcript profiles and the choice of dataset, annotation quality, gene function, expression similarity measure, and clustering approach significantly impacts the ability to recover functional associations between genes using Arabidopsis thaliana as an example. Some datasets are more informative in capturing coordinated expression profiles and larger data sets are not always better. In addition, to recover the maximum number of known pathways and identify candidate genes with similar functions, it is important to explore rather exhaustively multiple dataset combinations, similarity measures, clustering algorithms and parameters. Finally, we validated the biological relevance of co-expression cluster memberships with an independent phenomics dataset and found that genes that consistently cluster with leucine degradation genes tend to have similar leucine levels in mutants. This study provides a framework for obtaining gene functional associations by maximizing the information that can be obtained from gene expression datasets. PMID:27935950

  8. A Systems Biology Framework Identifies Molecular Underpinnings of Coronary Heart Disease

    PubMed Central

    Huan, Tianxiao; Zhang, Bin; Wang, Zhi; Joehanes, Roby; Zhu, Jun; Johnson, Andrew D.; Ying, Saixia; Munson, Peter J.; Raghavachari, Nalini; Wang, Richard; Liu, Poching; Courchesne, Paul; Hwang, Shih-Jen; Assimes, Themistocles L.; McPherson, Ruth; Samani, Nilesh J.; Schunkert, Heribert; Meng, Qingying; Suver, Christine; O'Donnell, Christopher J.; Derry, Jonathan; Yang, Xia; Levy, Daniel

    2013-01-01

    Objective Genetic approaches have identified numerous loci associated with coronary heart disease (CHD). The molecular mechanisms underlying CHD gene-disease associations, however, remain unclear. We hypothesized that genetic variants with both strong and subtle effects drive gene subnetworks that in turn affect CHD. Approach and Results We surveyed CHD-associated molecular interactions by constructing coexpression networks using whole blood gene expression profiles from 188 CHD cases and 188 age- and sex-matched controls. 24 coexpression modules were identified including one case-specific and one control-specific differential module (DM). The DMs were enriched for genes involved in B-cell activation, immune response, and ion transport. By integrating the DMs with altered gene expression associated SNPs (eSNPs) and with results of GWAS of CHD and its risk factors, the control-specific DM was implicated as CHD-causal based on its significant enrichment for both CHD and lipid eSNPs. This causal DM was further integrated with tissue-specific Bayesian networks and protein-protein interaction networks to identify regulatory key driver (KD) genes. Multi-tissue KDs (SPIB and TNFRSF13C) and tissue-specific KDs (e.g. EBF1) were identified. Conclusions Our network-driven integrative analysis not only identified CHD-related genes, but also defined network structure that sheds light on the molecular interactions of genes associated with CHD risk. PMID:23539213

  9. Gene Network for Identifying the Entropy Changes of Different Modules in Pediatric Sepsis.

    PubMed

    Yang, Jing; Zhang, Pingli; Wang, Lumin

    2016-01-01

    Pediatric sepsis is a disease that threatens life of children. The incidence of pediatric sepsis is higher in developing countries due to various reasons, such as insufficient immunization and nutrition, water and air pollution, etc. Exploring the potential genes via different methods is of significance for the prevention and treatment of pediatric sepsis. This study aimed to identify potential genes associated with pediatric sepsis utilizing analysis of gene network and entropy. The mRNA expression in the blood samples collected from 20 septic children and 30 healthy controls was quantified by using Affymetrix HG-U133A microarray. Two condition-specific protein-protein interaction networks (PINs), one for the healthy control and the other one for the children with sepsis, were deduced by combining the fundamental human PINs with gene expression profiles in the two phenotypes. Subsequently, distinct modules from the two conditional networks were extracted by adopting a maximal clique-merging approach. Delta entropy (ΔS) was calculated between sepsis and control modules. Then, key genes displaying changes in gene composition were identified by matching the control and sepsis modules. Two objective modules were obtained, in which ribosomal protein RPL4 and RPL9 as well as TOP2A were probably considered as the key genes differentiating sepsis from healthy controls. According to previous reports and this work, TOP2A is the potential gene therapy target for pediatric sepsis. The relationship between pediatric sepsis and RPL4 and RPL9 needs further investigation. © 2016 The Author(s) Published by S. Karger AG, Basel.

  10. Evolutionary analysis of vision genes identifies potential drivers of visual differences between giraffe and okapi

    PubMed Central

    Agaba, Morris; Cavener, Douglas R.

    2017-01-01

    Background The capacity of visually oriented species to perceive and respond to visual signal is integral to their evolutionary success. Giraffes are closely related to okapi, but the two species have broad range of phenotypic differences including their visual capacities. Vision studies rank giraffe’s visual acuity higher than all other artiodactyls despite sharing similar vision ecological determinants with many of them. The extent to which the giraffe’s unique visual capacity and its difference with okapi is reflected by changes in their vision genes is not understood. Methods The recent availability of giraffe and okapi genomes provided opportunity to identify giraffe and okapi vision genes. Multiple strategies were employed to identify thirty-six candidate mammalian vision genes in giraffe and okapi genomes. Quantification of selection pressure was performed by a combination of branch-site tests of positive selection and clade models of selection divergence through comparing giraffe and okapi vision genes and orthologous sequences from other mammals. Results Signatures of selection were identified in key genes that could potentially underlie giraffe and okapi visual adaptations. Importantly, some genes that contribute to optical transparency of the eye and those that are critical in light signaling pathway were found to show signatures of adaptive evolution or selection divergence. Comparison between giraffe and other ruminants identifies significant selection divergence in CRYAA and OPN1LW. Significant selection divergence was identified in SAG while positive selection was detected in LUM when okapi is compared with ruminants and other mammals. Sequence analysis of OPN1LW showed that at least one of the sites known to affect spectral sensitivity of the red pigment is uniquely divergent between giraffe and other ruminants. Discussion By taking a systemic approach to gene function in vision, the results provide the first molecular clues associated with

  11. Evolutionary analysis of vision genes identifies potential drivers of visual differences between giraffe and okapi.

    PubMed

    Ishengoma, Edson; Agaba, Morris; Cavener, Douglas R

    2017-01-01

    The capacity of visually oriented species to perceive and respond to visual signal is integral to their evolutionary success. Giraffes are closely related to okapi, but the two species have broad range of phenotypic differences including their visual capacities. Vision studies rank giraffe's visual acuity higher than all other artiodactyls despite sharing similar vision ecological determinants with many of them. The extent to which the giraffe's unique visual capacity and its difference with okapi is reflected by changes in their vision genes is not understood. The recent availability of giraffe and okapi genomes provided opportunity to identify giraffe and okapi vision genes. Multiple strategies were employed to identify thirty-six candidate mammalian vision genes in giraffe and okapi genomes. Quantification of selection pressure was performed by a combination of branch-site tests of positive selection and clade models of selection divergence through comparing giraffe and okapi vision genes and orthologous sequences from other mammals. Signatures of selection were identified in key genes that could potentially underlie giraffe and okapi visual adaptations. Importantly, some genes that contribute to optical transparency of the eye and those that are critical in light signaling pathway were found to show signatures of adaptive evolution or selection divergence. Comparison between giraffe and other ruminants identifies significant selection divergence in CRYAA and OPN1LW . Significant selection divergence was identified in SAG while positive selection was detected in LUM when okapi is compared with ruminants and other mammals. Sequence analysis of OPN1LW showed that at least one of the sites known to affect spectral sensitivity of the red pigment is uniquely divergent between giraffe and other ruminants. By taking a systemic approach to gene function in vision, the results provide the first molecular clues associated with giraffe and okapi vision adaptations. At

  12. Genome-Wide and Gene-Based Meta-Analyses Identify Novel Loci Influencing Blood Pressure Response to Hydrochlorothiazide.

    PubMed

    Salvi, Erika; Wang, Zhiying; Rizzi, Federica; Gong, Yan; McDonough, Caitrin W; Padmanabhan, Sandosh; Hiltunen, Timo P; Lanzani, Chiara; Zaninello, Roberta; Chittani, Martina; Bailey, Kent R; Sarin, Antti-Pekka; Barcella, Matteo; Melander, Olle; Chapman, Arlene B; Manunta, Paolo; Kontula, Kimmo K; Glorioso, Nicola; Cusi, Daniele; Dominiczak, Anna F; Johnson, Julie A; Barlassina, Cristina; Boerwinkle, Eric; Cooper-DeHoff, Rhonda M; Turner, Stephen T

    2017-01-01

    This study aimed to identify novel loci influencing the antihypertensive response to hydrochlorothiazide monotherapy. A genome-wide meta-analysis of blood pressure (BP) response to hydrochlorothiazide was performed in 1739 white hypertensives from 6 clinical trials within the International Consortium for Antihypertensive Pharmacogenomics Studies, making it the largest study to date of its kind. No signals reached genome-wide significance (P<5×10 - 8 ), and the suggestive regions (P<10 -5 ) were cross-validated in 2 black cohorts treated with hydrochlorothiazide. In addition, a gene-based analysis was performed on candidate genes with previous evidence of involvement in diuretic response, in BP regulation, or in hypertension susceptibility. Using the genome-wide meta-analysis approach, with validation in blacks, we identified 2 suggestive regulatory regions linked to gap junction protein α1 gene (GJA1) and forkhead box A1 gene (FOXA1), relevant for cardiovascular and kidney function. With the gene-based approach, we identified hydroxy-delta-5-steroid dehydrogenase, 3 β- and steroid δ-isomerase 1 gene (HSD3B1) as significantly associated with BP response (P<2.28×10 - 4 ). HSD3B1 encodes the 3β-hydroxysteroid dehydrogenase enzyme and plays a crucial role in the biosynthesis of aldosterone and endogenous ouabain. By amassing all of the available pharmacogenomic studies of BP response to hydrochlorothiazide, and using 2 different analytic approaches, we identified 3 novel loci influencing BP response to hydrochlorothiazide. The gene-based analysis, never before applied to pharmacogenomics of antihypertensive drugs to our knowledge, provided a powerful strategy to identify a locus of interest, which was not identified in the genome-wide meta-analysis because of high allelic heterogeneity. These data pave the way for future investigations on new pathways and drug targets to enhance the current understanding of personalized antihypertensive treatment. © 2016

  13. A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer.

    PubMed

    Wu, Lang; Shi, Wei; Long, Jirong; Guo, Xingyi; Michailidou, Kyriaki; Beesley, Jonathan; Bolla, Manjeet K; Shu, Xiao-Ou; Lu, Yingchang; Cai, Qiuyin; Al-Ejeh, Fares; Rozali, Esdy; Wang, Qin; Dennis, Joe; Li, Bingshan; Zeng, Chenjie; Feng, Helian; Gusev, Alexander; Barfield, Richard T; Andrulis, Irene L; Anton-Culver, Hoda; Arndt, Volker; Aronson, Kristan J; Auer, Paul L; Barrdahl, Myrto; Baynes, Caroline; Beckmann, Matthias W; Benitez, Javier; Bermisheva, Marina; Blomqvist, Carl; Bogdanova, Natalia V; Bojesen, Stig E; Brauch, Hiltrud; Brenner, Hermann; Brinton, Louise; Broberg, Per; Brucker, Sara Y; Burwinkel, Barbara; Caldés, Trinidad; Canzian, Federico; Carter, Brian D; Castelao, J Esteban; Chang-Claude, Jenny; Chen, Xiaoqing; Cheng, Ting-Yuan David; Christiansen, Hans; Clarke, Christine L; Collée, Margriet; Cornelissen, Sten; Couch, Fergus J; Cox, David; Cox, Angela; Cross, Simon S; Cunningham, Julie M; Czene, Kamila; Daly, Mary B; Devilee, Peter; Doheny, Kimberly F; Dörk, Thilo; Dos-Santos-Silva, Isabel; Dumont, Martine; Dwek, Miriam; Eccles, Diana M; Eilber, Ursula; Eliassen, A Heather; Engel, Christoph; Eriksson, Mikael; Fachal, Laura; Fasching, Peter A; Figueroa, Jonine; Flesch-Janys, Dieter; Fletcher, Olivia; Flyger, Henrik; Fritschi, Lin; Gabrielson, Marike; Gago-Dominguez, Manuela; Gapstur, Susan M; García-Closas, Montserrat; Gaudet, Mia M; Ghoussaini, Maya; Giles, Graham G; Goldberg, Mark S; Goldgar, David E; González-Neira, Anna; Guénel, Pascal; Hahnen, Eric; Haiman, Christopher A; Håkansson, Niclas; Hall, Per; Hallberg, Emily; Hamann, Ute; Harrington, Patricia; Hein, Alexander; Hicks, Belynda; Hillemanns, Peter; Hollestelle, Antoinette; Hoover, Robert N; Hopper, John L; Huang, Guanmengqian; Humphreys, Keith; Hunter, David J; Jakubowska, Anna; Janni, Wolfgang; John, Esther M; Johnson, Nichola; Jones, Kristine; Jones, Michael E; Jung, Audrey; Kaaks, Rudolf; Kerin, Michael J; Khusnutdinova, Elza; Kosma, Veli-Matti; Kristensen, Vessela N; Lambrechts, Diether; Le Marchand, Loic; Li, Jingmei; Lindström, Sara; Lissowska, Jolanta; Lo, Wing-Yee; Loibl, Sibylle; Lubinski, Jan; Luccarini, Craig; Lux, Michael P; MacInnis, Robert J; Maishman, Tom; Kostovska, Ivana Maleva; Mannermaa, Arto; Manson, JoAnn E; Margolin, Sara; Mavroudis, Dimitrios; Meijers-Heijboer, Hanne; Meindl, Alfons; Menon, Usha; Meyer, Jeffery; Mulligan, Anna Marie; Neuhausen, Susan L; Nevanlinna, Heli; Neven, Patrick; Nielsen, Sune F; Nordestgaard, Børge G; Olopade, Olufunmilayo I; Olson, Janet E; Olsson, Håkan; Peterlongo, Paolo; Peto, Julian; Plaseska-Karanfilska, Dijana; Prentice, Ross; Presneau, Nadege; Pylkäs, Katri; Rack, Brigitte; Radice, Paolo; Rahman, Nazneen; Rennert, Gad; Rennert, Hedy S; Rhenius, Valerie; Romero, Atocha; Romm, Jane; Rudolph, Anja; Saloustros, Emmanouil; Sandler, Dale P; Sawyer, Elinor J; Schmidt, Marjanka K; Schmutzler, Rita K; Schneeweiss, Andreas; Scott, Rodney J; Scott, Christopher G; Seal, Sheila; Shah, Mitul; Shrubsole, Martha J; Smeets, Ann; Southey, Melissa C; Spinelli, John J; Stone, Jennifer; Surowy, Harald; Swerdlow, Anthony J; Tamimi, Rulla M; Tapper, William; Taylor, Jack A; Terry, Mary Beth; Tessier, Daniel C; Thomas, Abigail; Thöne, Kathrin; Tollenaar, Rob A E M; Torres, Diana; Truong, Thérèse; Untch, Michael; Vachon, Celine; Van Den Berg, David; Vincent, Daniel; Waisfisz, Quinten; Weinberg, Clarice R; Wendt, Camilla; Whittemore, Alice S; Wildiers, Hans; Willett, Walter C; Winqvist, Robert; Wolk, Alicja; Xia, Lucy; Yang, Xiaohong R; Ziogas, Argyrios; Ziv, Elad; Dunning, Alison M; Pharoah, Paul D P; Simard, Jacques; Milne, Roger L; Edwards, Stacey L; Kraft, Peter; Easton, Douglas F; Chenevix-Trench, Georgia; Zheng, Wei

    2018-06-18

    The breast cancer risk variants identified in genome-wide association studies explain only a small fraction of the familial relative risk, and the genes responsible for these associations remain largely unknown. To identify novel risk loci and likely causal genes, we performed a transcriptome-wide association study evaluating associations of genetically predicted gene expression with breast cancer risk in 122,977 cases and 105,974 controls of European ancestry. We used data from the Genotype-Tissue Expression Project to establish genetic models to predict gene expression in breast tissue and evaluated model performance using data from The Cancer Genome Atlas. Of the 8,597 genes evaluated, significant associations were identified for 48 at a Bonferroni-corrected threshold of P < 5.82 × 10 -6 , including 14 genes at loci not yet reported for breast cancer. We silenced 13 genes and showed an effect for 11 on cell proliferation and/or colony-forming efficiency. Our study provides new insights into breast cancer genetics and biology.

  14. Whole genome-wide transcript profiling to identify differentially expressed genes associated with seed field emergence in two soybean low phytate mutants.

    PubMed

    Yuan, Fengjie; Yu, Xiaomin; Dong, Dekun; Yang, Qinghua; Fu, Xujun; Zhu, Shenlong; Zhu, Danhua

    2017-01-18

    Seed germination is important to soybean (Glycine max) growth and development, ultimately affecting soybean yield. A lower seed field emergence has been the main hindrance for breeding soybeans low in phytate. Although this reduction could be overcome by additional breeding and selection, the mechanisms of seed germination in different low phytate mutants remain unknown. In this study, we performed a comparative transcript analysis of two low phytate soybean mutants (TW-1 and TW-1-M), which have the same mutation, a 2 bp deletion in GmMIPS1, but show a significant difference in seed field emergence, TW-1-M was higher than that of TW-1 . Numerous genes analyzed by RNA-Seq showed markedly different expression levels between TW-1-M and TW-1 mutants. Approximately 30,000-35,000 read-mapped genes and ~21000-25000 expressed genes were identified for each library. There were ~3900-9200 differentially expressed genes (DEGs) in each contrast library, the number of up-regulated genes was similar with down-regulated genes in the mutant TW-1and TW-1-M. Gene ontology functional categories of DEGs indicated that the ethylene-mediated signaling pathway, the abscisic acid-mediated signaling pathway, response to hormone, ethylene biosynthetic process, ethylene metabolic process, regulation of hormone levels, and oxidation-reduction process, regulation of flavonoid biosynthetic process and regulation of abscisic acid-activated signaling pathway had high correlations with seed germination. In total, 2457 DEGs involved in the above functional categories were identified. Twenty-two genes with 20 biological functions were the most highly up/down- regulated (absolute value Log2FC >5) in the high field emergence mutant TW-1-M and were related to metabolic or signaling pathways. Fifty-seven genes with 36 biological functions had the greatest expression abundance (FRPM >100) in germination-related pathways. Seed germination in the soybean low phytate mutants is a very complex process

  15. Transcriptomic Analysis Using Olive Varieties and Breeding Progenies Identifies Candidate Genes Involved in Plant Architecture

    PubMed Central

    González-Plaza, Juan J.; Ortiz-Martín, Inmaculada; Muñoz-Mérida, Antonio; García-López, Carmen; Sánchez-Sevilla, José F.; Luque, Francisco; Trelles, Oswaldo; Bejarano, Eduardo R.; De La Rosa, Raúl; Valpuesta, Victoriano; Beuzón, Carmen R.

    2016-01-01

    Plant architecture is a critical trait in fruit crops that can significantly influence yield, pruning, planting density and harvesting. Little is known about how plant architecture is genetically determined in olive, were most of the existing varieties are traditional with an architecture poorly suited for modern growing and harvesting systems. In the present study, we have carried out microarray analysis of meristematic tissue to compare expression profiles of olive varieties displaying differences in architecture, as well as seedlings from their cross pooled on the basis of their sharing architecture-related phenotypes. The microarray used, previously developed by our group has already been applied to identify candidates genes involved in regulating juvenile to adult transition in the shoot apex of seedlings. Varieties with distinct architecture phenotypes and individuals from segregating progenies displaying opposite architecture features were used to link phenotype to expression. Here, we identify 2252 differentially expressed genes (DEGs) associated to differences in plant architecture. Microarray results were validated by quantitative RT-PCR carried out on genes with functional annotation likely related to plant architecture. Twelve of these genes were further analyzed in individual seedlings of the corresponding pool. We also examined Arabidopsis mutants in putative orthologs of these targeted candidate genes, finding altered architecture for most of them. This supports a functional conservation between species and potential biological relevance of the candidate genes identified. This study is the first to identify genes associated to plant architecture in olive, and the results obtained could be of great help in future programs aimed at selecting phenotypes adapted to modern cultivation practices in this species. PMID:26973682

  16. Robust Principal Component Analysis Regularized by Truncated Nuclear Norm for Identifying Differentially Expressed Genes.

    PubMed

    Wang, Ya-Xuan; Gao, Ying-Lian; Liu, Jin-Xing; Kong, Xiang-Zhen; Li, Hai-Jun

    2017-09-01

    Identifying differentially expressed genes from the thousands of genes is a challenging task. Robust principal component analysis (RPCA) is an efficient method in the identification of differentially expressed genes. RPCA method uses nuclear norm to approximate the rank function. However, theoretical studies showed that the nuclear norm minimizes all singular values, so it may not be the best solution to approximate the rank function. The truncated nuclear norm is defined as the sum of some smaller singular values, which may achieve a better approximation of the rank function than nuclear norm. In this paper, a novel method is proposed by replacing nuclear norm of RPCA with the truncated nuclear norm, which is named robust principal component analysis regularized by truncated nuclear norm (TRPCA). The method decomposes the observation matrix of genomic data into a low-rank matrix and a sparse matrix. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Thus, the differentially expressed genes can be identified according to the sparse matrix. The experimental results on The Cancer Genome Atlas data illustrate that the TRPCA method outperforms other state-of-the-art methods in the identification of differentially expressed genes.

  17. Genome-Wide Temporal Expression Profiling in Caenorhabditis elegans Identifies a Core Gene Set Related to Long-Term Memory.

    PubMed

    Freytag, Virginie; Probst, Sabine; Hadziselimovic, Nils; Boglari, Csaba; Hauser, Yannick; Peter, Fabian; Gabor Fenyves, Bank; Milnik, Annette; Demougin, Philippe; Vukojevic, Vanja; de Quervain, Dominique J-F; Papassotiropoulos, Andreas; Stetak, Attila

    2017-07-12

    The identification of genes related to encoding, storage, and retrieval of memories is a major interest in neuroscience. In the current study, we analyzed the temporal gene expression changes in a neuronal mRNA pool during an olfactory long-term associative memory (LTAM) in Caenorhabditis elegans hermaphrodites. Here, we identified a core set of 712 (538 upregulated and 174 downregulated) genes that follows three distinct temporal peaks demonstrating multiple gene regulation waves in LTAM. Compared with the previously published positive LTAM gene set (Lakhina et al., 2015), 50% of the identified upregulated genes here overlap with the previous dataset, possibly representing stimulus-independent memory-related genes. On the other hand, the remaining genes were not previously identified in positive associative memory and may specifically regulate aversive LTAM. Our results suggest a multistep gene activation process during the formation and retrieval of long-term memory and define general memory-implicated genes as well as conditioning-type-dependent gene sets. SIGNIFICANCE STATEMENT The identification of genes regulating different steps of memory is of major interest in neuroscience. Identification of common memory genes across different learning paradigms and the temporal activation of the genes are poorly studied. Here, we investigated the temporal aspects of Caenorhabditis elegans gene expression changes using aversive olfactory associative long-term memory (LTAM) and identified three major gene activation waves. Like in previous studies, aversive LTAM is also CREB dependent, and CREB activity is necessary immediately after training. Finally, we define a list of memory paradigm-independent core gene sets as well as conditioning-dependent genes. Copyright © 2017 the authors 0270-6474/17/376661-12$15.00/0.

  18. DNA methylome profiling identifies novel methylated genes in African American patients with colorectal neoplasia.

    PubMed

    Ashktorab, Hassan; Daremipouran, M; Goel, Ajay; Varma, Sudhir; Leavitt, R; Sun, Xueguang; Brim, Hassan

    2014-04-01

    The identification of genes that are differentially methylated in colorectal cancer (CRC) has potential value for both diagnostic and therapeutic interventions specifically in high-risk populations such as African Americans (AAs). However, DNA methylation patterns in CRC, especially in AAs, have not been systematically explored and remain poorly understood. Here, we performed DNA methylome profiling to identify the methylation status of CpG islands within candidate genes involved in critical pathways important in the initiation and development of CRC. We used reduced representation bisulfite sequencing (RRBS) in colorectal cancer and adenoma tissues that were compared with DNA methylome from a healthy AA subject's colon tissue and peripheral blood DNA. The identified methylation markers were validated in fresh frozen CRC tissues and corresponding normal tissues from AA patients diagnosed with CRC at Howard University Hospital. We identified and validated the methylation status of 355 CpG sites located within 16 gene promoter regions associated with CpG islands. Fifty CpG sites located within CpG islands-in genes ATXN7L1 (2), BMP3 (7), EID3 (15), GAS7 (1), GPR75 (24), and TNFAIP2 (1)-were significantly hypermethylated in tumor vs. normal tissues (P<0.05). The methylation status of BMP3, EID3, GAS7, and GPR75 was confirmed in an independent, validation cohort. Ingenuity pathway analysis mapped three of these markers (GAS7, BMP3 and GPR) in the insulin and TGF-β1 network-the two key pathways in CRC. In addition to hypermethylated genes, our analysis also revealed that LINE-1 repeat elements were progressively hypomethylated in the normal-adenoma-cancer sequence. We conclude that DNA methylome profiling based on RRBS is an effective method for screening aberrantly methylated genes in CRC. While previous studies focused on the limited identification of hypermethylated genes, ours is the first study to systematically and comprehensively identify novel hypermethylated

  19. Dissecting the Gene Network of Dietary Restriction to Identify Evolutionarily Conserved Pathways and New Functional Genes

    PubMed Central

    Wuttke, Daniel; Connor, Richard; Vora, Chintan; Craig, Thomas; Li, Yang; Wood, Shona; Vasieva, Olga; Shmookler Reis, Robert; Tang, Fusheng; de Magalhães, João Pedro

    2012-01-01

    Dietary restriction (DR), limiting nutrient intake from diet without causing malnutrition, delays the aging process and extends lifespan in multiple organisms. The conserved life-extending effect of DR suggests the involvement of fundamental mechanisms, although these remain a subject of debate. To help decipher the life-extending mechanisms of DR, we first compiled a list of genes that if genetically altered disrupt or prevent the life-extending effects of DR. We called these DR–essential genes and identified more than 100 in model organisms such as yeast, worms, flies, and mice. In order for other researchers to benefit from this first curated list of genes essential for DR, we established an online database called GenDR (http://genomics.senescence.info/diet/). To dissect the interactions of DR–essential genes and discover the underlying lifespan-extending mechanisms, we then used a variety of network and systems biology approaches to analyze the gene network of DR. We show that DR–essential genes are more conserved at the molecular level and have more molecular interactions than expected by chance. Furthermore, we employed a guilt-by-association method to predict novel DR–essential genes. In budding yeast, we predicted nine genes related to vacuolar functions; we show experimentally that mutations deleting eight of those genes prevent the life-extending effects of DR. Three of these mutants (OPT2, FRE6, and RCR2) had extended lifespan under ad libitum, indicating that the lack of further longevity under DR is not caused by a general compromise of fitness. These results demonstrate how network analyses of DR using GenDR can be used to make phenotypically relevant predictions. Moreover, gene-regulatory circuits reveal that the DR–induced transcriptional signature in yeast involves nutrient-sensing, stress responses and meiotic transcription factors. Finally, comparing the influence of gene expression changes during DR on the interactomes of multiple

  20. Identifying Stable Reference Genes for qRT-PCR Normalisation in Gene Expression Studies of Narrow-Leafed Lupin (Lupinus angustifolius L.).

    PubMed

    Taylor, Candy M; Jost, Ricarda; Erskine, William; Nelson, Matthew N

    2016-01-01

    Quantitative Reverse Transcription PCR (qRT-PCR) is currently one of the most popular, high-throughput and sensitive technologies available for quantifying gene expression. Its accurate application depends heavily upon normalisation of gene-of-interest data with reference genes that are uniformly expressed under experimental conditions. The aim of this study was to provide the first validation of reference genes for Lupinus angustifolius (narrow-leafed lupin, a significant grain legume crop) using a selection of seven genes previously trialed as reference genes for the model legume, Medicago truncatula. In a preliminary evaluation, the seven candidate reference genes were assessed on the basis of primer specificity for their respective targeted region, PCR amplification efficiency, and ability to discriminate between cDNA and gDNA. Following this assessment, expression of the three most promising candidates [Ubiquitin C (UBC), Helicase (HEL), and Polypyrimidine tract-binding protein (PTB)] was evaluated using the NormFinder and RefFinder statistical algorithms in two narrow-leafed lupin lines, both with and without vernalisation treatment, and across seven organ types (cotyledons, stem, leaves, shoot apical meristem, flowers, pods and roots) encompassing three developmental stages. UBC was consistently identified as the most stable candidate and has sufficiently uniform expression that it may be used as a sole reference gene under the experimental conditions tested here. However, as organ type and developmental stage were associated with greater variability in relative expression, it is recommended using UBC and HEL as a pair to achieve optimal normalisation. These results highlight the importance of rigorously assessing candidate reference genes for each species across a diverse range of organs and developmental stages. With emerging technologies, such as RNAseq, and the completion of valuable transcriptome data sets, it is possible that other potentially more

  1. Identifying Stable Reference Genes for qRT-PCR Normalisation in Gene Expression Studies of Narrow-Leafed Lupin (Lupinus angustifolius L.)

    PubMed Central

    Erskine, William; Nelson, Matthew N.

    2016-01-01

    Quantitative Reverse Transcription PCR (qRT-PCR) is currently one of the most popular, high-throughput and sensitive technologies available for quantifying gene expression. Its accurate application depends heavily upon normalisation of gene-of-interest data with reference genes that are uniformly expressed under experimental conditions. The aim of this study was to provide the first validation of reference genes for Lupinus angustifolius (narrow-leafed lupin, a significant grain legume crop) using a selection of seven genes previously trialed as reference genes for the model legume, Medicago truncatula. In a preliminary evaluation, the seven candidate reference genes were assessed on the basis of primer specificity for their respective targeted region, PCR amplification efficiency, and ability to discriminate between cDNA and gDNA. Following this assessment, expression of the three most promising candidates [Ubiquitin C (UBC), Helicase (HEL), and Polypyrimidine tract-binding protein (PTB)] was evaluated using the NormFinder and RefFinder statistical algorithms in two narrow-leafed lupin lines, both with and without vernalisation treatment, and across seven organ types (cotyledons, stem, leaves, shoot apical meristem, flowers, pods and roots) encompassing three developmental stages. UBC was consistently identified as the most stable candidate and has sufficiently uniform expression that it may be used as a sole reference gene under the experimental conditions tested here. However, as organ type and developmental stage were associated with greater variability in relative expression, it is recommended using UBC and HEL as a pair to achieve optimal normalisation. These results highlight the importance of rigorously assessing candidate reference genes for each species across a diverse range of organs and developmental stages. With emerging technologies, such as RNAseq, and the completion of valuable transcriptome data sets, it is possible that other potentially more

  2. A stratified transcriptomics analysis of polygenic fat and lean mouse adipose tissues identifies novel candidate obesity genes.

    PubMed

    Morton, Nicholas M; Nelson, Yvonne B; Michailidou, Zoi; Di Rollo, Emma M; Ramage, Lynne; Hadoke, Patrick W F; Seckl, Jonathan R; Bunger, Lutz; Horvat, Simon; Kenyon, Christopher J; Dunbar, Donald R

    2011-01-01

    Obesity and metabolic syndrome results from a complex interaction between genetic and environmental factors. In addition to brain-regulated processes, recent genome wide association studies have indicated that genes highly expressed in adipose tissue affect the distribution and function of fat and thus contribute to obesity. Using a stratified transcriptome gene enrichment approach we attempted to identify adipose tissue-specific obesity genes in the unique polygenic Fat (F) mouse strain generated by selective breeding over 60 generations for divergent adiposity from a comparator Lean (L) strain. To enrich for adipose tissue obesity genes a 'snap-shot' pooled-sample transcriptome comparison of key fat depots and non adipose tissues (muscle, liver, kidney) was performed. Known obesity quantitative trait loci (QTL) information for the model allowed us to further filter genes for increased likelihood of being causal or secondary for obesity. This successfully identified several genes previously linked to obesity (C1qr1, and Np3r) as positional QTL candidate genes elevated specifically in F line adipose tissue. A number of novel obesity candidate genes were also identified (Thbs1, Ppp1r3d, Tmepai, Trp53inp2, Ttc7b, Tuba1a, Fgf13, Fmr) that have inferred roles in fat cell function. Quantitative microarray analysis was then applied to the most phenotypically divergent adipose depot after exaggerating F and L strain differences with chronic high fat feeding which revealed a distinct gene expression profile of line, fat depot and diet-responsive inflammatory, angiogenic and metabolic pathways. Selected candidate genes Npr3 and Thbs1, as well as Gys2, a non-QTL gene that otherwise passed our enrichment criteria were characterised, revealing novel functional effects consistent with a contribution to obesity. A focussed candidate gene enrichment strategy in the unique F and L model has identified novel adipose tissue-enriched genes contributing to obesity.

  3. A screen to identify Drosophila genes required for integrin-mediated adhesion.

    PubMed Central

    Walsh, E P; Brown, N H

    1998-01-01

    Drosophila integrins have essential adhesive roles during development, including adhesion between the two wing surfaces. Most position-specific integrin mutations cause lethality, and clones of homozygous mutant cells in the wing do not adhere to the apposing surface, causing blisters. We have used FLP-FRT induced mitotic recombination to generate clones of randomly induced mutations in the F1 generation and screened for mutations that cause wing blisters. This phenotype is highly selective, since only 14 lethal complementation groups were identified in screens of the five major chromosome arms. Of the loci identified, 3 are PS integrin genes, 2 are blistered and bloated, and the remaining 9 appear to be newly characterized loci. All 11 nonintegrin loci are required on both sides of the wing, in contrast to integrin alpha subunit genes. Mutations in 8 loci only disrupt adhesion in the wing, similar to integrin mutations, while mutations in the 3 other loci cause additional wing defects. Mutations in 4 loci, like the strongest integrin mutations, cause a "tail-up" embryonic lethal phenotype, and mutant alleles of 1 of these loci strongly enhance an integrin mutation. Thus several of these loci are good candidates for genes encoding cytoplasmic proteins required for integrin function. PMID:9755209

  4. Genomic Analyses Yield Markers for Identifying Agronomically Important Genes in Potato

    USDA-ARS?s Scientific Manuscript database

    This study explores the genetic architecture underling the potato evolution through a comprehensive assessment of wild and cultivated potato species based on the re-sequencing of 201 accessions of Solanum section Petota with >12 × genome coverage. We identified 450 domesticated genes, which showed e...

  5. Engineering and Functional Characterization of Fusion Genes Identifies Novel Oncogenic Drivers of Cancer.

    PubMed

    Lu, Hengyu; Villafane, Nicole; Dogruluk, Turgut; Grzeskowiak, Caitlin L; Kong, Kathleen; Tsang, Yiu Huen; Zagorodna, Oksana; Pantazi, Angeliki; Yang, Lixing; Neill, Nicholas J; Kim, Young Won; Creighton, Chad J; Verhaak, Roel G; Mills, Gordon B; Park, Peter J; Kucherlapati, Raju; Scott, Kenneth L

    2017-07-01

    Oncogenic gene fusions drive many human cancers, but tools to more quickly unravel their functional contributions are needed. Here we describe methodology permitting fusion gene construction for functional evaluation. Using this strategy, we engineered the known fusion oncogenes, BCR-ABL1, EML4-ALK , and ETV6-NTRK3, as well as 20 previously uncharacterized fusion genes identified in The Cancer Genome Atlas datasets. In addition to confirming oncogenic activity of the known fusion oncogenes engineered by our construction strategy, we validated five novel fusion genes involving MET, NTRK2 , and BRAF kinases that exhibited potent transforming activity and conferred sensitivity to FDA-approved kinase inhibitors. Our fusion construction strategy also enabled domain-function studies of BRAF fusion genes. Our results confirmed other reports that the transforming activity of BRAF fusions results from truncation-mediated loss of inhibitory domains within the N-terminus of the BRAF protein. BRAF mutations residing within this inhibitory region may provide a means for BRAF activation in cancer, therefore we leveraged the modular design of our fusion gene construction methodology to screen N-terminal domain mutations discovered in tumors that are wild-type at the BRAF mutation hotspot, V600. We identified an oncogenic mutation, F247L, whose expression robustly activated the MAPK pathway and sensitized cells to BRAF and MEK inhibitors. When applied broadly, these tools will facilitate rapid fusion gene construction for subsequent functional characterization and translation into personalized treatment strategies. Cancer Res; 77(13); 3502-12. ©2017 AACR . ©2017 American Association for Cancer Research.

  6. Novel linkage disequilibrium clustering algorithm identifies new lupus genes on meta-analysis of GWAS datasets.

    PubMed

    Saeed, Mohammad

    2017-05-01

    Systemic lupus erythematosus (SLE) is a complex disorder. Genetic association studies of complex disorders suffer from the following three major issues: phenotypic heterogeneity, false positive (type I error), and false negative (type II error) results. Hence, genes with low to moderate effects are missed in standard analyses, especially after statistical corrections. OASIS is a novel linkage disequilibrium clustering algorithm that can potentially address false positives and negatives in genome-wide association studies (GWAS) of complex disorders such as SLE. OASIS was applied to two SLE dbGAP GWAS datasets (6077 subjects; ∼0.75 million single-nucleotide polymorphisms). OASIS identified three known SLE genes viz. IFIH1, TNIP1, and CD44, not previously reported using these GWAS datasets. In addition, 22 novel loci for SLE were identified and the 5 SLE genes previously reported using these datasets were verified. OASIS methodology was validated using single-variant replication and gene-based analysis with GATES. This led to the verification of 60% of OASIS loci. New SLE genes that OASIS identified and were further verified include TNFAIP6, DNAJB3, TTF1, GRIN2B, MON2, LATS2, SNX6, RBFOX1, NCOA3, and CHAF1B. This study presents the OASIS algorithm, software, and the meta-analyses of two publicly available SLE GWAS datasets along with the novel SLE genes. Hence, OASIS is a novel linkage disequilibrium clustering method that can be universally applied to existing GWAS datasets for the identification of new genes.

  7. An unbiased approach to identify genes involved in development in a turtle with temperature-dependent sex determination.

    PubMed

    Chojnowski, Jena L; Braun, Edward L

    2012-07-15

    Many reptiles exhibit temperature-dependent sex determination (TSD). The initial cue in TSD is incubation temperature, unlike genotypic sex determination (GSD) where it is determined by the presence of specific alleles (or genetic loci). We used patterns of gene expression to identify candidates for genes with a role in TSD and other developmental processes without making a priori assumptions about the identity of these genes (ortholog-based approach). We identified genes with sexually dimorphic mRNA accumulation during the temperature sensitive period of development in the Red-eared slider turtle (Trachemys scripta), a turtle with TSD. Genes with differential mRNA accumulation in response to estrogen (estradiol-17β; E(2)) exposure and developmental stages were also identified. Sequencing 767 clones from three suppression-subtractive hybridization libraries yielded a total of 581 unique sequences. Screening a macroarray with a subset of those sequences revealed a total of 26 genes that exhibited differential mRNA accumulation: 16 female biased and 10 male biased. Additional analyses revealed that C16ORF62 (an unknown gene) and MALAT1 (a long noncoding RNA) exhibited increased mRNA accumulation at the male producing temperature relative to the female producing temperature during embryonic sexual development. Finally, we identified four genes (C16ORF62, CCT3, MMP2, and NFIB) that exhibited a stage effect and five genes (C16ORF62, CCT3, MMP2, NFIB and NOTCH2) showed a response to E(2) exposure. Here we report a survey of genes identified using patterns of mRNA accumulation during embryonic development in a turtle with TSD. Many previous studies have focused on examining the turtle orthologs of genes involved in mammalian development. Although valuable, the limitations of this approach are exemplified by our identification of two genes (MALAT1 and C16ORF62) that are sexually dimorphic during embryonic development. MALAT1 is a noncoding RNA that has not been implicated

  8. An ant colony optimization based algorithm for identifying gene regulatory elements.

    PubMed

    Liu, Wei; Chen, Hanwu; Chen, Ling

    2013-08-01

    It is one of the most important tasks in bioinformatics to identify the regulatory elements in gene sequences. Most of the existing algorithms for identifying regulatory elements are inclined to converge into a local optimum, and have high time complexity. Ant Colony Optimization (ACO) is a meta-heuristic method based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of real ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper designs and implements an ACO based algorithm named ACRI (ant-colony-regulatory-identification) for identifying all possible binding sites of transcription factor from the upstream of co-expressed genes. To accelerate the ants' searching process, a strategy of local optimization is presented to adjust the ants' start positions on the searched sequences. By exploiting the powerful optimization ability of ACO, the algorithm ACRI can not only improve precision of the results, but also achieve a very high speed. Experimental results on real world datasets show that ACRI can outperform other traditional algorithms in the respects of speed and quality of solutions. Copyright © 2013 Elsevier Ltd. All rights reserved.

  9. Similarity of markers identified from cancer gene expression studies: observations from GEO.

    PubMed

    Shi, Xingjie; Shen, Shihao; Liu, Jin; Huang, Jian; Zhou, Yong; Ma, Shuangge

    2014-09-01

    Gene expression profiling has been extensively conducted in cancer research. The analysis of multiple independent cancer gene expression datasets may provide additional information and complement single-dataset analysis. In this study, we conduct multi-dataset analysis and are interested in evaluating the similarity of cancer-associated genes identified from different datasets. The first objective of this study is to briefly review some statistical methods that can be used for such evaluation. Both marginal analysis and joint analysis methods are reviewed. The second objective is to apply those methods to 26 Gene Expression Omnibus (GEO) datasets on five types of cancers. Our analysis suggests that for the same cancer, the marker identification results may vary significantly across datasets, and different datasets share few common genes. In addition, datasets on different cancers share few common genes. The shared genetic basis of datasets on the same or different cancers, which has been suggested in the literature, is not observed in the analysis of GEO data. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  10. Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment

    PubMed Central

    Uddin, Raihan; Singh, Shiva M.

    2017-01-01

    As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in “learning and memory” related functions and pathways. Subsequent differential network analysis of this “learning and memory” module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken

  11. Gene Network Construction from Microarray Data Identifies a Key Network Module and Several Candidate Hub Genes in Age-Associated Spatial Learning Impairment.

    PubMed

    Uddin, Raihan; Singh, Shiva M

    2017-01-01

    As humans age many suffer from a decrease in normal brain functions including spatial learning impairments. This study aimed to better understand the molecular mechanisms in age-associated spatial learning impairment (ASLI). We used a mathematical modeling approach implemented in Weighted Gene Co-expression Network Analysis (WGCNA) to create and compare gene network models of young (learning unimpaired) and aged (predominantly learning impaired) brains from a set of exploratory datasets in rats in the context of ASLI. The major goal was to overcome some of the limitations previously observed in the traditional meta- and pathway analysis using these data, and identify novel ASLI related genes and their networks based on co-expression relationship of genes. This analysis identified a set of network modules in the young, each of which is highly enriched with genes functioning in broad but distinct GO functional categories or biological pathways. Interestingly, the analysis pointed to a single module that was highly enriched with genes functioning in "learning and memory" related functions and pathways. Subsequent differential network analysis of this "learning and memory" module in the aged (predominantly learning impaired) rats compared to the young learning unimpaired rats allowed us to identify a set of novel ASLI candidate hub genes. Some of these genes show significant repeatability in networks generated from independent young and aged validation datasets. These hub genes are highly co-expressed with other genes in the network, which not only show differential expression but also differential co-expression and differential connectivity across age and learning impairment. The known function of these hub genes indicate that they play key roles in critical pathways, including kinase and phosphatase signaling, in functions related to various ion channels, and in maintaining neuronal integrity relating to synaptic plasticity and memory formation. Taken together, they

  12. Use of an activated beta-catenin to identify Wnt pathway target genes in caenorhabditis elegans, including a subset of collagen genes expressed in late larval development.

    PubMed

    Jackson, Belinda M; Abete-Luzi, Patricia; Krause, Michael W; Eisenmann, David M

    2014-04-16

    The Wnt signaling pathway plays a fundamental role during metazoan development, where it regulates diverse processes, including cell fate specification, cell migration, and stem cell renewal. Activation of the beta-catenin-dependent/canonical Wnt pathway up-regulates expression of Wnt target genes to mediate a cellular response. In the nematode Caenorhabditis elegans, a canonical Wnt signaling pathway regulates several processes during larval development; however, few target genes of this pathway have been identified. To address this deficit, we used a novel approach of conditionally activated Wnt signaling during a defined stage of larval life by overexpressing an activated beta-catenin protein, then used microarray analysis to identify genes showing altered expression compared with control animals. We identified 166 differentially expressed genes, of which 104 were up-regulated. A subset of the up-regulated genes was shown to have altered expression in mutants with decreased or increased Wnt signaling; we consider these genes to be bona fide C. elegans Wnt pathway targets. Among these was a group of six genes, including the cuticular collagen genes, bli-1 col-38, col-49, and col-71. These genes show a peak of expression in the mid L4 stage during normal development, suggesting a role in adult cuticle formation. Consistent with this finding, reduction of function for several of the genes causes phenotypes suggestive of defects in cuticle function or integrity. Therefore, this work has identified a large number of putative Wnt pathway target genes during larval life, including a small subset of Wnt-regulated collagen genes that may function in synthesis of the adult cuticle.

  13. Genomic characterization of biliary tract cancers identifies driver genes and predisposing mutations.

    PubMed

    Wardell, Christopher P; Fujita, Masashi; Yamada, Toru; Simbolo, Michele; Fassan, Matteo; Karlic, Rosa; Polak, Paz; Kim, Jaegil; Hatanaka, Yutaka; Maejima, Kazuhiro; Lawlor, Rita T; Nakanishi, Yoshitsugu; Mitsuhashi, Tomoko; Fujimoto, Akihiro; Furuta, Mayuko; Ruzzenente, Andrea; Conci, Simone; Oosawa, Ayako; Sasaki-Oku, Aya; Nakano, Kaoru; Tanaka, Hiroko; Yamamoto, Yujiro; Michiaki, Kubo; Kawakami, Yoshiiku; Aikata, Hiroshi; Ueno, Masaki; Hayami, Shinya; Gotoh, Kunihito; Ariizumi, Shun-Ichi; Yamamoto, Masakazu; Yamaue, Hiroki; Chayama, Kazuaki; Miyano, Satoru; Getz, Gad; Scarpa, Aldo; Hirano, Satoshi; Nakamura, Toru; Nakagawa, Hidewaki

    2018-05-01

    Biliary tract cancers (BTCs) are clinically and pathologically heterogeneous and respond poorly to treatment. Genomic profiling can offer a clearer understanding of their carcinogenesis, classification and treatment strategy. We performed large-scale genome sequencing analyses on BTCs to investigate their somatic and germline driver events and characterize their genomic landscape. We analyzed 412 BTC samples from Japanese and Italian populations, 107 by whole-exome sequencing (WES), 39 by whole-genome sequencing (WGS), and a further 266 samples by targeted sequencing. The subtypes were 136 intrahepatic cholangiocarcinomas (ICCs), 101 distal cholangiocarcinomas (DCCs), 109 peri-hilar type cholangiocarcinomas (PHCs), and 66 gallbladder or cystic duct cancers (GBCs/CDCs). We identified somatic alterations and searched for driver genes in BTCs, finding pathogenic germline variants of cancer-predisposing genes. We predicted cell-of-origin for BTCs by combining somatic mutation patterns and epigenetic features. We identified 32 significantly and commonly mutated genes including TP53, KRAS, SMAD4, NF1, ARID1A, PBRM1, and ATR, some of which negatively affected patient prognosis. A novel deletion of MUC17 at 7q22.1 affected patient prognosis. Cell-of-origin predictions using WGS and epigenetic features suggest hepatocyte-origin of hepatitis-related ICCs. Deleterious germline mutations of cancer-predisposing genes such as BRCA1, BRCA2, RAD51D, MLH1, or MSH2 were detected in 11% (16/146) of BTC patients. BTCs have distinct genetic features including somatic events and germline predisposition. These findings could be useful to establish treatment and diagnostic strategies for BTCs based on genetic information. We here analyzed genomic features of 412 BTC samples from Japanese and Italian populations. A total of 32 significantly and commonly mutated genes were identified, some of which negatively affected patient prognosis, including a novel deletion of MUC17 at 7q22.1. Cell

  14. Identifying marker genes in transcription profiling data using a mixture of feature relevance experts.

    PubMed

    Chow, M L; Moler, E J; Mian, I S

    2001-03-08

    Transcription profiling experiments permit the expression levels of many genes to be measured simultaneously. Given profiling data from two types of samples, genes that most distinguish the samples (marker genes) are good candidates for subsequent in-depth experimental studies and developing decision support systems for diagnosis, prognosis, and monitoring. This work proposes a mixture of feature relevance experts as a method for identifying marker genes and illustrates the idea using published data from samples labeled as acute lymphoblastic and myeloid leukemia (ALL, AML). A feature relevance expert implements an algorithm that calculates how well a gene distinguishes samples, reorders genes according to this relevance measure, and uses a supervised learning method [here, support vector machines (SVMs)] to determine the generalization performances of different nested gene subsets. The mixture of three feature relevance experts examined implement two existing and one novel feature relevance measures. For each expert, a gene subset consisting of the top 50 genes distinguished ALL from AML samples as completely as all 7,070 genes. The 125 genes at the union of the top 50s are plausible markers for a prototype decision support system. Chromosomal aberration and other data support the prediction that the three genes at the intersection of the top 50s, cystatin C, azurocidin, and adipsin, are good targets for investigating the basic biology of ALL/AML. The same data were employed to identify markers that distinguish samples based on their labels of T cell/B cell, peripheral blood/bone marrow, and male/female. Selenoprotein W may discriminate T cells from B cells. Results from analysis of transcription profiling data from tumor/nontumor colon adenocarcinoma samples support the general utility of the aforementioned approach. Theoretical issues such as choosing SVM kernels and their parameters, training and evaluating feature relevance experts, and the impact of

  15. Integrative strategies to identify candidate genes in rodent models of human alcoholism.

    PubMed

    Treadwell, Julie A

    2006-01-01

    The search for genes underlying alcohol-related behaviours in rodent models of human alcoholism has been ongoing for many years with only limited success. Recently, new strategies that integrate several of the traditional approaches have provided new insights into the molecular mechanisms underlying ethanol's actions in the brain. We have used alcohol-preferring C57BL/6J (B6) and alcohol-avoiding DBA/2J (D2) genetic strains of mice in an integrative strategy combining high-throughput gene expression screening, genetic segregation analysis, and mapping to previously published quantitative trait loci to uncover candidate genes for the ethanol-preference phenotype. In our study, 2 genes, retinaldehyde binding protein 1 (Rlbp1) and syntaxin 12 (Stx12), were found to be strong candidates for ethanol preference. Such experimental approaches have the power and the potential to greatly speed up the laborious process of identifying candidate genes for the animal models of human alcoholism.

  16. Methods for identifying an essential gene in a prokaryotic microorganism

    DOEpatents

    Shizuya, Hiroaki

    2006-01-31

    Methods are provided for the rapid identification of essential or conditionally essential DNA segments in any species of haploid cell (one copy chromosome per cell) that is capable of being transformed by artificial means and is capable of undergoing DNA recombination. This system offers an enhanced means of identifying essential function genes in diploid pathogens, such as gram-negative and gram-positive bacteria.

  17. Large-Scale Gene-Centric Analysis Identifies Novel Variants for Coronary Artery Disease

    PubMed Central

    2011-01-01

    Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10−33; LPA:p<10−19; 1p13.3:p<10−17) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10−7). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06–1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and

  18. Large-scale gene-centric analysis identifies novel variants for coronary artery disease.

    PubMed

    2011-09-01

    Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10(-33); LPA:p<10(-19); 1p13.3:p<10(-17)) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10(-7)). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06-1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and

  19. Gene co-expression analysis identifies gene clusters associated with isotropic and polarized growth in Aspergillus fumigatus conidia.

    PubMed

    Baltussen, Tim J H; Coolen, Jordy P M; Zoll, Jan; Verweij, Paul E; Melchers, Willem J G

    2018-04-26

    Aspergillus fumigatus is a saprophytic fungus that extensively produces conidia. These microscopic asexually reproductive structures are small enough to reach the lungs. Germination of conidia followed by hyphal growth inside human lungs is a key step in the establishment of infection in immunocompromised patients. RNA-Seq was used to analyze the transcriptome of dormant and germinating A. fumigatus conidia. Construction of a gene co-expression network revealed four gene clusters (modules) correlated with a growth phase (dormant, isotropic growth, polarized growth). Transcripts levels of genes encoding for secondary metabolites were high in dormant conidia. During isotropic growth, transcript levels of genes involved in cell wall modifications increased. Two modules encoding for growth and cell cycle/DNA processing were associated with polarized growth. In addition, the co-expression network was used to identify highly connected intermodular hub genes. These genes may have a pivotal role in the respective module and could therefore be compelling therapeutic targets. Generally, cell wall remodeling is an important process during isotropic and polarized growth, characterized by an increase of transcripts coding for hyphal growth and cell cycle/DNA processing when polarized growth is initiated. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  20. Targeted sequencing identifies 91 neurodevelopmental disorder risk genes with autism and developmental disability biases

    PubMed Central

    Stessman, Holly A. F.; Xiong, Bo; Coe, Bradley P.; Wang, Tianyun; Hoekzema, Kendra; Fenckova, Michaela; Kvarnung, Malin; Gerdts, Jennifer; Trinh, Sandy; Cosemans, Nele; Vives, Laura; Lin, Janice; Turner, Tychele N.; Santen, Gijs; Ruivenkamp, Claudia; Kriek, Marjolein; van Haeringen, Arie; Aten, Emmelien; Friend, Kathryn; Liebelt, Jan; Barnett, Christopher; Haan, Eric; Shaw, Marie; Gecz, Jozef; Anderlid, Britt-Marie; Nordgren, Ann; Lindstrand, Anna; Schwartz, Charles; Kooy, R. Frank; Vandeweyer, Geert; Helsmoortel, Celine; Romano, Corrado; Alberti, Antonino; Vinci, Mirella; Avola, Emanuela; Giusto, Stefania; Courchesne, Eric; Pramparo, Tiziano; Pierce, Karen; Nalabolu, Srinivasa; Amaral, David; Scheffer, Ingrid E.; Delatycki, Martin B.; Lockhart, Paul J.; Hormozdiari, Fereydoun; Harich, Benjamin; Castells-Nobau, Anna; Xia, Kun; Peeters, Hilde; Nordenskjöld, Magnus; Schenck, Annette; Bernier, Raphael A.; Eichler, Evan E.

    2017-01-01

    Gene-disruptive mutations contribute to the biology of neurodevelopmental disorders (NDDs), but most pathogenic genes are not known. We sequenced 208 candidate genes from >11,730 patients and >2,867 controls. We report 91 genes with an excess of de novo mutations or private disruptive mutations in 5.7% of patients, including 38 novel NDD genes. Drosophila functional assays of a subset bolster their involvement in NDDs. We identify 25 genes that show a bias for autism versus intellectual disability and highlight a network associated with high-functioning autism (FSIQ>100). Clinical follow-up for NAA15, KMT5B, and ASH1L reveals novel syndromic and non-syndromic forms of disease. PMID:28191889

  1. Integrative Analysis of DNA Methylation and Gene Expression Data Identifies EPAS1 as a Key Regulator of COPD

    PubMed Central

    Yoo, Seungyeul; Takikawa, Sachiko; Geraghty, Patrick; Argmann, Carmen; Campbell, Joshua; Lin, Luan; Huang, Tao; Tu, Zhidong; Feronjy, Robert; Spira, Avrum; Schadt, Eric E.; Powell, Charles A.; Zhu, Jun

    2015-01-01

    Chronic Obstructive Pulmonary Disease (COPD) is a complex disease. Genetic, epigenetic, and environmental factors are known to contribute to COPD risk and disease progression. Therefore we developed a systematic approach to identify key regulators of COPD that integrates genome-wide DNA methylation, gene expression, and phenotype data in lung tissue from COPD and control samples. Our integrative analysis identified 126 key regulators of COPD. We identified EPAS1 as the only key regulator whose downstream genes significantly overlapped with multiple genes sets associated with COPD disease severity. EPAS1 is distinct in comparison with other key regulators in terms of methylation profile and downstream target genes. Genes predicted to be regulated by EPAS1 were enriched for biological processes including signaling, cell communications, and system development. We confirmed that EPAS1 protein levels are lower in human COPD lung tissue compared to non-disease controls and that Epas1 gene expression is reduced in mice chronically exposed to cigarette smoke. As EPAS1 downstream genes were significantly enriched for hypoxia responsive genes in endothelial cells, we tested EPAS1 function in human endothelial cells. EPAS1 knockdown by siRNA in endothelial cells impacted genes that significantly overlapped with EPAS1 downstream genes in lung tissue including hypoxia responsive genes, and genes associated with emphysema severity. Our first integrative analysis of genome-wide DNA methylation and gene expression profiles illustrates that not only does DNA methylation play a ‘causal’ role in the molecular pathophysiology of COPD, but it can be leveraged to directly identify novel key mediators of this pathophysiology. PMID:25569234

  2. Co-expression network analysis identified six hub genes in association with metastasis risk and prognosis in hepatocellular carcinoma

    PubMed Central

    Feng, Juerong; Zhou, Rui; Chang, Ying; Liu, Jing; Zhao, Qiu

    2017-01-01

    Hepatocellular carcinoma (HCC) has a high incidence and mortality worldwide, and its carcinogenesis and progression are influenced by a complex network of gene interactions. A weighted gene co-expression network was constructed to identify gene modules associated with the clinical traits in HCC (n = 214). Among the 13 modules, high correlation was only found between the red module and metastasis risk (classified by the HCC metastasis gene signature) (R2 = −0.74). Moreover, in the red module, 34 network hub genes for metastasis risk were identified, six of which (ABAT, AGXT, ALDH6A1, CYP4A11, DAO and EHHADH) were also hub nodes in the protein-protein interaction network of the module genes. Thus, a total of six hub genes were identified. In validation, all hub genes showed a negative correlation with the four-stage HCC progression (P for trend < 0.05) in the test set. Furthermore, in the training set, HCC samples with any hub gene lowly expressed demonstrated a higher recurrence rate and poorer survival rate (hazard ratios with 95% confidence intervals > 1). RNA-sequencing data of 142 HCC samples showed consistent results in the prognosis. Gene set enrichment analysis (GSEA) demonstrated that in the samples with any hub gene highly expressed, a total of 24 functional gene sets were enriched, most of which focused on amino acid metabolism and oxidation. In conclusion, co-expression network analysis identified six hub genes in association with HCC metastasis risk and prognosis, which might improve the prognosis by influencing amino acid metabolism and oxidation. PMID:28430663

  3. Integrative analysis of DNA methylation and gene expression data identifies EPAS1 as a key regulator of COPD.

    PubMed

    Yoo, Seungyeul; Takikawa, Sachiko; Geraghty, Patrick; Argmann, Carmen; Campbell, Joshua; Lin, Luan; Huang, Tao; Tu, Zhidong; Foronjy, Robert F; Feronjy, Robert; Spira, Avrum; Schadt, Eric E; Powell, Charles A; Zhu, Jun

    2015-01-01

    Chronic Obstructive Pulmonary Disease (COPD) is a complex disease. Genetic, epigenetic, and environmental factors are known to contribute to COPD risk and disease progression. Therefore we developed a systematic approach to identify key regulators of COPD that integrates genome-wide DNA methylation, gene expression, and phenotype data in lung tissue from COPD and control samples. Our integrative analysis identified 126 key regulators of COPD. We identified EPAS1 as the only key regulator whose downstream genes significantly overlapped with multiple genes sets associated with COPD disease severity. EPAS1 is distinct in comparison with other key regulators in terms of methylation profile and downstream target genes. Genes predicted to be regulated by EPAS1 were enriched for biological processes including signaling, cell communications, and system development. We confirmed that EPAS1 protein levels are lower in human COPD lung tissue compared to non-disease controls and that Epas1 gene expression is reduced in mice chronically exposed to cigarette smoke. As EPAS1 downstream genes were significantly enriched for hypoxia responsive genes in endothelial cells, we tested EPAS1 function in human endothelial cells. EPAS1 knockdown by siRNA in endothelial cells impacted genes that significantly overlapped with EPAS1 downstream genes in lung tissue including hypoxia responsive genes, and genes associated with emphysema severity. Our first integrative analysis of genome-wide DNA methylation and gene expression profiles illustrates that not only does DNA methylation play a 'causal' role in the molecular pathophysiology of COPD, but it can be leveraged to directly identify novel key mediators of this pathophysiology.

  4. Microarray analysis identified Puccinia striiformis f. sp. tritici genes involved in infection and sporulation.

    USDA-ARS?s Scientific Manuscript database

    Puccinia striiformis f. sp. tritici (Pst) causes stripe rust, one of the most important diseases of wheat worldwide. To identify Pst genes involved in infection and sporulation, a custom oligonucleotide Genechip was made using sequences of 442 genes selected from Pst cDNA libraries. Microarray analy...

  5. Digital gene expression profiling of flax (Linum usitatissimum L.) stem peel identifies genes enriched in fiber-bearing phloem tissue.

    PubMed

    Guo, Yuan; Qiu, Caisheng; Long, Songhua; Chen, Ping; Hao, Dongmei; Preisner, Marta; Wang, Hui; Wang, Yufu

    2017-08-30

    To better understand the molecular mechanisms and gene expression characteristics associated with development of bast fiber cell within flax stem phloem, the gene expression profiling of flax stem peels and leaves were screened, using Illumina's Digital Gene Expression (DGE) analysis. Four DGE libraries (2 for stem peel and 2 for leaf), ranging from 6.7 to 9.2 million clean reads were obtained, which produced 7.0 million and 6.8 million mapped reads for flax stem peel and leave, respectively. By differential gene expression analysis, a total of 975 genes, of which 708 (73%) genes have protein-coding annotation, were identified as phloem enriched genes putatively involved in the processes of polysaccharide and cell wall metabolism. Differential expression genes (DEGs) was validated using quantitative RT-PCR, the expression pattern of all nine genes determined by qRT-PCR fitted in well with that obtained by sequencing analysis. Cluster and Gene Ontology (GO) analysis revealed that a large number of genes related to metabolic process, catalytic activity and binding category were expressed predominantly in the stem peels. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the phloem enriched genes suggested approximately 111 biological pathways. The large number of genes and pathways produced from DGE sequencing will expand our understanding of the complex molecular and cellular events in flax bast fiber development and provide a foundation for future studies on fiber development in other bast fiber crops. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Gene expression profiles of breast biopsies from healthy women identify a group with claudin-low features

    PubMed Central

    2011-01-01

    Background Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Methods Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Results Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. Conclusion This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast

  7. Gene expression profiles of breast biopsies from healthy women identify a group with claudin-low features.

    PubMed

    Haakensen, Vilde D; Lingjaerde, Ole Christian; Lüders, Torben; Riis, Margit; Prat, Aleix; Troester, Melissa A; Holmen, Marit M; Frantzen, Jan Ole; Romundstad, Linda; Navjord, Dina; Bukholm, Ida K; Johannesen, Tom B; Perou, Charles M; Ursin, Giske; Kristensen, Vessela N; Børresen-Dale, Anne-Lise; Helland, Aslaug

    2011-11-01

    Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.

  8. Suppressors of systemin signaling identify genes in the tomato wound response pathway.

    PubMed Central

    Howe, G A; Ryan, C A

    1999-01-01

    In tomato plants, systemic induction of defense genes in response to herbivory or mechanical wounding is regulated by an 18-amino-acid peptide signal called systemin. Transgenic plants that overexpress prosystemin, the systemin precursor, from a 35S::prosystemin (35S::prosys) transgene exhibit constitutive expression of wound-inducible defense proteins including proteinase inhibitors and polyphenol oxidase. To study further the role of (pro)systemin in the wound response pathway, we isolated and characterized mutations that suppress 35S::prosys-mediated phenotypes. Ten recessive, extragenic suppressors were identified. Two of these define new alleles of def-1, a previously identified mutation that blocks both wound- and systemin-induced gene expression and renders plants susceptible to herbivory. The remaining mutants defined four loci designated Spr-1, Spr-2, Spr-3, and Spr-4 (for Suppressed in 35S::prosystemin-mediated responses). spr-3 and spr-4 mutants were not significantly affected in their response to either systemin or mechanical wounding. In contrast, spr-1 and spr-2 plants lacked systemic wound responses and were insensitive to systemin. These results confirm the function of (pro)systemin in the transduction of systemic wound signals and further establish that wounding, systemin, and 35S::prosys induce defensive gene expression through a common signaling pathway defined by at least three genes (Def-1, Spr-1, and Spr-2). PMID:10545469

  9. Integrated microarray and ChIP analysis identifies multiple Foxa2 dependent target genes in the notochord.

    PubMed

    Tamplin, Owen J; Cox, Brian J; Rossant, Janet

    2011-12-15

    The node and notochord are key tissues required for patterning of the vertebrate body plan. Understanding the gene regulatory network that drives their formation and function is therefore important. Foxa2 is a key transcription factor at the top of this genetic hierarchy and finding its targets will help us to better understand node and notochord development. We performed an extensive microarray-based gene expression screen using sorted embryonic notochord cells to identify early notochord-enriched genes. We validated their specificity to the node and notochord by whole mount in situ hybridization. This provides the largest available resource of notochord-expressed genes, and therefore candidate Foxa2 target genes in the notochord. Using existing Foxa2 ChIP-seq data from adult liver, we were able to identify a set of genes expressed in the notochord that had associated regions of Foxa2-bound chromatin. Given that Foxa2 is a pioneer transcription factor, we reasoned that these sites might represent notochord-specific enhancers. Candidate Foxa2-bound regions were tested for notochord specific enhancer function in a zebrafish reporter assay and 7 novel notochord enhancers were identified. Importantly, sequence conservation or predictive models could not have readily identified these regions. Mutation of putative Foxa2 binding elements in two of these novel enhancers abrogated reporter expression and confirmed their Foxa2 dependence. The combination of highly specific gene expression profiling and genome-wide ChIP analysis is a powerful means of understanding developmental pathways, even for small cell populations such as the notochord. Copyright © 2011 Elsevier Inc. All rights reserved.

  10. Systems Biology-Based Investigation of Cellular Antiviral Drug Targets Identified by Gene-Trap Insertional Mutagenesis.

    PubMed

    Cheng, Feixiong; Murray, James L; Zhao, Junfei; Sheng, Jinsong; Zhao, Zhongming; Rubin, Donald H

    2016-09-01

    Viruses require host cellular factors for successful replication. A comprehensive systems-level investigation of the virus-host interactome is critical for understanding the roles of host factors with the end goal of discovering new druggable antiviral targets. Gene-trap insertional mutagenesis is a high-throughput forward genetics approach to randomly disrupt (trap) host genes and discover host genes that are essential for viral replication, but not for host cell survival. In this study, we used libraries of randomly mutagenized cells to discover cellular genes that are essential for the replication of 10 distinct cytotoxic mammalian viruses, 1 gram-negative bacterium, and 5 toxins. We herein reported 712 candidate cellular genes, characterizing distinct topological network and evolutionary signatures, and occupying central hubs in the human interactome. Cell cycle phase-specific network analysis showed that host cell cycle programs played critical roles during viral replication (e.g. MYC and TAF4 regulating G0/1 phase). Moreover, the viral perturbation of host cellular networks reflected disease etiology in that host genes (e.g. CTCF, RHOA, and CDKN1B) identified were frequently essential and significantly associated with Mendelian and orphan diseases, or somatic mutations in cancer. Computational drug repositioning framework via incorporating drug-gene signatures from the Connectivity Map into the virus-host interactome identified 110 putative druggable antiviral targets and prioritized several existing drugs (e.g. ajmaline) that may be potential for antiviral indication (e.g. anti-Ebola). In summary, this work provides a powerful methodology with a tight integration of gene-trap insertional mutagenesis testing and systems biology to identify new antiviral targets and drugs for the development of broadly acting and targeted clinical antiviral therapeutics.

  11. Alu Elements as Novel Regulators of Gene Expression in Type 1 Diabetes Susceptibility Genes?

    PubMed

    Kaur, Simranjeet; Pociot, Flemming

    2015-07-13

    Despite numerous studies implicating Alu repeat elements in various diseases, there is sparse information available with respect to the potential functional and biological roles of the repeat elements in Type 1 diabetes (T1D). Therefore, we performed a genome-wide sequence analysis of T1D candidate genes to identify embedded Alu elements within these genes. We observed significant enrichment of Alu elements within the T1D genes (p-value < 10e-16), which highlights their importance in T1D. Functional annotation of T1D genes harboring Alus revealed significant enrichment for immune-mediated processes (p-value < 10e-6). We also identified eight T1D genes harboring inverted Alus (IRAlus) within their 3' untranslated regions (UTRs) that are known to regulate the expression of host mRNAs by generating double stranded RNA duplexes. Our in silico analysis predicted the formation of duplex structures by IRAlus within the 3'UTRs of T1D genes. We propose that IRAlus might be involved in regulating the expression levels of the host T1D genes.

  12. A Genome-wide CRISPR Screen in Toxoplasma Identifies Essential Apicomplexan Genes.

    PubMed

    Sidik, Saima M; Huet, Diego; Ganesan, Suresh M; Huynh, My-Hang; Wang, Tim; Nasamu, Armiyaw S; Thiru, Prathapan; Saeij, Jeroen P J; Carruthers, Vern B; Niles, Jacquin C; Lourido, Sebastian

    2016-09-08

    Apicomplexan parasites are leading causes of human and livestock diseases such as malaria and toxoplasmosis, yet most of their genes remain uncharacterized. Here, we present the first genome-wide genetic screen of an apicomplexan. We adapted CRISPR/Cas9 to assess the contribution of each gene from the parasite Toxoplasma gondii during infection of human fibroblasts. Our analysis defines ∼200 previously uncharacterized, fitness-conferring genes unique to the phylum, from which 16 were investigated, revealing essential functions during infection of human cells. Secondary screens identify as an invasion factor the claudin-like apicomplexan microneme protein (CLAMP), which resembles mammalian tight-junction proteins and localizes to secretory organelles, making it critical to the initiation of infection. CLAMP is present throughout sequenced apicomplexan genomes and is essential during the asexual stages of the malaria parasite Plasmodium falciparum. These results provide broad-based functional information on T. gondii genes and will facilitate future approaches to expand the horizon of antiparasitic interventions. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. Genetic regulation of gene expression in the lung identifies CST3 and CD22 as potential causal genes for airflow obstruction.

    PubMed

    Lamontagne, Maxime; Timens, Wim; Hao, Ke; Bossé, Yohan; Laviolette, Michel; Steiling, Katrina; Campbell, Joshua D; Couture, Christian; Conti, Massimo; Sherwood, Karen; Hogg, James C; Brandsma, Corry-Anke; van den Berge, Maarten; Sandford, Andrew; Lam, Stephen; Lenburg, Marc E; Spira, Avrum; Paré, Peter D; Nickle, David; Sin, Don D; Postma, Dirkje S

    2014-11-01

    COPD is a complex chronic disease with poorly understood pathogenesis. Integrative genomic approaches have the potential to elucidate the biological networks underlying COPD and lung function. We recently combined genome-wide genotyping and gene expression in 1111 human lung specimens to map expression quantitative trait loci (eQTL). To determine causal associations between COPD and lung function-associated single nucleotide polymorphisms (SNPs) and lung tissue gene expression changes in our lung eQTL dataset. We evaluated causality between SNPs and gene expression for three COPD phenotypes: FEV(1)% predicted, FEV(1)/FVC and COPD as a categorical variable. Different models were assessed in the three cohorts independently and in a meta-analysis. SNPs associated with a COPD phenotype and gene expression were subjected to causal pathway modelling and manual curation. In silico analyses evaluated functional enrichment of biological pathways among newly identified causal genes. Biologically relevant causal genes were validated in two separate gene expression datasets of lung tissues and bronchial airway brushings. High reliability causal relations were found in SNP-mRNA-phenotype triplets for FEV(1)% predicted (n=169) and FEV(1)/FVC (n=80). Several genes of potential biological relevance for COPD were revealed. eQTL-SNPs upregulating cystatin C (CST3) and CD22 were associated with worse lung function. Signalling pathways enriched with causal genes included xenobiotic metabolism, apoptosis, protease-antiprotease and oxidant-antioxidant balance. By using integrative genomics and analysing the relationships of COPD phenotypes with SNPs and gene expression in lung tissue, we identified CST3 and CD22 as potential causal genes for airflow obstruction. This study also augmented the understanding of previously described COPD pathways. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  14. A functional screen for copper homeostasis genes identifies a pharmacologically tractable cellular system

    PubMed Central

    2014-01-01

    Background Copper is essential for the survival of aerobic organisms. If copper is not properly regulated in the body however, it can be extremely cytotoxic and genetic mutations that compromise copper homeostasis result in severe clinical phenotypes. Understanding how cells maintain optimal copper levels is therefore highly relevant to human health. Results We found that addition of copper (Cu) to culture medium leads to increased respiratory growth of yeast, a phenotype which we then systematically and quantitatively measured in 5050 homozygous diploid deletion strains. Cu’s positive effect on respiratory growth was quantitatively reduced in deletion strains representing 73 different genes, the function of which identify increased iron uptake as a cause of the increase in growth rate. Conversely, these effects were enhanced in strains representing 93 genes. Many of these strains exhibited respiratory defects that were specifically rescued by supplementing the growth medium with Cu. Among the genes identified are known and direct regulators of copper homeostasis, genes required to maintain low vacuolar pH, and genes where evidence supporting a functional link with Cu has been heretofore lacking. Roughly half of the genes are conserved in man, and several of these are associated with Mendelian disorders, including the Cu-imbalance syndromes Menkes and Wilson’s disease. We additionally demonstrate that pharmacological agents, including the approved drug disulfiram, can rescue Cu-deficiencies of both environmental and genetic origin. Conclusions A functional screen in yeast has expanded the list of genes required for Cu-dependent fitness, revealing a complex cellular system with implications for human health. Respiratory fitness defects arising from perturbations in this system can be corrected with pharmacological agents that increase intracellular copper concentrations. PMID:24708151

  15. A novel approach to identify genes that determine grain protein deviation in cereals.

    PubMed

    Mosleth, Ellen F; Wan, Yongfang; Lysenko, Artem; Chope, Gemma A; Penson, Simon P; Shewry, Peter R; Hawkesford, Malcolm J

    2015-06-01

    Grain yield and protein content were determined for six wheat cultivars grown over 3 years at multiple sites and at multiple nitrogen (N) fertilizer inputs. Although grain protein content was negatively correlated with yield, some grain samples had higher protein contents than expected based on their yields, a trait referred to as grain protein deviation (GPD). We used novel statistical approaches to identify gene transcripts significantly related to GPD across environments. The yield and protein content were initially adjusted for nitrogen fertilizer inputs and then adjusted for yield (to remove the negative correlation with protein content), resulting in a parameter termed corrected GPD. Significant genetic variation in corrected GPD was observed for six cultivars grown over a range of environmental conditions (a total of 584 samples). Gene transcript profiles were determined in a subset of 161 samples of developing grain to identify transcripts contributing to GPD. Principal component analysis (PCA), analysis of variance (ANOVA) and means of scores regression (MSR) were used to identify individual principal components (PCs) correlating with GPD alone. Scores of the selected PCs, which were significantly related to GPD and protein content but not to the yield and significantly affected by cultivar, were identified as reflecting a multivariate pattern of gene expression related to genetic variation in GPD. Transcripts with consistent variation along the selected PCs were identified by an approach hereby called one-block means of scores regression (one-block MSR). © 2014 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  16. Immunogenetic mechanisms leading to thyroid autoimmunity: recent advances in identifying susceptibility genes and regions.

    PubMed

    Brand, Oliver J; Gough, Stephen C L

    2011-12-01

    The autoimmune thyroid diseases (AITD) include Graves' disease (GD) and Hashimoto's thyroiditis (HT), which are characterised by a breakdown in immune tolerance to thyroid antigens. Unravelling the genetic architecture of AITD is vital to better understanding of AITD pathogenesis, required to advance therapeutic options in both disease management and prevention. The early whole-genome linkage and candidate gene association studies provided the first evidence that the HLA region and CTLA-4 represented AITD risk loci. Recent improvements in; high throughput genotyping technologies, collection of larger disease cohorts and cataloguing of genome-scale variation have facilitated genome-wide association studies and more thorough screening of candidate gene regions. This has allowed identification of many novel AITD risk genes and more detailed association mapping. The growing number of confirmed AITD susceptibility loci, implicates a number of putative disease mechanisms most of which are tightly linked with aspects of immune system function. The unprecedented advances in genetic study will allow future studies to identify further novel disease risk genes and to identify aetiological variants within specific gene regions, which will undoubtedly lead to a better understanding of AITD patho-physiology.

  17. Immunogenetic Mechanisms Leading to Thyroid Autoimmunity: Recent Advances in Identifying Susceptibility Genes and Regions

    PubMed Central

    Brand, Oliver J; Gough, Stephen C.L

    2011-01-01

    The autoimmune thyroid diseases (AITD) include Graves’ disease (GD) and Hashimoto’s thyroiditis (HT), which are characterised by a breakdown in immune tolerance to thyroid antigens. Unravelling the genetic architecture of AITD is vital to better understanding of AITD pathogenesis, required to advance therapeutic options in both disease management and prevention. The early whole-genome linkage and candidate gene association studies provided the first evidence that the HLA region and CTLA-4 represented AITD risk loci. Recent improvements in; high throughput genotyping technologies, collection of larger disease cohorts and cataloguing of genome-scale variation have facilitated genome-wide association studies and more thorough screening of candidate gene regions. This has allowed identification of many novel AITD risk genes and more detailed association mapping. The growing number of confirmed AITD susceptibility loci, implicates a number of putative disease mechanisms most of which are tightly linked with aspects of immune system function. The unprecedented advances in genetic study will allow future studies to identify further novel disease risk genes and to identify aetiological variants within specific gene regions, which will undoubtedly lead to a better understanding of AITD patho-physiology. PMID:22654554

  18. Numerical Linear Algebra.

    DTIC Science & Technology

    1980-09-08

    February 1979 through 31 March 1980 Title of Research: NUMERICAL LINEAR ALGEBRA Principal Investigators: Gene H. Golub James H. Wilkinson Research...BEFORE COMPLETING FORM 2 OTAgSSION NO. 3. RECIPIENT’S CATALOG NUMBER ITE~ btitle) ~qEE NUMERICAL LINEAR ALGEBRA #I ~ f#7&/8 PER.ORMING ORG. REPORT NUM 27R 7

  19. A Stratified Transcriptomics Analysis of Polygenic Fat and Lean Mouse Adipose Tissues Identifies Novel Candidate Obesity Genes

    PubMed Central

    Morton, Nicholas M.; Nelson, Yvonne B.; Michailidou, Zoi; Di Rollo, Emma M.; Ramage, Lynne; Hadoke, Patrick W. F.; Seckl, Jonathan R.; Bunger, Lutz; Horvat, Simon; Kenyon, Christopher J.; Dunbar, Donald R.

    2011-01-01

    Background Obesity and metabolic syndrome results from a complex interaction between genetic and environmental factors. In addition to brain-regulated processes, recent genome wide association studies have indicated that genes highly expressed in adipose tissue affect the distribution and function of fat and thus contribute to obesity. Using a stratified transcriptome gene enrichment approach we attempted to identify adipose tissue-specific obesity genes in the unique polygenic Fat (F) mouse strain generated by selective breeding over 60 generations for divergent adiposity from a comparator Lean (L) strain. Results To enrich for adipose tissue obesity genes a ‘snap-shot’ pooled-sample transcriptome comparison of key fat depots and non adipose tissues (muscle, liver, kidney) was performed. Known obesity quantitative trait loci (QTL) information for the model allowed us to further filter genes for increased likelihood of being causal or secondary for obesity. This successfully identified several genes previously linked to obesity (C1qr1, and Np3r) as positional QTL candidate genes elevated specifically in F line adipose tissue. A number of novel obesity candidate genes were also identified (Thbs1, Ppp1r3d, Tmepai, Trp53inp2, Ttc7b, Tuba1a, Fgf13, Fmr) that have inferred roles in fat cell function. Quantitative microarray analysis was then applied to the most phenotypically divergent adipose depot after exaggerating F and L strain differences with chronic high fat feeding which revealed a distinct gene expression profile of line, fat depot and diet-responsive inflammatory, angiogenic and metabolic pathways. Selected candidate genes Npr3 and Thbs1, as well as Gys2, a non-QTL gene that otherwise passed our enrichment criteria were characterised, revealing novel functional effects consistent with a contribution to obesity. Conclusions A focussed candidate gene enrichment strategy in the unique F and L model has identified novel adipose tissue-enriched genes

  20. Gene panel sequencing in familial breast/ovarian cancer patients identifies multiple novel mutations also in genes others than BRCA1/2.

    PubMed

    Kraus, Cornelia; Hoyer, Juliane; Vasileiou, Georgia; Wunderle, Marius; Lux, Michael P; Fasching, Peter A; Krumbiegel, Mandy; Uebe, Steffen; Reuter, Miriam; Beckmann, Matthias W; Reis, André

    2017-01-01

    Breast and ovarian cancer (BC/OC) predisposition has been attributed to a number of high- and moderate to low-penetrance susceptibility genes. With the advent of next generation sequencing (NGS) simultaneous testing of these genes has become feasible. In this monocentric study, we report results of panel-based screening of 14 BC/OC susceptibility genes (BRCA1, BRCA2, RAD51C, RAD51D, CHEK2, PALB2, ATM, NBN, CDH1, TP53, MLH1, MSH2, MSH6 and PMS2) in a group of 581 consecutive individuals from a German population with BC and/or OC fulfilling diagnostic criteria for BRCA1 and BRCA2 testing including 179 with a triple-negative tumor. Altogether we identified 106 deleterious mutations in 105 (18%) patients in 10 different genes, including seven different exon deletions. Of these 106 mutations, 16 (15%) were novel and only six were found in BRCA1/2. To further characterize mutations located in or nearby splicing consensus sites we performed RT-PCR analysis which allowed confirmation of pathogenicity in 7 of 9 mutations analyzed. In PALB2, we identified a deleterious variant in six cases. All but one were associated with early onset BC and a positive family history indicating that penetrance for PALB2 mutations is comparable to BRCA2. Overall, extended testing beyond BRCA1/2 identified a deleterious mutation in further 6% of patients. As a downside, 89 variants of uncertain significance were identified highlighting the need for comprehensive variant databases. In conclusion, panel testing yields more accurate information on genetic cancer risk than assessing BRCA1/2 alone and wide-spread testing will help improve penetrance assessment of variants in these risk genes. © 2016 UICC.

  1. Genome-Wide association study identifies candidate genes for Parkinson's disease in an Ashkenazi Jewish population

    PubMed Central

    2011-01-01

    Background To date, nine Parkinson disease (PD) genome-wide association studies in North American, European and Asian populations have been published. The majority of studies have confirmed the association of the previously identified genetic risk factors, SNCA and MAPT, and two studies have identified three new PD susceptibility loci/genes (PARK16, BST1 and HLA-DRB5). In a recent meta-analysis of datasets from five of the published PD GWAS an additional 6 novel candidate genes (SYT11, ACMSD, STK39, MCCC1/LAMP3, GAK and CCDC62/HIP1R) were identified. Collectively the associations identified in these GWAS account for only a small proportion of the estimated total heritability of PD suggesting that an 'unknown' component of the genetic architecture of PD remains to be identified. Methods We applied a GWAS approach to a relatively homogeneous Ashkenazi Jewish (AJ) population from New York to search for both 'rare' and 'common' genetic variants that confer risk of PD by examining any SNPs with allele frequencies exceeding 2%. We have focused on a genetic isolate, the AJ population, as a discovery dataset since this cohort has a higher sharing of genetic background and historically experienced a significant bottleneck. We also conducted a replication study using two publicly available datasets from dbGaP. The joint analysis dataset had a combined sample size of 2,050 cases and 1,836 controls. Results We identified the top 57 SNPs showing the strongest evidence of association in the AJ dataset (p < 9.9 × 10-5). Six SNPs located within gene regions had positive signals in at least one other independent dbGaP dataset: LOC100505836 (Chr3p24), LOC153328/SLC25A48 (Chr5q31.1), UNC13B (9p13.3), SLCO3A1(15q26.1), WNT3(17q21.3) and NSF (17q21.3). We also replicated published associations for the gene regions SNCA (Chr4q21; rs3775442, p = 0.037), PARK16 (Chr1q32.1; rs823114 (NUCKS1), p = 6.12 × 10-4), BST1 (Chr4p15; rs12502586, p = 0.027), STK39 (Chr2q24.3; rs3754775, p = 0

  2. De novo Transcriptome Analysis of Miscanthus lutarioriparius Identifies Candidate Genes in Rhizome Development

    PubMed Central

    Hu, Ruibo; Yu, Changjiang; Wang, Xiaoyu; Jia, Chunlin; Pei, Shengqiang; He, Kang; He, Guo; Kong, Yingzhen; Zhou, Gongke

    2017-01-01

    HIGHLIGHT De novo transcriptome profiling of five tissues reveals candidate genes putatively involved in rhizome development in M. lutarioriparius. Miscanthus lutarioriparius is a promising lignocellulosic feedstock for second-generation bioethanol production. However, the genomic resource for this species is relatively limited thus hampers our understanding of the molecular mechanisms underlying many important biological processes. In this study, we performed the first de novo transcriptome analysis of five tissues (leaf, stem, root, lateral bud and rhizome bud) of M. lutarioriparius with an emphasis to identify putative genes involved in rhizome development. Approximately 66 gigabase (GB) paired-end clean reads were obtained and assembled into 169,064 unigenes with an average length of 759 bp. Among these unigenes, 103,899 (61.5%) were annotated in seven public protein databases. Differential gene expression profiling analysis revealed that 4,609, 3,188, 1,679, 1,218, and 1,077 genes were predominantly expressed in root, leaf, stem, lateral bud, and rhizome bud, respectively. Their expression patterns were further classified into 12 distinct clusters. Pathway enrichment analysis revealed that genes predominantly expressed in rhizome bud were mainly involved in primary metabolism and hormone signaling and transduction pathways. Noteworthy, 19 transcription factors (TFs) and 16 hormone signaling pathway-related genes were identified to be predominantly expressed in rhizome bud compared with the other tissues, suggesting putative roles in rhizome formation and development. In addition, a predictive regulatory network was constructed between four TFs and six auxin and abscisic acid (ABA) -related genes. Furthermore, the expression of 24 rhizome-specific genes was further validated by quantitative real-time RT-PCR (qRT-PCR) analysis. Taken together, this study provide a global portrait of gene expression across five different tissues and reveal preliminary insights

  3. Blood pressure loci identified with a gene-centric array.

    PubMed

    Johnson, Toby; Gaunt, Tom R; Newhouse, Stephen J; Padmanabhan, Sandosh; Tomaszewski, Maciej; Kumari, Meena; Morris, Richard W; Tzoulaki, Ioanna; O'Brien, Eoin T; Poulter, Neil R; Sever, Peter; Shields, Denis C; Thom, Simon; Wannamethee, Sasiwarang G; Whincup, Peter H; Brown, Morris J; Connell, John M; Dobson, Richard J; Howard, Philip J; Mein, Charles A; Onipinla, Abiodun; Shaw-Hawkins, Sue; Zhang, Yun; Davey Smith, George; Day, Ian N M; Lawlor, Debbie A; Goodall, Alison H; Fowkes, F Gerald; Abecasis, Gonçalo R; Elliott, Paul; Gateva, Vesela; Braund, Peter S; Burton, Paul R; Nelson, Christopher P; Tobin, Martin D; van der Harst, Pim; Glorioso, Nicola; Neuvrith, Hani; Salvi, Erika; Staessen, Jan A; Stucchi, Andrea; Devos, Nabila; Jeunemaitre, Xavier; Plouin, Pierre-François; Tichet, Jean; Juhanson, Peeter; Org, Elin; Putku, Margus; Sõber, Siim; Veldre, Gudrun; Viigimaa, Margus; Levinsson, Anna; Rosengren, Annika; Thelle, Dag S; Hastie, Claire E; Hedner, Thomas; Lee, Wai K; Melander, Olle; Wahlstrand, Björn; Hardy, Rebecca; Wong, Andrew; Cooper, Jackie A; Palmen, Jutta; Chen, Li; Stewart, Alexandre F R; Wells, George A; Westra, Harm-Jan; Wolfs, Marcel G M; Clarke, Robert; Franzosi, Maria Grazia; Goel, Anuj; Hamsten, Anders; Lathrop, Mark; Peden, John F; Seedorf, Udo; Watkins, Hugh; Ouwehand, Willem H; Sambrook, Jennifer; Stephens, Jonathan; Casas, Juan-Pablo; Drenos, Fotios; Holmes, Michael V; Kivimaki, Mika; Shah, Sonia; Shah, Tina; Talmud, Philippa J; Whittaker, John; Wallace, Chris; Delles, Christian; Laan, Maris; Kuh, Diana; Humphries, Steve E; Nyberg, Fredrik; Cusi, Daniele; Roberts, Robert; Newton-Cheh, Christopher; Franke, Lude; Stanton, Alice V; Dominiczak, Anna F; Farrall, Martin; Hingorani, Aroon D; Samani, Nilesh J; Caulfield, Mark J; Munroe, Patricia B

    2011-12-09

    Raised blood pressure (BP) is a major risk factor for cardiovascular disease. Previous studies have identified 47 distinct genetic variants robustly associated with BP, but collectively these explain only a few percent of the heritability for BP phenotypes. To find additional BP loci, we used a bespoke gene-centric array to genotype an independent discovery sample of 25,118 individuals that combined hypertensive case-control and general population samples. We followed up four SNPs associated with BP at our p < 8.56 × 10(-7) study-specific significance threshold and six suggestively associated SNPs in a further 59,349 individuals. We identified and replicated a SNP at LSP1/TNNT3, a SNP at MTHFR-NPPB independent (r(2) = 0.33) of previous reports, and replicated SNPs at AGT and ATP2B1 reported previously. An analysis of combined discovery and follow-up data identified SNPs significantly associated with BP at p < 8.56 × 10(-7) at four further loci (NPR3, HFE, NOS3, and SOX6). The high number of discoveries made with modest genotyping effort can be attributed to using a large-scale yet targeted genotyping array and to the development of a weighting scheme that maximized power when meta-analyzing results from samples ascertained with extreme phenotypes, in combination with results from nonascertained or population samples. Chromatin immunoprecipitation and transcript expression data highlight potential gene regulatory mechanisms at the MTHFR and NOS3 loci. These results provide candidates for further study to help dissect mechanisms affecting BP and highlight the utility of studying SNPs and samples that are independent of those studied previously even when the sample size is smaller than that in previous studies. Copyright © 2011 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  4. Methods to identify and analyze gene products involved in neuronal intracellular transport using Drosophila

    PubMed Central

    Neisch, Amanda L.; Avery, Adam W.; Machame, James B.; Li, Min-gang; Hays, Thomas S.

    2017-01-01

    Proper neuronal function critically depends on efficient intracellular transport and disruption of transport leads to neurodegeneration. Molecular pathways that support or regulate neuronal transport are not fully understood. A greater understanding of these pathways will help reveal the pathological mechanisms underlying disease. Drosophila melanogaster is the premier model system for performing large-scale genetic functional screens. Here we describe methods to carry out primary and secondary genetic screens in Drosophila aimed at identifying novel gene products and pathways that impact neuronal intracellular transport. These screens are performed using whole animal or live cell imaging of intact neural tissue to ensure integrity of neurons and their cellular environment. The primary screen is used to identify gross defects in neuronal function indicative of a disruption in microtubule-based transport. The secondary screens, conducted in both motoneurons and dendritic arborization neurons, will confirm the function of candidate gene products in intracellular transport. Together, the methodologies described here will support labs interested in identifying and characterizing gene products that alter intracellular transport in Drosophila. PMID:26794520

  5. Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment.

    PubMed

    Zhang, Bochao; Meng, Wenzhao; Prak, Eline T Luning; Hershberg, Uri

    2015-12-01

    Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. Genome-wide transcriptional analysis of flagellar regeneration in Chlamydomonas reinhardtii identifies orthologs of ciliary disease genes

    NASA Technical Reports Server (NTRS)

    Stolc, Viktor; Samanta, Manoj Pratim; Tongprasit, Waraporn; Marshall, Wallace F.

    2005-01-01

    The important role that cilia and flagella play in human disease creates an urgent need to identify genes involved in ciliary assembly and function. The strong and specific induction of flagellar-coding genes during flagellar regeneration in Chlamydomonas reinhardtii suggests that transcriptional profiling of such cells would reveal new flagella-related genes. We have conducted a genome-wide analysis of RNA transcript levels during flagellar regeneration in Chlamydomonas by using maskless photolithography method-produced DNA oligonucleotide microarrays with unique probe sequences for all exons of the 19,803 predicted genes. This analysis represents previously uncharacterized whole-genome transcriptional activity profiling study in this important model organism. Analysis of strongly induced genes reveals a large set of known flagellar components and also identifies a number of important disease-related proteins as being involved with cilia and flagella, including the zebrafish polycystic kidney genes Qilin, Reptin, and Pontin, as well as the testis-expressed tubby-like protein TULP2.

  7. A gene expression biomarker identifies in vitro and in vivo ERα modulators in a human gene expression compendium

    EPA Science Inventory

    We propose the use of gene expression profiling to complement the chemical characterization currently based on HTS assay data and present a case study relevant to the Endocrine Disruptor Screening Program. We have developed computational methods to identify estrogen receptor &alp...

  8. Identifying core gene modules in glioblastoma based on multilayer factor-mediated dysfunctional regulatory networks through integrating multi-dimensional genomic data

    PubMed Central

    Ping, Yanyan; Deng, Yulan; Wang, Li; Zhang, Hongyi; Zhang, Yong; Xu, Chaohan; Zhao, Hongying; Fan, Huihui; Yu, Fulong; Xiao, Yun; Li, Xia

    2015-01-01

    The driver genetic aberrations collectively regulate core cellular processes underlying cancer development. However, identifying the modules of driver genetic alterations and characterizing their functional mechanisms are still major challenges for cancer studies. Here, we developed an integrative multi-omics method CMDD to identify the driver modules and their affecting dysregulated genes through characterizing genetic alteration-induced dysregulated networks. Applied to glioblastoma (GBM), the CMDD identified a core gene module of 17 genes, including seven known GBM drivers, and their dysregulated genes. The module showed significant association with shorter survival of GBM. When classifying driver genes in the module into two gene sets according to their genetic alteration patterns, we found that one gene set directly participated in the glioma pathway, while the other indirectly regulated the glioma pathway, mostly, via their dysregulated genes. Both of the two gene sets were significant contributors to survival and helpful for classifying GBM subtypes, suggesting their critical roles in GBM pathogenesis. Also, by applying the CMDD to other six cancers, we identified some novel core modules associated with overall survival of patients. Together, these results demonstrate integrative multi-omics data can identify driver modules and uncover their dysregulated genes, which is useful for interpreting cancer genome. PMID:25653168

  9. Integration of QTL and bioinformatic tools to identify candidate genes for triglycerides in mice[S

    PubMed Central

    Leduc, Magalie S.; Hageman, Rachael S.; Verdugo, Ricardo A.; Tsaih, Shirng-Wern; Walsh, Kenneth; Churchill, Gary A.; Paigen, Beverly

    2011-01-01

    To identify genetic loci influencing lipid levels, we performed quantitative trait loci (QTL) analysis between inbred mouse strains MRL/MpJ and SM/J, measuring triglyceride levels at 8 weeks of age in F2 mice fed a chow diet. We identified one significant QTL on chromosome (Chr) 15 and three suggestive QTL on Chrs 2, 7, and 17. We also carried out microarray analysis on the livers of parental strains of 282 F2 mice and used these data to find cis-regulated expression QTL. We then narrowed the list of candidate genes under significant QTL using a “toolbox” of bioinformatic resources, including haplotype analysis; parental strain comparison for gene expression differences and nonsynonymous coding single nucleotide polymorphisms (SNP); cis-regulated eQTL in livers of F2 mice; correlation between gene expression and phenotype; and conditioning of expression on the phenotype. We suggest Slc25a7 as a candidate gene for the Chr 7 QTL and, based on expression differences, five genes (Polr3 h, Cyp2d22, Cyp2d26, Tspo, and Ttll12) as candidate genes for Chr 15 QTL. This study shows how bioinformatics can be used effectively to reduce candidate gene lists for QTL related to complex traits. PMID:21622629

  10. Transcriptome and metabolite analysis identifies nitrogen utilization genes in tea plant (Camellia sinensis).

    PubMed

    Li, Wei; Xiang, Fen; Zhong, Micai; Zhou, Lingyun; Liu, Hongyan; Li, Saijun; Wang, Xuewen

    2017-05-10

    Applied nitrogen (N) fertilizer significantly increases the leaf yield. However, most N is not utilized by the plant, negatively impacting the environment. To date, little is known regarding N utilization genes and mechanisms in the leaf production. To understand this, we investigated transcriptomes using RNA-seq and amino acid levels with N treatment in tea (Camellia sinensis), the most popular beverage crop. We identified 196 and 29 common differentially expressed genes in roots and leaves, respectively, in response to ammonium in two tea varieties. Among those genes, AMT, NRT and AQP for N uptake and GOGAT and GS for N assimilation were the key genes, validated by RT-qPCR, which expressed in a network manner with tissue specificity. Importantly, only AQP and three novel DEGs associated with stress, manganese binding, and gibberellin-regulated transcription factor were common in N responses across all tissues and varieties. A hypothesized gene regulatory network for N was proposed. A strong statistical correlation between key genes' expression and amino acid content was revealed. The key genes and regulatory network improve our understanding of the molecular mechanism of N usage and offer gene targets for plant improvement.

  11. Mistaken Identifiers: Gene name errors can be introduced inadvertently when using Excel in bioinformatics

    PubMed Central

    Zeeberg, Barry R; Riss, Joseph; Kane, David W; Bussey, Kimberly J; Uchio, Edward; Linehan, W Marston; Barrett, J Carl; Weinstein, John N

    2004-01-01

    Background When processing microarray data sets, we recently noticed that some gene names were being changed inadvertently to non-gene names. Results A little detective work traced the problem to default date format conversions and floating-point format conversions in the very useful Excel program package. The date conversions affect at least 30 gene names; the floating-point conversions affect at least 2,000 if Riken identifiers are included. These conversions are irreversible; the original gene names cannot be recovered. Conclusions Users of Excel for analyses involving gene names should be aware of this problem, which can cause genes, including medically important ones, to be lost from view and which has contaminated even carefully curated public databases. We provide work-arounds and scripts for circumventing the problem. PMID:15214961

  12. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

    2003-06-01

    OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally importantmore » for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.« less

  13. Predicting essential genes for identifying potential drug targets in Aspergillus fumigatus.

    PubMed

    Lu, Yao; Deng, Jingyuan; Rhodes, Judith C; Lu, Hui; Lu, Long Jason

    2014-06-01

    Aspergillus fumigatus (Af) is a ubiquitous and opportunistic pathogen capable of causing acute, invasive pulmonary disease in susceptible hosts. Despite current therapeutic options, mortality associated with invasive Af infections remains unacceptably high, increasing 357% since 1980. Therefore, there is an urgent need for the development of novel therapeutic strategies, including more efficacious drugs acting on new targets. Thus, as noted in a recent review, "the identification of essential genes in fungi represents a crucial step in the development of new antifungal drugs". Expanding the target space by rapidly identifying new essential genes has thus been described as "the most important task of genomics-based target validation". In previous research, we were the first to show that essential gene annotation can be reliably transferred between distantly related four Prokaryotic species. In this study, we extend our machine learning approach to the much more complex Eukaryotic fungal species. A compendium of essential genes is predicted in Af by transferring known essential gene annotations from another filamentous fungus Neurospora crassa. This approach predicts essential genes by integrating diverse types of intrinsic and context-dependent genomic features encoded in microbial genomes. The predicted essential datasets contained 1674 genes. We validated our results by comparing our predictions with known essential genes in Af, comparing our predictions with those predicted by homology mapping, and conducting conditional expressed alleles. We applied several layers of filters and selected a set of potential drug targets from the predicted essential genes. Finally, we have conducted wet lab knockout experiments to verify our predictions, which further validates the accuracy and wide applicability of the machine learning approach. The approach presented here significantly extended our ability to predict essential genes beyond orthologs and made it possible to

  14. Genetic Susceptibility to Vitiligo: GWAS Approaches for Identifying Vitiligo Susceptibility Genes and Loci

    PubMed Central

    Shen, Changbing; Gao, Jing; Sheng, Yujun; Dou, Jinfa; Zhou, Fusheng; Zheng, Xiaodong; Ko, Randy; Tang, Xianfa; Zhu, Caihong; Yin, Xianyong; Sun, Liangdan; Cui, Yong; Zhang, Xuejun

    2016-01-01

    Vitiligo is an autoimmune disease with a strong genetic component, characterized by areas of depigmented skin resulting from loss of epidermal melanocytes. Genetic factors are known to play key roles in vitiligo through discoveries in association studies and family studies. Previously, vitiligo susceptibility genes were mainly revealed through linkage analysis and candidate gene studies. Recently, our understanding of the genetic basis of vitiligo has been rapidly advancing through genome-wide association study (GWAS). More than 40 robust susceptible loci have been identified and confirmed to be associated with vitiligo by using GWAS. Most of these associated genes participate in important pathways involved in the pathogenesis of vitiligo. Many susceptible loci with unknown functions in the pathogenesis of vitiligo have also been identified, indicating that additional molecular mechanisms may contribute to the risk of developing vitiligo. In this review, we summarize the key loci that are of genome-wide significance, which have been shown to influence vitiligo risk. These genetic loci may help build the foundation for genetic diagnosis and personalize treatment for patients with vitiligo in the future. However, substantial additional studies, including gene-targeted and functional studies, are required to confirm the causality of the genetic variants and their biological relevance in the development of vitiligo. PMID:26870082

  15. Rare copy number variations in congenital heart disease patients identify unique genes in left-right patterning

    PubMed Central

    Fakhro, Khalid A.; Choi, Murim; Ware, Stephanie M.; Belmont, John W.; Towbin, Jeffrey A.; Lifton, Richard P.; Khokha, Mustafa K.; Brueckner, Martina

    2011-01-01

    Dominant human genetic diseases that impair reproductive fitness and have high locus heterogeneity constitute a problem for gene discovery because the usual criterion of finding more mutations in specific genes than expected by chance may require extremely large populations. Heterotaxy (Htx), a congenital heart disease resulting from abnormalities in left-right (LR) body patterning, has features suggesting that many cases fall into this category. In this setting, appropriate model systems may provide a means to support implication of specific genes. By high-resolution genotyping of 262 Htx subjects and 991 controls, we identify a twofold excess of subjects with rare genic copy number variations in Htx (14.5% vs. 7.4%, P = 1.5 × 10−4). Although 7 of 45 Htx copy number variations were large chromosomal abnormalities, 38 smaller copy number variations altered a total of 61 genes, 22 of which had Xenopus orthologs. In situ hybridization identified 7 of these 22 genes with expression in the ciliated LR organizer (gastrocoel roof plate), a marked enrichment compared with 40 of 845 previously studied genes (sevenfold enrichment, P < 10−6). Morpholino knockdown in Xenopus of Htx candidates demonstrated that five (NEK2, ROCK2, TGFBR2, GALNT11, and NUP188) strongly disrupted both morphological LR development and expression of pitx2, a molecular marker of LR patterning. These effects were specific, because 0 of 13 control genes from rare Htx or control copy number variations produced significant LR abnormalities (P = 0.001). These findings identify genes not previously implicated in LR patterning. PMID:21282601

  16. Rare copy number variations in congenital heart disease patients identify unique genes in left-right patterning.

    PubMed

    Fakhro, Khalid A; Choi, Murim; Ware, Stephanie M; Belmont, John W; Towbin, Jeffrey A; Lifton, Richard P; Khokha, Mustafa K; Brueckner, Martina

    2011-02-15

    Dominant human genetic diseases that impair reproductive fitness and have high locus heterogeneity constitute a problem for gene discovery because the usual criterion of finding more mutations in specific genes than expected by chance may require extremely large populations. Heterotaxy (Htx), a congenital heart disease resulting from abnormalities in left-right (LR) body patterning, has features suggesting that many cases fall into this category. In this setting, appropriate model systems may provide a means to support implication of specific genes. By high-resolution genotyping of 262 Htx subjects and 991 controls, we identify a twofold excess of subjects with rare genic copy number variations in Htx (14.5% vs. 7.4%, P = 1.5 × 10(-4)). Although 7 of 45 Htx copy number variations were large chromosomal abnormalities, 38 smaller copy number variations altered a total of 61 genes, 22 of which had Xenopus orthologs. In situ hybridization identified 7 of these 22 genes with expression in the ciliated LR organizer (gastrocoel roof plate), a marked enrichment compared with 40 of 845 previously studied genes (sevenfold enrichment, P < 10(-6)). Morpholino knockdown in Xenopus of Htx candidates demonstrated that five (NEK2, ROCK2, TGFBR2, GALNT11, and NUP188) strongly disrupted both morphological LR development and expression of pitx2, a molecular marker of LR patterning. These effects were specific, because 0 of 13 control genes from rare Htx or control copy number variations produced significant LR abnormalities (P = 0.001). These findings identify genes not previously implicated in LR patterning.

  17. A framework to identify gene expression profiles in a model of inflammation induced by lipopolysaccharide after treatment with thalidomide

    PubMed Central

    2012-01-01

    Background Thalidomide is an anti-inflammatory and anti-angiogenic drug currently used for the treatment of several diseases, including erythema nodosum leprosum, which occurs in patients with lepromatous leprosy. In this research, we use DNA microarray analysis to identify the impact of thalidomide on gene expression responses in human cells after lipopolysaccharide (LPS) stimulation. We employed a two-stage framework. Initially, we identified 1584 altered genes in response to LPS. Modulation of this set of genes was then analyzed in the LPS stimulated cells treated with thalidomide. Results We identified 64 genes with altered expression induced by thalidomide using the rank product method. In addition, the lists of up-regulated and down-regulated genes were investigated by means of bioinformatics functional analysis, which allowed for the identification of biological processes affected by thalidomide. Confirmatory analysis was done in five of the identified genes using real time PCR. Conclusions The results showed some genes that can further our understanding of the biological mechanisms in the action of thalidomide. Of the five genes evaluated with real time PCR, three were down regulated and two were up regulated confirming the initial results of the microarray analysis. PMID:22695124

  18. A functional cancer genomics screen identifies a druggable synthetic lethal interaction between MSH3 and PRKDC.

    PubMed

    Dietlein, Felix; Thelen, Lisa; Jokic, Mladen; Jachimowicz, Ron D; Ivan, Laura; Knittel, Gero; Leeser, Uschi; van Oers, Johanna; Edelmann, Winfried; Heukamp, Lukas C; Reinhardt, H Christian

    2014-05-01

    Here, we use a large-scale cell line-based approach to identify cancer cell-specific mutations that are associated with DNA-dependent protein kinase catalytic subunit (DNA-PKcs) dependence. For this purpose, we profiled the mutational landscape across 1,319 cancer-associated genes of 67 distinct cell lines and identified numerous genes involved in homologous recombination-mediated DNA repair, including BRCA1, BRCA2, ATM, PAXIP, and RAD50, as being associated with non-oncogene addiction to DNA-PKcs. Mutations in the mismatch repair gene MSH3, which have been reported to occur recurrently in numerous human cancer entities, emerged as the most significant predictors of DNA-PKcs addiction. Concordantly, DNA-PKcs inhibition robustly induced apoptosis in MSH3-mutant cell lines in vitro and displayed remarkable single-agent efficacy against MSH3-mutant tumors in vivo. Thus, we here identify a therapeutically actionable synthetic lethal interaction between MSH3 and the non-homologous end joining kinase DNA-PKcs. Our observations recommend DNA-PKcs inhibition as a therapeutic concept for the treatment of human cancers displaying homologous recombination defects.

  19. A novel gammaretroviral shuttle vector insertional mutagenesis screen identifies SHARPIN as a breast cancer metastasis gene and prognostic biomarker.

    PubMed

    Bii, Victor M; Rae, Dustin T; Trobridge, Grant D

    2015-11-24

    Breast cancer (BC) is the second leading cause of malignancy among U.S. women. Metastasis results in a poor prognosis and increased mortality, but the molecular mechanisms by which metastatic tumors occur are not well understood. Identifying the genes that drive the metastatic process could provide targets for improved therapy and biomarkers to improve BC patient outcomes. Using a forward mutagenesis screen, BC cells mutagenized with a replication-incompetent gammaretroviral vector (γRV) were xenotransplanted into the mammary fat pad of immunodeficient mice. In this approach the vector provirus dysregulates nearby genes, providing a selective advantage to transduced cells to form metastases. Metastatic tumors were analyzed for proviral integration sites to identify nearby candidate metastasis genes. The γRV has a transgene cassette that allows for rescue in bacteria and rapid identification of vector integration sites. Using this approach, we identified the previously described metastasis gene WWTR1 (TAZ), and three other novel candidate metastasis genes including SHARPIN. SHARPIN was independently validated in vivo as a BC metastasis gene. Analysis of patient data showed that SHARPIN expression predicts metastasis-free survival after adjuvant therapy. Our approach has broad potential to identify genes involved in oncogenic processes for BC and other cancers. We show here it can identify both known (WWTR1) and novel (SHARPIN) BC metastasis genes.

  20. A comparative gene analysis with rice identified orthologous group II HKT genes and their association with Na(+) concentration in bread wheat.

    PubMed

    Ariyarathna, H A Chandima K; Oldach, Klaus H; Francki, Michael G

    2016-01-19

    Although the HKT transporter genes ascertain some of the key determinants of crop salt tolerance mechanisms, the diversity and functional role of group II HKT genes are not clearly understood in bread wheat. The advanced knowledge on rice HKT and whole genome sequence was, therefore, used in comparative gene analysis to identify orthologous wheat group II HKT genes and their role in trait variation under different saline environments. The four group II HKTs in rice identified two orthologous gene families from bread wheat, including the known TaHKT2;1 gene family and a new distinctly different gene family designated as TaHKT2;2. A single copy of TaHKT2;2 was found on each homeologous chromosome arm 7AL, 7BL and 7DL and each gene was expressed in leaf blade, sheath and root tissues under non-stressed and at 200 mM salt stressed conditions. The proteins encoded by genes of the TaHKT2;2 family revealed more than 93% amino acid sequence identity but ≤52% amino acid identity compared to the proteins encoded by TaHKT2;1 family. Specifically, variations in known critical domains predicted functional differences between the two protein families. Similar to orthologous rice genes on chromosome 6L, TaHKT2;1 and TaHKT2;2 genes were located approximately 3 kb apart on wheat chromosomes 7AL, 7BL and 7DL, forming a static syntenic block in the two species. The chromosomal region on 7AL containing TaHKT2;1 7AL-1 co-located with QTL for shoot Na(+) concentration and yield in some saline environments. The differences in copy number, genes sequences and encoded proteins between TaHKT2;2 homeologous genes and other group II HKT gene families within and across species likely reflect functional diversity for ion selectivity and transport in plants. Evidence indicated that neither TaHKT2;2 nor TaHKT2;1 were associated with primary root Na(+) uptake but TaHKT2;1 may be associated with trait variation for Na(+) exclusion and yield in some but not all saline environments.

  1. Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression

    PubMed Central

    Poole, William; Leinonen, Kalle; Shmulevich, Ilya

    2017-01-01

    Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C. PMID:28170390

  2. Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression.

    PubMed

    Poole, William; Leinonen, Kalle; Shmulevich, Ilya; Knijnenburg, Theo A; Bernard, Brady

    2017-02-01

    Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C.

  3. A Morpholino-based screen to identify novel genes involved in craniofacial morphogenesis

    PubMed Central

    Melvin, Vida Senkus; Feng, Weiguo; Hernandez-Lagunas, Laura; Artinger, Kristin Bruk; Williams, Trevor

    2014-01-01

    BACKGROUND The regulatory mechanisms underpinning facial development are conserved between diverse species. Therefore, results from model systems provide insight into the genetic causes of human craniofacial defects. Previously, we generated a comprehensive dataset examining gene expression during development and fusion of the mouse facial prominences. Here, we used this resource to identify genes that have dynamic expression patterns in the facial prominences, but for which only limited information exists concerning developmental function. RESULTS This set of ~80 genes was used for a high throughput functional analysis in the zebrafish system using Morpholino gene knockdown technology. This screen revealed three classes of cranial cartilage phenotypes depending upon whether knockdown of the gene affected the neurocranium, viscerocranium, or both. The targeted genes that produced consistent phenotypes encoded proteins linked to transcription (meis1, meis2a, tshz2, vgll4l), signaling (pkdcc, vlk, macc1, wu:fb16h09), and extracellular matrix function (smoc2). The majority of these phenotypes were not altered by reduction of p53 levels, demonstrating that both p53 dependent and independent mechanisms were involved in the craniofacial abnormalities. CONCLUSIONS This Morpholino-based screen highlights new genes involved in development of the zebrafish craniofacial skeleton with wider relevance to formation of the face in other species, particularly mouse and human. PMID:23559552

  4. Use of deep whole-genome sequencing data to identify structure risk variants in breast cancer susceptibility genes.

    PubMed

    Guo, Xingyi; Shi, Jiajun; Cai, Qiuyin; Shu, Xiao-Ou; He, Jing; Wen, Wanqing; Allen, Jamie; Pharoah, Paul; Dunning, Alison; Hunter, David J; Kraft, Peter; Easton, Douglas F; Zheng, Wei; Long, Jirong

    2018-03-01

    Functional disruptions of susceptibility genes by large genomic structure variant (SV) deletions in germlines are known to be associated with cancer risk. However, few studies have been conducted to systematically search for SV deletions in breast cancer susceptibility genes. We analysed deep (> 30x) whole-genome sequencing (WGS) data generated in blood samples from 128 breast cancer patients of Asian and European descent with either a strong family history of breast cancer or early cancer onset disease. To identify SV deletions in known or suspected breast cancer susceptibility genes, we used multiple SV calling tools including Genome STRiP, Delly, Manta, BreakDancer and Pindel. SV deletions were detected by at least three of these bioinformatics tools in five genes. Specifically, we identified heterozygous deletions covering a fraction of the coding regions of BRCA1 (with approximately 80kb in two patients), and TP53 genes (with ∼1.6 kb in two patients), and of intronic regions (∼1 kb) of the PALB2 (one patient), PTEN (three patients) and RAD51C genes (one patient). We confirmed the presence of these deletions using real-time quantitative PCR (qPCR). Our study identified novel SV deletions in breast cancer susceptibility genes and the identification of such SV deletions may improve clinical testing.

  5. Novel mutations in the homogentisate 1,2 dioxygenase gene identified in Jordanian patients with alkaptonuria.

    PubMed

    Al-sbou, Mohammed

    2012-06-01

    This study was conducted to identify mutations in the homogentisate 1,2 dioxygenase gene (HGD) in alkaptonuria patients among Jordanian population. Blood samples were collected from four alkaptonuria patients, four carriers, and two healthy volunteers. DNA was isolated from peripheral blood. All 14 exons of the HGD gene were amplified using the polymerase chain reaction (PCR) technique. The PCR products were then purified and analyzed by sequencing. Five mutations were identified in our samples. Four of them were novel C1273A, T1046G, 551-552insG, T533G and had not been previously reported, and one mutation T847C has been described before. The types of mutations identified were two missense mutations, one splice site mutation, one frameshift mutation, and one polymorphism. We present the first molecular study of the HGD gene in Jordanian alkaptonuria patients. This study provides valuable information about the molecular basis of alkaptonuria in Jordanian population.

  6. Gene interactions in the DNA damage-response pathway identified by genome-wide RNA-interference analysis of synthetic lethality

    PubMed Central

    van Haaften, Gijs; Vastenhouw, Nadine L.; Nollen, Ellen A. A.; Plasterk, Ronald H. A.; Tijsterman, Marcel

    2004-01-01

    Here, we describe a systematic search for synthetic gene interactions in a multicellular organism, the nematode Caenorhabditis elegans. We established a high-throughput method to determine synthetic gene interactions by genome-wide RNA interference and identified genes that are required to protect the germ line against DNA double-strand breaks. Besides known DNA-repair proteins such as the C. elegans orthologs of TopBP1, RPA2, and RAD51, eight genes previously unassociated with a double-strand-break response were identified. Knockdown of these genes increased sensitivity to ionizing radiation and camptothecin and resulted in increased chromosomal nondisjunction. All genes have human orthologs that may play a role in human carcinogenesis. PMID:15326288

  7. Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize

    PubMed Central

    2010-01-01

    Background Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. Results In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. Conclusions CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query

  8. Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize.

    PubMed

    Kelley, Rowena Y; Gresham, Cathy; Harper, Jonathan; Bridges, Susan M; Warburton, Marilyn L; Hawkins, Leigh K; Pechanova, Olga; Peethambaran, Bela; Pechan, Tibor; Luthe, Dawn S; Mylroie, J E; Ankala, Arunkanth; Ozkan, Seval; Henry, W B; Williams, W P

    2010-10-07

    Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database

  9. Genome-wide association meta-analysis of 78,308 individuals identifies new loci and genes influencing human intelligence.

    PubMed

    Sniekers, Suzanne; Stringer, Sven; Watanabe, Kyoko; Jansen, Philip R; Coleman, Jonathan R I; Krapohl, Eva; Taskesen, Erdogan; Hammerschlag, Anke R; Okbay, Aysu; Zabaneh, Delilah; Amin, Najaf; Breen, Gerome; Cesarini, David; Chabris, Christopher F; Iacono, William G; Ikram, M Arfan; Johannesson, Magnus; Koellinger, Philipp; Lee, James J; Magnusson, Patrik K E; McGue, Matt; Miller, Mike B; Ollier, William E R; Payton, Antony; Pendleton, Neil; Plomin, Robert; Rietveld, Cornelius A; Tiemeier, Henning; van Duijn, Cornelia M; Posthuma, Danielle

    2017-07-01

    Intelligence is associated with important economic and health-related life outcomes. Despite intelligence having substantial heritability (0.54) and a confirmed polygenic nature, initial genetic studies were mostly underpowered. Here we report a meta-analysis for intelligence of 78,308 individuals. We identify 336 associated SNPs (METAL P < 5 × 10 -8 ) in 18 genomic loci, of which 15 are new. Around half of the SNPs are located inside a gene, implicating 22 genes, of which 11 are new findings. Gene-based analyses identified an additional 30 genes (MAGMA P < 2.73 × 10 -6 ), of which all but one had not been implicated previously. We show that the identified genes are predominantly expressed in brain tissue, and pathway analysis indicates the involvement of genes regulating cell development (MAGMA competitive P = 3.5 × 10 -6 ). Despite the well-known difference in twin-based heritability for intelligence in childhood (0.45) and adulthood (0.80), we show substantial genetic correlation (r g = 0.89, LD score regression P = 5.4 × 10 -29 ). These findings provide new insight into the genetic architecture of intelligence.

  10. Transcriptional profiling identifies differentially expressed genes in developing turkey skeletal muscle

    PubMed Central

    2011-01-01

    involved in extracellular matrix regulation, cell death/apoptosis, and calcium signaling/muscle function, as well as genes with miscellaneous function was confirmed by qPCR. Conclusions The current study identified gene pathways and uncovered novel genes important in turkey muscle growth and development. Future experiments will focus further on several of these candidate genes and the expression and mechanism of action of their protein products. PMID:21385442

  11. Computational modeling identifies key gene regulatory interactions underlying phenobarbital-mediated tumor promotion

    PubMed Central

    Luisier, Raphaëlle; Unterberger, Elif B.; Goodman, Jay I.; Schwarz, Michael; Moggs, Jonathan; Terranova, Rémi; van Nimwegen, Erik

    2014-01-01

    Gene regulatory interactions underlying the early stages of non-genotoxic carcinogenesis are poorly understood. Here, we have identified key candidate regulators of phenobarbital (PB)-mediated mouse liver tumorigenesis, a well-characterized model of non-genotoxic carcinogenesis, by applying a new computational modeling approach to a comprehensive collection of in vivo gene expression studies. We have combined our previously developed motif activity response analysis (MARA), which models gene expression patterns in terms of computationally predicted transcription factor binding sites with singular value decomposition (SVD) of the inferred motif activities, to disentangle the roles that different transcriptional regulators play in specific biological pathways of tumor promotion. Furthermore, transgenic mouse models enabled us to identify which of these regulatory activities was downstream of constitutive androstane receptor and β-catenin signaling, both crucial components of PB-mediated liver tumorigenesis. We propose novel roles for E2F and ZFP161 in PB-mediated hepatocyte proliferation and suggest that PB-mediated suppression of ESR1 activity contributes to the development of a tumor-prone environment. Our study shows that combining MARA with SVD allows for automated identification of independent transcription regulatory programs within a complex in vivo tissue environment and provides novel mechanistic insights into PB-mediated hepatocarcinogenesis. PMID:24464994

  12. A fast and high performance multiple data integration algorithm for identifying human disease genes

    PubMed Central

    2015-01-01

    Background Integrating multiple data sources is indispensable in improving disease gene identification. It is not only due to the fact that disease genes associated with similar genetic diseases tend to lie close with each other in various biological networks, but also due to the fact that gene-disease associations are complex. Although various algorithms have been proposed to identify disease genes, their prediction performances and the computational time still should be further improved. Results In this study, we propose a fast and high performance multiple data integration algorithm for identifying human disease genes. A posterior probability of each candidate gene associated with individual diseases is calculated by using a Bayesian analysis method and a binary logistic regression model. Two prior probability estimation strategies and two feature vector construction methods are developed to test the performance of the proposed algorithm. Conclusions The proposed algorithm is not only generated predictions with high AUC scores, but also runs very fast. When only a single PPI network is employed, the AUC score is 0.769 by using F2 as feature vectors. The average running time for each leave-one-out experiment is only around 1.5 seconds. When three biological networks are integrated, the AUC score using F3 as feature vectors increases to 0.830, and the average running time for each leave-one-out experiment takes only about 12.54 seconds. It is better than many existing algorithms. PMID:26399620

  13. Epigenomic Elements Analyses for Promoters Identify ESRRG as a New Susceptibility Gene for Obesity-related Traits

    PubMed Central

    Dong, Shan-Shan; Guo, Yan; Zhu, Dong-Li; Chen, Xiao-Feng; Wu, Xiao-Ming; Shen, Hui; Chen, Xiang-Ding; Tan, Li-Jun; Tian, Qing; Deng, Hong-Wen; Yang, Tie-Lin

    2016-01-01

    OBJECTIVES With ENCODE epigenomic data and results from published genome-wide association studies (GWASs), we aimed to find regulatory signatures of obesity genes and discover novel susceptibility genes. METHODS Obesity genes were obtained from public GWASs databases and their promoters were annotated based on the regulatory elements information. Significantly enriched or depleted epigenomic elements in the promoters of obesity genes were evaluated and all human genes were then prioritized according to the existence of the selected elements to predict new candidate genes. Top ranked genes were subsequently applied to validate their associations with obesity-related traits in three independent in-house GWASs samples. RESULTS We identified RAD21 and EZH2 as over-represented, STAT2 and IRF3 as depleted transcription factors. Histone modification of H3K9me3 and chromatin state segmentation of “poised promoter” and “repressed” were overrepresented. All genes were prioritized and we selected the top five genes for validation at population level. Combined results from the three GWASs samples, rs7522101 in ESRRG remained significantly associated with BMI after multiple testing corrections (P = 7.25 × 10−5). It was also associated with β-cell function (P = 1.99 × 10−3) and fasting glucose level (P < 0.05) in the meta-analyses of glucose and insulin-related traits consortium (MAGIC) dataset. CONCLUSIONS In summary, we identified epigenomic characteristics for obesity genes and suggested ESRRG as a novel obesity susceptibility gene. PMID:27113491

  14. Z2Pack: Numerical implementation of hybrid Wannier centers for identifying topological materials

    NASA Astrophysics Data System (ADS)

    Gresch, Dominik; Autès, Gabriel; Yazyev, Oleg V.; Troyer, Matthias; Vanderbilt, David; Bernevig, B. Andrei; Soluyanov, Alexey A.

    2017-02-01

    The intense theoretical and experimental interest in topological insulators and semimetals has established band structure topology as a fundamental material property. Consequently, identifying band topologies has become an important, but often challenging, problem, with no exhaustive solution at the present time. In this work we compile a series of techniques, some previously known, that allow for a solution to this problem for a large set of the possible band topologies. The method is based on tracking hybrid Wannier charge centers computed for relevant Bloch states, and it works at all levels of materials modeling: continuous k .p models, tight-binding models, and ab initio calculations. We apply the method to compute and identify Chern, Z2, and crystalline topological insulators, as well as topological semimetal phases, using real material examples. Moreover, we provide a numerical implementation of this technique (the Z2Pack software package) that is ideally suited for high-throughput screening of materials databases for compounds with nontrivial topologies. We expect that our work will allow researchers to (a) identify topological materials optimal for experimental probes, (b) classify existing compounds, and (c) reveal materials that host novel, not yet described, topological states.

  15. In-Silico Integration Approach to Identify a Key miRNA Regulating a Gene Network in Aggressive Prostate Cancer

    PubMed Central

    Colaprico, Antonio; Bontempi, Gianluca; Castiglioni, Isabella

    2018-01-01

    Like other cancer diseases, prostate cancer (PC) is caused by the accumulation of genetic alterations in the cells that drives malignant growth. These alterations are revealed by gene profiling and copy number alteration (CNA) analysis. Moreover, recent evidence suggests that also microRNAs have an important role in PC development. Despite efforts to profile PC, the alterations (gene, CNA, and miRNA) and biological processes that correlate with disease development and progression remain partially elusive. Many gene signatures proposed as diagnostic or prognostic tools in cancer poorly overlap. The identification of co-expressed genes, that are functionally related, can identify a core network of genes associated with PC with a better reproducibility. By combining different approaches, including the integration of mRNA expression profiles, CNAs, and miRNA expression levels, we identified a gene signature of four genes overlapping with other published gene signatures and able to distinguish, in silico, high Gleason-scored PC from normal human tissue, which was further enriched to 19 genes by gene co-expression analysis. From the analysis of miRNAs possibly regulating this network, we found that hsa-miR-153 was highly connected to the genes in the network. Our results identify a four-gene signature with diagnostic and prognostic value in PC and suggest an interesting gene network that could play a key regulatory role in PC development and progression. Furthermore, hsa-miR-153, controlling this network, could be a potential biomarker for theranostics in high Gleason-scored PC. PMID:29562723

  16. Genome-wide significant localization for working and spatial memory: Identifying genes for psychosis using models of cognition.

    PubMed

    Knowles, Emma E M; Carless, Melanie A; de Almeida, Marcio A A; Curran, Joanne E; McKay, D Reese; Sprooten, Emma; Dyer, Thomas D; Göring, Harald H; Olvera, Rene; Fox, Peter; Almasy, Laura; Duggirala, Ravi; Kent, Jack W; Blangero, John; Glahn, David C

    2014-01-01

    It is well established that risk for developing psychosis is largely mediated by the influence of genes, but identifying precisely which genes underlie that risk has been problematic. Focusing on endophenotypes, rather than illness risk, is one solution to this problem. Impaired cognition is a well-established endophenotype of psychosis. Here we aimed to characterize the genetic architecture of cognition using phenotypically detailed models as opposed to relying on general IQ or individual neuropsychological measures. In so doing we hoped to identify genes that mediate cognitive ability, which might also contribute to psychosis risk. Hierarchical factor models of genetically clustered cognitive traits were subjected to linkage analysis followed by QTL region-specific association analyses in a sample of 1,269 Mexican American individuals from extended pedigrees. We identified four genome wide significant QTLs, two for working and two for spatial memory, and a number of plausible and interesting candidate genes. The creation of detailed models of cognition seemingly enhanced the power to detect genetic effects on cognition and provided a number of possible candidate genes for psychosis. © 2013 Wiley Periodicals, Inc.

  17. Microarray analysis and scale-free gene networks identify candidate regulators in drought-stressed roots of loblolly pine (P. taeda L.)

    PubMed Central

    2011-01-01

    Background Global transcriptional analysis of loblolly pine (Pinus taeda L.) is challenging due to limited molecular tools. PtGen2, a 26,496 feature cDNA microarray, was fabricated and used to assess drought-induced gene expression in loblolly pine propagule roots. Statistical analysis of differential expression and weighted gene correlation network analysis were used to identify drought-responsive genes and further characterize the molecular basis of drought tolerance in loblolly pine. Results Microarrays were used to interrogate root cDNA populations obtained from 12 genotype × treatment combinations (four genotypes, three watering regimes). Comparison of drought-stressed roots with roots from the control treatment identified 2445 genes displaying at least a 1.5-fold expression difference (false discovery rate = 0.01). Genes commonly associated with drought response in pine and other plant species, as well as a number of abiotic and biotic stress-related genes, were up-regulated in drought-stressed roots. Only 76 genes were identified as differentially expressed in drought-recovered roots, indicating that the transcript population can return to the pre-drought state within 48 hours. Gene correlation analysis predicts a scale-free network topology and identifies eleven co-expression modules that ranged in size from 34 to 938 members. Network topological parameters identified a number of central nodes (hubs) including those with significant homology (E-values ≤ 2 × 10-30) to 9-cis-epoxycarotenoid dioxygenase, zeatin O-glucosyltransferase, and ABA-responsive protein. Identified hubs also include genes that have been associated previously with osmotic stress, phytohormones, enzymes that detoxify reactive oxygen species, and several genes of unknown function. Conclusion PtGen2 was used to evaluate transcriptome responses in loblolly pine and was leveraged to identify 2445 differentially expressed genes responding to severe drought stress in roots. Many of the

  18. Transcriptome analysis identifies genes involved in ethanol response of Saccharomyces cerevisiae in Agave tequilana juice.

    PubMed

    Ramirez-Córdova, Jesús; Drnevich, Jenny; Madrigal-Pulido, Jaime Alberto; Arrizon, Javier; Allen, Kirk; Martínez-Velázquez, Moisés; Alvarez-Maya, Ikuri

    2012-08-01

    During ethanol fermentation, yeast cells are exposed to stress due to the accumulation of ethanol, cell growth is altered and the output of the target product is reduced. For Agave beverages, like tequila, no reports have been published on the global gene expression under ethanol stress. In this work, we used microarray analysis to identify Saccharomyces cerevisiae genes involved in the ethanol response. Gene expression of a tequila yeast strain of S. cerevisiae (AR5) was explored by comparing global gene expression with that of laboratory strain S288C, both after ethanol exposure. Additionally, we used two different culture conditions, cells grown in Agave tequilana juice as a natural fermentation media or grown in yeast-extract peptone dextrose as artificial media. Of the 6368 S. cerevisiae genes in the microarray, 657 genes were identified that had different expression responses to ethanol stress due to strain and/or media. A cluster of 28 genes was found over-expressed specifically in the AR5 tequila strain that could be involved in the adaptation to tequila yeast fermentation, 14 of which are unknown such as yor343c, ylr162w, ygr182c, ymr265c, yer053c-a or ydr415c. These could be the most suitable genes for transforming tequila yeast to increase ethanol tolerance in the tequila fermentation process. Other genes involved in response to stress (RFC4, TSA1, MLH1, PAU3, RAD53) or transport (CYB2, TIP20, QCR9) were expressed in the same cluster. Unknown genes could be good candidates for the development of recombinant yeasts with ethanol tolerance for use in industrial tequila fermentation.

  19. Parallel analysis of tagged deletion mutants efficiently identifies genes involved in endoplasmic reticulum biogenesis.

    PubMed

    Wright, Robin; Parrish, Mark L; Cadera, Emily; Larson, Lynnelle; Matson, Clinton K; Garrett-Engele, Philip; Armour, Chris; Lum, Pek Yee; Shoemaker, Daniel D

    2003-07-30

    Increased levels of HMG-CoA reductase induce cell type- and isozyme-specific proliferation of the endoplasmic reticulum. In yeast, the ER proliferations induced by Hmg1p consist of nuclear-associated stacks of smooth ER membranes known as karmellae. To identify genes required for karmellae assembly, we compared the composition of populations of homozygous diploid S. cerevisiae deletion mutants following 20 generations of growth with and without karmellae. Using an initial population of 1,557 deletion mutants, 120 potential mutants were identified as a result of three independent experiments. Each experiment produced a largely non-overlapping set of potential mutants, suggesting that differences in specific growth conditions could be used to maximize the comprehensiveness of similar parallel analysis screens. Only two genes, UBC7 and YAL011W, were identified in all three experiments. Subsequent analysis of individual mutant strains confirmed that each experiment was identifying valid mutations, based on the mutant's sensitivity to elevated HMG-CoA reductase and inability to assemble normal karmellae. The largest class of HMG-CoA reductase-sensitive mutations was a subset of genes that are involved in chromatin structure and transcriptional regulation, suggesting that karmellae assembly requires changes in transcription or that the presence of karmellae may interfere with normal transcriptional regulation. Copyright 2003 John Wiley & Sons, Ltd.

  20. Genome-wide association study identifies the SERPINB gene cluster as a susceptibility locus for food allergy.

    PubMed

    Marenholz, Ingo; Grosche, Sarah; Kalb, Birgit; Rüschendorf, Franz; Blümchen, Katharina; Schlags, Rupert; Harandi, Neda; Price, Mareike; Hansen, Gesine; Seidenberg, Jürgen; Röblitz, Holger; Yürek, Songül; Tschirner, Sebastian; Hong, Xiumei; Wang, Xiaobin; Homuth, Georg; Schmidt, Carsten O; Nöthen, Markus M; Hübner, Norbert; Niggemann, Bodo; Beyer, Kirsten; Lee, Young-Ae

    2017-10-20

    Genetic factors and mechanisms underlying food allergy are largely unknown. Due to heterogeneity of symptoms a reliable diagnosis is often difficult to make. Here, we report a genome-wide association study on food allergy diagnosed by oral food challenge in 497 cases and 2387 controls. We identify five loci at genome-wide significance, the clade B serpin (SERPINB) gene cluster at 18q21.3, the cytokine gene cluster at 5q31.1, the filaggrin gene, the C11orf30/LRRC32 locus, and the human leukocyte antigen (HLA) region. Stratifying the results for the causative food demonstrates that association of the HLA locus is peanut allergy-specific whereas the other four loci increase the risk for any food allergy. Variants in the SERPINB gene cluster are associated with SERPINB10 expression in leukocytes. Moreover, SERPINB genes are highly expressed in the esophagus. All identified loci are involved in immunological regulation or epithelial barrier function, emphasizing the role of both mechanisms in food allergy.

  1. Effectively identifying regulatory hotspots while capturing expression heterogeneity in gene expression studies

    PubMed Central

    2014-01-01

    Expression quantitative trait loci (eQTL) mapping is a tool that can systematically identify genetic variation affecting gene expression. eQTL mapping studies have shown that certain genomic locations, referred to as regulatory hotspots, may affect the expression levels of many genes. Recently, studies have shown that various confounding factors may induce spurious regulatory hotspots. Here, we introduce a novel statistical method that effectively eliminates spurious hotspots while retaining genuine hotspots. Applied to simulated and real datasets, we validate that our method achieves greater sensitivity while retaining low false discovery rates compared to previous methods. PMID:24708878

  2. Systems Biology in Animal Breeding: Identifying relationships among markers, genes, and phenotypes

    USDA-ARS?s Scientific Manuscript database

    The Breeding and Genetics Symposium titled “Systems Biology in Animal Breeding: Identifying relationships among markers, genes, and phenotypes” was held at the Joint Annual Meeting of the American Dairy Science Association and the American Society of Animal Science in Phoenix, AZ, July 15 to 19, 201...

  3. MethylMix 2.0: an R package for identifying DNA methylation genes. | Office of Cancer Genomics

    Cancer.gov

    DNA methylation is an important mechanism regulating gene transcription, and its role in carcinogenesis has been extensively studied. Hyper and hypomethylation of genes is a major mechanism of gene expression deregulation in a wide range of diseases. At the same time, high-throughput DNA methylation assays have been developed generating vast amounts of genome wide DNA methylation measurements. We developed MethylMix, an algorithm implemented in R to identify disease specific hyper and hypomethylated genes.

  4. Gene Deletions in Mycobacterium bovis BCG Stimulate Increased CD8+ T Cell Responses

    PubMed Central

    Panas, Michael W.; Sixsmith, Jaimie D.; White, KeriAnn; Korioth-Schmitz, Birgit; Shields, Shana T.; Moy, Brian T.; Lee, Sunhee; Schmitz, Joern E.; Jacobs, William R.; Porcelli, Steven A.; Haynes, Barton F.; Letvin, Norman L.

    2014-01-01

    Mycobacteria, the etiological agents of tuberculosis and leprosy, have coevolved with mammals for millions of years and have numerous ways of suppressing their host's immune response. It has been suggested that mycobacteria may contain genes that reduce the host's ability to elicit CD8+ T cell responses. We screened 3,290 mutant Mycobacterium bovis bacillus Calmette Guerin (BCG) strains to identify genes that decrease major histocompatibility complex (MHC) class I presentation of mycobacterium-encoded epitope peptides. Through our analysis, we identified 16 mutant BCG strains that generated increased transgene product-specific CD8+ T cell responses. The genes disrupted in these mutant strains had disparate predicted functions. Reconstruction of strains via targeted deletion of genes identified in the screen recapitulated the enhanced immunogenicity phenotype of the original mutant strains. When we introduced the simian immunodeficiency virus (SIV) gag gene into several of these novel BCG strains, we observed enhanced SIV Gag-specific CD8+ T cell responses in vivo. This study demonstrates that mycobacteria carry numerous genes that act to dampen CD8+ T cell responses and suggests that genetic modification of these genes may generate a novel group of recombinant BCG strains capable of serving as more effective and immunogenic vaccine vectors. PMID:25287928

  5. Gene deletions in Mycobacterium bovis BCG stimulate increased CD8+ T cell responses.

    PubMed

    Panas, Michael W; Sixsmith, Jaimie D; White, KeriAnn; Korioth-Schmitz, Birgit; Shields, Shana T; Moy, Brian T; Lee, Sunhee; Schmitz, Joern E; Jacobs, William R; Porcelli, Steven A; Haynes, Barton F; Letvin, Norman L; Gillard, Geoffrey O

    2014-12-01

    Mycobacteria, the etiological agents of tuberculosis and leprosy, have coevolved with mammals for millions of years and have numerous ways of suppressing their host's immune response. It has been suggested that mycobacteria may contain genes that reduce the host's ability to elicit CD8(+) T cell responses. We screened 3,290 mutant Mycobacterium bovis bacillus Calmette Guerin (BCG) strains to identify genes that decrease major histocompatibility complex (MHC) class I presentation of mycobacterium-encoded epitope peptides. Through our analysis, we identified 16 mutant BCG strains that generated increased transgene product-specific CD8(+) T cell responses. The genes disrupted in these mutant strains had disparate predicted functions. Reconstruction of strains via targeted deletion of genes identified in the screen recapitulated the enhanced immunogenicity phenotype of the original mutant strains. When we introduced the simian immunodeficiency virus (SIV) gag gene into several of these novel BCG strains, we observed enhanced SIV Gag-specific CD8(+) T cell responses in vivo. This study demonstrates that mycobacteria carry numerous genes that act to dampen CD8(+) T cell responses and suggests that genetic modification of these genes may generate a novel group of recombinant BCG strains capable of serving as more effective and immunogenic vaccine vectors. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  6. Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data.

    PubMed

    Paisitkriangkrai, Sakrapee; Quek, Kelly; Nievergall, Eva; Jabbour, Anissa; Zannettino, Andrew; Kok, Chung Hoow

    2018-06-07

    Recurrent oncogenic fusion genes play a critical role in the development of various cancers and diseases and provide, in some cases, excellent therapeutic targets. To date, analysis tools that can identify and compare recurrent fusion genes across multiple samples have not been available to researchers. To address this deficiency, we developed Co-occurrence Fusion (Co-fuse), a new and easy to use software tool that enables biologists to merge RNA-seq information, allowing them to identify recurrent fusion genes, without the need for exhaustive data processing. Notably, Co-fuse is based on pattern mining and statistical analysis which enables the identification of hidden patterns of recurrent fusion genes. In this report, we show that Co-fuse can be used to identify 2 distinct groups within a set of 49 leukemic cell lines based on their recurrent fusion genes: a multiple myeloma (MM) samples-enriched cluster and an acute myeloid leukemia (AML) samples-enriched cluster. Our experimental results further demonstrate that Co-fuse can identify known driver fusion genes (e.g., IGH-MYC, IGH-WHSC1) in MM, when compared to AML samples, indicating the potential of Co-fuse to aid the discovery of yet unknown driver fusion genes through cohort comparisons. Additionally, using a 272 primary glioma sample RNA-seq dataset, Co-fuse was able to validate recurrent fusion genes, further demonstrating the power of this analysis tool to identify recurrent fusion genes. Taken together, Co-fuse is a powerful new analysis tool that can be readily applied to large RNA-seq datasets, and may lead to the discovery of new disease subgroups and potentially new driver genes, for which, targeted therapies could be developed. The Co-fuse R source code is publicly available at https://github.com/sakrapee/co-fuse .

  7. Gene expression in bovine rumen epithelium during weaning identifies molecular regulators of rumen development and growth.

    PubMed

    Connor, Erin E; Baldwin, Ransom L; Li, Cong-jun; Li, Robert W; Chung, Hoyoung

    2013-03-01

    During weaning, epithelial cell function in the rumen transitions in response to conversion from a pre-ruminant to a true ruminant environment to ensure efficient nutrient absorption and metabolism. To identify gene networks affected by weaning in bovine rumen, Holstein bull calves were fed commercial milk replacer only (MRO) until 42 days of age, then were provided diets of either milk + orchardgrass hay (MH) or milk + grain-based calf starter (MG). Rumen epithelial RNA was extracted from calves sacrificed at four time points: day 14 (n = 3) and day 42 (n = 3) of age while fed the MRO diet and day 56 (n = 3/diet) and day 70 (n = 3/diet) while fed the MH and MG diets for transcript profiling by microarray hybridization. Five two-group comparisons were made using Permutation Analysis of Differential Expression® to identify differentially expressed genes over time and developmental stage between days 14 and 42 within the MRO diet, between day 42 on the MRO diet and day 56 on the MG or MH diets, and between the MG and MH diets at days 56 and 70. Ingenuity Pathway Analysis (IPA) of differentially expressed genes during weaning indicated the top 5 gene networks involving molecules participating in lipid metabolism, cell morphology and death, cellular growth and proliferation, molecular transport, and the cell cycle. Putative genes functioning in the establishment of the rumen microbial population and associated rumen epithelial inflammation during weaning were identified. Activation of transcription factor PPAR-α was identified by IPA software as an important regulator of molecular changes in rumen epithelium that function in papillary development and fatty acid oxidation during the transition from pre-rumination to rumination. Thus, molecular markers of rumen development and gene networks regulating differentiation and growth of rumen epithelium were identified for selecting targets and methods for improving and assessing rumen development and

  8. Comparative Transcriptomics to Identify Novel Genes and Pathways in Dinoflagellates

    NASA Astrophysics Data System (ADS)

    Ryan, D.

    2016-02-01

    The unarmored dinoflagellate Karenia brevis is among the most prominent harmful, bloom-forming phytoplankton species in the Gulf of Mexico. During blooms, the polyketides PbTx-1 and PbTx-2 (brevetoxins) are produced by K. brevis. Brevetoxins negatively impact human health and the Gulf shellfish harvest. However, the genes underlying brevetoxin synthesis are currently unknown. Because the K. brevis genome is extremely large ( 1 × 1011 base pairs long), and with a high proportion of repetitive, non-coding DNA, it has not been sequenced. In fact, large, repetitive genomes are common among the dinoflagellate group. High-throughput RNA sequencing technology enabled us to assemble Karenia transcriptomes de novo and investigate potential genes in the brevetoxin pathway through comparative transcriptomics. The brevetoxin profile varies among K. brevis clonal cultures. For example, well-documented Wilson-CCFWC268 typically produces 8-10 pg PbTx per cell, whereas SP1 produces < 2 pg PbTx/cell, and the mutant low-toxin Wilson clone produces undetectable to low (<0.05 pg/cell) amounts. Further, PbTx-2 has been measured in Karenia papilionacea but not Karenia mikimotoi. We compared the transcriptomes of four K. brevis clones (Wilson-CCFWC268, SP3, SP1, and mutant low-toxin Wilson) with K. papilionacea and K. mikimotoi to investigate nucleotide-level genetic variations and differences in gene expression. Of the 85,000 transcripts in the K. brevis transcriptome, 4,600 transcripts, including novel unannotated orthologs and putative polyketide synthases (PKSs), were only expressed by brevetoxin-producing K. brevis and K. papilionacea, not K. mikimotoi. Examination of gene expression between the typical- and low-toxin Wilson clones identified about 3,500 genes with significantly different expression levels, including 2 putative PKSs. One of the 2 PKSs was only found in the brevetoxin-producing Karenia species. These transcriptomes could not have been characterized without high

  9. Epigenomic elements analyses for promoters identify ESRRG as a new susceptibility gene for obesity-related traits.

    PubMed

    Dong, S-S; Guo, Y; Zhu, D-L; Chen, X-F; Wu, X-M; Shen, H; Chen, X-D; Tan, L-J; Tian, Q; Deng, H-W; Yang, T-L

    2016-07-01

    With ENCODE epigenomic data and results from published genome-wide association studies (GWASs), we aimed to find regulatory signatures of obesity genes and discover novel susceptibility genes. Obesity genes were obtained from public GWAS databases and their promoters were annotated based on the regulatory element information. Significantly enriched or depleted epigenomic elements in the promoters of obesity genes were evaluated and all human genes were then prioritized according to the existence of the selected elements to predict new candidate genes. Top-ranked genes were subsequently applied to validate their associations with obesity-related traits in three independent in-house GWAS samples. We identified RAD21 and EZH2 as over-represented, and STAT2 (signal transducer and activator of transcription 2) and IRF3 (interferon regulatory transcription factor 3) as depleted transcription factors. Histone modification of H3K9me3 and chromatin state segmentation of 'poised promoter' and 'repressed' were over-represented. All genes were prioritized and we selected the top five genes for validation at the population level. Combining results from the three GWAS samples, rs7522101 in ESRRG (estrogen-related receptor-γ) remained significantly associated with body mass index after multiple testing corrections (P=7.25 × 10(-5)). It was also associated with β-cell function (P=1.99 × 10(-3)) and fasting glucose level (P<0.05) in the meta-analyses of glucose and insulin-related traits consortium (MAGIC) data set.Cnoclusions:In summary, we identified epigenomic characteristics for obesity genes and suggested ESRRG as a novel obesity-susceptibility gene.

  10. Pharmacological Validation of Candidate Causal Sleep Genes Identified in an N2 Cross

    PubMed Central

    Brunner, Joseph I.; Gotter, Anthony L.; Millstein, Joshua; Garson, Susan; Binns, Jacquelyn; Fox, Steven V.; Savitz, Alan T.; Yang, He S.; Fitzpatrick, Karrie; Zhou, Lili; Owens, Joseph R.; Webber, Andrea L.; Vitaterna, Martha H.; Kasarskis, Andrew; Uebele, Victor N.; Turek, Fred; Renger, John J.; Winrow, Christopher J.

    2013-01-01

    Despite the substantial impact of sleep disturbances on human health and the many years of study dedicated to understanding sleep pathologies, the underlying genetic mechanisms that govern sleep and wake largely remain unknown. Recently, we completed large scale genetic and gene expression analyses in a segregating inbred mouse cross and identified candidate causal genes that regulate the mammalian sleep-wake cycle, across multiple traits including total sleep time, amounts of REM, non-REM, sleep bout duration and sleep fragmentation. Here we describe a novel approach toward validating candidate causal genes, while also identifying potential targets for sleep-related indications. Select small molecule antagonists and agonists were used to interrogate candidate causal gene function in rodent sleep polysomnography assays to determine impact on overall sleep architecture and to evaluate alignment with associated sleep-wake traits. Significant effects on sleep architecture were observed in validation studies using compounds targeting the muscarinic acetylcholine receptor M3 subunit (Chrm3)(wake promotion), nicotinic acetylcholine receptor alpha4 subunit (Chrna4)(wake promotion), dopamine receptor D5 subunit (Drd5)(sleep induction), serotonin 1D receptor (Htr1d)(altered REM fragmentation), glucagon-like peptide-1 receptor (Glp1r)(light sleep promotion and reduction of deep sleep), and Calcium channel, voltage-dependent, T type, alpha 1I subunit (Cacna1i)(increased bout duration slow wave sleep). Taken together, these results show the complexity of genetic components that regulate sleep-wake traits and highlight the importance of evaluating this complex behavior at a systems level. Pharmacological validation of genetically identified putative targets provides a rapid alternative to generating knock out or transgenic animal models, and may ultimately lead towards new therapeutic opportunities. PMID:22091728

  11. QTL Mapping and CRISPR/Cas9 Editing to Identify a Drug Resistance Gene in Toxoplasma gondii.

    PubMed

    Shen, Bang; Powell, Robin H; Behnke, Michael S

    2017-06-22

    Scientific knowledge is intrinsically linked to available technologies and methods. This article will present two methods that allowed for the identification and verification of a drug resistance gene in the Apicomplexan parasite Toxoplasma gondii, the method of Quantitative Trait Locus (QTL) mapping using a Whole Genome Sequence (WGS) -based genetic map and the method of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/Cas9 -based gene editing. The approach of QTL mapping allows one to test if there is a correlation between a genomic region(s) and a phenotype. Two datasets are required to run a QTL scan, a genetic map based on the progeny of a recombinant cross and a quantifiable phenotype assessed in each of the progeny of that cross. These datasets are then formatted to be compatible with R/qtl software that generates a QTL scan to identify significant loci correlated with the phenotype. Although this can greatly narrow the search window of possible candidates, QTLs span regions containing a number of genes from which the causal gene needs to be identified. Having WGS of the progeny was critical to identify the causal drug resistance mutation at the gene level. Once identified, the candidate mutation can be verified by genetic manipulation of drug sensitive parasites. The most facile and efficient method to genetically modify T. gondii is the CRISPR/Cas9 system. This system comprised of just 2 components both encoded on a single plasmid, a single guide RNA (gRNA) containing a 20 bp sequence complementary to the genomic target and the Cas9 endonuclease that generates a double-strand DNA break (DSB) at the target, repair of which allows for insertion or deletion of sequences around the break site. This article provides detailed protocols to use CRISPR/Cas9 based genome editing tools to verify the gene responsible for sinefungin resistance and to construct transgenic parasites.

  12. Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns.

    PubMed

    Barvkar, Vitthal T; Pardeshi, Varsha C; Kale, Sandip M; Kadoo, Narendra Y; Gupta, Vidya S

    2012-05-08

    The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were

  13. Phylogenetics of Lophotrochozoan bHLH Genes and the Evolution of Lineage-Specific Gene Duplicates.

    PubMed

    Bao, Yongbo; Xu, Fei; Shimeld, Sebastian M

    2017-04-01

    The gain and loss of genes encoding transcription factors is of importance to understanding the evolution of gene regulatory complexity. The basic helix-loop-helix (bHLH) genes encode a large superfamily of transcription factors. We systematically classify the bHLH genes from five mollusc, two annelid and one brachiopod genomes, tracing the pattern of bHLH gene evolution across these poorly studied Phyla. In total, 56-88 bHLH genes were identified in each genome, with most identifiable as members of previously described bilaterian families, or of new families we define. Of such families only one, Mesp, appears lost by all these species. Additional duplications have also played a role in the evolution of the bHLH gene repertoire, with many new lophotrochozoan-, mollusc-, bivalve-, or gastropod-specific genes defined. Using a combination of transcriptome mining, RT-PCR, and in situ hybridization we compared the expression of several of these novel genes in tissues and embryos of the molluscs Crassostrea gigas and Patella vulgata, finding both conserved expression and evidence for neofunctionalization. We also map the positions of the genes across these genomes, identifying numerous gene linkages. Some reflect recent paralog divergence by tandem duplication, others are remnants of ancient tandem duplications dating to the lophotrochozoan or bilaterian common ancestors. These data are built into a model of the evolution of bHLH genes in molluscs, showing formidable evolutionary stasis at the family level but considerable within-family diversification by tandem gene duplication. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. The Recently Identified Isoleucine Conjugate of cis-12-Oxo-Phytodienoic Acid Is Partially Active in cis-12-Oxo-Phytodienoic Acid-Specific Gene Expression of Arabidopsis thaliana

    PubMed Central

    Floková, Kristýna; Miersch, Otto; Strnad, Miroslav; Novák, Ondřej; Wasternack, Claus; Hause, Bettina

    2016-01-01

    Oxylipins of the jasmonate family are active as signals in plant responses to biotic and abiotic stresses as well as in development. Jasmonic acid (JA), its precursor cis-12-oxo-phytodienoic acid (OPDA) and the isoleucine conjugate of JA (JA-Ile) are the most prominent members. OPDA and JA-Ile have individual signalling properties in several processes and differ in their pattern of gene expression. JA-Ile, but not OPDA, is perceived by the SCFCOI1-JAZ co-receptor complex. There are, however, numerous processes and genes specifically induced by OPDA. The recently identified OPDA-Ile suggests that OPDA specific responses might be mediated upon formation of OPDA-Ile. Here, we tested OPDA-Ile-induced gene expression in wild type and JA-deficient, JA-insensitive and JA-Ile-deficient mutant background. Tests on putative conversion of OPDA-Ile during treatments revealed only negligible conversion. Expression of two OPDA-inducible genes, GRX480 and ZAT10, by OPDA-Ile could be detected in a JA-independent manner in Arabidopsis seedlings but less in flowering plants. The data suggest a bioactivity in planta of OPDA-Ile. PMID:27611078

  15. Genomic convergence to identify candidate genes for Alzheimer disease on chromosome 10

    PubMed Central

    Liang, Xueying; Slifer, Michael; Martin, Eden R.; Schnetz-Boutaud, Nathalie; Bartlett, Jackie; Anderson, Brent; Züchner, Stephan; Gwirtsman, Harry; Gilbert, John R.; Pericak-Vance, Margaret A.; Haines, Jonathan L.

    2009-01-01

    A broad region of chromosome 10 (chr10) has engendered continued interest in the etiology of late-onset Alzheimer Disease (LOAD) from both linkage and candidate gene studies. However, there is a very extensive heterogeneity on chr10. We converged linkage analysis and gene expression data using the concept of genomic convergence that suggests that genes showing positive results across multiple different data types are more likely to be involved in AD. We identified and examined 28 genes on chr10 for association with AD in a Caucasian case-control dataset of 506 cases and 558 controls with substantial clinical information. The cases were all LOAD (minimum age at onset ≥ 60 years). Both single marker and haplotypic associations were tested in the overall dataset and 8 subsets defined by age, gender, ApoE and clinical status. PTPLA showed allelic, genotypic and haplotypic association in the overall dataset. SORCS1 was significant in the overall data sets (p=0.0025) and most significant in the female subset (allelic association p=0.00002, a 3-locus haplotype had p=0.0005). Odds Ratio of SORCS1 in the female subset was 1.7 (p<0.0001). SORCS1 is an interesting candidate gene involved in the Aβ pathway. Therefore, genetic variations in PTPLA and SORCS1 may be associated and have modest effect to the risk of AD by affecting Aβ pathway. The replication of the effect of these genes in different study populations and search for susceptible variants and functional studies of these genes are necessary to get a better understanding of the roles of the genes in Alzheimer disease. PMID:19241460

  16. Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

    PubMed

    Höps, Wolfram; Jeffryes, Matt; Bateman, Alex

    2018-01-01

    We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation.  Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases.  We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes.  Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.

  17. Enhancer connectome in primary human cells identifies target genes of disease-associated DNA elements

    PubMed Central

    Mumbach, Maxwell R; Satpathy, Ansuman T; Boyle, Evan A; Dai, Chao; Gowen, Benjamin G; Cho, Seung Woo; Nguyen, Michelle L; Rubin, Adam J; Granja, Jeffrey M; Kazane, Katelynn R; Wei, Yuning; Nguyen, Trieu; Greenside, Peyton G; Corces, M Ryan; Tycko, Josh; Simeonov, Dimitre R; Suliman, Nabeela; Li, Rui; Xu, Jin; Flynn, Ryan A; Kundaje, Anshul; Khavari, Paul A; Marson, Alexander; Corn, Jacob E; Quertermous, Thomas; Greenleaf, William J; Chang, Howard Y

    2018-01-01

    The challenge of linking intergenic mutations to target genes has limited molecular understanding of human diseases. Here we show that H3K27ac HiChIP generates high-resolution contact maps of active enhancers and target genes in rare primary human T cell subtypes and coronary artery smooth muscle cells. Differentiation of naive T cells into T helper 17 cells or regulatory T cells creates subtype-specific enhancer–promoter interactions, specifically at regions of shared DNA accessibility. These data provide a principled means of assigning molecular functions to autoimmune and cardiovascular disease risk variants, linking hundreds of noncoding variants to putative gene targets. Target genes identified with HiChIP are further supported by CRISPR interference and activation at linked enhancers, by the presence of expression quantitative trait loci, and by allele-specific enhancer loops in patient-derived primary cells. The majority of disease-associated enhancers contact genes beyond the nearest gene in the linear genome, leading to a fourfold increase in the number of potential target genes for autoimmune and cardiovascular diseases. PMID:28945252

  18. An elm EST database for identifying leaf beetle egg-induced defense genes

    PubMed Central

    2012-01-01

    Background Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. Results Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and

  19. An elm EST database for identifying leaf beetle egg-induced defense genes.

    PubMed

    Büchel, Kerstin; McDowell, Eric; Nelson, Will; Descour, Anne; Gershenzon, Jonathan; Hilker, Monika; Soderlund, Carol; Gang, David R; Fenning, Trevor; Meiners, Torsten

    2012-06-15

    Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and primary metabolism

  20. Suppression subtractive hybridization identified differentially expressed genes in lung adenocarcinoma: ERGIC3 as a novel lung cancer-related gene

    PubMed Central

    2013-01-01

    Background To understand the carcinogenesis caused by accumulated genetic and epigenetic alterations and seek novel biomarkers for various cancers, studying differentially expressed genes between cancerous and normal tissues is crucial. In the study, two cDNA libraries of lung cancer were constructed and screened for identification of differentially expressed genes. Methods Two cDNA libraries of differentially expressed genes were constructed using lung adenocarcinoma tissue and adjacent nonmalignant lung tissue by suppression subtractive hybridization. The data of the cDNA libraries were then analyzed and compared using bioinformatics analysis. Levels of mRNA and protein were measured by quantitative real-time polymerase chain reaction (q-RT-PCR) and western blot respectively, as well as expression and localization of proteins were determined by immunostaining. Gene functions were investigated using proliferation and migration assays after gene silencing and gene over-expression. Results Two libraries of differentially expressed genes were obtained. The forward-subtracted library (FSL) and the reverse-subtracted library (RSL) contained 177 and 59 genes, respectively. Bioinformatic analysis demonstrated that these genes were involved in a wide range of cellular functions. The vast majority of these genes were newly identified to be abnormally expressed in lung cancer. In the first stage of the screening for 16 genes, we compared lung cancer tissues with their adjacent non-malignant tissues at the mRNA level, and found six genes (ERGIC3, DDR1, HSP90B1, SDC1, RPSA, and LPCAT1) from the FSL were significantly up-regulated while two genes (GPX3 and TIMP3) from the RSL were significantly down-regulated (P < 0.05). The ERGIC3 protein was also over-expressed in lung cancer tissues and cultured cells, and expression of ERGIC3 was correlated with the differentiated degree and histological type of lung cancer. The up-regulation of ERGIC3 could promote cellular migration

  1. Association Analysis Suggests SOD2 as a Newly Identified Candidate Gene Associated With Leprosy Susceptibility.

    PubMed

    Ramos, Geovana Brotto; Salomão, Heloisa; Francio, Angela Schneider; Fava, Vinícius Medeiros; Werneck, Renata Iani; Mira, Marcelo Távora

    2016-08-01

    Genetic studies have identified several genes and genomic regions contributing to the control of host susceptibility to leprosy. Here, we test variants of the positional and functional candidate gene SOD2 for association with leprosy in 2 independent population samples. Family-based analysis revealed an association between leprosy and allele G of marker rs295340 (P = .042) and borderline evidence of an association between leprosy and alleles C and A of markers rs4880 (P = .077) and rs5746136 (P = .071), respectively. Findings were validated in an independent case-control sample for markers rs295340 (P = .049) and rs4880 (P = .038). These results suggest SOD2 as a newly identified gene conferring susceptibility to leprosy. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.

  2. Featured Article: Transcriptional landscape analysis identifies differently expressed genes involved in follicle-stimulating hormone induced postmenopausal osteoporosis.

    PubMed

    Maasalu, Katre; Laius, Ott; Zhytnik, Lidiia; Kõks, Sulev; Prans, Ele; Reimann, Ene; Märtson, Aare

    2017-01-01

    Osteoporosis is a disorder associated with bone tissue reorganization, bone mass, and mineral density. Osteoporosis can severely affect postmenopausal women, causing bone fragility and osteoporotic fractures. The aim of the current study was to compare blood mRNA profiles of postmenopausal women with and without osteoporosis, with the aim of finding different gene expressions and thus targets for future osteoporosis biomarker studies. Our study consisted of transcriptome analysis of whole blood serum from 12 elderly female osteoporotic patients and 12 non-osteoporotic elderly female controls. The transcriptome analysis was performed with RNA sequencing technology. For data analysis, the edgeR package of R Bioconductor was used. Two hundred and fourteen genes were expressed differently in osteoporotic compared with non-osteoporotic patients. Statistical analysis revealed 20 differently expressed genes with a false discovery rate of less than 1.47 × 10 -4 among osteoporotic patients. The expression of 10 genes were up-regulated and 10 down-regulated. Further statistical analysis identified a potential osteoporosis mRNA biomarker pattern consisting of six genes: CACNA1G, ALG13, SBK1, GGT7, MBNL3, and RIOK3. Functional ingenuity pathway analysis identified the strongest candidate genes with regard to potential involvement in a follicle-stimulating hormone activated network of increased osteoclast activity and hypogonadal bone loss. The differentially expressed genes identified in this study may contribute to future research of postmenopausal osteoporosis blood biomarkers.

  3. Integrating transcriptome and genome re-sequencing data to identify key genes and mutations affecting chicken eggshell qualities.

    PubMed

    Zhang, Quan; Zhu, Feng; Liu, Long; Zheng, Chuan Wei; Wang, De He; Hou, Zhuo Cheng; Ning, Zhong Hua

    2015-01-01

    Eggshell damages lead to economic losses in the egg production industry and are a threat to human health. We examined 49-wk-old Rhode Island White hens (Gallus gallus) that laid eggs having shells with significantly different strengths and thicknesses. We used HiSeq 2000 (Illumina) sequencing to characterize the chicken transcriptome and whole genome to identify the key genes and genetic mutations associated with eggshell calcification. We identified a total of 14,234 genes expressed in the chicken uterus, representing 89% of all annotated chicken genes. A total of 889 differentially expressed genes were identified by comparing low eggshell strength (LES) and normal eggshell strength (NES) genomes. The DEGs are enriched in calcification-related processes, including calcium ion transport and calcium signaling pathways as revealed by gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis. Some important matrix proteins, such as OC-116, LTF and SPP1, were also expressed differentially between two groups. A total of 3,671,919 single-nucleotide polymorphisms (SNPs) and 508,035 Indels were detected in protein coding genes by whole-genome re-sequencing, including 1775 non-synonymous variations and 19 frame-shift Indels in DEGs. SNPs and Indels found in this study could be further investigated for eggshell traits. This is the first report to integrate the transcriptome and genome re-sequencing to target the genetic variations which decreased the eggshell qualities. These findings further advance our understanding of eggshell calcification in the chicken uterus.

  4. Barcode Sequencing Screen Identifies SUB1 as a Regulator of Yeast Pheromone Inducible Genes

    PubMed Central

    Sliva, Anna; Kuang, Zheng; Meluh, Pamela B.; Boeke, Jef D.

    2016-01-01

    The yeast pheromone response pathway serves as a valuable model of eukaryotic mitogen-activated protein kinase (MAPK) pathways, and transcription of their downstream targets. Here, we describe application of a screening method combining two technologies: fluorescence-activated cell sorting (FACS), and barcode analysis by sequencing (Bar-Seq). Using this screening method, and pFUS1-GFP as a reporter for MAPK pathway activation, we readily identified mutants in known mating pathway components. In this study, we also include a comprehensive analysis of the FUS1 induction properties of known mating pathway mutants by flow cytometry, featuring single cell analysis of each mutant population. We also characterized a new source of false positives resulting from the design of this screen. Additionally, we identified a deletion mutant, sub1Δ, with increased basal expression of pFUS1-GFP. Here, in the first ChIP-Seq of Sub1, our data shows that Sub1 binds to the promoters of about half the genes in the genome (tripling the 991 loci previously reported), including the promoters of several pheromone-inducible genes, some of which show an increase upon pheromone induction. Here, we also present the first RNA-Seq of a sub1Δ mutant; the majority of genes have no change in RNA, but, of the small subset that do, most show decreased expression, consistent with biochemical studies implicating Sub1 as a positive transcriptional regulator. The RNA-Seq data also show that certain pheromone-inducible genes are induced less in the sub1Δ mutant relative to the wild type, supporting a role for Sub1 in regulation of mating pathway genes. The sub1Δ mutant has increased basal levels of a small subset of other genes besides FUS1, including IMD2 and FIG1, a gene encoding an integral membrane protein necessary for efficient mating. PMID:26837954

  5. Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods.

    PubMed

    Tuo, Youlin; An, Ning; Zhang, Ming

    2018-03-01

    The aim of the present study was to investigate the feature genes in metastatic breast cancer samples. A total of 5 expression profiles of metastatic breast cancer samples were downloaded from the Gene Expression Omnibus database, which were then analyzed using the MetaQC and MetaDE packages in R language. The feature genes between metastasis and non‑metastasis samples were screened under the threshold of P<0.05. Based on the protein‑protein interactions (PPIs) in the Biological General Repository for Interaction Datasets, Human Protein Reference Database and Biomolecular Interaction Network Database, the PPI network of the feature genes was constructed. The feature genes identified by topological characteristics were then used for support vector machine (SVM) classifier training and verification. The accuracy of the SVM classifier was then evaluated using another independent dataset from The Cancer Genome Atlas database. Finally, function and pathway enrichment analyses for genes in the SVM classifier were performed. A total of 541 feature genes were identified between metastatic and non‑metastatic samples. The top 10 genes with the highest betweenness centrality values in the PPI network of feature genes were Nuclear RNA Export Factor 1, cyclin‑dependent kinase 2 (CDK2), myelocytomatosis proto‑oncogene protein (MYC), Cullin 5, SHC Adaptor Protein 1, Clathrin heavy chain, Nucleolin, WD repeat domain 1, proteasome 26S subunit non‑ATPase 2 and telomeric repeat binding factor 2. The cyclin‑dependent kinase inhibitor 1A (CDKN1A), E2F transcription factor 1 (E2F1), and MYC interacted with CDK2. The SVM classifier constructed by the top 30 feature genes was able to distinguish metastatic samples from non‑metastatic samples [correct rate, specificity, positive predictive value and negative predictive value >0.89; sensitivity >0.84; area under the receiver operating characteristic curve (AUROC) >0.96]. The verification of the SVM classifier in an

  6. Transcriptomic meta-analysis identifies gene expression characteristics in various samples of HIV-infected patients with nonprogressive disease.

    PubMed

    Zhang, Le-Le; Zhang, Zi-Ning; Wu, Xian; Jiang, Yong-Jun; Fu, Ya-Jing; Shang, Hong

    2017-09-12

    A small proportion of HIV-infected patients remain clinically and/or immunologically stable for years, including elite controllers (ECs) who have undetectable viremia (<50 copies/ml) and long-term nonprogressors (LTNPs) who maintain normal CD4 + T cell counts for prolonged periods (>10 years). However, the mechanism of nonprogression needs to be further resolved. In this study, a transcriptome meta-analysis was performed on nonprogressor and progressor microarray data to identify differential transcriptome pathways and potential biomarkers. Using the INMEX (integrative meta-analysis of expression data) program, we performed the meta-analysis to identify consistently differentially expressed genes (DEGs) in nonprogressors and further performed functional interpretation (gene ontology analysis and pathway analysis) of the DEGs identified in the meta-analysis. Five microarray datasets (81 cases and 98 controls in total), including whole blood, CD4 + and CD8 + T cells, were collected for meta-analysis. We determined that nonprogressors have reduced expression of important interferon-stimulated genes (ISGs), CD38, lymphocyte activation gene 3 (LAG-3) in whole blood, CD4 + and CD8 + T cells. Gene ontology (GO) analysis showed a significant enrichment in DEGs that function in the type I interferon signaling pathway. Upregulated pathways, including the PI3K-Akt signaling pathway in whole blood, cytokine-cytokine receptor interaction in CD4 + T cells and the MAPK signaling pathway in CD8 + T cells, were identified in nonprogressors compared with progressors. In each metabolic functional category, the number of downregulated DEGs was more than the upregulated DEGs, and almost all genes were downregulated DEGs in the oxidative phosphorylation (OXPHOS) and tricarboxylic acid (TCA) cycle in the three types of samples. Our transcriptomic meta-analysis provides a comprehensive evaluation of the gene expression profiles in major blood types of nonprogressors, providing new

  7. Global Gene-Expression Analysis to Identify Differentially Expressed Genes Critical for the Heat Stress Response in Brassica rapa

    PubMed Central

    Dong, Xiangshu; Yi, Hankuil; Lee, Jeongyeo; Nou, Ill-Sup; Han, Ching-Tack; Hur, Yoonkang

    2015-01-01

    Genome-wide dissection of the heat stress response (HSR) is necessary to overcome problems in crop production caused by global warming. To identify HSR genes, we profiled gene expression in two Chinese cabbage inbred lines with different thermotolerances, Chiifu and Kenshin. Many genes exhibited >2-fold changes in expression upon exposure to 0.5– 4 h at 45°C (high temperature, HT): 5.2% (2,142 genes) in Chiifu and 3.7% (1,535 genes) in Kenshin. The most enriched GO (Gene Ontology) items included ‘response to heat’, ‘response to reactive oxygen species (ROS)’, ‘response to temperature stimulus’, ‘response to abiotic stimulus’, and ‘MAPKKK cascade’. In both lines, the genes most highly induced by HT encoded small heat shock proteins (Hsps) and heat shock factor (Hsf)-like proteins such as HsfB2A (Bra029292), whereas high-molecular weight Hsps were constitutively expressed. Other upstream HSR components were also up-regulated: ROS-scavenging genes like glutathione peroxidase 2 (BrGPX2, Bra022853), protein kinases, and phosphatases. Among heat stress (HS) marker genes in Arabidopsis, only exportin 1A (XPO1A) (Bra008580, Bra006382) can be applied to B. rapa for basal thermotolerance (BT) and short-term acquired thermotolerance (SAT) gene. CYP707A3 (Bra025083, Bra021965), which is involved in the dehydration response in Arabidopsis, was associated with membrane leakage in both lines following HS. Although many transcription factors (TF) genes, including DREB2A (Bra005852), were involved in HS tolerance in both lines, Bra024224 (MYB41) and Bra021735 (a bZIP/AIR1 [Anthocyanin-Impaired-Response-1]) were specific to Kenshin. Several candidate TFs involved in thermotolerance were confirmed as HSR genes by real-time PCR, and these assignments were further supported by promoter analysis. Although some of our findings are similar to those obtained using other plant species, clear differences in Brassica rapa reveal a distinct HSR in this species. Our data

  8. Identifying osteosarcoma metastasis associated genes by weighted gene co-expression network analysis (WGCNA).

    PubMed

    Tian, Honglai; Guan, Donghui; Li, Jianmin

    2018-06-01

    Osteosarcoma (OS), the most common malignant bone tumor, accounts for the heavy healthy threat in the period of children and adolescents. OS occurrence usually correlates with early metastasis and high death rate. This study aimed to better understand the mechanism of OS metastasis.Based on Gene Expression Omnibus (GEO) database, we downloaded 4 expression profile data sets associated with OS metastasis, and selected differential expressed genes. Weighted gene co-expression network analysis (WGCNA) approach allowed us to investigate the most OS metastasis-correlated module. Gene Ontology functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were used to give annotation of selected OS metastasis-associated genes.We select 897 differential expressed genes from OS metastasis and OS non-metastasis groups. Based on these selected genes, WGCNA further explored 142 genes included in the most OS metastasis-correlated module. Gene Ontology functional and KEGG pathway enrichment analyses showed that significantly OS metastasis-associated genes were involved in pathway correlated with insulin-like growth factor binding.Our research figured out several potential molecules participating in metastasis process and factors acting as biomarker. With this study, we could better explore the mechanism of OS metastasis and further discover more therapy targets.

  9. Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    PubMed Central

    Yockteng, Roxana; Marthey, Sylvain; Chiapello, Hélène; Gendrault, Annie; Hood, Michael E; Rodolphe, François; Devier, Benjamin; Wincker, Patrick; Dossat, Carole; Giraud, Tatiana

    2007-01-01

    Background The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics. PMID:17692127

  10. Determining Semantically Related Significant Genes.

    PubMed

    Taha, Kamal

    2014-01-01

    GO relation embodies some aspects of existence dependency. If GO term xis existence-dependent on GO term y, the presence of y implies the presence of x. Therefore, the genes annotated with the function of the GO term y are usually functionally and semantically related to the genes annotated with the function of the GO term x. A large number of gene set enrichment analysis methods have been developed in recent years for analyzing gene sets enrichment. However, most of these methods overlook the structural dependencies between GO terms in GO graph by not considering the concept of existence dependency. We propose in this paper a biological search engine called RSGSearch that identifies enriched sets of genes annotated with different functions using the concept of existence dependency. We observe that GO term xcannot be existence-dependent on GO term y, if x- and y- have the same specificity (biological characteristics). After encoding into a numeric format the contributions of GO terms annotating target genes to the semantics of their lowest common ancestors (LCAs), RSGSearch uses microarray experiment to identify the most significant LCA that annotates the result genes. We evaluated RSGSearch experimentally and compared it with five gene set enrichment systems. Results showed marked improvement.

  11. Novel statistical framework to identify differentially expressed genes allowing transcriptomic background differences.

    PubMed

    Ling, Zhi-Qiang; Wang, Yi; Mukaisho, Kenichi; Hattori, Takanori; Tatsuta, Takeshi; Ge, Ming-Hua; Jin, Li; Mao, Wei-Min; Sugihara, Hiroyuki

    2010-06-01

    Tests of differentially expressed genes (DEGs) from microarray experiments are based on the null hypothesis that genes that are irrelevant to the phenotype/stimulus are expressed equally in the target and control samples. However, this strict hypothesis is not always true, as there can be several transcriptomic background differences between target and control samples, including different cell/tissue types, different cell cycle stages and different biological donors. These differences lead to increased false positives, which have little biological/medical significance. In this article, we propose a statistical framework to identify DEGs between target and control samples from expression microarray data allowing transcriptomic background differences between these samples by introducing a modified null hypothesis that the gene expression background difference is normally distributed. We use an iterative procedure to perform robust estimation of the null hypothesis and identify DEGs as outliers. We evaluated our method using our own triplicate microarray experiment, followed by validations with reverse transcription-polymerase chain reaction (RT-PCR) and on the MicroArray Quality Control dataset. The evaluations suggest that our technique (i) results in less false positive and false negative results, as measured by the degree of agreement with RT-PCR of the same samples, (ii) can be applied to different microarray platforms and results in better reproducibility as measured by the degree of DEG identification concordance both intra- and inter-platforms and (iii) can be applied efficiently with only a few microarray replicates. Based on these evaluations, we propose that this method not only identifies more reliable and biologically/medically significant DEG, but also reduces the power-cost tradeoff problem in the microarray field. Source code and binaries freely available for download at http://comonca.org.cn/fdca/resources/softwares/deg.zip.

  12. Sarcoidosis Related Novel Candidate Genes Identified by Multi-Omics Integrative Analyses.

    PubMed

    Hočevar, Keli; Maver, Aleš; Kunej, Tanja; Peterlin, Borut

    2018-05-01

    Sarcoidosis is a multifactorial systemic disease characterized by granulomatous inflammation and greatly impacting on global public health. The etiology and mechanisms of sarcoidosis are not fully understood. Recent high-throughput biological research has generated vast amounts of multi-omics big data on sarcoidosis, but their significance remains to be determined. We sought to identify novel candidate regions, and genes consistently altered in heterogeneous omics studies so as to reveal the underlying molecular mechanisms. We conducted a comprehensive integrative literature analysis on global data on sarcoidosis, including genomic, transcriptomic, proteomic, and phenomic studies. We performed positional integration analysis of 38 eligible datasets originating from 17 different biological layers. Using the integration interval length of 50 kb, we identified 54 regions reaching significance value p ≤ 0.0001 and 15 regions with significance value p ≤ 0.00001, when applying more stringent criteria. Secondary literature analysis of the top 20 regions, with the most significant accumulation of signals, revealed several novel candidate genes for which associations with sarcoidosis have not yet been established, but have considerable support for their involvement based on omic data. These new plausible candidate genes include NELFE, CFB, EGFL7, AGPAT2, FKBPL, NRC3, and NEU1. Furthermore, annotated data were prepared to enable custom visualization and browsing of these sarcoidosis related omics evidence in the University of California Santa Cruz (UCSC) Genome Browser. Further multi-omics approaches are called for sarcoidosis biomarkers and diagnostic and therapeutic innovation. Our approach for harnessing multi-omics data and the findings presented herein reflect important steps toward understanding the etiology and underlying pathological mechanisms of sarcoidosis.

  13. Random forests-based differential analysis of gene sets for gene expression data.

    PubMed

    Hsueh, Huey-Miin; Zhou, Da-Wei; Tsai, Chen-An

    2013-04-10

    In DNA microarray studies, gene-set analysis (GSA) has become the focus of gene expression data analysis. GSA utilizes the gene expression profiles of functionally related gene sets in Gene Ontology (GO) categories or priori-defined biological classes to assess the significance of gene sets associated with clinical outcomes or phenotypes. Many statistical approaches have been proposed to determine whether such functionally related gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to the discriminatory power of gene sets and classification of patients. In this study, we propose a method of gene set analysis, in which gene sets are used to develop classifications of patients based on the Random Forest (RF) algorithm. The corresponding empirical p-value of an observed out-of-bag (OOB) error rate of the classifier is introduced to identify differentially expressed gene sets using an adequate resampling method. In addition, we discuss the impacts and correlations of genes within each gene set based on the measures of variable importance in the RF algorithm. Significant classifications are reported and visualized together with the underlying gene sets and their contribution to the phenotypes of interest. Numerical studies using both synthesized data and a series of publicly available gene expression data sets are conducted to evaluate the performance of the proposed methods. Compared with other hypothesis testing approaches, our proposed methods are reliable and successful in identifying enriched gene sets and in discovering the contributions of genes within a gene set. The classification results of identified gene sets can provide an valuable alternative to gene set testing to reveal the unknown, biologically relevant classes of samples or patients. In summary, our proposed method allows one to simultaneously assess the discriminatory ability of gene sets and the importance of genes for

  14. Integrated Analysis of Mutation Data from Various Sources Identifies Key Genes and Signaling Pathways in Hepatocellular Carcinoma

    PubMed Central

    Wei, Lin; Tang, Ruqi; Lian, Baofeng; Zhao, Yingjun; He, Xianghuo; Xie, Lu

    2014-01-01

    Background Recently, a number of studies have performed genome or exome sequencing of hepatocellular carcinoma (HCC) and identified hundreds or even thousands of mutations in protein-coding genes. However, these studies have only focused on a limited number of candidate genes, and many important mutation resources remain to be explored. Principal Findings In this study, we integrated mutation data obtained from various sources and performed pathway and network analysis. We identified 113 pathways that were significantly mutated in HCC samples and found that the mutated genes included in these pathways contained high percentages of known cancer genes, and damaging genes and also demonstrated high conservation scores, indicating their important roles in liver tumorigenesis. Five classes of pathways that were mutated most frequently included (a) proliferation and apoptosis related pathways, (b) tumor microenvironment related pathways, (c) neural signaling related pathways, (d) metabolic related pathways, and (e) circadian related pathways. Network analysis further revealed that the mutated genes with the highest betweenness coefficients, such as the well-known cancer genes TP53, CTNNB1 and recently identified novel mutated genes GNAL and the ADCY family, may play key roles in these significantly mutated pathways. Finally, we highlight several key genes (e.g., RPS6KA3 and PCLO) and pathways (e.g., axon guidance) in which the mutations were associated with clinical features. Conclusions Our workflow illustrates the increased statistical power of integrating multiple studies of the same subject, which can provide biological insights that would otherwise be masked under individual sample sets. This type of bioinformatics approach is consistent with the necessity of making the best use of the ever increasing data provided in valuable databases, such as TCGA, to enhance the speed of deciphering human cancers. PMID:24988079

  15. Integrated analysis of mutation data from various sources identifies key genes and signaling pathways in hepatocellular carcinoma.

    PubMed

    Zhang, Yuannv; Qiu, Zhaoping; Wei, Lin; Tang, Ruqi; Lian, Baofeng; Zhao, Yingjun; He, Xianghuo; Xie, Lu

    2014-01-01

    Recently, a number of studies have performed genome or exome sequencing of hepatocellular carcinoma (HCC) and identified hundreds or even thousands of mutations in protein-coding genes. However, these studies have only focused on a limited number of candidate genes, and many important mutation resources remain to be explored. In this study, we integrated mutation data obtained from various sources and performed pathway and network analysis. We identified 113 pathways that were significantly mutated in HCC samples and found that the mutated genes included in these pathways contained high percentages of known cancer genes, and damaging genes and also demonstrated high conservation scores, indicating their important roles in liver tumorigenesis. Five classes of pathways that were mutated most frequently included (a) proliferation and apoptosis related pathways, (b) tumor microenvironment related pathways, (c) neural signaling related pathways, (d) metabolic related pathways, and (e) circadian related pathways. Network analysis further revealed that the mutated genes with the highest betweenness coefficients, such as the well-known cancer genes TP53, CTNNB1 and recently identified novel mutated genes GNAL and the ADCY family, may play key roles in these significantly mutated pathways. Finally, we highlight several key genes (e.g., RPS6KA3 and PCLO) and pathways (e.g., axon guidance) in which the mutations were associated with clinical features. Our workflow illustrates the increased statistical power of integrating multiple studies of the same subject, which can provide biological insights that would otherwise be masked under individual sample sets. This type of bioinformatics approach is consistent with the necessity of making the best use of the ever increasing data provided in valuable databases, such as TCGA, to enhance the speed of deciphering human cancers.

  16. Genome-wide Analyses Identify KIF5A as a Novel ALS Gene.

    PubMed

    Nicolas, Aude; Kenna, Kevin P; Renton, Alan E; Ticozzi, Nicola; Faghri, Faraz; Chia, Ruth; Dominov, Janice A; Kenna, Brendan J; Nalls, Mike A; Keagle, Pamela; Rivera, Alberto M; van Rheenen, Wouter; Murphy, Natalie A; van Vugt, Joke J F A; Geiger, Joshua T; Van der Spek, Rick A; Pliner, Hannah A; Shankaracharya; Smith, Bradley N; Marangi, Giuseppe; Topp, Simon D; Abramzon, Yevgeniya; Gkazi, Athina Soragia; Eicher, John D; Kenna, Aoife; Mora, Gabriele; Calvo, Andrea; Mazzini, Letizia; Riva, Nilo; Mandrioli, Jessica; Caponnetto, Claudia; Battistini, Stefania; Volanti, Paolo; La Bella, Vincenzo; Conforti, Francesca L; Borghero, Giuseppe; Messina, Sonia; Simone, Isabella L; Trojsi, Francesca; Salvi, Fabrizio; Logullo, Francesco O; D'Alfonso, Sandra; Corrado, Lucia; Capasso, Margherita; Ferrucci, Luigi; Moreno, Cristiane de Araujo Martins; Kamalakaran, Sitharthan; Goldstein, David B; Gitler, Aaron D; Harris, Tim; Myers, Richard M; Phatnani, Hemali; Musunuri, Rajeeva Lochan; Evani, Uday Shankar; Abhyankar, Avinash; Zody, Michael C; Kaye, Julia; Finkbeiner, Steven; Wyman, Stacia K; LeNail, Alex; Lima, Leandro; Fraenkel, Ernest; Svendsen, Clive N; Thompson, Leslie M; Van Eyk, Jennifer E; Berry, James D; Miller, Timothy M; Kolb, Stephen J; Cudkowicz, Merit; Baxi, Emily; Benatar, Michael; Taylor, J Paul; Rampersaud, Evadnie; Wu, Gang; Wuu, Joanne; Lauria, Giuseppe; Verde, Federico; Fogh, Isabella; Tiloca, Cinzia; Comi, Giacomo P; Sorarù, Gianni; Cereda, Cristina; Corcia, Philippe; Laaksovirta, Hannu; Myllykangas, Liisa; Jansson, Lilja; Valori, Miko; Ealing, John; Hamdalla, Hisham; Rollinson, Sara; Pickering-Brown, Stuart; Orrell, Richard W; Sidle, Katie C; Malaspina, Andrea; Hardy, John; Singleton, Andrew B; Johnson, Janel O; Arepalli, Sampath; Sapp, Peter C; McKenna-Yasek, Diane; Polak, Meraida; Asress, Seneshaw; Al-Sarraj, Safa; King, Andrew; Troakes, Claire; Vance, Caroline; de Belleroche, Jacqueline; Baas, Frank; Ten Asbroek, Anneloor L M A; Muñoz-Blanco, José Luis; Hernandez, Dena G; Ding, Jinhui; Gibbs, J Raphael; Scholz, Sonja W; Floeter, Mary Kay; Campbell, Roy H; Landi, Francesco; Bowser, Robert; Pulst, Stefan M; Ravits, John M; MacGowan, Daniel J L; Kirby, Janine; Pioro, Erik P; Pamphlett, Roger; Broach, James; Gerhard, Glenn; Dunckley, Travis L; Brady, Christopher B; Kowall, Neil W; Troncoso, Juan C; Le Ber, Isabelle; Mouzat, Kevin; Lumbroso, Serge; Heiman-Patterson, Terry D; Kamel, Freya; Van Den Bosch, Ludo; Baloh, Robert H; Strom, Tim M; Meitinger, Thomas; Shatunov, Aleksey; Van Eijk, Kristel R; de Carvalho, Mamede; Kooyman, Maarten; Middelkoop, Bas; Moisse, Matthieu; McLaughlin, Russell L; Van Es, Michael A; Weber, Markus; Boylan, Kevin B; Van Blitterswijk, Marka; Rademakers, Rosa; Morrison, Karen E; Basak, A Nazli; Mora, Jesús S; Drory, Vivian E; Shaw, Pamela J; Turner, Martin R; Talbot, Kevin; Hardiman, Orla; Williams, Kelly L; Fifita, Jennifer A; Nicholson, Garth A; Blair, Ian P; Rouleau, Guy A; Esteban-Pérez, Jesús; García-Redondo, Alberto; Al-Chalabi, Ammar; Rogaeva, Ekaterina; Zinman, Lorne; Ostrow, Lyle W; Maragakis, Nicholas J; Rothstein, Jeffrey D; Simmons, Zachary; Cooper-Knock, Johnathan; Brice, Alexis; Goutman, Stephen A; Feldman, Eva L; Gibson, Summer B; Taroni, Franco; Ratti, Antonia; Gellera, Cinzia; Van Damme, Philip; Robberecht, Wim; Fratta, Pietro; Sabatelli, Mario; Lunetta, Christian; Ludolph, Albert C; Andersen, Peter M; Weishaupt, Jochen H; Camu, William; Trojanowski, John Q; Van Deerlin, Vivianna M; Brown, Robert H; van den Berg, Leonard H; Veldink, Jan H; Harms, Matthew B; Glass, Jonathan D; Stone, David J; Tienari, Pentti; Silani, Vincenzo; Chiò, Adriano; Shaw, Christopher E; Traynor, Bryan J; Landers, John E

    2018-03-21

    To identify novel genes associated with ALS, we undertook two lines of investigation. We carried out a genome-wide association study comparing 20,806 ALS cases and 59,804 controls. Independently, we performed a rare variant burden analysis comparing 1,138 index familial ALS cases and 19,494 controls. Through both approaches, we identified kinesin family member 5A (KIF5A) as a novel gene associated with ALS. Interestingly, mutations predominantly in the N-terminal motor domain of KIF5A are causative for two neurodegenerative diseases: hereditary spastic paraplegia (SPG10) and Charcot-Marie-Tooth type 2 (CMT2). In contrast, ALS-associated mutations are primarily located at the C-terminal cargo-binding tail domain and patients harboring loss-of-function mutations displayed an extended survival relative to typical ALS cases. Taken together, these results broaden the phenotype spectrum resulting from mutations in KIF5A and strengthen the role of cytoskeletal defects in the pathogenesis of ALS. Copyright © 2018 Elsevier Inc. All rights reserved.

  17. ZCURVE 3.0: identify prokaryotic genes with higher accuracy as well as automatically and accurately select essential genes

    PubMed Central

    Hua, Zhi-Gang; Lin, Yan; Yuan, Ya-Zhou; Yang, De-Chang; Wei, Wen; Guo, Feng-Biao

    2015-01-01

    In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions. PMID:25977299

  18. Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns

    PubMed Central

    2012-01-01

    Background The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Results Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that

  19. An Evolutionary Genomic Approach to Identify Genes Involved in Human Birth Timing

    PubMed Central

    Orabona, Guilherme; Morgan, Thomas; Haataja, Ritva; Hallman, Mikko; Puttonen, Hilkka; Menon, Ramkumar; Kuczynski, Edward; Norwitz, Errol; Snegovskikh, Victoria; Palotie, Aarno; Fellman, Vineta; DeFranco, Emily A.; Chaudhari, Bimal P.; McGregor, Tracy L.; McElroy, Jude J.; Oetjens, Matthew T.; Teramo, Kari; Borecki, Ingrid; Fay, Justin; Muglia, Louis

    2011-01-01

    Coordination of fetal maturation with birth timing is essential for mammalian reproduction. In humans, preterm birth is a disorder of profound global health significance. The signals initiating parturition in humans have remained elusive, due to divergence in physiological mechanisms between humans and model organisms typically studied. Because of relatively large human head size and narrow birth canal cross-sectional area compared to other primates, we hypothesized that genes involved in parturition would display accelerated evolution along the human and/or higher primate phylogenetic lineages to decrease the length of gestation and promote delivery of a smaller fetus that transits the birth canal more readily. Further, we tested whether current variation in such accelerated genes contributes to preterm birth risk. Evidence from allometric scaling of gestational age suggests human gestation has been shortened relative to other primates. Consistent with our hypothesis, many genes involved in reproduction show human acceleration in their coding or adjacent noncoding regions. We screened >8,400 SNPs in 150 human accelerated genes in 165 Finnish preterm and 163 control mothers for association with preterm birth. In this cohort, the most significant association was in FSHR, and 8 of the 10 most significant SNPs were in this gene. Further evidence for association of a linkage disequilibrium block of SNPs in FSHR, rs11686474, rs11680730, rs12473870, and rs1247381 was found in African Americans. By considering human acceleration, we identified a novel gene that may be associated with preterm birth, FSHR. We anticipate other human accelerated genes will similarly be associated with preterm birth risk and elucidate essential pathways for human parturition. PMID:21533219

  20. Schistosoma mansoni: resistant specific infection-induced gene expression in Biomphalaria glabrata identified by fluorescent-based differential display.

    PubMed

    Lockyer, Anne E; Noble, Leslie R; Rollinson, David; Jones, Catherine S

    2004-01-01

    The freshwater tropical snail Biomphalaria glabrata is an intermediate host for Schistosoma mansoni, the causative agent of human intestinal schistosomiasis, and strains differ in their susceptibility to parasite infection. Changes in gene expression in response to parasite infection have been simultaneously examined in a susceptible strain (NHM1742) and a resistant strain (NHM1981) using a newly developed fluorescent-based differential display method. Such RNA profiling techniques allow the examination of changes in gene expression in response to parasite infection, without requiring previous sequence knowledge, or selecting candidate genes that may be involved in the complex neuroendocrine or defence systems of the snail. Thus, novel genes may be identified. Ten transcripts were initially identified, present only in the profiles derived from snails of the resistant strain when exposed to infection. The differential expression of five of these genes, including HSP70 and several novel transcripts with one containing at least two globin-like domains, has been confirmed by semi-quantitative RT-PCR.

  1. A CRISPR-Based Screen Identifies Genes Essential for West-Nile-Virus-Induced Cell Death.

    PubMed

    Ma, Hongming; Dang, Ying; Wu, Yonggan; Jia, Gengxiang; Anaya, Edgar; Zhang, Junli; Abraham, Sojan; Choi, Jang-Gi; Shi, Guojun; Qi, Ling; Manjunath, N; Wu, Haoquan

    2015-07-28

    West Nile virus (WNV) causes an acute neurological infection attended by massive neuronal cell death. However, the mechanism(s) behind the virus-induced cell death is poorly understood. Using a library containing 77,406 sgRNAs targeting 20,121 genes, we performed a genome-wide screen followed by a second screen with a sub-library. Among the genes identified, seven genes, EMC2, EMC3, SEL1L, DERL2, UBE2G2, UBE2J1, and HRD1, stood out as having the strongest phenotype, whose knockout conferred strong protection against WNV-induced cell death with two different WNV strains and in three cell lines. Interestingly, knockout of these genes did not block WNV replication. Thus, these appear to be essential genes that link WNV replication to downstream cell death pathway(s). In addition, the fact that all of these genes belong to the ER-associated protein degradation (ERAD) pathway suggests that this might be the primary driver of WNV-induced cell death. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  2. Genomic analysis of human lung fibroblasts exposed to vanadium pentoxide to identify candidate genes for occupational bronchitis

    PubMed Central

    Ingram, Jennifer L; Antao-Menezes, Aurita; Turpin, Elizabeth A; Wallace, Duncan G; Mangum, James B; Pluta, Linda J; Thomas, Russell S; Bonner, James C

    2007-01-01

    Background Exposure to vanadium pentoxide (V2O5) is a cause of occupational bronchitis. We evaluated gene expression profiles in cultured human lung fibroblasts exposed to V2O5 in vitro in order to identify candidate genes that could play a role in inflammation, fibrosis, and repair during the pathogenesis of V2O5-induced bronchitis. Methods Normal human lung fibroblasts were exposed to V2O5 in a time course experiment. Gene expression was measured at various time points over a 24 hr period using the Affymetrix Human Genome U133A 2.0 Array. Selected genes that were significantly changed in the microarray experiment were validated by RT-PCR. Results V2O5 altered more than 1,400 genes, of which ~300 were induced while >1,100 genes were suppressed. Gene ontology categories (GO) categories unique to induced genes included inflammatory response and immune response, while GO catogories unique to suppressed genes included ubiquitin cycle and cell cycle. A dozen genes were validated by RT-PCR, including growth factors (HBEGF, VEGF, CTGF), chemokines (IL8, CXCL9, CXCL10), oxidative stress response genes (SOD2, PIPOX, OXR1), and DNA-binding proteins (GAS1, STAT1). Conclusion Our study identified a variety of genes that could play pivotal roles in inflammation, fibrosis and repair during V2O5-induced bronchitis. The induction of genes that mediate inflammation and immune responses, as well as suppression of genes involved in growth arrest appear to be important to the lung fibrotic reaction to V2O5. PMID:17459161

  3. A novel APOC2 gene mutation identified in a Chinese patient with severe hypertriglyceridemia and recurrent pancreatitis.

    PubMed

    Jiang, Jingjing; Wang, Yuhui; Ling, Yan; Kayoumu, Abudurexiti; Liu, George; Gao, Xin

    2016-01-16

    The severe forms of hypertriglyceridemia are usually caused by genetic defects. In this study, we described a Chinese female with severe hypertriglyceridemia caused by a novel homozygous mutation in the APOC2 gene. Lipid profiles of the pedigree were studied in detail. LPL and HL activity were also measured. The coding regions of 5 candidate genes (namely LPL, APOC2, APOA5, LMF1, and GPIHBP1) were sequenced using genomic DNA from peripheral leucocytes. The ApoE gene was also genotyped. Serum triglyceride level was extremely high in the proband, compared with other family members. Plasma LPL activity was also significantly reduced in the proband. Serum ApoCII was very low in the proband as well as in the heterozygous mutation carriers. A novel mutation (c.86A > CC) was identified on exon 3 [corrected] of the APOC2 gene, which converted the Asp [corrected] codon at position 29 into Ala, followed by a termination codon (TGA). This study presented the first case of ApoCII deficiency in the Chinese population, with a novel mutation c.86A > CC in the APOC2 gene identified. Serum ApoCII protein might be a useful screening test for identifying mutation carriers.

  4. Integrating Transcriptome and Genome Re-Sequencing Data to Identify Key Genes and Mutations Affecting Chicken Eggshell Qualities

    PubMed Central

    Liu, Long; Zheng, Chuan Wei; Wang, De He; Hou, Zhuo Cheng; Ning, Zhong Hua

    2015-01-01

    Eggshell damages lead to economic losses in the egg production industry and are a threat to human health. We examined 49-wk-old Rhode Island White hens (Gallus gallus) that laid eggs having shells with significantly different strengths and thicknesses. We used HiSeq 2000 (Illumina) sequencing to characterize the chicken transcriptome and whole genome to identify the key genes and genetic mutations associated with eggshell calcification. We identified a total of 14,234 genes expressed in the chicken uterus, representing 89% of all annotated chicken genes. A total of 889 differentially expressed genes were identified by comparing low eggshell strength (LES) and normal eggshell strength (NES) genomes. The DEGs are enriched in calcification-related processes, including calcium ion transport and calcium signaling pathways as reveled by gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis. Some important matrix proteins, such as OC-116, LTF and SPP1, were also expressed differentially between two groups. A total of 3,671,919 single-nucleotide polymorphisms (SNPs) and 508,035 Indels were detected in protein coding genes by whole-genome re-sequencing, including 1775 non-synonymous variations and 19 frame-shift Indels in DEGs. SNPs and Indels found in this study could be further investigated for eggshell traits. This is the first report to integrate the transcriptome and genome re-sequencing to target the genetic variations which decreased the eggshell qualities. These findings further advance our understanding of eggshell calcification in the chicken uterus. PMID:25974068

  5. A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network

    PubMed Central

    RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG

    2015-01-01

    The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425

  6. A gene-signature progression approach to identifying candidate small-molecule cancer therapeutics with connectivity mapping.

    PubMed

    Wen, Qing; Kim, Chang-Sik; Hamilton, Peter W; Zhang, Shu-Dong

    2016-05-11

    Gene expression connectivity mapping has gained much popularity recently with a number of successful applications in biomedical research testifying its utility and promise. Previously methodological research in connectivity mapping mainly focused on two of the key components in the framework, namely, the reference gene expression profiles and the connectivity mapping algorithms. The other key component in this framework, the query gene signature, has been left to users to construct without much consensus on how this should be done, albeit it has been an issue most relevant to end users. As a key input to the connectivity mapping process, gene signature is crucially important in returning biologically meaningful and relevant results. This paper intends to formulate a standardized procedure for constructing high quality gene signatures from a user's perspective. We describe a two-stage process for making quality gene signatures using gene expression data as initial inputs. First, a differential gene expression analysis comparing two distinct biological states; only the genes that have passed stringent statistical criteria are considered in the second stage of the process, which involves ranking genes based on statistical as well as biological significance. We introduce a "gene signature progression" method as a standard procedure in connectivity mapping. Starting from the highest ranked gene, we progressively determine the minimum length of the gene signature that allows connections to the reference profiles (drugs) being established with a preset target false discovery rate. We use a lung cancer dataset and a breast cancer dataset as two case studies to demonstrate how this standardized procedure works, and we show that highly relevant and interesting biological connections are returned. Of particular note is gefitinib, identified as among the candidate therapeutics in our lung cancer case study. Our gene signature was based on gene expression data from Taiwan

  7. Gene expression analysis identifies new candidate genes associated with the development of black skin spots in Corriedale sheep.

    PubMed

    Peñagaricano, Francisco; Zorrilla, Pilar; Naya, Hugo; Robello, Carlos; Urioste, Jorge I

    2012-02-01

    The white coat colour of sheep is an important economic trait. For unknown reasons, some animals are born with, and others develop with time, black skin spots that can also produce pigmented fibres. The presence of pigmented fibres in the white wool significantly decreases the fibre quality. The aim of this work was to study gene expression in black spots (with and without pigmented fibres) and white skin by microarray techniques, in order to identify the possible genes involved in the development of this trait. Five unrelated Corriedale sheep were used and, for each animal, the three possible comparisons (three different hybridisations) between the three samples of interest were performed. Differential gene expression patterns were analysed using different t-test approaches. Most of the major genes with well-known roles in skin pigmentation, e.g. ASIP, MC1R and C-KIT, showed no significant difference in the gene expression between white skin and black spots. On the other hand, many of the differentially expressed genes (raw P-value < 0.005) detected in this study, e.g. C-FOS, KLF4 and UFC1, fulfil biological functions that are plausible to be involved in the formation of black spots. The gene expression of C-FOS and KLF4, transcription factors involved in the cellular response to external factors such as ultraviolet light, was validated by quantitative polymerase chain reaction (PCR). This exploratory study provides a list of candidate genes that could be associated with the development of black skin spots that should be studied in more detail. Characterisation of these genes will enable us to discern the molecular mechanisms involved in the development of this feature and, hence, increase our understanding of melanocyte biology and skin pigmentation. In sheep, understanding this phenomenon is a first step towards developing molecular tools to assist in the selection against the presence of pigmented fibres in white wool.

  8. Identifying driving gene clusters in complex diseases through critical transition theory

    NASA Astrophysics Data System (ADS)

    Wolanyk, Nathaniel; Wang, Xujing; Hessner, Martin; Gao, Shouguo; Chen, Ye; Jia, Shuang

    A novel approach of looking at the human body using critical transition theory has yielded positive results: clusters of genes that act in tandem to drive complex disease progression. This cluster of genes can be thought of as the first part of a large genetic force that pushes the body from a curable, but sick, point to an incurable diseased point through a catastrophic bifurcation. The data analyzed is time course microarray blood assay data of 7 high risk individuals for Type 1 Diabetes who progressed into a clinical onset, with an additional larger study requested to be presented at the conference. The normalized data is 25,000 genes strong, which were narrowed down based on statistical metrics, and finally a machine learning algorithm using critical transition metrics found the driving network. This approach was created to be repeatable across multiple complex diseases with only progression time course data needed so that it would be applicable to identifying when an individual is at risk of developing a complex disease. Thusly, preventative measures can be enacted, and in the longer term, offers a possible solution to prevent all Type 1 Diabetes.

  9. Transposon Mutagenesis Identified Chromosomal and Plasmid Genes Essential for Adaptation of the Marine Bacterium Dinoroseobacter shibae to Anaerobic Conditions

    PubMed Central

    Ebert, Matthias; Laaß, Sebastian; Burghartz, Melanie; Petersen, Jörn; Koßmehl, Sebastian; Wöhlbrand, Lars; Rabus, Ralf; Wittmann, Christoph; Jahn, Dieter

    2013-01-01

    Anaerobic growth and survival are integral parts of the life cycle of many marine bacteria. To identify genes essential for the anoxic life of Dinoroseobacter shibae, a transposon library was screened for strains impaired in anaerobic denitrifying growth. Transposon insertions in 35 chromosomal and 18 plasmid genes were detected. The essential contribution of plasmid genes to anaerobic growth was confirmed with plasmid-cured D. shibae strains. A combined transcriptome and proteome approach identified oxygen tension-regulated genes. Transposon insertion sites of a total of 1,527 mutants without an anaerobic growth phenotype were determined to identify anaerobically induced but not essential genes. A surprisingly small overlap of only three genes (napA, phaA, and the Na+/Pi antiporter gene Dshi_0543) between anaerobically essential and induced genes was found. Interestingly, transposon mutations in genes involved in dissimilatory and assimilatory nitrate reduction (napA, nasA) and corresponding cofactor biosynthesis (genomic moaB, moeB, and dsbC and plasmid-carried dsbD and ccmH) were found to cause anaerobic growth defects. In contrast, mutation of anaerobically induced genes encoding proteins required for the later denitrification steps (nirS, nirJ, nosD), dimethyl sulfoxide reduction (dmsA1), and fermentation (pdhB1, arcA, aceE, pta, acs) did not result in decreased anaerobic growth under the conditions tested. Additional essential components (ferredoxin, cccA) of the anaerobic electron transfer chain and central metabolism (pdhB) were identified. Another surprise was the importance of sodium gradient-dependent membrane processes and genomic rearrangements via viruses, transposons, and insertion sequence elements for anaerobic growth. These processes and the observed contributions of cell envelope restructuring (lysM, mipA, fadK), C4-dicarboxylate transport (dctM1, dctM3), and protease functions to anaerobic growth require further investigation to unravel the

  10. Genome-Wide Association Study Identifying Candidate Genes Influencing Important Agronomic Traits of Flax (Linum usitatissimum L.) Using SLAF-seq

    PubMed Central

    Xie, Dongwei; Dai, Zhigang; Yang, Zemao; Sun, Jian; Zhao, Debao; Yang, Xue; Zhang, Liguo; Tang, Qing; Su, Jianguang

    2018-01-01

    Flax (Linum usitatissimum L.) is an important cash crop, and its agronomic traits directly affect yield and quality. Molecular studies on flax remain inadequate because relatively few flax genes have been associated with agronomic traits or have been identified as having potential applications. To identify markers and candidate genes that can potentially be used for genetic improvement of crucial agronomic traits, we examined 224 specimens of core flax germplasm; specifically, phenotypic data for key traits, including plant height, technical length, number of branches, number of fruits, and 1000-grain weight were investigated under three environmental conditions before specific-locus amplified fragment sequencing (SLAF-seq) was employed to perform a genome-wide association study (GWAS) for these five agronomic traits. Subsequently, the results were used to screen single nucleotide polymorphism (SNP) loci and candidate genes that exhibited a significant correlation with the important agronomic traits. Our analyses identified a total of 42 SNP loci that showed significant correlations with the five important agronomic flax traits. Next, candidate genes were screened in the 10 kb zone of each of the 42 SNP loci. These SNP loci were then analyzed by a more stringent screening via co-identification using both a general linear model (GLM) and a mixed linear model (MLM) as well as co-occurrences in at least two of the three environments, whereby 15 final candidate genes were obtained. Based on these results, we determined that UGT and PL are candidate genes for plant height, GRAS and XTH are candidate genes for the number of branches, Contig1437 and LU0019C12 are candidate genes for the number of fruits, and PHO1 is a candidate gene for the 1000-seed weight. We propose that the identified SNP loci and corresponding candidate genes might serve as a biological basis for improving crucial agronomic flax traits. PMID:29375606

  11. Genome-Wide Association Study Identifying Candidate Genes Influencing Important Agronomic Traits of Flax (Linum usitatissimum L.) Using SLAF-seq.

    PubMed

    Xie, Dongwei; Dai, Zhigang; Yang, Zemao; Sun, Jian; Zhao, Debao; Yang, Xue; Zhang, Liguo; Tang, Qing; Su, Jianguang

    2017-01-01

    Flax ( Linum usitatissimum L.) is an important cash crop, and its agronomic traits directly affect yield and quality. Molecular studies on flax remain inadequate because relatively few flax genes have been associated with agronomic traits or have been identified as having potential applications. To identify markers and candidate genes that can potentially be used for genetic improvement of crucial agronomic traits, we examined 224 specimens of core flax germplasm; specifically, phenotypic data for key traits, including plant height, technical length, number of branches, number of fruits, and 1000-grain weight were investigated under three environmental conditions before specific-locus amplified fragment sequencing (SLAF-seq) was employed to perform a genome-wide association study (GWAS) for these five agronomic traits. Subsequently, the results were used to screen single nucleotide polymorphism (SNP) loci and candidate genes that exhibited a significant correlation with the important agronomic traits. Our analyses identified a total of 42 SNP loci that showed significant correlations with the five important agronomic flax traits. Next, candidate genes were screened in the 10 kb zone of each of the 42 SNP loci. These SNP loci were then analyzed by a more stringent screening via co-identification using both a general linear model (GLM) and a mixed linear model (MLM) as well as co-occurrences in at least two of the three environments, whereby 15 final candidate genes were obtained. Based on these results, we determined that UGT and PL are candidate genes for plant height, GRAS and XTH are candidate genes for the number of branches, Contig1437 and LU0019C12 are candidate genes for the number of fruits, and PHO1 is a candidate gene for the 1000-seed weight. We propose that the identified SNP loci and corresponding candidate genes might serve as a biological basis for improving crucial agronomic flax traits.

  12. VCP gene analyses in Japanese patients with sporadic amyotrophic lateral sclerosis identify a new mutation.

    PubMed

    Hirano, Makito; Nakamura, Yusaku; Saigoh, Kazumasa; Sakamoto, Hikaru; Ueno, Shuichi; Isono, Chiharu; Mitsui, Yoshiyuki; Kusunoki, Susumu

    2015-03-01

    Accumulating evidence has proven that mutations in the VCP gene encoding valosin-containing protein (VCP) cause inclusion body myopathy with Paget disease of the bone and frontotemporal dementia. This gene was later found to be causative for amyotrophic lateral sclerosis (ALS), a fatal neurodegenerative disease, occurring typically in elderly persons. We thus sequenced the VCP gene in 75 Japanese patients with sporadic ALS negative for mutations in other genes causative for ALS and found a novel mutation, p.Arg487His, in 1 patient. The newly identified mutant as well as known mutants rendered neuronal cells susceptible to oxidative stress. The presence of the mutation in the Japanese population extends the geographic region for involvement of the VCP gene in sporadic ALS to East Asia. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. QTL Mapping and CRISPR/Cas9 Editing to Identify a Drug Resistance Gene in Toxoplasma gondii

    PubMed Central

    Shen, Bang; Powell, Robin H.; Behnke, Michael S.

    2017-01-01

    Scientific knowledge is intrinsically linked to available technologies and methods. This article will present two methods that allowed for the identification and verification of a drug resistance gene in the Apicomplexan parasite Toxoplasma gondii, the method of Quantitative Trait Locus (QTL) mapping using a Whole Genome Sequence (WGS) -based genetic map and the method of Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/Cas9 -based gene editing. The approach of QTL mapping allows one to test if there is a correlation between a genomic region(s) and a phenotype. Two datasets are required to run a QTL scan, a genetic map based on the progeny of a recombinant cross and a quantifiable phenotype assessed in each of the progeny of that cross. These datasets are then formatted to be compatible with R/qtl software that generates a QTL scan to identify significant loci correlated with the phenotype. Although this can greatly narrow the search window of possible candidates, QTLs span regions containing a number of genes from which the causal gene needs to be identified. Having WGS of the progeny was critical to identify the causal drug resistance mutation at the gene level. Once identified, the candidate mutation can be verified by genetic manipulation of drug sensitive parasites. The most facile and efficient method to genetically modify T. gondii is the CRISPR/Cas9 system. This system comprised of just 2 components both encoded on a single plasmid, a single guide RNA (gRNA) containing a 20 bp sequence complementary to the genomic target and the Cas9 endonuclease that generates a double-strand DNA break (DSB) at the target, repair of which allows for insertion or deletion of sequences around the break site. This article provides detailed protocols to use CRISPR/Cas9 based genome editing tools to verify the gene responsible for sinefungin resistance and to construct transgenic parasites. PMID:28671645

  14. Genome-wide methylation analysis identifies a core set of hypermethylated genes in CIMP-H colorectal cancer.

    PubMed

    McInnes, Tyler; Zou, Donghui; Rao, Dasari S; Munro, Francesca M; Phillips, Vicky L; McCall, John L; Black, Michael A; Reeve, Anthony E; Guilford, Parry J

    2017-03-28

    Aberrant DNA methylation profiles are a characteristic of all known cancer types, epitomized by the CpG island methylator phenotype (CIMP) in colorectal cancer (CRC). Hypermethylation has been observed at CpG islands throughout the genome, but it is unclear which factors determine whether an individual island becomes methylated in cancer. DNA methylation in CRC was analysed using the Illumina HumanMethylation450K array. Differentially methylated loci were identified using Significance Analysis of Microarrays (SAM) and the Wilcoxon Signed Rank (WSR) test. Unsupervised hierarchical clustering was used to identify methylation subtypes in CRC. In this study we characterized the DNA methylation profiles of 94 CRC tissues and their matched normal counterparts. Consistent with previous studies, unsupervized hierarchical clustering of genome-wide methylation data identified three subtypes within the tumour samples, designated CIMP-H, CIMP-L and CIMP-N, that showed high, low and very low methylation levels, respectively. Differential methylation between normal and tumour samples was analysed at the individual CpG level, and at the gene level. The distribution of hypermethylation in CIMP-N tumours showed high inter-tumour variability and appeared to be highly stochastic in nature, whereas CIMP-H tumours exhibited consistent hypermethylation at a subset of genes, in addition to a highly variable background of hypermethylated genes. EYA4, TFPI2 and TLX1 were hypermethylated in more than 90% of all tumours examined. One-hundred thirty-two genes were hypermethylated in 100% of CIMP-H tumours studied and these were highly enriched for functions relating to skeletal system development (Bonferroni adjusted p value =2.88E-15), segment specification (adjusted p value =9.62E-11), embryonic development (adjusted p value =1.52E-04), mesoderm development (adjusted p value =1.14E-20), and ectoderm development (adjusted p value =7.94E-16). Our genome-wide characterization of DNA

  15. Analysis of SOX10 mutations identified in Waardenburg-Hirschsprung patients: Differential effects on target gene regulation.

    PubMed

    Chan, Kwok Keung; Wong, Corinne Kung Yen; Lui, Vincent Chi Hang; Tam, Paul Kwong Hang; Sham, Mai Har

    2003-10-15

    SOX10 is a member of the SOX gene family related by homology to the high-mobility group (HMG) box region of the testis-determining gene SRY. Mutations of the transcription factor gene SOX10 lead to Waardenburg-Hirschsprung syndrome (Waardenburg-Shah syndrome, WS4) in humans. A number of SOX10 mutations have been identified in WS4 patients who suffer from different extents of intestinal aganglionosis, pigmentation, and hearing abnormalities. Some patients also exhibit signs of myelination deficiency in the central and peripheral nervous systems. Although the molecular bases for the wide range of symptoms displayed by the patients are still not clearly understood, a few target genes for SOX10 have been identified. We have analyzed the impact of six different SOX10 mutations on the activation of SOX10 target genes by yeast one-hybrid and mammalian cell transfection assays. To investigate the transactivation activities of the mutant proteins, three different SOX target binding sites were introduced into luciferase reporter gene constructs and examined in our series of transfection assays: consensus HMG domain protein binding sites; SOX10 binding sites identified in the RET promoter; and Sox10 binding sites identified in the P0 promoter. We found that the same mutation could have different transactivation activities when tested with different target binding sites and in different cell lines. The differential transactivation activities of the SOX10 mutants appeared to correlate with the intestinal and/or neurological symptoms presented in the patients. Among the six mutant SOX10 proteins tested, much reduced transactivation activities were observed when tested on the SOX10 binding sites from the RET promoter. Of the two similar mutations X467K and 1400del12, only the 1400del12 mutant protein exhibited an increase of transactivation through the P0 promoter. While the lack of normal SOX10 mediated activation of RET transcription may lead to intestinal aganglionosis

  16. Omics of Brucella: Species-Specific sRNA-Mediated Gene Ontology Regulatory Networks Identified by Computational Biology.

    PubMed

    Vishnu, Udayakumar S; Sankarasubramanian, Jagadesan; Gunasekaran, Paramasamy; Sridhar, Jayavel; Rajendhran, Jeyaprakash

    2016-06-01

    Brucella is an intracellular bacterium that causes the zoonotic infectious disease, brucellosis. Brucella species are currently intensively studied with a view to developing novel global health diagnostics and therapeutics. In this context, small RNAs (sRNAs) are one of the emerging topical areas; they play significant roles in regulating gene expression and cellular processes in bacteria. In the present study, we forecast sRNAs in three Brucella species that infect humans, namely Brucella melitensis, Brucella abortus, and Brucella suis, using a computational biology analysis. We combined two bioinformatic algorithms, SIPHT and sRNAscanner. In B. melitensis 16M, 21 sRNA candidates were identified, of which 14 were novel. Similarly, 14 sRNAs were identified in B. abortus, of which four were novel. In B. suis, 16 sRNAs were identified, and five of them were novel. TargetRNA2 software predicted the putative target genes that could be regulated by the identified sRNAs. The identified mRNA targets are involved in carbohydrate, amino acid, lipid, nucleotide, and coenzyme metabolism and transport, energy production and conversion, replication, recombination, repair, and transcription. Additionally, the Gene Ontology (GO) network analysis revealed the species-specific, sRNA-based regulatory networks in B. melitensis, B. abortus, and B. suis. Taken together, although sRNAs are veritable modulators of gene expression in prokaryotes, there are few reports on the significance of sRNAs in Brucella. This report begins to address this literature gap by offering a series of initial observations based on computational biology to pave the way for future experimental analysis of sRNAs and their targets to explain the complex pathogenesis of Brucella.

  17. Transcriptome-wide selection of a reliable set of reference genes for gene expression studies in potato cyst nematodes (Globodera spp.).

    PubMed

    Sabeh, Michael; Duceppe, Marc-Olivier; St-Arnaud, Marc; Mimee, Benjamin

    2018-01-01

    Relative gene expression analyses by qRT-PCR (quantitative reverse transcription PCR) require an internal control to normalize the expression data of genes of interest and eliminate the unwanted variation introduced by sample preparation. A perfect reference gene should have a constant expression level under all the experimental conditions. However, the same few housekeeping genes selected from the literature or successfully used in previous unrelated experiments are often routinely used in new conditions without proper validation of their stability across treatments. The advent of RNA-Seq and the availability of public datasets for numerous organisms are opening the way to finding better reference genes for expression studies. Globodera rostochiensis is a plant-parasitic nematode that is particularly yield-limiting for potato. The aim of our study was to identify a reliable set of reference genes to study G. rostochiensis gene expression. Gene expression levels from an RNA-Seq database were used to identify putative reference genes and were validated with qRT-PCR analysis. Three genes, GR, PMP-3, and aaRS, were found to be very stable within the experimental conditions of this study and are proposed as reference genes for future work.

  18. Targeted next generation sequencing identifies novel NOTCH3 gene mutations in CADASIL diagnostics patients.

    PubMed

    Maksemous, Neven; Smith, Robert A; Haupt, Larisa M; Griffiths, Lyn R

    2016-11-24

    Cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL) is a monogenic, hereditary, small vessel disease of the brain causing stroke and vascular dementia in adults. CADASIL has previously been shown to be caused by varying mutations in the NOTCH3 gene. The disorder is often misdiagnosed due to its significant clinical heterogeneic manifestation with familial hemiplegic migraine and several ataxia disorders as well as the location of the currently identified causative mutations. The aim of this study was to develop a new, comprehensive and efficient single assay strategy for complete molecular diagnosis of NOTCH3 mutations through the use of a custom next-generation sequencing (NGS) panel for improved routine clinical molecular diagnostic testing. Our custom NGS panel identified nine genetic variants in NOTCH3 (p.D139V, p.C183R, p.R332C, p.Y465C, p.C597W, p.R607H, p.E813E, p.C977G and p.Y1106C). Six mutations were stereotypical CADASIL mutations leading to an odd number of cysteine residues in one of the 34 NOTCH3 gene epidermal growth factor (EGF)-like repeats, including three new typical cysteine mutations identified in exon 11 (p.C597W; c.1791C>G); exon 18 (p.C977G; c.2929T>G) and exon 20 (p.Y1106C; c.3317A>G). Interestingly, a novel missense mutation in the CACNA1A gene was also identified in one CADASIL patient. All variants identified (novel and known) were further investigated using in silico bioinformatic analyses and confirmed through Sanger sequencing. NGS provides an improved and effective methodology for the diagnosis of CADASIL. The NGS approach reduced time and cost for comprehensive genetic diagnosis, placing genetic diagnostic testing within reach of more patients.

  19. Comparing cancer vs normal gene expression profiles identifies new disease entities and common transcriptional programs in AML patients.

    PubMed

    Rapin, Nicolas; Bagger, Frederik Otzen; Jendholm, Johan; Mora-Jensen, Helena; Krogh, Anders; Kohlmann, Alexander; Thiede, Christian; Borregaard, Niels; Bullinger, Lars; Winther, Ole; Theilgaard-Mönch, Kim; Porse, Bo T

    2014-02-06

    Gene expression profiling has been used extensively to characterize cancer, identify novel subtypes, and improve patient stratification. However, it has largely failed to identify transcriptional programs that differ between cancer and corresponding normal cells and has not been efficient in identifying expression changes fundamental to disease etiology. Here we present a method that facilitates the comparison of any cancer sample to its nearest normal cellular counterpart, using acute myeloid leukemia (AML) as a model. We first generated a gene expression-based landscape of the normal hematopoietic hierarchy, using expression profiles from normal stem/progenitor cells, and next mapped the AML patient samples to this landscape. This allowed us to identify the closest normal counterpart of individual AML samples and determine gene expression changes between cancer and normal. We find the cancer vs normal method (CvN method) to be superior to conventional methods in stratifying AML patients with aberrant karyotype and in identifying common aberrant transcriptional programs with potential importance for AML etiology. Moreover, the CvN method uncovered a novel poor-outcome subtype of normal-karyotype AML, which allowed for the generation of a highly prognostic survival signature. Collectively, our CvN method holds great potential as a tool for the analysis of gene expression profiles of cancer patients.

  20. Network-Based Integration of GWAS and Gene Expression Identifies a HOX-Centric Network Associated with Serous Ovarian Cancer Risk.

    PubMed

    Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K H; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V; Bean, Yukie T; Beckmann, Matthias W; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S; Cramer, Daniel; Cunningham, Julie M; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F; Edwards, Robert P; Ekici, Arif B; Fasching, Peter A; Fridley, Brooke L; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G; Glasspool, Rosalind; Goode, Ellen L; Goodman, Marc T; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A T; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K; Hosono, Satoyo; Iversen, Edwin S; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K; Kelemen, Linda E; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Alice W; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; McNeish, Iain A; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Narod, Steven A; Nedergaard, Lotte; Ness, Roberta B; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pike, Malcolm C; Poole, Elizabeth M; Ramus, Susan J; Risch, Harvey A; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H; Rudolph, Anja; Runnebaum, Ingo B; Rzepecka, Iwona K; Salvesen, Helga B; Schildkraut, Joellen M; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C; Sucheston-Campbell, Lara E; Tangen, Ingvild L; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S; van Altena, Anne M; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S; Wicklund, Kristine G; Wilkens, Lynne R; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A; Monteiro, Alvaro N A; Freedman, Matthew L; Gayther, Simon A; Pharoah, Paul D P

    2015-10-01

    Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by coexpression may also be enriched for additional EOC risk associations. We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly coexpressed with each selected TF gene in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P < 0.05 and FDR < 0.05). These results were replicated (P < 0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Network analysis integrating large, context-specific datasets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. ©2015 American Association for Cancer Research.

  1. An EST-based analysis identifies new genes and reveals distinctive gene expression features of Coffea arabica and Coffea canephora

    PubMed Central

    2011-01-01

    Background Coffee is one of the world's most important crops; it is consumed worldwide and plays a significant role in the economy of producing countries. Coffea arabica and C. canephora are responsible for 70 and 30% of commercial production, respectively. C. arabica is an allotetraploid from a recent hybridization of the diploid species, C. canephora and C. eugenioides. C. arabica has lower genetic diversity and results in a higher quality beverage than C. canephora. Research initiatives have been launched to produce genomic and transcriptomic data about Coffea spp. as a strategy to improve breeding efficiency. Results Assembling the expressed sequence tags (ESTs) of C. arabica and C. canephora produced by the Brazilian Coffee Genome Project and the Nestlé-Cornell Consortium revealed 32,007 clusters of C. arabica and 16,665 clusters of C. canephora. We detected different GC3 profiles between these species that are related to their genome structure and mating system. BLAST analysis revealed similarities between coffee and grape (Vitis vinifera) genes. Using KA/KS analysis, we identified coffee genes under purifying and positive selection. Protein domain and gene ontology analyses suggested differences between Coffea spp. data, mainly in relation to complex sugar synthases and nucleotide binding proteins. OrthoMCL was used to identify specific and prevalent coffee protein families when compared to five other plant species. Among the interesting families annotated are new cystatins, glycine-rich proteins and RALF-like peptides. Hierarchical clustering was used to independently group C. arabica and C. canephora expression clusters according to expression data extracted from EST libraries, resulting in the identification of differentially expressed genes. Based on these results, we emphasize gene annotation and discuss plant defenses, abiotic stress and cup quality-related functional categories. Conclusion We present the first comprehensive genome-wide transcript

  2. Leveraging network analytics to infer patient syndrome and identify causal genes in rare disease cases.

    PubMed

    Krämer, Andreas; Shah, Sohela; Rebres, Robert Anthony; Tang, Susan; Richards, Daniel Rene

    2017-08-11

    Next-generation sequencing is widely used to identify disease-causing variants in patients with rare genetic disorders. Identifying those variants from whole-genome or exome data can be both scientifically challenging and time consuming. A significant amount of time is spent on variant annotation, and interpretation. Fully or partly automated solutions are therefore needed to streamline and scale this process. We describe Phenotype Driven Ranking (PDR), an algorithm integrated into Ingenuity Variant Analysis, that uses observed patient phenotypes to prioritize diseases and genes in order to expedite causal-variant discovery. Our method is based on a network of phenotype-disease-gene relationships derived from the QIAGEN Knowledge Base, which allows for efficient computational association of phenotypes to implicated diseases, and also enables scoring and ranking. We have demonstrated the utility and performance of PDR by applying it to a number of clinical rare-disease cases, where the true causal gene was known beforehand. It is also shown that PDR compares favorably to a representative alternative tool.

  3. Host susceptibility to malaria in human and mice: compatible approaches to identify potential resistant genes.

    PubMed

    Hernandez-Valladares, Maria; Rihet, Pascal; Iraqi, Fuad A

    2014-01-01

    There is growing evidence for human genetic factors controlling the outcome of malaria infection, while molecular basis of this genetic control is still poorly understood. Case-control and family-based studies have been carried out to identify genes underlying host susceptibility to malarial infection. Parasitemia and mild malaria have been genetically linked to human chromosomes 5q31-q33 and 6p21.3, and several immune genes located within those regions have been associated with malaria-related phenotypes. Association and linkage studies of resistance to malaria are not easy to carry out in human populations, because of the difficulty in surveying a significant number of families. Murine models have proven to be an excellent genetic tool for studying host response to malaria; their use allowed mapping 14 resistance loci, eight of them controlling parasitic levels and six controlling cerebral malaria. Once quantitative trait loci or genes have been identified, the human ortholog may then be identified. Comparative mapping studies showed that a couple of human and mouse might share similar genetically controlled mechanisms of resistance. In this way, char8, which controls parasitemia, was mapped on chromosome 11; char8 corresponds to human chromosome 5q31-q33 and contains immune genes, such as Il3, Il4, Il5, Il12b, Il13, Irf1, and Csf2. Nevertheless, part of the genetic factors controlling malaria traits might differ in both hosts because of specific host-pathogen interactions. Finally, novel genetic tools including animal models were recently developed and will offer new opportunities for identifying genetic factors underlying host phenotypic response to malaria, which will help in better therapeutic strategies including vaccine and drug development.

  4. Pathways-Driven Sparse Regression Identifies Pathways and Genes Associated with High-Density Lipoprotein Cholesterol in Two Asian Cohorts

    PubMed Central

    Silver, Matt; Chen, Peng; Li, Ruoying; Cheng, Ching-Yu; Wong, Tien-Yin; Tai, E-Shyong; Teo, Yik-Ying; Montana, Giovanni

    2013-01-01

    Standard approaches to data analysis in genome-wide association studies (GWAS) ignore any potential functional relationships between gene variants. In contrast gene pathways analysis uses prior information on functional structure within the genome to identify pathways associated with a trait of interest. In a second step, important single nucleotide polymorphisms (SNPs) or genes may be identified within associated pathways. The pathways approach is motivated by the fact that genes do not act alone, but instead have effects that are likely to be mediated through their interaction in gene pathways. Where this is the case, pathways approaches may reveal aspects of a trait's genetic architecture that would otherwise be missed when considering SNPs in isolation. Most pathways methods begin by testing SNPs one at a time, and so fail to capitalise on the potential advantages inherent in a multi-SNP, joint modelling approach. Here, we describe a dual-level, sparse regression model for the simultaneous identification of pathways and genes associated with a quantitative trait. Our method takes account of various factors specific to the joint modelling of pathways with genome-wide data, including widespread correlation between genetic predictors, and the fact that variants may overlap multiple pathways. We use a resampling strategy that exploits finite sample variability to provide robust rankings for pathways and genes. We test our method through simulation, and use it to perform pathways-driven gene selection in a search for pathways and genes associated with variation in serum high-density lipoprotein cholesterol levels in two separate GWAS cohorts of Asian adults. By comparing results from both cohorts we identify a number of candidate pathways including those associated with cardiomyopathy, and T cell receptor and PPAR signalling. Highlighted genes include those associated with the L-type calcium channel, adenylate cyclase, integrin, laminin, MAPK signalling and immune

  5. Pathways-driven sparse regression identifies pathways and genes associated with high-density lipoprotein cholesterol in two Asian cohorts.

    PubMed

    Silver, Matt; Chen, Peng; Li, Ruoying; Cheng, Ching-Yu; Wong, Tien-Yin; Tai, E-Shyong; Teo, Yik-Ying; Montana, Giovanni

    2013-11-01

    Standard approaches to data analysis in genome-wide association studies (GWAS) ignore any potential functional relationships between gene variants. In contrast gene pathways analysis uses prior information on functional structure within the genome to identify pathways associated with a trait of interest. In a second step, important single nucleotide polymorphisms (SNPs) or genes may be identified within associated pathways. The pathways approach is motivated by the fact that genes do not act alone, but instead have effects that are likely to be mediated through their interaction in gene pathways. Where this is the case, pathways approaches may reveal aspects of a trait's genetic architecture that would otherwise be missed when considering SNPs in isolation. Most pathways methods begin by testing SNPs one at a time, and so fail to capitalise on the potential advantages inherent in a multi-SNP, joint modelling approach. Here, we describe a dual-level, sparse regression model for the simultaneous identification of pathways and genes associated with a quantitative trait. Our method takes account of various factors specific to the joint modelling of pathways with genome-wide data, including widespread correlation between genetic predictors, and the fact that variants may overlap multiple pathways. We use a resampling strategy that exploits finite sample variability to provide robust rankings for pathways and genes. We test our method through simulation, and use it to perform pathways-driven gene selection in a search for pathways and genes associated with variation in serum high-density lipoprotein cholesterol levels in two separate GWAS cohorts of Asian adults. By comparing results from both cohorts we identify a number of candidate pathways including those associated with cardiomyopathy, and T cell receptor and PPAR signalling. Highlighted genes include those associated with the L-type calcium channel, adenylate cyclase, integrin, laminin, MAPK signalling and immune

  6. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution

    PubMed Central

    Clarke, Thomas H.; Garb, Jessica E.; Hayashi, Cheryl Y.; Arensburger, Peter; Ayoub, Nadia A.

    2015-01-01

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). PMID:26058392

  7. Non-parent of Origin Expression of Numerous Effector Genes Indicates a Role of Gene Regulation in Host Adaption of the Hybrid Triticale Powdery Mildew Pathogen.

    PubMed

    Praz, Coraline R; Menardo, Fabrizio; Robinson, Mark D; Müller, Marion C; Wicker, Thomas; Bourras, Salim; Keller, Beat

    2018-01-01

    Powdery mildew is an important disease of cereals. It is caused by one species, Blumeria graminis , which is divided into formae speciales each of which is highly specialized to one host. Recently, a new form capable of growing on triticale ( B.g. triticale ) has emerged through hybridization between wheat and rye mildews ( B.g. tritici and B.g. secalis , respectively). In this work, we used RNA sequencing to study the molecular basis of host adaptation in B.g. triticale . We analyzed gene expression in three B.g. tritici isolates, two B.g. secalis isolates and two B.g. triticale isolates and identified a core set of putative effector genes that are highly expressed in all formae speciales . We also found that the genes differentially expressed between isolates of the same form as well as between different formae speciales were enriched in putative effectors. Their coding genes belong to several families including some which contain known members of mildew avirulence ( Avr ) and suppressor ( Svr ) genes. Based on these findings we propose that effectors play an important role in host adaptation that is mechanistically based on Avr-Resistance gene-Svr interactions. We also found that gene expression in the B.g. triticale hybrid is mostly conserved with the parent-of-origin, but some genes inherited from B.g. tritici showed a B.g. secalis -like expression. Finally, we identified 11 unambiguous cases of putative effector genes with hybrid-specific, non-parent of origin gene expression, and we propose that they are possible determinants of host specialization in triticale mildew. These data suggest that altered expression of multiple effector genes, in particular Avr and Svr related factors, might play a role in mildew host adaptation based on hybridization.

  8. Non-parent of Origin Expression of Numerous Effector Genes Indicates a Role of Gene Regulation in Host Adaption of the Hybrid Triticale Powdery Mildew Pathogen

    PubMed Central

    Praz, Coraline R.; Menardo, Fabrizio; Robinson, Mark D.; Müller, Marion C.; Wicker, Thomas; Bourras, Salim; Keller, Beat

    2018-01-01

    Powdery mildew is an important disease of cereals. It is caused by one species, Blumeria graminis, which is divided into formae speciales each of which is highly specialized to one host. Recently, a new form capable of growing on triticale (B.g. triticale) has emerged through hybridization between wheat and rye mildews (B.g. tritici and B.g. secalis, respectively). In this work, we used RNA sequencing to study the molecular basis of host adaptation in B.g. triticale. We analyzed gene expression in three B.g. tritici isolates, two B.g. secalis isolates and two B.g. triticale isolates and identified a core set of putative effector genes that are highly expressed in all formae speciales. We also found that the genes differentially expressed between isolates of the same form as well as between different formae speciales were enriched in putative effectors. Their coding genes belong to several families including some which contain known members of mildew avirulence (Avr) and suppressor (Svr) genes. Based on these findings we propose that effectors play an important role in host adaptation that is mechanistically based on Avr-Resistance gene-Svr interactions. We also found that gene expression in the B.g. triticale hybrid is mostly conserved with the parent-of-origin, but some genes inherited from B.g. tritici showed a B.g. secalis-like expression. Finally, we identified 11 unambiguous cases of putative effector genes with hybrid-specific, non-parent of origin gene expression, and we propose that they are possible determinants of host specialization in triticale mildew. These data suggest that altered expression of multiple effector genes, in particular Avr and Svr related factors, might play a role in mildew host adaptation based on hybridization. PMID:29441081

  9. Genes2WordCloud: a quick way to identify biological themes from gene lists and free text.

    PubMed

    Baroukh, Caroline; Jenkins, Sherry L; Dannenfelser, Ruth; Ma'ayan, Avi

    2011-10-13

    Word-clouds recently emerged on the web as a solution for quickly summarizing text by maximizing the display of most relevant terms about a specific topic in the minimum amount of space. As biologists are faced with the daunting amount of new research data commonly presented in textual formats, word-clouds can be used to summarize and represent biological and/or biomedical content for various applications. Genes2WordCloud is a web application that enables users to quickly identify biological themes from gene lists and research relevant text by constructing and displaying word-clouds. It provides users with several different options and ideas for the sources that can be used to generate a word-cloud. Different options for rendering and coloring the word-clouds give users the flexibility to quickly generate customized word-clouds of their choice. Genes2WordCloud is a word-cloud generator and a word-cloud viewer that is based on WordCram implemented using Java, Processing, AJAX, mySQL, and PHP. Text is fetched from several sources and then processed to extract the most relevant terms with their computed weights based on word frequencies. Genes2WordCloud is freely available for use online; it is open source software and is available for installation on any web-site along with supporting documentation at http://www.maayanlab.net/G2W. Genes2WordCloud provides a useful way to summarize and visualize large amounts of textual biological data or to find biological themes from several different sources. The open source availability of the software enables users to implement customized word-clouds on their own web-sites and desktop applications.

  10. Evaluating the evaluation of cancer driver genes

    PubMed Central

    Tokheim, Collin J.; Papadopoulos, Nickolas; Kinzler, Kenneth W.; Vogelstein, Bert; Karchin, Rachel

    2016-01-01

    Sequencing has identified millions of somatic mutations in human cancers, but distinguishing cancer driver genes remains a major challenge. Numerous methods have been developed to identify driver genes, but evaluation of the performance of these methods is hindered by the lack of a gold standard, that is, bona fide driver gene mutations. Here, we establish an evaluation framework that can be applied to driver gene prediction methods. We used this framework to compare the performance of eight such methods. One of these methods, described here, incorporated a machine-learning–based ratiometric approach. We show that the driver genes predicted by each of the eight methods vary widely. Moreover, the P values reported by several of the methods were inconsistent with the uniform values expected, thus calling into question the assumptions that were used to generate them. Finally, we evaluated the potential effects of unexplained variability in mutation rates on false-positive driver gene predictions. Our analysis points to the strengths and weaknesses of each of the currently available methods and offers guidance for improving them in the future. PMID:27911828

  11. Suppression subtractive hybridization and comparative expression analysis to identify developmentally regulated genes in filamentous fungi.

    PubMed

    Gesing, Stefan; Schindler, Daniel; Nowrousian, Minou

    2013-09-01

    Ascomycetes differentiate four major morphological types of fruiting bodies (apothecia, perithecia, pseudothecia and cleistothecia) that are derived from an ancestral fruiting body. Thus, fruiting body differentiation is most likely controlled by a set of common core genes. One way to identify such genes is to search for genes with evolutionary conserved expression patterns. Using suppression subtractive hybridization (SSH), we selected differentially expressed transcripts in Pyronema confluens (Pezizales) by comparing two cDNA libraries specific for sexual and for vegetative development, respectively. The expression patterns of selected genes from both libraries were verified by quantitative real time PCR. Expression of several corresponding homologous genes was found to be conserved in two members of the Sordariales (Sordaria macrospora and Neurospora crassa), a derived group of ascomycetes that is only distantly related to the Pezizales. Knockout studies with N. crassa orthologues of differentially regulated genes revealed a functional role during fruiting body development for the gene NCU05079, encoding a putative MFS peptide transporter. These data indicate conserved gene expression patterns and a functional role of the corresponding genes during fruiting body development; such genes are candidates of choice for further functional analysis. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Network-based integration of GWAS and gene expression identifies a HOX-centric network associated with serous ovarian cancer risk

    PubMed Central

    Kar, Siddhartha P.; Tyrer, Jonathan P.; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K.H.; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V.; Bean, Yukie T.; Beckmann, Matthias W.; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S.; Cramer, Daniel; Cunningham, Julie M.; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A.; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F.; Edwards, Robert P.; Ekici, Arif B.; Fasching, Peter A.; Fridley, Brooke L.; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G.; Glasspool, Rosalind; Goode, Ellen L.; Goodman, Marc T.; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A.T.; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K.; Hosono, Satoyo; Iversen, Edwin S.; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K.; Kelemen, Linda E.; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A.; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Alice W.; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A.; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R.; McNeish, Iain A.; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B.; Narod, Steven A.; Nedergaard, Lotte; Ness, Roberta B.; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H.; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M.; Permuth-Wey, Jennifer; Phelan, Catherine M.; Pike, Malcolm C.; Poole, Elizabeth M.; Ramus, Susan J.; Risch, Harvey A.; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H.; Rudolph, Anja; Runnebaum, Ingo B.; Rzepecka, Iwona K.; Salvesen, Helga B.; Schildkraut, Joellen M.; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C.; Sucheston-Campbell, Lara E.; Tangen, Ingvild L.; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S.; van Altena, Anne M.; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A.; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S.; Wicklund, Kristine G.; Wilkens, Lynne R.; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A.; Monteiro, Alvaro N. A.; Freedman, Matthew L.; Gayther, Simon A.; Pharoah, Paul D. P.

    2015-01-01

    Background Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by co-expression may also be enriched for additional EOC risk associations. Methods We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly co-expressed with each selected TF gene in the unified microarray data set of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this data set were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Results Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P<0.05 and FDR<0.05). These results were replicated (P<0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. Conclusion We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Impact Network analysis integrating large, context-specific data sets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. PMID:26209509

  13. A genome-wide shRNA screen identifies GAS1 as a novel melanoma metastasis suppressor gene.

    PubMed

    Gobeil, Stephane; Zhu, Xiaochun; Doillon, Charles J; Green, Michael R

    2008-11-01

    Metastasis suppressor genes inhibit one or more steps required for metastasis without affecting primary tumor formation. Due to the complexity of the metastatic process, the development of experimental approaches for identifying genes involved in metastasis prevention has been challenging. Here we describe a genome-wide RNAi screening strategy to identify candidate metastasis suppressor genes. Following expression in weakly metastatic B16-F0 mouse melanoma cells, shRNAs were selected based upon enhanced satellite colony formation in a three-dimensional cell culture system and confirmed in a mouse experimental metastasis assay. Using this approach we discovered 22 genes whose knockdown increased metastasis without affecting primary tumor growth. We focused on one of these genes, Gas1 (Growth arrest-specific 1), because we found that it was substantially down-regulated in highly metastatic B16-F10 melanoma cells, which contributed to the high metastatic potential of this mouse cell line. We further demonstrated that Gas1 has all the expected properties of a melanoma tumor suppressor including: suppression of metastasis in a spontaneous metastasis assay, promotion of apoptosis following dissemination of cells to secondary sites, and frequent down-regulation in human melanoma metastasis-derived cell lines and metastatic tumor samples. Thus, we developed a genome-wide shRNA screening strategy that enables the discovery of new metastasis suppressor genes.

  14. Comparative transcriptional profiling identifies takeout as a gene that regulates life span

    PubMed Central

    Bauer, Johannes; Antosh, Michael; Chang, Chengyi; Schorl, Christoph; Kolli, Santharam; Neretti, Nicola; Helfand, Stephen L.

    2010-01-01

    A major challenge in translating the positive effects of dietary restriction (DR) for the improvement of human health is the development of therapeutic mimics. One approach to finding DR mimics is based upon identification of the proximal effectors of DR life span extension. Whole genome profiling of DR in Drosophila shows a large number of changes in gene expression, making it difficult to establish which changes are involved in life span determination as opposed to other unrelated physiological changes. We used comparative whole genome expression profiling to discover genes whose change in expression is shared between DR and two molecular genetic life span extending interventions related to DR, increased dSir2 and decreased Dmp53 activity. We find twenty-one genes shared among the three related life span extending interventions. One of these genes, takeout, thought to be involved in circadian rhythms, feeding behavior and juvenile hormone binding is also increased in four other life span extending conditions: Rpd3, Indy, chico and methuselah. We demonstrate takeout is involved in longevity determination by specifically increasing adult takeout expression and extending life span. These studies demonstrate the power of comparative whole genome transcriptional profiling for identifying specific downstream elements of the DR life span extending pathway. PMID:20519778

  15. Identifying and exploiting genes that potentiate the evolution of antibiotic resistance.

    PubMed

    Gifford, Danna R; Furió, Victoria; Papkou, Andrei; Vogwill, Tom; Oliver, Antonio; MacLean, R Craig

    2018-06-01

    There is an urgent need to develop novel approaches for predicting and preventing the evolution of antibiotic resistance. Here, we show that the ability to evolve de novo resistance to a clinically important β-lactam antibiotic, ceftazidime, varies drastically across the genus Pseudomonas. This variation arises because strains possessing the ampR global transcriptional regulator evolve resistance at a high rate. This does not arise because of mutations in ampR. Instead, this regulator potentiates evolution by allowing mutations in conserved peptidoglycan biosynthesis genes to induce high levels of β-lactamase expression. Crucially, blocking this evolutionary pathway by co-administering ceftazidime with the β-lactamase inhibitor avibactam can be used to eliminate pathogenic P. aeruginosa populations before they can evolve resistance. In summary, our study shows that identifying potentiator genes that act as evolutionary catalysts can be used to both predict and prevent the evolution of antibiotic resistance.

  16. Engineering and Functional Characterization of Fusion Genes Identifies Novel Oncogenic Drivers of Cancer. | Office of Cancer Genomics

    Cancer.gov

    Oncogenic gene fusions drive many human cancers, but tools to more quickly unravel their functional contributions are needed. Here we describe methodology permitting fusion gene construction for functional evaluation. Using this strategy, we engineered the known fusion oncogenes, BCR-ABL1, EML4-ALK, and ETV6-NTRK3, as well as 20 previously uncharacterized fusion genes identified in TCGA datasets.

  17. Genes for seed longevity in barley identified by genomic analysis on Near Isogenic Lines.

    PubMed

    Wozny, Dorothee; Kramer, Katharina; Finkemeier, Iris; Acosta, Ivan F; Koornneef, Maarten

    2018-05-09

    Genes controlling differences in seed longevity between two barley (Hordeum vulgare) accessions were identified by combining quantitative genetics 'omics' technologies in Near Isogenic Lines (NILs). The NILs were derived from crosses between the spring barley landraces L94 from Ethiopia and Cebada Capa from Argentina. A combined transcriptome and proteome analysis on mature, non-aged seeds of the two parental lines and the L94 NILs by RNA-sequencing and total seed proteomic profiling identified the UDP-glycosyltransferase MLOC_11661.1 as candidate gene for the QTL on 2H, and the NADP-dependent malic enzyme (NADP-ME) MLOC_35785.1 as possible downstream target gene. To validate these candidates, they were expressed in Arabidopsis under the control of constitutive promoters to attempt complementing the T-DNA knock-out line nadp-me1. Both the NADP-ME MLOC_35785.1 and the UDP-glycosyltransferase MLOC_11661.1 were able to rescue the nadp-me1 seed longevity phenotype. In the case of the UDP-glycosyltransferase, with high accumulation in NILs, only the coding sequence of Cebada Capa had a rescue effect. This article is protected by copyright. All rights reserved.

  18. Yeast Two-Hybrid and One-Hybrid Screenings Identify Regulators of hsp70 Gene Expression.

    PubMed

    Saito, Youhei; Nakagawa, Takanobu; Kakihana, Ayana; Nakamura, Yoshia; Nabika, Tomomi; Kasai, Michihiro; Takamori, Mai; Yamagishi, Nobuyuki; Kuga, Takahisa; Hatayama, Takumi; Nakayama, Yuji

    2016-09-01

    The mammalian stress protein Hsp105β, which is specifically expressed during mild heat shock and localizes to the nucleus, induces the major stress protein Hsp70. In the present study, we performed yeast two-hybrid and one-hybrid screenings to identify the regulators of Hsp105β-mediated hsp70 gene expression. Six and two proteins were detected as Hsp105β- and hsp70 promoter-binding proteins, respectively. A luciferase reporter gene assay revealed that hsp70 promoter activation is enhanced by the transcriptional co-activator AF9 and splicing mediator SNRPE, but suppressed by the coiled-coil domain-containing protein CCDC127. Of these proteins, the knockdown of SNRPE suppressed the expression of Hsp70 irrespective of the presence of Hsp105β, indicating that SNRPE essentially functions as a transcriptional activator of hsp70 gene expression. The overexpression of HSP70 in tumor cells has been associated with cell survival and drug resistance. We here identified novel regulators of Hsp70 expression in stress signaling and also provided important insights into Hsp70-targeted anti-cancer therapy. J. Cell. Biochem. 117: 2109-2117, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  19. ZCURVE 3.0: identify prokaryotic genes with higher accuracy as well as automatically and accurately select essential genes.

    PubMed

    Hua, Zhi-Gang; Lin, Yan; Yuan, Ya-Zhou; Yang, De-Chang; Wei, Wen; Guo, Feng-Biao

    2015-07-01

    In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Fatigue-Related Gene Networks Identified in CD14+ Cells Isolated From HIV-Infected Patients—Part I: Research Findings

    PubMed Central

    Voss, Joachim G.; Dobra, Adrian; Morse, Caryn; Kovacs, Joseph A.; Danner, Robert L.; Munson, Peter J.; Logan, Carolea; Rangel, Zoila; Adelsberger, Joseph W.; McLaughlin, Mary; Adams, Larry D.; Raju, Raghavan; Dalakas, Marinos C.

    2016-01-01

    Purpose Human immunodeficiency virus (HIV)–related fatigue (HRF) is multicausal and potentially related to mitochondrial dysfunction caused by antiretroviral therapy with nucleoside reverse transcriptase inhibitors (NRTIs). Methodology The authors compared gene expression profiles of CD14+ cells of low versus high fatigued, NRTI-treated HIV patients to healthy controls (n = 5/group). The authors identified 32 genes predictive of low versus high fatigue and 33 genes predictive of healthy versus HIV infection. The authors constructed genetic networks to further elucidate the possible biological pathways in which these genes are involved. Relevance for nursing practice Genes including the actin cytoskeletal regulatory proteins Prokineticin 2 and Cofilin 2 along with mitochondrial inner membrane proteins are involved in multiple pathways and were predictors of fatigue status. Previously identified inflammatory and signaling genes were predictive of HIV status, clearly confirming our results and suggesting a possible further connection between mitochondrial function and HIV. Isolated CD14+ cells are easily accessible cells that could be used for further study of the connection between fatigue and mitochondrial function of HIV patients. Implication for Practice The findings from this pilot study take us one step closer to identifying biomarker targets for fatigue status and mitochondrial dysfunction. Specific biomarkers will be pertinent to the development of methodologies to diagnosis, monitor, and treat fatigue and mitochondrial dysfunction. PMID:23324479

  1. Contig Maps and Genomic Sequencing Identify Candidate Genes in the Usher 1C Locus

    PubMed Central

    Higgins, Michael J.; Day, Colleen D.; Smilinich, Nancy J.; Ni, L.; Cooper, Paul R.; Nowak, Norma J.; Davies, Chris; de Jong, Pieter J.; Hejtmancik, Fielding; Evans, Glen A.; Smith, Richard J.H.; Shows, Thomas B.

    1998-01-01

    Usher syndrome 1C (USH1C) is a congenital condition manifesting profound hearing loss, the absence of vestibular function, and eventual retinal degeneration. The USH1C locus has been mapped genetically to a 2- to 3-cM interval in 11p14–15.1 between D11S899 and D11S861. In an effort to identify the USH1C disease gene we have isolated the region between these markers in yeast artificial chromosomes (YACs) using a combination of STS content mapping and Alu–PCR hybridization. The YAC contig is ∼3.5 Mb and has located several other loci within this interval, resulting in the order CEN-LDHA-SAA1-TPH-D11S1310-(D11S1888/KCNC1)-MYOD1-D11S902D11S921-D11S1890-TEL. Subsequent haplotyping and homozygosity analysis refined the location of the disease gene to a 400-kb interval between D11S902 and D11S1890 with all affected individuals being homozygous for the internal marker D11S921. To facilitate gene identification, the critical region has been converted into P1 artificial chromosome (PAC) clones using sequence-tagged sites (STSs) mapped to the YAC contig, Alu–PCR products generated from the YACs, and PAC end probes. A contig of >50 PAC clones has been assembled between D11S1310 and D11S1890, confirming the order of markers used in haplotyping. Three PAC clones representing nearly two-thirds of the USH1C critical region have been sequenced. PowerBLAST analysis identified six clusters of expressed sequence tags (ESTs), two known genes (BIR,SUR1) mapped previously to this region, and a previously characterized but unmapped gene NEFA (DNA binding/EF hand/acidic amino-acid-rich). GRAIL analysis identified 11 CpG islands and 73 exons of excellent quality. These data allowed the construction of a transcription map for the USH1C critical region, consisting of three known genes and six or more novel transcripts. Based on their map location, these loci represent candidate disease loci for USH1C. The NEFA gene was assessed as the USH1C locus by the sequencing of an amplified NEFA

  2. Whole Wiskott‑Aldrich syndrome protein gene deletion identified by high throughput sequencing.

    PubMed

    He, Xiangling; Zou, Runying; Zhang, Bing; You, Yalan; Yang, Yang; Tian, Xin

    2017-11-01

    Wiskott‑Aldrich syndrome (WAS) is a rare X‑linked recessive immunodeficiency disorder, characterized by thrombocytopenia, small platelets, eczema and recurrent infections associated with increased risk of autoimmunity and malignancy disorders. Mutations in the WAS protein (WASP) gene are responsible for WAS. To date, WASP mutations, including missense/nonsense, splicing, small deletions, small insertions, gross deletions, and gross insertions have been identified in patients with WAS. In addition, WASP‑interacting proteins are suspected in patients with clinical features of WAS, in whom the WASP gene sequence and mRNA levels are normal. The present study aimed to investigate the application of next generation sequencing in definitive diagnosis and clinical therapy for WAS. A 5 month‑old child with WAS who displayed symptoms of thrombocytopenia was examined. Whole exome sequence analysis of genomic DNA showed that the coverage and depth of WASP were extremely low. Quantitative polymerase chain reaction indicated total WASP gene deletion in the proband. In conclusion, high throughput sequencing is useful for the verification of WAS on the genetic profile, and has implications for family planning guidance and establishment of clinical programs.

  3. PhiSiGns: an online tool to identify signature genes in phages and design PCR primers for examining phage diversity.

    PubMed

    Dwivedi, Bhakti; Schmieder, Robert; Goldsmith, Dawn B; Edwards, Robert A; Breitbart, Mya

    2012-03-04

    Phages (viruses that infect bacteria) have gained significant attention because of their abundance, diversity and important ecological roles. However, the lack of a universal gene shared by all phages presents a challenge for phage identification and characterization, especially in environmental samples where it is difficult to culture phage-host systems. Homologous conserved genes (or "signature genes") present in groups of closely-related phages can be used to explore phage diversity and define evolutionary relationships amongst these phages. Bioinformatic approaches are needed to identify candidate signature genes and design PCR primers to amplify those genes from environmental samples; however, there is currently no existing computational tool that biologists can use for this purpose. Here we present PhiSiGns, a web-based and standalone application that performs a pairwise comparison of each gene present in user-selected phage genomes, identifies signature genes, generates alignments of these genes, and designs potential PCR primer pairs. PhiSiGns is available at (http://www.phantome.org/phisigns/; http://phisigns.sourceforge.net/) with a link to the source code. Here we describe the specifications of PhiSiGns and demonstrate its application with a case study. PhiSiGns provides phage biologists with a user-friendly tool to identify signature genes and design PCR primers to amplify related genes from uncultured phages in environmental samples. This bioinformatics tool will facilitate the development of novel signature genes for use as molecular markers in studies of phage diversity, phylogeny, and evolution.

  4. A numerical identifiability test for state-space models--application to optimal experimental design.

    PubMed

    Hidalgo, M E; Ayesa, E

    2001-01-01

    This paper describes a mathematical tool for identifiability analysis, easily applicable to high order non-linear systems modelled in state-space and implementable in simulators with a time-discrete approach. This procedure also permits a rigorous analysis of the expected estimation errors (average and maximum) in calibration experiments. The methodology is based on the recursive numerical evaluation of the information matrix during the simulation of a calibration experiment and in the setting-up of a group of information parameters based on geometric interpretations of this matrix. As an example of the utility of the proposed test, the paper presents its application to an optimal experimental design of ASM Model No. 1 calibration, in order to estimate the maximum specific growth rate microH and the concentration of heterotrophic biomass XBH.

  5. Machine Learning Leveraging Genomes from Metagenomes Identifies Influential Antibiotic Resistance Genes in the Infant Gut Microbiome

    PubMed Central

    Olm, Matthew R.; Morowitz, Michael J.

    2018-01-01

    ABSTRACT Antibiotic resistance in pathogens is extensively studied, and yet little is known about how antibiotic resistance genes of typical gut bacteria influence microbiome dynamics. Here, we leveraged genomes from metagenomes to investigate how genes of the premature infant gut resistome correspond to the ability of bacteria to survive under certain environmental and clinical conditions. We found that formula feeding impacts the resistome. Random forest models corroborated by statistical tests revealed that the gut resistome of formula-fed infants is enriched in class D beta-lactamase genes. Interestingly, Clostridium difficile strains harboring this gene are at higher abundance in formula-fed infants than C. difficile strains lacking this gene. Organisms with genes for major facilitator superfamily drug efflux pumps have higher replication rates under all conditions, even in the absence of antibiotic therapy. Using a machine learning approach, we identified genes that are predictive of an organism’s direction of change in relative abundance after administration of vancomycin and cephalosporin antibiotics. The most accurate results were obtained by reducing annotated genomic data to five principal components classified by boosted decision trees. Among the genes involved in predicting whether an organism increased in relative abundance after treatment are those that encode subclass B2 beta-lactamases and transcriptional regulators of vancomycin resistance. This demonstrates that machine learning applied to genome-resolved metagenomics data can identify key genes for survival after antibiotics treatment and predict how organisms in the gut microbiome will respond to antibiotic administration. IMPORTANCE The process of reconstructing genomes from environmental sequence data (genome-resolved metagenomics) allows unique insight into microbial systems. We apply this technique to investigate how the antibiotic resistance genes of bacteria affect their ability to

  6. Live-cell monitoring of periodic gene expression in synchronous human cells identifies Forkhead genes involved in cell cycle control

    PubMed Central

    Grant, Gavin D.; Gamsby, Joshua; Martyanov, Viktor; Brooks, Lionel; George, Lacy K.; Mahoney, J. Matthew; Loros, Jennifer J.; Dunlap, Jay C.; Whitfield, Michael L.

    2012-01-01

    We developed a system to monitor periodic luciferase activity from cell cycle–regulated promoters in synchronous cells. Reporters were driven by a minimal human E2F1 promoter with peak expression in G1/S or a basal promoter with six Forkhead DNA-binding sites with peak expression at G2/M. After cell cycle synchronization, luciferase activity was measured in live cells at 10-min intervals across three to four synchronous cell cycles, allowing unprecedented resolution of cell cycle–regulated gene expression. We used this assay to screen Forkhead transcription factors for control of periodic gene expression. We confirmed a role for FOXM1 and identified two novel cell cycle regulators, FOXJ3 and FOXK1. Knockdown of FOXJ3 and FOXK1 eliminated cell cycle–dependent oscillations and resulted in decreased cell proliferation rates. Analysis of genes regulated by FOXJ3 and FOXK1 showed that FOXJ3 may regulate a network of zinc finger proteins and that FOXK1 binds to the promoter and regulates DHFR, TYMS, GSDMD, and the E2F binding partner TFDP1. Chromatin immunoprecipitation followed by high-throughput sequencing analysis identified 4329 genomic loci bound by FOXK1, 83% of which contained a FOXK1-binding motif. We verified that a subset of these loci are activated by wild-type FOXK1 but not by a FOXK1 (H355A) DNA-binding mutant. PMID:22740631

  7. Transcriptome profiling of two maize inbreds with distinct responses to Gibberella ear rot disease to identify candidate resistance genes.

    PubMed

    Kebede, Aida Z; Johnston, Anne; Schneiderman, Danielle; Bosnich, Whynn; Harris, Linda J

    2018-02-09

    Gibberella ear rot (GER) is one of the most economically important fungal diseases of maize in the temperate zone due to moldy grain contaminated with health threatening mycotoxins. To develop resistant genotypes and control the disease, understanding the host-pathogen interaction is essential. RNA-Seq-derived transcriptome profiles of fungal- and mock-inoculated developing kernel tissues of two maize inbred lines were used to identify differentially expressed transcripts and propose candidate genes mapping within GER resistance quantitative trait loci (QTL). A total of 1255 transcripts were significantly (P ≤ 0.05) up regulated due to fungal infection in both susceptible and resistant inbreds. A greater number of transcripts were up regulated in the former (1174) than the latter (497) and increased as the infection progressed from 1 to 2 days after inoculation. Focusing on differentially expressed genes located within QTL regions for GER resistance, we identified 81 genes involved in membrane transport, hormone regulation, cell wall modification, cell detoxification, and biosynthesis of pathogenesis related proteins and phytoalexins as candidate genes contributing to resistance. Applying droplet digital PCR, we validated the expression profiles of a subset of these candidate genes from QTL regions contributed by the resistant inbred on chromosomes 1, 2 and 9. By screening global gene expression profiles for differentially expressed genes mapping within resistance QTL regions, we have identified candidate genes for gibberella ear rot resistance on several maize chromosomes which could potentially lead to a better understanding of Fusarium resistance mechanisms.

  8. Mango (Mangifera indica L.) cv. Kent fruit mesocarp de novo transcriptome assembly identifies gene families important for ripening.

    PubMed

    Dautt-Castro, Mitzuko; Ochoa-Leyva, Adrian; Contreras-Vergara, Carmen A; Pacheco-Sanchez, Magda A; Casas-Flores, Sergio; Sanchez-Flores, Alejandro; Kuhn, David N; Islas-Osuna, Maria A

    2015-01-01

    Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. "Kent" was done to identify key genes associated with fruit ripening. Using the Illumina sequencing platform, 67,682,269 clean reads were obtained and a transcriptome of 4.8 Gb. A total of 33,142 coding sequences were predicted and after functional annotation, 25,154 protein sequences were assigned with a product according to Swiss-Prot database and 32,560 according to non-redundant database. Differential expression analysis identified 2,306 genes with significant differences in expression between mature-green and ripe mango [1,178 up-regulated and 1,128 down-regulated (FDR ≤ 0.05)]. The expression of 10 genes evaluated by both qRT-PCR and RNA-seq data was highly correlated (R = 0.97), validating the differential expression data from RNA-seq alone. Gene Ontology enrichment analysis, showed significantly represented terms associated to fruit ripening like "cell wall," "carbohydrate catabolic process" and "starch and sucrose metabolic process" among others. Mango genes were assigned to 327 metabolic pathways according to Kyoto Encyclopedia of Genes and Genomes database, among them those involved in fruit ripening such as plant hormone signal transduction, starch and sucrose metabolism, galactose metabolism, terpenoid backbone, and carotenoid biosynthesis. This study provides a mango transcriptome that will be very helpful to identify genes for expression studies in early and late flowering mangos during fruit ripening.

  9. Heterogeneous activation of the TGFβ pathway in glioblastomas identified by gene expression-based classification using TGFβ-responsive genes

    PubMed Central

    Xu, Xie L; Kapoun, Ann M

    2009-01-01

    Background TGFβ has emerged as an attractive target for the therapeutic intervention of glioblastomas. Aberrant TGFβ overproduction in glioblastoma and other high-grade gliomas has been reported, however, to date, none of these reports has systematically examined the components of TGFβ signaling to gain a comprehensive view of TGFβ activation in large cohorts of human glioma patients. Methods TGFβ activation in mammalian cells leads to a transcriptional program that typically affects 5–10% of the genes in the genome. To systematically examine the status of TGFβ activation in high-grade glial tumors, we compiled a gene set of transcriptional response to TGFβ stimulation from tissue culture and in vivo animal studies. These genes were used to examine the status of TGFβ activation in high-grade gliomas including a large cohort of glioblastomas. Unsupervised and supervised classification analysis was performed in two independent, publicly available glioma microarray datasets. Results Unsupervised and supervised classification using the TGFβ-responsive gene list in two independent glial tumor gene expression data sets revealed various levels of TGFβ activation in these tumors. Among glioblastomas, one of the most devastating human cancers, two subgroups were identified that showed distinct TGFβ activation patterns as measured from transcriptional responses. Approximately 62% of glioblastoma samples analyzed showed strong TGFβ activation, while the rest showed a weak TGFβ transcriptional response. Conclusion Our findings suggest heterogeneous TGFβ activation in glioblastomas, which may cause potential differences in responses to anti-TGFβ therapies in these two distinct subgroups of glioblastomas patients. PMID:19192267

  10. Phenotypes of Recessive Pediatric Cataract in a Cohort of Children with Identified Homozygous Gene Mutations (An American Ophthalmological Society Thesis)

    PubMed Central

    Khan, Arif O.; Aldahmesh, Mohammed A.; Alkuraya, Fowzan S.

    2015-01-01

    Purpose: To assess for phenotype-genotype correlations in families with recessive pediatric cataract and identified gene mutations. Methods: Retrospective review (2004 through 2013) of 26 Saudi Arabian apparently nonsyndromic pediatric cataract families referred to one of the authors (A.O.K.) and for which recessive gene mutations were identified. Results: Fifteen different homozygous recessive gene mutations were identified in the 26 consanguineous families; two genes and five families are novel to this study. Ten families had a founder CRYBB1 deletion (all with bilateral central pulverulent cataract), two had the same missense mutation in CRYAB (both with bilateral juvenile cataract with marked variable expressivity), and two had different mutations in FYCO1 (both with bilateral posterior capsular abnormality). The remaining 12 families each had mutations in 12 different genes (CRYAA, CRYBA1, AKR1E2, AGK, BFSP2, CYP27A1, CYP51A1, EPHA2, GCNT2, LONP1, RNLS, WDR87) with unique phenotypes noted for CYP27A1 (bilateral juvenile fleck with anterior and/or posterior capsular cataract and later cerebrotendinous xanthomatosis), EPHA2 (bilateral anterior persistent fetal vasculature), and BFSP2 (bilateral flecklike with cloudy cortex). Potential carrier signs were documented for several families. Conclusions: In this recessive pediatric cataract case series most identified genes are noncrystallin. Recessive pediatric cataract phenotypes are generally nonspecific, but some notable phenotypes are distinct and associated with specific gene mutations. Marked variable expressivity can occur from a recessive missense CRYAB mutation. Genetic analysis of apparently isolated pediatric cataract can sometimes uncover mutations in a syndromic gene. Some gene mutations seem to be associated with apparent heterozygous carrier signs. PMID:26622071

  11. Genes2WordCloud: a quick way to identify biological themes from gene lists and free text

    PubMed Central

    2011-01-01

    Background Word-clouds recently emerged on the web as a solution for quickly summarizing text by maximizing the display of most relevant terms about a specific topic in the minimum amount of space. As biologists are faced with the daunting amount of new research data commonly presented in textual formats, word-clouds can be used to summarize and represent biological and/or biomedical content for various applications. Results Genes2WordCloud is a web application that enables users to quickly identify biological themes from gene lists and research relevant text by constructing and displaying word-clouds. It provides users with several different options and ideas for the sources that can be used to generate a word-cloud. Different options for rendering and coloring the word-clouds give users the flexibility to quickly generate customized word-clouds of their choice. Methods Genes2WordCloud is a word-cloud generator and a word-cloud viewer that is based on WordCram implemented using Java, Processing, AJAX, mySQL, and PHP. Text is fetched from several sources and then processed to extract the most relevant terms with their computed weights based on word frequencies. Genes2WordCloud is freely available for use online; it is open source software and is available for installation on any web-site along with supporting documentation at http://www.maayanlab.net/G2W. Conclusions Genes2WordCloud provides a useful way to summarize and visualize large amounts of textual biological data or to find biological themes from several different sources. The open source availability of the software enables users to implement customized word-clouds on their own web-sites and desktop applications. PMID:21995939

  12. Candidate chemosensory genes identified in the endoparasitoid Meteorus pulchricornis (Hymenoptera: Braconidae) by antennal transcriptome analysis.

    PubMed

    Sheng, Sheng; Liao, Cheng-Wu; Zheng, Yu; Zhou, Yu; Xu, Yan; Song, Wen-Miao; He, Peng; Zhang, Jian; Wu, Fu-An

    2017-06-01

    Meteorus pulchricornis is an endoparasitoid wasp which attacks the larvae of various lepidopteran pests. We present the first antennal transcriptome dataset for M. pulchricornis. A total of 48,845,072 clean reads were obtained and 34,967 unigenes were assembled. Of these, 15,458 unigenes showed a significant similarity (E-value <10 -5 ) to known proteins in the NCBI non-redundant protein database. Gene ontology (GO) and cluster of orthologous groups (COG) analyses were used to classify the functions of M. pulchricornis antennae genes. We identified 16 putative odorant-binding protein (OBP) genes, eight chemosensory protein (CSP) genes, 99 olfactory receptor (OR) genes, 19 ionotropic receptor (IR) genes and one sensory neuron membrane protein (SNMP) gene. BLASTx best hit results and phylogenetic analysis both indicated that these chemosensory genes were most closely related to those found in other hymenopteran species. Real-time quantitative PCR assays showed that 14 MpulOBP genes were antennae-specific. Of these, MpulOBP6, MpulOBP9, MpulOBP10, MpulOBP12, MpulOBP15 and MpulOBP16 were found to have greater expression in the antennae than in other body parts, while MpulOBP2 and MpulOBP3 were expressed predominately in the legs and abdomens, respectively. These results might provide a foundation for future studies of olfactory genes and chemoreception in M. pulchricornis. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. A genome-wide inducible phenotypic screen identifies antisense RNA constructs silencing Escherichia coli essential genes

    PubMed Central

    Meng, Jia; Kanzaki, Gregory; Meas, Diane; Lam, Christopher K.; Crummer, Heather; Tain, Justina; Xu, H. Howard

    2013-01-01

    Regulated antisense RNA (asRNA) expression has been employed successfully in Gram-positive bacteria for genome-wide essential gene identification and drug target determination. However, there have been no published reports describing the application of asRNA gene silencing for comprehensive analyses of essential genes in Gram-negative bacteria. In this study, we report the first genome-wide identification of asRNA constructs for essential genes in Escherichia coli. We screened 250,000 library transformants for conditional growth-inhibitory recombinant clones from two shot-gun genomic libraries of E. coli using a paired-termini expression vector (pHN678). After sequencing plasmid inserts of 675 confirmed inducer-sensitive cell clones, we identified 152 separate asRNA constructs of which 134 inserts came from essential genes while 18 originated from non-essential genes (but share operons with essential genes). Among the 79 individual essential genes silenced by these asRNA constructs, 61 genes (77%) engage in processes related to protein synthesis. The cell-based assays of an asRNA clone targeting fusA (encoding elongation factor G) showed that the induced cells were sensitized 12 fold to fusidic acid, a known specific inhibitor. Our results demonstrate the utility of the paired-termini expression vector and feasibility of large-scale gene silencing in E. coli using regulated asRNA expression. PMID:22268863

  14. RUBIC identifies driver genes by detecting recurrent DNA copy number breaks

    PubMed Central

    van Dyk, Ewald; Hoogstraat, Marlous; ten Hoeve, Jelle; Reinders, Marcel J. T.; Wessels, Lodewyk F. A.

    2016-01-01

    The frequent recurrence of copy number aberrations across tumour samples is a reliable hallmark of certain cancer driver genes. However, state-of-the-art algorithms for detecting recurrent aberrations fail to detect several known drivers. In this study, we propose RUBIC, an approach that detects recurrent copy number breaks, rather than recurrently amplified or deleted regions. This change of perspective allows for a simplified approach as recursive peak splitting procedures and repeated re-estimation of the background model are avoided. Furthermore, we control the false discovery rate on the level of called regions, rather than at the probe level, as in competing algorithms. We benchmark RUBIC against GISTIC2 (a state-of-the-art approach) and RAIG (a recently proposed approach) on simulated copy number data and on three SNP6 and NGS copy number data sets from TCGA. We show that RUBIC calls more focal recurrent regions and identifies a much larger fraction of known cancer genes. PMID:27396759

  15. Genome-wide histone state profiling of fibroblasts from the opossum, Monodelphis domestica, identifies the first marsupial-specific imprinted gene

    PubMed Central

    2014-01-01

    Background Imprinted genes have been extensively documented in eutherian mammals and found to exhibit significant interspecific variation in the suites of genes that are imprinted and in their regulation between tissues and developmental stages. Much less is known about imprinted loci in metatherian (marsupial) mammals, wherein studies have been limited to a small number of genes previously known to be imprinted in eutherians. We describe the first ab initio search for imprinted marsupial genes, in fibroblasts from the opossum, Monodelphis domestica, based on a genome-wide ChIP-seq strategy to identify promoters that are simultaneously marked by mutually exclusive, transcriptionally opposing histone modifications. Results We identified a novel imprinted gene (Meis1) and two additional monoallelically expressed genes, one of which (Cstb) showed allele-specific, but non-imprinted expression. Imprinted vs. allele-specific expression could not be resolved for the third monoallelically expressed gene (Rpl17). Transcriptionally opposing histone modifications H3K4me3, H3K9Ac, and H3K9me3 were found at the promoters of all three genes, but differential DNA methylation was not detected at CpG islands at any of these promoters. Conclusions In generating the first genome-wide histone modification profiles for a marsupial, we identified the first gene that is imprinted in a marsupial but not in eutherian mammals. This outcome demonstrates the practicality of an ab initio discovery strategy and implicates histone modification, but not differential DNA methylation, as a conserved mechanism for marking imprinted genes in all therian mammals. Our findings suggest that marsupials use multiple epigenetic mechanisms for imprinting and support the concept that lineage-specific selective forces can produce sets of imprinted genes that differ between metatherian and eutherian lines. PMID:24484454

  16. Guided genetic screen to identify genes essential in the regeneration of hair cells and other tissues.

    PubMed

    Pei, Wuhong; Xu, Lisha; Huang, Sunny C; Pettie, Kade; Idol, Jennifer; Rissone, Alberto; Jimenez, Erin; Sinclair, Jason W; Slevin, Claire; Varshney, Gaurav K; Jones, MaryPat; Carrington, Blake; Bishop, Kevin; Huang, Haigen; Sood, Raman; Lin, Shuo; Burgess, Shawn M

    2018-01-01

    Regenerative medicine holds great promise for both degenerative diseases and traumatic tissue injury which represent significant challenges to the health care system. Hearing loss, which affects hundreds of millions of people worldwide, is caused primarily by a permanent loss of the mechanosensory receptors of the inner ear known as hair cells. This failure to regenerate hair cells after loss is limited to mammals, while all other non-mammalian vertebrates tested were able to completely regenerate these mechanosensory receptors after injury. To understand the mechanism of hair cell regeneration and its association with regeneration of other tissues, we performed a guided mutagenesis screen using zebrafish lateral line hair cells as a screening platform to identify genes that are essential for hair cell regeneration, and further investigated how genes essential for hair cell regeneration were involved in the regeneration of other tissues. We created genetic mutations either by retroviral insertion or CRISPR/Cas9 approaches, and developed a high-throughput screening pipeline for analyzing hair cell development and regeneration. We screened 254 gene mutations and identified 7 genes specifically affecting hair cell regeneration. These hair cell regeneration genes fell into distinct and somewhat surprising functional categories. By examining the regeneration of caudal fin and liver, we found these hair cell regeneration genes often also affected other types of tissue regeneration. Therefore, our results demonstrate guided screening is an effective approach to discover regeneration candidates, and hair cell regeneration is associated with other tissue regeneration.

  17. Integrating Genetic, Transcriptional, and Functional Analyses to Identify Five Novel Genes for Atrial Fibrillation

    PubMed Central

    Sinner, Moritz F.; Tucker, Nathan R.; Lunetta, Kathryn L.; Ozaki, Kouichi; Smith, J. Gustav; Trompet, Stella; Bis, Joshua C.; Lin, Honghuang; Chung, Mina K.; Nielsen, Jonas B.; Lubitz, Steven A.; Krijthe, Bouwe P.; Magnani, Jared W.; Ye, Jiangchuan; Gollob, Michael H.; Tsunoda, Tatsuhiko; Müller-Nurasyid, Martina; Lichtner, Peter; Peters, Annette; Dolmatova, Elena; Kubo, Michiaki; Smith, Jonathan D.; Psaty, Bruce M.; Smith, Nicholas L.; Jukema, J. Wouter; Chasman, Daniel I.; Albert, Christine M.; Ebana, Yusuke; Furukawa, Tetsushi; MacFarlane, Peter; Harris, Tamara B.; Darbar, Dawood; Dörr, Marcus; Holst, Anders G.; Svendsen, Jesper H.; Hofman, Albert; Uitterlinden, Andre G.; Gudnason, Vilmundur; Isobe, Mitsuaki; Malik, Rainer; Dichgans, Martin; Rosand, Jonathan; Van Wagoner, David R.; Benjamin, Emelia J.; Milan, David J.; Melander, Olle; Heckbert, Susan R.; Ford, Ian; Liu, Yongmei; Barnard, John; Olesen, Morten S.; Stricker, Bruno H.C.; Tanaka, Toshihiro; Kääb, Stefan; Ellinor, Patrick T.

    2014-01-01

    Background Atrial fibrillation (AF) affects over 30 million individuals worldwide and is associated with an increased risk of stroke, heart failure, and death. AF is highly heritable, yet the genetic basis for the arrhythmia remains incompletely understood. Methods & Results To identify new AF-related genes, we utilized a multifaceted approach, combining large-scale genotyping in two ethnically distinct populations, cis-eQTL mapping, and functional validation. Four novel loci were identified in individuals of European descent near the genes NEURL (rs12415501, RR=1.18, 95%CI 1.13 – 1.23, p=6.5×10−16), GJA1 (rs13216675, RR=1.10, 95%CI 1.06 – 1.14, p=2.2×10−8), TBX5 (rs10507248, RR=1.12, 95%CI 1.08 – 1.16, p=5.7×10−11), and CAND2 (rs4642101, RR=1.10, 95%CI 1.06 – 1.14, p=9.8×10−9). In Japanese, novel loci were identified near NEURL (rs6584555, RR=1.32, 95%CI 1.26–1.39, p=2.0×10−25) and CUX2 (rs6490029, RR=1.12, 95%CI 1.08–1.16, p=3.9×10−9). The top SNPs or their proxies were identified as cis-eQTLs for the genes CAND2 (p=2.6×10−19), GJA1 (p=2.66×10−6), and TBX5 (p=1.36×10−05). Knockdown of the zebrafish orthologs of NEURL and CAND2 resulted in prolongation of the atrial action potential duration (17% and 45%, respectively). Conclusions We have identified five novel loci for AF. Our results further expand the diversity of genetic pathways implicated in AF and provide novel molecular targets for future biological and pharmacological investigation. PMID:25124494

  18. Integrative multi-platform meta-analysis of gene expression profiles in pancreatic ductal adenocarcinoma patients for identifying novel diagnostic biomarkers.

    PubMed

    Irigoyen, Antonio; Jimenez-Luna, Cristina; Benavides, Manuel; Caba, Octavio; Gallego, Javier; Ortuño, Francisco Manuel; Guillen-Ponce, Carmen; Rojas, Ignacio; Aranda, Enrique; Torres, Carolina; Prados, Jose

    2018-01-01

    Applying differentially expressed genes (DEGs) to identify feasible biomarkers in diseases can be a hard task when working with heterogeneous datasets. Expression data are strongly influenced by technology, sample preparation processes, and/or labeling methods. The proliferation of different microarray platforms for measuring gene expression increases the need to develop models able to compare their results, especially when different technologies can lead to signal values that vary greatly. Integrative meta-analysis can significantly improve the reliability and robustness of DEG detection. The objective of this work was to develop an integrative approach for identifying potential cancer biomarkers by integrating gene expression data from two different platforms. Pancreatic ductal adenocarcinoma (PDAC), where there is an urgent need to find new biomarkers due its late diagnosis, is an ideal candidate for testing this technology. Expression data from two different datasets, namely Affymetrix and Illumina (18 and 36 PDAC patients, respectively), as well as from 18 healthy controls, was used for this study. A meta-analysis based on an empirical Bayesian methodology (ComBat) was then proposed to integrate these datasets. DEGs were finally identified from the integrated data by using the statistical programming language R. After our integrative meta-analysis, 5 genes were commonly identified within the individual analyses of the independent datasets. Also, 28 novel genes that were not reported by the individual analyses ('gained' genes) were also discovered. Several of these gained genes have been already related to other gastroenterological tumors. The proposed integrative meta-analysis has revealed novel DEGs that may play an important role in PDAC and could be potential biomarkers for diagnosing the disease.

  19. Meta-analysis identifies gene-by-environment interactions as demonstrated in a study of 4,965 mice.

    PubMed

    Kang, Eun Yong; Han, Buhm; Furlotte, Nicholas; Joo, Jong Wha J; Shih, Diana; Davis, Richard C; Lusis, Aldons J; Eskin, Eleazar

    2014-01-01

    Identifying environmentally-specific genetic effects is a key challenge in understanding the structure of complex traits. Model organisms play a crucial role in the identification of such gene-by-environment interactions, as a result of the unique ability to observe genetically similar individuals across multiple distinct environments. Many model organism studies examine the same traits but under varying environmental conditions. For example, knock-out or diet-controlled studies are often used to examine cholesterol in mice. These studies, when examined in aggregate, provide an opportunity to identify genomic loci exhibiting environmentally-dependent effects. However, the straightforward application of traditional methodologies to aggregate separate studies suffers from several problems. First, environmental conditions are often variable and do not fit the standard univariate model for interactions. Additionally, applying a multivariate model results in increased degrees of freedom and low statistical power. In this paper, we jointly analyze multiple studies with varying environmental conditions using a meta-analytic approach based on a random effects model to identify loci involved in gene-by-environment interactions. Our approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional uni- or multi-variate approaches for discovery of gene-by-environment interactions. We apply our new method to combine 17 mouse studies containing in aggregate 4,965 distinct animals. We identify 26 significant loci involved in High-density lipoprotein (HDL) cholesterol, many of which are consistent with previous findings. Several of these loci show significant evidence of involvement in gene-by-environment interactions. An additional advantage of our meta

  20. Meta-Analysis Identifies Gene-by-Environment Interactions as Demonstrated in a Study of 4,965 Mice

    PubMed Central

    Joo, Jong Wha J.; Shih, Diana; Davis, Richard C.; Lusis, Aldons J.; Eskin, Eleazar

    2014-01-01

    Identifying environmentally-specific genetic effects is a key challenge in understanding the structure of complex traits. Model organisms play a crucial role in the identification of such gene-by-environment interactions, as a result of the unique ability to observe genetically similar individuals across multiple distinct environments. Many model organism studies examine the same traits but under varying environmental conditions. For example, knock-out or diet-controlled studies are often used to examine cholesterol in mice. These studies, when examined in aggregate, provide an opportunity to identify genomic loci exhibiting environmentally-dependent effects. However, the straightforward application of traditional methodologies to aggregate separate studies suffers from several problems. First, environmental conditions are often variable and do not fit the standard univariate model for interactions. Additionally, applying a multivariate model results in increased degrees of freedom and low statistical power. In this paper, we jointly analyze multiple studies with varying environmental conditions using a meta-analytic approach based on a random effects model to identify loci involved in gene-by-environment interactions. Our approach is motivated by the observation that methods for discovering gene-by-environment interactions are closely related to random effects models for meta-analysis. We show that interactions can be interpreted as heterogeneity and can be detected without utilizing the traditional uni- or multi-variate approaches for discovery of gene-by-environment interactions. We apply our new method to combine 17 mouse studies containing in aggregate 4,965 distinct animals. We identify 26 significant loci involved in High-density lipoprotein (HDL) cholesterol, many of which are consistent with previous findings. Several of these loci show significant evidence of involvement in gene-by-environment interactions. An additional advantage of our meta

  1. Cross-species microarray hybridization to identify developmentally regulated genes in the filamentous fungus Sordaria macrospora.

    PubMed

    Nowrousian, Minou; Ringelberg, Carol; Dunlap, Jay C; Loros, Jennifer J; Kück, Ulrich

    2005-04-01

    The filamentous fungus Sordaria macrospora forms complex three-dimensional fruiting bodies that protect the developing ascospores and ensure their proper discharge. Several regulatory genes essential for fruiting body development were previously isolated by complementation of the sterile mutants pro1, pro11 and pro22. To establish the genetic relationships between these genes and to identify downstream targets, we have conducted cross-species microarray hybridizations using cDNA arrays derived from the closely related fungus Neurospora crassa and RNA probes prepared from wild-type S. macrospora and the three developmental mutants. Of the 1,420 genes which gave a signal with the probes from all the strains used, 172 (12%) were regulated differently in at least one of the three mutants compared to the wild type, and 17 (1.2%) were regulated differently in all three mutant strains. Microarray data were verified by Northern analysis or quantitative real time PCR. Among the genes that are up- or down-regulated in the mutant strains are genes encoding the pheromone precursors, enzymes involved in melanin biosynthesis and a lectin-like protein. Analysis of gene expression in double mutants revealed a complex network of interaction between the pro gene products.

  2. PhiSiGns: an online tool to identify signature genes in phages and design PCR primers for examining phage diversity

    PubMed Central

    2012-01-01

    Background Phages (viruses that infect bacteria) have gained significant attention because of their abundance, diversity and important ecological roles. However, the lack of a universal gene shared by all phages presents a challenge for phage identification and characterization, especially in environmental samples where it is difficult to culture phage-host systems. Homologous conserved genes (or "signature genes") present in groups of closely-related phages can be used to explore phage diversity and define evolutionary relationships amongst these phages. Bioinformatic approaches are needed to identify candidate signature genes and design PCR primers to amplify those genes from environmental samples; however, there is currently no existing computational tool that biologists can use for this purpose. Results Here we present PhiSiGns, a web-based and standalone application that performs a pairwise comparison of each gene present in user-selected phage genomes, identifies signature genes, generates alignments of these genes, and designs potential PCR primer pairs. PhiSiGns is available at (http://www.phantome.org/phisigns/; http://phisigns.sourceforge.net/) with a link to the source code. Here we describe the specifications of PhiSiGns and demonstrate its application with a case study. Conclusions PhiSiGns provides phage biologists with a user-friendly tool to identify signature genes and design PCR primers to amplify related genes from uncultured phages in environmental samples. This bioinformatics tool will facilitate the development of novel signature genes for use as molecular markers in studies of phage diversity, phylogeny, and evolution. PMID:22385976

  3. Comparison of Expression Profiles in Ovarian Epithelium In Vivo and Ovarian Cancer Identifies Novel Candidate Genes Involved in Disease Pathogenesis

    PubMed Central

    Emmanuel, Catherine; Gava, Natalie; Kennedy, Catherine; Balleine, Rosemary L.; Sharma, Raghwa; Wain, Gerard; Brand, Alison; Hogg, Russell; Etemadmoghadam, Dariush; George, Joshy; Birrer, Michael J.; Clarke, Christine L.; Chenevix-Trench, Georgia; Bowtell, David D. L.; Harnett, Paul R.; deFazio, Anna

    2011-01-01

    Molecular events leading to epithelial ovarian cancer are poorly understood but ovulatory hormones and a high number of life-time ovulations with concomitant proliferation, apoptosis, and inflammation, increases risk. We identified genes that are regulated during the estrous cycle in murine ovarian surface epithelium and analysed these profiles to identify genes dysregulated in human ovarian cancer, using publically available datasets. We identified 338 genes that are regulated in murine ovarian surface epithelium during the estrous cycle and dysregulated in ovarian cancer. Six of seven candidates selected for immunohistochemical validation were expressed in serous ovarian cancer, inclusion cysts, ovarian surface epithelium and in fallopian tube epithelium. Most were overexpressed in ovarian cancer compared with ovarian surface epithelium and/or inclusion cysts (EpCAM, EZH2, BIRC5) although BIRC5 and EZH2 were expressed as highly in fallopian tube epithelium as in ovarian cancer. We prioritised the 338 genes for those likely to be important for ovarian cancer development by in silico analyses of copy number aberration and mutation using publically available datasets and identified genes with established roles in ovarian cancer as well as novel genes for which we have evidence for involvement in ovarian cancer. Chromosome segregation emerged as an important process in which genes from our list of 338 were over-represented including two (BUB1, NCAPD2) for which there is evidence of amplification and mutation. NUAK2, upregulated in ovarian surface epithelium in proestrus and predicted to have a driver mutation in ovarian cancer, was examined in a larger cohort of serous ovarian cancer where patients with lower NUAK2 expression had shorter overall survival. In conclusion, defining genes that are activated in normal epithelium in the course of ovulation that are also dysregulated in cancer has identified a number of pathways and novel candidate genes that may contribute

  4. Comparative Transcriptome Analysis Identifies Putative Genes Involved in the Biosynthesis of Xanthanolides in Xanthium strumarium L.

    PubMed

    Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng

    2016-01-01

    Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides.

  5. A Systems Approach Identifies Networks and Genes Linking Sleep and Stress: Implications for Neuropsychiatric Disorders

    PubMed Central

    Jiang, Peng; Scarpa, Joseph R.; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D.; Hao, Ke; Summa, Keith C.; Yang, He S.; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H.; Turek, Fred W.; Kasarskis, Andrew

    2016-01-01

    SUMMARY Sleep dysfunction and stress susceptibility are co-morbid complex traits, which often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multi-level organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J×A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests the interplay between sleep, stress, and neuropathology emerge from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework to interrogate the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. PMID:25921536

  6. Gene Manipulation Strategies to Identify Molecular Regulators of Axon Regeneration in the Central Nervous System

    PubMed Central

    Ribas, Vinicius T.; Costa, Marcos R.

    2017-01-01

    Limited axon regeneration in the injured adult mammalian central nervous system (CNS) usually results in irreversible functional deficits. Both the presence of extrinsic inhibitory molecules at the injury site and the intrinsically low capacity of adult neurons to grow axons are responsible for the diminished capacity of regeneration in the adult CNS. Conversely, in the embryonic CNS, neurons show a high regenerative capacity, mostly due to the expression of genes that positively control axon growth and downregulation of genes that inhibit axon growth. A better understanding of the role of these key genes controlling pro-regenerative mechanisms is pivotal to develop strategies to promote robust axon regeneration following adult CNS injury. Genetic manipulation techniques have been widely used to investigate the role of specific genes or a combination of different genes in axon regrowth. This review summarizes a myriad of studies that used genetic manipulations to promote axon growth in the injured CNS. We also review the roles of some of these genes during CNS development and suggest possible approaches to identify new candidate genes. Finally, we critically address the main advantages and pitfalls of gene-manipulation techniques, and discuss new strategies to promote robust axon regeneration in the mature CNS. PMID:28824380

  7. Novel genes identified in a high-density genome wide association study for nicotine dependence.

    PubMed

    Bierut, Laura Jean; Madden, Pamela A F; Breslau, Naomi; Johnson, Eric O; Hatsukami, Dorothy; Pomerleau, Ovide F; Swan, Gary E; Rutter, Joni; Bertelsen, Sarah; Fox, Louis; Fugman, Douglas; Goate, Alison M; Hinrichs, Anthony L; Konvicka, Karel; Martin, Nicholas G; Montgomery, Grant W; Saccone, Nancy L; Saccone, Scott F; Wang, Jen C; Chase, Gary A; Rice, John P; Ballinger, Dennis G

    2007-01-01

    Tobacco use is a leading contributor to disability and death worldwide, and genetic factors contribute in part to the development of nicotine dependence. To identify novel genes for which natural variation contributes to the development of nicotine dependence, we performed a comprehensive genome wide association study using nicotine dependent smokers as cases and non-dependent smokers as controls. To allow the efficient, rapid, and cost effective screen of the genome, the study was carried out using a two-stage design. In the first stage, genotyping of over 2.4 million single nucleotide polymorphisms (SNPs) was completed in case and control pools. In the second stage, we selected SNPs for individual genotyping based on the most significant allele frequency differences between cases and controls from the pooled results. Individual genotyping was performed in 1050 cases and 879 controls using 31 960 selected SNPs. The primary analysis, a logistic regression model with covariates of age, gender, genotype and gender by genotype interaction, identified 35 SNPs with P-values less than 10(-4) (minimum P-value 1.53 x 10(-6)). Although none of the individual findings is statistically significant after correcting for multiple tests, additional statistical analyses support the existence of true findings in this group. Our study nominates several novel genes, such as Neurexin 1 (NRXN1), in the development of nicotine dependence while also identifying a known candidate gene, the beta3 nicotinic cholinergic receptor. This work anticipates the future directions of large-scale genome wide association studies with state-of-the-art methodological approaches and sharing of data with the scientific community.

  8. G-NEST: A gene neighborhood scoring tool to identify co-conserved, co-expressed genes

    USDA-ARS?s Scientific Manuscript database

    In previous studies, gene neighborhoods--spatial clusters of co-expressed genes in the genome--have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Sc...

  9. Expression profiling during ocular development identifies 2 Nlz genes with a critical role in optic fissure closure.

    PubMed

    Brown, Jacob D; Dutta, Sunit; Bharti, Kapil; Bonner, Robert F; Munson, Peter J; Dawid, Igor B; Akhtar, Amana L; Onojafe, Ighovie F; Alur, Ramakrishna P; Gross, Jeffrey M; Hejtmancik, J Fielding; Jiao, Xiaodong; Chan, Wai-Yee; Brooks, Brian P

    2009-02-03

    The gene networks underlying closure of the optic fissure during vertebrate eye development are poorly understood. Here, we profile global gene expression during optic fissure closure using laser capture microdissected (LCM) tissue from the margins of the fissure. From these data, we identify a unique role for the C(2)H(2) zinc finger proteins Nlz1 and Nlz2 in normal fissure closure. Gene knockdown of nlz1 and/or nlz2 in zebrafish leads to a failure of the optic fissure to close, a phenotype which closely resembles that seen in human uveal coloboma. We also identify misregulation of pax2 in the developing eye of morphant fish, suggesting that Nlz1 and Nlz2 act upstream of the Pax2 pathway in directing proper closure of the optic fissure.

  10. Identifying biomarkers of papillary renal cell carcinoma associated with pathological stage by weighted gene co-expression network analysis.

    PubMed

    He, Zhongshi; Sun, Min; Ke, Yuan; Lin, Rongjie; Xiao, Youde; Zhou, Shuliang; Zhao, Hong; Wang, Yan; Zhou, Fuxiang; Zhou, Yunfeng

    2017-04-25

    Although papillary renal cell carcinoma (PRCC) accounts for 10%-15% of renal cell carcinoma (RCC), no predictive molecular biomarker is currently applicable to guiding disease stage of PRCC patients. The mRNASeq data of PRCC and adjacent normal tissue in The Cancer Genome Atlas was analyzed to identify 1148 differentially expressed genes, on which weighted gene co-expression network analysis was performed. Then 11 co-expressed gene modules were identified. The highest association was found between blue module and pathological stage (r = 0.45) by Pearson's correlation analysis. Functional enrichment analysis revealed that biological processes of blue module focused on nuclear division, cell cycle phase, and spindle (all P < 1e-10). All 40 hub genes in blue module can distinguish localized (pathological stage I, II) from non-localized (pathological stage III, IV) PRCC (P < 0.01). A good molecular biomarker for pathological stage of RCC must be a prognostic gene in clinical practice. Survival analysis was performed to reversely validate if hub genes were associated with pathological stage. Survival analysis unveiled that all hub genes were associated with patient prognosis (P < 0.01).The validation cohort GSE2748 verified that 30 hub genes can differentiate localized from non-localized PRCC (P < 0.01), and 18 hub genes are prognosis-associated (P < 0.01).ROC curve indicated that the 17 hub genes exhibited excellent diagnostic efficiency for localized and non-localized PRCC (AUC > 0.7). These hub genes may serve as a biomarker and help to distinguish different pathological stages for PRCC patients.

  11. Genexpi: a toolset for identifying regulons and validating gene regulatory networks using time-course expression data.

    PubMed

    Modrák, Martin; Vohradský, Jiří

    2018-04-13

    Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks. We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package. Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.

  12. Numerical classification of coding sequences

    NASA Technical Reports Server (NTRS)

    Collins, D. W.; Liu, C. C.; Jukes, T. H.

    1992-01-01

    DNA sequences coding for protein may be represented by counts of nucleotides or codons. A complete reading frame may be abbreviated by its base count, e.g. A76C158G121T74, or with the corresponding codon table, e.g. (AAA)0(AAC)1(AAG)9 ... (TTT)0. We propose that these numerical designations be used to augment current methods of sequence annotation. Because base counts and codon tables do not require revision as knowledge of function evolves, they are well-suited to act as cross-references, for example to identify redundant GenBank entries. These descriptors may be compared, in place of DNA sequences, to extract homologous genes from large databases. This approach permits rapid searching with good selectivity.

  13. Cry-Bt identifier: a biological database for PCR detection of Cry genes present in transgenic plants.

    PubMed

    Singh, Vinay Kumar; Ambwani, Sonu; Marla, Soma; Kumar, Anil

    2009-10-23

    We describe the development of a user friendly tool that would assist in the retrieval of information relating to Cry genes in transgenic crops. The tool also helps in detection of transformed Cry genes from Bacillus thuringiensis present in transgenic plants by providing suitable designed primers for PCR identification of these genes. The tool designed based on relational database model enables easy retrieval of information from the database with simple user queries. The tool also enables users to access related information about Cry genes present in various databases by interacting with different sources (nucleotide sequences, protein sequence, sequence comparison tools, published literature, conserved domains, evolutionary and structural data). http://insilicogenomics.in/Cry-btIdentifier/welcome.html.

  14. Network-Based Method for Identifying Co-Regeneration Genes in Bone, Dentin, Nerve and Vessel Tissues

    PubMed Central

    Pan, Hongying; Zhang, Yu-Hang; Feng, Kaiyan; Kong, XiangYin; Cai, Yu-Dong

    2017-01-01

    Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein–protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method. PMID:28974058

  15. Network-Based Method for Identifying Co- Regeneration Genes in Bone, Dentin, Nerve and Vessel Tissues.

    PubMed

    Chen, Lei; Pan, Hongying; Zhang, Yu-Hang; Feng, Kaiyan; Kong, XiangYin; Huang, Tao; Cai, Yu-Dong

    2017-10-02

    Bone and dental diseases are serious public health problems. Most current clinical treatments for these diseases can produce side effects. Regeneration is a promising therapy for bone and dental diseases, yielding natural tissue recovery with few side effects. Because soft tissues inside the bone and dentin are densely populated with nerves and vessels, the study of bone and dentin regeneration should also consider the co-regeneration of nerves and vessels. In this study, a network-based method to identify co-regeneration genes for bone, dentin, nerve and vessel was constructed based on an extensive network of protein-protein interactions. Three procedures were applied in the network-based method. The first procedure, searching, sought the shortest paths connecting regeneration genes of one tissue type with regeneration genes of other tissues, thereby extracting possible co-regeneration genes. The second procedure, testing, employed a permutation test to evaluate whether possible genes were false discoveries; these genes were excluded by the testing procedure. The last procedure, screening, employed two rules, the betweenness ratio rule and interaction score rule, to select the most essential genes. A total of seventeen genes were inferred by the method, which were deemed to contribute to co-regeneration of at least two tissues. All these seventeen genes were extensively discussed to validate the utility of the method.

  16. De Novo Assembly of the Japanese Flounder (Paralichthys olivaceus) Spleen Transcriptome to Identify Putative Genes Involved in Immunity

    PubMed Central

    Huang, Lin; Li, Guiyang; Mo, Zhaolan; Xiao, Peng; Li, Jie; Huang, Jie

    2015-01-01

    Background Japanese flounder (Paralichthys olivaceus) is an economically important marine fish in Asia and has suffered from disease outbreaks caused by various pathogens, which requires more information for immune relevant genes on genome background. However, genomic and transcriptomic data for Japanese flounder remain scarce, which limits studies on the immune system of this species. In this study, we characterized the Japanese flounder spleen transcriptome using an Illumina paired-end sequencing platform to identify putative genes involved in immunity. Methodology/Principal Findings A cDNA library from the spleen of P. olivaceus was constructed and randomly sequenced using an Illumina technique. The removal of low quality reads generated 12,196,968 trimmed reads, which assembled into 96,627 unigenes. A total of 21,391 unigenes (22.14%) were annotated in the NCBI Nr database, and only 1.1% of the BLASTx top-hits matched P. olivaceus protein sequences. Approximately 12,503 (58.45%) unigenes were categorized into three Gene Ontology groups, 19,547 (91.38%) were classified into 26 Cluster of Orthologous Groups, and 10,649 (49.78%) were assigned to six Kyoto Encyclopedia of Genes and Genomes pathways. Furthermore, 40,928 putative simple sequence repeats and 47, 362 putative single nucleotide polymorphisms were identified. Importantly, we identified 1,563 putative immune-associated unigenes that mapped to 15 immune signaling pathways. Conclusions/Significance The P. olivaceus transciptome data provides a rich source to discover and identify new genes, and the immune-relevant sequences identified here will facilitate our understanding of the mechanisms involved in the immune response. Furthermore, the plentiful potential SSRs and SNPs found in this study are important resources with respect to future development of a linkage map or marker assisted breeding programs for the flounder. PMID:25723398

  17. A recessive contiguous gene deletion causing infantile hyperinsulinism, enteropathy and deafness identifies the Usher type 1C gene.

    PubMed

    Bitner-Glindzicz, M; Lindley, K J; Rutland, P; Blaydon, D; Smith, V V; Milla, P J; Hussain, K; Furth-Lavi, J; Cosgrove, K E; Shepherd, R M; Barnes, P D; O'Brien, R E; Farndon, P A; Sowden, J; Liu, X Z; Scanlan, M J; Malcolm, S; Dunne, M J; Aynsley-Green, A; Glaser, B

    2000-09-01

    Usher syndrome type 1 describes the association of profound, congenital sensorineural deafness, vestibular hypofunction and childhood onset retinitis pigmentosa. It is an autosomal recessive condition and is subdivided on the basis of linkage analysis into types 1A through 1E. Usher type 1C maps to the region containing the genes ABCC8 and KCNJ11 (encoding components of ATP-sensitive K + (KATP) channels), which may be mutated in patients with hyperinsulinism. We identified three individuals from two consanguineous families with severe hyperinsulinism, profound congenital sensorineural deafness, enteropathy and renal tubular dysfunction. The molecular basis of the disorder is a homozygous 122-kb deletion of 11p14-15, which includes part of ABCC8 and overlaps with the locus for Usher syndrome type 1C and DFNB18. The centromeric boundary of this deletion includes part of a gene shown to be mutated in families with type 1C Usher syndrome, and is hence assigned the name USH1C. The pattern of expression of the USH1C protein is consistent with the clinical features exhibited by individuals with the contiguous gene deletion and with isolated Usher type 1C.

  18. Correlational analysis for identifying genes whose regulation contributes to chronic neuropathic pain

    PubMed Central

    Persson, Anna-Karin; Gebauer, Mathias; Jordan, Suzana; Metz-Weidmann, Christiane; Schulte, Anke M; Schneider, Hans-Christoph; Ding-Pfennigdorff, Danping; Thun, Jonas; Xu, Xiao-Jun; Wiesenfeld-Hallin, Zsuzsanna; Darvasi, Ariel; Fried, Kaj; Devor, Marshall

    2009-01-01

    Background Nerve injury-triggered hyperexcitability in primary sensory neurons is considered a major source of chronic neuropathic pain. The hyperexcitability, in turn, is thought to be related to transcriptional switching in afferent cell somata. Analysis using expression microarrays has revealed that many genes are regulated in the dorsal root ganglion (DRG) following axotomy. But which contribute to pain phenotype versus other nerve injury-evoked processes such as nerve regeneration? Using the L5 spinal nerve ligation model of neuropathy we examined differential changes in gene expression in the L5 (and L4) DRGs in five mouse strains with contrasting susceptibility to neuropathic pain. We sought genes for which the degree of regulation correlates with strain-specific pain phenotype. Results In an initial experiment six candidate genes previously identified as important in pain physiology were selected for in situ hybridization to DRG sections. Among these, regulation of the Na+ channel α subunit Scn11a correlated with levels of spontaneous pain behavior, and regulation of the cool receptor Trpm8 correlated with heat hypersensibility. In a larger scale experiment, mRNA extracted from individual mouse DRGs was processed on Affymetrix whole-genome expression microarrays. Overall, 2552 ± 477 transcripts were significantly regulated in the axotomized L5DRG 3 days postoperatively. However, in only a small fraction of these was the degree of regulation correlated with pain behavior across strains. Very few genes in the "uninjured" L4DRG showed altered expression (24 ± 28). Conclusion Correlational analysis based on in situ hybridization provided evidence that differential regulation of Scn11a and Trpm8 contributes to across-strain variability in pain phenotype. This does not, of course, constitute evidence that the others are unrelated to pain. Correlational analysis based on microarray data yielded a larger "look-up table" of genes whose regulation likely

  19. Regulatory network analysis of Epstein-Barr virus identifies functional modules and hub genes involved in infectious mononucleosis.

    PubMed

    Poorebrahim, Mansour; Salarian, Ali; Najafi, Saeideh; Abazari, Mohammad Foad; Aleagha, Maryam Nouri; Dadras, Mohammad Nasr; Jazayeri, Seyed Mohammad; Ataei, Atousa; Poortahmasebi, Vahdat

    2017-05-01

    Epstein-Barr virus (EBV) is the most common cause of infectious mononucleosis (IM) and establishes lifetime infection associated with a variety of cancers and autoimmune diseases. The aim of this study was to develop an integrative gene regulatory network (GRN) approach and overlying gene expression data to identify the representative subnetworks for IM and EBV latent infection (LI). After identifying differentially expressed genes (DEGs) in both IM and LI gene expression profiles, functional annotations were applied using gene ontology (GO) and BiNGO tools, and construction of GRNs, topological analysis and identification of modules were carried out using several plugins of Cytoscape. In parallel, a human-EBV GRN was generated using the Hu-Vir database for further analyses. Our analysis revealed that the majority of DEGs in both IM and LI were involved in cell-cycle and DNA repair processes. However, these genes showed a significant negative correlation in the IM and LI states. Furthermore, cyclin-dependent kinase 2 (CDK2) - a hub gene with the highest centrality score - appeared to be the key player in cell cycle regulation in IM disease. The most significant functional modules in the IM and LI states were involved in the regulation of the cell cycle and apoptosis, respectively. Human-EBV network analysis revealed several direct targets of EBV proteins during IM disease. Our study provides an important first report on the response to IM/LI EBV infection in humans. An important aspect of our data was the upregulation of genes associated with cell cycle progression and proliferation.

  20. Relating genes to function: identifying enriched transcription factors using the ENCODE ChIP-Seq significance tool.

    PubMed

    Auerbach, Raymond K; Chen, Bin; Butte, Atul J

    2013-08-01

    Biological analysis has shifted from identifying genes and transcripts to mapping these genes and transcripts to biological functions. The ENCODE Project has generated hundreds of ChIP-Seq experiments spanning multiple transcription factors and cell lines for public use, but tools for a biomedical scientist to analyze these data are either non-existent or tailored to narrow biological questions. We present the ENCODE ChIP-Seq Significance Tool, a flexible web application leveraging public ENCODE data to identify enriched transcription factors in a gene or transcript list for comparative analyses. The ENCODE ChIP-Seq Significance Tool is written in JavaScript on the client side and has been tested on Google Chrome, Apple Safari and Mozilla Firefox browsers. Server-side scripts are written in PHP and leverage R and a MySQL database. The tool is available at http://encodeqt.stanford.edu. abutte@stanford.edu Supplementary material is available at Bioinformatics online.

  1. Fine Mapping of a Clubroot Resistance Gene in Chinese Cabbage Using SNP Markers Identified from Bulked Segregant RNA Sequencing

    PubMed Central

    Huang, Zhen; Peng, Gary; Liu, Xunjia; Deora, Abhinandan; Falk, Kevin C.; Gossen, Bruce D.; McDonald, Mary R.; Yu, Fengqun

    2017-01-01

    Clubroot, caused by Plasmodiophora brassicae, is an important disease of canola (Brassica napus) in western Canada and worldwide. In this study, a clubroot resistance gene (Rcr2) was identified and fine mapped in Chinese cabbage cv. “Jazz” using single-nucleotide polymorphisms (SNP) markers identified from bulked segregant RNA sequencing (BSR-Seq) and molecular markers were developed for use in marker assisted selection. In total, 203.9 million raw reads were generated from one pooled resistant (R) and one pooled susceptible (S) sample, and >173,000 polymorphic SNP sites were identified between the R and S samples. One significant peak was observed between 22 and 26 Mb of chromosome A03, which had been predicted by BSR-Seq to contain the causal gene Rcr2. There were 490 polymorphic SNP sites identified in the region. A segregating population consisting of 675 plants was analyzed with 15 SNP sites in the region using the Kompetitive Allele Specific PCR method, and Rcr2 was fine mapped between two SNP markers, SNP_A03_32 and SNP_A03_67 with 0.1 and 0.3 cM from Rcr2, respectively. Five SNP markers co-segregated with Rcr2 in this region. Variants were identified in 14 of 36 genes annotated in the Rcr2 target region. The numbers of poly variants differed among the genes. Four genes encode TIR-NBS-LRR proteins and two of them Bra019410 and Bra019413, had high numbers of polymorphic variants and so are the most likely candidates of Rcr2. PMID:28894454

  2. Mango (Mangifera indica L.) cv. Kent fruit mesocarp de novo transcriptome assembly identifies gene families important for ripening

    PubMed Central

    Dautt-Castro, Mitzuko; Ochoa-Leyva, Adrian; Contreras-Vergara, Carmen A.; Pacheco-Sanchez, Magda A.; Casas-Flores, Sergio; Sanchez-Flores, Alejandro; Kuhn, David N.; Islas-Osuna, Maria A.

    2015-01-01

    Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. “Kent” was done to identify key genes associated with fruit ripening. Using the Illumina sequencing platform, 67,682,269 clean reads were obtained and a transcriptome of 4.8 Gb. A total of 33,142 coding sequences were predicted and after functional annotation, 25,154 protein sequences were assigned with a product according to Swiss-Prot database and 32,560 according to non-redundant database. Differential expression analysis identified 2,306 genes with significant differences in expression between mature-green and ripe mango [1,178 up-regulated and 1,128 down-regulated (FDR ≤ 0.05)]. The expression of 10 genes evaluated by both qRT-PCR and RNA-seq data was highly correlated (R = 0.97), validating the differential expression data from RNA-seq alone. Gene Ontology enrichment analysis, showed significantly represented terms associated to fruit ripening like “cell wall,” “carbohydrate catabolic process” and “starch and sucrose metabolic process” among others. Mango genes were assigned to 327 metabolic pathways according to Kyoto Encyclopedia of Genes and Genomes database, among them those involved in fruit ripening such as plant hormone signal transduction, starch and sucrose metabolism, galactose metabolism, terpenoid backbone, and carotenoid biosynthesis. This study provides a mango transcriptome that will be very helpful to identify genes for expression studies in early and late flowering mangos during fruit ripening. PMID:25741352

  3. A genome-wide inducible phenotypic screen identifies antisense RNA constructs silencing Escherichia coli essential genes.

    PubMed

    Meng, Jia; Kanzaki, Gregory; Meas, Diane; Lam, Christopher K; Crummer, Heather; Tain, Justina; Xu, H Howard

    2012-04-01

    Regulated antisense RNA (asRNA) expression has been employed successfully in Gram-positive bacteria for genome-wide essential gene identification and drug target determination. However, there have been no published reports describing the application of asRNA gene silencing for comprehensive analyses of essential genes in Gram-negative bacteria. In this study, we report the first genome-wide identification of asRNA constructs for essential genes in Escherichia coli. We screened 250 000 library transformants for conditional growth inhibitory recombinant clones from two shotgun genomic libraries of E. coli using a paired-termini expression vector (pHN678). After sequencing plasmid inserts of 675 confirmed inducer sensitive cell clones, we identified 152 separate asRNA constructs of which 134 inserts came from essential genes, while 18 originated from nonessential genes (but share operons with essential genes). Among the 79 individual essential genes silenced by these asRNA constructs, 61 genes (77%) engage in processes related to protein synthesis. The cell-based assays of an asRNA clone targeting fusA (encoding elongation factor G) showed that the induced cells were sensitized 12-fold to fusidic acid, a known specific inhibitor. Our results demonstrate the utility of the paired-termini expression vector and feasibility of large-scale gene silencing in E. coli using regulated asRNA expression. © 2012 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  4. Integration of mouse and human genome-wide association data identifies KCNIP4 as an asthma gene.

    PubMed

    Himes, Blanca E; Sheppard, Keith; Berndt, Annerose; Leme, Adriana S; Myers, Rachel A; Gignoux, Christopher R; Levin, Albert M; Gauderman, W James; Yang, James J; Mathias, Rasika A; Romieu, Isabelle; Torgerson, Dara G; Roth, Lindsey A; Huntsman, Scott; Eng, Celeste; Klanderman, Barbara; Ziniti, John; Senter-Sylvia, Jody; Szefler, Stanley J; Lemanske, Robert F; Zeiger, Robert S; Strunk, Robert C; Martinez, Fernando D; Boushey, Homer; Chinchilli, Vernon M; Israel, Elliot; Mauger, David; Koppelman, Gerard H; Postma, Dirkje S; Nieuwenhuis, Maartje A E; Vonk, Judith M; Lima, John J; Irvin, Charles G; Peters, Stephen P; Kubo, Michiaki; Tamari, Mayumi; Nakamura, Yusuke; Litonjua, Augusto A; Tantisira, Kelan G; Raby, Benjamin A; Bleecker, Eugene R; Meyers, Deborah A; London, Stephanie J; Barnes, Kathleen C; Gilliland, Frank D; Williams, L Keoki; Burchard, Esteban G; Nicolae, Dan L; Ober, Carole; DeMeo, Dawn L; Silverman, Edwin K; Paigen, Beverly; Churchill, Gary; Shapiro, Steve D; Weiss, Scott T

    2013-01-01

    Asthma is a common chronic respiratory disease characterized by airway hyperresponsiveness (AHR). The genetics of asthma have been widely studied in mouse and human, and homologous genomic regions have been associated with mouse AHR and human asthma-related phenotypes. Our goal was to identify asthma-related genes by integrating AHR associations in mouse with human genome-wide association study (GWAS) data. We used Efficient Mixed Model Association (EMMA) analysis to conduct a GWAS of baseline AHR measures from males and females of 31 mouse strains. Genes near or containing SNPs with EMMA p-values <0.001 were selected for further study in human GWAS. The results of the previously reported EVE consortium asthma GWAS meta-analysis consisting of 12,958 diverse North American subjects from 9 study centers were used to select a subset of homologous genes with evidence of association with asthma in humans. Following validation attempts in three human asthma GWAS (i.e., Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG) and two human AHR GWAS (i.e., SHARP, DAG), the Kv channel interacting protein 4 (KCNIP4) gene was identified as nominally associated with both asthma and AHR at a gene- and SNP-level. In EVE, the smallest KCNIP4 association was at rs6833065 (P-value 2.9e-04), while the strongest associations for Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG were 1.5e-03, 1.0e-03, 3.1e-03 at rs7664617, rs4697177, rs4696975, respectively. At a SNP level, the strongest association across all asthma GWAS was at rs4697177 (P-value 1.1e-04). The smallest P-values for association with AHR were 2.3e-03 at rs11947661 in SHARP and 2.1e-03 at rs402802 in DAG. Functional studies are required to validate the potential involvement of KCNIP4 in modulating asthma susceptibility and/or AHR. Our results suggest that a useful approach to identify genes associated with human asthma is to leverage mouse AHR association data.

  5. Unique Trichomonas vaginalis gene sequences identified in multinational regions of Northwest China.

    PubMed

    Liu, Jun; Feng, Meng; Wang, Xiaolan; Fu, Yongfeng; Ma, Cailing; Cheng, Xunjia

    2017-07-24

    Trichomonas vaginalis (T. vaginalis) is a flagellated protozoan parasite that infects humans worldwide. This study determined the sequence of the 18S ribosomal RNA gene of T. vaginalis infecting both females and males in Xinjiang, China. Samples from 73 females and 28 males were collected and confirmed for infection with T. vaginalis, a total of 110 sequences were identified when the T. vaginalis 18S ribosomal RNA gene was sequenced. These sequences were used to prepare a phylogenetic network. The rooted network comprised three large clades and several independent branches. Most of the Xinjiang sequences were in one group. Preliminary results suggest that Xinjiang T. vaginalis isolates might be genetically unique, as indicated by the sequence of their 18S ribosomal RNA gene. Low migration rate of local people in this province may contribute to a genetic conservativeness of T. vaginalis. The unique genetic feature of our isolates may suggest a different clinical presentation of trichomoniasis, including metronidazole susceptibility, T. vaginalis virus or Mycoplasma co-infection characteristics. The transmission and evolution of Xinjiang T. vaginalis is of interest and should be studied further. More attention should be given to T. vaginalis infection in both females and males in Xinjiang.

  6. SpeCond: a method to detect condition-specific gene expression

    PubMed Central

    2011-01-01

    Transcriptomic studies routinely measure expression levels across numerous conditions. These datasets allow identification of genes that are specifically expressed in a small number of conditions. However, there are currently no statistically robust methods for identifying such genes. Here we present SpeCond, a method to detect condition-specific genes that outperforms alternative approaches. We apply the method to a dataset of 32 human tissues to determine 2,673 specifically expressed genes. An implementation of SpeCond is freely available as a Bioconductor package at http://www.bioconductor.org/packages/release/bioc/html/SpeCond.html. PMID:22008066

  7. Machine-learning approach identifies a pattern of gene expression in peripheral blood that can accurately detect ischaemic stroke

    PubMed Central

    O’Connell, Grant C; Petrone, Ashley B; Treadway, Madison B; Tennant, Connie S; Lucke-Wold, Noelle; Chantler, Paul D; Barr, Taura L

    2016-01-01

    Early and accurate diagnosis of stroke improves the probability of positive outcome. The objective of this study was to identify a pattern of gene expression in peripheral blood that could potentially be optimised to expedite the diagnosis of acute ischaemic stroke (AIS). A discovery cohort was recruited consisting of 39 AIS patients and 24 neurologically asymptomatic controls. Peripheral blood was sampled at emergency department admission, and genome-wide expression profiling was performed via microarray. A machine-learning technique known as genetic algorithm k-nearest neighbours (GA/kNN) was then used to identify a pattern of gene expression that could optimally discriminate between groups. This pattern of expression was then assessed via qRT-PCR in an independent validation cohort, where it was evaluated for its ability to discriminate between an additional 39 AIS patients and 30 neurologically asymptomatic controls, as well as 20 acute stroke mimics. GA/kNN identified 10 genes (ANTXR2, STK3, PDK4, CD163, MAL, GRAP, ID3, CTSZ, KIF1B and PLXDC2) whose coordinate pattern of expression was able to identify 98.4% of discovery cohort subjects correctly (97.4% sensitive, 100% specific). In the validation cohort, the expression levels of the same 10 genes were able to identify 95.6% of subjects correctly when comparing AIS patients to asymptomatic controls (92.3% sensitive, 100% specific), and 94.9% of subjects correctly when comparing AIS patients with stroke mimics (97.4% sensitive, 90.0% specific). The transcriptional pattern identified in this study shows strong diagnostic potential, and warrants further evaluation to determine its true clinical efficacy. PMID:29263821

  8. A hierarchical approach employing metabolic and gene expression profiles to identify the pathways that confer cytotoxicity in HepG2 cells

    PubMed Central

    Li, Zheng; Srivastava, Shireesh; Yang, Xuerui; Mittal, Sheenu; Norton, Paul; Resau, James; Haab, Brian; Chan, Christina

    2007-01-01

    Background Free fatty acids (FFA) and tumor necrosis factor alpha (TNF-α) have been implicated in the pathogenesis of many obesity-related metabolic disorders. When human hepatoblastoma cells (HepG2) were exposed to different types of FFA and TNF-α, saturated fatty acid was found to be cytotoxic and its toxicity was exacerbated by TNF-α. In order to identify the processes associated with the toxicity of saturated FFA and TNF-α, the metabolic and gene expression profiles were measured to characterize the cellular states. A computational model was developed to integrate these disparate data to reveal the underlying pathways and mechanisms involved in saturated fatty acid toxicity. Results A hierarchical framework consisting of three stages was developed to identify the processes and genes that regulate the toxicity. First, discriminant analysis identified that fatty acid oxidation and intracellular triglyceride accumulation were the most relevant in differentiating the cytotoxic phenotype. Second, gene set enrichment analysis (GSEA) was applied to the cDNA microarray data to identify the transcriptionally altered pathways and processes. Finally, the genes and gene sets that regulate the metabolic responses identified in step 1 were identified by integrating the expression of the enriched gene sets and the metabolic profiles with a multi-block partial least squares (MBPLS) regression model. Conclusion The hierarchical approach suggested potential mechanisms involved in mediating the cytotoxic and cytoprotective pathways, as well as identified novel targets, such as NADH dehydrogenases, aldehyde dehydrogenases 1A1 (ALDH1A1) and endothelial membrane protein 3 (EMP3) as modulator of the toxic phenotypes. These predictions, as well as, some specific targets that were suggested by the analysis were experimentally validated. PMID:17498300

  9. Identification and Evolutionary Analysis of Potential Candidate Genes in a Human Eating Disorder.

    PubMed

    Sabbagh, Ubadah; Mullegama, Saman; Wyckoff, Gerald J

    2016-01-01

    The purpose of this study was to find genes linked with eating disorders and associated with both metabolic and neural systems. Our operating hypothesis was that there are genetic factors underlying some eating disorders resting in both those pathways. Specifically, we are interested in disorders that may rest in both sleep and metabolic function, generally called Night Eating Syndrome (NES). A meta-analysis of the Gene Expression Omnibus targeting the mammalian nervous system, sleep, and obesity studies was performed, yielding numerous genes of interest. Through a text-based analysis of the results, a number of potential candidate genes were identified. VGF, in particular, appeared to be relevant both to obesity and, broadly, to brain or neural development. VGF is a highly connected protein that interacts with numerous targets via proteolytically digested peptides. We examined VGF from an evolutionary perspective to determine whether other available evidence supported a role for the gene in human disease. We conclude that some of the already identified variants in VGF from human polymorphism studies may contribute to eating disorders and obesity. Our data suggest that there is enough evidence to warrant eGWAS and GWAS analysis of these genes in NES patients in a case-control study.

  10. Genetic study of congenital bile-duct dilatation identifies de novo and inherited variants in functionally related genes.

    PubMed

    Wong, John K L; Campbell, Desmond; Ngo, Ngoc Diem; Yeung, Fanny; Cheng, Guo; Tang, Clara S M; Chung, Patrick H Y; Tran, Ngoc Son; So, Man-Ting; Cherny, Stacey S; Sham, Pak C; Tam, Paul K; Garcia-Barcelo, Maria-Mercè

    2016-12-12

    Congenital dilatation of the bile-duct (CDD) is a rare, mostly sporadic, disorder that results in bile retention with severe associated complications. CDD affects mainly Asians. To our knowledge, no genetic study has ever been conducted. We aim to identify genetic risk factors by a "trio-based" exome-sequencing approach, whereby 31 CDD probands and their unaffected parents were exome-sequenced. Seven-hundred controls from the local population were used to detect gene-sets significantly enriched with rare variants in CDD patients. Twenty-one predicted damaging de novo variants (DNVs; 4 protein truncating and 17 missense) were identified in several evolutionarily constrained genes (p < 0.01). Six genes carrying DNVs were associated with human developmental disorders involving epithelial, connective or bone morphologies (PXDN, RTEL1, ANKRD11, MAP2K1, CYLD, ACAN) and four linked with cholangio- and hepatocellular carcinomas (PIK3CA, TLN1 CYLD, MAP2K1). Importantly, CDD patients have an excess of DNVs in cancer-related genes (p < 0.025). Thirteen genes were recurrently mutated at different sites, forming compound heterozygotes or functionally related complexes within patients. Our data supports a strong genetic basis for CDD and show that CDD is not only genetically heterogeneous but also non-monogenic, requiring mutations in more than one genes for the disease to develop. The data is consistent with the rarity and sporadic presentation of CDD.

  11. Frameshift mutational target gene analysis identifies similarities and differences in constitutional mismatch repair-deficiency and Lynch syndrome.

    PubMed

    Maletzki, Claudia; Huehns, Maja; Bauer, Ingrid; Ripperger, Tim; Mork, Maureen M; Vilar, Eduardo; Klöcking, Sabine; Zettl, Heike; Prall, Friedrich; Linnebacher, Michael

    2017-07-01

    Mismatch-repair deficient (MMR-D) malignancies include Lynch Syndrome (LS), which is secondary to germline mutations in one of the MMR genes, and the rare childhood-form of constitutional mismatch repair-deficiency (CMMR-D); caused by bi-allelic MMR gene mutations. A hallmark of LS-associated cancers is microsatellite instability (MSI), characterized by coding frameshift mutations (cFSM) in target genes. By contrast, tumors arising in CMMR-D patients are thought to display a somatic mutation pattern differing from LS. This study has the main goal to identify cFSM in MSI target genes relevant in CMMR-D and to compare the spectrum of common somatic mutations, including alterations in DNA polymerases POLE and D1 between LS and CMMR-D. CMMR-D-associated tumors harbored more somatic mutations compared to LS cases, especially in the TP53 gene and in POLE and POLD1, where novel mutations were additionally identified. Strikingly, MSI in classical mononucleotide markers BAT40 and CAT25 was frequent in CMMR-D cases. MSI-target gene analysis revealed mutations in CMMR-D-associated tumors, some of them known to be frequently hit in LS, such as RNaseT2, HT001, and TGFβR2. Our results imply a general role for these cFSM as potential new drivers of MMR-D tumorigenesis. © 2017 Wiley Periodicals, Inc.

  12. Transcriptome analysis identifies genes involved in sex determination and development of Xenopus laevis gonads.

    PubMed

    Piprek, Rafal P; Damulewicz, Milena; Kloc, Malgorzata; Kubiak, Jacek Z

    Development of the gonads is a complex process, which starts with a period of undifferentiated, bipotential gonads. During this period the expression of sex-determining genes is initiated. Sex determination is a process triggering differentiation of the gonads into the testis or ovary. Sex determination period is followed by sexual differentiation, i.e. appearance of the first testis- and ovary-specific features. In Xenopus laevis W-linked DM-domain gene (DM-W) had been described as a master determinant of the gonadal female sex. However, the data on the expression and function of other genes participating in gonad development in X. laevis, and in anurans, in general, are very limited. We applied microarray technique to analyze the expression pattern of a subset of X. laevis genes previously identified to be involved in gonad development in several vertebrate species. We also analyzed the localization and the expression level of proteins encoded by these genes in developing X. laevis gonads. These analyses pointed to the set of genes differentially expressed in developing testes and ovaries. Gata4, Sox9, Dmrt1, Amh, Fgf9, Ptgds, Pdgf, Fshr, and Cyp17a1 expression was upregulated in developing testes, while DM-W, Fst, Foxl2, and Cyp19a1 were upregulated in developing ovaries. We discuss the possible roles of these genes in development of X. laevis gonads. Copyright © 2018 International Society of Differentiation. Published by Elsevier B.V. All rights reserved.

  13. High-Throughput Genetic Screens Identify a Large and Diverse Collection of New Sporulation Genes in Bacillus subtilis.

    PubMed

    Meeske, Alexander J; Rodrigues, Christopher D A; Brady, Jacqueline; Lim, Hoong Chuin; Bernhardt, Thomas G; Rudner, David Z

    2016-01-01

    The differentiation of the bacterium Bacillus subtilis into a dormant spore is among the most well-characterized developmental pathways in biology. Classical genetic screens performed over the past half century identified scores of factors involved in every step of this morphological process. More recently, transcriptional profiling uncovered additional sporulation-induced genes required for successful spore development. Here, we used transposon-sequencing (Tn-seq) to assess whether there were any sporulation genes left to be discovered. Our screen identified 133 out of the 148 genes with known sporulation defects. Surprisingly, we discovered 24 additional genes that had not been previously implicated in spore formation. To investigate their functions, we used fluorescence microscopy to survey early, middle, and late stages of differentiation of null mutants from the B. subtilis ordered knockout collection. This analysis identified mutants that are delayed in the initiation of sporulation, defective in membrane remodeling, and impaired in spore maturation. Several mutants had novel sporulation phenotypes. We performed in-depth characterization of two new factors that participate in cell-cell signaling pathways during sporulation. One (SpoIIT) functions in the activation of σE in the mother cell; the other (SpoIIIL) is required for σG activity in the forespore. Our analysis also revealed that as many as 36 sporulation-induced genes with no previously reported mutant phenotypes are required for timely spore maturation. Finally, we discovered a large set of transposon insertions that trigger premature initiation of sporulation. Our results highlight the power of Tn-seq for the discovery of new genes and novel pathways in sporulation and, combined with the recently completed null mutant collection, open the door for similar screens in other, less well-characterized processes.

  14. High-Throughput Genetic Screens Identify a Large and Diverse Collection of New Sporulation Genes in Bacillus subtilis

    PubMed Central

    Brady, Jacqueline; Lim, Hoong Chuin; Bernhardt, Thomas G.; Rudner, David Z.

    2016-01-01

    The differentiation of the bacterium Bacillus subtilis into a dormant spore is among the most well-characterized developmental pathways in biology. Classical genetic screens performed over the past half century identified scores of factors involved in every step of this morphological process. More recently, transcriptional profiling uncovered additional sporulation-induced genes required for successful spore development. Here, we used transposon-sequencing (Tn-seq) to assess whether there were any sporulation genes left to be discovered. Our screen identified 133 out of the 148 genes with known sporulation defects. Surprisingly, we discovered 24 additional genes that had not been previously implicated in spore formation. To investigate their functions, we used fluorescence microscopy to survey early, middle, and late stages of differentiation of null mutants from the B. subtilis ordered knockout collection. This analysis identified mutants that are delayed in the initiation of sporulation, defective in membrane remodeling, and impaired in spore maturation. Several mutants had novel sporulation phenotypes. We performed in-depth characterization of two new factors that participate in cell–cell signaling pathways during sporulation. One (SpoIIT) functions in the activation of σE in the mother cell; the other (SpoIIIL) is required for σG activity in the forespore. Our analysis also revealed that as many as 36 sporulation-induced genes with no previously reported mutant phenotypes are required for timely spore maturation. Finally, we discovered a large set of transposon insertions that trigger premature initiation of sporulation. Our results highlight the power of Tn-seq for the discovery of new genes and novel pathways in sporulation and, combined with the recently completed null mutant collection, open the door for similar screens in other, less well-characterized processes. PMID:26735940

  15. Mapping autosomal recessive intellectual disability: combined microarray and exome sequencing identifies 26 novel candidate genes in 192 consanguineous families.

    PubMed

    Harripaul, R; Vasli, N; Mikhailov, A; Rafiq, M A; Mittal, K; Windpassinger, C; Sheikh, T I; Noor, A; Mahmood, H; Downey, S; Johnson, M; Vleuten, K; Bell, L; Ilyas, M; Khan, F S; Khan, V; Moradi, M; Ayaz, M; Naeem, F; Heidari, A; Ahmed, I; Ghadami, S; Agha, Z; Zeinali, S; Qamar, R; Mozhdehipanah, H; John, P; Mir, A; Ansar, M; French, L; Ayub, M; Vincent, J B

    2018-04-01

    Approximately 1% of the global population is affected by intellectual disability (ID), and the majority receive no molecular diagnosis. Previous studies have indicated high levels of genetic heterogeneity, with estimates of more than 2500 autosomal ID genes, the majority of which are autosomal recessive (AR). Here, we combined microarray genotyping, homozygosity-by-descent (HBD) mapping, copy number variation (CNV) analysis, and whole exome sequencing (WES) to identify disease genes/mutations in 192 multiplex Pakistani and Iranian consanguineous families with non-syndromic ID. We identified definite or candidate mutations (or CNVs) in 51% of families in 72 different genes, including 26 not previously reported for ARID. The new ARID genes include nine with loss-of-function mutations (ABI2, MAPK8, MPDZ, PIDD1, SLAIN1, TBC1D23, TRAPPC6B, UBA7 and USP44), and missense mutations include the first reports of variants in BDNF or TET1 associated with ID. The genes identified also showed overlap with de novo gene sets for other neuropsychiatric disorders. Transcriptional studies showed prominent expression in the prenatal brain. The high yield of AR mutations for ID indicated that this approach has excellent clinical potential and should inform clinical diagnostics, including clinical whole exome and genome sequencing, for populations in which consanguinity is common. As with other AR disorders, the relevance will also apply to outbred populations.

  16. ChIP-Seq Analysis for Identifying Genome-Wide Histone Modifications Associated with Stress-Responsive Genes in Plants.

    PubMed

    Li, Guosheng; Jagadeeswaran, Guru; Mort, Andrew; Sunkar, Ramanjulu

    2017-01-01

    Histone modifications represent the crux of epigenetic gene regulation essential for most biological processes including abiotic stress responses in plants. Thus, identification of histone modifications at the genome-scale can provide clues for how some genes are 'turned-on' while some others are "turned-off" in response to stress. This chapter details a step-by-step protocol for identifying genome-wide histone modifications associated with stress-responsive gene regulation using chromatin immunoprecipitation (ChIP) followed by sequencing of the DNA (ChIP-seq).

  17. Identifying the Viral Genes Encoding Envelope Glycoproteins for Differentiation of Cyprinid herpesvirus 3 Isolates

    PubMed Central

    Han, Jee Eun; Kim, Ji Hyung; Renault, Tristan; Choresca, Casiano; Shin, Sang Phil; Jun, Jin Woo; Park, Se Chang

    2013-01-01

    Cyprinid herpes virus 3 (CyHV-3) diseases have been reported around the world and are associated with high mortalities of koi (Cyprinus carpio). Although little work has been conducted on the molecular analysis of this virus, glycoprotein genes identified in the present study seem to be valuable targets for genetic comparison of this virus. Three envelope glycoprotein genes (ORF25, 65 and 116) of the CyHV-3 isolates from the USA, Israel, Japan and Korea were compared, and interestingly, sequence insertions or deletions were observed in these target regions. In addition, polymorphisms were presented in microsatellite zones from two glycoprotein genes (ORF65 and 116). In phylogenetic tree analysis, the Korean isolate was remarkably distinguished from USA, Israel, Japan isolates. These findings may be suitable for many applications including isolates differentiation and phylogeny studies. PMID:23435236

  18. Identifying the viral genes encoding envelope glycoproteins for differentiation of Cyprinid herpesvirus 3 isolates.

    PubMed

    Han, Jee Eun; Kim, Ji Hyung; Renault, Tristan; Choresca, Casiano; Shin, Sang Phil; Jun, Jin Woo; Park, Se Chang

    2013-01-31

    Cyprinid herpes virus 3 (CyHV-3) diseases have been reported around the world and are associated with high mortalities of koi (Cyprinus carpio). Although little work has been conducted on the molecular analysis of this virus, glycoprotein genes identified in the present study seem to be valuable targets for genetic comparison of this virus. Three envelope glycoprotein genes (ORF25, 65 and 116) of the CyHV-3 isolates from the USA, Israel, Japan and Korea were compared, and interestingly, sequence insertions or deletions were observed in these target regions. In addition, polymorphisms were presented in microsatellite zones from two glycoprotein genes (ORF65 and 116). In phylogenetic tree analysis, the Korean isolate was remarkably distinguished from USA, Israel, Japan isolates. These findings may be suitable for many applications including isolates differentiation and phylogeny studies.

  19. Comparative Transcriptome Analysis Identifies Putative Genes Involved in the Biosynthesis of Xanthanolides in Xanthium strumarium L.

    PubMed Central

    Li, Yuanjun; Gou, Junbo; Chen, Fangfang; Li, Changfu; Zhang, Yansheng

    2016-01-01

    Xanthium strumarium L. is a traditional Chinese herb belonging to the Asteraceae family. The major bioactive components of this plant are sesquiterpene lactones (STLs), which include the xanthanolides. To date, the biogenesis of xanthanolides, especially their downstream pathway, remains largely unknown. In X. strumarium, xanthanolides primarily accumulate in its glandular trichomes. To identify putative gene candidates involved in the biosynthesis of xanthanolides, three X. strumarium transcriptomes, which were derived from the young leaves of two different cultivars and the purified glandular trichomes from one of the cultivars, were constructed in this study. In total, 157 million clean reads were generated and assembled into 91,861 unigenes, of which 59,858 unigenes were successfully annotated. All the genes coding for known enzymes in the upstream pathway to the biosynthesis of xanthanolides were present in the X. strumarium transcriptomes. From a comparative analysis of the X. strumarium transcriptomes, this study identified a number of gene candidates that are putatively involved in the downstream pathway to the synthesis of xanthanolides, such as four unigenes encoding CYP71 P450s, 50 unigenes for dehydrogenases, and 27 genes for acetyltransferases. The possible functions of these four CYP71 candidates are extensively discussed. In addition, 116 transcription factors that are highly expressed in X. strumarium glandular trichomes were also identified. Their possible regulatory roles in the biosynthesis of STLs are discussed. The global transcriptomic data for X. strumarium should provide a valuable resource for further research into the biosynthesis of xanthanolides. PMID:27625674

  20. Comparative genomics identifies candidate genes for infectious salmon anemia (ISA) resistance in Atlantic salmon (Salmo salar).

    PubMed

    Li, Jieying; Boroevich, Keith A; Koop, Ben F; Davidson, William S

    2011-04-01

    Infectious salmon anemia (ISA) has been described as the hoof and mouth disease of salmon farming. ISA is caused by a lethal and highly communicable virus, which can have a major impact on salmon aquaculture, as demonstrated by an outbreak in Chile in 2007. A quantitative trait locus (QTL) for ISA resistance has been mapped to three microsatellite markers on linkage group (LG) 8 (Chr 15) on the Atlantic salmon genetic map. We identified bacterial artificial chromosome (BAC) clones and three fingerprint contigs from the Atlantic salmon physical map that contains these markers. We made use of the extensive BAC end sequence database to extend these contigs by chromosome walking and identified additional two markers in this region. The BAC end sequences were used to search for conserved synteny between this segment of LG8 and the fish genomes that have been sequenced. An examination of the genes in the syntenic segments of the tetraodon and medaka genomes identified candidates for association with ISA resistance in Atlantic salmon based on differential expression profiles from ISA challenges or on the putative biological functions of the proteins they encode. One gene in particular, HIV-EP2/MBP-2, caught our attention as it may influence the expression of several genes that have been implicated in the response to infection by infectious salmon anemia virus (ISAV). Therefore, we suggest that HIV-EP2/MBP-2 is a very strong candidate for the gene associated with the ISAV resistance QTL in Atlantic salmon and is worthy of further study.

  1. Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases

    PubMed Central

    2010-01-01

    Background Cytochrome P450 monooxygenases (P450s) catalyze oxidation of various substrates using oxygen and NAD(P)H. Plant P450s are involved in the biosynthesis of primary and secondary metabolites performing diverse biological functions. The recent availability of the soybean genome sequence allows us to identify and analyze soybean putative P450s at a genome scale. Co-expression analysis using an available soybean microarray and Illumina sequencing data provides clues for functional annotation of these enzymes. This approach is based on the assumption that genes that have similar expression patterns across a set of conditions may have a functional relationship. Results We have identified a total number of 332 full-length P450 genes and 378 pseudogenes from the soybean genome. From the full-length sequences, 195 genes belong to A-type, which could be further divided into 20 families. The remaining 137 genes belong to non-A type P450s and are classified into 28 families. A total of 178 probe sets were found to correspond to P450 genes on the Affymetrix soybean array. Out of these probe sets, 108 represented single genes. Using the 28 publicly available microarray libraries that contain organ-specific information, some tissue-specific P450s were identified. Similarly, stress responsive soybean P450s were retrieved from 99 microarray soybean libraries. We also utilized Illumina transcriptome sequencing technology to analyze the expressions of all 332 soybean P450 genes. This dataset contains total RNAs isolated from nodules, roots, root tips, leaves, flowers, green pods, apical meristem, mock-inoculated and Bradyrhizobium japonicum-infected root hair cells. The tissue-specific expression patterns of these P450 genes were analyzed and the expression of a representative set of genes were confirmed by qRT-PCR. We performed the co-expression analysis on many of the 108 P450 genes on the Affymetrix arrays. First we confirmed that CYP93C5 (an isoflavone synthase gene) is

  2. Genome-Wide siRNA-Based Functional Genomics of Pigmentation Identifies Novel Genes and Pathways That Impact Melanogenesis in Human Cells

    PubMed Central

    Bodemann, Brian; Petersen, Sean; Aruri, Jayavani; Koshy, Shiney; Richardson, Zachary; Le, Lu Q.; Krasieva, Tatiana; Roth, Michael G.; Farmer, Pat; White, Michael A.

    2008-01-01

    Melanin protects the skin and eyes from the harmful effects of UV irradiation, protects neural cells from toxic insults, and is required for sound conduction in the inner ear. Aberrant regulation of melanogenesis underlies skin disorders (melasma and vitiligo), neurologic disorders (Parkinson's disease), auditory disorders (Waardenburg's syndrome), and opthalmologic disorders (age related macular degeneration). Much of the core synthetic machinery driving melanin production has been identified; however, the spectrum of gene products participating in melanogenesis in different physiological niches is poorly understood. Functional genomics based on RNA-mediated interference (RNAi) provides the opportunity to derive unbiased comprehensive collections of pharmaceutically tractable single gene targets supporting melanin production. In this study, we have combined a high-throughput, cell-based, one-well/one-gene screening platform with a genome-wide arrayed synthetic library of chemically synthesized, small interfering RNAs to identify novel biological pathways that govern melanin biogenesis in human melanocytes. Ninety-two novel genes that support pigment production were identified with a low false discovery rate. Secondary validation and preliminary mechanistic studies identified a large panel of targets that converge on tyrosinase expression and stability. Small molecule inhibition of a family of gene products in this class was sufficient to impair chronic tyrosinase expression in pigmented melanoma cells and UV-induced tyrosinase expression in primary melanocytes. Isolation of molecular machinery known to support autophagosome biosynthesis from this screen, together with in vitro and in vivo validation, exposed a close functional relationship between melanogenesis and autophagy. In summary, these studies illustrate the power of RNAi-based functional genomics to identify novel genes, pathways, and pharmacologic agents that impact a biological phenotype and operate

  3. Distinct ontogenic and regional expressions of newly identified Cajal-Retzius cell-specific genes during neocorticogenesis.

    PubMed

    Yamazaki, Hiroshi; Sekiguchi, Mariko; Takamatsu, Masako; Tanabe, Yasuto; Nakanishi, Shigetada

    2004-10-05

    Cajal-Retzius (CR) cells are early-generated transient neurons and are important in the regulation of cortical neuronal migration and cortical laminar formation. Molecular entities characterizing the CR cell identity, however, remain largely elusive. We purified mouse cortical CR cells expressing GFP to homogeneity by fluorescence-activated cell sorting and examined a genome-wide expression profile of cortical CR cells at embryonic and postnatal periods. We identified 49 genes that exceeded hybridization signals by >10-fold in CR cells compared with non-CR cells at embryonic day 13.5, postnatal day 2, or both. Among these CR cell-specific genes, 25 genes, including the CR cell marker genes such as the reelin and calretinin genes, are selectively and highly expressed in both embryonic and postnatal CR cells. These genes, which encode generic properties of CR cell specificity, are eminently characterized as modulatory composites of voltage-dependent calcium channels and sets of functionally related cellular components involved in cell migration, adhesion, and neurite extension. Five genes are highly expressed in CR cells at the early embryonic period and are rapidly down-regulated thereafter. Furthermore, some of these genes have been shown to mark two distinctly different focal regions corresponding to the CR cell origins. At the late prenatal and postnatal periods, 19 genes are selectively up-regulated in CR cells. These genes include functional molecules implicated in synaptic transmission and modulation. CR cells thus strikingly change their cellular phenotypes during cortical development and play a pivotal role in both corticogenesis and cortical circuit maturation.

  4. A systems approach identifies networks and genes linking sleep and stress: implications for neuropsychiatric disorders.

    PubMed

    Jiang, Peng; Scarpa, Joseph R; Fitzpatrick, Karrie; Losic, Bojan; Gao, Vance D; Hao, Ke; Summa, Keith C; Yang, He S; Zhang, Bin; Allada, Ravi; Vitaterna, Martha H; Turek, Fred W; Kasarskis, Andrew

    2015-05-05

    Sleep dysfunction and stress susceptibility are comorbid complex traits that often precede and predispose patients to a variety of neuropsychiatric diseases. Here, we demonstrate multilevel organizations of genetic landscape, candidate genes, and molecular networks associated with 328 stress and sleep traits in a chronically stressed population of 338 (C57BL/6J × A/J) F2 mice. We constructed striatal gene co-expression networks, revealing functionally and cell-type-specific gene co-regulations important for stress and sleep. Using a composite ranking system, we identified network modules most relevant for 15 independent phenotypic categories, highlighting a mitochondria/synaptic module that links sleep and stress. The key network regulators of this module are overrepresented with genes implicated in neuropsychiatric diseases. Our work suggests that the interplay among sleep, stress, and neuropathology emerges from genetic influences on gene expression and their collective organization through complex molecular networks, providing a framework for interrogating the mechanisms underlying sleep, stress susceptibility, and related neuropsychiatric disorders. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  5. A Heterogeneous Network Based Method for Identifying GBM-Related Genes by Integrating Multi-Dimensional Data.

    PubMed

    Chen Peng; Ao Li

    2017-01-01

    The emergence of multi-dimensional data offers opportunities for more comprehensive analysis of the molecular characteristics of human diseases and therefore improving diagnosis, treatment, and prevention. In this study, we proposed a heterogeneous network based method by integrating multi-dimensional data (HNMD) to identify GBM-related genes. The novelty of the method lies in that the multi-dimensional data of GBM from TCGA dataset that provide comprehensive information of genes, are combined with protein-protein interactions to construct a weighted heterogeneous network, which reflects both the general and disease-specific relationships between genes. In addition, a propagation algorithm with resistance is introduced to precisely score and rank GBM-related genes. The results of comprehensive performance evaluation show that the proposed method significantly outperforms the network based methods with single-dimensional data and other existing approaches. Subsequent analysis of the top ranked genes suggests they may be functionally implicated in GBM, which further corroborates the superiority of the proposed method. The source code and the results of HNMD can be downloaded from the following URL: http://bioinformatics.ustc.edu.cn/hnmd/ .

  6. Spider Transcriptomes Identify Ancient Large-Scale Gene Duplication Event Potentially Important in Silk Gland Evolution.

    PubMed

    Clarke, Thomas H; Garb, Jessica E; Hayashi, Cheryl Y; Arensburger, Peter; Ayoub, Nadia A

    2015-06-08

    The evolution of specialized tissues with novel functions, such as the silk synthesizing glands in spiders, is likely an influential driver of adaptive success. Large-scale gene duplication events and subsequent paralog divergence are thought to be required for generating evolutionary novelty. Such an event has been proposed for spiders, but not tested. We de novo assembled transcriptomes from three cobweb weaving spider species. Based on phylogenetic analyses of gene families with representatives from each of the three species, we found numerous duplication events indicative of a whole genome or segmental duplication. We estimated the age of the gene duplications relative to several speciation events within spiders and arachnids and found that the duplications likely occurred after the divergence of scorpions (order Scorpionida) and spiders (order Araneae), but before the divergence of the spider suborders Mygalomorphae and Araneomorphae, near the evolutionary origin of spider silk glands. Transcripts that are expressed exclusively or primarily within black widow silk glands are more likely to have a paralog descended from the ancient duplication event and have elevated amino acid replacement rates compared with other transcripts. Thus, an ancient large-scale gene duplication event within the spider lineage was likely an important source of molecular novelty during the evolution of silk gland-specific expression. This duplication event may have provided genetic material for subsequent silk gland diversification in the true spiders (Araneomorphae). © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. Gene expression profiling and candidate gene resequencing identifies pathways and mutations important for malignant transformation caused by leukemogenic fusion genes.

    PubMed

    Novak, Rachel L; Harper, David P; Caudell, David; Slape, Christopher; Beachy, Sarah H; Aplan, Peter D

    2012-12-01

    NUP98-HOXD13 (NHD13) and CALM-AF10 (CA10) are oncogenic fusion proteins produced by recurrent chromosomal translocations in patients with acute myeloid leukemia (AML). Transgenic mice that express these fusions develop AML with a long latency and incomplete penetrance, suggesting that collaborating genetic events are required for leukemic transformation. We employed genetic techniques to identify both preleukemic abnormalities in healthy transgenic mice as well as collaborating events leading to leukemic transformation. Candidate gene resequencing revealed that 6 of 27 (22%) CA10 AMLs spontaneously acquired a Ras pathway mutation and 8 of 27 (30%) acquired an Flt3 mutation. Two CA10 AMLs acquired an Flt3 internal-tandem duplication, demonstrating that these mutations can be acquired in murine as well as human AML. Gene expression profiles revealed a marked upregulation of Hox genes, particularly Hoxa5, Hoxa9, and Hoxa10 in both NHD13 and CA10 mice. Furthermore, mir196b, which is embedded within the Hoxa locus, was overexpressed in both CA10 and NHD13 samples. In contrast, the Hox cofactors Meis1 and Pbx3 were differentially expressed; Meis1 was increased in CA10 AMLs but not NHD13 AMLs, whereas Pbx3 was consistently increased in NHD13 but not CA10 AMLs. Silencing of Pbx3 in NHD13 cells led to decreased proliferation, increased apoptosis, and decreased colony formation in vitro, suggesting a previously unexpected role for Pbx3 in leukemic transformation. Published by Elsevier Inc.

  8. Automated update, revision, and quality control of the maize genome annotations using MAKER-P improves the B73 RefGen_v3 gene models and identifies new genes.

    PubMed

    Law, MeiYee; Childs, Kevin L; Campbell, Michael S; Stein, Joshua C; Olson, Andrew J; Holt, Carson; Panchy, Nicholas; Lei, Jikai; Jiao, Dian; Andorf, Carson M; Lawrence, Carolyn J; Ware, Doreen; Shiu, Shin-Han; Sun, Yanni; Jiang, Ning; Yandell, Mark

    2015-01-01

    The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes. © 2015 American Society of Plant Biologists. All Rights Reserved.

  9. Gene networks associated with conditional fear in mice identified using a systems genetics approach

    PubMed Central

    2011-01-01

    Background Our understanding of the genetic basis of learning and memory remains shrouded in mystery. To explore the genetic networks governing the biology of conditional fear, we used a systems genetics approach to analyze a hybrid mouse diversity panel (HMDP) with high mapping resolution. Results A total of 27 behavioral quantitative trait loci were mapped with a false discovery rate of 5%. By integrating fear phenotypes, transcript profiling data from hippocampus and striatum and also genotype information, two gene co-expression networks correlated with context-dependent immobility were identified. We prioritized the key markers and genes in these pathways using intramodular connectivity measures and structural equation modeling. Highly connected genes in the context fear modules included Psmd6, Ube2a and Usp33, suggesting an important role for ubiquitination in learning and memory. In addition, we surveyed the architecture of brain transcript regulation and demonstrated preservation of gene co-expression modules in hippocampus and striatum, while also highlighting important differences. Rps15a, Kif3a, Stard7, 6330503K22RIK, and Plvap were among the individual genes whose transcript abundance were strongly associated with fear phenotypes. Conclusion Application of our multi-faceted mapping strategy permits an increasingly detailed characterization of the genetic networks underlying behavior. PMID:21410935

  10. QTL and gene expression analyses identify genes affecting carcass weight and marbling on BTA14 in Hanwoo (Korean Cattle).

    PubMed

    Lee, Seung Hwan; van der Werf, J H J; Kim, Nam Kuk; Lee, Sang Hong; Gondro, C; Park, Eung Woo; Oh, Sung Jong; Gibson, J P; Thompson, J M

    2011-10-01

    Causal mutations affecting quantitative trait variation can be good targets for marker-assisted selection for carcass traits in beef cattle. In this study, linkage and linkage disequilibrium analysis (LDLA) for four carcass traits was undertaken using 19 markers on bovine chromosome 14. The LDLA analysis detected quantitative trait loci (QTL) for carcass weight (CWT) and eye muscle area (EMA) at the same position at around 50 cM and surrounded by the markers FABP4SNP2774C>G and FABP4_μsat3237. The QTL for marbling (MAR) was identified at the midpoint of markers BMS4513 and RM137 in a 3.5-cM marker interval. The most likely position for a second QTL for CWT was found at the midpoint of tenth marker bracket (FABP4SNP2774C>G and FABP4_μsat3237). For this marker bracket, the total number of haplotypes was 34 with a most common frequency of 0.118. Effects of haplotypes on CWT varied from a -5-kg deviation for haplotype 6 to +8 kg for haplotype 23. To determine which genes contribute to the QTL effect, gene expression analysis was performed in muscle for a wide range of phenotypes. The results demonstrate that two genes, LOC781182 (p = 0.002) and TRPS1 (p = 0.006) were upregulated with increasing CWT and EMA, whereas only LOC614744 (p = 0.04) has a significant effect on intramuscular fat (IMF) content. Two genetic markers detected in FABP4 were the most likely QTL position in this QTL study, but FABP4 did not show a significant effect on both traits (CWT and EMA) in gene expression analysis. We conclude that three genes could be potential causal genes affecting carcass traits CWT, EMA, and IMF in Hanwoo.

  11. GeneSigDB: a manually curated database and resource for analysis of gene expression signatures

    PubMed Central

    Culhane, Aedín C.; Schröder, Markus S.; Sultana, Razvan; Picard, Shaita C.; Martinelli, Enzo N.; Kelly, Caroline; Haibe-Kains, Benjamin; Kapushesky, Misha; St Pierre, Anne-Alyssa; Flahive, William; Picard, Kermshlise C.; Gusenleitner, Daniel; Papenhausen, Gerald; O'Connor, Niall; Correll, Mick; Quackenbush, John

    2012-01-01

    GeneSigDB (http://www.genesigdb.org or http://compbio.dfci.harvard.edu/genesigdb/) is a database of gene signatures that have been extracted and manually curated from the published literature. It provides a standardized resource of published prognostic, diagnostic and other gene signatures of cancer and related disease to the community so they can compare the predictive power of gene signatures or use these in gene set enrichment analysis. Since GeneSigDB release 1.0, we have expanded from 575 to 3515 gene signatures, which were collected and transcribed from 1604 published articles largely focused on gene expression in cancer, stem cells, immune cells, development and lung disease. We have made substantial upgrades to the GeneSigDB website to improve accessibility and usability, including adding a tag cloud browse function, facetted navigation and a ‘basket’ feature to store genes or gene signatures of interest. Users can analyze GeneSigDB gene signatures, or upload their own gene list, to identify gene signatures with significant gene overlap and results can be viewed on a dynamic editable heatmap that can be downloaded as a publication quality image. All data in GeneSigDB can be downloaded in numerous formats including .gmt file format for gene set enrichment analysis or as a R/Bioconductor data file. GeneSigDB is available from http://www.genesigdb.org. PMID:22110038

  12. Using gene chips to identify organ-specific, smooth muscle responses to experimental diabetes: potential applications to urological diseases.

    PubMed

    Hipp, Jason D; Davies, Kelvin P; Tar, Moses; Valcic, Mira; Knoll, Abraham; Melman, Arnold; Christ, George J

    2007-02-01

    To identify early diabetes-related alterations in gene expression in bladder and erectile tissue that would provide novel diagnostic and therapeutic treatment targets to prevent, delay or ameliorate the ensuing bladder and erectile dysfunction. The RG-U34A rat GeneChip (Affymetrix Inc., Sunnyvale, CA, USA) oligonucleotide microarray (containing approximately 8799 genes) was used to evaluate gene expression in corporal and male bladder tissue excised from rats 1 week after confirmation of a diabetic state, but before demonstrable changes in organ function in vivo. A conservative analytical approach was used to detect alterations in gene expression, and gene ontology (GO) classifications were used to identify biological themes/pathways involved in the aetiology of the organ dysfunction. In all, 320 and 313 genes were differentially expressed in bladder and corporal tissue, respectively. GO analysis in bladder tissue showed prominent increases in biological pathways involved in cell proliferation, metabolism, actin cytoskeleton and myosin, as well as decreases in cell motility, and regulation of muscle contraction. GO analysis in corpora showed increases in pathways related to ion channel transport and ion channel activity, while there were decreases in collagen I and actin genes. The changes in gene expression in these initial experiments are consistent with the pathophysiological characteristics of the bladder and erectile dysfunction seen later in the diabetic disease process. Thus, the observed changes in gene expression might be harbingers or biomarkers of impending organ dysfunction, and could provide useful diagnostic and therapeutic targets for a variety of progressive urological diseases/conditions (i.e. lower urinary tract symptoms related to benign prostatic hyperplasia, erectile dysfunction, etc.).

  13. A shell regeneration assay to identify biomineralization candidate genes in mytilid mussels.

    PubMed

    Hüning, Anne K; Lange, Skadi M; Ramesh, Kirti; Jacob, Dorrit E; Jackson, Daniel J; Panknin, Ulrike; Gutowska, Magdalena A; Philipp, Eva E R; Rosenstiel, Philip; Lucassen, Magnus; Melzner, Frank

    2016-06-01

    Biomineralization processes in bivalve molluscs are still poorly understood. Here we provide an analysis of specifically expressed sequences from a mantle transcriptome of the blue mussel, Mytilus edulis. We then developed a novel, integrative shell injury assay to test, whether biomineralization candidate genes highly expressed in marginal and pallial mantle could be induced in central mantle tissue underlying the damaged shell areas. This experimental approach makes it possible to identify gene products that control the chemical micro-environment during calcification as well as organic matrix components. This is unlike existing methodological approaches that work retroactively to characterize calcification relevant molecules and are just able to examine organic matrix components that are present in completed shells. In our assay an orthogonal array of nine 1mm holes was drilled into the left valve, and mussels were suspended in net cages for 20, 29 and 36days to regenerate. Structural observations using stereo-microscopy, SEM and Raman spectroscopy revealed organic sheet synthesis (day 20) as the first step of shell-repair followed by the deposition of calcite crystals (days 20 and 29) and aragonite tablets (day 36). The regeneration period was characterized by time-dependent shifts in gene expression in left central mantle tissue underlying the injured shell, (i) increased expression of two tyrosinase isoforms (TYR3: 29-fold and TYR6: 5-fold) at day 20 with a decline thereafter, (ii) an increase in expression of a gene encoding a nacrein-like protein (max. 100-fold) on day 29. The expression of an acidic Asp-Ser-rich protein was enhanced during the entire regeneration process. This proof-of-principle study demonstrates that genes that are specifically expressed in pallial and marginal mantle tissue can be induced (4 out of 10 genes) in central mantle following experimental injury of the overlying shell. Our findings suggest that regeneration assays can be used

  14. Using the Developmental Gene Bicoid to Identify Species of Forensically Important Blowflies (Diptera: Calliphoridae)

    PubMed Central

    Park, Seong Hwan; Park, Chung Hyun; Zhang, Yong; Piao, Huguo; Chung, Ukhee; Kim, Seong Yoon; Ko, Kwang Soo; Yi, Cheong-Ho; Jo, Tae-Ho; Hwang, Juck-Joon

    2013-01-01

    Identifying species of insects used to estimate postmortem interval (PMI) is a major subject in forensic entomology. Because forensic insect specimens are morphologically uniform and are obtained at various developmental stages, DNA markers are greatly needed. To develop new autosomal DNA markers to identify species, partial genomic sequences of the bicoid (bcd) genes, containing the homeobox and its flanking sequences, from 12 blowfly species (Aldrichina grahami, Calliphora vicina, Calliphora lata, Triceratopyga calliphoroides, Chrysomya megacephala, Chrysomya pinguis, Phormia regina, Lucilia ampullacea, Lucilia caesar, Lucilia illustris, Hemipyrellia ligurriens and Lucilia sericata; Calliphoridae: Diptera) were determined and analyzed. This study first sequenced the ten blowfly species other than C. vicina and L. sericata. Based on the bcd sequences of these 12 blowfly species, a phylogenetic tree was constructed that discriminates the subfamilies of Calliphoridae (Luciliinae, Chrysomyinae, and Calliphorinae) and most blowfly species. Even partial genomic sequences of about 500 bp can distinguish most blowfly species. The short intron 2 and coding sequences downstream of the bcd homeobox in exon 3 could be utilized to develop DNA markers for forensic applications. These gene sequences are important in the evolution of insect developmental biology and are potentially useful for identifying insect species in forensic science. PMID:23586044

  15. Two novel mutations in the homogentisate-1,2-dioxygenase gene identified in Chinese Han Child with Alkaptonuria.

    PubMed

    Li, Hongying; Zhang, Kaihui; Xu, Qun; Ma, Lixia; Lv, Xin; Sun, Ruopeng

    2015-03-01

    Alkaptonuria (AKU) is an autosomal recessive disorder of tyrosine metabolism, which is caused by a defect in the enzyme homogentisate 1,2-dioxygenase (HGD) with subsequent accumulation of homogentisic acid. Presently, more than 100 HGD mutations have been identified as the cause of the inborn error of metabolism across different populations worldwide. However, the HGD mutation is very rarely reported in Asia, especially China. In this study, we present mutational analyses of HGD gene in one Chinese Han child with AKU, which had been identified by gas chromatography-mass spectrometry detection of organic acids in urine samples. PCR and DNA sequencing of the entire coding region as well as exon-intron boundaries of HGD have been performed. Two novel mutations were identified in the HGD gene in this AKU case, a frameshift mutation of c.115delG in exon 3 and the splicing mutation of IVS5+3 A>C, a donor splice site of the exon 5 and exon-intron junction. The identification of these mutations in this study further expands the spectrum of known HGD gene mutations and contributes to prenatal molecular diagnosis of AKU.

  16. Type 2 diabetes mellitus disease risk genes identified by genome wide copy number variation scan in normal populations.

    PubMed

    Prabhanjan, Manasa; Suresh, Raviraj V; Murthy, Megha N; Ramachandra, Nallur B

    2016-03-01

    To identify the role of copy number variations (CNVs) on disease risk genes and its effect on disease phenotypes in type 2 diabetes mellitus (T2DM) in 12 random populations using high throughput arrays. CNV analysis was carried out on a total of 1715 individuals from 12 populations, from ArrayExpress Archive of the European Bioinformatics Institute along with our subjects using Affymetrix Genome Wide SNP 6.0 array. CNV effect on T2DM genes were analyzed using several bioinformatics tools and a molecular protein interaction network was constructed to identify the disease mechanism altered by the CNVs. Analysis showed 34.4% of the total population to be under CNV burden for T2DM, with 83 disease causal and associated genes being under CNV influence. Hotspots were identified on chromosomes 22, 12, 6, 19 and 11.Overlap studies with case cohorts revealed significant disease risk genes such as EGFR, E2F1, PPP1R3A, HLA and TSPAN8. CNVs play a significant role in predisposing T2DM in normal cohorts and contribute to the phenotypic effects. Thus, CNVs should be considered as one of the major contributors in predisposition of the disease. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  17. A high-throughput virus-induced gene silencing protocol identifies genes involved in multi-stress tolerance

    PubMed Central

    2013-01-01

    Background Understanding the function of a particular gene under various stresses is important for engineering plants for broad-spectrum stress tolerance. Although virus-induced gene silencing (VIGS) has been used to characterize genes involved in abiotic stress tolerance, currently available gene silencing and stress imposition methodology at the whole plant level is not suitable for high-throughput functional analyses of genes. This demands a robust and reliable methodology for characterizing genes involved in abiotic and multi-stress tolerance. Results Our methodology employs VIGS-based gene silencing in leaf disks combined with simple stress imposition and effect quantification methodologies for easy and faster characterization of genes involved in abiotic and multi-stress tolerance. By subjecting leaf disks from gene-silenced plants to various abiotic stresses and inoculating silenced plants with various pathogens, we show the involvement of several genes for multi-stress tolerance. In addition, we demonstrate that VIGS can be used to characterize genes involved in thermotolerance. Our results also showed the functional relevance of NtEDS1 in abiotic stress, NbRBX1 and NbCTR1 in oxidative stress; NtRAR1 and NtNPR1 in salinity stress; NbSOS1 and NbHSP101 in biotic stress; and NtEDS1, NbETR1, NbWRKY2 and NbMYC2 in thermotolerance. Conclusions In addition to widening the application of VIGS, we developed a robust, easy and high-throughput methodology for functional characterization of genes involved in multi-stress tolerance. PMID:24289810

  18. Current Status of Gene Therapy for Inherited Lung Diseases

    PubMed Central

    Driskell, Ryan R.; Engelhardt, John F.

    2007-01-01

    Gene therapy as a treatment modality for pulmonary disorders has attracted significant interest over the past decade. Since the initiation of the first clinical trials for cystic fibrosis lung disease using recombinant adenovirus in the early 1990s, the field has encountered numerous obstacles including vector inflammation, inefficient delivery, and vector production. Despite these obstacles, enthusiasm for lung gene therapy remains high. In part, this enthusiasm is fueled through the diligence of numerous researchers whose studies continue to reveal great potential of new gene transfer vectors that demonstrate increased tropism for airway epithelia. Several newly identified serotypes of adeno-associated virus have demonstrated substantial promise in animal models and will likely surface soon in clinical trials. Furthermore, an increased understanding of vector biology has also led to the development of new technologies to enhance the efficiency and selectivity of gene delivery to the lung. Although the promise of gene therapy to the lung has yet to be realized, the recent concentrated efforts in the field that focus on the basic virology of vector development will undoubtedly reap great rewards over the next decade in treating lung diseases. PMID:12524461

  19. Identifying RNA splicing factors using IFT genes in Chlamydomonas reinhardtii.

    PubMed

    Lin, Huawen; Zhang, Zhengyan; Iomini, Carlo; Dutcher, Susan K

    2018-03-01

    Intraflagellar transport moves proteins in and out of flagella/cilia and it is essential for the assembly of these organelles. Using whole-genome sequencing, we identified splice site mutations in two IFT genes, IFT81 ( fla9 ) and IFT121 ( ift121-2 ), which lead to flagellar assembly defects in the unicellular green alga Chlamydomonas reinhardtii The splicing defects in these ift mutants are partially corrected by mutations in two conserved spliceosome proteins, DGR14 and FRA10. We identified a dgr14 deletion mutant, which suppresses the 3' splice site mutation in IFT81 , and a frameshift mutant of FRA10 , which suppresses the 5' splice site mutation in IFT121 Surprisingly, we found dgr14-1 and fra10 mutations suppress both splice site mutations. We suggest these two proteins are involved in facilitating splice site recognition/interaction; in their absence some splice site mutations are tolerated. Nonsense mutations in SMG1 , which is involved in nonsense-mediated decay, lead to accumulation of aberrant transcripts and partial restoration of flagellar assembly in the ift mutants. The high density of introns and the conservation of noncore splicing factors, together with the ease of scoring the ift mutant phenotype, make Chlamydomonas an attractive organism to identify new proteins involved in splicing through suppressor screening. © 2018 The Authors.

  20. Transcriptome meta-analysis reveals common differential and global gene expression profiles in cystic fibrosis and other respiratory disorders and identifies CFTR regulators.

    PubMed

    Clarke, Luka A; Botelho, Hugo M; Sousa, Lisete; Falcao, Andre O; Amaral, Margarida D

    2015-11-01

    A meta-analysis of 13 independent microarray data sets was performed and gene expression profiles from cystic fibrosis (CF), similar disorders (COPD: chronic obstructive pulmonary disease, IPF: idiopathic pulmonary fibrosis, asthma), environmental conditions (smoking, epithelial injury), related cellular processes (epithelial differentiation/regeneration), and non-respiratory "control" conditions (schizophrenia, dieting), were compared. Similarity among differentially expressed (DE) gene lists was assessed using a permutation test, and a clustergram was constructed, identifying common gene markers. Global gene expression values were standardized using a novel approach, revealing that similarities between independent data sets run deeper than shared DE genes. Correlation of gene expression values identified putative gene regulators of the CF transmembrane conductance regulator (CFTR) gene, of potential therapeutic significance. Our study provides a novel perspective on CF epithelial gene expression in the context of other lung disorders and conditions, and highlights the contribution of differentiation/EMT and injury to gene signatures of respiratory disease. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python).

    PubMed

    Irizarry, Kristopher J L; Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  2. Targeted next generation sequencing identifies functionally deleterious germline mutations in novel genes in early-onset/familial prostate cancer.

    PubMed

    Paulo, Paula; Maia, Sofia; Pinto, Carla; Pinto, Pedro; Monteiro, Augusta; Peixoto, Ana; Teixeira, Manuel R

    2018-04-01

    Considering that mutations in known prostate cancer (PrCa) predisposition genes, including those responsible for hereditary breast/ovarian cancer and Lynch syndromes, explain less than 5% of early-onset/familial PrCa, we have sequenced 94 genes associated with cancer predisposition using next generation sequencing (NGS) in a series of 121 PrCa patients. We found monoallelic truncating/functionally deleterious mutations in seven genes, including ATM and CHEK2, which have previously been associated with PrCa predisposition, and five new candidate PrCa associated genes involved in cancer predisposing recessive disorders, namely RAD51C, FANCD2, FANCI, CEP57 and RECQL4. Furthermore, using in silico pathogenicity prediction of missense variants among 18 genes associated with breast/ovarian cancer and/or Lynch syndrome, followed by KASP genotyping in 710 healthy controls, we identified "likely pathogenic" missense variants in ATM, BRIP1, CHEK2 and TP53. In conclusion, this study has identified putative PrCa predisposing germline mutations in 14.9% of early-onset/familial PrCa patients. Further data will be necessary to confirm the genetic heterogeneity of inherited PrCa predisposition hinted in this study.

  3. Quantitative analysis of bristle number in Drosophila mutants identifies genes involved in neural development

    NASA Technical Reports Server (NTRS)

    Norga, Koenraad K.; Gurganus, Marjorie C.; Dilda, Christy L.; Yamamoto, Akihiko; Lyman, Richard F.; Patel, Prajal H.; Rubin, Gerald M.; Hoskins, Roger A.; Mackay, Trudy F.; Bellen, Hugo J.

    2003-01-01

    BACKGROUND: The identification of the function of all genes that contribute to specific biological processes and complex traits is one of the major challenges in the postgenomic era. One approach is to employ forward genetic screens in genetically tractable model organisms. In Drosophila melanogaster, P element-mediated insertional mutagenesis is a versatile tool for the dissection of molecular pathways, and there is an ongoing effort to tag every gene with a P element insertion. However, the vast majority of P element insertion lines are viable and fertile as homozygotes and do not exhibit obvious phenotypic defects, perhaps because of the tendency for P elements to insert 5' of transcription units. Quantitative genetic analysis of subtle effects of P element mutations that have been induced in an isogenic background may be a highly efficient method for functional genome annotation. RESULTS: Here, we have tested the efficacy of this strategy by assessing the extent to which screening for quantitative effects of P elements on sensory bristle number can identify genes affecting neural development. We find that such quantitative screens uncover an unusually large number of genes that are known to function in neural development, as well as genes with yet uncharacterized effects on neural development, and novel loci. CONCLUSIONS: Our findings establish the use of quantitative trait analysis for functional genome annotation through forward genetics. Similar analyses of quantitative effects of P element insertions will facilitate our understanding of the genes affecting many other complex traits in Drosophila.

  4. A Systems Biology Approach To Identify the Combination Effects of Human Herpesvirus 8 Genes on NF-κB Activation▿

    PubMed Central

    Konrad, Andreas; Wies, Effi; Thurau, Mathias; Marquardt, Gaby; Naschberger, Elisabeth; Hentschel, Sonja; Jochmann, Ramona; Schulz, Thomas F.; Erfle, Holger; Brors, Benedikt; Lausen, Berthold; Neipel, Frank; Stürzl, Michael

    2009-01-01

    Human herpesvirus 8 (HHV-8) is the etiologic agent of Kaposi's sarcoma and primary effusion lymphoma. Activation of the cellular transcription factor nuclear factor-kappa B (NF-κB) is essential for latent persistence of HHV-8, survival of HHV-8-infected cells, and disease progression. We used reverse-transfected cell microarrays (RTCM) as an unbiased systems biology approach to systematically analyze the effects of HHV-8 genes on the NF-κB signaling pathway. All HHV-8 genes individually (n = 86) and, additionally, all K and latent genes in pairwise combinations (n = 231) were investigated. Statistical analyses of more than 14,000 transfections identified ORF75 as a novel and confirmed K13 as a known HHV-8 activator of NF-κB. K13 and ORF75 showed cooperative NF-κB activation. Small interfering RNA-mediated knockdown of ORF75 expression demonstrated that this gene contributes significantly to NF-κB activation in HHV-8-infected cells. Furthermore, our approach confirmed K10.5 as an NF-κB inhibitor and newly identified K1 as an inhibitor of both K13- and ORF75-mediated NF-κB activation. All results obtained with RTCM were confirmed with classical transfection experiments. Our work describes the first successful application of RTCM for the systematic analysis of pathofunctions of genes of an infectious agent. With this approach, ORF75 and K1 were identified as novel HHV-8 regulatory molecules on the NF-κB signal transduction pathway. The genes identified may be involved in fine-tuning of the balance between latency and lytic replication, since this depends critically on the state of NF-κB activity. PMID:19129458

  5. Analysis of genomic aberrations and gene expression profiling identifies novel lesions and pathways in myeloproliferative neoplasms

    PubMed Central

    Rice, K L; Lin, X; Wolniak, K; Ebert, B L; Berkofsky-Fessler, W; Buzzai, M; Sun, Y; Xi, C; Elkin, P; Levine, R; Golub, T; Gilliland, D G; Crispino, J D; Licht, J D; Zhang, W

    2011-01-01

    Polycythemia vera (PV), essential thrombocythemia and primary myelofibrosis, are myeloproliferative neoplasms (MPNs) with distinct clinical features and are associated with the JAK2V617F mutation. To identify genomic anomalies involved in the pathogenesis of these disorders, we profiled 87 MPN patients using Affymetrix 250K single-nucleotide polymorphism (SNP) arrays. Aberrations affecting chr9 were the most frequently observed and included 9pLOH (n=16), trisomy 9 (n=6) and amplifications of 9p13.3–23.3 (n=1), 9q33.1–34.13 (n=1) and 9q34.13 (n=6). Patients with trisomy 9 were associated with elevated JAK2V617F mutant allele burden, suggesting that gain of chr9 represents an alternative mechanism for increasing JAK2V617F dosage. Gene expression profiling of patients with and without chr9 abnormalities (+9, 9pLOH), identified genes potentially involved in disease pathogenesis including JAK2, STAT5B and MAPK14. We also observed recurrent gains of 1p36.31–36.33 (n=6), 17q21.2–q21.31 (n=5) and 17q25.1–25.3 (n=5) and deletions affecting 18p11.31–11.32 (n=8). Combined SNP and gene expression analysis identified aberrations affecting components of a non-canonical PRC2 complex (EZH1, SUZ12 and JARID2) and genes comprising a ‘HSC signature' (MLLT3, SMARCA2 and PBX1). We show that NFIB, which is amplified in 7/87 MPN patients and upregulated in PV CD34+ cells, protects cells from apoptosis induced by cytokine withdrawal. PMID:22829077

  6. Transcriptome profiling identified differentially expressed genes and pathways associated with tamoxifen resistance in human breast cancer

    PubMed Central

    Men, Xin; Ma, Jun; Wu, Tong; Pu, Junyi; Wen, Shaojia; Shen, Jianfeng; Wang, Xun; Wang, Yamin; Chen, Chao; Dai, Penggao

    2018-01-01

    Tamoxifen (TAM) resistance is an important clinical problem in the treatment of breast cancer. In order to identify the mechanism of TAM resistance for estrogen receptor (ER)-positive breast cancer, we screened the transcriptome using RNA-seq and compared the gene expression profiles between the MCF-7 mamma carcinoma cell line and the TAM-resistant cell line TAMR/MCF-7, 52 significant differential expression genes (DEGs) were identified including SLIT2, ROBO, LHX, KLF, VEGFC, BAMBI, LAMA1, FLT4, PNMT, DHRS2, MAOA and ALDH. The DEGs were annotated in the GO, COG and KEGG databases. Annotation of the function of the DEGs in the KEGG database revealed the top three pathways enriched with the most DEGs, including pathways in cancer, the PI3K-AKT pathway, and focal adhesion. Then we compared the gene expression profiles between the Clinical progressive disease (PD) and the complete response (CR) from the cancer genome altas (TCGA). 10 common DEGs were identified through combining the clinical and cellular analysis results. Protein-protein interaction network was applied to analyze the association of ER signal pathway with the 10 DEGs. 3 significant genes (GFRA3, NPY1R and PTPRN2) were closely related to ER related pathway. These significant DEGs regulated many biological activities such as cell proliferation and survival, motility and migration, and tumor cell invasion. The interactions between these DEGs and drug resistance phenomenon need to be further elucidated at a functional level in further studies. Based on our findings, we believed that these DEGs could be therapeutic targets, which can be explored to develop new treatment options. PMID:29423105

  7. Computational Gene Expression Modeling Identifies Salivary Biomarker Analysis that Predict Oral Feeding Readiness in the Newborn

    PubMed Central

    Maron, Jill L.; Hwang, Jooyeon S.; Pathak, Subash; Ruthazer, Robin; Russell, Ruby L.; Alterovitz, Gil

    2014-01-01

    Objective To combine mathematical modeling of salivary gene expression microarray data and systems biology annotation with RT-qPCR amplification to identify (phase I) and validate (phase II) salivary biomarker analysis for the prediction of oral feeding readiness in preterm infants. Study design Comparative whole transcriptome microarray analysis from 12 preterm newborns pre- and post-oral feeding success was used for computational modeling and systems biology analysis to identify potential salivary transcripts associated with oral feeding success (phase I). Selected gene expression biomarkers (15 from computational modeling; 6 evidence-based; and 3 reference) were evaluated by RT-qPCR amplification on 400 salivary samples from successful (n=200) and unsuccessful (n=200) oral feeders (phase II). Genes, alone and in combination, were evaluated by a multivariate analysis controlling for sex and post-conceptional age (PCA) to determine the probability that newborns achieved successful oral feeding. Results Advancing post-conceptional age (p < 0.001) and female sex (p = 0.05) positively predicted an infant’s ability to feed orally. A combination of five genes, NPY2R (hunger signaling), AMPK (energy homeostasis), PLXNA1 (olfactory neurogenesis), NPHP4 (visual behavior) and WNT3 (facial development), in addition to PCA and sex, demonstrated good accuracy for determining feeding success (AUROC = 0.78). Conclusions We have identified objective and biologically relevant salivary biomarkers that noninvasively assess a newborn’s developing brain, sensory and facial development as they relate to oral feeding success. Understanding the mechanisms that underlie the development of oral feeding readiness through translational and computational methods may improve clinical decision making while decreasing morbidities and health care costs. PMID:25620512

  8. Using variable rate models to identify genes under selection in sequence pairs: their validity and limitations for EST sequences.

    PubMed

    Church, Sheri A; Livingstone, Kevin; Lai, Zhao; Kozik, Alexander; Knapp, Steven J; Michelmore, Richard W; Rieseberg, Loren H

    2007-02-01

    Using likelihood-based variable selection models, we determined if positive selection was acting on 523 EST sequence pairs from two lineages of sunflower and lettuce. Variable rate models are generally not used for comparisons of sequence pairs due to the limited information and the inaccuracy of estimates of specific substitution rates. However, previous studies have shown that the likelihood ratio test (LRT) is reliable for detecting positive selection, even with low numbers of sequences. These analyses identified 56 genes that show a signature of selection, of which 75% were not identified by simpler models that average selection across codons. Subsequent mapping studies in sunflower show four of five of the positively selected genes identified by these methods mapped to domestication QTLs. We discuss the validity and limitations of using variable rate models for comparisons of sequence pairs, as well as the limitations of using ESTs for identification of positively selected genes.

  9. Current Status and Challenges in Identifying Disease Resistance Genes in Brassica napus

    PubMed Central

    Neik, Ting Xiang; Barbetti, Martin J.; Batley, Jacqueline

    2017-01-01

    Brassica napus is an economically important crop across different continents including temperate and subtropical regions in Europe, Canada, South Asia, China and Australia. Its widespread cultivation also brings setbacks as it plays host to fungal, oomycete and chytrid pathogens that can lead to serious yield loss. For sustainable crop production, identification of resistance (R) genes in B. napus has become of critical importance. In this review, we discuss four key pathogens affecting Brassica crops: Clubroot (Plasmodiophora brassicae), Blackleg (Leptosphaeria maculans and L. biglobosa), Sclerotinia Stem Rot (Sclerotinia sclerotiorum), and Downy Mildew (Hyaloperonospora parasitica). We first review current studies covering prevalence of these pathogens on Brassica crops and highlight the R genes and QTL that have been identified from Brassica species against these pathogens. Insights into the relationships between the pathogen and its Brassica host, the unique host resistance mechanisms and how these affect resistance outcomes is also presented. We discuss challenges in identification and deployment of R genes in B. napus in relation to highly specific genetic interactions between host subpopulations and pathogen pathotypes and emphasize the need for common or shared techniques and research materials or tighter collaboration between researchers to reconcile the inconsistencies in the research outcomes. Using current genomics tools, we provide examples of how characterization and cloning of R genes in B. napus can be carried out more effectively. Lastly, we put forward strategies to breed resistant cultivars through introgressions supported by genomic approaches and suggest prospects that can be implemented in the future for a better, pathogen-resistant B. napus. PMID:29163558

  10. Transcriptomic profiling in muscle and adipose tissue identifies genes related to growth and lipid deposition

    PubMed Central

    Pang, Jianhui; Zhong, Zhijun; Chen, Xiaohui; Yang, Yuekui; Zeng, Kai; Kang, Runming; Lei, Yunfeng; Ying, Sancheng; Gong, Jianjun; Gu, Yiren

    2017-01-01

    Growth performance and meat quality are important traits for the pig industry and consumers. Adipose tissue is the main site at which fat storage and fatty acid synthesis occur. Therefore, we combined high-throughput transcriptomic sequencing in adipose and muscle tissues with the quantification of corresponding phenotypic features using seven Chinese indigenous pig breeds and one Western commercial breed (Yorkshire). We obtained data on 101 phenotypic traits, from which principal component analysis distinguished two groups: one associated with the Chinese breeds and one with Yorkshire. The numbers of differentially expressed genes between all Chinese breeds and Yorkshire were shown to be 673 and 1056 in adipose and muscle tissues, respectively. Functional enrichment analysis revealed that these genes are associated with biological functions and canonical pathways related to oxidoreductase activity, immune response, and metabolic process. Weighted gene coexpression network analysis found more coexpression modules significantly correlated with the measured phenotypic traits in adipose than in muscle, indicating that adipose regulates meat and carcass quality. Using the combination of differential expression, QTL information, gene significance, and module hub genes, we identified a large number of candidate genes potentially related to economically important traits in pig, which should help us improve meat production and quality. PMID:28877211

  11. Transcriptomic profiling in muscle and adipose tissue identifies genes related to growth and lipid deposition.

    PubMed

    Tao, Xuan; Liang, Yan; Yang, Xuemei; Pang, Jianhui; Zhong, Zhijun; Chen, Xiaohui; Yang, Yuekui; Zeng, Kai; Kang, Runming; Lei, Yunfeng; Ying, Sancheng; Gong, Jianjun; Gu, Yiren; Lv, Xuebin

    2017-01-01

    Growth performance and meat quality are important traits for the pig industry and consumers. Adipose tissue is the main site at which fat storage and fatty acid synthesis occur. Therefore, we combined high-throughput transcriptomic sequencing in adipose and muscle tissues with the quantification of corresponding phenotypic features using seven Chinese indigenous pig breeds and one Western commercial breed (Yorkshire). We obtained data on 101 phenotypic traits, from which principal component analysis distinguished two groups: one associated with the Chinese breeds and one with Yorkshire. The numbers of differentially expressed genes between all Chinese breeds and Yorkshire were shown to be 673 and 1056 in adipose and muscle tissues, respectively. Functional enrichment analysis revealed that these genes are associated with biological functions and canonical pathways related to oxidoreductase activity, immune response, and metabolic process. Weighted gene coexpression network analysis found more coexpression modules significantly correlated with the measured phenotypic traits in adipose than in muscle, indicating that adipose regulates meat and carcass quality. Using the combination of differential expression, QTL information, gene significance, and module hub genes, we identified a large number of candidate genes potentially related to economically important traits in pig, which should help us improve meat production and quality.

  12. Identifying new susceptibility genes on dopaminergic and serotonergic pathways for the framing effect in decision-making.

    PubMed

    Gao, Xiaoxue; Liu, Jinting; Gong, Pingyuan; Wang, Junhui; Fang, Wan; Yan, Hongming; Zhu, Lusha; Zhou, Xiaolin

    2017-09-01

    The framing effect refers the tendency to be risk-averse when options are presented positively but be risk-seeking when the same options are presented negatively during decision-making. This effect has been found to be modulated by the serotonin transporter gene (SLC6A4) and the catechol-o-methyltransferase gene (COMT) polymorphisms, which are on the dopaminergic and serotonergic pathways and which are associated with affective processing. The current study aimed to identify new genetic variations of genes on dopaminergic and serotonergic pathways that may contribute to individual differences in the susceptibility to framing. Using genome-wide association data and the gene-based principal components regression method, we examined genetic variations of 26 genes on the pathways in 1317 Chinese Han participants. Consistent with previous studies, we found that the genetic variations of the SLC6A4 gene and the COMT gene were associated with the framing effect. More importantly, we demonstrated that the genetic variations of the aromatic-L-amino-acid decarboxylase (DDC) gene, which is involved in the synthesis of both dopamine and serotonin, contributed to individual differences in the susceptibility to framing. Our findings shed light on the understanding of the genetic basis of affective decision-making. © The Author (2017). Published by Oxford University Press.

  13. Identifying new susceptibility genes on dopaminergic and serotonergic pathways for the framing effect in decision-making

    PubMed Central

    Gao, Xiaoxue; Liu, Jinting; Gong, Pingyuan; Wang, Junhui; Fang, Wan; Yan, Hongming; Zhu, Lusha

    2017-01-01

    Abstract The framing effect refers the tendency to be risk-averse when options are presented positively but be risk-seeking when the same options are presented negatively during decision-making. This effect has been found to be modulated by the serotonin transporter gene (SLC6A4) and the catechol-o-methyltransferase gene (COMT) polymorphisms, which are on the dopaminergic and serotonergic pathways and which are associated with affective processing. The current study aimed to identify new genetic variations of genes on dopaminergic and serotonergic pathways that may contribute to individual differences in the susceptibility to framing. Using genome-wide association data and the gene-based principal components regression method, we examined genetic variations of 26 genes on the pathways in 1317 Chinese Han participants. Consistent with previous studies, we found that the genetic variations of the SLC6A4 gene and the COMT gene were associated with the framing effect. More importantly, we demonstrated that the genetic variations of the aromatic-L-amino-acid decarboxylase (DDC) gene, which is involved in the synthesis of both dopamine and serotonin, contributed to individual differences in the susceptibility to framing. Our findings shed light on the understanding of the genetic basis of affective decision-making. PMID:28431168

  14. Examination of tetrahydrobiopterin pathway genes in autism.

    PubMed

    Schnetz-Boutaud, N C; Anderson, B M; Brown, K D; Wright, H H; Abramson, R K; Cuccaro, M L; Gilbert, J R; Pericak-Vance, M A; Haines, J L

    2009-11-01

    Autism is a complex disorder with a high degree of heritability and significant phenotypic and genotypic heterogeneity. Although candidate gene studies and genome-wide screens have failed to identify major causal loci associated with autism, numerous studies have proposed association with several variations in genes in the dopaminergic and serotonergic pathways. Because tetrahydrobiopterin (BH4) is the essential cofactor in the synthesis of these two neurotransmitters, we genotyped 25 SNPs in nine genes of the BH4 pathway in a total of 403 families. Significant nominal association was detected in the gene for 6-pyruvoyl-tetrahydropterin synthase, PTS (chromosome 11), with P = 0.009; this result was not restricted to an affected male-only subset. Multilocus interaction was detected in the BH4 pathway alone, but not across the serotonin, dopamine and BH4 pathways.

  15. Identifying genetic loci affecting antidepressant drug response in depression using drug–gene interaction models

    PubMed Central

    Noordam, Raymond; Avery, Christy L; Visser, Loes E; Stricker, Bruno H

    2016-01-01

    Antidepressants are often only moderately successful in decreasing the severity of depressive symptoms. In part, antidepressant treatment response in patients with depression is genetically determined. However, although a large number of studies have been conducted aiming to identify genetic variants associated with antidepressant drug response in depression, only a few variants have been repeatedly identified. Within the present review, we will discuss the methodological challenges and limitations of the studies that have been conducted on this topic to date (e.g., ‘treated-only design’, statistical power) and we will discuss how specifically drug–gene interaction models can be used to be better able to identify genetic variants associated with antidepressant drug response in depression. PMID:27248517

  16. Gene panel sequencing improves the diagnostic work-up of patients with idiopathic erythrocytosis and identifies new mutations

    PubMed Central

    Camps, Carme; Petousi, Nayia; Bento, Celeste; Cario, Holger; Copley, Richard R.; McMullin, Mary Frances; van Wijk, Richard; Ratcliffe, Peter J.; Robbins, Peter A.; Taylor, Jenny C.

    2016-01-01

    Erythrocytosis is a rare disorder characterized by increased red cell mass and elevated hemoglobin concentration and hematocrit. Several genetic variants have been identified as causes for erythrocytosis in genes belonging to different pathways including oxygen sensing, erythropoiesis and oxygen transport. However, despite clinical investigation and screening for these mutations, the cause of disease cannot be found in a considerable number of patients, who are classified as having idiopathic erythrocytosis. In this study, we developed a targeted next-generation sequencing panel encompassing the exonic regions of 21 genes from relevant pathways (~79 Kb) and sequenced 125 patients with idiopathic erythrocytosis. The panel effectively screened 97% of coding regions of these genes, with an average coverage of 450×. It identified 51 different rare variants, all leading to alterations of protein sequence, with 57 out of 125 cases (45.6%) having at least one of these variants. Ten of these were known erythrocytosis-causing variants, which had been missed following existing diagnostic algorithms. Twenty-two were novel variants in erythrocytosis-associated genes (EGLN1, EPAS1, VHL, BPGM, JAK2, SH2B3) and in novel genes included in the panel (e.g. EPO, EGLN2, HIF3A, OS9), some with a high likelihood of functionality, for which future segregation, functional and replication studies will be useful to provide further evidence for causality. The rest were classified as polymorphisms. Overall, these results demonstrate the benefits of using a gene panel rather than existing methods in which focused genetic screening is performed depending on biochemical measurements: the gene panel improves diagnostic accuracy and provides the opportunity for discovery of novel variants. PMID:27651169

  17. Gene panel sequencing improves the diagnostic work-up of patients with idiopathic erythrocytosis and identifies new mutations.

    PubMed

    Camps, Carme; Petousi, Nayia; Bento, Celeste; Cario, Holger; Copley, Richard R; McMullin, Mary Frances; van Wijk, Richard; Ratcliffe, Peter J; Robbins, Peter A; Taylor, Jenny C

    2016-11-01

    Erythrocytosis is a rare disorder characterized by increased red cell mass and elevated hemoglobin concentration and hematocrit. Several genetic variants have been identified as causes for erythrocytosis in genes belonging to different pathways including oxygen sensing, erythropoiesis and oxygen transport. However, despite clinical investigation and screening for these mutations, the cause of disease cannot be found in a considerable number of patients, who are classified as having idiopathic erythrocytosis. In this study, we developed a targeted next-generation sequencing panel encompassing the exonic regions of 21 genes from relevant pathways (~79 Kb) and sequenced 125 patients with idiopathic erythrocytosis. The panel effectively screened 97% of coding regions of these genes, with an average coverage of 450×. It identified 51 different rare variants, all leading to alterations of protein sequence, with 57 out of 125 cases (45.6%) having at least one of these variants. Ten of these were known erythrocytosis-causing variants, which had been missed following existing diagnostic algorithms. Twenty-two were novel variants in erythrocytosis-associated genes (EGLN1, EPAS1, VHL, BPGM, JAK2, SH2B3) and in novel genes included in the panel (e.g. EPO, EGLN2, HIF3A, OS9), some with a high likelihood of functionality, for which future segregation, functional and replication studies will be useful to provide further evidence for causality. The rest were classified as polymorphisms. Overall, these results demonstrate the benefits of using a gene panel rather than existing methods in which focused genetic screening is performed depending on biochemical measurements: the gene panel improves diagnostic accuracy and provides the opportunity for discovery of novel variants. Copyright© Ferrata Storti Foundation.

  18. Comparative Analysis of the Full Genome of Helicobacter pylori Isolate Sahul64 Identifies Genes of High Divergence

    PubMed Central

    Lu, Wei; Wise, Michael J.; Tay, Chin Yen; Windsor, Helen M.; Marshall, Barry J.; Peacock, Christopher

    2014-01-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains. PMID:24375107

  19. Comparative analysis of the full genome of Helicobacter pylori isolate Sahul64 identifies genes of high divergence.

    PubMed

    Lu, Wei; Wise, Michael J; Tay, Chin Yen; Windsor, Helen M; Marshall, Barry J; Peacock, Christopher; Perkins, Tim

    2014-03-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains.

  20. DNMT1-interacting RNAs block gene specific DNA methylation

    PubMed Central

    Di Ruscio, Annalisa; Ebralidze, Alexander K.; Benoukraf, Touati; Amabile, Giovanni; Goff, Loyal A.; Terragni, Joylon; Figueroa, Maria Eugenia; De Figureido Pontes, Lorena Lobo; Alberich-Jorda, Meritxell; Zhang, Pu; Wu, Mengchu; D’Alò, Francesco; Melnick, Ari; Leone, Giuseppe; Ebralidze, Konstantin K.; Pradhan, Sriharsa; Rinn, John L.; Tenen, Daniel G.

    2013-01-01

    Summary DNA methylation was described almost a century ago. However, the rules governing its establishment and maintenance remain elusive. Here, we present data demonstrating that active transcription regulates levels of genomic methylation. We identified a novel RNA arising from the CEBPA gene locus critical in regulating the local DNA methylation profile. This RNA binds to DNMT1 and prevents CEBPA gene locus methylation. Deep sequencing of transcripts associated with DNMT1 combined with genome-scale methylation and expression profiling extended the generality of this finding to numerous gene loci. Collectively, these results delineate the nature of DNMT1-RNA interactions and suggest strategies for gene selective demethylation of therapeutic targets in disease. PMID:24107992

  1. Comparison of genome-wide selection strategies to identify furfural tolerance genes in Escherichia coli.

    PubMed

    Glebes, Tirzah Y; Sandoval, Nicholas R; Gillis, Jacob H; Gill, Ryan T

    2015-01-01

    Engineering both feedstock and product tolerance is important for transitioning towards next-generation biofuels derived from renewable sources. Tolerance to chemical inhibitors typically results in complex phenotypes, for which multiple genetic changes must often be made to confer tolerance. Here, we performed a genome-wide search for furfural-tolerant alleles using the TRackable Multiplex Recombineering (TRMR) method (Warner et al. (2010), Nature Biotechnology), which uses chromosomally integrated mutations directed towards increased or decreased expression of virtually every gene in Escherichia coli. We employed various growth selection strategies to assess the role of selection design towards growth enrichments. We also compared genes with increased fitness from our TRMR selection to those from a previously reported genome-wide identification study of furfural tolerance genes using a plasmid-based genomic library approach (Glebes et al. (2014) PLOS ONE). In several cases, growth improvements were observed for the chromosomally integrated promoter/RBS mutations but not for the plasmid-based overexpression constructs. Through this assessment, four novel tolerance genes, ahpC, yhjH, rna, and dicA, were identified and confirmed for their effect on improving growth in the presence of furfural. © 2014 Wiley Periodicals, Inc.

  2. Pla2g12b and Hpn Are Genes Identified by Mouse ENU Mutagenesis That Affect HDL Cholesterol

    PubMed Central

    Aljakna, Aleksandra; Choi, Seungbum; Savage, Holly; Hageman Blair, Rachael; Gu, Tongjun; Svenson, Karen L.; Churchill, Gary A.; Hibbs, Matt; Korstanje, Ron

    2012-01-01

    Despite considerable progress understanding genes that affect the HDL particle, its function, and cholesterol content, genes identified to date explain only a small percentage of the genetic variation. We used N-ethyl-N-nitrosourea mutagenesis in mice to discover novel genes that affect HDL cholesterol levels. Two mutant lines (Hlb218 and Hlb320) with low HDL cholesterol levels were established. Causal mutations in these lines were mapped using linkage analysis: for line Hlb218 within a 12 Mbp region on Chr 10; and for line Hlb320 within a 21 Mbp region on Chr 7. High-throughput sequencing of Hlb218 liver RNA identified a mutation in Pla2g12b. The transition of G to A leads to a cysteine to tyrosine change and most likely causes a loss of a disulfide bridge. Microarray analysis of Hlb320 liver RNA showed a 7-fold downregulation of Hpn; sequencing identified a mutation in the 3′ splice site of exon 8. Northern blot confirmed lower mRNA expression level in Hlb320 and did not show a difference in splicing, suggesting that the mutation only affects the splicing rate. In addition to affecting HDL cholesterol, the mutated genes also lead to reduction in serum non-HDL cholesterol and triglyceride levels. Despite low HDL cholesterol levels, the mice from both mutant lines show similar atherosclerotic lesion sizes compared to control mice. These new mutant mouse models are valuable tools to further study the role of these genes, their affect on HDL cholesterol levels, and metabolism. PMID:22912808

  3. Genetic screening of non-classic CAH females with hyperandrogenemia identifies a novel CYP11B1 gene mutation.

    PubMed

    Shammas, Christos; Byrou, Stefania; Phelan, Marie M; Toumba, Meropi; Stylianou, Charilaos; Skordis, Nicos; Neocleous, Vassos; Phylactou, Leonidas A

    2016-04-01

    Congenital adrenal hyperplasia (CAH) is an endocrine autosomal recessive disorder with various symptoms of diverse severity. Mild hyperandrogenemia is the most commonclinical feature in non-classic CAH patients and 95% of the cases are identified by mutations in the CYP21A2 gene. In the present study, the second most common cause for non-classic CAH (NC-CAH), 11β-hydroxylase deficiency due to mutations in the CYP11B1 gene, is investigated. Screening of the CYP21A2 and CYP11B1 genes by direct sequencing was carried out for the detection of possible genetic defects in patients with suspected CAH. It wasobserved that CYP11B1 variants co-exist only in rare cases along with mutations in CYP21A2 in patients clinically diagnosed with CAH. A total of 23 NC-CAH female patients out of 75 were identified with only one mutation in the CYP21A2 gene. The novel CYP11B1 gene mutation, p.Val484Asp, was identified in a patient with CAH in the heterozygous state. The structural characterization of the novel p.Val484Asp was found to likely cause distortion of the surrounding beta sheet and indirect destabilization of the cavity that occurs on the opposite face of the structural elements, leading to partial impairment of the enzymatic activity. CYP21A2 gene mutations are the most frequent genetic defects in cases of NC-CAH even when these patients are in the heterozygous state. These mutations have a diverse phenotype giving rise to a variable extent of cortisol synthesis impairment; it is also clear that CYP11B1 mutants are a rare type of defects causing CAH.

  4. Integrating Genetic and Gene Co-expression Analysis Identifies Gene Networks Involved in Alcohol and Stress Responses

    PubMed Central

    Luo, Jie; Xu, Pei; Cao, Peijian; Wan, Hongjian; Lv, Xiaonan; Xu, Shengchun; Wang, Gangjun; Cook, Melloni N.; Jones, Byron C.; Lu, Lu; Wang, Xusheng

    2018-01-01

    Although the link between stress and alcohol is well recognized, the underlying mechanisms of how they interplay at the molecular level remain unclear. The purpose of this study is to identify molecular networks underlying the effects of alcohol and stress responses, as well as their interaction on anxiety behaviors in the hippocampus of mice using a systems genetics approach. Here, we applied a gene co-expression network approach to transcriptomes of 41 BXD mouse strains under four conditions: stress, alcohol, stress-induced alcohol and control. The co-expression analysis identified 14 modules and characterized four expression patterns across the four conditions. The four expression patterns include up-regulation in no restraint stress and given an ethanol injection (NOE) but restoration in restraint stress followed by an ethanol injection (RSE; pattern 1), down-regulation in NOE but rescue in RSE (pattern 2), up-regulation in both restraint stress followed by a saline injection (RSS) and NOE, and further amplification in RSE (pattern 3), and up-regulation in RSS but reduction in both NOE and RSE (pattern 4). We further identified four functional subnetworks by superimposing protein-protein interactions (PPIs) to the 14 co-expression modules, including γ-aminobutyric acid receptor (GABA) signaling, glutamate signaling, neuropeptide signaling, cAMP-dependent signaling. We further performed module specificity analysis to identify modules that are specific to stress, alcohol, or stress-induced alcohol responses. Finally, we conducted causality analysis to link genetic variation to these identified modules, and anxiety behaviors after stress and alcohol treatments. This study underscores the importance of integrative analysis and offers new insights into the molecular networks underlying stress and alcohol responses. PMID:29674951

  5. Novel Mutations in the ZEB1 Gene Identified in Czech and British Patients With Posterior Polymorphous Corneal Dystrophy

    PubMed Central

    Liskova, Petra; Tuft, Stephen J.; Gwilliam, Rhian; Ebenezer, Neil D.; Jirsova, Katerina; Prescott, Quincy; Martincova, Radka; Pretorius, Marike; Sinclair, Neil; Boase, David L.; Jeffrey, Margaret J.; Deloukas, Panos; Hardcastle, Alison J.; Filipec, Martin; Bhattacharya, Shomi S.

    2009-01-01

    We describe the search for mutations in six unrelated Czech and four unrelated British families with posterior polymorphous corneal dystrophy (PPCD); a relatively rare eye disorder. Coding exons and intron/exon boundaries of all three genes (VSX1, COL8A2, and ZEB1/TCF8) previously reported to be implicated in the pathogenesis of this disorder were screened by DNA sequencing. Four novel pathogenic mutations were identified in four families; two deletions, one nonsense, and one duplication within exon 7 in the ZEB1 gene located at 10p11.2. We also genotyped the Czech patients to test for a founder haplotype and lack of disease segregation with the 20p11.2 locus we previously described. Although a systematic clinical examination was not performed, our investigation does not support an association between ZEB1 changes and self reported non-ocular anomalies. In the remaining six families no disease causing mutations were identified thereby indicating that as yet unidentified gene(s) are likely to be responsible for PPCD. PMID:17437275

  6. Molecular assays in detecting EGFR gene aberrations: an updated HER2-dependent algorithm for interpreting gene signals; a short technical report.

    PubMed

    Tsiambas, Evangelos; Ragos, Vasileios; Lefas, Alicia Y; Georgiannos, Stavros N; Rigopoulos, Dimitrios N; Georgakopoulos, Georgios; Stamatelopoulos, Athanasios; Grapsa, Dimitra; Syrigos, Konstantinos

    2016-01-01

    Purpose: Among oncogenes that have already been identified and cloned, Epidermal Growth Factor Receptor (EGFR) remains one of the most significant. Understanding its deregulation mechanisms improves critically patients' selection for personalized therapies based on modern molecular biology and oncology guidelines. Anti-EGFR targeted therapeutic strategies have been developed based on specific genetic profiles and applied in subgroups of patients suffering by solid cancers of different histogenetic origin. Detection of specific EGFR somatic mutations leads to tyrosine kinase inhibitors (TKIs) application in subsets of them. Concerning EGFR gene numerical imbalances, identification of pure gene amplification is critical for targeting the molecule via monoclonal antibodies (mAbs). In the current technical paper we demonstrate the main molecular methods applied in EGFR analyses focused also on new data in interpreting numerical imbalances based on ASCO/ACAP guidelines for HER2 in situ hybridization (ISH) clarifications.

  7. Large-scale functional RNAi screen in C. elegans identifies genes that regulate the dysfunction of mutant polyglutamine neurons

    PubMed Central

    2012-01-01

    Background A central goal in Huntington's disease (HD) research is to identify and prioritize candidate targets for neuroprotective intervention, which requires genome-scale information on the modifiers of early-stage neuron injury in HD. Results Here, we performed a large-scale RNA interference screen in C. elegans strains that express N-terminal huntingtin (htt) in touch receptor neurons. These neurons control the response to light touch. Their function is strongly impaired by expanded polyglutamines (128Q) as shown by the nearly complete loss of touch response in adult animals, providing an in vivo model in which to manipulate the early phases of expanded-polyQ neurotoxicity. In total, 6034 genes were examined, revealing 662 gene inactivations that either reduce or aggravate defective touch response in 128Q animals. Several genes were previously implicated in HD or neurodegenerative disease, suggesting that this screen has effectively identified candidate targets for HD. Network-based analysis emphasized a subset of high-confidence modifier genes in pathways of interest in HD including metabolic, neurodevelopmental and pro-survival pathways. Finally, 49 modifiers of 128Q-neuron dysfunction that are dysregulated in the striatum of either R/2 or CHL2 HD mice, or both, were identified. Conclusions Collectively, these results highlight the relevance to HD pathogenesis, providing novel information on the potential therapeutic targets for neuroprotection in HD. PMID:22413862

  8. Large-scale functional RNAi screen in C. elegans identifies genes that regulate the dysfunction of mutant polyglutamine neurons.

    PubMed

    Lejeune, François-Xavier; Mesrob, Lilia; Parmentier, Frédéric; Bicep, Cedric; Vazquez-Manrique, Rafael P; Parker, J Alex; Vert, Jean-Philippe; Tourette, Cendrine; Neri, Christian

    2012-03-13

    A central goal in Huntington's disease (HD) research is to identify and prioritize candidate targets for neuroprotective intervention, which requires genome-scale information on the modifiers of early-stage neuron injury in HD. Here, we performed a large-scale RNA interference screen in C. elegans strains that express N-terminal huntingtin (htt) in touch receptor neurons. These neurons control the response to light touch. Their function is strongly impaired by expanded polyglutamines (128Q) as shown by the nearly complete loss of touch response in adult animals, providing an in vivo model in which to manipulate the early phases of expanded-polyQ neurotoxicity. In total, 6034 genes were examined, revealing 662 gene inactivations that either reduce or aggravate defective touch response in 128Q animals. Several genes were previously implicated in HD or neurodegenerative disease, suggesting that this screen has effectively identified candidate targets for HD. Network-based analysis emphasized a subset of high-confidence modifier genes in pathways of interest in HD including metabolic, neurodevelopmental and pro-survival pathways. Finally, 49 modifiers of 128Q-neuron dysfunction that are dysregulated in the striatum of either R/2 or CHL2 HD mice, or both, were identified. Collectively, these results highlight the relevance to HD pathogenesis, providing novel information on the potential therapeutic targets for neuroprotection in HD. © 2012 Lejeune et al; licensee BioMed Central Ltd.

  9. An automated procedure to identify biomedical articles that contain cancer-associated gene variants.

    PubMed

    McDonald, Ryan; Scott Winters, R; Ankuda, Claire K; Murphy, Joan A; Rogers, Amy E; Pereira, Fernando; Greenblatt, Marc S; White, Peter S

    2006-09-01

    The proliferation of biomedical literature makes it increasingly difficult for researchers to find and manage relevant information. However, identifying research articles containing mutation data, a requisite first step in integrating large and complex mutation data sets, is currently tedious, time-consuming and imprecise. More effective mechanisms for identifying articles containing mutation information would be beneficial both for the curation of mutation databases and for individual researchers. We developed an automated method that uses information extraction, classifier, and relevance ranking techniques to determine the likelihood of MEDLINE abstracts containing information regarding genomic variation data suitable for inclusion in mutation databases. We targeted the CDKN2A (p16) gene and the procedure for document identification currently used by CDKN2A Database curators as a measure of feasibility. A set of abstracts was manually identified from a MEDLINE search as potentially containing specific CDKN2A mutation events. A subset of these abstracts was used as a training set for a maximum entropy classifier to identify text features distinguishing "relevant" from "not relevant" abstracts. Each document was represented as a set of indicative word, word pair, and entity tagger-derived genomic variation features. When applied to a test set of 200 candidate abstracts, the classifier predicted 88 articles as being relevant; of these, 29 of 32 manuscripts in which manual curation found CDKN2A sequence variants were positively predicted. Thus, the set of potentially useful articles that a manual curator would have to review was reduced by 56%, maintaining 91% recall (sensitivity) and more than doubling precision (positive predictive value). Subsequent expansion of the training set to 494 articles yielded similar precision and recall rates, and comparison of the original and expanded trials demonstrated that the average precision improved with the larger data set

  10. An integrative framework for Bayesian variable selection with informative priors for identifying genes and pathways.

    PubMed

    Peng, Bin; Zhu, Dianwen; Ander, Bradley P; Zhang, Xiaoshuai; Xue, Fuzhong; Sharp, Frank R; Yang, Xiaowei

    2013-01-01

    The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with 'large p, small n' problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed.

  11. Combining suppressive subtractive hybridization and cDNA microarrays to identify dietary phosphorus-responsive genes of the rainbow trout (Oncorhynchus mykiss) kidney.

    PubMed

    Lake, Jennifer; Gravel, Catherine; Koko, Gabriel Koffi D; Robert, Claude; Vandenberg, Grant W

    2010-03-01

    Phosphorus (P)-responsive genes and how they regulate renal adaptation to phosphorous-deficient diets in animals, including fish, are not well understood. RNA abundance profiling using cDNA microarrays is an efficient approach to study nutrient-gene interactions and identify these dietary P-responsive genes. To test the hypothesis that dietary P-responsive genes are differentially expressed in fish fed varying P levels, rainbow trout were fed a practical high-P diet (R20: 0.96% P) or a low-P diet (R0: 0.38% P) for 7 weeks. The differentially-expressed genes between dietary groups were identified and compared from the kidney by combining suppressive subtractive hybridization (SSH) with cDNA microarray analysis. A number of genes were confirmed by real-time PCR, and correlated with plasma and bone P concentrations. Approximately 54 genes were identified as potential dietary P-responsive after 7 weeks on a diet deficient in P according to cDNA microarray analysis. Of 18 selected genes, 13 genes were confirmed to be P-responsive at 7 weeks by real-time PCR analysis, including: iNOS, cytochrome b, cytochrome c oxidase subunit II , alpha-globin I, beta-globin, ATP synthase, hyperosmotic protein 21, COL1A3, Nkef, NDPK, glucose phosphate isomerase 1, Na+/H+ exchange protein and GDP dissociation inhibitor 2. Many of these dietary P-responsive genes responded in a moderate way (R0/R20 ratio: <2-3 or >0.5) and in a transient manner to dietary P limitation. In summary, renal adaptation to dietary P deficiency in trout involves changes in the expression of several genes, suggesting a profile of metabolic stress, since many of these differentially-expressed candidates are associated with the cellular adaptative responses. Crown Copyright 2009. Published by Elsevier Inc. All rights reserved.

  12. Transcriptome profiling of equine vitamin E deficient neuroaxonal dystrophy identifies upregulation of liver X receptor target genes

    PubMed Central

    Finno, Carrie J.; Bordbari, Matthew H.; Valberg, Stephanie J.; Lee, David; Herron, Josi; Hines, Kelly; Monsour, Tamer; Scott, Erica; Bannasch, Danika L.; Mickelson, James; Xu, Libin

    2016-01-01

    Specific spontaneous heritable neurodegenerative diseases have been associated with lower serum and cerebrospinal fluid α-tocopherol (α-TOH) concentrations. Equine neuroaxonal dystrophy (eNAD) has similar histologic lesions to human ataxia with vitamin E deficiency caused by mutations in the α-TOH transfer protein gene (TTPA). Mutations in TTPA are not present with eNAD and the molecular basis remains unknown. Given the neuropathologic phenotypic similarity of the conditions, we assessed the molecular basis of eNAD by global transcriptome sequencing of the cervical spinal cord. Differential gene expression analysis identified 157 significantly (FDR<0.05) dysregulated transcripts within the spinal cord of eNAD-affected horses. Statistical enrichment analysis identified significant downregulation of the ionotropic and metabotropic group III glutamate receptor, synaptic vesicle trafficking and cholesterol biosynthesis pathways. Gene co-expression analysis identified one module of upregulated genes significantly associated with the eNAD phenotype that included the liver X receptor (LXR) targets CYP7A1, APOE, PLTP and ABCA1. Validation of CYP7A1 and APOE dysregulation was performed in an independent biologic group and CYP7A1 was found to be additionally upregulated in the medulla oblongata of eNAD horses. Evidence of LXR activation supports a role for modulation of oxysterol-dependent LXR transcription factor activity by tocopherols. We hypothesize that the protective role of α-TOH in eNAD may reside in its ability to prevent oxysterol accumulation and subsequent activation of the LXR in order to decrease lipid peroxidation associated neurodegeneration. PMID:27751910

  13. Gene expression profiling of prostate tissue identifies chromatin regulation as a potential link between obesity and lethal prostate cancer.

    PubMed

    Ebot, Ericka M; Gerke, Travis; Labbé, David P; Sinnott, Jennifer A; Zadra, Giorgia; Rider, Jennifer R; Tyekucheva, Svitlana; Wilson, Kathryn M; Kelly, Rachel S; Shui, Irene M; Loda, Massimo; Kantoff, Philip W; Finn, Stephen; Vander Heiden, Matthew G; Brown, Myles; Giovannucci, Edward L; Mucci, Lorelei A

    2017-11-01

    Obese men are at higher risk of advanced prostate cancer and cancer-specific mortality; however, the biology underlying this association remains unclear. This study examined gene expression profiles of prostate tissue to identify biological processes differentially expressed by obesity status and lethal prostate cancer. Gene expression profiling was performed on tumor (n = 402) and adjacent normal (n = 200) prostate tissue from participants in 2 prospective cohorts who had been diagnosed with prostate cancer from 1982 to 2005. Body mass index (BMI) was calculated from the questionnaire immediately preceding cancer diagnosis. Men were followed for metastases or prostate cancer-specific death (lethal disease) through 2011. Gene Ontology biological processes differentially expressed by BMI were identified using gene set enrichment analysis. Pathway scores were computed by averaging the signal intensities of member genes. Odds ratios (ORs) for lethal prostate cancer were estimated with logistic regression. Among 402 men, 48% were healthy weight, 31% were overweight, and 21% were very overweight/obese. Fifteen gene sets were enriched in tumor tissue, but not normal tissue, of very overweight/obese men versus healthy-weight men; 5 of these were related to chromatin modification and remodeling (false-discovery rate < 0.25). Patients with high tumor expression of chromatin-related genes had worse clinical characteristics (Gleason grade > 7, 41% vs 17%; P = 2 × 10 -4 ) and an increased risk of lethal disease that was independent of grade and stage (OR, 5.26; 95% confidence interval, 2.37-12.25). This study improves our understanding of the biology of aggressive prostate cancer and identifies a potential mechanistic link between obesity and prostate cancer death that warrants further study. Cancer 2017;123:4130-4138. © 2017 American Cancer Society. © 2017 American Cancer Society.

  14. A gene co-expression network model identifies yield-related vicinity networks in Jatropha curcas shoot system.

    PubMed

    Govender, Nisha; Senan, Siju; Mohamed-Hussein, Zeti-Azura; Wickneswari, Ratnam

    2018-06-15

    The plant shoot system consists of reproductive organs such as inflorescences, buds and fruits, and the vegetative leaves and stems. In this study, the reproductive part of the Jatropha curcas shoot system, which includes the aerial shoots, shoots bearing the inflorescence and inflorescence were investigated in regard to gene-to-gene interactions underpinning yield-related biological processes. An RNA-seq based sequencing of shoot tissues performed on an Illumina HiSeq. 2500 platform generated 18 transcriptomes. Using the reference genome-based mapping approach, a total of 64 361 genes was identified in all samples and the data was annotated against the non-redundant database by the BLAST2GO Pro. Suite. After removing the outlier genes and samples, a total of 12 734 genes across 17 samples were subjected to gene co-expression network construction using petal, an R library. A gene co-expression network model built with scale-free and small-world properties extracted four vicinity networks (VNs) with putative involvement in yield-related biological processes as follow; heat stress tolerance, floral and shoot meristem differentiation, biosynthesis of chlorophyll molecules and laticifers, cell wall metabolism and epigenetic regulations. Our VNs revealed putative key players that could be adapted in breeding strategies for J. curcas shoot system improvements.

  15. Identifying gene coexpression networks underlying the dynamic regulation of wood-forming tissues in Populus under diverse environmental conditions.

    PubMed

    Zinkgraf, Matthew; Liu, Lijun; Groover, Andrew; Filkov, Vladimir

    2017-06-01

    Trees modify wood formation through integration of environmental and developmental signals in complex but poorly defined transcriptional networks, allowing trees to produce woody tissues appropriate to diverse environmental conditions. In order to identify relationships among genes expressed during wood formation, we integrated data from new and publically available datasets in Populus. These datasets were generated from woody tissue and include transcriptome profiling, transcription factor binding, DNA accessibility and genome-wide association mapping experiments. Coexpression modules were calculated, each of which contains genes showing similar expression patterns across experimental conditions, genotypes and treatments. Conserved gene coexpression modules (four modules totaling 8398 genes) were identified that were highly preserved across diverse environmental conditions and genetic backgrounds. Functional annotations as well as correlations with specific experimental treatments associated individual conserved modules with distinct biological processes underlying wood formation, such as cell-wall biosynthesis, meristem development and epigenetic pathways. Module genes were also enriched for DNase I hypersensitivity footprints and binding from four transcription factors associated with wood formation. The conserved modules are excellent candidates for modeling core developmental pathways common to wood formation in diverse environments and genotypes, and serve as testbeds for hypothesis generation and testing for future studies. No claim to original US government works. New Phytologist © 2017 New Phytologist Trust.

  16. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python)

    PubMed Central

    Rutllant, Josep

    2016-01-01

    Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism's genome (such as the mouse genome) in order to make physiological inferences about the role of genes and proteins in a less characterized organism's genome (such as the Burmese python). We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1) production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2) enhanced assisted reproduction technology for endangered and captive reptiles; and (3) novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value. PMID:27200191

  17. Metagenomic approaches to identify and isolate bioactive natural products from microbiota of marine sponges.

    PubMed

    Gurgui, Cristian; Piel, Jörn

    2010-01-01

    Many marine sponges harbor massive consortia of symbiotic bacteria belonging to diverse phyla. Sponges are also an unusually rich source of biologically active natural products, and evidence is accumulating that these compounds might often be synthesized by the symbionts. Since the study of sponge-associated bacteria is generally hampered by very low cultivation rates, cultivation-independent, metagenomic methods have recently been applied to sponges. These methods allow for the isolation of biosynthetic gene clusters that can ultimately be exploited to develop sustainable natural product sources by heterologous expression. However, general challenges encountered in sponge metagenomic research are the poor quality of the isolated DNA with respect to size and yield, the difficulty to identify genes of interest among numerous homologs, insufficient clone numbers in metagenomic libraries, and time-consuming screening procedures to identify and isolate rare positive clones. Here, we give an overview of methods that address these problems and can be used to streamline isolation of biosynthetic and other genes of interest.

  18. Back to the sea twice: identifying candidate plant genes for molecular evolution to marine life.

    PubMed

    Wissler, Lothar; Codoñer, Francisco M; Gu, Jenny; Reusch, Thorsten B H; Olsen, Jeanine L; Procaccini, Gabriele; Bornberg-Bauer, Erich

    2011-01-12

    Seagrasses are a polyphyletic group of monocotyledonous angiosperms that have adapted to a completely submerged lifestyle in marine waters. Here, we exploit two collections of expressed sequence tags (ESTs) of two wide-spread and ecologically important seagrass species, the Mediterranean seagrass Posidonia oceanica (L.) Delile and the eelgrass Zostera marina L., which have independently evolved from aquatic ancestors. This replicated, yet independent evolutionary history facilitates the identification of traits that may have evolved in parallel and are possible instrumental candidates for adaptation to a marine habitat. In our study, we provide the first quantitative perspective on molecular adaptations in two seagrass species. By constructing orthologous gene clusters shared between two seagrasses (Z. marina and P. oceanica) and eight distantly related terrestrial angiosperm species, 51 genes could be identified with detection of positive selection along the seagrass branches of the phylogenetic tree. Characterization of these positively selected genes using KEGG pathways and the Gene Ontology uncovered that these genes are mostly involved in translation, metabolism, and photosynthesis. These results provide first insights into which seagrass genes have diverged from their terrestrial counterparts via an initial aquatic stage characteristic of the order and to the derived fully-marine stage characteristic of seagrasses. We discuss how adaptive changes in these processes may have contributed to the evolution towards an aquatic and marine existence.

  19. Identification and characterization of amelogenin genes in monotremes, reptiles, and amphibians

    PubMed Central

    Toyosawa, Satoru; O’hUigin, Colm; Figueroa, Felipe; Tichy, Herbert; Klein, Jan

    1998-01-01

    Two features make the tooth an excellent model in the study of evolutionary innovations: the relative simplicity of its structure and the fact that the major tooth-forming genes have been identified in eutherian mammals. To understand the nature of the innovation at the molecular level, it is necessary to identify the homologs of tooth-forming genes in other vertebrates. As a first step toward this goal, homologs of the eutherian amelogenin gene have been cloned and characterized in selected species of monotremes (platypus and echidna), reptiles (caiman), and amphibians (African clawed toad). Comparisons of the homologs reveal that the amelogenin gene evolves quickly in the repeat region, in which numerous insertions and deletions have obliterated any similarity among the genes, and slowly in other regions. The gene organization, the distribution of hydrophobic and hydrophilic segments in the encoded protein, and several other features have been conserved throughout the evolution of the tetrapod amelogenin gene. Clones corresponding to one locus only were found in caiman, whereas the clawed toad possesses at least two amelogenin-encoding loci. PMID:9789040

  20. Biomphalaria glabrata transcriptome: cDNA microarray profiling identifies resistant- and susceptible-specific gene expression in haemocytes from snail strains exposed to Schistosoma mansoni

    PubMed Central

    Lockyer, Anne E; Spinks, Jenny; Kane, Richard A; Hoffmann, Karl F; Fitzpatrick, Jennifer M; Rollinson, David; Noble, Leslie R; Jones, Catherine S

    2008-01-01

    Background Biomphalaria glabrata is an intermediate snail host for Schistosoma mansoni, one of the important schistosomes infecting man. B. glabrata/S. mansoni provides a useful model system for investigating the intimate interactions between host and parasite. Examining differential gene expression between S. mansoni-exposed schistosome-resistant and susceptible snail lines will identify genes and pathways that may be involved in snail defences. Results We have developed a 2053 element cDNA microarray for B. glabrata containing clones from ORESTES (Open Reading frame ESTs) libraries, suppression subtractive hybridization (SSH) libraries and clones identified in previous expression studies. Snail haemocyte RNA, extracted from parasite-challenged resistant and susceptible snails, 2 to 24 h post-exposure to S. mansoni, was hybridized to the custom made cDNA microarray and 98 differentially expressed genes or gene clusters were identified, 94 resistant-associated and 4 susceptible-associated. Quantitative PCR analysis verified the cDNA microarray results for representative transcripts. Differentially expressed genes were annotated and clustered using gene ontology (GO) terminology and Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis. 61% of the identified differentially expressed genes have no known function including the 4 susceptible strain-specific transcripts. Resistant strain-specific expression of genes implicated in innate immunity of invertebrates was identified, including hydrolytic enzymes such as cathepsin L, a cysteine proteinase involved in lysis of phagocytosed particles; metabolic enzymes such as ornithine decarboxylase, the rate-limiting enzyme in the production of polyamines, important in inflammation and infection processes, as well as scavenging damaging free radicals produced during production of reactive oxygen species; stress response genes such as HSP70; proteins involved in signalling, such as importin 7 and copine 1

  1. Biomphalaria glabrata transcriptome: cDNA microarray profiling identifies resistant- and susceptible-specific gene expression in haemocytes from snail strains exposed to Schistosoma mansoni.

    PubMed

    Lockyer, Anne E; Spinks, Jenny; Kane, Richard A; Hoffmann, Karl F; Fitzpatrick, Jennifer M; Rollinson, David; Noble, Leslie R; Jones, Catherine S

    2008-12-29

    Biomphalaria glabrata is an intermediate snail host for Schistosoma mansoni, one of the important schistosomes infecting man. B. glabrata/S. mansoni provides a useful model system for investigating the intimate interactions between host and parasite. Examining differential gene expression between S. mansoni-exposed schistosome-resistant and susceptible snail lines will identify genes and pathways that may be involved in snail defences. We have developed a 2053 element cDNA microarray for B. glabrata containing clones from ORESTES (Open Reading frame ESTs) libraries, suppression subtractive hybridization (SSH) libraries and clones identified in previous expression studies. Snail haemocyte RNA, extracted from parasite-challenged resistant and susceptible snails, 2 to 24 h post-exposure to S. mansoni, was hybridized to the custom made cDNA microarray and 98 differentially expressed genes or gene clusters were identified, 94 resistant-associated and 4 susceptible-associated. Quantitative PCR analysis verified the cDNA microarray results for representative transcripts. Differentially expressed genes were annotated and clustered using gene ontology (GO) terminology and Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis. 61% of the identified differentially expressed genes have no known function including the 4 susceptible strain-specific transcripts. Resistant strain-specific expression of genes implicated in innate immunity of invertebrates was identified, including hydrolytic enzymes such as cathepsin L, a cysteine proteinase involved in lysis of phagocytosed particles; metabolic enzymes such as ornithine decarboxylase, the rate-limiting enzyme in the production of polyamines, important in inflammation and infection processes, as well as scavenging damaging free radicals produced during production of reactive oxygen species; stress response genes such as HSP70; proteins involved in signalling, such as importin 7 and copine 1, cytoplasmic intermediate

  2. The human cumulus--oocyte complex gene-expression profile

    PubMed Central

    Assou, Said; Anahory, Tal; Pantesco, Véronique; Le Carrour, Tanguy; Pellestor, Franck; Klein, Bernard; Reyftmann, Lionel; Dechaud, Hervé; De Vos, John; Hamamah, Samir

    2006-01-01

    BACKGROUND The understanding of the mechanisms regulating human oocyte maturation is still rudimentary. We have identified transcripts differentially expressed between immature and mature oocytes, and cumulus cells. METHODS Using oligonucleotides microarrays, genome wide gene expression was studied in pooled immature and mature oocytes or cumulus cells from patients who underwent IVF. RESULTS In addition to known genes such as DAZL, BMP15 or GDF9, oocytes upregulated 1514 genes. We show that PTTG3 and AURKC are respectively the securin and the Aurora kinase preferentially expressed during oocyte meiosis. Strikingly, oocytes overexpressed previously unreported growth factors such as TNFSF13/APRIL, FGF9, FGF14, and IL4, and transcription factors including OTX2, SOX15 and SOX30. Conversely, cumulus cells, in addition to known genes such as LHCGR or BMPR2, overexpressed cell-tocell signaling genes including TNFSF11/RANKL, numerous complement components, semaphorins (SEMA3A, SEMA6A, SEMA6D) and CD genes such as CD200. We also identified 52 genes progressively increasing during oocyte maturation, comprising CDC25A and SOCS7. CONCLUSION The identification of genes up and down regulated during oocyte maturation greatly improves our understanding of oocyte biology and will provide new markers that signal viable and competent oocytes. Furthermore, genes found expressed in cumulus cells are potential markers of granulosa cell tumors. PMID:16571642

  3. Developmental regulation of diacylglycerol acyltransferase family gene expression in tung tree tissues

    USDA-ARS?s Scientific Manuscript database

    Diacylglycerol acyltransferases (DGAT) are responsible for the final and rate-limiting step of triacylglycerol (TAG) biosynthesis in eukaryotic organisms. DGAT genes have been identified in numerous organisms. Multiple isoforms of DGAT are present in eukaryotes, including DGAT1 and DGAT2 of tung tre...

  4. Gene-centric Meta-analysis in 87,736 Individuals of European Ancestry Identifies Multiple Blood-Pressure-Related Loci

    PubMed Central

    Tragante, Vinicius; Barnes, Michael R.; Ganesh, Santhi K.; Lanktree, Matthew B.; Guo, Wei; Franceschini, Nora; Smith, Erin N.; Johnson, Toby; Holmes, Michael V.; Padmanabhan, Sandosh; Karczewski, Konrad J.; Almoguera, Berta; Barnard, John; Baumert, Jens; Chang, Yen-Pei Christy; Elbers, Clara C.; Farrall, Martin; Fischer, Mary E.; Gaunt, Tom R.; Gho, Johannes M.I.H.; Gieger, Christian; Goel, Anuj; Gong, Yan; Isaacs, Aaron; Kleber, Marcus E.; Leach, Irene Mateo; McDonough, Caitrin W.; Meijs, Matthijs F.L.; Melander, Olle; Nelson, Christopher P.; Nolte, Ilja M.; Pankratz, Nathan; Price, Tom S.; Shaffer, Jonathan; Shah, Sonia; Tomaszewski, Maciej; van der Most, Peter J.; Van Iperen, Erik P.A.; Vonk, Judith M.; Witkowska, Kate; Wong, Caroline O.L.; Zhang, Li; Beitelshees, Amber L.; Berenson, Gerald S.; Bhatt, Deepak L.; Brown, Morris; Burt, Amber; Cooper-DeHoff, Rhonda M.; Connell, John M.; Cruickshanks, Karen J.; Curtis, Sean P.; Davey-Smith, George; Delles, Christian; Gansevoort, Ron T.; Guo, Xiuqing; Haiqing, Shen; Hastie, Claire E.; Hofker, Marten H.; Hovingh, G. Kees; Kim, Daniel S.; Kirkland, Susan A.; Klein, Barbara E.; Klein, Ronald; Li, Yun R.; Maiwald, Steffi; Newton-Cheh, Christopher; O’Brien, Eoin T.; Onland-Moret, N. Charlotte; Palmas, Walter; Parsa, Afshin; Penninx, Brenda W.; Pettinger, Mary; Vasan, Ramachandran S.; Ranchalis, Jane E.; M Ridker, Paul; Rose, Lynda M.; Sever, Peter; Shimbo, Daichi; Steele, Laura; Stolk, Ronald P.; Thorand, Barbara; Trip, Mieke D.; van Duijn, Cornelia M.; Verschuren, W. Monique; Wijmenga, Cisca; Wyatt, Sharon; Young, J. Hunter; Zwinderman, Aeilko H.; Bezzina, Connie R.; Boerwinkle, Eric; Casas, Juan P.; Caulfield, Mark J.; Chakravarti, Aravinda; Chasman, Daniel I.; Davidson, Karina W.; Doevendans, Pieter A.; Dominiczak, Anna F.; FitzGerald, Garret A.; Gums, John G.; Fornage, Myriam; Hakonarson, Hakon; Halder, Indrani; Hillege, Hans L.; Illig, Thomas; Jarvik, Gail P.; Johnson, Julie A.; Kastelein, John J.P.; Koenig, Wolfgang; Kumari, Meena; März, Winfried; Murray, Sarah S.; O’Connell, Jeffery R.; Oldehinkel, Albertine J.; Pankow, James S.; Rader, Daniel J.; Redline, Susan; Reilly, Muredach P.; Schadt, Eric E.; Kottke-Marchant, Kandice; Snieder, Harold; Snyder, Michael; Stanton, Alice V.; Tobin, Martin D.; Uitterlinden, André G.; van der Harst, Pim; van der Schouw, Yvonne T.; Samani, Nilesh J.; Watkins, Hugh; Johnson, Andrew D.; Reiner, Alex P.; Zhu, Xiaofeng; de Bakker, Paul I.W.; Levy, Daniel; Asselbergs, Folkert W.; Munroe, Patricia B.; Keating, Brendan J.

    2014-01-01

    Blood pressure (BP) is a heritable risk factor for cardiovascular disease. To investigate genetic associations with systolic BP (SBP), diastolic BP (DBP), mean arterial pressure (MAP), and pulse pressure (PP), we genotyped ∼50,000 SNPs in up to 87,736 individuals of European ancestry and combined these in a meta-analysis. We replicated findings in an independent set of 68,368 individuals of European ancestry. Our analyses identified 11 previously undescribed associations in independent loci containing 31 genes including PDE1A, HLA-DQB1, CDK6, PRKAG2, VCL, H19, NUCB2, RELA, HOXC@ complex, FBN1, and NFAT5 at the Bonferroni-corrected array-wide significance threshold (p < 6 × 10−7) and confirmed 27 previously reported associations. Bioinformatic analysis of the 11 loci provided support for a putative role in hypertension of several genes, such as CDK6 and NUCB2. Analysis of potential pharmacological targets in databases of small molecules showed that ten of the genes are predicted to be a target for small molecules. In summary, we identified previously unknown loci associated with BP. Our findings extend our understanding of genes involved in BP regulation, which may provide new targets for therapeutic intervention or drug response stratification. PMID:24560520

  5. Pathway-based analysis of GWAs data identifies association of sex determination genes with susceptibility to testicular germ cell tumors.

    PubMed

    Koster, Roelof; Mitra, Nandita; D'Andrea, Kurt; Vardhanabhuti, Saran; Chung, Charles C; Wang, Zhaoming; Loren Erickson, R; Vaughn, David J; Litchfield, Kevin; Rahman, Nazneen; Greene, Mark H; McGlynn, Katherine A; Turnbull, Clare; Chanock, Stephen J; Nathanson, Katherine L; Kanetsky, Peter A

    2014-11-15

    Genome-wide association (GWA) studies of testicular germ cell tumor (TGCT) have identified 18 susceptibility loci, some containing genes encoding proteins important in male germ cell development. Deletions of one of these genes, DMRT1, lead to male-to-female sex reversal and are associated with development of gonadoblastoma. To further explore genetic association with TGCT, we undertook a pathway-based analysis of SNP marker associations in the Penn GWAs (349 TGCT cases and 919 controls). We analyzed a custom-built sex determination gene set consisting of 32 genes using three different methods of pathway-based analysis. The sex determination gene set ranked highly compared with canonical gene sets, and it was associated with TGCT (FDRG = 2.28 × 10(-5), FDRM = 0.014 and FDRI = 0.008 for Gene Set Analysis-SNP (GSA-SNP), Meta-Analysis Gene Set Enrichment of Variant Associations (MAGENTA) and Improved Gene Set Enrichment Analysis for Genome-wide Association Study (i-GSEA4GWAS) analysis, respectively). The association remained after removal of DMRT1 from the gene set (FDRG = 0.0002, FDRM = 0.055 and FDRI = 0.009). Using data from the NCI GWA scan (582 TGCT cases and 1056 controls) and UK scan (986 TGCT cases and 4946 controls), we replicated these findings (NCI: FDRG = 0.006, FDRM = 0.014, FDRI = 0.033, and UK: FDRG = 1.04 × 10(-6), FDRM = 0.016, FDRI = 0.025). After removal of DMRT1 from the gene set, the sex determination gene set remains associated with TGCT in the NCI (FDRG = 0.039, FDRM = 0.050 and FDRI = 0.055) and UK scans (FDRG = 3.00 × 10(-5), FDRM = 0.056 and FDRI = 0.044). With the exception of DMRT1, genes in the sex determination gene set have not previously been identified as TGCT susceptibility loci in these GWA scans, demonstrating the complementary nature of a pathway-based approach for genome-wide analysis of TGCT. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  6. Transposon mutagenesis identifies chromatin modifiers cooperating with Ras in thyroid tumorigenesis and detects ATXN7 as a cancer gene.

    PubMed

    Montero-Conde, Cristina; Leandro-Garcia, Luis J; Chen, Xu; Oler, Gisele; Ruiz-Llorente, Sergio; Ryder, Mabel; Landa, Iñigo; Sanchez-Vega, Francisco; La, Konnor; Ghossein, Ronald A; Bajorin, Dean F; Knauf, Jeffrey A; Riordan, Jesse D; Dupuy, Adam J; Fagin, James A

    2017-06-20

    Oncogenic RAS mutations are present in 15-30% of thyroid carcinomas. Endogenous expression of mutant Ras is insufficient to initiate thyroid tumorigenesis in murine models, indicating that additional genetic alterations are required. We used Sleeping Beauty (SB) transposon mutagenesis to identify events that cooperate with Hras G12V in thyroid tumor development. Random genomic integration of SB transposons primarily generated loss-of-function events that significantly increased thyroid tumor penetrance in Tpo-Cre/homozygous FR-Hras G12V mice. The thyroid tumors closely phenocopied the histological features of human RAS-driven, poorly differentiated thyroid cancers. Characterization of transposon insertion sites in the SB-induced tumors identified 45 recurrently mutated candidate cancer genes. These mutation profiles were remarkably concordant with mutated cancer genes identified in a large series of human poorly differentiated and anaplastic thyroid cancers screened by next-generation sequencing using the MSK-IMPACT panel of cancer genes, which we modified to include all SB candidates. The disrupted genes primarily clustered in chromatin remodeling functional nodes and in the PI3K pathway. ATXN7 , a component of a multiprotein complex with histone acetylase activity, scored as a significant SB hit. It was recurrently mutated in advanced human cancers and significantly co-occurred with RAS or NF1 mutations. Expression of ATXN7 mutants cooperated with oncogenic RAS to induce thyroid cell proliferation, pointing to ATXN7 as a previously unrecognized cancer gene.

  7. Transposon mutagenesis identifies chromatin modifiers cooperating with Ras in thyroid tumorigenesis and detects ATXN7 as a cancer gene

    PubMed Central

    Montero-Conde, Cristina; Leandro-Garcia, Luis J.; Chen, Xu; Oler, Gisele; Ruiz-Llorente, Sergio; Ryder, Mabel; Landa, Iñigo; Sanchez-Vega, Francisco; La, Konnor; Ghossein, Ronald A.; Bajorin, Dean F.; Knauf, Jeffrey A.; Riordan, Jesse D.; Dupuy, Adam J.; Fagin, James A.

    2017-01-01

    Oncogenic RAS mutations are present in 15–30% of thyroid carcinomas. Endogenous expression of mutant Ras is insufficient to initiate thyroid tumorigenesis in murine models, indicating that additional genetic alterations are required. We used Sleeping Beauty (SB) transposon mutagenesis to identify events that cooperate with HrasG12V in thyroid tumor development. Random genomic integration of SB transposons primarily generated loss-of-function events that significantly increased thyroid tumor penetrance in Tpo-Cre/homozygous FR-HrasG12V mice. The thyroid tumors closely phenocopied the histological features of human RAS-driven, poorly differentiated thyroid cancers. Characterization of transposon insertion sites in the SB-induced tumors identified 45 recurrently mutated candidate cancer genes. These mutation profiles were remarkably concordant with mutated cancer genes identified in a large series of human poorly differentiated and anaplastic thyroid cancers screened by next-generation sequencing using the MSK-IMPACT panel of cancer genes, which we modified to include all SB candidates. The disrupted genes primarily clustered in chromatin remodeling functional nodes and in the PI3K pathway. ATXN7, a component of a multiprotein complex with histone acetylase activity, scored as a significant SB hit. It was recurrently mutated in advanced human cancers and significantly co-occurred with RAS or NF1 mutations. Expression of ATXN7 mutants cooperated with oncogenic RAS to induce thyroid cell proliferation, pointing to ATXN7 as a previously unrecognized cancer gene. PMID:28584132

  8. Comparative transcriptome analysis of stylar canal cells identifies novel candidate genes implicated in the self-incompatibility response of Citrus clementina

    PubMed Central

    2012-01-01

    Background Reproductive biology in citrus is still poorly understood. Although in recent years several efforts have been made to study pollen-pistil interaction and self-incompatibility, little information is available about the molecular mechanisms regulating these processes. Here we report the identification of candidate genes involved in pollen-pistil interaction and self-incompatibility in clementine (Citrus clementina Hort. ex Tan.). These genes have been identified comparing the transcriptomes of laser-microdissected stylar canal cells (SCC) isolated from two genotypes differing for self-incompatibility response ('Comune', a self-incompatible cultivar and 'Monreal', a self- compatible mutation of 'Comune'). Results The transcriptome profiling of SCC indicated that the differential regulation of few specific, mostly uncharacterized transcripts is associated with the breakdown of self-incompatibility in 'Monreal'. Among them, a novel F-box gene showed a drastic up-regulation both in laser microdissected stylar canal cells and in self-pollinated whole styles with stigmas of 'Comune' in concomitance with the arrest of pollen tube growth. Moreover, we identify a non-characterized gene family as closely associated to the self-incompatibility genetic program activated in 'Comune'. Three different aspartic-acid rich (Asp-rich) protein genes, located in tandem in the clementine genome, were over-represented in the transcriptome of 'Comune'. These genes are tightly linked to a DELLA gene, previously found to be up-regulated in the self-incompatible genotype during pollen-pistil interaction. Conclusion The highly specific transcriptome survey of the stylar canal cells identified novel genes which have not been previously associated with self-pollen rejection in citrus and in other plant species. Bioinformatic and transcriptional analyses suggested that the mutation leading to self-compatibility in 'Monreal' affected the expression of non-homologous genes located in a

  9. integIRTy: a method to identify genes altered in cancer by accounting for multiple mechanisms of regulation using item response theory.

    PubMed

    Tong, Pan; Coombes, Kevin R

    2012-11-15

    Identifying genes altered in cancer plays a crucial role in both understanding the mechanism of carcinogenesis and developing novel therapeutics. It is known that there are various mechanisms of regulation that can lead to gene dysfunction, including copy number change, methylation, abnormal expression, mutation and so on. Nowadays, all these types of alterations can be simultaneously interrogated by different types of assays. Although many methods have been proposed to identify altered genes from a single assay, there is no method that can deal with multiple assays accounting for different alteration types systematically. In this article, we propose a novel method, integration using item response theory (integIRTy), to identify altered genes by using item response theory that allows integrated analysis of multiple high-throughput assays. When applied to a single assay, the proposed method is more robust and reliable than conventional methods such as Student's t-test or the Wilcoxon rank-sum test. When used to integrate multiple assays, integIRTy can identify novel-altered genes that cannot be found by looking at individual assay separately. We applied integIRTy to three public cancer datasets (ovarian carcinoma, breast cancer, glioblastoma) for cross-assay type integration which all show encouraging results. The R package integIRTy is available at the web site http://bioinformatics.mdanderson.org/main/OOMPA:Overview. kcoombes@mdanderson.org. Supplementary data are available at Bioinformatics online.

  10. Genome-wide association study identified genetic variations and candidate genes for plant architecture component traits in Chinese upland cotton.

    PubMed

    Su, Junji; Li, Libei; Zhang, Chi; Wang, Caixiang; Gu, Lijiao; Wang, Hantao; Wei, Hengling; Liu, Qibao; Huang, Long; Yu, Shuxun

    2018-06-01

    Thirty significant associations between 22 SNPs and five plant architecture component traits in Chinese upland cotton were identified via GWAS. Four peak SNP loci located on chromosome D03 were simultaneously associated with more plant architecture component traits. A candidate gene, Gh_D03G0922, might be responsible for plant height in upland cotton. A compact plant architecture is increasingly required for mechanized harvesting processes in China. Therefore, cotton plant architecture is an important trait, and its components, such as plant height, fruit branch length and fruit branch angle, affect the suitability of a cultivar for mechanized harvesting. To determine the genetic basis of cotton plant architecture, a genome-wide association study (GWAS) was performed using a panel composed of 355 accessions and 93,250 single nucleotide polymorphisms (SNPs) identified using the specific-locus amplified fragment sequencing method. Thirty significant associations between 22 SNPs and five plant architecture component traits were identified via GWAS. Most importantly, four peak SNP loci located on chromosome D03 were simultaneously associated with more plant architecture component traits, and these SNPs were harbored in one linkage disequilibrium block. Furthermore, 21 candidate genes for plant architecture were predicted in a 0.95-Mb region including the four peak SNPs. One of these genes (Gh_D03G0922) was near the significant SNP D03_31584163 (8.40 kb), and its Arabidopsis homologs contain MADS-box domains that might be involved in plant growth and development. qRT-PCR showed that the expression of Gh_D03G0922 was upregulated in the apical buds and young leaves of the short and compact cotton varieties, and virus-induced gene silencing (VIGS) proved that the silenced plants exhibited increased PH. These results indicate that Gh_D03G0922 is likely the candidate gene for PH in cotton. The genetic variations and candidate genes identified in this study lay a foundation

  11. Identifying spatially similar gene expression patterns in early stage fruit fly embryo images: binary feature versus invariant moment digital representations

    PubMed Central

    Gurunathan, Rajalakshmi; Van Emden, Bernard; Panchanathan, Sethuraman; Kumar, Sudhir

    2004-01-01

    Background Modern developmental biology relies heavily on the analysis of embryonic gene expression patterns. Investigators manually inspect hundreds or thousands of expression patterns to identify those that are spatially similar and to ultimately infer potential gene interactions. However, the rapid accumulation of gene expression pattern data over the last two decades, facilitated by high-throughput techniques, has produced a need for the development of efficient approaches for direct comparison of images, rather than their textual descriptions, to identify spatially similar expression patterns. Results The effectiveness of the Binary Feature Vector (BFV) and Invariant Moment Vector (IMV) based digital representations of the gene expression patterns in finding biologically meaningful patterns was compared for a small (226 images) and a large (1819 images) dataset. For each dataset, an ordered list of images, with respect to a query image, was generated to identify overlapping and similar gene expression patterns, in a manner comparable to what a developmental biologist might do. The results showed that the BFV representation consistently outperforms the IMV representation in finding biologically meaningful matches when spatial overlap of the gene expression pattern and the genes involved are considered. Furthermore, we explored the value of conducting image-content based searches in a dataset where individual expression components (or domains) of multi-domain expression patterns were also included separately. We found that this technique improves performance of both IMV and BFV based searches. Conclusions We conclude that the BFV representation consistently produces a more extensive and better list of biologically useful patterns than the IMV representation. The high quality of results obtained scales well as the search database becomes larger, which encourages efforts to build automated image query and retrieval systems for spatial gene expression patterns

  12. Microaspiration of esophageal gland cells and cDNA library construction for identifying parasitism genes of plant-parasitic nematodes.

    PubMed

    Hussey, Richard S; Huang, Guozhong; Allen, Rex

    2011-01-01

    Identifying parasitism genes encoding proteins secreted from a plant-parasitic nematode's esophageal gland cells and injected through its stylet into plant tissue is the key to understanding the molecular basis of nematode parasitism of plants. Parasitism genes have been cloned by directly microaspirating the cytoplasm from the esophageal gland cells of different parasitic stages of cyst or root-knot nematodes to provide mRNA to create a gland cell-specific cDNA library by long-distance reverse-transcriptase polymerase chain reaction. cDNA clones are sequenced and deduced protein sequences with a signal peptide for secretion are identified for high-throughput in situ hybridization to confirm gland-specific expression.

  13. Combined gene expression analysis of whole-tissue and microdissected pancreatic ductal adenocarcinoma identifies genes specifically overexpressed in tumor epithelia.

    PubMed

    Badea, Liviu; Herlea, Vlad; Dima, Simona Olimpia; Dumitrascu, Traian; Popescu, Irinel

    2008-01-01

    The precise details of pancreatic ductal adenocarcinoma (PDAC) pathogenesis are still insufficiently known, requiring the use of high-throughput methods. However, PDAC is especially difficult to study using microarrays due to its strong desmoplastic reaction, which involves a hyperproliferating stroma that effectively "masks" the contribution of the minoritary neoplastic epithelial cells. Thus it is not clear which of the genes that have been found differentially expressed between normal and whole tumor tissues are due to the tumor epithelia and which simply reflect the differences in cellular composition. To address this problem, laser microdissection studies have been performed, but these have to deal with much smaller tissue sample quantities and therefore have significantly higher experimental noise. In this paper we combine our own large sample whole-tissue study with a previously published smaller sample microdissection study by Grützmann et al. to identify the genes that are specifically overexpressed in PDAC tumor epithelia. The overlap of this list of genes with other microarray studies of pancreatic cancer as well as with the published literature is impressive. Moreover, we find a number of genes whose over-expression appears to be inversely correlated with patient survival: keratin 7, laminin gamma 2, stratifin, platelet phosphofructokinase, annexin A2, MAP4K4 and OACT2 (MBOAT2), which are all specifically upregulated in the neoplastic epithelia, rather than the tumor stroma. We improve on other microarray studies of PDAC by putting together the higher statistical power due to a larger number of samples with information about cell-type specific expression and patient survival.

  14. Loci influencing blood pressure identified using a cardiovascular gene-centric array.

    PubMed

    Ganesh, Santhi K; Tragante, Vinicius; Guo, Wei; Guo, Yiran; Lanktree, Matthew B; Smith, Erin N; Johnson, Toby; Castillo, Berta Almoguera; Barnard, John; Baumert, Jens; Chang, Yen-Pei Christy; Elbers, Clara C; Farrall, Martin; Fischer, Mary E; Franceschini, Nora; Gaunt, Tom R; Gho, Johannes M I H; Gieger, Christian; Gong, Yan; Isaacs, Aaron; Kleber, Marcus E; Mateo Leach, Irene; McDonough, Caitrin W; Meijs, Matthijs F L; Mellander, Olle; Molony, Cliona M; Nolte, Ilja M; Padmanabhan, Sandosh; Price, Tom S; Rajagopalan, Ramakrishnan; Shaffer, Jonathan; Shah, Sonia; Shen, Haiqing; Soranzo, Nicole; van der Most, Peter J; Van Iperen, Erik P A; Van Setten, Jessica; Van Setten, Jessic A; Vonk, Judith M; Zhang, Li; Beitelshees, Amber L; Berenson, Gerald S; Bhatt, Deepak L; Boer, Jolanda M A; Boerwinkle, Eric; Burkley, Ben; Burt, Amber; Chakravarti, Aravinda; Chen, Wei; Cooper-Dehoff, Rhonda M; Curtis, Sean P; Dreisbach, Albert; Duggan, David; Ehret, Georg B; Fabsitz, Richard R; Fornage, Myriam; Fox, Ervin; Furlong, Clement E; Gansevoort, Ron T; Hofker, Marten H; Hovingh, G Kees; Kirkland, Susan A; Kottke-Marchant, Kandice; Kutlar, Abdullah; Lacroix, Andrea Z; Langaee, Taimour Y; Li, Yun R; Lin, Honghuang; Liu, Kiang; Maiwald, Steffi; Malik, Rainer; Murugesan, Gurunathan; Newton-Cheh, Christopher; O'Connell, Jeffery R; Onland-Moret, N Charlotte; Ouwehand, Willem H; Palmas, Walter; Penninx, Brenda W; Pepine, Carl J; Pettinger, Mary; Polak, Joseph F; Ramachandran, Vasan S; Ranchalis, Jane; Redline, Susan; Ridker, Paul M; Rose, Lynda M; Scharnag, Hubert; Schork, Nicholas J; Shimbo, Daichi; Shuldiner, Alan R; Srinivasan, Sathanur R; Stolk, Ronald P; Taylor, Herman A; Thorand, Barbara; Trip, Mieke D; van Duijn, Cornelia M; Verschuren, W Monique; Wijmenga, Cisca; Winkelmann, Bernhard R; Wyatt, Sharon; Young, J Hunter; Boehm, Bernhard O; Caulfield, Mark J; Chasman, Daniel I; Davidson, Karina W; Doevendans, Pieter A; Fitzgerald, Garret A; Gums, John G; Hakonarson, Hakon; Hillege, Hans L; Illig, Thomas; Jarvik, Gail P; Johnson, Julie A; Kastelein, John J P; Koenig, Wolfgang; März, Winfried; Mitchell, Braxton D; Murray, Sarah S; Oldehinkel, Albertine J; Rader, Daniel J; Reilly, Muredach P; Reiner, Alex P; Schadt, Eric E; Silverstein, Roy L; Snieder, Harold; Stanton, Alice V; Uitterlinden, André G; van der Harst, Pim; van der Schouw, Yvonne T; Samani, Nilesh J; Johnson, Andrew D; Munroe, Patricia B; de Bakker, Paul I W; Zhu, Xiaofeng; Levy, Daniel; Keating, Brendan J; Asselbergs, Folkert W

    2013-04-15

    Blood pressure (BP) is a heritable determinant of risk for cardiovascular disease (CVD). To investigate genetic associations with systolic BP (SBP), diastolic BP (DBP), mean arterial pressure (MAP) and pulse pressure (PP), we genotyped ∼50 000 single-nucleotide polymorphisms (SNPs) that capture variation in ∼2100 candidate genes for cardiovascular phenotypes in 61 619 individuals of European ancestry from cohort studies in the USA and Europe. We identified novel associations between rs347591 and SBP (chromosome 3p25.3, in an intron of HRH1) and between rs2169137 and DBP (chromosome1q32.1 in an intron of MDM4) and between rs2014408 and SBP (chromosome 11p15 in an intron of SOX6), previously reported to be associated with MAP. We also confirmed 10 previously known loci associated with SBP, DBP, MAP or PP (ADRB1, ATP2B1, SH2B3/ATXN2, CSK, CYP17A1, FURIN, HFE, LSP1, MTHFR, SOX6) at array-wide significance (P < 2.4 × 10(-6)). We then replicated these associations in an independent set of 65 886 individuals of European ancestry. The findings from expression QTL (eQTL) analysis showed associations of SNPs in the MDM4 region with MDM4 expression. We did not find any evidence of association of the two novel SNPs in MDM4 and HRH1 with sequelae of high BP including coronary artery disease (CAD), left ventricular hypertrophy (LVH) or stroke. In summary, we identified two novel loci associated with BP and confirmed multiple previously reported associations. Our findings extend our understanding of genes involved in BP regulation, some of which may eventually provide new targets for therapeutic intervention.

  15. Mutations in the Kinase Domain of the HER2/ERBB2 Gene Identified in a Wide Variety of Human Cancers.

    PubMed

    Wen, Wenhsiang; Chen, Wangjuh Sting; Xiao, Nick; Bender, Ryan; Ghazalpour, Anatole; Tan, Zheng; Swensen, Jeffrey; Millis, Sherri Z; Basu, Gargi; Gatalica, Zoran; Press, Michael F

    2015-09-01

    The HER2 (official name ERBB2) gene encodes a membrane receptor in the epidermal growth factor receptor family amplified and overexpressed in adenocarcinoma. Activating mutations also occur in several cancers. We report mutation analyses of the HER2 kinase domain in 7497 histologically diverse cancers. Forty-five genes, including the kinase domain of HER2 with HER2 IHC and dual in situ hybridization, were analyzed in tumors from 7497 patients with cancer, including 850 breast, 770 colorectal, 910 non-small cell lung, 823 uterine or cervical, 1372 ovarian, and 297 pancreatic cancers, as well as 323 melanomas and 2152 other solid tumors. Sixty-nine HER2 kinase domain mutations were identified in tumors from 68 patients (approximately 1% of all cases, ranging from absent in sarcomas to 4% in urothelial cancers), which included previously published activating mutations and 13 novel mutations. Fourteen cases with coexisting HER2 mutation and amplification and/or overexpression were identified. Fifty-two of 68 patients had additional mutations in other analyzed genes, whereas 16 patients (23%) had HER2 mutations identified as the sole driver mutation. HER2 mutations coexisted with HER2 gene amplification and overexpression and with mutations in other functionally important genes. HER2 mutations were identified as the only driver mutation in a significant proportion of solid cancers. Evaluation of anti-HER2 therapies in nonamplified, HER2-mutated cancers is warranted. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  16. Automated Update, Revision, and Quality Control of the Maize Genome Annotations Using MAKER-P Improves the B73 RefGen_v3 Gene Models and Identifies New Genes1[OPEN

    PubMed Central

    Law, MeiYee; Childs, Kevin L.; Campbell, Michael S.; Stein, Joshua C.; Olson, Andrew J.; Holt, Carson; Panchy, Nicholas; Lei, Jikai; Jiao, Dian; Andorf, Carson M.; Lawrence, Carolyn J.; Ware, Doreen; Shiu, Shin-Han; Sun, Yanni; Jiang, Ning; Yandell, Mark

    2015-01-01

    The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes. PMID:25384563

  17. Potential Susceptibility Loci Identified for Renal Cell Carcinoma by Targeting Obesity-Related Genes.

    PubMed

    Shu, Xiang; Purdue, Mark P; Ye, Yuanqing; Tu, Huakang; Wood, Christopher G; Tannir, Nizar M; Wang, Zhaoming; Albanes, Demetrius; Gapstur, Susan M; Stevens, Victoria L; Rothman, Nathaniel; Chanock, Stephen J; Wu, Xifeng

    2017-09-01

    Background: Obesity is an established risk factor for renal cell carcinoma (RCC). Although genome-wide association studies (GWAS) of RCC have identified several susceptibility loci, additional variants might be missed due to the highly conservative selection. Methods: We conducted a multiphase study utilizing three independent genome-wide scans at MD Anderson Cancer Center (MDA RCC GWAS and MDA RCC OncoArray) and National Cancer Institute (NCI RCC GWAS), which consisted of a total of 3,530 cases and 5,714 controls, to investigate genetic variations in obesity-related genes and RCC risk. Results: In the discovery phase, 32,946 SNPs located at ±10 kb of 2,001 obesity-related genes were extracted from MDA RCC GWAS and analyzed using multivariable logistic regression. Proxies ( R 2 > 0.8) were searched or imputation was performed if SNPs were not directly genotyped in the validation sets. Twenty-one SNPs with P < 0.05 in both MDA RCC GWAS and NCI RCC GWAS were subsequently evaluated in MDA RCC OncoArray. In the overall meta-analysis, significant ( P < 0.05) associations with RCC risk were observed for SNP mapping to IL1RAPL2 [rs10521506-G: OR meta = 0.87 (0.81-0.93), P meta = 2.33 × 10 -5 ], PLIN2 [rs2229536-A: OR meta = 0.87 (0.81-0.93), P meta = 2.33 × 10 -5 ], SMAD3 [rs4601989-A: OR meta = 0.86 (0.80-0.93), P meta = 2.71 × 10 -4 ], MED13L [rs10850596-A: OR meta = 1.14 (1.07-1.23), P meta = 1.50 × 10 -4 ], and TSC1 [rs3761840-G: OR meta = 0.90 (0.85-0.97), P meta = 2.47 × 10 -3 ]. We did not observe any significant cis-expression quantitative trait loci effect for these SNPs in the TCGA KIRC data. Conclusions: Taken together, we found that genetic variation of obesity-related genes could influence RCC susceptibility. Impact: The five identified loci may provide new insights into disease etiology that reveal importance of obesity-related genes in RCC development. Cancer Epidemiol Biomarkers Prev; 26(9); 1436-42. ©2017 AACR . ©2017 American Association for

  18. Genome-wide association study identifies Loci and candidate genes for body composition and meat quality traits in Beijing-You chickens.

    PubMed

    Liu, Ranran; Sun, Yanfa; Zhao, Guiping; Wang, Fangjie; Wu, Dan; Zheng, Maiqing; Chen, Jilan; Zhang, Lei; Hu, Yaodong; Wen, Jie

    2013-01-01

    Body composition and meat quality traits are important economic traits of chickens. The development of high-throughput genotyping platforms and relevant statistical methods have enabled genome-wide association studies in chickens. In order to identify molecular markers and candidate genes associated with body composition and meat quality traits, genome-wide association studies were conducted using the Illumina 60 K SNP Beadchip to genotype 724 Beijing-You chickens. For each bird, a total of 16 traits were measured, including carcass weight (CW), eviscerated weight (EW), dressing percentage, breast muscle weight (BrW) and percentage (BrP), thigh muscle weight and percentage, abdominal fat weight and percentage, dry matter and intramuscular fat contents of breast and thigh muscle, ultimate pH, and shear force of the pectoralis major muscle at 100 d of age. The SNPs that were significantly associated with the phenotypic traits were identified using both simple (GLM) and compressed mixed linear (MLM) models. For nine of ten body composition traits studied, SNPs showing genome wide significance (P<2.59E-6) have been identified. A consistent region on chicken (Gallus gallus) chromosome 4 (GGA4), including seven significant SNPs and four candidate genes (LCORL, LAP3, LDB2, TAPT1), were found to be associated with CW and EW. Another 0.65 Mb region on GGA3 for BrW and BrP was identified. After measuring the mRNA content in beast muscle for five genes located in this region, the changes in GJA1 expression were found to be consistent with that of breast muscle weight across development. It is highly possible that GJA1 is a functional gene for breast muscle development in chickens. For meat quality traits, several SNPs reaching suggestive association were identified and possible candidate genes with their functions were discussed.

  19. Mapping Adipose and Muscle Tissue Expression Quantitative Trait Loci in African Americans to Identify Genes for Type 2 Diabetes and Obesity

    PubMed Central

    Sajuthi, Satria P.; Sharma, Neeraj K.; Chou, Jeff W.; Palmer, Nicholette D.; McWilliams, David R.; Beal, John; Comeau, Mary E.; Ma, Lijun; Calles-Escandon, Jorge; Demons, Jamehl; Rogers, Samantha; Cherry, Kristina; Menon, Lata; Kouba, Ethel; Davis, Donna; Burris, Marcie; Byerly, Sara J.; Ng, Maggie C.Y.; Maruthur, Nisa M.; Patel, Sanjay R.; Bielak, Lawrence F.; Lange, Leslie; Guo, Xiuqing; Sale, Michèle M.; Chan, Kei Hang; Monda, Keri L.; Chen, Gary K.; Taylor, Kira; Palmer, Cameron; Edwards, Todd L; North, Kari E.; Haiman, Christopher A.; Bowden, Donald W.; Freedman, Barry I.; Langefeld, Carl D.; Das, Swapan K.

    2016-01-01

    Relative to European Americans, type 2 diabetes (T2D) is more prevalent in African Americans (AAs). Genetic variation may modulate transcript abundance in insulin-responsive tissues and contribute to risk; yet published studies identifying expression quantitative trait loci (eQTLs) in African ancestry populations are restricted to blood cells. This study aims to develop a map of genetically regulated transcripts expressed in tissues important for glucose homeostasis in AAs, critical for identifying the genetic etiology of T2D and related traits. Quantitative measures of adipose and muscle gene expression, and genotypic data were integrated in 260 non-diabetic AAs to identify expression regulatory variants. Their roles in genetic susceptibility to T2D, and related metabolic phenotypes were evaluated by mining GWAS datasets. eQTL analysis identified 1,971 and 2,078 cis-eGenes in adipose and muscle, respectively. Cis-eQTLs for 885 transcripts including top cis-eGenes CHURC1, USMG5, and ERAP2, were identified in both tissues. 62.1% of top cis-eSNPs were within ±50kb of transcription start sites and cis-eGenes were enriched for mitochondrial transcripts. Mining GWAS databases revealed association of cis-eSNPs for more than 50 genes with T2D (e.g. PIK3C2A, RBMS1, UFSP1), gluco-metabolic phenotypes, (e.g. INPP5E, SNX17, ERAP2, FN3KRP), and obesity (e.g. POMC, CPEB4). Integration of GWAS meta-analysis data from AA cohorts revealed the most significant association for cis-eSNPs of ATP5SL and MCCC1 genes, with T2D and BMI, respectively. This study developed the first comprehensive map of adipose and muscle tissue eQTLs in AAs (publically accessible at https://mdsetaa.phs.wakehealth.edu) and identified genetically-regulated transcripts for delineating genetic causes of T2D, and related metabolic phenotypes. PMID:27193597

  20. Gene expression profiling to identify the toxicities and potentially relevant human disease outcomes associated with environmental heavy metal exposure.

    PubMed

    Korashy, Hesham M; Attafi, Ibraheem M; Famulski, Konrad S; Bakheet, Saleh A; Hafez, Mohammed M; Alsaad, Abdulaziz M S; Al-Ghadeer, Abdul Rahman M

    2017-02-01

    Heavy metals are the most commonly encountered toxic substances that increase susceptibility to various diseases after prolonged exposure. We have previously shown that healthy volunteers living near a mining area had significant contamination with heavy metals associated with significant changes in the expression of some detoxifying genes, xenobiotic metabolizing enzymes, and DNA repair genes. However, alterations of most of the molecular target genes associated with diseases are still unknown. Thus, the aims of this study were to (a) evaluate the gene expression profile and (b) identify the toxicities and potentially relevant human disease outcomes associated with long-term human exposure to environmental heavy metals in mining area using microarray analysis. For this purpose, 40 healthy male volunteers who were residents of a heavy metal-polluted area (Mahd Al-Dhahab city, Saudi Arabia) and 20 healthy male volunteers who were residents of a non-heavy metal-polluted area were included in the study. Total RNA was isolated from whole blood using PAXgene Blood RNA tubes and then reversed transcribed and hybridized to the gene array using the Affymetrix U219 GeneChip. Microarray analysis showed about 2129 genes were identified and differentially altered, among which a shared set of 425 genes was differentially expressed in the heavy metal-exposed groups. Ingenuity pathway analysis revealed that the most altered gene-regulated diseases in heavy metal-exposed groups included hematological and developmental disorders and mostly renal and urological diseases. Quantitative real-time polymerase chain reaction closely matched the microarray data for some genes tested. Importantly, changes in gene-related diseases were attributed to alterations in the genes encoded for protein synthesis. Renal and urological diseases were the diseases that were most frequently associated with the heavy metal-exposed group. Therefore, there is a need for further studies to validate these

  1. An Integrative Framework for Bayesian Variable Selection with Informative Priors for Identifying Genes and Pathways

    PubMed Central

    Ander, Bradley P.; Zhang, Xiaoshuai; Xue, Fuzhong; Sharp, Frank R.; Yang, Xiaowei

    2013-01-01

    The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with ‘large p, small n’ problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed. PMID:23844055

  2. Exome sequencing of Pakistani consanguineous families identifies 30 novel candidate genes for recessive intellectual disability

    PubMed Central

    Riazuddin, S; Hussain, M; Razzaq, A; Iqbal, Z; Shahzad, M; Polla, D L; Song, Y; van Beusekom, E; Khan, A A; Tomas-Roca, L; Rashid, M; Zahoor, M Y; Wissink-Lindhout, W M; Basra, M A R; Ansar, M; Agha, Z; van Heeswijk, K; Rasheed, F; Van de Vorst, M; Veltman, J A; Gilissen, C; Akram, J; Kleefstra, T; Assir, M Z; Grozeva, D; Carss, K; Raymond, F L; O'Connor, T D; Riazuddin, S A; Khan, S N; Ahmed, Z M; de Brouwer, A P M; van Bokhoven, H; Riazuddin, S

    2017-01-01

    Intellectual disability (ID) is a clinically and genetically heterogeneous disorder, affecting 1–3% of the general population. Although research into the genetic causes of ID has recently gained momentum, identification of pathogenic mutations that cause autosomal recessive ID (ARID) has lagged behind, predominantly due to non-availability of sizeable families. Here we present the results of exome sequencing in 121 large consanguineous Pakistani ID families. In 60 families, we identified homozygous or compound heterozygous DNA variants in a single gene, 30 affecting reported ID genes and 30 affecting novel candidate ID genes. Potential pathogenicity of these alleles was supported by co-segregation with the phenotype, low frequency in control populations and the application of stringent bioinformatics analyses. In another eight families segregation of multiple pathogenic variants was observed, affecting 19 genes that were either known or are novel candidates for ID. Transcriptome profiles of normal human brain tissues showed that the novel candidate ID genes formed a network significantly enriched for transcriptional co-expression (P<0.0001) in the frontal cortex during fetal development and in the temporal–parietal and sub-cortex during infancy through adulthood. In addition, proteins encoded by 12 novel ID genes directly interact with previously reported ID proteins in six known pathways essential for cognitive function (P<0.0001). These results suggest that disruptions of temporal parietal and sub-cortical neurogenesis during infancy are critical to the pathophysiology of ID. These findings further expand the existing repertoire of genes involved in ARID, and provide new insights into the molecular mechanisms and the transcriptome map of ID. PMID:27457812

  3. Exome sequencing of Pakistani consanguineous families identifies 30 novel candidate genes for recessive intellectual disability.

    PubMed

    Riazuddin, S; Hussain, M; Razzaq, A; Iqbal, Z; Shahzad, M; Polla, D L; Song, Y; van Beusekom, E; Khan, A A; Tomas-Roca, L; Rashid, M; Zahoor, M Y; Wissink-Lindhout, W M; Basra, M A R; Ansar, M; Agha, Z; van Heeswijk, K; Rasheed, F; Van de Vorst, M; Veltman, J A; Gilissen, C; Akram, J; Kleefstra, T; Assir, M Z; Grozeva, D; Carss, K; Raymond, F L; O'Connor, T D; Riazuddin, S A; Khan, S N; Ahmed, Z M; de Brouwer, A P M; van Bokhoven, H; Riazuddin, S

    2017-11-01

    Intellectual disability (ID) is a clinically and genetically heterogeneous disorder, affecting 1-3% of the general population. Although research into the genetic causes of ID has recently gained momentum, identification of pathogenic mutations that cause autosomal recessive ID (ARID) has lagged behind, predominantly due to non-availability of sizeable families. Here we present the results of exome sequencing in 121 large consanguineous Pakistani ID families. In 60 families, we identified homozygous or compound heterozygous DNA variants in a single gene, 30 affecting reported ID genes and 30 affecting novel candidate ID genes. Potential pathogenicity of these alleles was supported by co-segregation with the phenotype, low frequency in control populations and the application of stringent bioinformatics analyses. In another eight families segregation of multiple pathogenic variants was observed, affecting 19 genes that were either known or are novel candidates for ID. Transcriptome profiles of normal human brain tissues showed that the novel candidate ID genes formed a network significantly enriched for transcriptional co-expression (P<0.0001) in the frontal cortex during fetal development and in the temporal-parietal and sub-cortex during infancy through adulthood. In addition, proteins encoded by 12 novel ID genes directly interact with previously reported ID proteins in six known pathways essential for cognitive function (P<0.0001). These results suggest that disruptions of temporal parietal and sub-cortical neurogenesis during infancy are critical to the pathophysiology of ID. These findings further expand the existing repertoire of genes involved in ARID, and provide new insights into the molecular mechanisms and the transcriptome map of ID.

  4. Novel β-catenin target genes identified in thalamic neurons encode modulators of neuronal excitability

    PubMed Central

    2012-01-01

    Background LEF1/TCF transcription factors and their activator β-catenin are effectors of the canonical Wnt pathway. Although Wnt/β-catenin signaling has been implicated in neurodegenerative and psychiatric disorders, its possible role in the adult brain remains enigmatic. To address this issue, we sought to identify the genetic program activated by β-catenin in neurons. We recently showed that β-catenin accumulates specifically in thalamic neurons where it activates Cacna1g gene expression. In the present study, we combined bioinformatics and experimental approaches to find new β-catenin targets in the adult thalamus. Results We first selected the genes with at least two conserved LEF/TCF motifs within the regulatory elements. The resulting list of 428 putative LEF1/TCF targets was significantly enriched in known Wnt targets, validating our approach. Functional annotation of the presumed targets also revealed a group of 41 genes, heretofore not associated with Wnt pathway activity, that encode proteins involved in neuronal signal transmission. Using custom polymerase chain reaction arrays, we profiled the expression of these genes in the rat forebrain. We found that nine of the analyzed genes were highly expressed in the thalamus compared with the cortex and hippocampus. Removal of nuclear β-catenin from thalamic neurons in vitro by introducing its negative regulator Axin2 reduced the expression of six of the nine genes. Immunoprecipitation of chromatin from the brain tissues confirmed the interaction between β-catenin and some of the predicted LEF1/TCF motifs. The results of these experiments validated four genes as authentic and direct targets of β-catenin: Gabra3 for the receptor of GABA neurotransmitter, Calb2 for the Ca2+-binding protein calretinin, and the Cacna1g and Kcna6 genes for voltage-gated ion channels. Two other genes from the latter cluster, Cacna2d2 and Kcnh8, appeared to be regulated by β-catenin, although the binding of β-catenin to the

  5. Identifying Dynamic Protein Complexes Based on Gene Expression Profiles and PPI Networks

    PubMed Central

    Li, Min; Chen, Weijie; Wang, Jianxin; Pan, Yi

    2014-01-01

    Identification of protein complexes from protein-protein interaction networks has become a key problem for understanding cellular life in postgenomic era. Many computational methods have been proposed for identifying protein complexes. Up to now, the existing computational methods are mostly applied on static PPI networks. However, proteins and their interactions are dynamic in reality. Identifying dynamic protein complexes is more meaningful and challenging. In this paper, a novel algorithm, named DPC, is proposed to identify dynamic protein complexes by integrating PPI data and gene expression profiles. According to Core-Attachment assumption, these proteins which are always active in the molecular cycle are regarded as core proteins. The protein-complex cores are identified from these always active proteins by detecting dense subgraphs. Final protein complexes are extended from the protein-complex cores by adding attachments based on a topological character of “closeness” and dynamic meaning. The protein complexes produced by our algorithm DPC contain two parts: static core expressed in all the molecular cycle and dynamic attachments short-lived. The proposed algorithm DPC was applied on the data of Saccharomyces cerevisiae and the experimental results show that DPC outperforms CMC, MCL, SPICi, HC-PIN, COACH, and Core-Attachment based on the validation of matching with known complexes and hF-measures. PMID:24963481

  6. Evolutionary Distance of Amino Acid Sequence Orthologs across Macaque Subspecies: Identifying Candidate Genes for SIV Resistance in Chinese Rhesus Macaques

    PubMed Central

    Ross, Cody T.; Roodgar, Morteza; Smith, David Glenn

    2015-01-01

    We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac pathogenesis, but future comparative studies might hold fertile ground for research into the biological mechanisms underlying innate resistance to SIVmac in Chinese rhesus macaques. PMID:25884674

  7. Gene Expression Profiling of Multiple Sclerosis Pathology Identifies Early Patterns of Demyelination Surrounding Chronic Active Lesions

    PubMed Central

    Hendrickx, Debbie A. E.; van Scheppingen, Jackelien; van der Poel, Marlijn; Bossers, Koen; Schuurman, Karianne G.; van Eden, Corbert G.; Hol, Elly M.; Hamann, Jörg; Huitinga, Inge

    2017-01-01

    In multiple sclerosis (MS), activated microglia and infiltrating macrophages phagocytose myelin focally in (chronic) active lesions. These demyelinating sites expand in time, but at some point turn inactive into a sclerotic scar. To identify molecular mechanisms underlying lesion activity and halt, we analyzed genome-wide gene expression in rim and peri-lesional regions of chronic active and inactive MS lesions, as well as in control tissue. Gene clustering revealed patterns of gene expression specifically associated with MS and with the presumed, subsequent stages of lesion development. Next to genes involved in immune functions, we found regulation of novel genes in and around the rim of chronic active lesions, such as NPY, KANK4, NCAN, TKTL1, and ANO4. Of note, the presence of many foamy macrophages in active rims was accompanied by a congruent upregulation of genes related to lipid binding, such as MSR1, CD68, CXCL16, and OLR1, and lipid uptake, such as CHIT1, GPNMB, and CCL18. Except CCL18, these genes were already upregulated in regions around active MS lesions, showing that such lesions are indeed expanding. In vitro downregulation of the scavenger receptors MSR1 and CXCL16 reduced myelin uptake. In conclusion, this study provides the gene expression profile of different aspects of MS pathology and indicates that early demyelination, mediated by scavenger receptors, is already present in regions around active MS lesions. Genes involved in early demyelination events in regions surrounding chronic active MS lesions might be promising therapeutic targets to stop lesion expansion. PMID:29312322

  8. Gene Expression Profiling of Multiple Sclerosis Pathology Identifies Early Patterns of Demyelination Surrounding Chronic Active Lesions.

    PubMed

    Hendrickx, Debbie A E; van Scheppingen, Jackelien; van der Poel, Marlijn; Bossers, Koen; Schuurman, Karianne G; van Eden, Corbert G; Hol, Elly M; Hamann, Jörg; Huitinga, Inge

    2017-01-01

    In multiple sclerosis (MS), activated microglia and infiltrating macrophages phagocytose myelin focally in (chronic) active lesions. These demyelinating sites expand in time, but at some point turn inactive into a sclerotic scar. To identify molecular mechanisms underlying lesion activity and halt, we analyzed genome-wide gene expression in rim and peri-lesional regions of chronic active and inactive MS lesions, as well as in control tissue. Gene clustering revealed patterns of gene expression specifically associated with MS and with the presumed, subsequent stages of lesion development. Next to genes involved in immune functions, we found regulation of novel genes in and around the rim of chronic active lesions, such as NPY, KANK4, NCAN, TKTL1 , and ANO4 . Of note, the presence of many foamy macrophages in active rims was accompanied by a congruent upregulation of genes related to lipid binding, such as MSR1, CD68, CXCL16 , and OLR1 , and lipid uptake, such as CHIT1, GPNMB , and CCL18 . Except CCL18 , these genes were already upregulated in regions around active MS lesions, showing that such lesions are indeed expanding. In vitro downregulation of the scavenger receptors MSR1 and CXCL16 reduced myelin uptake. In conclusion, this study provides the gene expression profile of different aspects of MS pathology and indicates that early demyelination, mediated by scavenger receptors, is already present in regions around active MS lesions. Genes involved in early demyelination events in regions surrounding chronic active MS lesions might be promising therapeutic targets to stop lesion expansion.

  9. Analysis of pan-genome to identify the core genes and essential genes of Brucella spp.

    PubMed

    Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin

    2016-04-01

    Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp.

  10. Identifying the candidate genes involved in the calyx abscission process of 'Kuerlexiangli' (Pyrus sinkiangensis Yu) by digital transcript abundance measurements.

    PubMed

    Qi, Xiaoxiao; Wu, Jun; Wang, Lifen; Li, Leiting; Cao, Yufen; Tian, Luming; Dong, Xingguang; Zhang, Shaoling

    2013-10-23

    'Kuerlexiangli' (Pyrus sinkiangensis Yu), a native pear of Xinjiang, China, is an important agricultural fruit and primary export to the international market. However, fruit with persistent calyxes affect fruit shape and quality. Although several studies have looked into the physiological aspects of the calyx abscission process, the underlying molecular mechanisms remain unknown. In order to better understand the molecular basis of the process of calyx abscission, materials at three critical stages of regulation, with 6000 × Flusilazole plus 300 × PBO treatment (calyx abscising treatment) and 50 mg.L-1GA3 treatment (calyx persisting treatment), were collected and cDNA fragments were sequenced using digital transcript abundance measurements to identify candidate genes. Digital transcript abundance measurements was performed using high-throughput Illumina GAII sequencing on seven samples that were collected at three important stages of the calyx abscission process with chemical agent treatments promoting calyx abscission and persistence. Altogether more than 251,123,845 high quality reads were obtained with approximately 8.0 M raw data for each library. The values of 69.85%-71.90% of clean data in the digital transcript abundance measurements could be mapped to the pear genome database. There were 12,054 differentially expressed genes having Gene Ontology (GO) terms and associating with 251 Kyoto Encyclopedia of Genes and Genomes (KEGG) defined pathways. The differentially expressed genes correlated with calyx abscission were mainly involved in photosynthesis, plant hormone signal transduction, cell wall modification, transcriptional regulation, and carbohydrate metabolism. Furthermore, candidate calyx abscission-specific genes, e.g. Inflorescence deficient in abscission gene, were identified. Quantitative real-time PCR was used to confirm the digital transcript abundance measurements results. We identified candidate genes that showed highly dynamic changes in

  11. An internal regulatory element controls troponin I gene expression.

    PubMed Central

    Yutzey, K E; Kline, R L; Konieczny, S F

    1989-01-01

    During skeletal myogenesis, approximately 20 contractile proteins and related gene products temporally accumulate as the cells fuse to form multinucleated muscle fibers. In most instances, the contractile protein genes are regulated transcriptionally, which suggests that a common molecular mechanism may coordinate the expression of this diverse and evolutionarily unrelated gene set. Recent studies have examined the muscle-specific cis-acting elements associated with numerous contractile protein genes. All of the identified regulatory elements are positioned in the 5'-flanking regions, usually within 1,500 base pairs of the transcription start site. Surprisingly, a DNA consensus sequence that is common to each contractile protein gene has not been identified. In contrast to the results of these earlier studies, we have found that the 5'-flanking region of the quail troponin I (TnI) gene is not sufficient to permit the normal myofiber transcriptional activation of the gene. Instead, the TnI gene utilizes a unique internal regulatory element that is responsible for the correct myofiber-specific expression pattern associated with the TnI gene. This is the first example in which a contractile protein gene has been shown to rely primarily on an internal regulatory element to elicit transcriptional activation during myogenesis. The diversity of regulatory elements associated with the contractile protein genes suggests that the temporal expression of the genes may involve individual cis-trans regulatory components specific for each gene. Images PMID:2725509

  12. Linking gene regulation and the exo-metabolome: A comparative transcriptomics approach to identify genes that impact on the production of volatile aroma compounds in yeast

    PubMed Central

    Rossouw, Debra; Næs, Tormod; Bauer, Florian F

    2008-01-01

    Background 'Omics' tools provide novel opportunities for system-wide analysis of complex cellular functions. Secondary metabolism is an example of a complex network of biochemical pathways, which, although well mapped from a biochemical point of view, is not well understood with regards to its physiological roles and genetic and biochemical regulation. Many of the metabolites produced by this network such as higher alcohols and esters are significant aroma impact compounds in fermentation products, and different yeast strains are known to produce highly divergent aroma profiles. Here, we investigated whether we can predict the impact of specific genes of known or unknown function on this metabolic network by combining whole transcriptome and partial exo-metabolome analysis. Results For this purpose, the gene expression levels of five different industrial wine yeast strains that produce divergent aroma profiles were established at three different time points of alcoholic fermentation in synthetic wine must. A matrix of gene expression data was generated and integrated with the concentrations of volatile aroma compounds measured at the same time points. This relatively unbiased approach to the study of volatile aroma compounds enabled us to identify candidate genes for aroma profile modification. Five of these genes, namely YMR210W, BAT1, AAD10, AAD14 and ACS1 were selected for overexpression in commercial wine yeast, VIN13. Analysis of the data show a statistically significant correlation between the changes in the exo-metabome of the overexpressing strains and the changes that were predicted based on the unbiased alignment of transcriptomic and exo-metabolomic data. Conclusion The data suggest that a comparative transcriptomics and metabolomics approach can be used to identify the metabolic impacts of the expression of individual genes in complex systems, and the amenability of transcriptomic data to direct applications of biotechnological relevance. PMID:18990252

  13. Defended to the Nines: 25 Years of Resistance Gene Cloning Identifies Nine Mechanisms for R Protein Function.

    PubMed

    Kourelis, Jiorgos; van der Hoorn, Renier A L

    2018-02-01

    Plants have many, highly variable resistance ( R ) gene loci, which provide resistance to a variety of pathogens. The first R gene to be cloned, maize ( Zea mays ) Hm1 , was published over 25 years ago, and since then, many different R genes have been identified and isolated. The encoded proteins have provided clues to the diverse molecular mechanisms underlying immunity. Here, we present a meta-analysis of 314 cloned R genes. The majority of R genes encode cell surface or intracellular receptors, and we distinguish nine molecular mechanisms by which R proteins can elevate or trigger disease resistance: direct (1) or indirect (2) perception of pathogen-derived molecules on the cell surface by receptor-like proteins and receptor-like kinases; direct (3) or indirect (4) intracellular detection of pathogen-derived molecules by nucleotide binding, leucine-rich repeat receptors, or detection through integrated domains (5); perception of transcription activator-like effectors through activation of executor genes (6); and active (7), passive (8), or host reprogramming-mediated (9) loss of susceptibility. Although the molecular mechanisms underlying the functions of R genes are only understood for a small proportion of known R genes, a clearer understanding of mechanisms is emerging and will be crucial for rational engineering and deployment of novel R genes. © 2018 American Society of Plant Biologists. All rights reserved.

  14. Nuclear Factor Kappa B Activation Occurs in the Amnion Prior to Labour Onset and Modulates the Expression of Numerous Labour Associated Genes

    PubMed Central

    Lim, Sheri; MacIntyre, David A.; Lee, Yun S.; Khanjani, Shirin; Terzidou, Vasso; Teoh, T. G.; Bennett, Phillip R.

    2012-01-01

    Background Prior to the onset of human labour there is an increase in the synthesis of prostaglandins, cytokines and chemokines in the fetal membranes, particular the amnion. This is associated with activation of the transcription factor nuclear factor kappa B (NFκB). In this study we characterised the level of NFκB activity in amnion epithelial cells as a measure of amnion activation in samples collected from women undergoing caesarean section at 39 weeks gestation prior to the onset of labour. Methodology/Principal Findings We found that a proportion of women exhibit low or moderate NFκB activity while other women exhibit high levels of NFκB activity (n = 12). This activation process does not appear to involve classical pathways of NFκB activation but rather is correlated with an increase in nuclear p65-Rel-B dimers. To identify the full range of genes upregulated in association with amnion activation, microarray analysis was performed on carefully characterised non-activated amnion (n = 3) samples and compared to activated samples (n = 3). A total of 919 genes were upregulated in response to amnion activation including numerous inflammatory genes such cyclooxygenase-2 (COX-2, 44-fold), interleukin 8 (IL-8, 6-fold), IL-1 receptor accessory protein (IL-1RAP, 4.5-fold), thrombospondin 1 (TSP-1, 3-fold) and, unexpectedly, oxytocin receptor (OTR, 24-fold). Ingenuity Pathway Analysis of the microarray data reveal the two main gene networks activated concurrently with amnion activation are i) cell death, cancer and morphology and ii) cell cycle, embryonic development and tissue development. Conclusions/Significance Our results indicate that assessment of amnion NFκB activation is critical for accurate sample classification and subsequent interpretation of data. Collectively, our data suggest amnion activation is largely an inflammatory event that occurs in the amnion epithelial layer as a prelude to the onset of labour. PMID:22485186

  15. A genome-wide survey of CD4+ lymphocyte regulatory genetic variants identifies novel asthma genes

    PubMed Central

    Sharma, Sunita; Zhou, Xiaobo; Thibault, Derek M.; Himes, Blanca E.; Liu, Andy; Szefler, Stanley J.; Strunk, Robert; Castro, Mario; Hansel, Nadia N.; Diette, Gregory B.; Vonakis, Becky M.; Adkinson, N. Franklin; Avila, Lydiana; Soto-Quiros, Manuel; Barraza-Villareal, Albino; Lemanske, Robert F.; Solway, Julian; Krishnan, Jerry; White, Steven R.; Cheadle, Chris; Berger, Alan E.; Fan, Jinshui; Boorgula, Meher Preethi; Nicolae, Dan; Gilliland, Frank; Barnes, Kathleen; London, Stephanie J.; Martinez, Fernando; Ober, Carole; Celedón, Juan C.; Carey, Vincent J.; Weiss, Scott T.; Raby, Benjamin A.

    2014-01-01

    Background Genome-wide association studies have yet to identify the majority of genetic variants involved in asthma. We hypothesized that expression quantitative trait locus (eQTL) mapping can identify novel asthma genes by enabling prioritization of putative functional variants for association testing. Objective We evaluated 6,706 cis-acting expression-associated variants (eSNP) identified through a genome-wide eQTL survey of CD4+ lymphocytes for association with asthma. Methods eSNP were tested for association with asthma in 359 asthma cases and 846 controls from the Childhood Asthma Management Program, with verification using family-based testing. Significant associations were tested for replication in 579 parent-child trios with asthma from Costa Rica. Further functional validation was performed by Formaldehyde Assisted Isolation of Regulatory Elements (FAIRE)-qPCR and Chromatin-Immunoprecipitation (ChIP)-PCR in lung derived epithelial cell lines (Beas-2B and A549) and Jurkat cells, a leukemia cell line derived from T lymphocytes. Results Cis-acting eSNP demonstrated associations with asthma in both cohorts. We confirmed the previously-reported association of ORMDL3/GSDMB variants with asthma (combined p=2.9 × 108). Reproducible associations were also observed for eSNP in three additional genes: FADS2 (p=0.002), NAGA (p=0.0002), and F13A1 (p=0.0001). We subsequently demonstrated that FADS2 mRNA is increased in CD4+ lymphocytes in asthmatics, and that the associated eSNPs reside within DNA segments with histone modifications that denote open chromatin status and confer enhancer activity. Conclusions Our results demonstrate the utility of eQTL mapping in the identification of novel asthma genes, and provide evidence for the importance of FADS2, NAGA, and F13A1 in the pathogenesis of asthma. PMID:24934276

  16. Genome-Wide Association Identifies SLC2A9 and NLN Gene Regions as Associated with Entropion in Domestic Sheep

    PubMed Central

    Mousel, Michelle R.; Reynolds, James O.; White, Stephen N.

    2015-01-01

    Entropion is an inward rolling of the eyelid allowing contact between the eyelashes and cornea that may lead to blindness if not corrected. Although many mammalian species, including humans and dogs, are afflicted by congenital entropion, no specific genes or gene regions related to development of entropion have been reported in any mammalian species to date. Entropion in domestic sheep is known to have a genetic component therefore, we used domestic sheep as a model system to identify genomic regions containing genes associated with entropion. A genome-wide association was conducted with congenital entropion in 998 Columbia, Polypay, and Rambouillet sheep genotyped with 50,000 SNP markers. Prevalence of entropion was 6.01%, with all breeds represented. Logistic regression was performed in PLINK with additive allelic, recessive, dominant, and genotypic inheritance models. Two genome-wide significant (empirical P<0.05) SNP were identified, specifically markers in SLC2A9 (empirical P = 0.007; genotypic model) and near NLN (empirical P = 0.026; dominance model). Six additional genome-wide suggestive SNP (nominal P<1x10-5) were identified including markers in or near PIK3CB (P = 2.22x10-6; additive model), KCNB1 (P = 2.93x10-6; dominance model), ZC3H12C (P = 3.25x10-6; genotypic model), JPH1 (P = 4.68x20-6; genotypic model), and MYO3B (P = 5.74x10-6; recessive model). This is the first report of specific gene regions associated with congenital entropion in any mammalian species, to our knowledge. Further, none of these genes have previously been associated with any eyelid traits. These results represent the first genome-wide analysis of gene regions associated with entropion and provide target regions for the development of sheep genetic markers for marker-assisted selection. PMID:26098909

  17. Genome-Wide Association Identifies SLC2A9 and NLN Gene Regions as Associated with Entropion in Domestic Sheep.

    PubMed

    Mousel, Michelle R; Reynolds, James O; White, Stephen N

    2015-01-01

    Entropion is an inward rolling of the eyelid allowing contact between the eyelashes and cornea that may lead to blindness if not corrected. Although many mammalian species, including humans and dogs, are afflicted by congenital entropion, no specific genes or gene regions related to development of entropion have been reported in any mammalian species to date. Entropion in domestic sheep is known to have a genetic component therefore, we used domestic sheep as a model system to identify genomic regions containing genes associated with entropion. A genome-wide association was conducted with congenital entropion in 998 Columbia, Polypay, and Rambouillet sheep genotyped with 50,000 SNP markers. Prevalence of entropion was 6.01%, with all breeds represented. Logistic regression was performed in PLINK with additive allelic, recessive, dominant, and genotypic inheritance models. Two genome-wide significant (empirical P<0.05) SNP were identified, specifically markers in SLC2A9 (empirical P = 0.007; genotypic model) and near NLN (empirical P = 0.026; dominance model). Six additional genome-wide suggestive SNP (nominal P<1x10(-5)) were identified including markers in or near PIK3CB (P = 2.22x10(-6); additive model), KCNB1 (P = 2.93x10(-6); dominance model), ZC3H12C (P = 3.25x10(-6); genotypic model), JPH1 (P = 4.68x20(-6); genotypic model), and MYO3B (P = 5.74x10(-6); recessive model). This is the first report of specific gene regions associated with congenital entropion in any mammalian species, to our knowledge. Further, none of these genes have previously been associated with any eyelid traits. These results represent the first genome-wide analysis of gene regions associated with entropion and provide target regions for the development of sheep genetic markers for marker-assisted selection.

  18. Early and long-standing rheumatoid arthritis: distinct molecular signatures identified by gene-expression profiling in synovia

    PubMed Central

    Lequerré, Thierry; Bansard, Carine; Vittecoq, Olivier; Derambure, Céline; Hiron, Martine; Daveau, Maryvonne; Tron, François; Ayral, Xavier; Biga, Norman; Auquit-Auckbur, Isabelle; Chiocchia, Gilles; Le Loët, Xavier; Salier, Jean-Philippe

    2009-01-01

    Introduction Rheumatoid arthritis (RA) is a heterogeneous disease and its underlying molecular mechanisms are still poorly understood. Because previous microarray studies have only focused on long-standing (LS) RA compared to osteoarthritis, we aimed to compare the molecular profiles of early and LS RA versus control synovia. Methods Synovial biopsies were obtained by arthroscopy from 15 patients (4 early untreated RA, 4 treated LS RA and 7 controls, who had traumatic or mechanical lesions). Extracted mRNAs were used for large-scale gene-expression profiling. The different gene-expression combinations identified by comparison of profiles of early, LS RA and healthy synovia were linked to the biological processes involved in each situation. Results Three combinations of 719, 116 and 52 transcripts discriminated, respectively, early from LS RA, and early or LS RA from healthy synovia. We identified several gene clusters and distinct molecular signatures specifically expressed during early or LS RA, thereby suggesting the involvement of different pathophysiological mechanisms during the course of RA. Conclusions Early and LS RA have distinct molecular signatures with different biological processes participating at different times during the course of the disease. These results suggest that better knowledge of the main biological processes involved at a given RA stage might help to choose the most appropriate treatment. PMID:19563633

  19. Gene-centric meta-analysis in 87,736 individuals of European ancestry identifies multiple blood-pressure-related loci.

    PubMed

    Tragante, Vinicius; Barnes, Michael R; Ganesh, Santhi K; Lanktree, Matthew B; Guo, Wei; Franceschini, Nora; Smith, Erin N; Johnson, Toby; Holmes, Michael V; Padmanabhan, Sandosh; Karczewski, Konrad J; Almoguera, Berta; Barnard, John; Baumert, Jens; Chang, Yen-Pei Christy; Elbers, Clara C; Farrall, Martin; Fischer, Mary E; Gaunt, Tom R; Gho, Johannes M I H; Gieger, Christian; Goel, Anuj; Gong, Yan; Isaacs, Aaron; Kleber, Marcus E; Mateo Leach, Irene; McDonough, Caitrin W; Meijs, Matthijs F L; Melander, Olle; Nelson, Christopher P; Nolte, Ilja M; Pankratz, Nathan; Price, Tom S; Shaffer, Jonathan; Shah, Sonia; Tomaszewski, Maciej; van der Most, Peter J; Van Iperen, Erik P A; Vonk, Judith M; Witkowska, Kate; Wong, Caroline O L; Zhang, Li; Beitelshees, Amber L; Berenson, Gerald S; Bhatt, Deepak L; Brown, Morris; Burt, Amber; Cooper-DeHoff, Rhonda M; Connell, John M; Cruickshanks, Karen J; Curtis, Sean P; Davey-Smith, George; Delles, Christian; Gansevoort, Ron T; Guo, Xiuqing; Haiqing, Shen; Hastie, Claire E; Hofker, Marten H; Hovingh, G Kees; Kim, Daniel S; Kirkland, Susan A; Klein, Barbara E; Klein, Ronald; Li, Yun R; Maiwald, Steffi; Newton-Cheh, Christopher; O'Brien, Eoin T; Onland-Moret, N Charlotte; Palmas, Walter; Parsa, Afshin; Penninx, Brenda W; Pettinger, Mary; Vasan, Ramachandran S; Ranchalis, Jane E; M Ridker, Paul; Rose, Lynda M; Sever, Peter; Shimbo, Daichi; Steele, Laura; Stolk, Ronald P; Thorand, Barbara; Trip, Mieke D; van Duijn, Cornelia M; Verschuren, W Monique; Wijmenga, Cisca; Wyatt, Sharon; Young, J Hunter; Zwinderman, Aeilko H; Bezzina, Connie R; Boerwinkle, Eric; Casas, Juan P; Caulfield, Mark J; Chakravarti, Aravinda; Chasman, Daniel I; Davidson, Karina W; Doevendans, Pieter A; Dominiczak, Anna F; FitzGerald, Garret A; Gums, John G; Fornage, Myriam; Hakonarson, Hakon; Halder, Indrani; Hillege, Hans L; Illig, Thomas; Jarvik, Gail P; Johnson, Julie A; Kastelein, John J P; Koenig, Wolfgang; Kumari, Meena; März, Winfried; Murray, Sarah S; O'Connell, Jeffery R; Oldehinkel, Albertine J; Pankow, James S; Rader, Daniel J; Redline, Susan; Reilly, Muredach P; Schadt, Eric E; Kottke-Marchant, Kandice; Snieder, Harold; Snyder, Michael; Stanton, Alice V; Tobin, Martin D; Uitterlinden, André G; van der Harst, Pim; van der Schouw, Yvonne T; Samani, Nilesh J; Watkins, Hugh; Johnson, Andrew D; Reiner, Alex P; Zhu, Xiaofeng; de Bakker, Paul I W; Levy, Daniel; Asselbergs, Folkert W; Munroe, Patricia B; Keating, Brendan J

    2014-03-06

    Blood pressure (BP) is a heritable risk factor for cardiovascular disease. To investigate genetic associations with systolic BP (SBP), diastolic BP (DBP), mean arterial pressure (MAP), and pulse pressure (PP), we genotyped ~50,000 SNPs in up to 87,736 individuals of European ancestry and combined these in a meta-analysis. We replicated findings in an independent set of 68,368 individuals of European ancestry. Our analyses identified 11 previously undescribed associations in independent loci containing 31 genes including PDE1A, HLA-DQB1, CDK6, PRKAG2, VCL, H19, NUCB2, RELA, HOXC@ complex, FBN1, and NFAT5 at the Bonferroni-corrected array-wide significance threshold (p < 6 × 10(-7)) and confirmed 27 previously reported associations. Bioinformatic analysis of the 11 loci provided support for a putative role in hypertension of several genes, such as CDK6 and NUCB2. Analysis of potential pharmacological targets in databases of small molecules showed that ten of the genes are predicted to be a target for small molecules. In summary, we identified previously unknown loci associated with BP. Our findings extend our understanding of genes involved in BP regulation, which may provide new targets for therapeutic intervention or drug response stratification. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  20. Integration of genome-wide association studies with biological knowledge identifies six novel genes related to kidney function.

    PubMed

    Chasman, Daniel I; Fuchsberger, Christian; Pattaro, Cristian; Teumer, Alexander; Böger, Carsten A; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; O'Seaghdha, Conall M; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D; Gierman, Hinco J; Feitosa, Mary F; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Lambert, Jean-Charles; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Coassin, Stefan; Pistis, Giorgio; Harris, Tamara B; Launer, Lenore J; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank; Demirkan, Ayse; Oostra, Ben A; de Andrade, Mariza; Turner, Stephen T; Ding, Jingzhong; Andrews, Jeanette S; Freedman, Barry I; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Meisinger, Christa; Gieger, Christian; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H; Wright, Alan F; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G; Rivadeneira, Fernando; Aulchenko, Yurii S; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K; Portas, Laura; Ford, Ian; Buckley, Brendan M; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J Wouter; Probst-Hensch, Nicole M; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S; van Duijn, Cornelia M; Borecki, Ingrid B; Kardia, Sharon L R; Liu, Yongmei; Curhan, Gary C; Rudan, Igor; Gyllensten, Ulf; Wilson, James F; Franke, Andre; Pramstaller, Peter P; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Parsa, Afshin; Bochud, Murielle; Heid, Iris M; Kao, W H Linda; Fox, Caroline S; Köttgen, Anna

    2012-12-15

    In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10(-9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10(-4)-2.2 × 10(-7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.