Sample records for gene variant databases

  1. Mutation databases for inherited renal disease: are they complete, accurate, clinically relevant, and freely available?

    PubMed

    Savige, Judy; Dagher, Hayat; Povey, Sue

    2014-07-01

    This study examined whether gene-specific DNA variant databases for inherited diseases of the kidney fulfilled the Human Variome Project recommendations of being complete, accurate, clinically relevant and freely available. A recent review identified 60 inherited renal diseases caused by mutations in 132 genes. The disease name, MIM number, gene name, together with "mutation" or "database," were used to identify web-based databases. Fifty-nine diseases (98%) due to mutations in 128 genes had a variant database. Altogether there were 349 databases (a median of 3 per gene, range 0-6), but no gene had two databases with the same number of variants, and 165 (50%) databases included fewer than 10 variants. About half the databases (180, 54%) had been updated in the previous year. Few (77, 23%) were curated by "experts" but these included nine of the 11 with the most variants. Even fewer databases (41, 12%) included clinical features apart from the name of the associated disease. Most (223, 67%) could be accessed without charge, including those for 50 genes (40%) with the maximum number of variants. Future efforts should focus on encouraging experts to collaborate on a single database for each gene affected in inherited renal disease, including both unpublished variants, and clinical phenotypes. © 2014 WILEY PERIODICALS, INC.

  2. The Clinical Next-Generation Sequencing Database: A Tool for the Unified Management of Clinical Information and Genetic Variants to Accelerate Variant Pathogenicity Classification.

    PubMed

    Nishio, Shin-Ya; Usami, Shin-Ichi

    2017-03-01

    Recent advances in next-generation sequencing (NGS) have given rise to new challenges due to the difficulties in variant pathogenicity interpretation and large dataset management, including many kinds of public population databases as well as public or commercial disease-specific databases. Here, we report a new database development tool, named the "Clinical NGS Database," for improving clinical NGS workflow through the unified management of variant information and clinical information. This database software offers a two-feature approach to variant pathogenicity classification. The first of these approaches is a phenotype similarity-based approach. This database allows the easy comparison of the detailed phenotype of each patient with the average phenotype of the same gene mutation at the variant or gene level. It is also possible to browse patients with the same gene mutation quickly. The other approach is a statistical approach to variant pathogenicity classification based on the use of the odds ratio for comparisons between the case and the control for each inheritance mode (families with apparently autosomal dominant inheritance vs. control, and families with apparently autosomal recessive inheritance vs. control). A number of case studies are also presented to illustrate the utility of this database. © 2016 The Authors. **Human Mutation published by Wiley Periodicals, Inc.

  3. The Finnish disease heritage database (FinDis) update-a database for the genes mutated in the Finnish disease heritage brought to the next-generation sequencing era.

    PubMed

    Polvi, Anne; Linturi, Henna; Varilo, Teppo; Anttonen, Anna-Kaisa; Byrne, Myles; Fokkema, Ivo F A C; Almusa, Henrikki; Metzidis, Anthony; Avela, Kristiina; Aula, Pertti; Kestilä, Marjo; Muilu, Juha

    2013-11-01

    The Finnish Disease Heritage Database (FinDis) (http://findis.org) was originally published in 2004 as a centralized information resource for rare monogenic diseases enriched in the Finnish population. The FinDis database originally contained 405 causative variants for 30 diseases. At the time, the FinDis database was a comprehensive collection of data, but since 1994, a large amount of new information has emerged, making the necessity to update the database evident. We collected information and updated the database to contain genes and causative variants for 35 diseases, including six more genes and more than 1,400 additional disease-causing variants. Information for causative variants for each gene is collected under the LOVD 3.0 platform, enabling easy updating. The FinDis portal provides a centralized resource and user interface to link information on each disease and gene with variant data in the LOVD 3.0 platform. The software written to achieve this has been open-sourced and made available on GitHub (http://github.com/findis-db), allowing biomedical institutions in other countries to present their national data in a similar way, and to both contribute to, and benefit from, standardized variation data. The updated FinDis portal provides a unique resource to assist patient diagnosis, research, and the development of new cures. © 2013 WILEY PERIODICALS, INC.

  4. GETPrime: a gene- or transcript-specific primer database for quantitative real-time PCR.

    PubMed

    Gubelmann, Carine; Gattiker, Alexandre; Massouras, Andreas; Hens, Korneel; David, Fabrice; Decouttere, Frederik; Rougemont, Jacques; Deplancke, Bart

    2011-01-01

    The vast majority of genes in humans and other organisms undergo alternative splicing, yet the biological function of splice variants is still very poorly understood in large part because of the lack of simple tools that can map the expression profiles and patterns of these variants with high sensitivity. High-throughput quantitative real-time polymerase chain reaction (qPCR) is an ideal technique to accurately quantify nucleic acid sequences including splice variants. However, currently available primer design programs do not distinguish between splice variants and also differ substantially in overall quality, functionality or throughput mode. Here, we present GETPrime, a primer database supported by a novel platform that uniquely combines and automates several features critical for optimal qPCR primer design. These include the consideration of all gene splice variants to enable either gene-specific (covering the majority of splice variants) or transcript-specific (covering one splice variant) expression profiling, primer specificity validation, automated best primer pair selection according to strict criteria and graphical visualization of the latter primer pairs within their genomic context. GETPrime primers have been extensively validated experimentally, demonstrating high transcript specificity in complex samples. Thus, the free-access, user-friendly GETPrime database allows fast primer retrieval and visualization for genes or groups of genes of most common model organisms, and is available at http://updepla1srv1.epfl.ch/getprime/. Database URL: http://deplanckelab.epfl.ch.

  5. GETPrime: a gene- or transcript-specific primer database for quantitative real-time PCR

    PubMed Central

    Gubelmann, Carine; Gattiker, Alexandre; Massouras, Andreas; Hens, Korneel; David, Fabrice; Decouttere, Frederik; Rougemont, Jacques; Deplancke, Bart

    2011-01-01

    The vast majority of genes in humans and other organisms undergo alternative splicing, yet the biological function of splice variants is still very poorly understood in large part because of the lack of simple tools that can map the expression profiles and patterns of these variants with high sensitivity. High-throughput quantitative real-time polymerase chain reaction (qPCR) is an ideal technique to accurately quantify nucleic acid sequences including splice variants. However, currently available primer design programs do not distinguish between splice variants and also differ substantially in overall quality, functionality or throughput mode. Here, we present GETPrime, a primer database supported by a novel platform that uniquely combines and automates several features critical for optimal qPCR primer design. These include the consideration of all gene splice variants to enable either gene-specific (covering the majority of splice variants) or transcript-specific (covering one splice variant) expression profiling, primer specificity validation, automated best primer pair selection according to strict criteria and graphical visualization of the latter primer pairs within their genomic context. GETPrime primers have been extensively validated experimentally, demonstrating high transcript specificity in complex samples. Thus, the free-access, user-friendly GETPrime database allows fast primer retrieval and visualization for genes or groups of genes of most common model organisms, and is available at http://updepla1srv1.epfl.ch/getprime/. Database URL: http://deplanckelab.epfl.ch. PMID:21917859

  6. HbVar: A relational database of human hemoglobin variants and thalassemia mutations at the globin gene server.

    PubMed

    Hardison, Ross C; Chui, David H K; Giardine, Belinda; Riemer, Cathy; Patrinos, George P; Anagnou, Nicholas; Miller, Webb; Wajcman, Henri

    2002-03-01

    We have constructed a relational database of hemoglobin variants and thalassemia mutations, called HbVar, which can be accessed on the web at http://globin.cse.psu.edu. Extensive information is recorded for each variant and mutation, including a description of the variant and associated pathology, hematology, electrophoretic mobility, methods of isolation, stability information, ethnic occurrence, structure studies, functional studies, and references. The initial information was derived from books by Dr. Titus Huisman and colleagues [Huisman et al., 1996, 1997, 1998]. The current database is updated regularly with the addition of new data and corrections to previous data. Queries can be formulated based on fields in the database. Tables of common categories of variants, such as all those involving the alpha1-globin gene (HBA1) or all those that result in high oxygen affinity, are maintained by automated queries on the database. Users can formulate more precise queries, such as identifying "all beta-globin variants associated with instability and found in Scottish populations." This new database should be useful for clinical diagnosis as well as in fundamental studies of hemoglobin biochemistry, globin gene regulation, and human sequence variation at these loci. Copyright 2002 Wiley-Liss, Inc.

  7. Genetic variants of the DNA repair genes from Exome Aggregation Consortium (EXAC) database: significance in cancer.

    PubMed

    Das, Raima; Ghosh, Sankar Kumar

    2017-04-01

    DNA repair pathway is a primary defense system that eliminates wide varieties of DNA damage. Any deficiencies in them are likely to cause the chromosomal instability that leads to cell malfunctioning and tumorigenesis. Genetic polymorphisms in DNA repair genes have demonstrated a significant association with cancer risk. Our study attempts to give a glimpse of the overall scenario of the germline polymorphisms in the DNA repair genes by taking into account of the Exome Aggregation Consortium (ExAC) database as well as the Human Gene Mutation Database (HGMD) for evaluating the disease link, particularly in cancer. It has been found that ExAC DNA repair dataset (which consists of 228 DNA repair genes) comprises 30.4% missense, 12.5% dbSNP reported and 3.2% ClinVar significant variants. 27% of all the missense variants has the deleterious SIFT score of 0.00 and 6% variants carrying the most damaging Polyphen-2 score of 1.00, thus affecting the protein structure and function. However, as per HGMD, only a fraction (1.2%) of ExAC DNA repair variants was found to be cancer-related, indicating remaining variants reported in both the databases to be further analyzed. This, in turn, may provide an increased spectrum of the reported cancer linked variants in the DNA repair genes present in ExAC database. Moreover, further in silico functional assay of the identified vital cancer-associated variants, which is essential to get their actual biological significance, may shed some lights in the field of targeted drug development in near future. Copyright © 2017. Published by Elsevier B.V.

  8. Novel LOVD databases for hereditary breast cancer and colorectal cancer genes in the Chinese population.

    PubMed

    Pan, Min; Cong, Peikuan; Wang, Yue; Lin, Changsong; Yuan, Ying; Dong, Jian; Banerjee, Santasree; Zhang, Tao; Chen, Yanling; Zhang, Ting; Chen, Mingqing; Hu, Peter; Zheng, Shu; Zhang, Jin; Qi, Ming

    2011-12-01

    The Human Variome Project (HVP) is an international consortium of clinicians, geneticists, and researchers from over 30 countries, aiming to facilitate the establishment and maintenance of standards, systems, and infrastructure for the worldwide collection and sharing of all genetic variations effecting human disease. The HVP-China Node will build new and supplement existing databases of genetic diseases. As the first effort, we have created a novel variant database of BRCA1 and BRCA2, mismatch repair genes (MMR), and APC genes for breast cancer, Lynch syndrome, and familial adenomatous polyposis (FAP), respectively, in the Chinese population using the Leiden Open Variation Database (LOVD) format. We searched PubMed and some Chinese search engines to collect all the variants of these genes in the Chinese population that have already been detected and reported. There are some differences in the gene variants between the Chinese population and that of other ethnicities. The database is available online at http://www.genomed.org/LOVD/. Our database will appear to users who survey other LOVD databases (e.g., by Google search, or by NCBI GeneTests search). Remote submissions are accepted, and the information is updated monthly. © 2011 Wiley Periodicals, Inc.

  9. Human Variome Project Quality Assessment Criteria for Variation Databases.

    PubMed

    Vihinen, Mauno; Hancock, John M; Maglott, Donna R; Landrum, Melissa J; Schaafsma, Gerard C P; Taschner, Peter

    2016-06-01

    Numerous databases containing information about DNA, RNA, and protein variations are available. Gene-specific variant databases (locus-specific variation databases, LSDBs) are typically curated and maintained for single genes or groups of genes for a certain disease(s). These databases are widely considered as the most reliable information source for a particular gene/protein/disease, but it should also be made clear they may have widely varying contents, infrastructure, and quality. Quality is very important to evaluate because these databases may affect health decision-making, research, and clinical practice. The Human Variome Project (HVP) established a Working Group for Variant Database Quality Assessment. The basic principle was to develop a simple system that nevertheless provides a good overview of the quality of a database. The HVP quality evaluation criteria that resulted are divided into four main components: data quality, technical quality, accessibility, and timeliness. This report elaborates on the developed quality criteria and how implementation of the quality scheme can be achieved. Examples are provided for the current status of the quality items in two different databases, BTKbase, an LSDB, and ClinVar, a central archive of submissions about variants and their clinical significance. © 2016 WILEY PERIODICALS, INC.

  10. Application of a 5-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants in the InSiGHT locus-specific database.

    PubMed

    Thompson, Bryony A; Spurdle, Amanda B; Plazzer, John-Paul; Greenblatt, Marc S; Akagi, Kiwamu; Al-Mulla, Fahd; Bapat, Bharati; Bernstein, Inge; Capellá, Gabriel; den Dunnen, Johan T; du Sart, Desiree; Fabre, Aurelie; Farrell, Michael P; Farrington, Susan M; Frayling, Ian M; Frebourg, Thierry; Goldgar, David E; Heinen, Christopher D; Holinski-Feder, Elke; Kohonen-Corish, Maija; Robinson, Kristina Lagerstedt; Leung, Suet Yi; Martins, Alexandra; Moller, Pal; Morak, Monika; Nystrom, Minna; Peltomaki, Paivi; Pineda, Marta; Qi, Ming; Ramesar, Rajkumar; Rasmussen, Lene Juel; Royer-Pokora, Brigitte; Scott, Rodney J; Sijmons, Rolf; Tavtigian, Sean V; Tops, Carli M; Weber, Thomas; Wijnen, Juul; Woods, Michael O; Macrae, Finlay; Genuardi, Maurizio

    2014-02-01

    The clinical classification of hereditary sequence variants identified in disease-related genes directly affects clinical management of patients and their relatives. The International Society for Gastrointestinal Hereditary Tumours (InSiGHT) undertook a collaborative effort to develop, test and apply a standardized classification scheme to constitutional variants in the Lynch syndrome-associated genes MLH1, MSH2, MSH6 and PMS2. Unpublished data submission was encouraged to assist in variant classification and was recognized through microattribution. The scheme was refined by multidisciplinary expert committee review of the clinical and functional data available for variants, applied to 2,360 sequence alterations, and disseminated online. Assessment using validated criteria altered classifications for 66% of 12,006 database entries. Clinical recommendations based on transparent evaluation are now possible for 1,370 variants that were not obviously protein truncating from nomenclature. This large-scale endeavor will facilitate the consistent management of families suspected to have Lynch syndrome and demonstrates the value of multidisciplinary collaboration in the curation and classification of variants in public locus-specific databases.

  11. Application of a five-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants lodged on the InSiGHT locus-specific database

    PubMed Central

    Plazzer, John-Paul; Greenblatt, Marc S.; Akagi, Kiwamu; Al-Mulla, Fahd; Bapat, Bharati; Bernstein, Inge; Capellá, Gabriel; den Dunnen, Johan T.; du Sart, Desiree; Fabre, Aurelie; Farrell, Michael P.; Farrington, Susan M.; Frayling, Ian M.; Frebourg, Thierry; Goldgar, David E.; Heinen, Christopher D.; Holinski-Feder, Elke; Kohonen-Corish, Maija; Robinson, Kristina Lagerstedt; Leung, Suet Yi; Martins, Alexandra; Moller, Pal; Morak, Monika; Nystrom, Minna; Peltomaki, Paivi; Pineda, Marta; Qi, Ming; Ramesar, Rajkumar; Rasmussen, Lene Juel; Royer-Pokora, Brigitte; Scott, Rodney J.; Sijmons, Rolf; Tavtigian, Sean V.; Tops, Carli M.; Weber, Thomas; Wijnen, Juul; Woods, Michael O.; Macrae, Finlay; Genuardi, Maurizio

    2015-01-01

    Clinical classification of sequence variants identified in hereditary disease genes directly affects clinical management of patients and their relatives. The International Society for Gastrointestinal Hereditary Tumours (InSiGHT) undertook a collaborative effort to develop, test and apply a standardized classification scheme to constitutional variants in the Lynch Syndrome genes MLH1, MSH2, MSH6 and PMS2. Unpublished data submission was encouraged to assist variant classification, and recognized by microattribution. The scheme was refined by multidisciplinary expert committee review of clinical and functional data available for variants, applied to 2,360 sequence alterations, and disseminated online. Assessment using validated criteria altered classifications for 66% of 12,006 database entries. Clinical recommendations based on transparent evaluation are now possible for 1,370 variants not obviously protein-truncating from nomenclature. This large-scale endeavor will facilitate consistent management of suspected Lynch Syndrome families, and demonstrates the value of multidisciplinary collaboration for curation and classification of variants in public locus-specific databases. PMID:24362816

  12. BRCA Share: A Collection of Clinical BRCA Gene Variants.

    PubMed

    Béroud, Christophe; Letovsky, Stanley I; Braastad, Corey D; Caputo, Sandrine M; Beaudoux, Olivia; Bignon, Yves Jean; Bressac-De Paillerets, Brigitte; Bronner, Myriam; Buell, Crystal M; Collod-Béroud, Gwenaëlle; Coulet, Florence; Derive, Nicolas; Divincenzo, Christina; Elzinga, Christopher D; Garrec, Céline; Houdayer, Claude; Karbassi, Izabela; Lizard, Sarab; Love, Angela; Muller, Danièle; Nagan, Narasimhan; Nery, Camille R; Rai, Ghadi; Revillion, Françoise; Salgado, David; Sévenet, Nicolas; Sinilnikova, Olga; Sobol, Hagay; Stoppa-Lyonnet, Dominique; Toulas, Christine; Trautman, Edwin; Vaur, Dominique; Vilquin, Paul; Weymouth, Katelyn S; Willis, Alecia; Eisenberg, Marcia; Strom, Charles M

    2016-12-01

    As next-generation sequencing increases access to human genetic variation, the challenge of determining clinical significance of variants becomes ever more acute. Germline variants in the BRCA1 and BRCA2 genes can confer substantial lifetime risk of breast and ovarian cancer. Assessment of variant pathogenicity is a vital part of clinical genetic testing for these genes. A database of clinical observations of BRCA variants is a critical resource in that process. This article describes BRCA Share™, a database created by a unique international alliance of academic centers and commercial testing laboratories. By integrating the content of the Universal Mutation Database generated by the French Unicancer Genetic Group with the testing results of two large commercial laboratories, Quest Diagnostics and Laboratory Corporation of America (LabCorp), BRCA Share™ has assembled one of the largest publicly accessible collections of BRCA variants currently available. Although access is available to academic researchers without charge, commercial participants in the project are required to pay a support fee and contribute their data. The fees fund the ongoing curation effort, as well as planned experiments to functionally characterize variants of uncertain significance. BRCA Share™ databases can therefore be considered as models of successful data sharing between private companies and the academic world. © 2016 WILEY PERIODICALS, INC.

  13. Monogenic diabetes syndromes: Locus‐specific databases for Alström, Wolfram, and Thiamine‐responsive megaloblastic anemia

    PubMed Central

    Astuti, Dewi; Sabir, Ataf; Fulton, Piers; Zatyka, Malgorzata; Williams, Denise; Hardy, Carol; Milan, Gabriella; Favaretto, Francesca; Yu‐Wai‐Man, Patrick; Rohayem, Julia; López de Heredia, Miguel; Hershey, Tamara; Tranebjaerg, Lisbeth; Chen, Jian‐Hua; Chaussenot, Annabel; Nunes, Virginia; Marshall, Bess; McAfferty, Susan; Tillmann, Vallo; Maffei, Pietro; Paquis‐Flucklinger, Veronique; Geberhiwot, Tarekign; Mlynarski, Wojciech; Parkinson, Kay; Picard, Virginie; Bueno, Gema Esteban; Dias, Renuka; Arnold, Amy; Richens, Caitlin; Paisey, Richard; Urano, Fumihiko; Semple, Robert; Sinnott, Richard

    2017-01-01

    Abstract We developed a variant database for diabetes syndrome genes, using the Leiden Open Variation Database platform, containing observed phenotypes matched to the genetic variations. We populated it with 628 published disease‐associated variants (December 2016) for: WFS1 (n = 309), CISD2 (n = 3), ALMS1 (n = 268), and SLC19A2 (n = 48) for Wolfram type 1, Wolfram type 2, Alström, and Thiamine‐responsive megaloblastic anemia syndromes, respectively; and included 23 previously unpublished novel germline variants in WFS1 and 17 variants in ALMS1. We then investigated genotype–phenotype relations for the WFS1 gene. The presence of biallelic loss‐of‐function variants predicted Wolfram syndrome defined by insulin‐dependent diabetes and optic atrophy, with a sensitivity of 79% (95% CI 75%–83%) and specificity of 92% (83%–97%). The presence of minor loss‐of‐function variants in WFS1 predicted isolated diabetes, isolated deafness, or isolated congenital cataracts without development of the full syndrome (sensitivity 100% [93%–100%]; specificity 78% [73%–82%]). The ability to provide a prognostic prediction based on genotype will lead to improvements in patient care and counseling. The development of the database as a repository for monogenic diabetes gene variants will allow prognostic predictions for other diabetes syndromes as next‐generation sequencing expands the repertoire of genotypes and phenotypes. The database is publicly available online at https://lovd.euro-wabb.org. PMID:28432734

  14. MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome.

    PubMed

    Wang, Julia; Al-Ouran, Rami; Hu, Yanhui; Kim, Seon-Young; Wan, Ying-Wooi; Wangler, Michael F; Yamamoto, Shinya; Chao, Hsiao-Tuan; Comjean, Aram; Mohr, Stephanie E; Perrimon, Norbert; Liu, Zhandong; Bellen, Hugo J

    2017-06-01

    One major challenge encountered with interpreting human genetic variants is the limited understanding of the functional impact of genetic alterations on biological processes. Furthermore, there remains an unmet demand for an efficient survey of the wealth of information on human homologs in model organisms across numerous databases. To efficiently assess the large volume of publically available information, it is important to provide a concise summary of the most relevant information in a rapid user-friendly format. To this end, we created MARRVEL (model organism aggregated resources for rare variant exploration). MARRVEL is a publicly available website that integrates information from six human genetic databases and seven model organism databases. For any given variant or gene, MARRVEL displays information from OMIM, ExAC, ClinVar, Geno2MP, DGV, and DECIPHER. Importantly, it curates model organism-specific databases to concurrently display a concise summary regarding the human gene homologs in budding and fission yeast, worm, fly, fish, mouse, and rat on a single webpage. Experiment-based information on tissue expression, protein subcellular localization, biological process, and molecular function for the human gene and homologs in the seven model organisms are arranged into a concise output. Hence, rather than visiting multiple separate databases for variant and gene analysis, users can obtain important information by searching once through MARRVEL. Altogether, MARRVEL dramatically improves efficiency and accessibility to data collection and facilitates analysis of human genes and variants by cross-disciplinary integration of 18 million records available in public databases to facilitate clinical diagnosis and basic research. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  15. Identification of Inherited Retinal Disease-Associated Genetic Variants in 11 Candidate Genes.

    PubMed

    Astuti, Galuh D N; van den Born, L Ingeborgh; Khan, M Imran; Hamel, Christian P; Bocquet, Béatrice; Manes, Gaël; Quinodoz, Mathieu; Ali, Manir; Toomes, Carmel; McKibbin, Martin; El-Asrag, Mohammed E; Haer-Wigman, Lonneke; Inglehearn, Chris F; Black, Graeme C M; Hoyng, Carel B; Cremers, Frans P M; Roosing, Susanne

    2018-01-10

    Inherited retinal diseases (IRDs) display an enormous genetic heterogeneity. Whole exome sequencing (WES) recently identified genes that were mutated in a small proportion of IRD cases. Consequently, finding a second case or family carrying pathogenic variants in the same candidate gene often is challenging. In this study, we searched for novel candidate IRD gene-associated variants in isolated IRD families, assessed their causality, and searched for novel genotype-phenotype correlations. Whole exome sequencing was performed in 11 probands affected with IRDs. Homozygosity mapping data was available for five cases. Variants with minor allele frequencies ≤ 0.5% in public databases were selected as candidate disease-causing variants. These variants were ranked based on their: (a) presence in a gene that was previously implicated in IRD; (b) minor allele frequency in the Exome Aggregation Consortium database (ExAC); (c) in silico pathogenicity assessment using the combined annotation dependent depletion (CADD) score; and (d) interaction of the corresponding protein with known IRD-associated proteins. Twelve unique variants were found in 11 different genes in 11 IRD probands. Novel autosomal recessive and dominant inheritance patterns were found for variants in Small Nuclear Ribonucleoprotein U5 Subunit 200 ( SNRNP200 ) and Zinc Finger Protein 513 ( ZNF513 ), respectively. Using our pathogenicity assessment, a variant in DEAH-Box Helicase 32 ( DHX32 ) was the top ranked novel candidate gene to be associated with IRDs, followed by eight medium and lower ranked candidate genes. The identification of candidate disease-associated sequence variants in 11 single families underscores the notion that the previously identified IRD-associated genes collectively carry > 90% of the defects implicated in IRDs. To identify multiple patients or families with variants in the same gene and thereby provide extra proof for pathogenicity, worldwide data sharing is needed.

  16. The UCL low-density lipoprotein receptor gene variant database: pathogenicity update

    PubMed Central

    Futema, Marta; Whittall, Ros; Taylor-Beadling, Alison; Williams, Maggie; den Dunnen, Johan T; Humphries, Steve E

    2017-01-01

    Background Familial hypercholesterolaemia (OMIM 143890) is most frequently caused by variations in the low-density lipoprotein receptor (LDLR) gene. Predicting whether novel variants are pathogenic may not be straightforward, especially for missense and synonymous variants. In 2013, the Association of Clinical Genetic Scientists published guidelines for the classification of variants, with categories 1 and 2 representing clearly not or unlikely pathogenic, respectively, 3 representing variants of unknown significance (VUS), and 4 and 5 representing likely to be or clearly pathogenic, respectively. Here, we update the University College London (UCL) LDLR variant database according to these guidelines. Methods PubMed searches and alerts were used to identify novel LDLR variants for inclusion in the database. Standard in silico tools were used to predict potential pathogenicity. Variants were designated as class 4/5 only when the predictions from the different programs were concordant and as class 3 when predictions were discordant. Results The updated database (http://www.lovd.nl/LDLR) now includes 2925 curated variants, representing 1707 independent events. All 129 nonsense variants, 337 small frame-shifting and 117/118 large rearrangements were classified as 4 or 5. Of the 795 missense variants, 115 were in classes 1 and 2, 605 in class 4 and 75 in class 3. 111/181 intronic variants, 4/34 synonymous variants and 14/37 promoter variants were assigned to classes 4 or 5. Overall, 112 (7%) of reported variants were class 3. Conclusions This study updates the LDLR variant database and identifies a number of reported VUS where additional family and in vitro studies will be required to confirm or refute their pathogenicity. PMID:27821657

  17. Description and analysis of genetic variants in French hereditary breast and ovarian cancer families recorded in the UMD-BRCA1/BRCA2 databases.

    PubMed

    Caputo, Sandrine; Benboudjema, Louisa; Sinilnikova, Olga; Rouleau, Etienne; Béroud, Christophe; Lidereau, Rosette

    2012-01-01

    BRCA1 and BRCA2 are the two main genes responsible for predisposition to breast and ovarian cancers, as a result of protein-inactivating monoallelic mutations. It remains to be established whether many of the variants identified in these two genes, so-called unclassified/unknown variants (UVs), contribute to the disease phenotype or are simply neutral variants (or polymorphisms). Given the clinical importance of establishing their status, a nationwide effort to annotate these UVs was launched by laboratories belonging to the French GGC consortium (Groupe Génétique et Cancer), leading to the creation of the UMD-BRCA1/BRCA2 databases (http://www.umd.be/BRCA1/ and http://www.umd.be/BRCA2/). These databases have been endorsed by the French National Cancer Institute (INCa) and are designed to collect all variants detected in France, whether causal, neutral or UV. They differ from other BRCA databases in that they contain co-occurrence data for all variants. Using these data, the GGC French consortium has been able to classify certain UVs also contained in other databases. In this article, we report some novel UVs not contained in the BIC database and explore their impact in cancer predisposition based on a structural approach.

  18. DRUMS: a human disease related unique gene mutation search engine.

    PubMed

    Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan

    2011-10-01

    With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.

  19. Detection of alternative splice variants at the proteome level in Aspergillus flavus.

    PubMed

    Chang, Kung-Yen; Georgianna, D Ryan; Heber, Steffen; Payne, Gary A; Muddiman, David C

    2010-03-05

    Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.

  20. A comprehensive global genotype-phenotype database for rare diseases.

    PubMed

    Trujillano, Daniel; Oprea, Gabriela-Elena; Schmitz, Yvonne; Bertoli-Avella, Aida M; Abou Jamra, Rami; Rolfs, Arndt

    2017-01-01

    The ability to discover genetic variants in a patient runs far ahead of the ability to interpret them. Databases with accurate descriptions of the causal relationship between the variants and the phenotype are valuable since these are critical tools in clinical genetic diagnostics. Here, we introduce a comprehensive and global genotype-phenotype database focusing on rare diseases. This database (CentoMD ® ) is a browser-based tool that enables access to a comprehensive, independently curated system utilizing stringent high-quality criteria and a quickly growing repository of genetic and human phenotype ontology (HPO)-based clinical information. Its main goals are to aid the evaluation of genetic variants, to enhance the validity of the genetic analytical workflow, to increase the quality of genetic diagnoses, and to improve evaluation of treatment options for patients with hereditary diseases. The database software correlates clinical information from consented patients and probands of different geographical backgrounds with a large dataset of genetic variants and, when available, biomarker information. An automated follow-up tool is incorporated that informs all users whenever a variant classification has changed. These unique features fully embedded in a CLIA/CAP-accredited quality management system allow appropriate data quality and enhanced patient safety. More than 100,000 genetically screened individuals are documented in the database, resulting in more than 470 million variant detections. Approximately, 57% of the clinically relevant and uncertain variants in the database are novel. Notably, 3% of the genetic variants identified and previously reported in the literature as being associated with a particular rare disease were reclassified, based on internal evidence, as clinically irrelevant. The database offers a comprehensive summary of the clinical validity and causality of detected gene variants with their associated phenotypes, and is a valuable tool for identifying new disease genes through the correlation of novel genetic variants with specific, well-defined phenotypes.

  1. UMD-USHbases: a comprehensive set of databases to record and analyse pathogenic mutations and unclassified variants in seven Usher syndrome causing genes.

    PubMed

    Baux, David; Faugère, Valérie; Larrieu, Lise; Le Guédard-Méreuze, Sandie; Hamroun, Dalil; Béroud, Christophe; Malcolm, Sue; Claustres, Mireille; Roux, Anne-Françoise

    2008-08-01

    Using the Universal Mutation Database (UMD) software, we have constructed "UMD-USHbases", a set of relational databases of nucleotide variations for seven genes involved in Usher syndrome (MYO7A, CDH23, PCDH15, USH1C, USH1G, USH3A and USH2A). Mutations in the Usher syndrome type I causing genes are also recorded in non-syndromic hearing loss cases and mutations in USH2A in non-syndromic retinitis pigmentosa. Usher syndrome provides a particular challenge for molecular diagnostics because of the clinical and molecular heterogeneity. As many mutations are missense changes, and all the genes also contain apparently non-pathogenic polymorphisms, well-curated databases are crucial for accurate interpretation of pathogenicity. Tools are provided to assess the pathogenicity of mutations, including conservation of amino acids and analysis of splice-sites. Reference amino acid alignments are provided. Apparently non-pathogenic variants in patients with Usher syndrome, at both the nucleotide and amino acid level, are included. The UMD-USHbases currently contain more than 2,830 entries including disease causing mutations, unclassified variants or non-pathogenic polymorphisms identified in over 938 patients. In addition to data collected from 89 publications, 15 novel mutations identified in our laboratory are recorded in MYO7A (6), CDH23 (8), or PCDH15 (1) genes. Information is given on the relative involvement of the seven genes, the number and distribution of variants in each gene. UMD-USHbases give access to a software package that provides specific routines and optimized multicriteria research and sorting tools. These databases should assist clinicians and geneticists seeking information about mutations responsible for Usher syndrome.

  2. Evaluating the quality of Marfan genotype-phenotype correlations in existing FBN1 databases.

    PubMed

    Groth, Kristian A; Von Kodolitsch, Yskert; Kutsche, Kerstin; Gaustadnes, Mette; Thorsen, Kasper; Andersen, Niels H; Gravholt, Claus H

    2017-07-01

    Genetic FBN1 testing is pivotal for confirming the clinical diagnosis of Marfan syndrome. In an effort to evaluate variant causality, FBN1 databases are often used. We evaluated the current databases regarding FBN1 variants and validated associated phenotype records with a new Marfan syndrome geno-phenotyping tool called the Marfan score. We evaluated four databases (UMD-FBN1, ClinVar, the Human Gene Mutation Database (HGMD), and Uniprot) containing 2,250 FBN1 variants supported by 4,904 records presented in 307 references. The Marfan score calculated for phenotype data from the records quantified variant associations with Marfan syndrome phenotype. We calculated a Marfan score for 1,283 variants, of which we confirmed the database diagnosis of Marfan syndrome in 77.1%. This represented only 35.8% of the total registered variants; 18.5-33.3% (UMD-FBN1 versus HGMD) of variants associated with Marfan syndrome in the databases could not be confirmed by the recorded phenotype. FBN1 databases can be imprecise and incomplete. Data should be used with caution when evaluating FBN1 variants. At present, the UMD-FBN1 database seems to be the biggest and best curated; therefore, it is the most comprehensive database. However, the need for better genotype-phenotype curated databases is evident, and we hereby present such a database.Genet Med advance online publication 01 December 2016.

  3. New workflow for classification of genetic variants' pathogenicity applied to hereditary recurrent fevers by the International Study Group for Systemic Autoinflammatory Diseases (INSAID).

    PubMed

    Van Gijn, Marielle E; Ceccherini, Isabella; Shinar, Yael; Carbo, Ellen C; Slofstra, Mariska; Arostegui, Juan I; Sarrabay, Guillaume; Rowczenio, Dorota; Omoyımnı, Ebun; Balci-Peynircioglu, Banu; Hoffman, Hal M; Milhavet, Florian; Swertz, Morris A; Touitou, Isabelle

    2018-03-29

    Hereditary recurrent fevers (HRFs) are rare inflammatory diseases sharing similar clinical symptoms and effectively treated with anti-inflammatory biological drugs. Accurate diagnosis of HRF relies heavily on genetic testing. This study aimed to obtain an experts' consensus on the clinical significance of gene variants in four well-known HRF genes: MEFV , TNFRSF1A , NLRP3 and MVK . We configured a MOLGENIS web platform to share and analyse pathogenicity classifications of the variants and to manage a consensus-based classification process. Four experts in HRF genetics submitted independent classifications of 858 variants. Classifications were driven to consensus by recruiting four more expert opinions and by targeting discordant classifications in five iterative rounds. Consensus classification was reached for 804/858 variants (94%). None of the unsolved variants (6%) remained with opposite classifications (eg, pathogenic vs benign). New mutational hotspots were found in all genes. We noted a lower pathogenic variant load and a higher fraction of variants with unknown or unsolved clinical significance in the MEFV gene. Applying a consensus-driven process on the pathogenicity assessment of experts yielded rapid classification of almost all variants of four HRF genes. The high-throughput database will profoundly assist clinicians and geneticists in the diagnosis of HRFs. The configured MOLGENIS platform and consensus evolution protocol are usable for assembly of other variant pathogenicity databases. The MOLGENIS software is available for reuse at http://github.com/molgenis/molgenis; the specific HRF configuration is available at http://molgenis.org/said/. The HRF pathogenicity classifications will be published on the INFEVERS database at https://fmf.igh.cnrs.fr/ISSAID/infevers/. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  4. Identification of Candidate Gene Variants in Korean MODY Families by Whole-Exome Sequencing.

    PubMed

    Shim, Ye Jee; Kim, Jung Eun; Hwang, Su-Kyeong; Choi, Bong Seok; Choi, Byung Ho; Cho, Eun-Mi; Jang, Kyoung Mi; Ko, Cheol Woo

    2015-01-01

    To date, 13 genes causing maturity-onset diabetes of the young (MODY) have been identified. However, there is a big discrepancy in the genetic locus between Asian and Caucasian patients with MODY. Thus, we conducted whole-exome sequencing in Korean MODY families to identify causative gene variants. Six MODY probands and their family members were included. Variants in the dbSNP135 and TIARA databases for Koreans and the variants with minor allele frequencies >0.5% of the 1000 Genomes database were excluded. We selected only the functional variants (gain of stop codon, frameshifts and nonsynonymous single-nucleotide variants) and conducted a case-control comparison in the family members. The selected variants were scanned for the previously introduced gene set implicated in glucose metabolism. Three variants c.620C>T:p.Thr207Ile in PTPRD, c.559C>G:p.Gln187Glu in SYT9, and c.1526T>G:p.Val509Gly in WFS1 were respectively identified in 3 families. We could not find any disease-causative alleles of known MODY 1-13 genes. Based on the predictive program, Thr207Ile in PTPRD was considered pathogenic. Whole-exome sequencing is a valuable method for the genetic diagnosis of MODY. Further evaluation is necessary about the role of PTPRD, SYT9 and WFS1 in normal insulin release from pancreatic beta cells. © 2015 S. Karger AG, Basel.

  5. GAVIN: Gene-Aware Variant INterpretation for medical sequencing.

    PubMed

    van der Velde, K Joeri; de Boer, Eddy N; van Diemen, Cleo C; Sikkema-Raddatz, Birgit; Abbott, Kristin M; Knopperts, Alain; Franke, Lude; Sijmons, Rolf H; de Koning, Tom J; Wijmenga, Cisca; Sinke, Richard J; Swertz, Morris A

    2017-01-16

    We present Gene-Aware Variant INterpretation (GAVIN), a new method that accurately classifies variants for clinical diagnostic purposes. Classifications are based on gene-specific calibrations of allele frequencies from the ExAC database, likely variant impact using SnpEff, and estimated deleteriousness based on CADD scores for >3000 genes. In a benchmark on 18 clinical gene sets, we achieve a sensitivity of 91.4% and a specificity of 76.9%. This accuracy is unmatched by 12 other tools. We provide GAVIN as an online MOLGENIS service to annotate VCF files and as an open source executable for use in bioinformatic pipelines. It can be found at http://molgenis.org/gavin .

  6. Identification of Alternative Splice Variants Using Unique Tryptic Peptide Sequences for Database Searches.

    PubMed

    Tran, Trung T; Bollineni, Ravi C; Strozynski, Margarita; Koehler, Christian J; Thiede, Bernd

    2017-07-07

    Alternative splicing is a mechanism in eukaryotes by which different forms of mRNAs are generated from the same gene. Identification of alternative splice variants requires the identification of peptides specific for alternative splice forms. For this purpose, we generated a human database that contains only unique tryptic peptides specific for alternative splice forms from Swiss-Prot entries. Using this database allows an easy access to splice variant-specific peptide sequences that match to MS data. Furthermore, we combined this database without alternative splice variant-1-specific peptides with human Swiss-Prot. This combined database can be used as a general database for searching of LC-MS data. LC-MS data derived from in-solution digests of two different cell lines (LNCaP, HeLa) and phosphoproteomics studies were analyzed using these two databases. Several nonalternative splice variant-1-specific peptides were found in both cell lines, and some of them seemed to be cell-line-specific. Control and apoptotic phosphoproteomes from Jurkat T cells revealed several nonalternative splice variant-1-specific peptides, and some of them showed clear quantitative differences between the two states.

  7. Comparison of locus-specific databases for BRCA1 and BRCA2 variants reveals disparity in variant classification within and among databases.

    PubMed

    Vail, Paris J; Morris, Brian; van Kan, Aric; Burdett, Brianna C; Moyes, Kelsey; Theisen, Aaron; Kerr, Iain D; Wenstrup, Richard J; Eggington, Julie M

    2015-10-01

    Genetic variants of uncertain clinical significance (VUSs) are a common outcome of clinical genetic testing. Locus-specific variant databases (LSDBs) have been established for numerous disease-associated genes as a research tool for the interpretation of genetic sequence variants to facilitate variant interpretation via aggregated data. If LSDBs are to be used for clinical practice, consistent and transparent criteria regarding the deposition and interpretation of variants are vital, as variant classifications are often used to make important and irreversible clinical decisions. In this study, we performed a retrospective analysis of 2017 consecutive BRCA1 and BRCA2 genetic variants identified from 24,650 consecutive patient samples referred to our laboratory to establish an unbiased dataset representative of the types of variants seen in the US patient population, submitted by clinicians and researchers for BRCA1 and BRCA2 testing. We compared the clinical classifications of these variants among five publicly accessible BRCA1 and BRCA2 variant databases: BIC, ClinVar, HGMD (paid version), LOVD, and the UMD databases. Our results show substantial disparity of variant classifications among publicly accessible databases. Furthermore, it appears that discrepant classifications are not the result of a single outlier but widespread disagreement among databases. This study also shows that databases sometimes favor a clinical classification when current best practice guidelines (ACMG/AMP/CAP) would suggest an uncertain classification. Although LSDBs have been well established for research applications, our results suggest several challenges preclude their wider use in clinical practice.

  8. Screening of Variations in CD22 Gene in Children with B-Precursor Acute Lymphoblastic Leukemia.

    PubMed

    Aslar Oner, Deniz; Akin, Dilara Fatma; Sipahi, Kadir; Mumcuoglu, Mine; Ezer, Ustun; Kürekci, A Emin; Akar, Nejat

    2016-09-01

    CD22 is expressed on the surface of B-cell lineage cells from the early progenitor stage of pro-B cell until terminal differentiation to mature B cells. It plays a role in signal transduction and as a regulator of B-cell receptor signaling in B-cell development. We aimed to screen exons 9-14 of the CD22 gene, which is a mutational hot spot region in B-precursor acute lymphoblastic leukemia (pre-B ALL) patients, to find possible genetic variants that could play role in the pathogenesis of pre-B ALL in Turkish children. This study included 109 Turkish children with pre-B ALL who were diagnosed at Losante Hospital for Children with Leukemia. Genomic DNA was extracted from both peripheral blood and bone marrow leukocytes. Gene amplification was performed with PCR, and all samples were screened for the variants by single strand conformation polymorphism. Samples showing band shifts were sequenced on an automated sequencer. In our patient group a total of 9 variants were identified in the CD22 gene by sequencing: a novel variant in intron 10 (T2199G); a missense variant in exon 12; 5 intronic variants between exon 12 and intron 13; a novel intronic variant (C2424T); and a synonymous in exon 13. Thirteen of 109 children (11.9%) carried the T2199G novel intronic variant located in intron 10, and 17 of 109 children (15.6%) carried the C2424T novel intronic variant. Novel variants in the CD22 gene in children with pre-B ALL in Turkey that are not present, in the Human Gene Mutation Database or NCBI SNP database, were found.

  9. Difficulties in diagnosing Marfan syndrome using current FBN1 databases.

    PubMed

    Groth, Kristian A; Gaustadnes, Mette; Thorsen, Kasper; Østergaard, John R; Jensen, Uffe Birk; Gravholt, Claus H; Andersen, Niels H

    2016-01-01

    The diagnostic criteria of Marfan syndrome (MFS) highlight the importance of a FBN1 mutation test in diagnosing MFS. As genetic sequencing becomes better, cheaper, and more accessible, the expected increase in the number of genetic tests will become evident, resulting in numerous genetic variants that need to be evaluated for disease-causing effects based on database information. The aim of this study was to evaluate genetic variants in four databases and review the relevant literature. We assessed background data on 23 common variants registered in ESP6500 and classified as causing MFS in the Human Gene Mutation Database (HGMD). We evaluated data in four variant databases (HGMD, UMD-FBN1, ClinVar, and UniProt) according to the diagnostic criteria for MFS and compared the results with the classification of each variant in the four databases. None of the 23 variants was clearly associated with MFS, even though all classifications in the databases stated otherwise. A genetic diagnosis of MFS cannot reliably be based on current variant databases because they contain incorrectly interpreted conclusions on variants. Variants must be evaluated by time-consuming review of the background material in the databases and by combining these data with expert knowledge on MFS. This is a major problem because we expect even more genetic test results in the near future as a result of the reduced cost and process time for next-generation sequencing.Genet Med 18 1, 98-102.

  10. DNA variant databases improve test accuracy and phenotype prediction in Alport syndrome.

    PubMed

    Savige, Judy; Ars, Elisabet; Cotton, Richard G H; Crockett, David; Dagher, Hayat; Deltas, Constantinos; Ding, Jie; Flinter, Frances; Pont-Kingdon, Genevieve; Smaoui, Nizar; Torra, Roser; Storey, Helen

    2014-06-01

    X-linked Alport syndrome is a form of progressive renal failure caused by pathogenic variants in the COL4A5 gene. More than 700 variants have been described and a further 400 are estimated to be known to individual laboratories but are unpublished. The major genetic testing laboratories for X-linked Alport syndrome worldwide have established a Web-based database for published and unpublished COL4A5 variants ( https://grenada.lumc.nl/LOVD2/COL4A/home.php?select_db=COL4A5 ). This conforms with the recommendations of the Human Variome Project: it uses the Leiden Open Variation Database (LOVD) format, describes variants according to the human reference sequence with standardized nomenclature, indicates likely pathogenicity and associated clinical features, and credits the submitting laboratory. The database includes non-pathogenic and recurrent variants, and is linked to another COL4A5 mutation database and relevant bioinformatics sites. Access is free. Increasing the number of COL4A5 variants in the public domain helps patients, diagnostic laboratories, clinicians, and researchers. The database improves the accuracy and efficiency of genetic testing because its variants are already categorized for pathogenicity. The description of further COL4A5 variants and clinical associations will improve our ability to predict phenotype and our understanding of collagen IV biochemistry. The database for X-linked Alport syndrome represents a model for databases in other inherited renal diseases.

  11. Common variants in Mendelian kidney disease genes and their association with renal function.

    PubMed

    Parsa, Afshin; Fuchsberger, Christian; Köttgen, Anna; O'Seaghdha, Conall M; Pattaro, Cristian; de Andrade, Mariza; Chasman, Daniel I; Teumer, Alexander; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Kim, Young J; Taliun, Daniel; Li, Man; Feitosa, Mary; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; Glazer, Nicole; Isaacs, Aaron; Rao, Madhumathi; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Couraki, Vincent; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B; Launer, Lenore J; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D; Boerwinkle, Eric; Schmidt, Helena; Hofer, Edith; Hu, Frank; Demirkan, Ayse; Oostra, Ben A; Turner, Stephen T; Ding, Jingzhong; Andrews, Jeanette S; Freedman, Barry I; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Döring, Angela; Wichmann, H-Erich; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H; Wright, Alan F; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G; Rivadeneira, Fernando; Aulchenko, Yurii S; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K; Portas, Laura; Ford, Ian; Buckley, Brendan M; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J Wouter; Probst-Hensch, Nicole M; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; van Duijn, Cornelia M; Borecki, Ingrid; Kardia, Sharon L R; Liu, Yongmei; Curhan, Gary C; Rudan, Igor; Gyllensten, Ulf; Wilson, James F; Franke, Andre; Pramstaller, Peter P; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Bochud, Murielle; Heid, Iris M; Siscovick, David S; Fox, Caroline S; Kao, W Linda; Böger, Carsten A

    2013-12-01

    Many common genetic variants identified by genome-wide association studies for complex traits map to genes previously linked to rare inherited Mendelian disorders. A systematic analysis of common single-nucleotide polymorphisms (SNPs) in genes responsible for Mendelian diseases with kidney phenotypes has not been performed. We thus developed a comprehensive database of genes for Mendelian kidney conditions and evaluated the association between common genetic variants within these genes and kidney function in the general population. Using the Online Mendelian Inheritance in Man database, we identified 731 unique disease entries related to specific renal search terms and confirmed a kidney phenotype in 218 of these entries, corresponding to mutations in 258 genes. We interrogated common SNPs (minor allele frequency >5%) within these genes for association with the estimated GFR in 74,354 European-ancestry participants from the CKDGen Consortium. However, the top four candidate SNPs (rs6433115 at LRP2, rs1050700 at TSC1, rs249942 at PALB2, and rs9827843 at ROBO2) did not achieve significance in a stage 2 meta-analysis performed in 56,246 additional independent individuals, indicating that these common SNPs are not associated with estimated GFR. The effect of less common or rare variants in these genes on kidney function in the general population and disease-specific cohorts requires further research.

  12. STOPGAP: a database for systematic target opportunity assessment by genetic association predictions.

    PubMed

    Shen, Judong; Song, Kijoung; Slater, Andrew J; Ferrero, Enrico; Nelson, Matthew R

    2017-09-01

    We developed the STOPGAP (Systematic Target OPportunity assessment by Genetic Association Predictions) database, an extensive catalog of human genetic associations mapped to effector gene candidates. STOPGAP draws on a variety of publicly available GWAS associations, linkage disequilibrium (LD) measures, functional genomic and variant annotation sources. Algorithms were developed to merge the association data, partition associations into non-overlapping LD clusters, map variants to genes and produce a variant-to-gene score used to rank the relative confidence among potential effector genes. This database can be used for a multitude of investigations into the genes and genetic mechanisms underlying inter-individual variation in human traits, as well as supporting drug discovery applications. Shell, R, Perl and Python scripts and STOPGAP R data files (version 2.5.1 at publication) are available at https://github.com/StatGenPRD/STOPGAP . Some of the most useful STOPGAP fields can be queried through an R Shiny web application at http://stopgapwebapp.com . matthew.r.nelson@gsk.com. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  13. Mutation update of transcription factor genes FOXE3, HSF4, MAF, and PITX3 causing cataracts and other developmental ocular defects.

    PubMed

    Anand, Deepti; Agrawal, Smriti A; Slavotinek, Anne; Lachke, Salil A

    2018-04-01

    Mutations in the transcription factor genes FOXE3, HSF4, MAF, and PITX3 cause congenital lens defects including cataracts that may be accompanied by defects in other components of the eye or in nonocular tissues. We comprehensively describe here all the variants in FOXE3, HSF4, MAF, and PITX3 genes linked to human developmental defects. A total of 52 variants for FOXE3, 18 variants for HSF4, 20 variants for MAF, and 19 variants for PITX3 identified so far in isolated cases or within families are documented. This effort reveals FOXE3, HSF4, MAF, and PITX3 to have 33, 16, 18, and 7 unique causal mutations, respectively. Loss-of-function mutant animals for these genes have served to model the pathobiology of the associated human defects, and we discuss the currently known molecular function of these genes, particularly with emphasis on their role in ocular development. Finally, we make the detailed FOXE3, HSF4, MAF, and PITX3 variant information available in the Leiden Online Variation Database (LOVD) platform at https://www.LOVD.nl/FOXE3, https://www.LOVD.nl/HSF4, https://www.LOVD.nl/MAF, and https://www.LOVD.nl/PITX3. Thus, this article informs on key variants in transcription factor genes linked to cataract, aphakia, corneal opacity, glaucoma, microcornea, microphthalmia, anterior segment mesenchymal dysgenesis, and Ayme-Gripp syndrome, and facilitates their access through Web-based databases. © 2018 Wiley Periodicals, Inc.

  14. Comparison and optimization of in silico algorithms for predicting the pathogenicity of sodium channel variants in epilepsy.

    PubMed

    Holland, Katherine D; Bouley, Thomas M; Horn, Paul S

    2017-07-01

    Variants in neuronal voltage-gated sodium channel α-subunits genes SCN1A, SCN2A, and SCN8A are common in early onset epileptic encephalopathies and other autosomal dominant childhood epilepsy syndromes. However, in clinical practice, missense variants are often classified as variants of uncertain significance when missense variants are identified but heritability cannot be determined. Genetic testing reports often include results of computational tests to estimate pathogenicity and the frequency of that variant in population-based databases. The objective of this work was to enhance clinicians' understanding of results by (1) determining how effectively computational algorithms predict epileptogenicity of sodium channel (SCN) missense variants; (2) optimizing their predictive capabilities; and (3) determining if epilepsy-associated SCN variants are present in population-based databases. This will help clinicians better understand the results of indeterminate SCN test results in people with epilepsy. Pathogenic, likely pathogenic, and benign variants in SCNs were identified using databases of sodium channel variants. Benign variants were also identified from population-based databases. Eight algorithms commonly used to predict pathogenicity were compared. In addition, logistic regression was used to determine if a combination of algorithms could better predict pathogenicity. Based on American College of Medical Genetic Criteria, 440 variants were classified as pathogenic or likely pathogenic and 84 were classified as benign or likely benign. Twenty-eight variants previously associated with epilepsy were present in population-based gene databases. The output provided by most computational algorithms had a high sensitivity but low specificity with an accuracy of 0.52-0.77. Accuracy could be improved by adjusting the threshold for pathogenicity. Using this adjustment, the Mendelian Clinically Applicable Pathogenicity (M-CAP) algorithm had an accuracy of 0.90 and a combination of algorithms increased the accuracy to 0.92. Potentially pathogenic variants are present in population-based sources. Most computational algorithms overestimate pathogenicity; however, a weighted combination of several algorithms increased classification accuracy to >0.90. Wiley Periodicals, Inc. © 2017 International League Against Epilepsy.

  15. Harmonizing the interpretation of genetic variants across the world: the Malaysian experience.

    PubMed

    Hassan, Nik Norliza Nik; Plazzer, John-Paul; Smith, Timothy D; Halim-Fikri, Hashim; Macrae, Finlay; Zubaidi, A A L; Zilfalil, Bin Alwi

    2016-02-26

    Databases for gene variants are very useful for sharing genetic data and to facilitate the understanding of the genetic basis of diseases. This report summarises the issues surrounding the development of the Malaysian Human Variome Project Country Node. The focus is on human germline variants. Somatic variants, mitochondrial variants and other types of genetic variation have corresponding databases which are not covered here, as they have specific issues that do not necessarily apply to germline variations. The ethical, legal, social issues, intellectual property, ownership of the data, information technology implementation, and efforts to improve the standards and systems used in data sharing are discussed. An overarching framework such as provided by the Human Variome Project to co-ordinate activities is invaluable. Country Nodes, such as MyHVP, enable human gene variation associated with human diseases to be collected, stored and shared by all disciplines (clinicians, molecular biologists, pathologists, bioinformaticians) for a consistent interpretation of genetic variants locally and across the world.

  16. Multiple endocrine neoplasia type 1 (MEN1): An update of 208 new germline variants reported in the last nine years.

    PubMed

    Concolino, Paola; Costella, Alessandra; Capoluongo, Ettore

    2016-01-01

    This review will focus on the germline MEN1 mutations that have been reported in patients with MEN1 and other hereditary endocrine disorders from 2007 to September 2015. A comprehensive review regarding the analysis of 1336 MEN1 mutations reported in the first decade following the gene's identification was performed by Lemos and Thakker in 2008. No other similar papers are available in literature apart from these data. We also checked for the list of Locus-Specific DataBases (LSDBs) and we found five MEN1 free-online mutational databases. 151 articles from the NCBI PubMed literature database were read and evaluated and a total of 75 MEN1 variants were found. On the contrary, 67, 22 and 44 novel MEN1 variants were obtained from ClinVar, MEN1 at Café Variome and HGMD (The Human Gene Mutation Database) databases respectively. A final careful analysis of MEN1 mutations affecting the coding region was performed. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Variability of Creatine Metabolism Genes in Children with Autism Spectrum Disorder.

    PubMed

    Cameron, Jessie M; Levandovskiy, Valeriy; Roberts, Wendy; Anagnostou, Evdokia; Scherer, Stephen; Loh, Alvin; Schulze, Andreas

    2017-07-31

    Creatine deficiency syndrome (CDS) comprises three separate enzyme deficiencies with overlapping clinical presentations: arginine:glycine amidinotransferase ( GATM gene, glycine amidinotransferase), guanidinoacetate methyltransferase ( GAMT gene), and creatine transporter deficiency ( SLC6A8 gene, solute carrier family 6 member 8). CDS presents with developmental delays/regression, intellectual disability, speech and language impairment, autistic behaviour, epileptic seizures, treatment-refractory epilepsy, and extrapyramidal movement disorders; symptoms that are also evident in children with autism. The objective of the study was to test the hypothesis that genetic variability in creatine metabolism genes is associated with autism. We sequenced GATM , GAMT and SLC6A8 genes in 166 patients with autism (coding sequence, introns and adjacent untranslated regions). A total of 29, 16 and 25 variants were identified in each gene, respectively. Four variants were novel in GATM , and 5 in SLC6A8 (not present in the 1000 Genomes, Exome Sequencing Project (ESP) or Exome Aggregation Consortium (ExAC) databases). A single variant in each gene was identified as non-synonymous, and computationally predicted to be potentially damaging. Nine variants in GATM were shown to have a lower minor allele frequency (MAF) in the autism population than in the 1000 Genomes database, specifically in the East Asian population (Fisher's exact test). Two variants also had lower MAFs in the European population. In summary, there were no apparent associations of variants in GAMT and SLC6A8 genes with autism. The data implying there could be a lower association of some specific GATM gene variants with autism is an observation that would need to be corroborated in a larger group of autism patients, and with sub-populations of Asian ethnicities. Overall, our findings suggest that the genetic variability of creatine synthesis/transport is unlikely to play a part in the pathogenesis of autism spectrum disorder (ASD) in children.

  18. Common Variants in Mendelian Kidney Disease Genes and Their Association with Renal Function

    PubMed Central

    Fuchsberger, Christian; Köttgen, Anna; O’Seaghdha, Conall M.; Pattaro, Cristian; de Andrade, Mariza; Chasman, Daniel I.; Teumer, Alexander; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Kim, Young J.; Taliun, Daniel; Li, Man; Feitosa, Mary; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C.; Glazer, Nicole; Isaacs, Aaron; Rao, Madhumathi; Smith, Albert V.; O’Connell, Jeffrey R.; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Hwang, Shih-Jen; Atkinson, Elizabeth J.; Lohman, Kurt; Cornelis, Marilyn C.; Johansson, Åsa; Tönjes, Anke; Dehghan, Abbas; Couraki, Vincent; Holliday, Elizabeth G.; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y.; Murgia, Federico; Trompet, Stella; Imboden, Medea; Kollerits, Barbara; Pistis, Giorgio; Harris, Tamara B.; Launer, Lenore J.; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D.; Boerwinkle, Eric; Schmidt, Helena; Hofer, Edith; Hu, Frank; Demirkan, Ayse; Oostra, Ben A.; Turner, Stephen T.; Ding, Jingzhong; Andrews, Jeanette S.; Freedman, Barry I.; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Döring, Angela; Wichmann, H.-Erich; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E.; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H.; Wright, Alan F.; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K.; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G.; Rivadeneira, Fernando; Aulchenko, Yurii S.; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K.; Portas, Laura; Ford, Ian; Buckley, Brendan M.; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J. Wouter; Probst-Hensch, Nicole M.; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R.; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; van Duijn, Cornelia M.; Borecki, Ingrid; Kardia, Sharon L.R.; Liu, Yongmei; Curhan, Gary C.; Rudan, Igor; Gyllensten, Ulf; Wilson, James F.; Franke, Andre; Pramstaller, Peter P.; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M.; Bochud, Murielle; Heid, Iris M.; Siscovick, David S.; Fox, Caroline S.; Kao, W. Linda; Böger, Carsten A.

    2013-01-01

    Many common genetic variants identified by genome-wide association studies for complex traits map to genes previously linked to rare inherited Mendelian disorders. A systematic analysis of common single-nucleotide polymorphisms (SNPs) in genes responsible for Mendelian diseases with kidney phenotypes has not been performed. We thus developed a comprehensive database of genes for Mendelian kidney conditions and evaluated the association between common genetic variants within these genes and kidney function in the general population. Using the Online Mendelian Inheritance in Man database, we identified 731 unique disease entries related to specific renal search terms and confirmed a kidney phenotype in 218 of these entries, corresponding to mutations in 258 genes. We interrogated common SNPs (minor allele frequency >5%) within these genes for association with the estimated GFR in 74,354 European-ancestry participants from the CKDGen Consortium. However, the top four candidate SNPs (rs6433115 at LRP2, rs1050700 at TSC1, rs249942 at PALB2, and rs9827843 at ROBO2) did not achieve significance in a stage 2 meta-analysis performed in 56,246 additional independent individuals, indicating that these common SNPs are not associated with estimated GFR. The effect of less common or rare variants in these genes on kidney function in the general population and disease-specific cohorts requires further research. PMID:24029420

  19. Diversity and impact of rare variants in genes encoding the platelet G protein-coupled receptors.

    PubMed

    Jones, Matthew L; Norman, Jane E; Morgan, Neil V; Mundell, Stuart J; Lordkipanidzé, Marie; Lowe, Gillian C; Daly, Martina E; Simpson, Michael A; Drake, Sian; Watson, Steve P; Mumford, Andrew D

    2015-04-01

    Platelet responses to activating agonists are influenced by common population variants within or near G protein-coupled receptor (GPCR) genes that affect receptor activity. However, the impact of rare GPCR gene variants is unknown. We describe the rare single nucleotide variants (SNVs) in the coding and splice regions of 18 GPCR genes in 7,595 exomes from the 1,000-genomes and Exome Sequencing Project databases and in 31 cases with inherited platelet function disorders (IPFDs). In the population databases, the GPCR gene target regions contained 740 SNVs (318 synonymous, 410 missense, 7 stop gain and 6 splice region) of which 70 % had global minor allele frequency (MAF) < 0.05 %. Functional annotation using six computational algorithms, experimental evidence and structural data identified 156/740 (21 %) SNVs as potentially damaging to GPCR function, most commonly in regions encoding the transmembrane and C-terminal intracellular receptor domains. In 31 index cases with IPFDs (Gi-pathway defect n=15; secretion defect n=11; thromboxane pathway defect n=3 and complex defect n=2) there were 256 SNVs in the target regions of 15 stimulatory platelet GPCRs (34 unique; 12 with MAF< 1 % and 22 with MAF≥ 1 %). These included rare variants predicting R122H, P258T and V207A substitutions in the P2Y12 receptor that were annotated as potentially damaging, but only partially explained the platelet function defects in each case. Our data highlight that potentially damaging variants in platelet GPCR genes have low individual frequencies, but are collectively abundant in the population. Potentially damaging variants are also present in pedigrees with IPFDs and may contribute to complex laboratory phenotypes.

  20. CYP21A2 mutation update: Comprehensive analysis of databases and published genetic variants.

    PubMed

    Simonetti, Leandro; Bruque, Carlos D; Fernández, Cecilia S; Benavides-Mori, Belén; Delea, Marisol; Kolomenski, Jorge E; Espeche, Lucía D; Buzzalino, Noemí D; Nadra, Alejandro D; Dain, Liliana

    2018-01-01

    Congenital adrenal hyperplasia (CAH) is a group of autosomal recessive disorders of adrenal steroidogenesis. Disorders in steroid 21-hydroxylation account for over 95% of patients with CAH. Clinically, the 21-hydroxylase deficiency has been classified in a broad spectrum of clinical forms, ranging from severe or classical, to mild late onset or non-classical. Known allelic variants in the disease causing CYP21A2 gene are spread among different sources. Until recently, most variants reported have been identified in the clinical setting, which presumably bias described variants to pathogenic ones, as those found in the CYPAlleles database. Nevertheless, a large number of variants are being described in massive genome projects, many of which are found in dbSNP, but lack functional implications and/or their phenotypic effect. In this work, we gathered a total of 1,340 GVs in the CYP21A2 gene, from which 899 variants were unique and 230 have an effect on human health, and compiled all this information in an integrated database. We also connected CYP21A2 sequence information to phenotypic effects for all available mutations, including double mutants in cis. Data compiled in the present work could help physicians in the genetic counseling of families affected with 21-hydroxylase deficiency. © 2017 Wiley Periodicals, Inc.

  1. Mutation screening in the Greek population and evaluation of NLGN3 and NLGN4X genes causal factors for autism.

    PubMed

    Volaki, Konstantina; Pampanos, Andreas; Kitsiou-Tzeli, Sophia; Vrettou, Christina; Oikonomakis, Vasilis; Sofocleous, Christalena; Kanavakis, Emmanuel

    2013-10-01

    Molecular and neurobiological evidence for the involvement of neuroligins (particularly NLGN3 and NLGN4X genes) in autistic disorder is accumulating. However, previous mutation screening studies on these two genes have yielded controversial results. The present study explores, for the first time, the contribution of NLGN3 and NLGN4X genetic variants in Greek patients with autistic disorder. We analyzed the full exonic sequence of NLGN3 and NLGN4X genes in 40 patients strictly fulfilling the Diagnostic and Statistical Manual of Mental Disorders, 4th ed. criteria for autistic disorder. We identified nine nucleotide changes in NLGN4X--one probable causative mutation (p.K378R) previously reported by our research group, one novel variant (c.-206G>C), one nonvalidated single nucleotide polymorphism (SNP, rs111953947), and six known human SNPs reported in the SNP database--and one known human SNP in NLGN3 also reported in the SNP database. The variants identified are expected to be benign. However, they should be investigated in the context of variants in interacting cellular pathways to assess their contribution to the etiology of autism.

  2. IMPACT web portal: oncology database integrating molecular profiles with actionable therapeutics.

    PubMed

    Hintzsche, Jennifer D; Yoo, Minjae; Kim, Jihye; Amato, Carol M; Robinson, William A; Tan, Aik Choon

    2018-04-20

    With the advancement of next generation sequencing technology, researchers are now able to identify important variants and structural changes in DNA and RNA in cancer patient samples. With this information, we can now correlate specific variants and/or structural changes with actionable therapeutics known to inhibit these variants. We introduce the creation of the IMPACT Web Portal, a new online resource that connects molecular profiles of tumors to approved drugs, investigational therapeutics and pharmacogenetics associated drugs. IMPACT Web Portal contains a total of 776 drugs connected to 1326 target genes and 435 target variants, fusion, and copy number alterations. The online IMPACT Web Portal allows users to search for various genetic alterations and connects them to three levels of actionable therapeutics. The results are categorized into 3 levels: Level 1 contains approved drugs separated into two groups; Level 1A contains approved drugs with variant specific information while Level 1B contains approved drugs with gene level information. Level 2 contains drugs currently in oncology clinical trials. Level 3 provides pharmacogenetic associations between approved drugs and genes. IMPACT Web Portal allows for sequencing data to be linked to actionable therapeutics for translational and drug repurposing research. The IMPACT Web Portal online resource allows users to query genes and variants to approved and investigational drugs. We envision that this resource will be a valuable database for personalized medicine and drug repurposing. IMPACT Web Portal is freely available for non-commercial use at http://tanlab.ucdenver.edu/IMPACT .

  3. Rare missense variants in POT1 predispose to familial cutaneous malignant melanoma

    PubMed Central

    Shi, Jianxin; Yang, Xiaohong R.; Ballew, Bari; Rotunno, Melissa; Calista, Donato; Fargnoli, Maria Concetta; Ghiorzo, Paola; Paillerets, Brigitte Bressac-de; Nagore, Eduardo; Avril, Marie Francoise; Caporaso, Neil E.; McMaster, Mary L.; Cullen, Michael; Wang, Zhaoming; Zhang, Xijun; Bruno, William; Pastorino, Lorenza; Queirolo, Paola; Banuls-Roca, Jose; Garcia-Casado, Zaida; Vaysse, Amaury; Mohamdi, Hamida; Riazalhosseini, Yasser; Foglio, Mario; Jouenne, Fanélie; Hua, Xing; Hyland, Paula L.; Yin, Jinhu; Vallabhaneni, Haritha; Chai, Weihang; Minghetti, Paola; Pellegrini, Cristina; Ravichandran, Sarangan; Eggermont, Alexander; Lathrop, Mark; Peris, Ketty; Scarra, Giovanna Bianchi; Landi, Giorgio; Savage, Sharon A.; Sampson, Joshua N.; He, Ji; Yeager, Meredith; Goldin, Lynn R.; Demenais, Florence; Chanock, Stephen J.; Tucker, Margaret A.; Goldstein, Alisa M.; Liu, Yie; Landi, Maria Teresa

    2014-01-01

    Although CDKN2A is the most frequent high-risk melanoma susceptibility gene, the underlying genetic factors for most melanoma-prone families remain unknown. Using whole exome sequencing, we identified a rare variant that arose as a founder mutation in the telomere shelterin POT1 gene (g.7:124493086 C>T, Ser270Asn) in five unrelated melanoma-prone families from Romagna, Italy. Carriers of this variant had increased telomere length and elevated fragile telomeres suggesting that this variant perturbs telomere maintenance. Two additional rare POT1 variants were identified in all cases sequenced in two other Italian families, yielding a frequency of POT1 variants comparable to that of CDKN2A mutations in this population. These variants were not found in public databases or in 2,038 genotyped Italian controls. We also identified two rare recurrent POT1 variants in American and French familial melanoma cases. Our findings suggest that POT1 is a major susceptibility gene for familial melanoma in several populations. PMID:24686846

  4. TransAtlasDB: an integrated database connecting expression data, metadata and variants

    PubMed Central

    Adetunji, Modupeore O; Lamont, Susan J; Schmidt, Carl J

    2018-01-01

    Abstract High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating suitable hypothesis, thus an innovative storage solution that addresses these limitations, such as hard disk storage requirements, efficiency and reproducibility are paramount. By offering a uniform data storage and retrieval mechanism, various data can be compared and easily investigated. We present a sophisticated system, TransAtlasDB, which incorporates a hybrid architecture of both relational and NoSQL databases for fast and efficient data storage, processing and querying of large datasets from transcript expression analysis with corresponding metadata, as well as gene-associated variants (such as SNPs) and their predicted gene effects. TransAtlasDB provides the data model of accurate storage of the large amount of data derived from RNAseq analysis and also methods of interacting with the database, either via the command-line data management workflows, written in Perl, with useful functionalities that simplifies the complexity of data storage and possibly manipulation of the massive amounts of data generated from RNAseq analysis or through the web interface. The database application is currently modeled to handle analyses data from agricultural species, and will be expanded to include more species groups. Overall TransAtlasDB aims to serve as an accessible repository for the large complex results data files derived from RNAseq gene expression profiling and variant analysis. Database URL: https://modupeore.github.io/TransAtlasDB/ PMID:29688361

  5. An integrated database-pipeline system for studying single nucleotide polymorphisms and diseases.

    PubMed

    Yang, Jin Ok; Hwang, Sohyun; Oh, Jeongsu; Bhak, Jong; Sohn, Tae-Kwon

    2008-12-12

    Studies on the relationship between disease and genetic variations such as single nucleotide polymorphisms (SNPs) are important. Genetic variations can cause disease by influencing important biological regulation processes. Despite the needs for analyzing SNP and disease correlation, most existing databases provide information only on functional variants at specific locations on the genome, or deal with only a few genes associated with disease. There is no combined resource to widely support gene-, SNP-, and disease-related information, and to capture relationships among such data. Therefore, we developed an integrated database-pipeline system for studying SNPs and diseases. To implement the pipeline system for the integrated database, we first unified complicated and redundant disease terms and gene names using the Unified Medical Language System (UMLS) for classification and noun modification, and the HUGO Gene Nomenclature Committee (HGNC) and NCBI gene databases. Next, we collected and integrated representative databases for three categories of information. For genes and proteins, we examined the NCBI mRNA, UniProt, UCSC Table Track and MitoDat databases. For genetic variants we used the dbSNP, JSNP, ALFRED, and HGVbase databases. For disease, we employed OMIM, GAD, and HGMD databases. The database-pipeline system provides a disease thesaurus, including genes and SNPs associated with disease. The search results for these categories are available on the web page http://diseasome.kobic.re.kr/, and a genome browser is also available to highlight findings, as well as to permit the convenient review of potentially deleterious SNPs among genes strongly associated with specific diseases and clinical phenotypes. Our system is designed to capture the relationships between SNPs associated with disease and disease-causing genes. The integrated database-pipeline provides a list of candidate genes and SNP markers for evaluation in both epidemiological and molecular biological approaches to diseases-gene association studies. Furthermore, researchers then can decide semi-automatically the data set for association studies while considering the relationships between genetic variation and diseases. The database can also be economical for disease-association studies, as well as to facilitate an understanding of the processes which cause disease. Currently, the database contains 14,674 SNP records and 109,715 gene records associated with human diseases and it is updated at regular intervals.

  6. CFTR-France, a national relational patient database for sharing genetic and phenotypic data associated with rare CFTR variants.

    PubMed

    Claustres, Mireille; Thèze, Corinne; des Georges, Marie; Baux, David; Girodon, Emmanuelle; Bienvenu, Thierry; Audrezet, Marie-Pierre; Dugueperoux, Ingrid; Férec, Claude; Lalau, Guy; Pagin, Adrien; Kitzis, Alain; Thoreau, Vincent; Gaston, Véronique; Bieth, Eric; Malinge, Marie-Claire; Reboul, Marie-Pierre; Fergelot, Patricia; Lemonnier, Lydie; Mekki, Chadia; Fanen, Pascale; Bergougnoux, Anne; Sasorith, Souphatta; Raynal, Caroline; Bareil, Corinne

    2017-10-01

    Most of the 2,000 variants identified in the CFTR (cystic fibrosis transmembrane regulator) gene are rare or private. Their interpretation is hampered by the lack of available data and resources, making patient care and genetic counseling challenging. We developed a patient-based database dedicated to the annotations of rare CFTR variants in the context of their cis- and trans-allelic combinations. Based on almost 30 years of experience of CFTR testing, CFTR-France (https://cftr.iurc.montp.inserm.fr/cftr) currently compiles 16,819 variant records from 4,615 individuals with cystic fibrosis (CF) or CFTR-RD (related disorders), fetuses with ultrasound bowel anomalies, newborns awaiting clinical diagnosis, and asymptomatic compound heterozygotes. For each of the 736 different variants reported in the database, patient characteristics and genetic information (other variations in cis or in trans) have been thoroughly checked by a dedicated curator. Combining updated clinical, epidemiological, in silico, or in vitro functional data helps to the interpretation of unclassified and the reassessment of misclassified variants. This comprehensive CFTR database is now an invaluable tool for diagnostic laboratories gathering information on rare variants, especially in the context of genetic counseling, prenatal and preimplantation genetic diagnosis. CFTR-France is thus highly complementary to the international database CFTR2 focused so far on the most common CF-causing alleles. © 2017 Wiley Periodicals, Inc.

  7. Rapid functional analysis of computationally complex rare human IRF6 gene variants using a novel zebrafish model.

    PubMed

    Li, Edward B; Truong, Dawn; Hallett, Shawn A; Mukherjee, Kusumika; Schutte, Brian C; Liao, Eric C

    2017-09-01

    Large-scale sequencing efforts have captured a rapidly growing catalogue of genetic variations. However, the accurate establishment of gene variant pathogenicity remains a central challenge in translating personal genomics information to clinical decisions. Interferon Regulatory Factor 6 (IRF6) gene variants are significant genetic contributors to orofacial clefts. Although approximately three hundred IRF6 gene variants have been documented, their effects on protein functions remain difficult to interpret. Here, we demonstrate the protein functions of human IRF6 missense gene variants could be rapidly assessed in detail by their abilities to rescue the irf6 -/- phenotype in zebrafish through variant mRNA microinjections at the one-cell stage. The results revealed many missense variants previously predicted by traditional statistical and computational tools to be loss-of-function and pathogenic retained partial or full protein function and rescued the zebrafish irf6 -/- periderm rupture phenotype. Through mRNA dosage titration and analysis of the Exome Aggregation Consortium (ExAC) database, IRF6 missense variants were grouped by their abilities to rescue at various dosages into three functional categories: wild type function, reduced function, and complete loss-of-function. This sensitive and specific biological assay was able to address the nuanced functional significances of IRF6 missense gene variants and overcome many limitations faced by current statistical and computational tools in assigning variant protein function and pathogenicity. Furthermore, it unlocked the possibility for characterizing yet undiscovered human IRF6 missense gene variants from orofacial cleft patients, and illustrated a generalizable functional genomics paradigm in personalized medicine.

  8. Meta-analysis of CHEK2 1100delC variant and colorectal cancer susceptibility.

    PubMed

    Xiang, He-ping; Geng, Xiao-ping; Ge, Wei-wei; Li, He

    2011-11-01

    Cell cycle checkpoint kinase 2 (CHEK2) gene has been inconsistently associated with colorectal cancer (CRC), particularly the 1100delC variant. To generate large-scale evidence on whether the CHEK2 1100delC variant is associated with CRC susceptibility we have conducted a meta-analysis. Data were collected from the following electronic databases: PubMed, Excerpta Medica Database and Chinese Biomedical Literature Database, with the last report up to November 2010. The odds ratio (OR) and its 95% confidence interval (95% CI) were used to assess the strength of association. We evaluated the contrast of carriers versus non-carriers. Meta-analysis was performed in a fixed/random effect model by using the software Review Manager 4.2. A total of six studies including 4194 cases and 10,010 controls based on the search criteria were involved in this meta-analysis. A significant association of the CHEK2 1100delC variant with unselected CRC was found (OR=2.11, 95% CI=1.41-3.16, P=0.0003). We also found an association of the CHEK2 1100delC variant with familial CRC (OR=2.80, 95% CI=1.74-4.51, P<0.0001). However, the association was not established for sporadic CRC (OR=1.45, 95% CI=0.49-4.30, P=0.50). This meta-analysis demonstrates that the CHEK2 1100delC variant may be an important CRC-predisposing gene, which increases CRC risk. Copyright © 2011. Published by Elsevier Ltd.

  9. Mutation Update of ARSA and PSAP Genes Causing Metachromatic Leukodystrophy.

    PubMed

    Cesani, Martina; Lorioli, Laura; Grossi, Serena; Amico, Giulia; Fumagalli, Francesca; Spiga, Ivana; Filocamo, Mirella; Biffi, Alessandra

    2016-01-01

    Metachromatic leukodystrophy is a neurodegenerative disorder characterized by progressive demyelination. The disease is caused by variants in the ARSA gene, which codes for the lysosomal enzyme arylsulfatase A, or, more rarely, in the PSAP gene, which codes for the activator protein saposin B. In this Mutation Update, an extensive review of all the ARSA- and PSAP-causative variants published in the literature to date, accounting for a total of 200 ARSA and 10 PSAP allele types, is presented. The detailed ARSA and PSAP variant lists are freely available on the Leiden Online Variation Database (LOVD) platform at http://www.LOVD.nl/ARSA and http://www.LOVD.nl/PSAP, respectively. © 2015 WILEY PERIODICALS, INC.

  10. Integrating 400 million variants from 80,000 human samples with extensive annotations: towards a knowledge base to analyze disease cohorts.

    PubMed

    Hakenberg, Jörg; Cheng, Wei-Yi; Thomas, Philippe; Wang, Ying-Chih; Uzilov, Andrew V; Chen, Rong

    2016-01-08

    Data from a plethora of high-throughput sequencing studies is readily available to researchers, providing genetic variants detected in a variety of healthy and disease populations. While each individual cohort helps gain insights into polymorphic and disease-associated variants, a joint perspective can be more powerful in identifying polymorphisms, rare variants, disease-associations, genetic burden, somatic variants, and disease mechanisms. We have set up a Reference Variant Store (RVS) containing variants observed in a number of large-scale sequencing efforts, such as 1000 Genomes, ExAC, Scripps Wellderly, UK10K; various genotyping studies; and disease association databases. RVS holds extensive annotations pertaining to affected genes, functional impacts, disease associations, and population frequencies. RVS currently stores 400 million distinct variants observed in more than 80,000 human samples. RVS facilitates cross-study analysis to discover novel genetic risk factors, gene-disease associations, potential disease mechanisms, and actionable variants. Due to its large reference populations, RVS can also be employed for variant filtration and gene prioritization. A web interface to public datasets and annotations in RVS is available at https://rvs.u.hpc.mssm.edu/.

  11. iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.

    PubMed

    Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi

    2018-01-01

    We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.

  12. Genetic investigation of 100 heart genes in sudden unexplained death victims in a forensic setting

    PubMed Central

    Christiansen, Sofie Lindgren; Hertz, Christin Løth; Ferrero-Miliani, Laura; Dahl, Morten; Weeke, Peter Ejvin; LuCamp; Ottesen, Gyda Lolk; Frank-Hansen, Rune; Bundgaard, Henning; Morling, Niels

    2016-01-01

    In forensic medicine, one-third of the sudden deaths remain unexplained after medico-legal autopsy. A major proportion of these sudden unexplained deaths (SUD) are considered to be caused by inherited cardiac diseases. Sudden cardiac death (SCD) may be the first manifestation of these diseases. The purpose of this study was to explore the yield of next-generation sequencing of genes associated with SCD in a cohort of SUD victims. We investigated 100 genes associated with cardiac diseases in 61 young (1–50 years) SUD cases. DNA was captured with the Haloplex target enrichment system and sequenced using an Illumina MiSeq. The identified genetic variants were evaluated and classified as likely, unknown or unlikely to have a functional effect. The criteria for this classification were based on the literature, databases, conservation and prediction of the effect of the variant. We found that 21 (34%) individuals carried variants with a likely functional effect. Ten (40%) of these variants were located in genes associated with cardiomyopathies and 15 (60%) of the variants in genes associated with cardiac channelopathies. Nineteen individuals carried variants with unknown functional effect. Our findings indicate that broad genetic investigation of SUD victims increases the diagnostic outcome, and the investigation should comprise genes involved in both cardiomyopathies and cardiac channelopathies. PMID:27650965

  13. Genetic investigation of 100 heart genes in sudden unexplained death victims in a forensic setting.

    PubMed

    Christiansen, Sofie Lindgren; Hertz, Christin Løth; Ferrero-Miliani, Laura; Dahl, Morten; Weeke, Peter Ejvin; LuCamp; Ottesen, Gyda Lolk; Frank-Hansen, Rune; Bundgaard, Henning; Morling, Niels

    2016-12-01

    In forensic medicine, one-third of the sudden deaths remain unexplained after medico-legal autopsy. A major proportion of these sudden unexplained deaths (SUD) are considered to be caused by inherited cardiac diseases. Sudden cardiac death (SCD) may be the first manifestation of these diseases. The purpose of this study was to explore the yield of next-generation sequencing of genes associated with SCD in a cohort of SUD victims. We investigated 100 genes associated with cardiac diseases in 61 young (1-50 years) SUD cases. DNA was captured with the Haloplex target enrichment system and sequenced using an Illumina MiSeq. The identified genetic variants were evaluated and classified as likely, unknown or unlikely to have a functional effect. The criteria for this classification were based on the literature, databases, conservation and prediction of the effect of the variant. We found that 21 (34%) individuals carried variants with a likely functional effect. Ten (40%) of these variants were located in genes associated with cardiomyopathies and 15 (60%) of the variants in genes associated with cardiac channelopathies. Nineteen individuals carried variants with unknown functional effect. Our findings indicate that broad genetic investigation of SUD victims increases the diagnostic outcome, and the investigation should comprise genes involved in both cardiomyopathies and cardiac channelopathies.

  14. Identifying Mendelian disease genes with the Variant Effect Scoring Tool

    PubMed Central

    2013-01-01

    Background Whole exome sequencing studies identify hundreds to thousands of rare protein coding variants of ambiguous significance for human health. Computational tools are needed to accelerate the identification of specific variants and genes that contribute to human disease. Results We have developed the Variant Effect Scoring Tool (VEST), a supervised machine learning-based classifier, to prioritize rare missense variants with likely involvement in human disease. The VEST classifier training set comprised ~ 45,000 disease mutations from the latest Human Gene Mutation Database release and another ~45,000 high frequency (allele frequency >1%) putatively neutral missense variants from the Exome Sequencing Project. VEST outperforms some of the most popular methods for prioritizing missense variants in carefully designed holdout benchmarking experiments (VEST ROC AUC = 0.91, PolyPhen2 ROC AUC = 0.86, SIFT4.0 ROC AUC = 0.84). VEST estimates variant score p-values against a null distribution of VEST scores for neutral variants not included in the VEST training set. These p-values can be aggregated at the gene level across multiple disease exomes to rank genes for probable disease involvement. We tested the ability of an aggregate VEST gene score to identify candidate Mendelian disease genes, based on whole-exome sequencing of a small number of disease cases. We used whole-exome data for two Mendelian disorders for which the causal gene is known. Considering only genes that contained variants in all cases, the VEST gene score ranked dihydroorotate dehydrogenase (DHODH) number 2 of 2253 genes in four cases of Miller syndrome, and myosin-3 (MYH3) number 2 of 2313 genes in three cases of Freeman Sheldon syndrome. Conclusions Our results demonstrate the potential power gain of aggregating bioinformatics variant scores into gene-level scores and the general utility of bioinformatics in assisting the search for disease genes in large-scale exome sequencing studies. VEST is available as a stand-alone software package at http://wiki.chasmsoftware.org and is hosted by the CRAVAT web server at http://www.cravat.us PMID:23819870

  15. Meta-analysis of gene-level associations for rare variants based on single-variant statistics.

    PubMed

    Hu, Yi-Juan; Berndt, Sonja I; Gustafsson, Stefan; Ganna, Andrea; Hirschhorn, Joel; North, Kari E; Ingelsson, Erik; Lin, Dan-Yu

    2013-08-08

    Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  16. Mutation Update for GNE Gene Variants Associated with GNE Myopathy

    PubMed Central

    Celeste, Frank V.; Vilboux, Thierry; Ciccone, Carla; de Dios, John Karl; Malicdan, May Christine V.; Leoyklang, Petcharat; McKew, John C.; Gahl, William A.; Carrillo-Carrasco, Nuria; Huizing, Marjan

    2014-01-01

    The GNE gene encodes the rate-limiting, bifunctional enzyme of sialic acid biosynthesis, UDP-N-acetylglucosamine 2-epimerase/N-acetylmannosamine kinase (GNE). Biallelic GNE mutations underlie GNE myopathy, an adult-onset progressive myopathy. GNE myopathy-associated GNE mutations are predominantly missense, resulting in reduced, but not absent, GNE enzyme activities. The exact pathomechanism of GNE myopathy remains unknown, but likely involves aberrant (muscle) sialylation. Here we summarize 154 reported and novel GNE variants associated with GNE myopathy, including 122 missense, 11 nonsense, 14 insertion/deletions and 7 intronic variants. All variants were deposited in the online GNE variation database (http://www.dmd.nl/nmdb2/home.php?select_db=GNE). We report the predicted effects on protein function of all variants as well as the predicted effects on epimerase and/or kinase enzymatic activities of selected variants. By analyzing exome sequence databases, we identified three frequently occurring, unreported GNE missense variants/polymorphisms, important for future sequence interpretations. Based on allele frequencies, we estimate the world-wide prevalence of GNE myopathy to be ~ 4–21/1,000,000. This previously unrecognized high prevalence confirms suspicions that many patients may escape diagnosis. Awareness among physicians for GNE myopathy is essential for the identification of new patients, which is required for better understanding of the disorder’s pathomechanism and for the success of ongoing treatment trials. PMID:24796702

  17. NCI-60 Whole Exome Sequencing and Pharmacological CellMiner Analyses

    PubMed Central

    Reinhold, William C.; Varma, Sudhir; Sousa, Fabricio; Sunshine, Margot; Abaan, Ogan D.; Davis, Sean R.; Reinhold, Spencer W.; Kohn, Kurt W.; Morris, Joel; Meltzer, Paul S.; Doroshow, James H.; Pommier, Yves

    2014-01-01

    Exome sequencing provides unprecedented insights into cancer biology and pharmacological response. Here we assess these two parameters for the NCI-60, which is among the richest genomic and pharmacological publicly available cancer cell line databases. Homozygous genetic variants that putatively affect protein function were identified in 1,199 genes (approximately 6% of all genes). Variants that are either enriched or depleted compared to non-cancerous genomes, and thus may be influential in cancer progression and differential drug response were identified for 2,546 genes. Potential gene knockouts are made available. Assessment of cell line response to 19,940 compounds, including 110 FDA-approved drugs, reveals ≈80-fold range in resistance versus sensitivity response across cell lines. 103,422 gene variants were significantly correlated with at least one compound (at p<0.0002). These include genes of known pharmacological importance such as IGF1R, BRAF, RAD52, MTOR, STAT2 and TSC2 as well as a large number of candidate genes such as NOM1, TLL2, and XDH. We introduce two new web-based CellMiner applications that enable exploration of variant-to-compound relationships for a broad range of researchers, especially those without bioinformatics support. The first tool, “Genetic variant versus drug visualization”, provides a visualization of significant correlations between drug activity-gene variant combinations. Examples are given for the known vemurafenib-BRAF, and novel ifosfamide-RAD52 pairings. The second, “Genetic variant summation” allows an assessment of cumulative genetic variations for up to 150 combined genes together; and is designed to identify the variant burden for molecular pathways or functional grouping of genes. An example of its use is provided for the EGFR-ERBB2 pathway gene variant data and the identification of correlated EGFR, ERBB2, MTOR, BRAF, MEK and ERK inhibitors. The new tools are implemented as an updated web-based CellMiner version, for which the present publication serves as a compendium. PMID:25032700

  18. MERRF Classification: Implications for Diagnosis and Clinical Trials.

    PubMed

    Finsterer, Josef; Zarrouk-Mahjoub, Sinda; Shoffner, John M

    2018-03-01

    Given the etiologic heterogeneity of disease classification using clinical phenomenology, we employed contemporary criteria to classify variants associated with myoclonic epilepsy with ragged-red fibers (MERRF) syndrome and to assess the strength of evidence of gene-disease associations. Standardized approaches are used to clarify the definition of MERRF, which is essential for patient diagnosis, patient classification, and clinical trial design. Systematic literature and database search with application of standardized assessment of gene-disease relationships using modified Smith criteria and of variants reported to be associated with MERRF using modified Yarham criteria. Review of available evidence supports a gene-disease association for two MT-tRNAs and for POLG. Using modified Smith criteria, definitive evidence of a MERRF gene-disease association is identified for MT-TK. Strong gene-disease evidence is present for MT-TL1 and POLG. Functional assays that directly associate variants with oxidative phosphorylation impairment were critical to mtDNA variant classification. In silico analysis was of limited utility to the assessment of individual MT-tRNA variants. With the use of contemporary classification criteria, several mtDNA variants previously reported as pathogenic or possibly pathogenic are reclassified as neutral variants. MERRF is primarily an MT-TK disease, with pathogenic variants in this gene accounting for ~90% of MERRF patients. Although MERRF is phenotypically and genotypically heterogeneous, myoclonic epilepsy is the clinical feature that distinguishes MERRF from other categories of mitochondrial disorders. Given its low frequency in mitochondrial disorders, myoclonic epilepsy is not explained simply by an impairment of cellular energetics. Although MERRF phenocopies can occur in other genes, additional data are needed to establish a MERRF disease-gene association. This approach to MERRF emphasizes standardized classification rather than clinical phenomenology, thus improving patient diagnosis and clinical trial design. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Systematic documentation and analysis of human genetic variation in hemoglobinopathies using the microattribution approach.

    PubMed

    Giardine, Belinda; Borg, Joseph; Higgs, Douglas R; Peterson, Kenneth R; Philipsen, Sjaak; Maglott, Donna; Singleton, Belinda K; Anstee, David J; Basak, A Nazli; Clark, Barnaby; Costa, Flavia C; Faustino, Paula; Fedosyuk, Halyna; Felice, Alex E; Francina, Alain; Galanello, Renzo; Gallivan, Monica V E; Georgitsi, Marianthi; Gibbons, Richard J; Giordano, Piero C; Harteveld, Cornelis L; Hoyer, James D; Jarvis, Martin; Joly, Philippe; Kanavakis, Emmanuel; Kollia, Panagoula; Menzel, Stephan; Miller, Webb; Moradkhani, Kamran; Old, John; Papachatzopoulou, Adamantia; Papadakis, Manoussos N; Papadopoulos, Petros; Pavlovic, Sonja; Perseu, Lucia; Radmilovic, Milena; Riemer, Cathy; Satta, Stefania; Schrijver, Iris; Stojiljkovic, Maja; Thein, Swee Lay; Traeger-Synodinos, Jan; Tully, Ray; Wada, Takahito; Waye, John S; Wiemann, Claudia; Zukic, Branka; Chui, David H K; Wajcman, Henri; Hardison, Ross C; Patrinos, George P

    2011-03-20

    We developed a series of interrelated locus-specific databases to store all published and unpublished genetic variation related to hemoglobinopathies and thalassemia and implemented microattribution to encourage submission of unpublished observations of genetic variation to these public repositories. A total of 1,941 unique genetic variants in 37 genes, encoding globins and other erythroid proteins, are currently documented in these databases, with reciprocal attribution of microcitations to data contributors. Our project provides the first example of implementing microattribution to incentivise submission of all known genetic variation in a defined system. It has demonstrably increased the reporting of human variants, leading to a comprehensive online resource for systematically describing human genetic variation in the globin genes and other genes contributing to hemoglobinopathies and thalassemias. The principles established here will serve as a model for other systems and for the analysis of other common and/or complex human genetic diseases.

  20. SSTAR, a Stand-Alone Easy-To-Use Antimicrobial Resistance Gene Predictor.

    PubMed

    de Man, Tom J B; Limbago, Brandi M

    2016-01-01

    We present the easy-to-use Sequence Search Tool for Antimicrobial Resistance, SSTAR. It combines a locally executed BLASTN search against a customizable database with an intuitive graphical user interface for identifying antimicrobial resistance (AR) genes from genomic data. Although the database is initially populated from a public repository of acquired resistance determinants (i.e., ARG-ANNOT), it can be customized for particular pathogen groups and resistance mechanisms. For instance, outer membrane porin sequences associated with carbapenem resistance phenotypes can be added, and known intrinsic mechanisms can be included. Unique about this tool is the ability to easily detect putative new alleles and truncated versions of existing AR genes. Variants and potential new alleles are brought to the attention of the user for further investigation. For instance, SSTAR is able to identify modified or truncated versions of porins, which may be of great importance in carbapenemase-negative carbapenem-resistant Enterobacteriaceae. SSTAR is written in Java and is therefore platform independent and compatible with both Windows and Unix operating systems. SSTAR and its manual, which includes a simple installation guide, are freely available from https://github.com/tomdeman-bio/Sequence-Search-Tool-for-Antimicrobial-Resistance-SSTAR-. IMPORTANCE Whole-genome sequencing (WGS) is quickly becoming a routine method for identifying genes associated with antimicrobial resistance (AR). However, for many microbiologists, the use and analysis of WGS data present a substantial challenge. We developed SSTAR, software with a graphical user interface that enables the identification of known AR genes from WGS and has the unique capacity to easily detect new variants of known AR genes, including truncated protein variants. Current software solutions do not notify the user when genes are truncated and, therefore, likely nonfunctional, which makes phenotype predictions less accurate. SSTAR users can apply any AR database of interest as a reference comparator and can manually add genes that impact resistance, even if such genes are not resistance determinants per se (e.g., porins and efflux pumps).

  1. GENEASE: Real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization.

    PubMed

    Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B

    2018-03-24

    Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.

  2. Hb Mozhaisk [β92(F8)His→Arg; HBB: c.278A>G] as a De Novo Mutation in a Child of Mixed Ethnic Origins.

    PubMed

    Benzoni, Elena; Giannone, Valentina; Michetti, Laura; Seia, Manuela; Cavalleri, Laura; Curcio, Cristina

    Approximately 150 variants described in the HbVar database have been found to be unstable and about 80.0% of these are on the β-globin gene. We describe the case of a 3-year-old child who presented at the emergency room with fever and asthenia. Hematological data suggested severe hemolytic anemia. Sequencing of the β-globin gene revealed the mutation HBB: c.278A>G at codon 92 in a heterozygous state, reported as Hb Mozhaisk in the HbVar database. Other family members did not have Hb Mozhaisk, thus, this variant is due to a de novo mutation. Because of the rarity of this globin variant, we believe it is important to report similar cases, to have a more complete phenotype description of the pathology and define an adequate reproductive risk for couples, considering the dominant inheritance pattern (hence an inheritance risk of 50.0%).

  3. Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

    PubMed

    Singhal, Ayush; Simmons, Michael; Lu, Zhiyong

    2016-11-01

    The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant) that overlapped with entries in UniProt and 5,384 triplets without overlap in UniProt. Analysis of the overlapping triplets and of a stratified sample of the non-overlapping triplets revealed accuracies of 93% and 80% for the respective categories (cumulative accuracy, 77%). We conclude that our process represents an important and broadly applicable improvement to the state of the art for curation of disease-gene-variant relationships.

  4. APADB: a database for alternative polyadenylation and microRNA regulation events

    PubMed Central

    Müller, Sören; Rycak, Lukas; Afonso-Grunz, Fabian; Winter, Peter; Zawada, Adam M.; Damrath, Ewa; Scheider, Jessica; Schmäh, Juliane; Koch, Ina; Kahl, Günter; Rotter, Björn

    2014-01-01

    Alternative polyadenylation (APA) is a widespread mechanism that contributes to the sophisticated dynamics of gene regulation. Approximately 50% of all protein-coding human genes harbor multiple polyadenylation (PA) sites; their selective and combinatorial use gives rise to transcript variants with differing length of their 3′ untranslated region (3′UTR). Shortened variants escape UTR-mediated regulation by microRNAs (miRNAs), especially in cancer, where global 3′UTR shortening accelerates disease progression, dedifferentiation and proliferation. Here we present APADB, a database of vertebrate PA sites determined by 3′ end sequencing, using massive analysis of complementary DNA ends. APADB provides (A)PA sites for coding and non-coding transcripts of human, mouse and chicken genes. For human and mouse, several tissue types, including different cancer specimens, are available. APADB records the loss of predicted miRNA binding sites and visualizes next-generation sequencing reads that support each PA site in a genome browser. The database tables can either be browsed according to organism and tissue or alternatively searched for a gene of interest. APADB is the largest database of APA in human, chicken and mouse. The stored information provides experimental evidence for thousands of PA sites and APA events. APADB combines 3′ end sequencing data with prediction algorithms of miRNA binding sites, allowing to further improve prediction algorithms. Current databases lack correct information about 3′UTR lengths, especially for chicken, and APADB provides necessary information to close this gap. Database URL: http://tools.genxpro.net/apadb/ PMID:25052703

  5. Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research.

    PubMed

    Manolio, Teri A; Fowler, Douglas M; Starita, Lea M; Haendel, Melissa A; MacArthur, Daniel G; Biesecker, Leslie G; Worthey, Elizabeth; Chisholm, Rex L; Green, Eric D; Jacob, Howard J; McLeod, Howard L; Roden, Dan; Rodriguez, Laura Lyman; Williams, Marc S; Cooper, Gregory M; Cox, Nancy J; Herman, Gail E; Kingsmore, Stephen; Lo, Cecilia; Lutz, Cathleen; MacRae, Calum A; Nussbaum, Robert L; Ordovas, Jose M; Ramos, Erin M; Robinson, Peter N; Rubinstein, Wendy S; Seidman, Christine; Stranger, Barbara E; Wang, Haoyi; Westerfield, Monte; Bult, Carol

    2017-03-23

    Genome sequencing has revolutionized the diagnosis of genetic diseases. Close collaborations between basic scientists and clinical genomicists are now needed to link genetic variants with disease causation. To facilitate such collaborations, we recommend prioritizing clinically relevant genes for functional studies, developing reference variant-phenotype databases, adopting phenotype description standards, and promoting data sharing. Published by Elsevier Inc.

  6. Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research

    PubMed Central

    Manolio, Teri A.; Fowler, Douglas M.; Starita, Lea M.; Haendel, Melissa A.; MacArthur, Daniel G.; Biesecker, Leslie G.; Worthey, Elizabeth; Chisholm, Rex L.; Green, Eric D.; Jacob, Howard J.; McLeod, Howard L.; Roden, Dan; Rodriguez, Laura Lyman; Williams, Marc S.; Cooper, Gregory M.; Cox, Nancy J.; Herman, Gail E.; Kingsmore, Stephen; Lo, Cecilia; Lutz, Cathleen; MacRae, Calum A.; Nussbaum, Robert L.; Ordovas, Jose M.; Ramos, Erin M.; Robinson, Peter N.; Rubinstein, Wendy S.; Seidman, Christine; Stranger, Barbara E.; Wang, Haoyi; Westerfield, Monte; Bult, Carol

    2017-01-01

    Summary Genome sequencing has revolutionized the diagnosis of genetic diseases. Close collaborations between basic scientists and clinical genomicists are now needed to link genetic variants with disease causation. To facilitate such collaborations we recommend prioritizing clinically relevant genes for functional studies, developing reference variant-phenotype databases, adopting phenotype description standards, and promoting data sharing. PMID:28340351

  7. Using diverse U.S. beef cattle genomes to identify missense mutations in EPAS1, a gene associated with pulmonary hypertension

    USDA-ARS?s Scientific Manuscript database

    The availability of whole genome sequence (WGS) data has made it possible to discover protein variants in silico. However, existing bovine WGS databases do not show data in a form conducive to protein variant analysis, and tend to under represent the breadth of genetic diversity in U.S. beef cattle...

  8. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease.

    PubMed

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T; van Oven, Mannis; Wallace, Douglas C; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F; Attimonelli, Marcella; Zuchner, Stephan; Falk, Marni J; Gai, Xiaowu

    2016-06-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse genome browser supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and mitochondrial disease. MSeqDR-LSDB is a locus-specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar compliant variant annotations. PhenoTips will be used for phenotypic data submission on deidentified patients using human phenotype ontology terminology. The development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources. © 2016 WILEY PERIODICALS, INC.

  9. MSeqDR: A Centralized Knowledge Repository and Bioinformatics Web Resource to Facilitate Genomic Investigations in Mitochondrial Disease

    PubMed Central

    Shen, Lishuang; Diroma, Maria Angela; Gonzalez, Michael; Navarro-Gomez, Daniel; Leipzig, Jeremy; Lott, Marie T.; van Oven, Mannis; Wallace, Douglas C.; Muraresku, Colleen Clarke; Zolkipli-Cunningham, Zarazuela; Chinnery, Patrick F.; Attimonelli, Marcella; Zuchner, Stephan

    2016-01-01

    MSeqDR is the Mitochondrial Disease Sequence Data Resource, a centralized and comprehensive genome and phenome bioinformatics resource built by the mitochondrial disease community to facilitate clinical diagnosis and research investigations of individual patient phenotypes, genomes, genes, and variants. A central Web portal (https://mseqdr.org) integrates community knowledge from expert-curated databases with genomic and phenotype data shared by clinicians and researchers. MSeqDR also functions as a centralized application server for Web-based tools to analyze data across both mitochondrial and nuclear DNA, including investigator-driven whole exome or genome dataset analyses through MSeqDR-Genesis. MSeqDR-GBrowse supports interactive genomic data exploration and visualization with custom tracks relevant to mtDNA variation and disease. MSeqDR-LSDB is a locus specific database that currently manages 178 mitochondrial diseases, 1,363 genes associated with mitochondrial biology or disease, and 3,711 pathogenic variants in those genes. MSeqDR Disease Portal allows hierarchical tree-style disease exploration to evaluate their unique descriptions, phenotypes, and causative variants. Automated genomic data submission tools are provided that capture ClinVar-compliant variant annotations. PhenoTips is used for phenotypic data submission on de-identified patients using human phenotype ontology terminology. Development of a dynamic informed patient consent process to guide data access is underway to realize the full potential of these resources. PMID:26919060

  10. A systematic review and meta-analysis of genetic association studies for the role of inflammation and the immune system in diabetic nephropathy

    PubMed Central

    Tziastoudi, Maria; Hadjigeorgiou, Georgios M.; Stravodimos, Konstantinos; Zintzaras, Elias

    2017-01-01

    Abstract Background: Despite the certain contribution of metabolic and haemodynamic factors in diabetic nephropathy (DN), many lines of evidence highlight the role of immunologic and inflammatory mechanisms. To elucidate the contribution of the immune system in the development of DN, we explored the contribution of gene variants (polymorphisms) in relevant pathophysiologic pathways. Methods: We selected six major pathways related to immune response from the Kyoto Encyclopaedia of Genes and Genomes database and thereafter we traced all available genetic association studies (GASs) involving gene variants in these pathways from PubMed and HuGE Navigator. Finally, we used meta-analytic methods for synthesizing the results of the GASs. Results: One hundred three GASs were retrieved that included 443 variants from 75 genes. Of those variants, 138 were meta-analysed and 61 produced significant results; seven variants were investigated in single GASs and showed significant association. Variants in CCL2, CCR5, IL6, IL8, EPO, IL1A, IL1B, IL100, IL1RN, GHRL, MMP9, TGFB1, VEGFA, MMP3, MMP12, IL12RB1, PRKCE, TNF and TNFRSF19 genes were associated with an increased risk of DN. Conclusions: There is evidence that variants related with immunologic response affect the course of DN. However, the present results should be interpreted with caution since the current number of available GASs is limited. PMID:28616206

  11. The PBII gene of the human salivary proline-rich protein P-B produces another protein, Q504X8, with an opiorphin homolog, QRGPR.

    PubMed

    Saitoh, Eiichi; Sega, Takuya; Imai, Akane; Isemura, Satoko; Kato, Tetsuo; Ochiai, Akihito; Taniguchi, Masayuki

    2018-04-01

    The NCBI gene database and human-transcriptome database for alternative splicing were used to determine the expression of mRNAs for P-B (SMR3B) and variant form of P-B. The translational product from the former mRNA was identified as the protein named P-B, whereas that from the latter has not yet been elucidated. In the present study, we investigated the expression of P-B and its variant form at the protein level. To identify the variant protein of P-B, (1) cationic proteins with a higher isoelectric point in human pooled whole saliva were purified by a two dimensional liquid chromatography; (2) the peptide fragments generated from the in-solution of all proteins digested with trypsin separated and analyzed by MALDI-TOF-MS; and (3) the presence or absence of P-B in individual saliva was examined by 15% SDS-PAGE. The peptide sequences (I 37 PPPYSCTPNMNNCSR 52 , C 53 HHHHKRHHYPCNYCFCYPK 72 , R 59 HHYPCNYCFCYPK 72 and H 60 HYPCNYCFCYPK 72 ) present in the variant protein of P-B were identified. The peptide sequence (G 6 PYPPGPLAPPQPFGPGFVPPPPPPPYGPGR 36 ) in P-B (or the variant) and sequence (I 37 PPPPPAPYGPGIFPPPPPQP 57 ) in P-B were identified. The sum of the sequences identified indicated a 91.23% sequence identity for P-B and 79.76% for the variant. There were cases in which P-B existed in individual saliva, but there were cases in which it did not exist in individual saliva. The variant protein is produced by excising a non-canonical intron (CC-AC pair) from the 3'-noncoding sequence of the PBII gene. Both P-B and the variant are subject to proteolysis in the oral cavity. Copyright © 2018 Elsevier Ltd. All rights reserved.

  12. Systematic meta-analyses and field synopsis of genetic association studies in colorectal adenomas

    PubMed Central

    Montazeri, Zahra; Theodoratou, Evropi; Nyiraneza, Christine; Timofeeva, Maria; Chen, Wanjing; Svinti, Victoria; Sivakumaran, Shanya; Gresham, Gillian; Cubitt, Laura; Carvajal-Carmona, Luis; Bertagnolli, Monica M; Zauber, Ann G; Tomlinson, Ian; Farrington, Susan M; Dunlop, Malcolm G; Campbell, Harry; Little, Julian

    2018-01-01

    Background Low penetrance genetic variants, primarily single nucleotide polymorphisms, have substantial influence on colorectal cancer (CRC) susceptibility. Most CRCs develop from colorectal adenomas (CRA). Here, we report the first comprehensive field synopsis that catalogues all genetic association studies on CRA, with a parallel online database (http://www.chs.med.ed.ac.uk/CRAgene/). Methods We performed a systematic review, reviewing 9750 titles and then extracted data from 130 publications reporting on 181 polymorphisms in 74 genes. We conducted meta-analyses to derive summary effect estimates for 37 polymorphisms in 26 genes. We applied the Venice criteria and Bayesian False Discovery Probability (BFDP) to assess the levels of the credibility of associations. Results We considered the association with the rs6983267 variant at 8q24 as “highly credible”, reaching genome wide statistical significance in at least one meta-analysis model. We identified “less credible” associations (higher heterogeneity, lower statistical power, BFDP>0.02) with a further four variants of four independent genes: MTHFR c.677C>T p.A222V (rs1801133), TP53 c.215C>G p.R72P (rs1042522), NQO1 c.559C>T p.P187S (rs1800566), and NAT1 alleles imputed as fast acetylator genotypes. For the remaining 32 variants of 22 genes for which positive associations with CRA risk have been previously reported, the meta-analyses revealed no credible evidence to support these as true associations. Conclusions The limited number of credible associations between low penetrance genetic variants and CRA reflects the lower volume of evidence and associated lack of statistical power to detect associations of the magnitude typically observed for genetic variants and chronic diseases. The CRAgene database provides context for CRA genetic association data and will help inform future research directions. PMID:26451011

  13. Pleiotropic Effects of Variants in Dementia Genes in Parkinson Disease.

    PubMed

    Ibanez, Laura; Dube, Umber; Davis, Albert A; Fernandez, Maria V; Budde, John; Cooper, Breanna; Diez-Fairen, Monica; Ortega-Cubero, Sara; Pastor, Pau; Perlmutter, Joel S; Cruchaga, Carlos; Benitez, Bruno A

    2018-01-01

    Background: The prevalence of dementia in Parkinson disease (PD) increases dramatically with advancing age, approaching 80% in patients who survive 20 years with the disease. Increasing evidence suggests clinical, pathological and genetic overlap between Alzheimer disease, dementia with Lewy bodies and frontotemporal dementia with PD. However, the contribution of the dementia-causing genes to PD risk, cognitive impairment and dementia in PD is not fully established. Objective: To assess the contribution of coding variants in Mendelian dementia-causing genes on the risk of developing PD and the effect on cognitive performance of PD patients. Methods: We analyzed the coding regions of the amyloid-beta precursor protein ( APP ), Presenilin 1 and 2 ( PSEN1, PSEN2 ), and Granulin ( GRN ) genes from 1,374 PD cases and 973 controls using pooled-DNA targeted sequence, human exome-chip and whole-exome sequencing (WES) data by single variant and gene base (SKAT-O and burden tests) analyses. Global cognitive function was assessed using the Mini-Mental State Examination (MMSE) or the Montreal Cognitive Assessment (MoCA). The effect of coding variants in dementia-causing genes on cognitive performance was tested by multiple regression analysis adjusting for gender, disease duration, age at dementia assessment, study site and APOE carrier status. Results: Known AD pathogenic mutations in the PSEN1 (p.A79V) and PSEN2 (p.V148I) genes were found in 0.3% of all PD patients. There was a significant burden of rare, likely damaging variants in the GRN and PSEN1 genes in PD patients when compared with frequencies in the European population from the ExAC database. Multiple regression analysis revealed that PD patients carrying rare variants in the APP, PSEN1, PSEN2 , and GRN genes exhibit lower cognitive tests scores than non-carrier PD patients ( p = 2.0 × 10 -4 ), independent of age at PD diagnosis, age at evaluation, APOE status or recruitment site. Conclusions: Pathogenic mutations in the Alzheimer disease-causing genes ( PSEN1 and PSEN2) are found in sporadic PD patients. PD patients with cognitive decline carry rare variants in dementia-causing genes. Variants in genes causing Mendelian neurodegenerative diseases exhibit pleiotropic effects.

  14. A searchable, whole genome resource designed for protein variant analysis in diverse lineages of U.S. beef cattle

    USDA-ARS?s Scientific Manuscript database

    A key feature of a gene's function is the variety of protein isoforms it encodes in a population. However, the genetic diversity in bovine whole genome databases tends to be underrepresented because these databases contain an abundance of sequence from the most influential sires. Our first aim was ...

  15. Spectrum of PAH gene variants among a population of Han Chinese patients with phenylketonuria from northern China.

    PubMed

    Liu, Ning; Huang, Qiuying; Li, Qingge; Zhao, Dehua; Li, Xiaole; Cui, Lixia; Bai, Ying; Feng, Yin; Kong, Xiangdong

    2017-10-05

    Phenylketonuria (PKU), which primarily results from a deficiency of phenylalanine hydroxylase (PAH), is one of the most common inherited inborn errors of metabolism that impairs postnatal cognitive development. The incidence of various PAH variations differs by race and ethnicity. The aim of the present study was to characterize the PAH gene variants of a Han population from Northern China. In total, 655 PKU patients and their families were recruited for this study; each proband was diagnosed both clinically and biochemically with phenylketonuria. Subjects were sequentially screened for single-base variants and exon deletions or duplications within PAH via direct Sanger sequencing and multiplex ligation-dependent probe amplification (MLPA). A spectrum of 174 distinct PAH variants was identified: 152 previously documented variants and 22 novel variants. While single-base variants were distributed throughout the 13 exons, they were particularly concentrated in exons 7 (33.3%), 11 (14.2%), 6 (13.2%), 12 (11.0%), 3 (10.4%), and 5 (4.4%). The predominant variant was p.Arg243Gln (17.7%), followed by Ex6-96A > G (8.3%), p.Val399 = (6.4%), p.Arg53His (4.7%), p.Tyr356* (4.7%), p.Arg241Cys (4.6%), p.Arg413Pro (4.6%), p.Arg111* (4.4%), and c.442-1G > A (3.4%). Notably, two patients were also identified as carrying de novo variants. The composition of PAH gene variants in this Han population from Northern China was distinct from those of other ethnic groups. As such, the construction of a PAH gene variant database for Northern China is necessary to lay a foundation for genetic-based diagnoses, prenatal diagnoses, and population screening.

  16. A functional promoter variant of the human formimidoyltransferase cyclodeaminase (FTCD) gene is associated with working memory performance in young but not older adults.

    PubMed

    Greenwood, Pamela M; Schmidt, Kevin; Lin, Ming-Kuan; Lipsky, Robert; Parasuraman, Raja; Jankord, Ryan

    2018-06-21

    The central role of working memory in IQ and the high heritability of working memory performance motivated interest in identifying the specific genes underlying this heritability. The FTCD (formimidoyltransferase cyclodeaminase) gene was identified as a candidate gene for allelic association with working memory in part from genetic mapping studies of mouse Morris water maze performance. The present study tested variants of this gene for effects on a delayed match-to-sample task of a large sample of younger and older participants. The rs914246 variant, but not the rs914245 variant, of the FTCD gene modulated accuracy in the task for younger, but not older, people under high working memory load. The interaction of haplotype × distance × load had a partial eta squared effect size of 0.015. Analysis of simple main effects had partial eta squared effect sizes ranging from 0.012 to 0.040. A reporter gene assay revealed that the C allele of the rs914246 genotype is functional and a main factor regulating FTCD gene expression. This study extends previous work on the genetics of working memory by revealing that a gene in the glutamatergic pathway modulates working memory in young people but not in older people. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  17. The Israeli National Genetic database: a 10-year experience.

    PubMed

    Zlotogora, Joël; Patrinos, George P

    2017-03-16

    The Israeli National and Ethnic Mutation database ( http://server.goldenhelix.org/israeli ) was launched in September 2006 on the ETHNOS software to include clinically relevant genomic variants reported among Jewish and Arab Israeli patients. In 2016, the database was reviewed and corrected according to ClinVar ( https://www.ncbi.nlm.nih.gov/clinvar ) and ExAC ( http://exac.broadinstitute.org ) database entries. The present article summarizes some key aspects from the development and continuous update of the database over a 10-year period, which could serve as a paradigm of successful database curation for other similar resources. In September 2016, there were 2444 entries in the database, 890 among Jews, 1376 among Israeli Arabs, and 178 entries among Palestinian Arabs, corresponding to an ~4× data content increase compared to when originally launched. While the Israeli Arab population is much smaller than the Jewish population, the number of pathogenic variants causing recessive disorders reported in the database is higher among Arabs (934) than among Jews (648). Nevertheless, the number of pathogenic variants classified as founder mutations in the database is smaller among Arabs (175) than among Jews (192). In 2016, the entire database content was compared to that of other databases such as ClinVar and ExAC. We show that a significant difference in the percentage of pathogenic variants from the Israeli genetic database that were present in ExAC was observed between the Jewish population (31.8%) and the Israeli Arab population (20.6%). The Israeli genetic database was launched in 2006 on the ETHNOS software and is available online ever since. It allows querying the database according to the disorder and the ethnicity; however, many other features are not available, in particular the possibility to search according to the name of the gene. In addition, due to the technical limitations of the previous ETHNOS software, new features and data are not included in the present online version of the database and upgrade is currently ongoing.

  18. Efficient analysis of mouse genome sequences reveal many nonsense variants

    PubMed Central

    Steeland, Sophie; Timmermans, Steven; Van Ryckeghem, Sara; Hulpiau, Paco; Saeys, Yvan; Van Montagu, Marc; Vandenbroucke, Roosmarijn E.; Libert, Claude

    2016-01-01

    Genetic polymorphisms in coding genes play an important role when using mouse inbred strains as research models. They have been shown to influence research results, explain phenotypical differences between inbred strains, and increase the amount of interesting gene variants present in the many available inbred lines. SPRET/Ei is an inbred strain derived from Mus spretus that has ∼1% sequence difference with the C57BL/6J reference genome. We obtained a listing of all SNPs and insertions/deletions (indels) present in SPRET/Ei from the Mouse Genomes Project (Wellcome Trust Sanger Institute) and processed these data to obtain an overview of all transcripts having nonsynonymous coding sequence variants. We identified 8,883 unique variants affecting 10,096 different transcripts from 6,328 protein-coding genes, which is about 28% of all coding genes. Because only a subset of these variants results in drastic changes in proteins, we focused on variations that are nonsense mutations that ultimately resulted in a gain of a stop codon. These genes were identified by in silico changing the C57BL/6J coding sequences to the SPRET/Ei sequences, converting them to amino acid (AA) sequences, and comparing the AA sequences. All variants and transcripts affected were also stored in a database, which can be browsed using a SPRET/Ei M. spretus variants web tool (www.spretus.org), including a manual. We validated the tool by demonstrating the loss of function of three proteins predicted to be severely truncated, namely Fas, IRAK2, and IFNγR1. PMID:27147605

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Golbus, Jessica R.; Puckelwartz, Megan J.; Dellefave-Castillo, Lisa

    Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused onmore » 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. We conclude that these pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.« less

  20. Characterization of Novel Missense Variants of SERPINA1 Gene Causing Alpha-1 Antitrypsin Deficiency.

    PubMed

    Matamala, Nerea; Lara, Beatriz; Gomez-Mariano, Gema; Martínez, Selene; Retana, Diana; Fernandez, Taiomara; Silvestre, Ramona Angeles; Belmonte, Irene; Rodriguez-Frias, Francisco; Vilar, Marçal; Sáez, Raquel; Iturbe, Igor; Castillo, Silvia; Molina-Molina, María; Texido, Anna; Tirado-Conde, Gema; Lopez-Campos, Jose Luis; Posada, Manuel; Blanco, Ignacio; Janciauskiene, Sabina; Martinez-Delgado, Beatriz

    2018-06-01

    The SERPINA1 gene is highly polymorphic, with more than 100 variants described in databases. SERPINA1 encodes the alpha-1 antitrypsin (AAT) protein, and severe deficiency of AAT is a major contributor to pulmonary emphysema and liver diseases. In Spanish patients with AAT deficiency, we identified seven new variants of the SERPINA1 gene involving amino acid substitutions in different exons: PiSDonosti (S+Ser14Phe), PiTijarafe (Ile50Asn), PiSevilla (Ala58Asp), PiCadiz (Glu151Lys), PiTarragona (Phe227Cys), PiPuerto Real (Thr249Ala), and PiValencia (Lys328Glu). We examined the characteristics of these variants and the putative association with the disease. Mutant proteins were overexpressed in HEK293T cells, and AAT expression, polymerization, degradation, and secretion, as well as antielastase activity, were analyzed by periodic acid-Schiff staining, Western blotting, pulse-chase, and elastase inhibition assays. When overexpressed, S+S14F, I50N, A58D, F227C, and T249A variants formed intracellular polymers and did not secrete AAT protein. Both the E151K and K328E variants secreted AAT protein and did not form polymers, although K328E showed intracellular retention and reduced antielastase activity. We conclude that deficient variants may be more frequent than previously thought and that their discovery is possible only by the complete sequencing of the gene and subsequent functional characterization. Better knowledge of SERPINA1 variants would improve diagnosis and management of individuals with AAT deficiency.

  1. A database of gene-environment interactions pertaining to blood lipid traits, cardiovascular disease and type 2 diabetes

    USDA-ARS?s Scientific Manuscript database

    As the role of the environment – diet, exercise, alcohol and tobacco use and sleep among others – is accorded a more prominent role in modifying the relationship between genetic variants and clinical measures of disease, consideration of gene-environment (GxE) interactions is a must. To facilitate i...

  2. Databases in the Area of Pharmacogenetics

    PubMed Central

    Sim, Sarah C.; Altman, Russ B.; Ingelman-Sundberg, Magnus

    2012-01-01

    In the area of pharmacogenetics and personalized health care it is obvious that databases, providing important information of the occurrence and consequences of variant genes encoding drug metabolizing enzymes, drug transporters, drug targets, and other proteins of importance for drug response or toxicity, are of critical value for scientists, physicians, and industry. The primary outcome of the pharmacogenomic field is the identification of biomarkers that can predict drug toxicity and drug response, thereby individualizing and improving drug treatment of patients. The drug in question and the polymorphic gene exerting the impact are the main issues to be searched for in the databases. Here, we review the databases that provide useful information in this respect, of benefit for the development of the pharmacogenomic field. PMID:21309040

  3. Analysis of prostate-specific antigen transcripts in chimpanzees, cynomolgus monkeys, baboons, and African green monkeys.

    PubMed

    Mubiru, James N; Yang, Alice S; Olsen, Christian; Nayak, Sudhir; Livi, Carolina B; Dick, Edward J; Owston, Michael; Garcia-Forey, Magdalena; Shade, Robert E; Rogers, Jeffrey

    2014-01-01

    The function of prostate-specific antigen (PSA) is to liquefy the semen coagulum so that the released sperm can fuse with the ovum. Fifteen spliced variants of the PSA gene have been reported in humans, but little is known about alternative splicing in nonhuman primates. Positive selection has been reported in sex- and reproductive-related genes from sea urchins to Drosophila to humans; however, there are few studies of adaptive evolution of the PSA gene. Here, using polymerase chain reaction (PCR) product cloning and sequencing, we study PSA transcript variant heterogeneity in the prostates of chimpanzees (Pan troglodytes), cynomolgus monkeys (Macaca fascicularis), baboons (Papio hamadryas anubis), and African green monkeys (Chlorocebus aethiops). Six PSA variants were identified in the chimpanzee prostate, but only two variants were found in cynomolgus monkeys, baboons, and African green monkeys. In the chimpanzee the full-length transcript is expressed at the same magnitude as the transcripts that retain intron 3. We have found previously unidentified splice variants of the PSA gene, some of which might be linked to disease conditions. Selection on the PSA gene was studied in 11 primate species by computational methods using the sequences reported here for African green monkey, cynomolgus monkey, baboon, and chimpanzee and other sequences available in public databases. A codon-based analysis (dN/dS) of the PSA gene identified potential adaptive evolution at five residue sites (Arg45, Lys70, Gln144, Pro189, and Thr203).

  4. An ensemble rank learning approach for gene prioritization.

    PubMed

    Lee, Po-Feng; Soo, Von-Wun

    2013-01-01

    Several different computational approaches have been developed to solve the gene prioritization problem. We intend to use the ensemble boosting learning techniques to combine variant computational approaches for gene prioritization in order to improve the overall performance. In particular we add a heuristic weighting function to the Rankboost algorithm according to: 1) the absolute ranks generated by the adopted methods for a certain gene, and 2) the ranking relationship between all gene-pairs from each prioritization result. We select 13 known prostate cancer genes in OMIM database as training set and protein coding gene data in HGNC database as test set. We adopt the leave-one-out strategy for the ensemble rank boosting learning. The experimental results show that our ensemble learning approach outperforms the four gene-prioritization methods in ToppGene suite in the ranking results of the 13 known genes in terms of mean average precision, ROC and AUC measures.

  5. Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

    DOE PAGES

    Golbus, Jessica R.; Puckelwartz, Megan J.; Dellefave-Castillo, Lisa; ...

    2014-09-01

    Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused onmore » 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. We conclude that these pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.« less

  6. Complex phenotype of dyskeratosis congenita and mood dysregulation with novel homozygous RTEL1 and TPH1 variants.

    PubMed

    Ungar, Rachel A; Giri, Neelam; Pao, Maryland; Khincha, Payal P; Zhou, Weiyin; Alter, Blanche P; Savage, Sharon A

    2018-06-01

    Dyskeratosis congenita (DC) is an inherited bone marrow failure syndrome caused by germline mutations in telomere biology genes. Patients have extremely short telomeres for their age and a complex phenotype including oral leukoplakia, abnormal skin pigmentation, and dysplastic nails in addition to bone marrow failure, pulmonary fibrosis, stenosis of the esophagus, lacrimal ducts and urethra, developmental anomalies, and high risk of cancer. We evaluated a patient with features of DC, mood dysregulation, diabetes, and lack of pubertal development. Family history was not available but genome-wide genotyping was consistent with consanguinity. Whole exome sequencing identified 82 variants of interest in 80 genes based on the following criteria: homozygous, <0.1% minor allele frequency in public and in-house databases, nonsynonymous, and predicted deleterious by multiple in silico prediction programs. Six genes were identified likely contributory to the clinical presentation. The cause of DC is likely due to homozygous splice site variants in regulator of telomere elongation helicase 1, a known DC and telomere biology gene. A homozygous, missense variant in tryptophan hydroxylase 1 may be clinically important as this gene encodes the rate limiting step in serotonin biosynthesis, a biologic pathway connected with mood disorders. Four additional genes (SCN4A, LRP4, GDAP1L1, and SPTBN5) had rare, missense homozygous variants that we speculate may contribute to portions of the clinical phenotype. This case illustrates the value of conducting detailed clinical and genomic evaluations on rare patients in order to identify new areas of research into the functional consequences of rare variants and their contribution to human disease. © 2018 Wiley Periodicals, Inc.

  7. Mutation analysis of the COL1A1 and COL1A2 genes in Vietnamese patients with osteogenesis imperfecta.

    PubMed

    Ho Duy, Binh; Zhytnik, Lidiia; Maasalu, Katre; Kändla, Ivo; Prans, Ele; Reimann, Ene; Märtson, Aare; Kõks, Sulev

    2016-08-12

    The genetics of osteogenesis imperfecta (OI) have not been studied in a Vietnamese population before. We performed mutational analysis of the COL1A1 and COL1A2 genes in 91 unrelated OI patients of Vietnamese origin. We then systematically characterized the mutation profiles of these two genes which are most commonly related to OI. Genomic DNA was extracted from EDTA-preserved blood according to standard high-salt extraction methods. Sequence analysis and pathogenic variant identification was performed with Mutation Surveyor DNA variant analysis software. Prediction of the pathogenicity of mutations was conducted using Alamut Visual software. The presence of variants was checked against Dalgleish's osteogenesis imperfecta mutation database. The sample consisted of 91 unrelated osteogenesis imperfecta patients. We identified 54 patients with COL1A1/2 pathogenic variants; 33 with COL1A1 and 21 with COL1A2. Two patients had multiple pathogenic variants. Seventeen novel COL1A1 and 10 novel COL1A2 variants were identified. The majority of identified COL1A1/2 pathogenic variants occurred in a glycine substitution (36/56, 64.3 %), usually serine (23/36, 63.9 %). We found two pathogenic variants of the COL1A1 gene c.2461G > A (p.Gly821Ser) in four unrelated patients and one, c.2005G > A (p.Ala669Thr), in two unrelated patients. Our data showed a lower number of collagen OI pathogenic variants in Vietnamese patients compared to reported rates for Asian populations. The OI mutational profile of the Vietnamese population is unique and related to the presence of a high number of recessive mutations in non-collagenous OI genes. Further analysis of OI patients negative for collagen mutations, is required.

  8. BISQUE: locus- and variant-specific conversion of genomic, transcriptomic and proteomic database identifiers.

    PubMed

    Meyer, Michael J; Geske, Philip; Yu, Haiyuan

    2016-05-15

    Biological sequence databases are integral to efforts to characterize and understand biological molecules and share biological data. However, when analyzing these data, scientists are often left holding disparate biological currency-molecular identifiers from different databases. For downstream applications that require converting the identifiers themselves, there are many resources available, but analyzing associated loci and variants can be cumbersome if data is not given in a form amenable to particular analyses. Here we present BISQUE, a web server and customizable command-line tool for converting molecular identifiers and their contained loci and variants between different database conventions. BISQUE uses a graph traversal algorithm to generalize the conversion process for residues in the human genome, genes, transcripts and proteins, allowing for conversion across classes of molecules and in all directions through an intuitive web interface and a URL-based web service. BISQUE is freely available via the web using any major web browser (http://bisque.yulab.org/). Source code is available in a public GitHub repository (https://github.com/hyulab/BISQUE). haiyuan.yu@cornell.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. De novo mutations in genes of mediator complex causing syndromic intellectual disability: mediatorpathy or transcriptomopathy?

    PubMed

    Caro-Llopis, Alfonso; Rosello, Monica; Orellana, Carmen; Oltra, Silvestre; Monfort, Sandra; Mayo, Sonia; Martinez, Francisco

    2016-12-01

    Mutations in the X-linked gene MED12 cause at least three different, but closely related, entities of syndromic intellectual disability. Recently, a new syndrome caused by MED13L deleterious variants has been described, which shows similar clinical manifestations including intellectual disability, hypotonia, and other congenital anomalies. Genotyping of 1,256 genes related with neurodevelopment was performed by next-generation sequencing in three unrelated patients and their healthy parents. Clinically relevant findings were confirmed by conventional sequencing. Each patient showed one de novo variant not previously reported in the literature or databases. Two different missense variants were found in the MED12 or MED13L genes and one nonsense mutation was found in the MED13L gene. The phenotypic consequences of these mutations are closely related and/or have been previously reported in one or other gene. Additionally, MED12 and MED13L code for two closely related partners of the mediator kinase module. Consequently, we propose the concept of a common MED12/MED13L clinical spectrum, encompassing Opitz-Kaveggia syndrome, Lujan-Fryns syndrome, Ohdo syndrome, MED13L haploinsufficiency syndrome, and others.

  10. Mutation databases and other online sites as a resource for transfusion medicine: history and attributes.

    PubMed

    Blumenfeld, Olga O

    2002-04-01

    Recent advances in molecular biology and technology have provided evidence, at a molecular level, for long-known observations that the human genome is not unique but is characterized by individual sequence variation. At the present time, documentation of genetic variation occurring in a large number of genes is increasing exponentially. The characterization of alleles that encode a variety of blood group antigens has been particularly fruitful for transfusion medicine. Phenotypic variation, as identified by the serologic study of blood group variants, is required to identify the presence of a variant allele. Many of the other alleles currently recorded have been selected and identified on the basis of inherited disease traits. New approaches document single nucleotide polymorphisms that occur throughout the genome and best show how the DNA sequence varies in the human population. The primary data dealing with variant alleles or more general genomic variation are scattered throughout the scientific literature and only within the last few years has information begun to be organized into databases. This article provides guidance on how to access those databases online as a source of information about genetic variation for purposes of molecular, clinical, and diagnostic medicine, research, and teaching. The attributes of the sites are described. A more detailed view of the database dealing specifically with alleles of genes encoding the blood group antigens includes a brief preliminary analysis of the molecular basis for observed polymorphisms. Other online sites that may be particularly useful to the transfusion medicine readership as well as a brief historical account are also presented. Copyright 2002, Elsevier Science (USA). All rights reserved.

  11. Gamma-aminobutyric acid A receptor, α-2 (GABRA2) variants as individual markers for alcoholism: a meta-analysis.

    PubMed

    Zintzaras, Elias

    2012-08-01

    The available evidence from the genetic association studies (GAS) published to date on the association between variants in the GABRA2 gene and alcoholism has produced inconclusive results. To interpret these results, a meticulous meta-analysis of all available studies was carried out. The PubMed database and the HuGE Navigator were searched for published GAS-related variants in the GABRA2 gene with susceptibility to alcoholism. Then, the GAS were synthesized to decrease the uncertainty of estimated genetic risk effects. The risk effects were estimated on the basis of the odds ratio (OR) of the allele contrast and the generalized odds ratio (OR(G)), a model-free approach. Cumulative and recursive cumulative meta-analyses (CMA) were also carried out to investigate the trend and stability of effect sizes as evidence accumulates. Fourteen variants investigated in eight studies were analyzed. Significant associations were derived for four variants either for the allele contrast or for the OR(G). In particular, the variants rs279858 and rs279845 showed marginal significance for OR(G): OR(G)=1.27 (1.01-1.60) and OR(G)=1.49 (1.02-2.19), respectively. Also, the variants rs567926 and rs279844 showed significance for the allele contrast: OR=1.24 (1.06-1.46) and OR=1.23 (1.08-1.43), respectively; the ORG produced similar results. The variant rs279858 produced a large heterogeneity between studies. CMA showed a trend of an association only for the variant rs567926. Recursive CMA indicated that more evidence is needed to conclude on the status of significance of all variants. There is evidence that variants in the GABRA2 gene are associated with alcoholism. However, the present findings should be interpreted with caution.

  12. Analysis of RNA-Seq datasets reveals enrichment of tissue-specific splice variants for nuclear envelope proteins.

    PubMed

    Capitanchik, Charlotte; Dixon, Charles; Swanson, Selene K; Florens, Laurence; Kerr, Alastair R W; Schirmer, Eric C

    2018-06-18

    Nuclear envelopathies/laminopathies yield tissue-specific pathologies, yet arise from mutation of ubiquitously-expressed genes. One possible explanation of this tissue specificity is that tissue-specific partners become disrupted from larger complexes, but a little investigated alternate hypothesis is that the mutated proteins themselves have tissue-specific splice variants. Here, we analyze RNA-Seq datasets to identify muscle-specific splice variants of nuclear envelope genes that could be relevant to the study of laminopathies, particularly muscular dystrophies, that are not currently annotated in sequence databases. Notably, we found novel isoforms or tissue-specificity of isoforms for: Lap2, linked to cardiomyopathy; Nesprin 2, linked to Emery-Dreifuss muscular dystrophy and Lmo7, a regulator of the emerin gene that is linked to Emery-Dreifuss muscular dystrophy. Interestingly, the muscle-specific exon in Lmo7 is rich in serine phosphorylation motifs, suggesting an important regulatory function. Evidence for muscle-specific splice variants in non-nuclear envelope proteins linked to other muscular dystrophies was also found. Tissue-specific variants were also indicated for several nucleoporins including Nup54, Nup133, Nup153 and Nup358/RanBP2. We confirmed expression of novel Lmo7 and RanBP2 variants with RT-PCR and found that specific knockdown of the Lmo7 variant caused a reduction in myogenic index during mouse C2C12 myogenesis. Global analysis revealed an enrichment of tissue-specific splice variants for nuclear envelope proteins in general compared to the rest of the genome, suggesting that splice variants contribute to regulating its tissue-specific functions.

  13. Identification of a novel valosin-containing protein polymorphism in late-onset Alzheimer's disease.

    PubMed

    Kaleem, M; Zhao, A; Hamshere, M; Myers, A J

    2007-01-01

    Recently, mutations in the valosin-containing protein gene (VCP) were found to be causative for a rare form of dementia [Watts GDJ, et al.: Nat Genet 2004;36:377-381]. This gene lies within a region on the genome that has been linked to late onset Alzheimer's disease (LOAD) [Myers A, et al.: Am J Med Genet 2002;114:233-242]. In this study, we investigated whether variation within VCP could account for the LOAD linkage peak on chromosome 9. We sequenced 188 individuals from the set of sibling pairs we had used to obtain the linkage results for chromosome 9 to look for novel polymorphisms that could explain the linkage signal. Any variant that was found was then typed in 2 additional sets of neuropathologically confirmed samples to look for associations with Alzheimer's disease. We found 2 variants when we sequenced VCP. One was a novel rare variant (R92H) and the other is already reported within the publicly available databases (rs10972300). Neither explained the chromosome 9 linkage signal for LOAD. We have found a novel rare variant within the VCP gene, but we did not find a variant that could explain the linkage signal for LOAD on chromosome 9. Copyright (c) 2007 S. Karger AG, Basel.

  14. Integrated sequence analysis pipeline provides one-stop solution for identifying disease-causing mutations.

    PubMed

    Hu, Hao; Wienker, Thomas F; Musante, Luciana; Kalscheuer, Vera M; Kahrizi, Kimia; Najmabadi, Hossein; Ropers, H Hilger

    2014-12-01

    Next-generation sequencing has greatly accelerated the search for disease-causing defects, but even for experts the data analysis can be a major challenge. To facilitate the data processing in a clinical setting, we have developed a novel medical resequencing analysis pipeline (MERAP). MERAP assesses the quality of sequencing, and has optimized capacity for calling variants, including single-nucleotide variants, insertions and deletions, copy-number variation, and other structural variants. MERAP identifies polymorphic and known causal variants by filtering against public domain databases, and flags nonsynonymous and splice-site changes. MERAP uses a logistic model to estimate the causal likelihood of a given missense variant. MERAP considers the relevant information such as phenotype and interaction with known disease-causing genes. MERAP compares favorably with GATK, one of the widely used tools, because of its higher sensitivity for detecting indels, its easy installation, and its economical use of computational resources. Upon testing more than 1,200 individuals with mutations in known and novel disease genes, MERAP proved highly reliable, as illustrated here for five families with disease-causing variants. We believe that the clinical implementation of MERAP will expedite the diagnostic process of many disease-causing defects. © 2014 WILEY PERIODICALS, INC.

  15. A Bioinformatics Approach to the Identification of Variants Associated with Type 1 and Type 2 Diabetes Mellitus that Reside in Functionally Validated miRNAs Binding Sites.

    PubMed

    Ghaedi, Hamid; Bastami, Milad; Jahani, Mohammad Mehdi; Alipoor, Behnam; Tabasinezhad, Maryam; Ghaderi, Omar; Nariman-Saleh-Fam, Ziba; Mirfakhraie, Reza; Movafagh, Abolfazl; Omrani, Mir Davood; Masotti, Andrea

    2016-06-01

    The present work is aimed at finding variants associated with Type 1 and Type 2 diabetes mellitus (DM) that reside in functionally validated miRNAs binding sites and that can have a functional role in determining diabetes and related pathologies. Using bioinformatics analyses we obtained a database of validated polymorphic miRNA binding sites which has been intersected with genes related to DM or to variants associated and/or in linkage disequilibrium (LD) with it and is reported in genome-wide association studies (GWAS). The workflow we followed allowed us to find variants associated with DM that also reside in functional miRNA binding sites. These data have been demonstrated to have a functional role by impairing the functions of genes implicated in biological processes linked to DM. In conclusion, our work emphasized the importance of SNPs located in miRNA binding sites. The results discussed in this work may constitute the basis of further works aimed at finding functional candidates and variants affecting protein structure and function, transcription factor binding sites, and non-coding epigenetic variants, contributing to widen the knowledge about the pathogenesis of this important disease.

  16. Identification of Susceptibility Loci and Genes for Colorectal Cancer Risk

    PubMed Central

    Zeng, Chenjie; Matsuda, Koichi; Jia, Wei-Hua; Chang, Jiang; Kweon, Sun-Seog; Xiang, Yong-Bing; Shin, Aesun; Jee, Sun Ha; Kim, Dong-Hyun; Zhang, Ben; Cai, Qiuyin; Guo, Xingyi; Long, Jirong; Wang, Nan; Courtney, Regina; Pan, Zhi-Zhong; Wu, Chen; Takahashi, Atsushi; Shin, Min-Ho; Matsuo, Keitaro; Matsuda, Fumihiko; Gao, Yu-Tang; Oh, Jae Hwan; Kim, Soriul; Jung, Keum Ji; Ahn, Yoon-Ok; Ren, Zefang; Li, Hong-Lan; Wu, Jie; Shi, Jiajun; Wen, Wanqing; Yang, Gong; Li, Bingshan; Ji, Bu-Tian; Brenner, Hermann; Schoen, Robert E.; Küry, Sébastien; Gruber, Stephen B.; Schumacher, Fredrick R.; Stenzel, Stephanie L.; Casey, Graham; Hopper, John L.; Jenkins, Mark A.; Kim, Hyeong-Rok; Jeong, Jin-Young; Park, Ji Won; Tajima, Kazuo; Cho, Sang-Hee; Kubo, Michiaki; Shu, Xiao-Ou; Lin, Dongxin; Zeng, Yi-Xin; Zheng, Wei

    2016-01-01

    Background & Aims Known Genetic factors explain only a small fraction of genetic variation in colorectal cancer (CRC). We conducted a genome-wide association study (GWAS) to identify risk loci for CRC. Methods This discovery stage included 8027 cases and 22577 controls of East-Asian ancestry. Promising variants were evaluated in studies including as many as 11044 cases and 12047 controls. Tumor-adjacent normal tissues from 188 patients were analyzed to evaluate correlations of risk variants with expression levels of nearby genes. Potential functionality of risk variants were evaluated using public genomic and epigenomic databases. Results We identified 4 loci associated with CRC risk; P values for the most significant variant in each locus ranged from 3.92×10−8 to 1.24×10−12: 6p21.1 (rs4711689), 8q23.3 (rs2450115, rs6469656), 10q24.3 (rs4919687), and 12p13.3 (rs11064437). We also identified 2 risk variants at loci previously associated with CRC: 10q25.2 (rs10506868) and 20q13.3 (rs6061231). These risk variants, conferring an approximate 10%–18% increase in risk per allele, are located either inside or near protein-coding genes that include TFEB (lysosome biogenesis and autophagy), EIF3H (initiation of translation), CYP17A1 (steroidogenesis), SPSB2 (proteasome degradation), and RPS21 (ribosome biogenesis). Gene expression analyses showed a significant association (P <.05) for rs4711689 with TFEB, rs6469656 with EIF3H, rs11064437 with SPSB2, and rs6061231 with RPS21. Conclusions We identified susceptibility loci and genes associated with CRC risk, linking CRC predisposition to steroid hormone, protein synthesis and degradation, and autophagy pathways and providing added insight into the mechanism of CRC pathogenesis. PMID:26965516

  17. Comparative transcriptome analysis of three color variants of the sea cucumber Apostichopus japonicus.

    PubMed

    Jo, Jihoon; Park, Jongsun; Lee, Hyun-Gwan; Kern, Elizabeth M A; Cheon, Seongmin; Jin, Soyeong; Park, Joong-Ki; Cho, Sung-Jin; Park, Chungoo

    2016-08-01

    The sea cucumber Apostichopus japonicus Selenka 1867 represents an important resource in biomedical research, traditional medicine, and the seafood industry. Much of the commercial value of A. japonicus is determined by dorsal/ventral color variation (red, green, and black), yet the taxonomic relationships between these color variants are not clearly understood. We performed the first comparative analysis of de novo assembled transcriptome data from three color variants of A. japonicus. Using the Illumina platform, we sequenced nearly 177,596,774 clean reads representing a total of 18.2Gbp of sea cucumber transcriptome. A comparison of over 0.3 million transcript scaffolds against the Uniprot/Swiss-Prot database yielded 8513, 8602, and 8588 positive matches for green, red, and black body color transcriptomes, respectively. Using the Panther gene classification system, we assessed an extensive and diverse set of expressed genes in three color variants and found that (1) among the three color variants of A. japonicus, genes associated with RNA binding protein, oxidoreductase, nucleic acid binding, transferase, and KRAB box transcription factor were most commonly expressed; and (2) the main protein functional classes are differently regulated in all three color variants (extracellular matrix protein and phosphatase for green color, transporter and potassium channel for red color, and G-protein modulator and enzyme modulator for black color). This work will assist in the discovery and annotation of novel genes that play significant morphological and physiological roles in color variants of A. japonicus, and these sequence data will provide a useful set of resources for the rapidly growing sea cucumber aquaculture industry. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Integrated Enrichment Analysis of Variants and Pathways in Genome-Wide Association Studies Indicates Central Role for IL-2 Signaling Genes in Type 1 Diabetes, and Cytokine Signaling Genes in Crohn's Disease

    PubMed Central

    Carbonetto, Peter; Stephens, Matthew

    2013-01-01

    Pathway analyses of genome-wide association studies aggregate information over sets of related genes, such as genes in common pathways, to identify gene sets that are enriched for variants associated with disease. We develop a model-based approach to pathway analysis, and apply this approach to data from the Wellcome Trust Case Control Consortium (WTCCC) studies. Our method offers several benefits over existing approaches. First, our method not only interrogates pathways for enrichment of disease associations, but also estimates the level of enrichment, which yields a coherent way to promote variants in enriched pathways, enhancing discovery of genes underlying disease. Second, our approach allows for multiple enriched pathways, a feature that leads to novel findings in two diseases where the major histocompatibility complex (MHC) is a major determinant of disease susceptibility. Third, by modeling disease as the combined effect of multiple markers, our method automatically accounts for linkage disequilibrium among variants. Interrogation of pathways from eight pathway databases yields strong support for enriched pathways, indicating links between Crohn's disease (CD) and cytokine-driven networks that modulate immune responses; between rheumatoid arthritis (RA) and “Measles” pathway genes involved in immune responses triggered by measles infection; and between type 1 diabetes (T1D) and IL2-mediated signaling genes. Prioritizing variants in these enriched pathways yields many additional putative disease associations compared to analyses without enrichment. For CD and RA, 7 of 8 additional non-MHC associations are corroborated by other studies, providing validation for our approach. For T1D, prioritization of IL-2 signaling genes yields strong evidence for 7 additional non-MHC candidate disease loci, as well as suggestive evidence for several more. Of the 7 strongest associations, 4 are validated by other studies, and 3 (near IL-2 signaling genes RAF1, MAPK14, and FYN) constitute novel putative T1D loci for further study. PMID:24098138

  19. CNVinspector: a web-based tool for the interactive evaluation of copy number variations in single patients and in cohorts.

    PubMed

    Knierim, Ellen; Schwarz, Jana Marie; Schuelke, Markus; Seelow, Dominik

    2013-08-01

    Many genetic disorders are caused by copy number variations (CNVs) in the human genome. However, the large number of benign CNV polymorphisms makes it difficult to delineate causative variants for a certain disease phenotype. Hence, we set out to create software that accumulates and visualises locus-specific knowledge and enables clinicians to study their own CNVs in the context of known polymorphisms and disease variants. CNV data from healthy cohorts (Database of Genomic Variants) and from disease-related databases (DECIPHER) were integrated into a joint resource. Data are presented in an interactive web-based application that allows inspection, evaluation and filtering of CNVs in single individuals or in entire cohorts. CNVinspector provides simple interfaces to upload CNV data, compare them with own or published control data and visualise the results in graphical interfaces. Beyond choosing control data from different public studies, platforms and methods, dedicated filter options allow the detection of CNVs that are either enriched in patients or depleted in controls. Alternatively, a search can be restricted to those CNVs that appear in individuals of similar clinical phenotype. For each gene of interest within a CNV, we provide a link to NCBI, ENSEMBL and the GeneDistiller search engine to browse for potential disease-associated genes. With its user-friendly handling, the integration of control data and the filtering options, CNVinspector will facilitate the daily work of clinical geneticists and accelerate the delineation of new syndromes and gene functions. CNVinspector is freely accessible under http://www.cnvinspector.org.

  20. Comprehensive splicing functional analysis of DNA variants of the BRCA2 gene by hybrid minigenes

    PubMed Central

    2012-01-01

    Introduction The underlying pathogenic mechanism of a large fraction of DNA variants of disease-causing genes is the disruption of the splicing process. We aimed to investigate the effect on splicing of the BRCA2 variants c.8488-1G > A (exon 20) and c.9026_9030del (exon 23), as well as 41 BRCA2 variants reported in the Breast Cancer Information Core (BIC) mutation database. Methods DNA variants were analyzed with the splicing prediction programs NNSPLICE and Human Splicing Finder. Functional analyses of candidate variants were performed by lymphocyte RT-PCR and/or hybrid minigene assays. Forty-one BIC variants of exons 19, 20, 23 and 24 were bioinformatically selected and generated by PCR-mutagenesis of the wild type minigenes. Results Lymphocyte RT-PCR of c.8488-1G > A showed intron 19 retention and a 12-nucleotide deletion in exon 20, whereas c.9026_9030del did not show any splicing anomaly. Minigene analysis of c.8488-1G > A displayed the aforementioned aberrant isoforms but also exon 20 skipping. We further evaluated the splicing outcomes of 41 variants of four BRCA2 exons by minigene analysis. Eighteen variants presented splicing aberrations. Most variants (78.9%) disrupted the natural splice sites, whereas four altered putative enhancers/silencers and had a weak effect. Fluorescent RT-PCR of minigenes accurately detected 14 RNA isoforms generated by cryptic site usage, exon skipping and intron retention events. Fourteen variants showed total splicing disruptions and were predicted to truncate or eliminate essential domains of BRCA2. Conclusions A relevant proportion of BRCA2 variants are correlated with splicing disruptions, indicating that RNA analysis is a valuable tool to assess the pathogenicity of a particular DNA change. The minigene system is a straightforward and robust approach to detect variants with an impact on splicing and contributes to a better knowledge of this gene expression step. PMID:22632462

  1. MitBASE : a comprehensive and integrated mitochondrial DNA database. The present status

    PubMed Central

    Attimonelli, M.; Altamura, N.; Benne, R.; Brennicke, A.; Cooper, J. M.; D’Elia, D.; Montalvo, A. de; Pinto, B. de; De Robertis, M.; Golik, P.; Knoop, V.; Lanave, C.; Lazowska, J.; Licciulli, F.; Malladi, B. S.; Memeo, F.; Monnerot, M.; Pasimeni, R.; Pilbout, S.; Schapira, A. H. V.; Sloof, P.; Saccone, C.

    2000-01-01

    MitBASE is an integrated and comprehensive database of mitochondrial DNA data which collects, under a single interface, databases for Plant, Vertebrate, Invertebrate, Human, Protist and Fungal mtDNA and a Pilot database on nuclear genes involved in mitochondrial biogenesis in Saccharomyces cerevisiae. MitBASE reports all available information from different organisms and from intraspecies variants and mutants. Data have been drawn from the primary databases and from the literature; value adding information has been structured, e.g., editing information on protist mtDNA genomes, pathological information for human mtDNA variants, etc. The different databases, some of which are structured using commercial packages (Microsoft Access, File Maker Pro) while others use a flat-file format, have been integrated under ORACLE. Ad hoc retrieval systems have been devised for some of the above listed databases keeping into account their peculiarities. The database is resident at the EBI and is available at the following site: http://www3.ebi.ac.uk/Research/Mitbase/mitbase.pl . The impact of this project is intended for both basic and applied research. The study of mitochondrial genetic diseases and mitochondrial DNA intraspecies diversity are key topics in several biotechnological fields. The database has been funded within the EU Biotechnology programme. PMID:10592207

  2. Factors influencing success of clinical genome sequencing across a broad spectrum of disorders

    PubMed Central

    Lise, Stefano; Broxholme, John; Cazier, Jean-Baptiste; Rimmer, Andy; Kanapin, Alexander; Lunter, Gerton; Fiddy, Simon; Allan, Chris; Aricescu, A. Radu; Attar, Moustafa; Babbs, Christian; Becq, Jennifer; Beeson, David; Bento, Celeste; Bignell, Patricia; Blair, Edward; Buckle, Veronica J; Bull, Katherine; Cais, Ondrej; Cario, Holger; Chapel, Helen; Copley, Richard R; Cornall, Richard; Craft, Jude; Dahan, Karin; Davenport, Emma E; Dendrou, Calliope; Devuyst, Olivier; Fenwick, Aimée L; Flint, Jonathan; Fugger, Lars; Gilbert, Rodney D; Goriely, Anne; Green, Angie; Greger, Ingo H.; Grocock, Russell; Gruszczyk, Anja V; Hastings, Robert; Hatton, Edouard; Higgs, Doug; Hill, Adrian; Holmes, Chris; Howard, Malcolm; Hughes, Linda; Humburg, Peter; Johnson, David; Karpe, Fredrik; Kingsbury, Zoya; Kini, Usha; Knight, Julian C; Krohn, Jonathan; Lamble, Sarah; Langman, Craig; Lonie, Lorne; Luck, Joshua; McCarthy, Davis; McGowan, Simon J; McMullin, Mary Frances; Miller, Kerry A; Murray, Lisa; Németh, Andrea H; Nesbit, M Andrew; Nutt, David; Ormondroyd, Elizabeth; Oturai, Annette Bang; Pagnamenta, Alistair; Patel, Smita Y; Percy, Melanie; Petousi, Nayia; Piazza, Paolo; Piret, Sian E; Polanco-Echeverry, Guadalupe; Popitsch, Niko; Powrie, Fiona; Pugh, Chris; Quek, Lynn; Robbins, Peter A; Robson, Kathryn; Russo, Alexandra; Sahgal, Natasha; van Schouwenburg, Pauline A; Schuh, Anna; Silverman, Earl; Simmons, Alison; Sørensen, Per Soelberg; Sweeney, Elizabeth; Taylor, John; Thakker, Rajesh V; Tomlinson, Ian; Trebes, Amy; Twigg, Stephen RF; Uhlig, Holm H; Vyas, Paresh; Vyse, Tim; Wall, Steven A; Watkins, Hugh; Whyte, Michael P; Witty, Lorna; Wright, Ben; Yau, Chris; Buck, David; Humphray, Sean; Ratcliffe, Peter J; Bell, John I; Wilkie, Andrew OM; Bentley, David; Donnelly, Peter; McVean, Gilean

    2015-01-01

    To assess factors influencing the success of whole genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases across a broad spectrum of disorders in whom prior screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritisation. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy. Overall, we identified disease causing variants in 21% of cases, rising to 34% (23/68) for Mendelian disorders and 57% (8/14) in trios. We also discovered 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder, though only four were ultimately considered reportable. Our results demonstrate the value of genome sequencing for routine clinical diagnosis, but also highlight many outstanding challenges. PMID:25985138

  3. Public variant databases: liability?

    PubMed

    Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria

    2017-07-01

    Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing.Genet Med advance online publication 15 December 2016.

  4. Report of a Novel SHOX Missense Variant in a Boy With Short Stature and His Mother With Leri–Weill Dyschondrosteosis

    PubMed Central

    Lucchetti, Laura; Prontera, Paolo; Mencarelli, Amedea; Sallicandro, Ester; Mencarelli, Annalisa; Cofini, Marta; Leonardi, Alberto; Stangoni, Gabriela; Penta, Laura; Esposito, Susanna

    2018-01-01

    Heterozygous mutations in the SHOX gene or in the upstream and downstream enhancer elements are associated with 2–22% of cases of idiopathic short stature (OMIM #300582) and with 60% of cases of Leri–Weill dyschondrosteosis (OMIM #127300) with which female subjects are generally more severely affected. Approximately 80–90% of SHOX pathogenic variants are deletions or duplications, and the remaining 10–20% are point mutations that primarily give rise to missense variants. The clinical interpretation of novel variants, particularly missense variants, can be challenging and can remain of uncertain significance. Here, we describe a novel missense variant (c.1044 G>T, p.Arg118Met) in a Moroccan boy with a disproportionately short stature and without any radiological traits or bone deformities and in his mother, who had a disproportionately short stature and a Madelung deformity. This variant has not been reported to date in the updated SHOX allelic variant or Human Gene Mutation Databases nor is it listed as a polymorphism in the ExAC browser, dbSNP, or 1000G. This mutation was predicted to be deleterious by three different bioinformatics tools since it modifies an amino acid in a highly conserved DNA-binding domain of the SHOX protein. Based on this evidence, the patient was treated with recombinant human growth hormone. PMID:29692759

  5. Changes in classification of genetic variants in BRCA1 and BRCA2.

    PubMed

    Kast, Karin; Wimberger, Pauline; Arnold, Norbert

    2018-02-01

    Classification of variants of unknown significance (VUS) in the breast cancer genes BRCA1 and BRCA2 changes with accumulating evidence for clinical relevance. In most cases down-staging towards neutral variants without clinical significance is possible. We searched the database of the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC) for changes in classification of genetic variants as an update to our earlier publication on genetic variants in the Centre of Dresden. Changes between 2015 and 2017 were recorded. In the group of variants of unclassified significance (VUS, Class 3, uncertain), only changes of classification towards neutral genetic variants were noted. In BRCA1, 25% of the Class 3 variants (n = 2/8) changed to Class 2 (likely benign) and Class 1 (benign). In BRCA2, in 50% of the Class 3 variants (n = 16/32), a change to Class 2 (n = 10/16) or Class 1 (n = 6/16) was observed. No change in classification was noted in Class 4 (likely pathogenic) and Class 5 (pathogenic) genetic variants in both genes. No up-staging from Class 1, Class 2 or Class 3 to more clinical significance was observed. All variants with a change in classification in our cohort were down-staged towards no clinical significance by a panel of experts of the German Consortium for Hereditary Breast and Ovarian Cancer (GC-HBOC). Prevention in families with Class 3 variants should be based on pedigree based risks and should not be guided by the presence of a VUS.

  6. Public variant databases: liability?

    PubMed Central

    Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria

    2017-01-01

    Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing. Genet Med advance online publication 15 December 2016 PMID:27977006

  7. Germline mutations in candidate predisposition genes in individuals with cutaneous melanoma and at least two independent additional primary cancers.

    PubMed

    Pritchard, Antonia L; Johansson, Peter A; Nathan, Vaishnavi; Howlie, Madeleine; Symmons, Judith; Palmer, Jane M; Hayward, Nicholas K

    2018-01-01

    While a number of autosomal dominant and autosomal recessive cancer syndromes have an associated spectrum of cancers, the prevalence and variety of cancer predisposition mutations in patients with multiple primary cancers have not been extensively investigated. An understanding of the variants predisposing to more than one cancer type could improve patient care, including screening and genetic counselling, as well as advancing the understanding of tumour development. A cohort of 57 patients ascertained due to their cutaneous melanoma (CM) diagnosis and with a history of two or more additional non-cutaneous independent primary cancer types were recruited for this study. Patient blood samples were assessed by whole exome or whole genome sequencing. We focussed on variants in 525 pre-selected genes, including 65 autosomal dominant and 31 autosomal recessive cancer predisposition genes, 116 genes involved in the DNA repair pathway, and 313 commonly somatically mutated in cancer. The same genes were analysed in exome sequence data from 1358 control individuals collected as part of non-cancer studies (UK10K). The identified variants were classified for pathogenicity using online databases, literature and in silico prediction tools. No known pathogenic autosomal dominant or previously described compound heterozygous mutations in autosomal recessive genes were observed in the multiple cancer cohort. Variants typically found somatically in haematological malignancies (in JAK1, JAK2, SF3B1, SRSF2, TET2 and TYK2) were present in lymphocyte DNA of patients with multiple primary cancers, all of whom had a history of haematological malignancy and cutaneous melanoma, as well as colorectal cancer and/or prostate cancer. Other potentially pathogenic variants were discovered in BUB1B, POLE2, ROS1 and DNMT3A. Compared to controls, multiple cancer cases had significantly more likely damaging mutations (nonsense, frameshift ins/del) in tumour suppressor and tyrosine kinase genes and higher overall burden of mutations in all cancer genes. We identified several pathogenic variants that likely predispose to at least one of the tumours in patients with multiple cancers. We additionally present evidence that there may be a higher burden of variants of unknown significance in 'cancer genes' in patients with multiple cancer types. Further screens of this nature need to be carried out to build evidence to show if the cancers observed in these patients form part of a cancer spectrum associated with single germline variants in these genes, whether multiple layers of susceptibility exist (oligogenic or polygenic), or if the occurrence of multiple different cancers is due to random chance.

  8. Broad phenotypes in heterozygous NR5A1 46,XY patients with a disorder of sex development: an oligogenic origin?

    PubMed

    Camats, Núria; Fernández-Cancio, Mónica; Audí, Laura; Schaller, André; Flück, Christa E

    2018-06-11

    SF-1/NR5A1 is a transcriptional regulator of adrenal and gonadal development. NR5A1 disease-causing variants cause disorders of sex development (DSD) and adrenal failure, but most affected individuals show a broad DSD/reproductive phenotype only. Most NR5A1 variants show in vitro pathogenic effects, but not when tested in heterozygote state together with wild-type NR5A1 as usually seen in patients. Thus, the genotype-phenotype correlation for NR5A1 variants remains an unsolved question. We analyzed heterozygous 46,XY SF-1/NR5A1 patients by whole exome sequencing and used an algorithm for data analysis based on selected project-specific DSD- and SF-1-related genes. The variants detected were evaluated for their significance in literature, databases and checked in silico using webtools. We identified 19 potentially deleterious variants (one to seven per patient) in 18 genes in four 46,XY DSD subjects carrying heterozygous NR5A1 disease-causing variants. We constructed a scheme of all these hits within the landscape of currently known genes involved in male sex determination and differentiation. Our results suggest that the broad phenotype in these heterozygous NR5A1 46,XY DSD subjects may well be explained by an oligogenic mode of inheritance, in which multiple hits, individually non-deleterious, may contribute to a DSD phenotype unique to each heterozygous SF-1/NR5A1 individual.

  9. Genic insights from integrated human proteomics in GeneCards.

    PubMed

    Fishilevich, Simon; Zimmerman, Shahar; Kohn, Asher; Iny Stein, Tsippi; Olender, Tsviya; Kolker, Eugene; Safran, Marilyn; Lancet, Doron

    2016-01-01

    GeneCards is a one-stop shop for searchable human gene annotations (http://www.genecards.org/). Data are automatically mined from ∼120 sources and presented in an integrated web card for every human gene. We report the application of recent advances in proteomics to enhance gene annotation and classification in GeneCards. First, we constructed the Human Integrated Protein Expression Database (HIPED), a unified database of protein abundance in human tissues, based on the publically available mass spectrometry (MS)-based proteomics sources ProteomicsDB, Multi-Omics Profiling Expression Database, Protein Abundance Across Organisms and The MaxQuant DataBase. The integrated database, residing within GeneCards, compares favourably with its individual sources, covering nearly 90% of human protein-coding genes. For gene annotation and comparisons, we first defined a protein expression vector for each gene, based on normalized abundances in 69 normal human tissues. This vector is portrayed in the GeneCards expression section as a bar graph, allowing visual inspection and comparison. These data are juxtaposed with transcriptome bar graphs. Using the protein expression vectors, we further defined a pairwise metric that helps assess expression-based pairwise proximity. This new metric for finding functional partners complements eight others, including sharing of pathways, gene ontology (GO) terms and domains, implemented in the GeneCards Suite. In parallel, we calculated proteome-based differential expression, highlighting a subset of tissues that overexpress a gene and subserving gene classification. This textual annotation allows users of VarElect, the suite's next-generation phenotyper, to more effectively discover causative disease variants. Finally, we define the protein-RNA expression ratio and correlation as yet another attribute of every gene in each tissue, adding further annotative information. The results constitute a significant enhancement of several GeneCards sections and help promote and organize the genome-wide structural and functional knowledge of the human proteome. Database URL:http://www.genecards.org/. © The Author(s) 2016. Published by Oxford University Press.

  10. CRIMEtoYHU: a new web tool to develop yeast-based functional assays for characterizing cancer-associated missense variants.

    PubMed

    Mercatanti, Alberto; Lodovichi, Samuele; Cervelli, Tiziana; Galli, Alvaro

    2017-12-01

    Evaluation of the functional impact of cancer-associated missense variants is more difficult than for protein-truncating mutations and consequently standard guidelines for the interpretation of sequence variants have been recently proposed. A number of algorithms and software products were developed to predict the impact of cancer-associated missense mutations on protein structure and function. Importantly, direct assessment of the variants using high-throughput functional assays using simple genetic systems can help in speeding up the functional evaluation of newly identified cancer-associated variants. We developed the web tool CRIMEtoYHU (CTY) to help geneticists in the evaluation of the functional impact of cancer-associated missense variants. Humans and the yeast Saccharomyces cerevisiae share thousands of protein-coding genes although they have diverged for a billion years. Therefore, yeast humanization can be helpful in deciphering the functional consequences of human genetic variants found in cancer and give information on the pathogenicity of missense variants. To humanize specific positions within yeast genes, human and yeast genes have to share functional homology. If a mutation in a specific residue is associated with a particular phenotype in humans, a similar substitution in the yeast counterpart may reveal its effect at the organism level. CTY simultaneously finds yeast homologous genes, identifies the corresponding variants and determines the transferability of human variants to yeast counterparts by assigning a reliability score (RS) that may be predictive for the validity of a functional assay. CTY analyzes newly identified mutations or retrieves mutations reported in the COSMIC database, provides information about the functional conservation between yeast and human and shows the mutation distribution in human genes. CTY analyzes also newly found mutations and aborts when no yeast homologue is found. Then, on the basis of the protein domain localization and functional conservation between yeast and human, the selected variants are ranked by the RS. The RS is assigned by an algorithm that computes functional data, type of mutation, chemistry of amino acid substitution and the degree of mutation transferability between human and yeast protein. Mutations giving a positive RS are highly transferable to yeast and, therefore, yeast functional assays will be more predictable. To validate the web application, we have analyzed 8078 cancer-associated variants located in 31 genes that have a yeast homologue. More than 50% of variants are transferable to yeast. Incidentally, 88% of all transferable mutations have a reliability score >0. Moreover, we analyzed by CTY 72 functionally validated missense variants located in yeast genes at positions corresponding to the human cancer-associated variants. All these variants gave a positive RS. To further validate CTY, we analyzed 3949 protein variants (with positive RS) by the predictive algorithm PROVEAN. This analysis shows that yeast-based functional assays will be more predictable for the variants with positive RS. We believe that CTY could be an important resource for the cancer research community by providing information concerning the functional impact of specific mutations, as well as for the design of functional assays useful for decision support in precision medicine. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  11. X-Linked and Autosomal Recessive Alport Syndrome: Pathogenic Variant Features and Further Genotype-Phenotype Correlations

    PubMed Central

    Savige, Judith; Storey, Helen; Il Cheong, Hae; Gyung Kang, Hee; Park, Eujin; Hilbert, Pascale; Persikov, Anton; Torres-Fernandez, Carmen; Ars, Elisabet; Torra, Roser; Hertz, Jens Michael; Thomassen, Mads; Shagam, Lev; Wang, Dongmao; Wang, Yanyan; Flinter, Frances; Nagel, Mato

    2016-01-01

    Alport syndrome results from mutations in the COL4A5 (X-linked) or COL4A3/COL4A4 (recessive) genes. This study examined 754 previously- unpublished variants in these genes from individuals referred for genetic testing in 12 accredited diagnostic laboratories worldwide, in addition to all published COL4A5, COL4A3 and COL4A4 variants in the LOVD databases. It also determined genotype-phenotype correlations for variants where clinical data were available. Individuals were referred for genetic testing where Alport syndrome was suspected clinically or on biopsy (renal failure, hearing loss, retinopathy, lamellated glomerular basement membrane), variant pathogenicity was assessed using currently-accepted criteria, and variants were examined for gene location, and age at renal failure onset. Results were compared using Fisher’s exact test (DNA Stata). Altogether 754 new DNA variants were identified, an increase of 25%, predominantly in people of European background. Of the 1168 COL4A5 variants, 504 (43%) were missense mutations, 273 (23%) splicing variants, 73 (6%) nonsense mutations, 169 (14%) short deletions and 76 (7%) complex or large deletions. Only 135 of the 432 Gly residues in the collagenous sequence were substituted (31%), which means that fewer than 10% of all possible variants have been identified. Both missense and nonsense mutations in COL4A5 were not randomly distributed but more common at the 70 CpG sequences (p<10−41 and p<0.001 respectively). Gly>Ala substitutions were underrepresented in all three genes (p< 0.0001) probably because of an association with a milder phenotype. The average age at end-stage renal failure was the same for all mutations in COL4A5 (24.4 ±7.8 years), COL4A3 (23.3 ± 9.3) and COL4A4 (25.4 ± 10.3) (COL4A5 and COL4A3, p = 0.45; COL4A5 and COL4A4, p = 0.55; COL4A3 and COL4A4, p = 0.41). For COL4A5, renal failure occurred sooner with non-missense than missense variants (p<0.01). For the COL4A3 and COL4A4 genes, age at renal failure occurred sooner with two non-missense variants (p = 0.08, and p = 0.01 respectively). Thus DNA variant characteristics that predict age at renal failure appeared to be the same for all three Alport genes. Founder mutations (with the pathogenic variant in at least 5 apparently- unrelated individuals) were not necessarily associated with a milder phenotype. This study illustrates the benefits when routine diagnostic laboratories share and analyse their data. PMID:27627812

  12. X-Linked and Autosomal Recessive Alport Syndrome: Pathogenic Variant Features and Further Genotype-Phenotype Correlations.

    PubMed

    Savige, Judith; Storey, Helen; Il Cheong, Hae; Gyung Kang, Hee; Park, Eujin; Hilbert, Pascale; Persikov, Anton; Torres-Fernandez, Carmen; Ars, Elisabet; Torra, Roser; Hertz, Jens Michael; Thomassen, Mads; Shagam, Lev; Wang, Dongmao; Wang, Yanyan; Flinter, Frances; Nagel, Mato

    2016-01-01

    Alport syndrome results from mutations in the COL4A5 (X-linked) or COL4A3/COL4A4 (recessive) genes. This study examined 754 previously- unpublished variants in these genes from individuals referred for genetic testing in 12 accredited diagnostic laboratories worldwide, in addition to all published COL4A5, COL4A3 and COL4A4 variants in the LOVD databases. It also determined genotype-phenotype correlations for variants where clinical data were available. Individuals were referred for genetic testing where Alport syndrome was suspected clinically or on biopsy (renal failure, hearing loss, retinopathy, lamellated glomerular basement membrane), variant pathogenicity was assessed using currently-accepted criteria, and variants were examined for gene location, and age at renal failure onset. Results were compared using Fisher's exact test (DNA Stata). Altogether 754 new DNA variants were identified, an increase of 25%, predominantly in people of European background. Of the 1168 COL4A5 variants, 504 (43%) were missense mutations, 273 (23%) splicing variants, 73 (6%) nonsense mutations, 169 (14%) short deletions and 76 (7%) complex or large deletions. Only 135 of the 432 Gly residues in the collagenous sequence were substituted (31%), which means that fewer than 10% of all possible variants have been identified. Both missense and nonsense mutations in COL4A5 were not randomly distributed but more common at the 70 CpG sequences (p<10-41 and p<0.001 respectively). Gly>Ala substitutions were underrepresented in all three genes (p< 0.0001) probably because of an association with a milder phenotype. The average age at end-stage renal failure was the same for all mutations in COL4A5 (24.4 ±7.8 years), COL4A3 (23.3 ± 9.3) and COL4A4 (25.4 ± 10.3) (COL4A5 and COL4A3, p = 0.45; COL4A5 and COL4A4, p = 0.55; COL4A3 and COL4A4, p = 0.41). For COL4A5, renal failure occurred sooner with non-missense than missense variants (p<0.01). For the COL4A3 and COL4A4 genes, age at renal failure occurred sooner with two non-missense variants (p = 0.08, and p = 0.01 respectively). Thus DNA variant characteristics that predict age at renal failure appeared to be the same for all three Alport genes. Founder mutations (with the pathogenic variant in at least 5 apparently- unrelated individuals) were not necessarily associated with a milder phenotype. This study illustrates the benefits when routine diagnostic laboratories share and analyse their data.

  13. [Phenotypic and genotypic spectra of patients with glucose-6-phosphate dehydrogenase deficiency gene known pathogenic variants: a single-center study].

    PubMed

    Chen, X; Yang, L; Wang, H J; Wu, B B; Lu, Y L; Dong, X R; Zhou, W H

    2018-05-02

    Objective: To analyze the hotspots of known pathogenic disease-causing variants of glucose-6-phosphate dehydrogenase (G6PD) and the phenotype spectrum of neonatal patients with known pathogenic disease-causing variants of G6PD. Methods: The known pathogenic disease-causing variants of G6PD were collected from Human Gene Mutation Database. Screening was performed for these variants among the 7 966 cases (2 357 neonatal, 5 609 non-neonatal) in the database of sequencing at Molecular Diagnosis Center, Children's Hospital of Fudan University. All these samples were from patients suspected with genetic disorder. The database contained Whole Exon Sequencing data and Clinical Exon Sequencing data. We screened out the patients with known pathogenic disease-causing variants of G6PD, analyzed the hotspot of G6PD and the phenotype spectrum of neonatal patients with known pathogenic disease-causing variants of G6PD. Results: (1) Among the next generation sequencing data of the 7 966 samples, 86 samples (1.1%) were detected as positive for the known pathogenic disease-causing variants of G6PD (positive samples set). In the positive sample set, 51 patients (33 males, 18 females) were newborn babies. Forty-three patients (26 males, 17 females) had the enzyme activity data of G6PD. (2) Among the 86 samples, Arg463His, Arg459Leu, Leu342Phe, Val291Met were the leading 4 disease-causing variants found in 72 samples (84%). (3) Male neonatal patients with the same variants had the statistically significant differences in enzyme activity: among 13 patients with Arg463His, enzyme activity of 9 patients was ranked as grade Ⅲ, 1 case ranked as Ⅳ, 3 cases had no activity data;among 10 patients with Arg459Leu, enzyme activity of 4 patients was ranked as Ⅱ, 4 cases ranked as Ⅲ, 2 cases had no activity data;among 2 patients with His32Arg, enzyme activity of one patient was ranked as Ⅱ, another was Ⅲ. Male neonatal patients with the same mutation and enzyme activity also had the statistically significant differences in phenotype spectrum: among 9 patients with Arg463His and level Ⅲ enzyme activity, 6 presented hyperbilirubinemia, 2 met the criteria for exchange transfusion therapy, 2 showed hemolysis;among 4 patients with Arg459Leu and level Ⅱ enzyme activity, 3 presented hyperbilirubinemia;among 4 patients with Arg459Leu and level Ⅲ enzyme activity, 2 presented hyperbilirubinemia, 1 met the standard of exchange transfusion therapy;among 3 patients with Val291Met and level Ⅲ enzyme activity, 1 presented hyperbilirubinemia. Conclusions: Arg463His, Arg459Leu, Leu342Phe, Val291Met were the hotspots variants for the G6PD. Patients with the same G6PD variants and sex present different phenotype, patients with the same G6PD variants, sex and enzyme activity also present different phenotype .

  14. The functional spectrum of low-frequency coding variation.

    PubMed

    Marth, Gabor T; Yu, Fuli; Indap, Amit R; Garimella, Kiran; Gravel, Simon; Leong, Wen Fung; Tyler-Smith, Chris; Bainbridge, Matthew; Blackwell, Tom; Zheng-Bradley, Xiangqun; Chen, Yuan; Challis, Danny; Clarke, Laura; Ball, Edward V; Cibulskis, Kristian; Cooper, David N; Fulton, Bob; Hartl, Chris; Koboldt, Dan; Muzny, Donna; Smith, Richard; Sougnez, Carrie; Stewart, Chip; Ward, Alistair; Yu, Jin; Xue, Yali; Altshuler, David; Bustamante, Carlos D; Clark, Andrew G; Daly, Mark; DePristo, Mark; Flicek, Paul; Gabriel, Stacey; Mardis, Elaine; Palotie, Aarno; Gibbs, Richard

    2011-09-14

    Rare coding variants constitute an important class of human genetic variation, but are underrepresented in current databases that are based on small population samples. Recent studies show that variants altering amino acid sequence and protein function are enriched at low variant allele frequency, 2 to 5%, but because of insufficient sample size it is not clear if the same trend holds for rare variants below 1% allele frequency. The 1000 Genomes Exon Pilot Project has collected deep-coverage exon-capture data in roughly 1,000 human genes, for nearly 700 samples. Although medical whole-exome projects are currently afoot, this is still the deepest reported sampling of a large number of human genes with next-generation technologies. According to the goals of the 1000 Genomes Project, we created effective informatics pipelines to process and analyze the data, and discovered 12,758 exonic SNPs, 70% of them novel, and 74% below 1% allele frequency in the seven population samples we examined. Our analysis confirms that coding variants below 1% allele frequency show increased population-specificity and are enriched for functional variants. This study represents a large step toward detecting and interpreting low frequency coding variation, clearly lays out technical steps for effective analysis of DNA capture data, and articulates functional and population properties of this important class of genetic variation.

  15. regSNPs: a strategy for prioritizing regulatory single nucleotide substitutions

    PubMed Central

    Teng, Mingxiang; Ichikawa, Shoji; Padgett, Leah R.; Wang, Yadong; Mort, Matthew; Cooper, David N.; Koller, Daniel L.; Foroud, Tatiana; Edenberg, Howard J.; Econs, Michael J.; Liu, Yunlong

    2012-01-01

    Motivation: One of the fundamental questions in genetics study is to identify functional DNA variants that are responsible to a disease or phenotype of interest. Results from large-scale genetics studies, such as genome-wide association studies (GWAS), and the availability of high-throughput sequencing technologies provide opportunities in identifying causal variants. Despite the technical advances, informatics methodologies need to be developed to prioritize thousands of variants for potential causative effects. Results: We present regSNPs, an informatics strategy that integrates several established bioinformatics tools, for prioritizing regulatory SNPs, i.e. the SNPs in the promoter regions that potentially affect phenotype through changing transcription of downstream genes. Comparing to existing tools, regSNPs has two distinct features. It considers degenerative features of binding motifs by calculating the differences on the binding affinity caused by the candidate variants and integrates potential phenotypic effects of various transcription factors. When tested by using the disease-causing variants documented in the Human Gene Mutation Database, regSNPs showed mixed performance on various diseases. regSNPs predicted three SNPs that can potentially affect bone density in a region detected in an earlier linkage study. Potential effects of one of the variants were validated using luciferase reporter assay. Contact: yunliu@iupui.edu Supplementary information: Supplementary data are available at Bioinformatics online PMID:22611130

  16. Exome analysis of a family with Wolff-Parkinson-White syndrome identifies a novel disease locus.

    PubMed

    Bowles, Neil E; Jou, Chuanchau J; Arrington, Cammon B; Kennedy, Brett J; Earl, Aubree; Matsunami, Norisada; Meyers, Lindsay L; Etheridge, Susan P; Saarel, Elizabeth V; Bleyl, Steven B; Yost, H Joseph; Yandell, Mark; Leppert, Mark F; Tristani-Firouzi, Martin; Gruber, Peter J

    2015-12-01

    Wolff-Parkinson-White (WPW) syndrome is a common cause of supraventricular tachycardia that carries a risk of sudden cardiac death. To date, mutations in only one gene, PRKAG2, which encodes the 5'-AMP-activated protein kinase subunit γ-2, have been identified as causative for WPW. DNA samples from five members of a family with WPW were analyzed by exome sequencing. We applied recently designed prioritization strategies (VAAST/pedigree VAAST) coupled with an ontology-based algorithm (Phevor) that reduced the number of potentially damaging variants to 10: a variant in KCNE2 previously associated with Long QT syndrome was also identified. Of these 11 variants, only MYH6 p.E1885K segregated with the WPW phenotype in all affected individuals and was absent in 10 unaffected family members. This variant was predicted to be damaging by in silico methods and is not present in the 1,000 genome and NHLBI exome sequencing project databases. Screening of a replication cohort of 47 unrelated WPW patients did not identify other likely causative variants in PRKAG2 or MYH6. MYH6 variants have been identified in patients with atrial septal defects, cardiomyopathies, and sick sinus syndrome. Our data highlight the pleiotropic nature of phenotypes associated with defects in this gene. © 2015 Wiley Periodicals, Inc.

  17. Exome Analysis of a Family with Wolff–Parkinson–White Syndrome Identifies a Novel Disease Locus

    PubMed Central

    Bowles, Neil E.; Jou, Chuanchau J.; Arrington, Cammon B.; Kennedy, Brett J.; Earl, Aubree; Matsunami, Norisada; Meyers, Lindsay L.; Etheridge, Susan P.; Saarel, Elizabeth V.; Bleyl, Steven B.; Yost, H. Joseph; Yandell, Mark; Leppert, Mark F.; Tristani-Firouzi, Martin; Gruber, Peter J.

    2016-01-01

    Wolff–Parkinson–White (WPW) syndrome is a common cause of supraventricular tachycardia that carries a risk of sudden cardiac death. To date, mutations in only one gene, PRKAG2, which encodes the 5’ -AMP-activated protein kinase subunit γ-2, have been identified as causative for WPW. DNA samples from five members of a family with WPW were analyzed by exome sequencing. We applied recently designed prioritization strategies (VAAST/pedigree VAAST) coupled with an ontology-based algorithm (Phevor) that reduced the number of potentially damaging variants to 10: a variant in KCNE2 previously associated with Long QT syndrome was also identified. Of these 11 variants, only MYH6 p.E1885K segregated with the WPW phenotype in all affected individuals and was absent in 10 unaffected family members. This variant was predicted to be damaging by in silico methods and is not present in the 1,000 genome and NHLBI exome sequencing project databases. Screening of a replication cohort of 47 unrelated WPW patients did not identify other likely causative variants in PRKAG2 or MYH6. MYH6 variants have been identified in patients with atrial septal defects, cardiomyopathies, and sick sinus syndrome. Our data highlight the pleiotropic nature of phenotypes associated with defects in this gene. PMID:26284702

  18. Assessing the impact of copy number variants on miRNA genes in autism by Monte Carlo simulation.

    PubMed

    Marrale, Maurizio; Albanese, Nadia Ninfa; Calì, Francesco; Romano, Valentino

    2014-01-01

    Autism Spectrum Disorders (ASDs) are childhood neurodevelopmental disorders with complex genetic origins. Previous studies have investigated the role of de novo Copy Number Variants (CNVs) and microRNAs as important but distinct etiological factors in ASD. We developed a novel computational procedure to assess the potential pathogenic role of microRNA genes overlapping de novo CNVs in ASD patients. Here we show that for chromosomes # 1, 2 and 22 the actual number of miRNA loci affected by de novo CNVs in patients was found significantly higher than that estimated by Monte Carlo simulation of random CNV events. Out of 24 miRNA genes over-represented in CNVs from these three chromosomes only hsa-mir-4436b-1 and hsa-mir-4436b-2 have not been detected in CNVs from non-autistic subjects as reported in the Database of Genomic Variants. Altogether the results reported in this study represent a first step towards a full understanding of how a dysregulated expression of the 24 miRNAs genes affect neurodevelopment in autism. We also propose that the procedure used in this study can be effectively applied to CNVs/miRNA genes association data in other genomic disorders beyond autism.

  19. Clinical Interpretation and Implications of Whole-Genome Sequencing

    PubMed Central

    Dewey, Frederick E.; Grove, Megan E.; Pan, Cuiping; Goldstein, Benjamin A.; Bernstein, Jonathan A.; Chaib, Hassan; Merker, Jason D.; Goldfeder, Rachel L.; Enns, Gregory M.; David, Sean P.; Pakdaman, Neda; Ormond, Kelly E.; Caleshu, Colleen; Kingham, Kerry; Klein, Teri E.; Whirl-Carrillo, Michelle; Sakamoto, Kenneth; Wheeler, Matthew T.; Butte, Atul J.; Ford, James M.; Boxer, Linda; Ioannidis, John P. A.; Yeung, Alan C.; Altman, Russ B.; Assimes, Themistocles L.; Snyder, Michael; Ashley, Euan A.; Quertermous, Thomas

    2014-01-01

    IMPORTANCE Whole-genome sequencing (WGS) is increasingly applied in clinical medicine and is expected to uncover clinically significant findings regardless of sequencing indication. OBJECTIVES To examine coverage and concordance of clinically relevant genetic variation provided by WGS technologies; to quantitate inherited disease risk and pharmacogenomic findings in WGS data and resources required for their discovery and interpretation; and to evaluate clinical action prompted by WGS findings. DESIGN, SETTING, AND PARTICIPANTS An exploratory study of 12 adult participants recruited at Stanford University Medical Center who underwent WGS between November 2011 and March 2012. A multidisciplinary team reviewed all potentially reportable genetic findings. Five physicians proposed initial clinical follow-up based on the genetic findings. MAIN OUTCOMES AND MEASURES Genome coverage and sequencing platform concordance in different categories of genetic disease risk, person-hours spent curating candidate disease-risk variants, interpretation agreement between trained curators and disease genetics databases, burden of inherited disease risk and pharmacogenomic findings, and burden and interrater agreement of proposed clinical follow-up. RESULTS Depending on sequencing platform, 10% to 19% of inherited disease genes were not covered to accepted standards for single nucleotide variant discovery. Genotype concordance was high for previously described single nucleotide genetic variants (99%-100%) but low for small insertion/deletion variants (53%-59%). Curation of 90 to 127 genetic variants in each participant required a median of 54 minutes (range, 5-223 minutes) per genetic variant, resulted in moderate classification agreement between professionals (Gross κ, 0.52; 95%CI, 0.40-0.64), and reclassified 69%of genetic variants cataloged as disease causing in mutation databases to variants of uncertain or lesser significance. Two to 6 personal disease-risk findings were discovered in each participant, including 1 frameshift deletion in the BRCA1 gene implicated in hereditary breast and ovarian cancer. Physician review of sequencing findings prompted consideration of a median of 1 to 3 initial diagnostic tests and referrals per participant, with fair interrater agreement about the suitability of WGS findings for clinical follow-up (Fleiss κ, 0.24; P < 001). CONCLUSIONS AND RELEVANCE In this exploratory study of 12 volunteer adults, the use of WGS was associated with incomplete coverage of inherited disease genes, low reproducibility of detection of genetic variation with the highest potential clinical effects, and uncertainty about clinically reportable findings. In certain cases, WGS will identify clinically actionable genetic variants warranting early medical intervention. These issues should be considered when determining the role of WGS in clinical medicine. PMID:24618965

  20. Estimated carrier frequency of creatine transporter deficiency in females in the general population using functional characterization of novel missense variants in the SLC6A8 gene.

    PubMed

    DesRoches, Caro-Lyne; Patel, Jaina; Wang, Peixiang; Minassian, Berge; Salomons, Gajja S; Marshall, Christian R; Mercimek-Mahmutoglu, Saadet

    2015-07-10

    Creatine transporter deficiency (CRTR-D) is an X-linked inherited disorder of creatine transport. All males and about 50% of females have intellectual disability or cognitive dysfunction. Creatine deficiency on brain proton magnetic resonance spectroscopy and elevated urinary creatine to creatinine ratio are important biomarkers. Mutations in the SLC6A8 gene occur de novo in 30% of males. Despite reports of high prevalence of CRTR-D in males with intellectual disability, there are no true prevalence studies in the general population. To determine carrier frequency of CRTR-D in the general population we studied the variants in the SLC6A8 gene reported in the Exome Variant Server database and performed functional characterization of missense variants. We also analyzed synonymous and intronic variants for their predicted pathogenicity using in silico analysis tools. Nine missense variants were functionally analyzed using transient transfection by site-directed mutagenesis with In-Fusion HD Cloning in HeLa cells. Creatine uptake was measured by liquid chromatography tandem mass spectrometry for creatine measurement. The c.1654G>T (p.Val552Leu) variant showed low residual creatine uptake activity of 35% of wild type transfected HeLa cells and was classified as pathogenic. Three variants (c.808G>A; p.Val270Met, c.942C>G; p.Phe314Leu and c.952G>A; p.Ala318Thr) were predicted to be pathogenic based on in silico analysis, but proved to be non-pathogenic by our functional analysis. The estimated carrier frequency of CRTR-D was 0.024% in females in the general population. We recommend functional studies for all novel missense variants by transient transfection followed by creatine uptake measurement by liquid chromatography tandem mass spectrometry as fast and cost effective method for the functional analysis of missense variants in the SLC6A8 gene. Crown Copyright © 2015. Published by Elsevier B.V. All rights reserved.

  1. Splice Site Variants in the KCNQ1 and SCN5A Genes: Transcript Analysis as a Tool in Supporting Pathogenicity

    PubMed Central

    Leong, Ivone U.S.; Dryland, Philippa A.; Prosser, Debra O.; Lai, Stella W.-S.; Graham, Mandy; Stiles, Martin; Crawford, Jackie; Skinner, Jonathan R.; Love, Donald R.

    2017-01-01

    Background Approximately 75% of clinically definite long QT syndrome (LQTS) cases are caused by mutations in the KCNQ1, KCNH2 and SCN5A genes. Of these mutations, a small proportion (3.2-9.2%) are predicted to affect splicing. These mutations present a particular challenge in ascribing pathogenicity. Methods Here we report an analysis of the transcriptional consequences of two mutations, one in the KCNQ1 gene (c.781_782delinsTC) and one in the SCN5A gene (c.2437-5C>A), which are predicted to affect splicing. We isolated RNA from lymphocytes and used a directed PCR amplification strategy of cDNA to show mis-spliced transcripts in mutation-positive patients. Results The loss of an exon in each mis-spliced transcript had no deduced effect on the translational reading frame. The clinical phenotype corresponded closely with genotypic status in family members carrying the KCNQ1 splice variant, but not in family members with the SCN5A splice variant. These results are put in the context of a literature review, where only 20% of all splice variants reported in the KCNQ1, KCNH2 and SCN5A gene entries in the HGMDPro 2015.4 database have been evaluated using transcriptional assays. Conclusions Prediction programmes play a strong role in most diagnostic laboratories in classifying variants located at splice sites; however, transcriptional analysis should be considered critical to confirm mis-splicing. Critically, this study shows that genuine mis- splicing may not always imply clinical significance, and genotype/phenotype cosegregation remains important even when mis-splicing is confirmed. PMID:28725320

  2. Targeted next generation sequencing of the entire vitamin D receptor gene reveals polymorphisms correlated with vitamin D deficiency among older Filipino women with and without fragility fracture.

    PubMed

    Zumaraga, Mark Pretzel; Medina, Paul Julius; Recto, Juan Miguel; Abrahan, Lauro; Azurin, Edelyn; Tanchoco, Celeste C; Jimeno, Cecilia A; Palmes-Saloma, Cynthia

    2017-03-01

    This study aimed to discover genetic variants in the entire 101 kB vitamin D receptor (VDR) gene for vitamin D deficiency in a group of postmenopausal Filipino women using targeted next generation sequencing (TNGS) approach in a case-control study design. A total of 50 women with and without osteoporotic fracture seen at the Philippine Orthopedic Center were included. Blood samples were collected for determination of serum vitamin D, calcium, phosphorus, glucose, blood urea nitrogen, creatinine, aspartate aminotransferase, alanine aminotransferase and as primary source for targeted VDR gene sequencing using the Ion Torrent Personal Genome Machine. The variant calling was based on the GATK best practice workflow and annotated using Annovar tool. A total of 1496 unique variants in the whole 101-kb VDR gene were identified. Novel sequence variations not registered in the dbSNP database were found among cases and controls at a rate of 23.1% and 16.6% of total discovered variants, respectively. One disease-associated enhancer showed statistically significant association to low serum 25-hydroxy vitamin D levels (Pearson chi-square P-value=0.009). The transcription factor binding site prediction program PROMO predicted the disruption of three transcription factor binding sites in this enhancer region. These findings show the power of TNGS in identifying sequence variations in a very large gene and the surprising results obtained in this study greatly expand the catalog of known VDR sequence variants that may represent an important clue in the emergence of vitamin D deficiency. Such information will also provide the additional guidance necessary toward a personalized nutritional advice to reach sufficient vitamin D status. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. A benchmark study of scoring methods for non-coding mutations.

    PubMed

    Drubay, Damien; Gautheret, Daniel; Michiels, Stefan

    2018-05-15

    Detailed knowledge of coding sequences has led to different candidate models for pathogenic variant prioritization. Several deleteriousness scores have been proposed for the non-coding part of the genome, but no large-scale comparison has been realized to date to assess their performance. We compared the leading scoring tools (CADD, FATHMM-MKL, Funseq2 and GWAVA) and some recent competitors (DANN, SNP and SOM scores) for their ability to discriminate assumed pathogenic variants from assumed benign variants (using the ClinVar, COSMIC and 1000 genomes project databases). Using the ClinVar benchmark, CADD was the best tool for detecting the pathogenic variants that are mainly located in protein coding gene regions. Using the COSMIC benchmark, FATHMM-MKL, GWAVA and SOMliver outperformed the other tools for pathogenic variants that are typically located in lincRNAs, pseudogenes and other parts of the non-coding genome. However, all tools had low precision, which could potentially be improved by future non-coding genome feature discoveries. These results may have been influenced by the presence of potential benign variants in the COSMIC database. The development of a gold standard as consistent as ClinVar for these regions will be necessary to confirm our tool ranking. The Snakemake, C++ and R codes are freely available from https://github.com/Oncostat/BenchmarkNCVTools and supported on Linux. damien.drubay@gustaveroussy.fr or stefan.michiels@gustaveroussy.fr. Supplementary data are available at Bioinformatics online.

  4. Genetic polymorphisms associated with heart failure: A literature review.

    PubMed

    Guo, Mengqi; Guo, Guanlun; Ji, Xiaoping

    2016-02-01

    To review possible associations reported between genetic variants and the risk, therapeutic response and prognosis of heart failure. Electronic databases (PubMed, Web of Science and CNKI) were systematically searched for relevant papers, published between January 1995 and February 2015. Eighty-two articles covering 29 genes and 39 polymorphisms were identified. Genetic association studies of heart failure have been highly controversial. There may be interaction or synergism of several genetic variants that together result in the ultimate pathological phenotype for heart failure. © The Author(s) 2016.

  5. Genomic Approach to Understand the Association of DNA Repair with Longevity and Healthy Aging Using Genomic Databases of Oldest-Old Population

    PubMed Central

    Kim, Hyun Soo

    2018-01-01

    Aged population is increasing worldwide due to the aging process that is inevitable. Accordingly, longevity and healthy aging have been spotlighted to promote social contribution of aged population. Many studies in the past few decades have reported the process of aging and longevity, emphasizing the importance of maintaining genomic stability in exceptionally long-lived population. Underlying reason of longevity remains unclear due to its complexity involving multiple factors. With advances in sequencing technology and human genome-associated approaches, studies based on population-based genomic studies are increasing. In this review, we summarize recent longevity and healthy aging studies of human population focusing on DNA repair as a major factor in maintaining genome integrity. To keep pace with recent growth in genomic research, aging- and longevity-associated genomic databases are also briefly introduced. To suggest novel approaches to investigate longevity-associated genetic variants related to DNA repair using genomic databases, gene set analysis was conducted, focusing on DNA repair- and longevity-associated genes. Their biological networks were additionally analyzed to grasp major factors containing genetic variants of human longevity and healthy aging in DNA repair mechanisms. In summary, this review emphasizes DNA repair activity in human longevity and suggests approach to conduct DNA repair-associated genomic study on human healthy aging.

  6. Human Chromosome Y and Haplogroups; introducing YDHS Database.

    PubMed

    Tiirikka, Timo; Moilanen, Jukka S

    2015-12-01

    As the high throughput sequencing efforts generate more biological information, scientists from different disciplines are interpreting the polymorphisms that make us unique. In addition, there is an increasing trend in general public to research their own genealogy, find distant relatives and to know more about their biological background. Commercial vendors are providing analyses of mitochondrial and Y-chromosomal markers for such purposes. Clearly, an easy-to-use free interface to the existing data on the identified variants would be in the interest of general public and professionals less familiar with the field. Here we introduce a novel metadatabase YDHS that aims to provide such an interface for Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants. The database uses ISOGG Y-DNA tree as the source of mutations and haplogroups and by using genomic positions of the mutations the database links them to genes and other biological entities. YDHS contains analysis tools for deeper Y-SNP analysis. YDHS addresses the shortage of Y-DNA related databases. We have tested our database using a set of different cases from literature ranging from infertility to autism. The database is at http://www.semanticgen.net/ydhs Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants have not been in the scientific limelight, excluding certain specialized fields like forensics, mainly because there is not much freely available information or it is scattered in different sources. However, as we have demonstrated Y-SNPs do play a role in various cases on the haplogroup level and it is possible to create a free Y-DNA dedicated bioinformatics resource.

  7. SZDB: A Database for Schizophrenia Genetic Research

    PubMed Central

    Wu, Yong; Yao, Yong-Gang

    2017-01-01

    Abstract Schizophrenia (SZ) is a debilitating brain disorder with a complex genetic architecture. Genetic studies, especially recent genome-wide association studies (GWAS), have identified multiple variants (loci) conferring risk to SZ. However, how to efficiently extract meaningful biological information from bulk genetic findings of SZ remains a major challenge. There is a pressing need to integrate multiple layers of data from various sources, eg, genetic findings from GWAS, copy number variations (CNVs), association and linkage studies, gene expression, protein–protein interaction (PPI), co-expression, expression quantitative trait loci (eQTL), and Encyclopedia of DNA Elements (ENCODE) data, to provide a comprehensive resource to facilitate the translation of genetic findings into SZ molecular diagnosis and mechanism study. Here we developed the SZDB database (http://www.szdb.org/), a comprehensive resource for SZ research. SZ genetic data, gene expression data, network-based data, brain eQTL data, and SNP function annotation information were systematically extracted, curated and deposited in SZDB. In-depth analyses and systematic integration were performed to identify top prioritized SZ genes and enriched pathways. Multiple types of data from various layers of SZ research were systematically integrated and deposited in SZDB. In-depth data analyses and integration identified top prioritized SZ genes and enriched pathways. We further showed that genes implicated in SZ are highly co-expressed in human brain and proteins encoded by the prioritized SZ risk genes are significantly interacted. The user-friendly SZDB provides high-confidence candidate variants and genes for further functional characterization. More important, SZDB provides convenient online tools for data search and browse, data integration, and customized data analyses. PMID:27451428

  8. Gene panel sequencing in familial breast/ovarian cancer patients identifies multiple novel mutations also in genes others than BRCA1/2.

    PubMed

    Kraus, Cornelia; Hoyer, Juliane; Vasileiou, Georgia; Wunderle, Marius; Lux, Michael P; Fasching, Peter A; Krumbiegel, Mandy; Uebe, Steffen; Reuter, Miriam; Beckmann, Matthias W; Reis, André

    2017-01-01

    Breast and ovarian cancer (BC/OC) predisposition has been attributed to a number of high- and moderate to low-penetrance susceptibility genes. With the advent of next generation sequencing (NGS) simultaneous testing of these genes has become feasible. In this monocentric study, we report results of panel-based screening of 14 BC/OC susceptibility genes (BRCA1, BRCA2, RAD51C, RAD51D, CHEK2, PALB2, ATM, NBN, CDH1, TP53, MLH1, MSH2, MSH6 and PMS2) in a group of 581 consecutive individuals from a German population with BC and/or OC fulfilling diagnostic criteria for BRCA1 and BRCA2 testing including 179 with a triple-negative tumor. Altogether we identified 106 deleterious mutations in 105 (18%) patients in 10 different genes, including seven different exon deletions. Of these 106 mutations, 16 (15%) were novel and only six were found in BRCA1/2. To further characterize mutations located in or nearby splicing consensus sites we performed RT-PCR analysis which allowed confirmation of pathogenicity in 7 of 9 mutations analyzed. In PALB2, we identified a deleterious variant in six cases. All but one were associated with early onset BC and a positive family history indicating that penetrance for PALB2 mutations is comparable to BRCA2. Overall, extended testing beyond BRCA1/2 identified a deleterious mutation in further 6% of patients. As a downside, 89 variants of uncertain significance were identified highlighting the need for comprehensive variant databases. In conclusion, panel testing yields more accurate information on genetic cancer risk than assessing BRCA1/2 alone and wide-spread testing will help improve penetrance assessment of variants in these risk genes. © 2016 UICC.

  9. Improved genetic counseling in Alport syndrome by new variants of COL4A5 gene.

    PubMed

    Fernandez-Rosado, Francisco; Campos, Ana; Alvarez-Cubero, Maria Jesus; Ruiz, Ana; Entrala-Bernal, Carmen

    2015-07-01

    There are current requirements of using genetic databases for offering a better genetic assistance to patients of some syndromes, especially those with X-linked heredity patterns (like Alport Syndrome) for the high probability of having descendants affected by the disease. We describe the first reported case of COL4A5 gene missense c.1499 G>T mutation in a 16-year-old girl confirmed to be affected by Alport Syndrome after genetic counseling. Next Generation Sequencing procedures let discover this mutation and offer an accurate clinical treatment to this patient. Current scientific understanding of genetic syndromes suggests the high importance of updated databases and the inclusion of Variant of Unknown Significance related to clinical cases. All of this updating could enable patients to have a better opportunity of diagnosis and having genetic and clinical counseling. This event is even more important in women planning to start a family to have correct genetic counseling regarding the risk posed to offspring, and allowing the decision to undergo prenatal testing. © 2015 Asian Pacific Society of Nephrology.

  10. Deciphering the colon cancer genes--report of the InSiGHT-Human Variome Project Workshop, UNESCO, Paris 2010.

    PubMed

    Kohonen-Corish, Maija R J; Macrae, Finlay; Genuardi, Maurizio; Aretz, Stefan; Bapat, Bharati; Bernstein, Inge T; Burn, John; Cotton, Richard G H; den Dunnen, Johan T; Frebourg, Thierry; Greenblatt, Marc S; Hofstra, Robert; Holinski-Feder, Elke; Lappalainen, Ilkka; Lindblom, Annika; Maglott, Donna; Møller, Pål; Morreau, Hans; Möslein, Gabriela; Sijmons, Rolf; Spurdle, Amanda B; Tavtigian, Sean; Tops, Carli M J; Weber, Thomas K; de Wind, Niels; Woods, Michael O

    2011-04-01

    The Human Variome Project (HVP) has established a pilot program with the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) to compile all inherited variation affecting colon cancer susceptibility genes. An HVP-InSiGHT Workshop was held on May 10, 2010, prior to the HVP Integration and Implementation Meeting at UNESCO in Paris, to review the progress of this pilot program. A wide range of topics were covered, including issues relating to genotype-phenotype data submission to the InSiGHT Colon Cancer Gene Variant Databases (chromium.liacs.nl/LOVD2/colon_cancer/home.php). The meeting also canvassed the recent exciting developments in models to evaluate the pathogenicity of unclassified variants using in silico data, tumor pathology information, and functional assays, and made further plans for the future progress and sustainability of the pilot program. © 2011 Wiley-Liss, Inc.

  11. Association of the S267F variant on NTCP gene and treatment response to pegylated interferon in patients with chronic hepatitis B: a multicentre study.

    PubMed

    Thanapirom, Kessarin; Suksawatamnuay, Sirinporn; Sukeepaisarnjaroen, Wattana; Treeprasertsuk, Sombat; Tanwandee, Tawesak; Charatcharoenwitthaya, Phunchai; Thongsawat, Satawat; Leerapun, Apinya; Piratvisuth, Teerha; Boonsirichan, Rattana; Bunchorntavakul, Chalermrat; Pattanasirigool, Chaowalit; Pornthisarn, Bubpha; Tuntipanichteerakul, Supoj; Sripariwuth, Ekawee; Jeamsripong, Woramon; Sanpajit, Theeranun; Poovorawan, Yong; Komolmit, Piyawat

    2018-01-01

    Sodium taurocholate co-transporting polypeptide (NTCP) is a cell receptor for HBV. The S267F variant on the NTCP gene is inversely associated with the chronicity of HBV infection, progression to cirrhosis and hepatocellular carcinoma in East Asian populations. The aim of this study was to determine whether the S267F variant was associated with response to pegylated interferon (PEG-IFN) in patients with chronic HBV infection. A total of 257 patients with chronic HBV, treated with PEG-IFN for 48 weeks, were identified from 13 tertiary hospitals included in the hepatitis B database of the Thai Association for the Study of the Liver (THASL). Of these, 202 patients were infected with HBV genotype C (84.9%); 146 patients were hepatitis B e antigen (HBeAg)-positive (56.8%). Genotypic frequencies of the S267F polymorphism were 85.2%, 14.8% and 0% for the GG, GA and AA genotypes, respectively. S267F GA was associated with sustained alanine aminotransferase (ALT) normalization (OR = 3.25, 95% CI 1.23, 8.61; P=0.02) in HBeAg-positive patients. Patients with S267F variant tended to have more virological response, sustained response with hepatitis B surface antigen (HBsAg) loss at 24 weeks following PEG-IFN treatment. There was no association between the S267F variant and improved patient outcomes in HBeAg-negative patients. The S267F variant on the NTCP gene is independently associated with sustained normalization of ALT following treatment with PEG-IFN in patients with HBV infection who are HBeAg-positive. The findings of this study provide additional support for the clinical significance of the S267F variant of NTCP beyond HBV entry.

  12. CancerDR: cancer drug resistance database.

    PubMed

    Kumar, Rahul; Chaudhary, Kumardeep; Gupta, Sudheer; Singh, Harinder; Kumar, Shailesh; Gautam, Ankur; Kapoor, Pallavi; Raghava, Gajendra P S

    2013-01-01

    Cancer therapies are limited by the development of drug resistance, and mutations in drug targets is one of the main reasons for developing acquired resistance. The adequate knowledge of these mutations in drug targets would help to design effective personalized therapies. Keeping this in mind, we have developed a database "CancerDR", which provides information of 148 anti-cancer drugs, and their pharmacological profiling across 952 cancer cell lines. CancerDR provides comprehensive information about each drug target that includes; (i) sequence of natural variants, (ii) mutations, (iii) tertiary structure, and (iv) alignment profile of mutants/variants. A number of web-based tools have been integrated in CancerDR. This database will be very useful for identification of genetic alterations in genes encoding drug targets, and in turn the residues responsible for drug resistance. CancerDR allows user to identify promiscuous drug molecules that can kill wide range of cancer cells. CancerDR is freely accessible at http://crdd.osdd.net/raghava/cancerdr/

  13. The GENCODE exome: sequencing the complete human exome

    PubMed Central

    Coffey, Alison J; Kokocinski, Felix; Calafato, Maria S; Scott, Carol E; Palta, Priit; Drury, Eleanor; Joyce, Christopher J; LeProust, Emily M; Harrow, Jen; Hunt, Sarah; Lehesjoki, Anna-Elina; Turner, Daniel J; Hubbard, Tim J; Palotie, Aarno

    2011-01-01

    Sequencing the coding regions, the exome, of the human genome is one of the major current strategies to identify low frequency and rare variants associated with human disease traits. So far, the most widely used commercial exome capture reagents have mainly targeted the consensus coding sequence (CCDS) database. We report the design of an extended set of targets for capturing the complete human exome, based on annotation from the GENCODE consortium. The extended set covers an additional 5594 genes and 10.3 Mb compared with the current CCDS-based sets. The additional regions include potential disease genes previously inaccessible to exome resequencing studies, such as 43 genes linked to ion channel activity and 70 genes linked to protein kinase activity. In total, the new GENCODE exome set developed here covers 47.9 Mb and performed well in sequence capture experiments. In the sample set used in this study, we identified over 5000 SNP variants more in the GENCODE exome target (24%) than in the CCDS-based exome sequencing. PMID:21364695

  14. Identification of Medically Actionable Secondary Findings in the 1000 Genomes

    PubMed Central

    Olfson, Emily; Cottrell, Catherine E.; Davidson, Nicholas O.; Gurnett, Christina A.; Heusel, Jonathan W.; Stitziel, Nathan O.; Chen, Li-Shiun; Hartz, Sarah; Nagarajan, Rakesh; Saccone, Nancy L.; Bierut, Laura J.

    2015-01-01

    The American College of Medical Genetics and Genomics (ACMG) recommends that clinical sequencing laboratories return secondary findings in 56 genes associated with medically actionable conditions. Our goal was to apply a systematic, stringent approach consistent with clinical standards to estimate the prevalence of pathogenic variants associated with such conditions using a diverse sequencing reference sample. Candidate variants in the 56 ACMG genes were selected from Phase 1 of the 1000 Genomes dataset, which contains sequencing information on 1,092 unrelated individuals from across the world. These variants were filtered using the Human Gene Mutation Database (HGMD) Professional version and defined parameters, appraised through literature review, and examined by a clinical laboratory specialist and expert physician. Over 70,000 genetic variants were extracted from the 56 genes, and filtering identified 237 variants annotated as disease causing by HGMD Professional. Literature review and expert evaluation determined that 7 of these variants were pathogenic or likely pathogenic. Furthermore, 5 additional truncating variants not listed as disease causing in HGMD Professional were identified as likely pathogenic. These 12 secondary findings are associated with diseases that could inform medical follow-up, including cancer predisposition syndromes, cardiac conditions, and familial hypercholesterolemia. The majority of the identified medically actionable findings were in individuals from the European (5/379) and Americas (4/181) ancestry groups, with fewer findings in Asian (2/286) and African (1/246) ancestry groups. Our results suggest that medically relevant secondary findings can be identified in approximately 1% (12/1092) of individuals in a diverse reference sample. As clinical sequencing laboratories continue to implement the ACMG recommendations, our results highlight that at least a small number of potentially important secondary findings can be selected for return. Our results also confirm that understudied populations will not reap proportionate benefits of genomic medicine, highlighting the need for continued research efforts on genetic diseases in these populations. PMID:26332594

  15. Pathogenic Variants in Complement Genes and Risk of Atypical Hemolytic Uremic Syndrome Relapse after Eculizumab Discontinuation.

    PubMed

    Fakhouri, Fadi; Fila, Marc; Provôt, François; Delmas, Yahsou; Barbet, Christelle; Châtelet, Valérie; Rafat, Cédric; Cailliez, Mathilde; Hogan, Julien; Servais, Aude; Karras, Alexandre; Makdassi, Raifah; Louillet, Feriell; Coindre, Jean-Philippe; Rondeau, Eric; Loirat, Chantal; Frémeaux-Bacchi, Véronique

    2017-01-06

    The complement inhibitor eculizumab has dramatically improved the outcome of atypical hemolytic uremic syndrome. However, the optimal duration of eculizumab treatment in atypical hemolytic uremic syndrome remains debated. We report on the French atypical hemolytic uremic syndrome working group's first 2-year experience with eculizumab discontinuation in patients with atypical hemolytic uremic syndrome. Using the French atypical hemolytic uremic syndrome registry database, we retrospectively identified all dialysis-free patients with atypical hemolytic uremic syndrome who discontinued eculizumab between 2010 and 2014 and reviewed their relevant clinical and biologic data. The decision to discontinue eculizumab was made by the clinician in charge of the patient. All patients were closely monitored by regular urine dipsticks and blood tests. Eculizumab was rapidly (24-48 hours) restarted in case of relapse. Among 108 patients treated with eculizumab, 38 patients (nine children and 29 adults) discontinued eculizumab (median treatment duration of 17.5 months). Twenty-one patients (55%) carried novel or rare complement genes variants. Renal recovery under eculizumab was equally good in patients with and those without complement gene variants detected. After a median follow-up of 22 months, 12 patients (31%) experienced atypical hemolytic uremic syndrome relapse. Eight of 11 patients (72%) with complement factor H variants, four of eight patients (50%) with membrane cofactor protein variants, and zero of 16 patients with no rare variant detected relapsed. In relapsing patients, early reintroduction (≤48 hours) of eculizumab led to rapid (<7 days) hematologic remission and a return of serum creatinine to baseline level in a median time of 26 days. At last follow-up, renal function remained unchanged in nonrelapsing and relapsing patients compared with baseline values before eculizumab discontinuation. Pathogenic variants in complement genes were associated with higher risk of atypical hemolytic uremic syndrome relapse after eculizumab discontinuation. Prospective studies are needed to identify biomarkers predictive of relapse and determine the best strategy of retreatment in relapsing patients. Copyright © 2016 by the American Society of Nephrology.

  16. Breeding and Genetics Symposium: networks and pathways to guide genomic selection.

    PubMed

    Snelling, W M; Cushman, R A; Keele, J W; Maltecca, C; Thomas, M G; Fortes, M R S; Reverter, A

    2013-02-01

    Many traits affecting profitability and sustainability of meat, milk, and fiber production are polygenic, with no single gene having an overwhelming influence on observed variation. No knowledge of the specific genes controlling these traits has been needed to make substantial improvement through selection. Significant gains have been made through phenotypic selection enhanced by pedigree relationships and continually improving statistical methodology. Genomic selection, recently enabled by assays for dense SNP located throughout the genome, promises to increase selection accuracy and accelerate genetic improvement by emphasizing the SNP most strongly correlated to phenotype although the genes and sequence variants affecting phenotype remain largely unknown. These genomic predictions theoretically rely on linkage disequilibrium (LD) between genotyped SNP and unknown functional variants, but familial linkage may increase effectiveness when predicting individuals related to those in the training data. Genomic selection with functional SNP genotypes should be less reliant on LD patterns shared by training and target populations, possibly allowing robust prediction across unrelated populations. Although the specific variants causing polygenic variation may never be known with certainty, a number of tools and resources can be used to identify those most likely to affect phenotype. Associations of dense SNP genotypes with phenotype provide a 1-dimensional approach for identifying genes affecting specific traits; in contrast, associations with multiple traits allow defining networks of genes interacting to affect correlated traits. Such networks are especially compelling when corroborated by existing functional annotation and established molecular pathways. The SNP occurring within network genes, obtained from public databases or derived from genome and transcriptome sequences, may be classified according to expected effects on gene products. As illustrated by functionally informed genomic predictions being more accurate than naive whole-genome predictions of beef tenderness, coupling evidence from livestock genotypes, phenotypes, gene expression, and genomic variants with existing knowledge of gene functions and interactions may provide greater insight into the genes and genomic mechanisms affecting polygenic traits and facilitate functional genomic selection for economically important traits.

  17. Screening of whole genome sequences identified high-impact variants for stallion fertility.

    PubMed

    Schrimpf, Rahel; Gottschalk, Maren; Metzger, Julia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

    2016-04-14

    Stallion fertility is an economically important trait due to the increase of artificial insemination in horses. The availability of whole genome sequence data facilitates identification of rare high-impact variants contributing to stallion fertility. The aim of our study was to genotype rare high-impact variants retrieved from next-generation sequencing (NGS)-data of 11 horses in order to unravel harmful genetic variants in large samples of stallions. Gene ontology (GO) terms and search results from public databases were used to obtain a comprehensive list of human und mice genes predicted to participate in the regulation of male reproduction. The corresponding equine orthologous genes were searched in whole genome sequence data of seven stallions and four mares and filtered for high-impact genetic variants using SnpEFF, SIFT and Polyphen 2 software. All genetic variants with the missing homozygous mutant genotype were genotyped on 337 fertile stallions of 19 breeds using KASP genotyping assays or PCR-RFLP. Mixed linear model analysis was employed for an association analysis with de-regressed estimated breeding values of the paternal component of the pregnancy rate per estrus (EBV-PAT). We screened next generation sequenced data of whole genomes from 11 horses for equine genetic variants in 1194 human and mice genes involved in male fertility and linked through common gene ontology (GO) with male reproductive processes. Variants were filtered for high-impact on protein structure and validated through SIFT and Polyphen 2. Only those genetic variants were followed up when the homozygote mutant genotype was missing in the detection sample comprising 11 horses. After this filtering process, 17 single nucleotide polymorphism (SNPs) were left. These SNPs were genotyped in 337 fertile stallions of 19 breeds using KASP genotyping assays or PCR-RFLP. An association analysis in 216 Hanoverian stallions revealed a significant association of the splice-site disruption variant g.37455302G>A in NOTCH1 with the de-regressed estimated breeding values of the paternal component of the pregnancy rate per estrus (EBV-PAT). For 9 high-impact variants within the genes CFTR, OVGP1, FBXO43, TSSK6, PKD1, FOXP1, TCP11, SPATA31E1 and NOTCH1 (g.37453246G>C) absence of the homozygous mutant genotype in the validation sample of all 337 fertile stallions was obvious. Therefore, these variants were considered as potentially deleterious factors for stallion fertility. In conclusion, this study revealed 17 genetic variants with a predicted high damaging effect on protein structure and missing homozygous mutant genotype. The g.37455302G>A NOTCH1 variant was identified as a significant stallion fertility locus in Hanoverian stallions and further 9 candidate fertility loci with missing homozygous mutant genotypes were validated in a panel including 19 horse breeds. To our knowledge this is the first study in horses using next generation sequencing data to uncover strong candidate factors for stallion fertility.

  18. Association of Arrhythmia-Related Genetic Variants With Phenotypes Documented in Electronic Medical Records.

    PubMed

    Van Driest, Sara L; Wells, Quinn S; Stallings, Sarah; Bush, William S; Gordon, Adam; Nickerson, Deborah A; Kim, Jerry H; Crosslin, David R; Jarvik, Gail P; Carrell, David S; Ralston, James D; Larson, Eric B; Bielinski, Suzette J; Olson, Janet E; Ye, Zi; Kullo, Iftikhar J; Abul-Husn, Noura S; Scott, Stuart A; Bottinger, Erwin; Almoguera, Berta; Connolly, John; Chiavacci, Rosetta; Hakonarson, Hakon; Rasmussen-Torvik, Laura J; Pan, Vivian; Persell, Stephen D; Smith, Maureen; Chisholm, Rex L; Kitchner, Terrie E; He, Max M; Brilliant, Murray H; Wallace, John R; Doheny, Kimberly F; Shoemaker, M Benjamin; Li, Rongling; Manolio, Teri A; Callis, Thomas E; Macaya, Daniela; Williams, Marc S; Carey, David; Kapplinger, Jamie D; Ackerman, Michael J; Ritchie, Marylyn D; Denny, Joshua C; Roden, Dan M

    2016-01-05

    Large-scale DNA sequencing identifies incidental rare variants in established Mendelian disease genes, but the frequency of related clinical phenotypes in unselected patient populations is not well established. Phenotype data from electronic medical records (EMRs) may provide a resource to assess the clinical relevance of rare variants. To determine the clinical phenotypes from EMRs for individuals with variants designated as pathogenic by expert review in arrhythmia susceptibility genes. This prospective cohort study included 2022 individuals recruited for nonantiarrhythmic drug exposure phenotypes from October 5, 2012, to September 30, 2013, for the Electronic Medical Records and Genomics Network Pharmacogenomics project from 7 US academic medical centers. Variants in SCN5A and KCNH2, disease genes for long QT and Brugada syndromes, were assessed for potential pathogenicity by 3 laboratories with ion channel expertise and by comparison with the ClinVar database. Relevant phenotypes were determined from EMRs, with data available from 2002 (or earlier for some sites) through September 10, 2014. One or more variants designated as pathogenic in SCN5A or KCNH2. Arrhythmia or electrocardiographic (ECG) phenotypes defined by International Classification of Diseases, Ninth Revision (ICD-9) codes, ECG data, and manual EMR review. Among 2022 study participants (median age, 61 years [interquartile range, 56-65 years]; 1118 [55%] female; 1491 [74%] white), a total of 122 rare (minor allele frequency <0.5%) nonsynonymous and splice-site variants in 2 arrhythmia susceptibility genes were identified in 223 individuals (11% of the study cohort). Forty-two variants in 63 participants were designated potentially pathogenic by at least 1 laboratory or ClinVar, with low concordance across laboratories (Cohen κ = 0.26). An ICD-9 code for arrhythmia was found in 11 of 63 (17%) variant carriers vs 264 of 1959 (13%) of those without variants (difference, +4%; 95% CI, -5% to +13%; P = .35). In the 1270 (63%) with ECGs, corrected QT intervals were not different in variant carriers vs those without (median, 429 vs 439 milliseconds; difference, -10 milliseconds; 95% CI, -16 to +3 milliseconds; P = .17). After manual review, 22 of 63 participants (35%) with designated variants had any ECG or arrhythmia phenotype, and only 2 had corrected QT interval longer than 500 milliseconds. Among laboratories experienced in genetic testing for cardiac arrhythmia disorders, there was low concordance in designating SCN5A and KCNH2 variants as pathogenic. In an unselected population, the putatively pathogenic genetic variants were not associated with an abnormal phenotype. These findings raise questions about the implications of notifying patients of incidental genetic findings.

  19. A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics*

    PubMed Central

    Li, Jing; Su, Zengliu; Ma, Ze-Qiang; Slebos, Robbert J. C.; Halvey, Patrick; Tabb, David L.; Liebler, Daniel C.; Pao, William; Zhang, Bing

    2011-01-01

    Shotgun proteomics data analysis usually relies on database search. However, commonly used protein sequence databases do not contain information on protein variants and thus prevent variant peptides and proteins from been identified. Including known coding variations into protein sequence databases could help alleviate this problem. Based on our recently published human Cancer Proteome Variation Database, we have created a protein sequence database that comprehensively annotates thousands of cancer-related coding variants collected in the Cancer Proteome Variation Database as well as noncancer-specific ones from the Single Nucleotide Polymorphism Database (dbSNP). Using this database, we then developed a data analysis workflow for variant peptide identification in shotgun proteomics. The high risk of false positive variant identifications was addressed by a modified false discovery rate estimation method. Analysis of colorectal cancer cell lines SW480, RKO, and HCT-116 revealed a total of 81 peptides that contain either noncancer-specific or cancer-related variations. Twenty-three out of 26 variants randomly selected from the 81 were confirmed by genomic sequencing. We further applied the workflow on data sets from three individual colorectal tumor specimens. A total of 204 distinct variant peptides were detected, and five carried known cancer-related mutations. Each individual showed a specific pattern of cancer-related mutations, suggesting potential use of this type of information for personalized medicine. Compatibility of the workflow has been tested with four popular database search engines including Sequest, Mascot, X!Tandem, and MyriMatch. In summary, we have developed a workflow that effectively uses existing genomic data to enable variant peptide detection in proteomics. PMID:21389108

  20. Systematic review and meta-analysis of candidate gene association studies of lower urinary tract symptoms in men.

    PubMed

    Cartwright, Rufus; Mangera, Altaf; Tikkinen, Kari A O; Rajan, Prabhakar; Pesonen, Jori; Kirby, Anna C; Thiagamoorthy, Ganesh; Ambrose, Chris; Gonzalez-Maffe, Juan; Bennett, Phillip R; Palmer, Tom; Walley, Andrew; Järvelin, Marjo-Riitta; Khullar, Vik; Chapple, Chris

    2014-10-01

    Although family studies have shown that male lower urinary tract symptoms (LUTS) are highly heritable, no systematic review exists of genetic polymorphisms tested for association with LUTS. To systematically review and meta-analyze studies assessing candidate polymorphisms/genes tested for an association with LUTS, and to assess the strength, consistency, and potential for bias among pooled associations. A systematic search of the PubMed and HuGE databases as well as abstracts of major urologic meetings was performed through to January 2013. Case-control studies reporting genetic associations in men with LUTS were included. Reviewers independently and in duplicate screened titles, abstracts, and full texts to determine eligibility, abstracted data, and assessed the credibility of pooled associations according to the interim Venice criteria. Authors were contacted for clarifications if needed. Meta-analyses were performed for variants assessed in more than two studies. We identified 74 eligible studies containing data on 70 different genes. A total of 35 meta-analyses were performed with statistical significance in five (ACE, ELAC2, GSTM1, TERT, and VDR). The heterogeneity was high in three of these meta-analyses. The rs731236 variant of the vitamin D receptor had a protective effect for LUTS (odds ratio: 0.64; 95% confidence interval, 0.49-0.83) with moderate heterogeneity (I(2)=27.2%). No evidence for publication bias was identified. Limitations include wide-ranging phenotype definitions for LUTS and limited power in most meta-analyses to detect smaller effect sizes. Few putative genetic risk variants have been reliably replicated across populations. We found consistent evidence of a reduced risk of LUTS associated with the common rs731236 variant of the vitamin D receptor gene in our meta-analyses. Combining the results from all previous studies of genetic variants that may cause urinary symptoms in men, we found significant variants in five genes. Only one, a variant of the vitamin D receptor, was consistently protective across different populations. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  1. Distribution of gene mutations in sporadic congenital cataract in a Han Chinese population

    PubMed Central

    Li, Dan; Wang, Siying; Ye, Hongfei; Tang, Yating; Qiu, Xiaodi; Fan, Qi; Rong, Xianfang; Liu, Xin; Chen, Yuhong; Yang, Jin

    2016-01-01

    Purpose This study aimed to investigate the genetic effects underlying non-familial sporadic congenital cataract (SCC). Methods We collected DNA samples from 74 patients with SCC and 20 patients with traumatic cataract (TC) in an age-matched group and performed genomic sequencing of 61 lens-related genes with target region capture and next-generation sequencing (NGS). The suspected SCC variants were validated with MassARRAY and Sanger sequencing. DNA samples from 103 healthy subjects were used as additional controls in the confirmation examination. Results By filtering against common variants in public databases and those associated with TC cases, we identified 23 SCC-specific variants in 17 genes from 19 patients, which were predicted to be functional. These mutations were further confirmed by examination of the 103 healthy controls. Among the mutated genes, CRYBB3 had the highest mutation frequency with mutations detected four times in four patients, followed by EPHA2, NHS, and WDR36, the mutation of which were detected two times in two patients. We observed that the four patients with CRYBB3 mutations had three different cataract phenotypes. Conclusions From this study, we concluded the clinical and genetic heterogeneity of SCC. This is the first study to report broad spectrum genotyping for patients with SCC. PMID:27307692

  2. VarioML framework for comprehensive variation data representation and exchange.

    PubMed

    Byrne, Myles; Fokkema, Ivo Fac; Lancaster, Owen; Adamusiak, Tomasz; Ahonen-Bishopp, Anni; Atlan, David; Béroud, Christophe; Cornell, Michael; Dalgleish, Raymond; Devereau, Andrew; Patrinos, George P; Swertz, Morris A; Taschner, Peter Em; Thorisson, Gudmundur A; Vihinen, Mauno; Brookes, Anthony J; Muilu, Juha

    2012-10-03

    Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement. The GEN2PHEN project addressed these difficulties by developing a comprehensive data model for capturing biomedical observations, Observ-OM, and building the VarioML format around it. VarioML pairs a simplified open specification for describing variants, with a toolkit for adapting the specification into one's own research workflow. Straightforward variant data can be captured, federated, and exchanged with no overhead; more complex data can be described, without loss of compatibility. The open specification enables push-button submission to gene variant databases (LSDBs) e.g., the Leiden Open Variation Database, using the Cafe Variome data publishing service, while VarioML bidirectionally transforms data between XML and web-application code formats, opening up new possibilities for open source web applications building on shared data. A Java implementation toolkit makes VarioML easily integrated into biomedical applications. VarioML is designed primarily for LSDB data submission and transfer scenarios, but can also be used as a standard variation data format for JSON and XML document databases and user interface components. VarioML is a set of tools and practices improving the availability, quality, and comprehensibility of human variation information. It enables researchers, diagnostic laboratories, and clinics to share that information with ease, clarity, and without ambiguity.

  3. VarioML framework for comprehensive variation data representation and exchange

    PubMed Central

    2012-01-01

    Background Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement. Results The GEN2PHEN project addressed these difficulties by developing a comprehensive data model for capturing biomedical observations, Observ-OM, and building the VarioML format around it. VarioML pairs a simplified open specification for describing variants, with a toolkit for adapting the specification into one's own research workflow. Straightforward variant data can be captured, federated, and exchanged with no overhead; more complex data can be described, without loss of compatibility. The open specification enables push-button submission to gene variant databases (LSDBs) e.g., the Leiden Open Variation Database, using the Cafe Variome data publishing service, while VarioML bidirectionally transforms data between XML and web-application code formats, opening up new possibilities for open source web applications building on shared data. A Java implementation toolkit makes VarioML easily integrated into biomedical applications. VarioML is designed primarily for LSDB data submission and transfer scenarios, but can also be used as a standard variation data format for JSON and XML document databases and user interface components. Conclusions VarioML is a set of tools and practices improving the availability, quality, and comprehensibility of human variation information. It enables researchers, diagnostic laboratories, and clinics to share that information with ease, clarity, and without ambiguity. PMID:23031277

  4. Assessment of epithelial sodium channel variants in nonwhite cystic fibrosis patients with non-diagnostic CFTR genotypes.

    PubMed

    Brennan, Marie-Luise; Pique, Lynn M; Schrijver, Iris

    2016-01-01

    Several lines of evidence suggest a role for the epithelial sodium channel (ENaC) in cystic fibrosis (CF). The purpose of our study was to assess the contribution of genetic variants in the ENaC subunits (α, β, γ) in nonwhite CF patients in whom CFTR molecular testing has been non-diagnostic. Samples were obtained from patients who were nonwhite and whose molecular CFTR testing did not identify two mutations. Sequencing of the SCNN1A, B, and G genes was performed and variants assessed for pathogenicity and association with CF using databases, protein and splice site mutation analysis software, and literature review. We identified four nonsynonymous amino acid variants in SCNN1A, three in SCNN1B and one in SCNN1G. There was no convincing evidence of pathogenicity. Whereas all have been reported in the dbSNP database, only p.Ala334Thr, p.Val573Ile, and p.Thr663Ala in SCNN1A, p.Gly442Val in SCNN1B and p.Gly183Ser in SCNN1G were previously reported in ENaC genetic studies of CF or CF-like patients. Synonymous substitutions were also observed but novel synonymous variants were not detected. There is no conclusive association of ENaC genetic variants with CF in nonwhite CF patients. Copyright © 2015 European Cystic Fibrosis Society. Published by Elsevier B.V. All rights reserved.

  5. Novel GREM1 Variations in Sub-Saharan African Patients With Cleft Lip and/or Cleft Palate.

    PubMed

    Gowans, Lord Jephthah Joojo; Oseni, Ganiyu; Mossey, Peter A; Adeyemo, Wasiu Lanre; Eshete, Mekonen A; Busch, Tamara D; Donkor, Peter; Obiri-Yeboah, Solomon; Plange-Rhule, Gyikua; Oti, Alexander A; Owais, Arwa; Olaitan, Peter B; Aregbesola, Babatunde S; Oginni, Fadekemi O; Bello, Seidu A; Audu, Rosemary; Onwuamah, Chika; Agbenorku, Pius; Ogunlewe, Mobolanle O; Abdur-Rahman, Lukman O; Marazita, Mary L; Adeyemo, A A; Murray, Jeffrey C; Butali, Azeez

    2018-05-01

    Cleft lip and/or cleft palate (CL/P) are congenital anomalies of the face and have multifactorial etiology, with both environmental and genetic risk factors playing crucial roles. Though at least 40 loci have attained genomewide significant association with nonsyndromic CL/P, these loci largely reside in noncoding regions of the human genome, and subsequent resequencing studies of neighboring candidate genes have revealed only a limited number of etiologic coding variants. The present study was conducted to identify etiologic coding variants in GREM1, a locus that has been shown to be largely associated with cleft of both lip and soft palate. We resequenced DNA from 397 sub-Saharan Africans with CL/P and 192 controls using Sanger sequencing. Following analyses of the sequence data, we observed 2 novel coding variants in GREM1. These variants were not found in the 192 African controls and have never been previously reported in any public genetic variant database that includes more than 5000 combined African and African American controls or from the CL/P literature. The novel variants include p.Pro164Ser in an individual with soft palate cleft only and p.Gly61Asp in an individual with bilateral cleft lip and palate. The proband with the p.Gly61Asp GREM1 variant is a van der Woude (VWS) case who also has an etiologic variant in IRF6 gene. Our study demonstrated that there is low number of etiologic coding variants in GREM1, confirming earlier suggestions that variants in regulatory elements may largely account for the association between this locus and CL/P.

  6. Identification of genomic variants putatively targeted by selection during dog domestication.

    PubMed

    Cagan, Alex; Blass, Torsten

    2016-01-12

    Dogs [Canis lupus familiaris] were the first animal species to be domesticated and continue to occupy an important place in human societies. Recent studies have begun to reveal when and where dog domestication occurred. While much progress has been made in identifying the genetic basis of phenotypic differences between dog breeds we still know relatively little about the genetic changes underlying the phenotypes that differentiate all dogs from their wild progenitors, wolves [Canis lupus]. In particular, dogs generally show reduced aggression and fear towards humans compared to wolves. Therefore, selection for tameness was likely a necessary prerequisite for dog domestication. With the increasing availability of whole-genome sequence data it is possible to try and directly identify the genetic variants contributing to the phenotypic differences between dogs and wolves. We analyse the largest available database of genome-wide polymorphism data in a global sample of dogs 69 and wolves 7. We perform a scan to identify regions of the genome that are highly differentiated between dogs and wolves. We identify putatively functional genomic variants that are segregating or at high frequency [> = 0.75 Fst] for alternative alleles between dogs and wolves. A biological pathways analysis of the genes containing these variants suggests that there has been selection on the 'adrenaline and noradrenaline biosynthesis pathway', well known for its involvement in the fight-or-flight response. We identify 11 genes with putatively functional variants fixed for alternative alleles between dogs and wolves. The segregating variants in these genes are strong candidates for having been targets of selection during early dog domestication. We present the first genome-wide analysis of the different categories of putatively functional variants that are fixed or segregating at high frequency between a global sampling of dogs and wolves. We find evidence that selection has been strongest around non-synonymous variants. Strong selection in the initial stages of dog domestication appears to have occurred on multiple genes involved in the fight-or-flight response, particularly in the catecholamine synthesis pathway. Different alleles in some of these genes have been associated with behavioral differences between modern dog breeds, suggesting an important role for this pathway at multiple stages in the domestication process.

  7. Reconstruction of a Functional Human Gene Network, with an Application for Prioritizing Positional Candidate Genes

    PubMed Central

    Franke, Lude; Bakel, Harm van; Fokkens, Like; de Jong, Edwin D.; Egmont-Petersen, Michael; Wijmenga, Cisca

    2006-01-01

    Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray coexpressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which most of the genes are unknown. PMID:16685651

  8. Examining rare and low-frequency genetic variants previously associated with lone or familial forms of atrial fibrillation in an electronic medical record system: a cautionary note.

    PubMed

    Weeke, Peter; Denny, Joshua C; Basterache, Lisa; Shaffer, Christian; Bowton, Erica; Ingram, Christie; Darbar, Dawood; Roden, Dan M

    2015-02-01

    Studies in individuals or small kindreds have implicated rare variants in 25 different genes in lone and familial atrial fibrillation (AF) using linkage and segregation analysis, functional characterization, and rarity in public databases. Here, we used a cohort of 20 204 patients of European or African ancestry with electronic medical records and exome chip data to compare the frequency of AF among carriers and noncarriers of these rare variants. The exome chip included 19 of 115 rare variants, in 9 genes, previously associated with lone or familial AF. Using validated algorithms querying a combination of clinical notes, structured billing codes, ECG reports, and procedure codes, we identified 1056 AF cases (>18 years) and 19 148 non-AF controls (>50 years) with available genotype data on the Illumina HumanExome BeadChip v.1.0 in the Vanderbilt electronic medical record-linked DNA repository, BioVU. Known correlations between AF and common variants at 4q25 were replicated. None of the 19 variants previously associated with AF were over-represented among AF cases (P>0.1 for all), and the frequency of variant carriers among non-AF controls was >0.1% for 14 of 19. Repeat analyses using non-AF controls aged >60 (n=14 904), >70 (n=9670), and >80 (n=4729) years did not influence these findings. Rare variants previously implicated in lone or familial forms of AF present on the exome chip are detected at low frequencies in a general population but are not associated with AF. These findings emphasize the need for caution when ascribing variants as pathogenic or causative. © 2014 American Heart Association, Inc.

  9. Functional analysis of regulatory single-nucleotide polymorphisms.

    PubMed

    Pampín, Sandra; Rodríguez-Rey, José C

    2007-04-01

    The identification of regulatory polymorphisms has become a key problem in human genetics. In the past few years there has been a conceptual change in the way in which regulatory single-nucleotide polymorphisms are studied. We revise the new approaches and discuss how gene expression studies can contribute to a better knowledge of the genetics of common diseases. New techniques for the association of single-nucleotide polymorphisms with changes in gene expression have been recently developed. This, together with a more comprehensive use of the old in-vitro methods, has produced a great amount of genetic information. When added to current databases, it will help to design better tools for the detection of regulatory single-nucleotide polymorphisms. The identification of functional regulatory single-nucleotide polymorphisms cannot be done by the simple inspection of DNA sequence. In-vivo techniques, based on primer-extension, and the more recently developed 'haploChIP' allow the association of gene variants to changes in gene expression. Gene expression analysis by conventional in-vitro techniques is the only way to identify the functional consequences of regulatory single-nucleotide polymorphisms. The amount of information produced in the last few years will help to refine the tools for the future analysis of regulatory gene variants.

  10. Using diverse U.S. beef cattle genomes to identify missense mutations in EPAS1, a gene associated with high-altitude pulmonary hypertension

    USDA-ARS?s Scientific Manuscript database

    The availability of whole genome sequence (WGS) data has made it possible to discover protein variants in silico. However, bovine WGS databases comprised of related influential sires from relatively few breeds tend to under represent the breadth of genetic diversity in U.S. beef cattle. Thus, our ...

  11. Whole exome sequencing of rare variants in EIF4G1 and VPS35 in Parkinson disease

    PubMed Central

    Nuytemans, Karen; Bademci, Guney; Inchausti, Vanessa; Dressen, Amy; Kinnamon, Daniel D.; Mehta, Arpit; Wang, Liyong; Züchner, Stephan; Beecham, Gary W.; Martin, Eden R.; Scott, William K.

    2013-01-01

    Objective: Recently, vacuolar protein sorting 35 (VPS35) and eukaryotic translation initiation factor 4 gamma 1 (EIF4G1) have been identified as 2 causal Parkinson disease (PD) genes. We used whole exome sequencing for rapid, parallel analysis of variations in these 2 genes. Methods: We performed whole exome sequencing in 213 patients with PD and 272 control individuals. Those rare variants (RVs) with <5% frequency in the exome variant server database and our own control data were considered for analysis. We performed joint gene-based tests for association using RVASSOC and SKAT (Sequence Kernel Association Test) as well as single-variant test statistics. Results: We identified 3 novel VPS35 variations that changed the coded amino acid (nonsynonymous) in 3 cases. Two variations were in multiplex families and neither segregated with PD. In EIF4G1, we identified 11 (9 nonsynonymous and 2 small indels) RVs including the reported pathogenic mutation p.R1205H, which segregated in all affected members of a large family, but also in 1 unaffected 86-year-old family member. Two additional RVs were found in isolated patients only. Whereas initial association studies suggested an association (p = 0.04) with all RVs in EIF4G1, subsequent testing in a second dataset for the driving variant (p.F1461) suggested no association between RVs in the gene and PD. Conclusions: We confirm that the specific EIF4G1 variation p.R1205H seems to be a strong PD risk factor, but is nonpenetrant in at least one 86-year-old. A few other select RVs in both genes could not be ruled out as causal. However, there was no evidence for an overall contribution of genetic variability in VPS35 or EIF4G1 to PD development in our dataset. PMID:23408866

  12. APPRIS 2017: principal isoforms for multiple gene sets

    PubMed Central

    Rodriguez-Rivas, Juan; Di Domenico, Tomás; Vázquez, Jesús; Valencia, Alfonso

    2018-01-01

    Abstract The APPRIS database (http://appris-tools.org) uses protein structural and functional features and information from cross-species conservation to annotate splice isoforms in protein-coding genes. APPRIS selects a single protein isoform, the ‘principal’ isoform, as the reference for each gene based on these annotations. A single main splice isoform reflects the biological reality for most protein coding genes and APPRIS principal isoforms are the best predictors of these main proteins isoforms. Here, we present the updates to the database, new developments that include the addition of three new species (chimpanzee, Drosophila melangaster and Caenorhabditis elegans), the expansion of APPRIS to cover the RefSeq gene set and the UniProtKB proteome for six species and refinements in the core methods that make up the annotation pipeline. In addition APPRIS now provides a measure of reliability for individual principal isoforms and updates with each release of the GENCODE/Ensembl and RefSeq reference sets. The individual GENCODE/Ensembl, RefSeq and UniProtKB reference gene sets for six organisms have been merged to produce common sets of splice variants. PMID:29069475

  13. MECP2 variation in Rett syndrome-An overview of current coverage of genetic and phenotype data within existing databases.

    PubMed

    Townend, Gillian S; Ehrhart, Friederike; van Kranen, Henk J; Wilkinson, Mark; Jacobsen, Annika; Roos, Marco; Willighagen, Egon L; van Enckevort, David; Evelo, Chris T; Curfs, Leopold M G

    2018-04-27

    Rett syndrome (RTT) is a monogenic rare disorder that causes severe neurological problems. In most cases, it results from a loss-of-function mutation in the gene encoding methyl-CPG-binding protein 2 (MECP2). Currently, about 900 unique MECP2 variations (benign and pathogenic) have been identified and it is suspected that the different mutations contribute to different levels of disease severity. For researchers and clinicians, it is important that genotype-phenotype information is available to identify disease-causing mutations for diagnosis, to aid in clinical management of the disorder, and to provide counseling for parents. In this study, 13 genotype-phenotype databases were surveyed for their general functionality and availability of RTT-specific MECP2 variation data. For each database, we investigated findability and interoperability alongside practical user functionality, and type and amount of genetic and phenotype data. The main conclusions are that, as well as being challenging to find these databases and specific MECP2 variants held within, interoperability is as yet poorly developed and requires effort to search across databases. Nevertheless, we found several thousand online database entries for MECP2 variations and their associated phenotypes, diagnosis, or predicted variant effects, which is a good starting point for researchers and clinicians who want to provide, annotate, and use the data. © 2018 The Authors. Human Mutation published by Wiley Periodicals, Inc.

  14. Genetics of antipsychotic-induced weight gain: update and current perspectives.

    PubMed

    Kao, Amy C C; Müller, Daniel J

    2013-12-01

    Antipsychotic medications are used to effectively treat various symptoms for different psychiatric conditions. Unfortunately, antipsychotic-induced weight gain (AIWG) is a common side effect that frequently results in obesity and secondary medical conditions. Twin and sibling studies have indicated that genetic factors are likely to be highly involved in AIWG. Over recent years, there has been considerable progress in this area, with several consistently replicated findings, as well as the identification of new genes and implicated pathways. Here, we will review the most recent genetic studies related to AIWG using the Medline database (PubMed) and Google Scholar. Among the steadiest findings associated with AIWG are serotonin 2C receptors (HTR2C) and leptin promoter gene variants, with more recent studies implicating MTHFR and, in particular, MC4R genes. Additional support was reported for the HRH1, BDNF, NPY, CNR1, GHRL, FTO and AMPK genes. Notably, some of the reported variants appear to have relatively large effect sizes. These findings have provided insights into the mechanisms involved in AIWG and will help to develop predictive genetic tests in the near future.

  15. Challenges and disparities in the application of personalized genomic medicine to populations with African ancestry

    PubMed Central

    Kessler, Michael D.; Yerges-Armstrong, Laura; Taub, Margaret A.; Shetty, Amol C.; Maloney, Kristin; Jeng, Linda Jo Bone; Ruczinski, Ingo; Levin, Albert M.; Williams, L. Keoki; Beaty, Terri H.; Mathias, Rasika A.; Barnes, Kathleen C.; Boorgula, Meher Preethi; Campbell, Monica; Chavan, Sameer; Ford, Jean G.; Foster, Cassandra; Gao, Li; Hansel, Nadia N.; Horowitz, Edward; Huang, Lili; Ortiz, Romina; Potee, Joseph; Rafaels, Nicholas; Scott, Alan F.; Vergara, Candelaria; Gao, Jingjing; Hu, Yijuan; Johnston, Henry Richard; Qin, Zhaohui S.; Padhukasahasram, Badri; Dunston, Georgia M.; Faruque, Mezbah U.; Kenny, Eimear E.; Gietzen, Kimberly; Hansen, Mark; Genuario, Rob; Bullis, Dave; Lawley, Cindy; Deshpande, Aniket; Grus, Wendy E.; Locke, Devin P.; Foreman, Marilyn G.; Avila, Pedro C.; Grammer, Leslie; Kim, Kwang-YounA; Kumar, Rajesh; Schleimer, Robert; Bustamante, Carlos; De La Vega, Francisco M.; Gignoux, Chris R.; Shringarpure, Suyash S.; Musharoff, Shaila; Wojcik, Genevieve; Burchard, Esteban G.; Eng, Celeste; Gourraud, Pierre-Antoine; Hernandez, Ryan D.; Lizee, Antoine; Pino-Yanes, Maria; Torgerson, Dara G.; Szpiech, Zachary A.; Torres, Raul; Nicolae, Dan L.; Ober, Carole; Olopade, Christopher O.; Olopade, Olufunmilayo; Oluwole, Oluwafemi; Arinola, Ganiyu; Song, Wei; Abecasis, Goncalo; Correa, Adolfo; Musani, Solomon; Wilson, James G.; Lange, Leslie A.; Akey, Joshua; Bamshad, Michael; Chong, Jessica; Fu, Wenqing; Nickerson, Deborah; Reiner, Alexander; Hartert, Tina; Ware, Lorraine B.; Bleecker, Eugene; Meyers, Deborah; Ortega, Victor E.; Pissamai, Maul R. N.; Trevor, Maul R. N.; Watson, Harold; Araujo, Maria Ilma; Oliveira, Ricardo Riccio; Caraballo, Luis; Marrugo, Javier; Martinez, Beatriz; Meza, Catherine; Ayestas, Gerardo; Herrera-Paz, Edwin Francisco; Landaverde-Torres, Pamela; Erazo, Said Omar Leiva; Martinez, Rosella; Mayorga, Alvaro; Mayorga, Luis F.; Mejia-Mejia, Delmy-Aracely; Ramos, Hector; Saenz, Allan; Varela, Gloria; Vasquez, Olga Marina; Ferguson, Trevor; Knight-Madden, Jennifer; Samms-Vaughan, Maureen; Wilks, Rainford J.; Adegnika, Akim; Ateba-Ngoa, Ulysse; Yazdanbakhsh, Maria; O'Connor, Timothy D.

    2016-01-01

    To characterize the extent and impact of ancestry-related biases in precision genomic medicine, we use 642 whole-genome sequences from the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) project to evaluate typical filters and databases. We find significant correlations between estimated African ancestry proportions and the number of variants per individual in all variant classification sets but one. The source of these correlations is highlighted in more detail by looking at the interaction between filtering criteria and the ClinVar and Human Gene Mutation databases. ClinVar's correlation, representing African ancestry-related bias, has changed over time amidst monthly updates, with the most extreme switch happening between March and April of 2014 (r=0.733 to r=−0.683). We identify 68 SNPs as the major drivers of this change in correlation. As long as ancestry-related bias when using these clinical databases is minimally recognized, the genetics community will face challenges with implementation, interpretation and cost-effectiveness when treating minority populations. PMID:27725664

  16. Identification of Susceptibility Loci and Genes for Colorectal Cancer Risk.

    PubMed

    Zeng, Chenjie; Matsuda, Koichi; Jia, Wei-Hua; Chang, Jiang; Kweon, Sun-Seog; Xiang, Yong-Bing; Shin, Aesun; Jee, Sun Ha; Kim, Dong-Hyun; Zhang, Ben; Cai, Qiuyin; Guo, Xingyi; Long, Jirong; Wang, Nan; Courtney, Regina; Pan, Zhi-Zhong; Wu, Chen; Takahashi, Atsushi; Shin, Min-Ho; Matsuo, Keitaro; Matsuda, Fumihiko; Gao, Yu-Tang; Oh, Jae Hwan; Kim, Soriul; Jung, Keum Ji; Ahn, Yoon-Ok; Ren, Zefang; Li, Hong-Lan; Wu, Jie; Shi, Jiajun; Wen, Wanqing; Yang, Gong; Li, Bingshan; Ji, Bu-Tian; Brenner, Hermann; Schoen, Robert E; Küry, Sébastien; Gruber, Stephen B; Schumacher, Fredrick R; Stenzel, Stephanie L; Casey, Graham; Hopper, John L; Jenkins, Mark A; Kim, Hyeong-Rok; Jeong, Jin-Young; Park, Ji Won; Tajima, Kazuo; Cho, Sang-Hee; Kubo, Michiaki; Shu, Xiao-Ou; Lin, Dongxin; Zeng, Yi-Xin; Zheng, Wei

    2016-06-01

    Known genetic factors explain only a small fraction of genetic variation in colorectal cancer (CRC). We conducted a genome-wide association study to identify risk loci for CRC. This discovery stage included 8027 cases and 22,577 controls of East-Asian ancestry. Promising variants were evaluated in studies including as many as 11,044 cases and 12,047 controls. Tumor-adjacent normal tissues from 188 patients were analyzed to evaluate correlations of risk variants with expression levels of nearby genes. Potential functionality of risk variants were evaluated using public genomic and epigenomic databases. We identified 4 loci associated with CRC risk; P values for the most significant variant in each locus ranged from 3.92 × 10(-8) to 1.24 × 10(-12): 6p21.1 (rs4711689), 8q23.3 (rs2450115, rs6469656), 10q24.3 (rs4919687), and 12p13.3 (rs11064437). We also identified 2 risk variants at loci previously associated with CRC: 10q25.2 (rs10506868) and 20q13.3 (rs6061231). These risk variants, conferring an approximate 10%-18% increase in risk per allele, are located either inside or near protein-coding genes that include transcription factor EB (lysosome biogenesis and autophagy), eukaryotic translation initiation factor 3, subunit H (initiation of translation), cytochrome P450, family 17, subfamily A, polypeptide 1 (steroidogenesis), splA/ryanodine receptor domain and SOCS box containing 2 (proteasome degradation), and ribosomal protein S2 (ribosome biogenesis). Gene expression analyses showed a significant association (P < .05) for rs4711689 with transcription factor EB, rs6469656 with eukaryotic translation initiation factor 3, subunit H, rs11064437 with splA/ryanodine receptor domain and SOCS box containing 2, and rs6061231 with ribosomal protein S2. We identified susceptibility loci and genes associated with CRC risk, linking CRC predisposition to steroid hormone, protein synthesis and degradation, and autophagy pathways and providing added insight into the mechanism of CRC pathogenesis. Copyright © 2016 AGA Institute. Published by Elsevier Inc. All rights reserved.

  17. Identification of Rare Variants in TNNI3 with Atrial Fibrillation in a Chinese GeneID Population

    PubMed Central

    Wang, Chuchu; Wu, Manman; Qian, Jin; Li, Bin; Tu, Xin; Xu, Chengqi; Li, Sisi; Chen, Shanshan; Zhao, Yuanyuan; Huang, Yufeng; Shi, Lisong; Cheng, Xiang; Liao, Yuhua; Chen, Qiuyun; Xia, Yunlong; Yao, Wei; Wu, Gang; Cheng, Mian; Wang, Qing K.

    2015-01-01

    Despite advances by genome-wide association studies (GWAS), much of heritability of common human diseases remains missing, a phenomenon referred to as ‘missing heritability’. One potential cause for ‘missing heritability’ is the rare susceptibility variants overlooked by GWAS. Atrial fibrillation (AF) is the most common arrhythmia seen at hospitals and increases risk of stroke by 5-fold and doubles risk of heart failure and sudden death. Here we studied one large Chinese family with AF and hypertrophic cardiomyopathy (HCM). Whole-exome sequencing analysis identified a mutation in TNNI3, R186Q, that co-segregated with the disease in the family, but did not exist in >1,583 controls, suggesting that R186Q causes AF and HCM. High-resolution melting curve analysis and direct DNA sequence analysis were then used to screen mutations in all exons and exon-intron boundaries of TNNI3 in a panel of 1,127 unrelated AF patients and 1,583 non-AF subjects. Four novel missense variants were identified in TNNI3, including E64G, M154L, E187G and D196G in four independent AF patients, but no variant was found in 1,583 non-AF subjects. All variants were not found in public databases, including the ExAC Browser database with 60,706 exomes. These data suggests that rare TNNI3 variants are associated with AF (P=0.03). TNNI3 encodes troponin I, a key regulator of the contraction-relaxation function of cardiac muscle and was not previously implicated in AF. Thus, this study may identify a new biological pathway for the pathogenesis of AF and provides evidence to support the rare variant hypothesis for missing heritability. PMID:26169204

  18. Population genetics of chronic kidney disease: the evolving story of APOL1.

    PubMed

    Wasser, Walter G; Tzur, Shay; Wolday, Dawit; Adu, Dwomoa; Baumstein, Donald; Rosset, Saharon; Skorecki, Karl

    2012-01-01

    Advances in human genome sequencing and generation of public databases of genomic diversity enable nephrologists to re-examine the genetics of common, complex kidney diseases. Non-diabetic kidney diseases prevalent in African ancestry populations and the allelic variation described in chromosome 22q12.3 is one such illustrative example. Newly available genomic database information enabled research groups to discover common functional DNA sequence risk variants in the APOL1 gene. These variants (termed G1 and G2) evolved to confer protection from a species of trypanosomal infection and thus achieved high prominence in many geographic regions of Africa and have been carried over to African diaspora communities worldwide. Since these discoveries two years ago, new insights have been gained: localization of APOL1 in normal and disease kidney tissues; influence of the APOL1 variants on the histopathology of HIV kidney disease; possible association with kidney transplant durability; onset of kidney failure at a younger age; association with blood lipid concentrations; more precise geographic localization of individuals with these variants to western and southern African ancestry; and the absence of the variants and kidney disease predisposition in Ethiopians. The definition of APOL1 nephropathy also confirms the long-held assumption by many clinicians that kidney disease attributed to hypertension in African populations represents an underlying glomerulopathy. Still awaited is the delineation of the biologic mechanisms of cellular injury related to these variants, to provide biologic proof of the APOL1 association and to provide potential targets for preventive and therapeutic intervention.

  19. ICO amplicon NGS data analysis: a Web tool for variant detection in common high-risk hereditary cancer genes analyzed by amplicon GS Junior next-generation sequencing.

    PubMed

    Lopez-Doriga, Adriana; Feliubadaló, Lídia; Menéndez, Mireia; Lopez-Doriga, Sergio; Morón-Duran, Francisco D; del Valle, Jesús; Tornero, Eva; Montes, Eva; Cuesta, Raquel; Campos, Olga; Gómez, Carolina; Pineda, Marta; González, Sara; Moreno, Victor; Capellá, Gabriel; Lázaro, Conxi

    2014-03-01

    Next-generation sequencing (NGS) has revolutionized genomic research and is set to have a major impact on genetic diagnostics thanks to the advent of benchtop sequencers and flexible kits for targeted libraries. Among the main hurdles in NGS are the difficulty of performing bioinformatic analysis of the huge volume of data generated and the high number of false positive calls that could be obtained, depending on the NGS technology and the analysis pipeline. Here, we present the development of a free and user-friendly Web data analysis tool that detects and filters sequence variants, provides coverage information, and allows the user to customize some basic parameters. The tool has been developed to provide accurate genetic analysis of targeted sequencing of common high-risk hereditary cancer genes using amplicon libraries run in a GS Junior System. The Web resource is linked to our own mutation database, to assist in the clinical classification of identified variants. We believe that this tool will greatly facilitate the use of the NGS approach in routine laboratories.

  20. Database of cattle candidate genes and genetic markers for milk production and mastitis

    PubMed Central

    Ogorevc, J; Kunej, T; Razpet, A; Dovc, P

    2009-01-01

    A cattle database of candidate genes and genetic markers for milk production and mastitis has been developed to provide an integrated research tool incorporating different types of information supporting a genomic approach to study lactation, udder development and health. The database contains 943 genes and genetic markers involved in mammary gland development and function, representing candidates for further functional studies. The candidate loci were drawn on a genetic map to reveal positional overlaps. For identification of candidate loci, data from seven different research approaches were exploited: (i) gene knockouts or transgenes in mice that result in specific phenotypes associated with mammary gland (143 loci); (ii) cattle QTL for milk production (344) and mastitis related traits (71); (iii) loci with sequence variations that show specific allele-phenotype interactions associated with milk production (24) or mastitis (10) in cattle; (iv) genes with expression profiles associated with milk production (207) or mastitis (107) in cattle or mouse; (v) cattle milk protein genes that exist in different genetic variants (9); (vi) miRNAs expressed in bovine mammary gland (32) and (vii) epigenetically regulated cattle genes associated with mammary gland function (1). Fourty-four genes found by multiple independent analyses were suggested as the most promising candidates and were further in silico analysed for expression levels in lactating mammary gland, genetic variability and top biological functions in functional networks. A miRNA target search for mammary gland expressed miRNAs identified 359 putative binding sites in 3′UTRs of candidate genes. PMID:19508288

  1. AgdbNet – antigen sequence database software for bacterial typing

    PubMed Central

    Jolley, Keith A; Maiden, Martin CJ

    2006-01-01

    Background Bacterial typing schemes based on the sequences of genes encoding surface antigens require databases that provide a uniform, curated, and widely accepted nomenclature of the variants identified. Due to the differences in typing schemes, imposed by the diversity of genes targeted, creating these databases has typically required the writing of one-off code to link the database to a web interface. Here we describe agdbNet, widely applicable web database software that facilitates simultaneous BLAST querying of multiple loci using either nucleotide or peptide sequences. Results Databases are described by XML files that are parsed by a Perl CGI script. Each database can have any number of loci, which may be defined by nucleotide and/or peptide sequences. The software is currently in use on at least five public databases for the typing of Neisseria meningitidis, Campylobacter jejuni and Streptococcus equi and can be set up to query internal isolate tables or suitably-configured external isolate databases, such as those used for multilocus sequence typing. The style of the resulting website can be fully configured by modifying stylesheets and through the use of customised header and footer files that surround the output of the script. Conclusion The software provides a rapid means of setting up customised Internet antigen sequence databases. The flexible configuration options enable typing schemes with differing requirements to be accommodated. PMID:16790057

  2. Houston Methodist Variant Viewer: An Application to Support Clinical Laboratory Interpretation of Next-generation Sequencing Data for Cancer

    PubMed Central

    Christensen, Paul A.; Ni, Yunyun; Bao, Feifei; Hendrickson, Heather L.; Greenwood, Michael; Thomas, Jessica S.; Long, S. Wesley; Olsen, Randall J.

    2017-01-01

    Introduction: Next-generation-sequencing (NGS) is increasingly used in clinical and research protocols for patients with cancer. NGS assays are routinely used in clinical laboratories to detect mutations bearing on cancer diagnosis, prognosis and personalized therapy. A typical assay may interrogate 50 or more gene targets that encompass many thousands of possible gene variants. Analysis of NGS data in cancer is a labor-intensive process that can become overwhelming to the molecular pathologist or research scientist. Although commercial tools for NGS data analysis and interpretation are available, they are often costly, lack key functionality or cannot be customized by the end user. Methods: To facilitate NGS data analysis in our clinical molecular diagnostics laboratory, we created a custom bioinformatics tool termed Houston Methodist Variant Viewer (HMVV). HMVV is a Java-based solution that integrates sequencing instrument output, bioinformatics analysis, storage resources and end user interface. Results: Compared to the predicate method used in our clinical laboratory, HMVV markedly simplifies the bioinformatics workflow for the molecular technologist and facilitates the variant review by the molecular pathologist. Importantly, HMVV reduces time spent researching the biological significance of the variants detected, standardizes the online resources used to perform the variant investigation and assists generation of the annotated report for the electronic medical record. HMVV also maintains a searchable variant database, including the variant annotations generated by the pathologist, which is useful for downstream quality improvement and research projects. Conclusions: HMVV is a clinical grade, low-cost, feature-rich, highly customizable platform that we have made available for continued development by the pathology informatics community. PMID:29226007

  3. Low Frequency Variants, Collapsed Based on Biological Knowledge, Uncover Complexity of Population Stratification in 1000 Genomes Project Data

    PubMed Central

    Moore, Carrie B.; Wallace, John R.; Wolfe, Daniel J.; Frase, Alex T.; Pendergrass, Sarah A.; Weiss, Kenneth M.; Ritchie, Marylyn D.

    2013-01-01

    Analyses investigating low frequency variants have the potential for explaining additional genetic heritability of many complex human traits. However, the natural frequencies of rare variation between human populations strongly confound genetic analyses. We have applied a novel collapsing method to identify biological features with low frequency variant burden differences in thirteen populations sequenced by the 1000 Genomes Project. Our flexible collapsing tool utilizes expert biological knowledge from multiple publicly available database sources to direct feature selection. Variants were collapsed according to genetically driven features, such as evolutionary conserved regions, regulatory regions genes, and pathways. We have conducted an extensive comparison of low frequency variant burden differences (MAF<0.03) between populations from 1000 Genomes Project Phase I data. We found that on average 26.87% of gene bins, 35.47% of intergenic bins, 42.85% of pathway bins, 14.86% of ORegAnno regulatory bins, and 5.97% of evolutionary conserved regions show statistically significant differences in low frequency variant burden across populations from the 1000 Genomes Project. The proportion of bins with significant differences in low frequency burden depends on the ancestral similarity of the two populations compared and types of features tested. Even closely related populations had notable differences in low frequency burden, but fewer differences than populations from different continents. Furthermore, conserved or functionally relevant regions had fewer significant differences in low frequency burden than regions under less evolutionary constraint. This degree of low frequency variant differentiation across diverse populations and feature elements highlights the critical importance of considering population stratification in the new era of DNA sequencing and low frequency variant genomic analyses. PMID:24385916

  4. Association between cytochrome CYP17A1, CYP3A4, and CYP3A43 polymorphisms and prostate cancer risk and aggressiveness in a Korean study population

    PubMed Central

    Han, Jun Hyun; Lee, Yong Seong; Kim, Hae Jong; Lee, Shin Young; Myung, Soon Chul

    2015-01-01

    In this study, we evaluated genetic variants of the androgen metabolism genes CYP17A1, CYP3A4, and CYP3A43 to determine whether they play a role in the development of prostate cancer (PCa) in Korean men. The study population included 240 pathologically diagnosed cases of PCa and 223 age-matched controls. Among the 789 single-nucleotide polymorphism (SNP) database variants detected, 129 were reported in two Asian groups (Han Chinese and Japanese) in the HapMap database. Only 21 polymorphisms of CYP17A1, CYP3A4, and CYP3A43 were selected based on linkage disequilibrium in Asians (r2 = 1), locations (SNPs in exons were preferred), and amino acid changes and were assessed. In addition, we performed haplotype analysis for the 21 SNPs in CYP17A1, CYP3A4, and CYP3A43 genes. To determine the association between genotype and haplotype distributions of patients and controls, logistic analyses were carried out, controlling for age. Twelve sequence variants and five major haplotypes were identified in CYP17A1. Five sequence variants and two major haplotypes were identified in CYP3A4. Four sequence variants and four major haplotypes were observed in CYP3A43. CYP17A1 haplotype-2 (Ht-2) (odds ratio [OR], 1.51; 95% confidence interval [CI], 1.04–2.18) was associated with PCa susceptibility. CYP3A4 Ht-2 (OR: 1.87; 95% CI: 1.02–3.43) was associated with PCa metastatic potential according to tumor stage. rs17115149 (OR: 1.96; 95% CI: 1.04–3.68) and CYP17A1 Ht-4 (OR: 2.01; 95% CI: 1.07–4.11) showed a significant association with histologic aggressiveness according to Gleason score. Genetic variants of CYP17A1 and CYP3A4 may play a role in the development of PCa in Korean men. PMID:25337833

  5. Association between cytochrome CYP17A1, CYP3A4, and CYP3A43 polymorphisms and prostate cancer risk and aggressiveness in a Korean study population.

    PubMed

    Han, Jun Hyun; Lee, Yong Seong; Kim, Hae Jong; Lee, Shin Young; Myung, Soon Chul

    2015-01-01

    In this study, we evaluated genetic variants of the androgen metabolism genes CYP17A1, CYP3A4, and CYP3A43 to determine whether they play a role in the development of prostate cancer (PCa) in Korean men. The study population included 240 pathologically diagnosed cases of PCa and 223 age-matched controls. Among the 789 single-nucleotide polymorphism (SNP) database variants detected, 129 were reported in two Asian groups (Han Chinese and Japanese) in the HapMap database. Only 21 polymorphisms of CYP17A1, CYP3A4, and CYP3A43 were selected based on linkage disequilibrium in Asians (r2 = 1), locations (SNPs in exons were preferred), and amino acid changes and were assessed. In addition, we performed haplotype analysis for the 21 SNPs in CYP17A1, CYP3A4, and CYP3A43 genes. To determine the association between genotype and haplotype distributions of patients and controls, logistic analyses were carried out, controlling for age. Twelve sequence variants and five major haplotypes were identified in CYP17A1. Five sequence variants and two major haplotypes were identified in CYP3A4. Four sequence variants and four major haplotypes were observed in CYP3A43. CYP17A1 haplotype-2 (Ht-2) (odds ratio [OR], 1.51; 95% confidence interval [CI], 1.04-2.18) was associated with PCa susceptibility. CYP3A4 Ht-2 (OR: 1.87; 95% CI: 1.02-3.43) was associated with PCa metastatic potential according to tumor stage. rs17115149 (OR: 1.96; 95% CI: 1.04-3.68) and CYP17A1 Ht-4 (OR: 2.01; 95% CI: 1.07-4.11) showed a significant association with histologic aggressiveness according to Gleason score. Genetic variants of CYP17A1 and CYP3A4 may play a role in the development of PCa in Korean men.

  6. Three-dimensional spatial analysis of missense variants in RTEL1 identifies pathogenic variants in patients with Familial Interstitial Pneumonia.

    PubMed

    Sivley, R Michael; Sheehan, Jonathan H; Kropski, Jonathan A; Cogan, Joy; Blackwell, Timothy S; Phillips, John A; Bush, William S; Meiler, Jens; Capra, John A

    2018-01-23

    Next-generation sequencing of individuals with genetic diseases often detects candidate rare variants in numerous genes, but determining which are causal remains challenging. We hypothesized that the spatial distribution of missense variants in protein structures contains information about function and pathogenicity that can help prioritize variants of unknown significance (VUS) and elucidate the structural mechanisms leading to disease. To illustrate this approach in a clinical application, we analyzed 13 candidate missense variants in regulator of telomere elongation helicase 1 (RTEL1) identified in patients with Familial Interstitial Pneumonia (FIP). We curated pathogenic and neutral RTEL1 variants from the literature and public databases. We then used homology modeling to construct a 3D structural model of RTEL1 and mapped known variants into this structure. We next developed a pathogenicity prediction algorithm based on proximity to known disease causing and neutral variants and evaluated its performance with leave-one-out cross-validation. We further validated our predictions with segregation analyses, telomere lengths, and mutagenesis data from the homologous XPD protein. Our algorithm for classifying RTEL1 VUS based on spatial proximity to pathogenic and neutral variation accurately distinguished 7 known pathogenic from 29 neutral variants (ROC AUC = 0.85) in the N-terminal domains of RTEL1. Pathogenic proximity scores were also significantly correlated with effects on ATPase activity (Pearson r = -0.65, p = 0.0004) in XPD, a related helicase. Applying the algorithm to 13 VUS identified from sequencing of RTEL1 from patients predicted five out of six disease-segregating VUS to be pathogenic. We provide structural hypotheses regarding how these mutations may disrupt RTEL1 ATPase and helicase function. Spatial analysis of missense variation accurately classified candidate VUS in RTEL1 and suggests how such variants cause disease. Incorporating spatial proximity analyses into other pathogenicity prediction tools may improve accuracy for other genes and genetic diseases.

  7. The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes

    PubMed Central

    Rigden, Daniel J

    2017-01-01

    Abstract This year's Database Issue of Nucleic Acids Research contains 152 papers that include descriptions of 54 new databases and update papers on 98 databases, of which 16 have not been previously featured in NAR. As always, these databases cover a broad range of molecular biology subjects, including genome structure, gene expression and its regulation, proteins, protein domains, and protein–protein interactions. Following the recent trend, an increasing number of new and established databases deal with the issues of human health, from cancer-causing mutations to drugs and drug targets. In accordance with this trend, three recently compiled databases that have been selected by NAR reviewers and editors as ‘breakthrough’ contributions, denovo-db, the Monarch Initiative, and Open Targets, cover human de novo gene variants, disease-related phenotypes in model organisms, and a bioinformatics platform for therapeutic target identification and validation, respectively. We expect these databases to attract the attention of numerous researchers working in various areas of genetics and genomics. Looking back at the past 12 years, we present here the ‘golden set’ of databases that have consistently served as authoritative, comprehensive, and convenient data resources widely used by the entire community and offer some lessons on what makes a successful database. The Database Issue is freely available online at the https://academic.oup.com/nar web site. An updated version of the NAR Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/a/. PMID:28053160

  8. Clinical Variant Classification: A Comparison of Public Databases and a Commercial Testing Laboratory.

    PubMed

    Gradishar, William; Johnson, KariAnne; Brown, Krystal; Mundt, Erin; Manley, Susan

    2017-07-01

    There is a growing move to consult public databases following receipt of a genetic test result from a clinical laboratory; however, the well-documented limitations of these databases call into question how often clinicians will encounter discordant variant classifications that may introduce uncertainty into patient management. Here, we evaluate discordance in BRCA1 and BRCA2 variant classifications between a single commercial testing laboratory and a public database commonly consulted in clinical practice. BRCA1 and BRCA2 variant classifications were obtained from ClinVar and compared with the classifications from a reference laboratory. Full concordance and discordance were determined for variants whose ClinVar entries were of the same pathogenicity (pathogenic, benign, or uncertain). Variants with conflicting ClinVar classifications were considered partially concordant if ≥1 of the listed classifications agreed with the reference laboratory classification. Four thousand two hundred and fifty unique BRCA1 and BRCA2 variants were available for analysis. Overall, 73.2% of classifications were fully concordant and 12.3% were partially concordant. The remaining 14.5% of variants had discordant classifications, most of which had a definitive classification (pathogenic or benign) from the reference laboratory compared with an uncertain classification in ClinVar (14.0%). Here, we show that discrepant classifications between a public database and single reference laboratory potentially account for 26.7% of variants in BRCA1 and BRCA2 . The time and expertise required of clinicians to research these discordant classifications call into question the practicality of checking all test results against a database and suggest that discordant classifications should be interpreted with these limitations in mind. With the increasing use of clinical genetic testing for hereditary cancer risk, accurate variant classification is vital to ensuring appropriate medical management. There is a growing move to consult public databases following receipt of a genetic test result from a clinical laboratory; however, we show that up to 26.7% of variants in BRCA1 and BRCA2 have discordant classifications between ClinVar and a reference laboratory. The findings presented in this paper serve as a note of caution regarding the utility of database consultation. © AlphaMed Press 2017.

  9. DR-GAS: a database of functional genetic variants and their phosphorylation states in human DNA repair systems.

    PubMed

    Sehgal, Manika; Singh, Tiratha Raj

    2014-04-01

    We present DR-GAS(1), a unique, consolidated and comprehensive DNA repair genetic association studies database of human DNA repair system. It presents information on repair genes, assorted mechanisms of DNA repair, linkage disequilibrium, haplotype blocks, nsSNPs, phosphorylation sites, associated diseases, and pathways involved in repair systems. DNA repair is an intricate process which plays an essential role in maintaining the integrity of the genome by eradicating the damaging effect of internal and external changes in the genome. Hence, it is crucial to extensively understand the intact process of DNA repair, genes involved, non-synonymous SNPs which perhaps affect the function, phosphorylated residues and other related genetic parameters. All the corresponding entries for DNA repair genes, such as proteins, OMIM IDs, literature references and pathways are cross-referenced to their respective primary databases. DNA repair genes and their associated parameters are either represented in tabular or in graphical form through images elucidated by computational and statistical analyses. It is believed that the database will assist molecular biologists, biotechnologists, therapeutic developers and other scientific community to encounter biologically meaningful information, and meticulous contribution of genetic level information towards treacherous diseases in human DNA repair systems. DR-GAS is freely available for academic and research purposes at: http://www.bioinfoindia.org/drgas. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. Association of Rare and Common Variation in the Lipoprotein Lipase Gene with Coronary Artery Disease

    PubMed Central

    Khera, Amit V.; Won, Hong-Hee; Peloso, Gina M.; O’Dushlaine, Colm; Liu, Dajiang; Stitziel, Nathan O.; Natarajan, Pradeep; Nomura, Akihiro; Emdin, Connor A.; Gupta, Namrata; Borecki, Ingrid B.; Asselta, Rosanna; Duga, Stefano; Merlini, Piera Angelica; Correa, Adolfo; Kessler, Thorsten; Wilson, James G.; Bown, Matthew J.; Hall, Alistair S.; Braund, Peter S.; Carey, David J.; Murray, Michael F.; Kirchner, H. Lester; Leader, Joseph B.; Lavage, Daniel R.; Manus, J. Neil; Hartzel, Dustin N.; Samani, Nilesh J.; Schunkert, Heribert; Marrugat, Jaume; Elosua, Roberto; McPherson, Ruth; Farrall, Martin; Watkins, Hugh; Lander, Eric S.; Rader, Daniel J.; Danesh, John; Ardissino, Diego; Gabriel, Stacey; Willer, Cristen; Abecasis, Gonçalo R.; Saleheen, Danish; Dewey, Frederick E.; Kathiresan, Sekar

    2017-01-01

    Importance The activity of lipoprotein lipase (LPL) is the rate-determining step in clearing triglyceride-rich lipoproteins from the circulation. Mutations that damage the LPL gene lead to lifelong deficiency in enzymatic activity and can provide insight into the relationship of LPL to human disease. Objective Determine if rare and/or common variants in the LPL gene are associated with early-onset coronary artery disease (CAD). Design, Setting, and Participants Cross-sectional study. The LPL gene was sequenced in 10 CAD case-control cohorts of the multinational Myocardial Infarction Genetics Consortium and a nested CAD case-control cohort of the Geisinger Health System DiscovEHR cohort between 2010 and 2015. Common variants were genotyped in up to 305,699 individuals of the Global Lipids Genetics Consortium and up to 120,600 individuals of the CARDIoGRAM Exome Consortium between 2012 and 2014. Study-specific estimates were pooled via meta-analysis. Exposure Rare damaging mutations in LPL included loss-of-function variants and missense variants annotated as pathogenic in a human genetics database or predicted to be damaging by computer prediction algorithms trained to identify mutations that impair protein function. Common variants in the LPL gene region included those independently associated with circulating triglyceride levels. Main Outcomes and Measures Circulating lipid levels and CAD. Results Among 46,891 individuals with LPL gene sequencing data available, mean age was 50 years (SD 12.6) and 51% were female. 188 participants (0.40%; 95%CI 0.35–0.46) carried a damaging mutation in the LPL gene – 105 of 32,646 control participants (0.32%) and 83 of 14,245 (0.58%) early-onset CAD cases. Compared to 46,703 non-carriers, the 188 heterozygous carriers of a LPL damaging mutation displayed higher plasma triglycerides (Beta coefficient= +19.6 mg/dL; 95%CI 4.6–34.6) and higher odds of CAD (odds ratio 1.84; 95%CI 1.35–2.51; P<0.001). An analysis of 6 common LPL variants noted an odds ratio for CAD of 1.51 (95%CI 1.39–1.64; P=1.1×10−22) per standard deviation increase in triglycerides. Conclusions and Relevance The presence of rare damaging mutations in the LPL gene was significantly associated with higher triglyceride levels and presence of CAD. However, further research is needed to assess causal mechanisms by which heterozygous LPL deficiency could lead to CAD. PMID:28267856

  11. Constraints on Biological Mechanism from Disease Comorbidity Using Electronic Medical Records and Database of Genetic Variants

    PubMed Central

    Bagley, Steven C.; Sirota, Marina; Chen, Richard; Butte, Atul J.; Altman, Russ B.

    2016-01-01

    Patterns of disease co-occurrence that deviate from statistical independence may represent important constraints on biological mechanism, which sometimes can be explained by shared genetics. In this work we study the relationship between disease co-occurrence and commonly shared genetic architecture of disease. Records of pairs of diseases were combined from two different electronic medical systems (Columbia, Stanford), and compared to a large database of published disease-associated genetic variants (VARIMED); data on 35 disorders were available across all three sources, which include medical records for over 1.2 million patients and variants from over 17,000 publications. Based on the sources in which they appeared, disease pairs were categorized as having predominant clinical, genetic, or both kinds of manifestations. Confounding effects of age on disease incidence were controlled for by only comparing diseases when they fall in the same cluster of similarly shaped incidence patterns. We find that disease pairs that are overrepresented in both electronic medical record systems and in VARIMED come from two main disease classes, autoimmune and neuropsychiatric. We furthermore identify specific genes that are shared within these disease groups. PMID:27115429

  12. Constraints on Biological Mechanism from Disease Comorbidity Using Electronic Medical Records and Database of Genetic Variants.

    PubMed

    Bagley, Steven C; Sirota, Marina; Chen, Richard; Butte, Atul J; Altman, Russ B

    2016-04-01

    Patterns of disease co-occurrence that deviate from statistical independence may represent important constraints on biological mechanism, which sometimes can be explained by shared genetics. In this work we study the relationship between disease co-occurrence and commonly shared genetic architecture of disease. Records of pairs of diseases were combined from two different electronic medical systems (Columbia, Stanford), and compared to a large database of published disease-associated genetic variants (VARIMED); data on 35 disorders were available across all three sources, which include medical records for over 1.2 million patients and variants from over 17,000 publications. Based on the sources in which they appeared, disease pairs were categorized as having predominant clinical, genetic, or both kinds of manifestations. Confounding effects of age on disease incidence were controlled for by only comparing diseases when they fall in the same cluster of similarly shaped incidence patterns. We find that disease pairs that are overrepresented in both electronic medical record systems and in VARIMED come from two main disease classes, autoimmune and neuropsychiatric. We furthermore identify specific genes that are shared within these disease groups.

  13. MoonProt: a database for proteins that are known to moonlight

    PubMed Central

    Mani, Mathew; Chen, Chang; Amblee, Vaishak; Liu, Haipeng; Mathur, Tanu; Zwicke, Grant; Zabad, Shadi; Patel, Bansi; Thakkar, Jagravi; Jeffery, Constance J.

    2015-01-01

    Moonlighting proteins comprise a class of multifunctional proteins in which a single polypeptide chain performs multiple biochemical functions that are not due to gene fusions, multiple RNA splice variants or pleiotropic effects. The known moonlighting proteins perform a variety of diverse functions in many different cell types and species, and information about their structures and functions is scattered in many publications. We have constructed the manually curated, searchable, internet-based MoonProt Database (http://www.moonlightingproteins.org) with information about the over 200 proteins that have been experimentally verified to be moonlighting proteins. The availability of this organized information provides a more complete picture of what is currently known about moonlighting proteins. The database will also aid researchers in other fields, including determining the functions of genes identified in genome sequencing projects, interpreting data from proteomics projects and annotating protein sequence and structural databases. In addition, information about the structures and functions of moonlighting proteins can be helpful in understanding how novel protein functional sites evolved on an ancient protein scaffold, which can also help in the design of proteins with novel functions. PMID:25324305

  14. Genetic Variants in the Bone Morphogenic Protein Gene Family Modify the Association between Residential Exposure to Traffic and Peripheral Arterial Disease

    PubMed Central

    Ward-Caviness, Cavin K.; Neas, Lucas M.; Blach, Colette; Haynes, Carol S.; LaRocque-Abramson, Karen; Grass, Elizabeth; Dowdy, Elaine; Devlin, Robert B.; Diaz-Sanchez, David; Cascio, Wayne E.; Lynn Miranda, Marie; Gregory, Simon G.; Shah, Svati H.; Kraus, William E.; Hauser, Elizabeth R.

    2016-01-01

    There is a growing literature indicating that genetic variants modify many of the associations between environmental exposures and clinical outcomes, potentially by increasing susceptibility to these exposures. However, genome-scale investigations of these interactions have been rarely performed particularly in the case of air pollution exposures. We performed race-stratified genome-wide gene-environment interaction association studies on European-American (EA, N = 1623) and African-American (AA, N = 554) cohorts to investigate the joint influence of common single nucleotide polymorphisms (SNPs) and residential exposure to traffic (“traffic exposure”)—a recognized vascular disease risk factor—on peripheral arterial disease (PAD). Traffic exposure was estimated via the distance from the primary residence to the nearest major roadway, defined as the nearest limited access highways or major arterial. The rs755249-traffic exposure interaction was associated with PAD at a genome-wide significant level (P = 2.29x10-8) in European-Americans. Rs755249 is located in the 3’ untranslated region of BMP8A, a member of the bone morphogenic protein (BMP) gene family. Further investigation revealed several variants in BMP genes associated with PAD via an interaction with traffic exposure in both the EA and AA cohorts; this included interactions with non-synonymous variants in BMP2, which is regulated by air pollution exposure. The BMP family of genes is linked to vascular growth and calcification and is a novel gene family for the study of PAD pathophysiology. Further investigation of BMP8A using the Genotype Tissue Expression Database revealed multiple variants with nominally significant (P < 0.05) interaction P-values in our EA cohort were significant BMP8A eQTLs in tissue types highlight relevant for PAD such as rs755249 (tibial nerve, eQTL P = 3.6x10-6) and rs1180341 (tibial artery, eQTL P = 5.3x10-6). Together these results reveal a novel gene, and possibly gene family, associated with PAD via an interaction with traffic air pollution exposure. These results also highlight the potential for interactions studies, particularly at the genome scale, to reveal novel biology linking environmental exposures to clinical outcomes. PMID:27082954

  15. Management of Gene Variants of Unknown Significance: Analysis Method and Risk Assessment of the VHL Mutation p.P81S (c.241C>T).

    PubMed

    Alosi, Daniela; Bisgaard, Marie Luise; Hemmingsen, Sophie Nowak; Krogh, Lotte Nylandsted; Mikkelsen, Hanne Birte; Binderup, Marie Louise Mølgaard

    2017-02-01

    Evaluation of the pathogenicity of a gene variant of unknown significance (VUS) is crucial for molecular diagnosis and genetic counseling, but can be challenging. This is especially so in phenotypically variable diseases, such as von Hippel-Lindau disease (vHL). vHL is caused by germline mutations in the VHL gene, which predispose to the development of multiple tumors such as central nervous system hemangioblastomas and renal cell carcinoma (RCC). We propose a method for the evaluation of VUS pathogenicity through our experience with the VHL missense mutation c.241C>T (p.P81S). 1) Clinical evaluation of known variant carriers: We evaluated a family of five VHL p.P81S carriers, as well as the clinical characteristics of all the p.P81S carriers reported in the literature; 2) Evaluation of tumor tissue via genetic analysis, histology, and immunohistochemistry (IHC); 3) Assessment of the variant's impact on protein structure and function, using multiple databases, in silico algorithms, and reports of functional studies. Only one family member had clinical signs of vHL with early-onset RCC. IHC analysis showed no VHL protein expressed in the tumor, consistent with biallelic VHL inactivation. The majority of in silico algorithms reported p.P81S as possibly pathogenic in relation to vHL or RCC, but there were discrepancies. Functional studies suggest that p.P81S impairs the VHL protein's function. The VHL p.P81S mutation is most likely a low-penetrant pathogenic variant predisposing to RCC development. We suggest the above-mentioned method for VUS evaluation with use of different methods, especially a variety of in silico methods and tumor tissue analysis.

  16. The Yak genome database: an integrative database for studying yak biology and high-altitude adaption

    PubMed Central

    2012-01-01

    Background The yak (Bos grunniens) is a long-haired bovine that lives at high altitudes and is an important source of milk, meat, fiber and fuel. The recent sequencing, assembly and annotation of its genome are expected to further our understanding of the means by which it has adapted to life at high altitudes and its ecologically important traits. Description The Yak Genome Database (YGD) is an internet-based resource that provides access to genomic sequence data and predicted functional information concerning the genes and proteins of Bos grunniens. The curated data stored in the YGD includes genome sequences, predicted genes and associated annotations, non-coding RNA sequences, transposable elements, single nucleotide variants, and three-way whole-genome alignments between human, cattle and yak. YGD offers useful searching and data mining tools, including the ability to search for genes by name or using function keywords as well as GBrowse genome browsers and/or BLAST servers, which can be used to visualize genome regions and identify similar sequences. Sequence data from the YGD can also be downloaded to perform local searches. Conclusions A new yak genome database (YGD) has been developed to facilitate studies on high-altitude adaption and bovine genomics. The database will be continuously updated to incorporate new information such as transcriptome data and population resequencing data. The YGD can be accessed at http://me.lzu.edu.cn/yak. PMID:23134687

  17. APASdb: a database describing alternative poly(A) sites and selection of heterogeneous cleavage sites downstream of poly(A) signals

    PubMed Central

    You, Leiming; Wu, Jiexin; Feng, Yuchao; Fu, Yonggui; Guo, Yanan; Long, Liyuan; Zhang, Hui; Luan, Yijie; Tian, Peng; Chen, Liangfu; Huang, Guangrui; Huang, Shengfeng; Li, Yuxin; Li, Jie; Chen, Chengyong; Zhang, Yaqing; Chen, Shangwu; Xu, Anlong

    2015-01-01

    Increasing amounts of genes have been shown to utilize alternative polyadenylation (APA) 3′-processing sites depending on the cell and tissue type and/or physiological and pathological conditions at the time of processing, and the construction of genome-wide database regarding APA is urgently needed for better understanding poly(A) site selection and APA-directed gene expression regulation for a given biology. Here we present a web-accessible database, named APASdb (http://mosas.sysu.edu.cn/utr), which can visualize the precise map and usage quantification of different APA isoforms for all genes. The datasets are deeply profiled by the sequencing alternative polyadenylation sites (SAPAS) method capable of high-throughput sequencing 3′-ends of polyadenylated transcripts. Thus, APASdb details all the heterogeneous cleavage sites downstream of poly(A) signals, and maintains near complete coverage for APA sites, much better than the previous databases using conventional methods. Furthermore, APASdb provides the quantification of a given APA variant among transcripts with different APA sites by computing their corresponding normalized-reads, making our database more useful. In addition, APASdb supports URL-based retrieval, browsing and display of exon-intron structure, poly(A) signals, poly(A) sites location and usage reads, and 3′-untranslated regions (3′-UTRs). Currently, APASdb involves APA in various biological processes and diseases in human, mouse and zebrafish. PMID:25378337

  18. Novel Myopia Genes and Pathways Identified From Syndromic Forms of Myopia

    PubMed Central

    Loughman, James; Wildsoet, Christine F.; Williams, Cathy; Guggenheim, Jeremy A.

    2018-01-01

    Purpose To test the hypothesis that genes known to cause clinical syndromes featuring myopia also harbor polymorphisms contributing to nonsyndromic refractive errors. Methods Clinical phenotypes and syndromes that have refractive errors as a recognized feature were identified using the Online Mendelian Inheritance in Man (OMIM) database. One hundred fifty-four unique causative genes were identified, of which 119 were specifically linked with myopia and 114 represented syndromic myopia (i.e., myopia and at least one other clinical feature). Myopia was the only refractive error listed for 98 genes and hyperopia and the only refractive error noted for 28 genes, with the remaining 28 genes linked to phenotypes with multiple forms of refractive error. Pathway analysis was carried out to find biological processes overrepresented within these sets of genes. Genetic variants located within 50 kb of the 119 myopia-related genes were evaluated for involvement in refractive error by analysis of summary statistics from genome-wide association studies (GWAS) conducted by the CREAM Consortium and 23andMe, using both single-marker and gene-based tests. Results Pathway analysis identified several biological processes already implicated in refractive error development through prior GWAS analyses and animal studies, including extracellular matrix remodeling, focal adhesion, and axon guidance, supporting the research hypothesis. Novel pathways also implicated in myopia development included mannosylation, glycosylation, lens development, gliogenesis, and Schwann cell differentiation. Hyperopia was found to be linked to a different pattern of biological processes, mostly related to organogenesis. Comparison with GWAS findings further confirmed that syndromic myopia genes were enriched for genetic variants that influence refractive errors in the general population. Gene-based analyses implicated 21 novel candidate myopia genes (ADAMTS18, ADAMTS2, ADAMTSL4, AGK, ALDH18A1, ASXL1, COL4A1, COL9A2, ERBB3, FBN1, GJA1, GNPTG, IFIH1, KIF11, LTBP2, OCA2, POLR3B, POMT1, PTPN11, TFAP2A, ZNF469). Conclusions Common genetic variants within or nearby genes that cause syndromic myopia are enriched for variants that cause nonsyndromic, common myopia. Analysis of syndromic forms of refractive errors can provide new insights into the etiology of myopia and additional potential targets for therapeutic interventions. PMID:29346494

  19. Genetics of Combined Pituitary Hormone Deficiency: Roadmap into the Genome Era.

    PubMed

    Fang, Qing; George, Akima S; Brinkmeier, Michelle L; Mortensen, Amanda H; Gergics, Peter; Cheung, Leonard Y M; Daly, Alexandre Z; Ajmal, Adnan; Pérez Millán, María Ines; Ozel, A Bilge; Kitzman, Jacob O; Mills, Ryan E; Li, Jun Z; Camper, Sally A

    2016-12-01

    The genetic basis for combined pituitary hormone deficiency (CPHD) is complex, involving 30 genes in a variety of syndromic and nonsyndromic presentations. Molecular diagnosis of this disorder is valuable for predicting disease progression, avoiding unnecessary surgery, and family planning. We expect that the application of high throughput sequencing will uncover additional contributing genes and eventually become a valuable tool for molecular diagnosis. For example, in the last 3 years, six new genes have been implicated in CPHD using whole-exome sequencing. In this review, we present a historical perspective on gene discovery for CPHD and predict approaches that may facilitate future gene identification projects conducted by clinicians and basic scientists. Guidelines for systematic reporting of genetic variants and assigning causality are emerging. We apply these guidelines retrospectively to reports of the genetic basis of CPHD and summarize modes of inheritance and penetrance for each of the known genes. In recent years, there have been great improvements in databases of genetic information for diverse populations. Some issues remain that make molecular diagnosis challenging in some cases. These include the inherent genetic complexity of this disorder, technical challenges like uneven coverage, differing results from variant calling and interpretation pipelines, the number of tolerated genetic alterations, and imperfect methods for predicting pathogenicity. We discuss approaches for future research in the genetics of CPHD.

  20. Variant translocation partners of the anaplastic lymphoma kinase (ALK) gene in two cases of anaplastic large cell lymphoma, identified by inverse cDNA polymerase chain reaction.

    PubMed

    Takeoka, Kayo; Okumura, Atsuko; Honjo, Gen; Ohno, Hitoshi

    2014-01-01

    In anaplastic large cell lymphoma (ALCL), the anaplastic lymphoma kinase (ALK) gene is rearranged with diverse partners due to variant translocations/inversions. Case 1 was a 39-year-old man who developed multiple tumors in the mediastinum, psoas muscle, lung, and lymph nodes. A biopsy specimen of the inguinal node was effaced by large tumor cells expressing CD30, epithelial membrane antigen, and cytoplasmic ALK, which led to a diagnosis of ALK(+) ALCL. Case 2 was a 51-year-old man who was initially diagnosed with undifferentiated carcinoma. He developed multiple skin tumors eight years after his initial presentation, and was finally diagnosed with ALK(+) ALCL. He died of therapy-related acute myeloid leukemia. G-banding and fluorescence in situ hybridization using an ALK break-apart probe revealed the rearrangement of ALK and suggested variant translocation in both cases. We applied an inverse cDNA polymerase chain reaction (PCR) strategy to identify the partner of ALK. Nucleotide sequencing of the PCR products and a database search revealed that the sequences of ATIC in case 1 and TRAF1 in case 2 appeared to follow those of ALK. We subsequently confirmed ATIC-ALK and TRAF1-ALK fusions by reverse transcriptase PCR and nucleotide sequencing. We successfully determined the partner gene of ALK in two cases of ALK(+) ALCL. ATIC is the second most common partner of variant ALK rearrangements, while the TRAF1-ALK fusion gene was first reported in 2013, and this is the second reported case of ALK(+) ALCL carrying TRAF1-ALK.

  1. GETPrime 2.0: gene- and transcript-specific qPCR primers for 13 species including polymorphisms

    PubMed Central

    David, Fabrice P.A.; Rougemont, Jacques; Deplancke, Bart

    2017-01-01

    GETPrime (http://bbcftools.epfl.ch/getprime) is a database with a web frontend providing gene- and transcript-specific, pre-computed qPCR primer pairs. The primers have been optimized for genome-wide specificity and for allowing the selective amplification of one or several splice variants of most known genes. To ease selection, primers have also been ranked according to defined criteria such as genome-wide specificity (with BLAST), amplicon size, and isoform coverage. Here, we report a major upgrade (2.0) of the database: eight new species (yeast, chicken, macaque, chimpanzee, rat, platypus, pufferfish, and Anolis carolinensis) now complement the five already included in the previous version (human, mouse, zebrafish, fly, and worm). Furthermore, the genomic reference has been updated to Ensembl v81 (while keeping earlier versions for backward compatibility) as a result of re-designing the back-end database and automating the import of relevant sections of the Ensembl database in species-independent fashion. This also allowed us to map known polymorphisms to the primers (on average three per primer for human), with the aim of reducing experimental error when targeting specific strains or individuals. Another consequence is that the inclusion of future Ensembl releases and other species has now become a relatively straightforward task. PMID:28053161

  2. amamutdb.no: A relational database for MAN2B1 allelic variants that compiles genotypes, clinical phenotypes, and biochemical and structural data of mutant MAN2B1 in α-mannosidosis.

    PubMed

    Riise Stensland, Hilde Monica Frostad; Frantzen, Gabrio; Kuokkanen, Elina; Buvang, Elisabeth Kjeldsen; Klenow, Helle Bagterp; Heikinheimo, Pirkko; Malm, Dag; Nilssen, Øivind

    2015-06-01

    α-Mannosidosis is an autosomal recessive lysosomal storage disorder caused by mutations in the MAN2B1 gene, encoding lysosomal α-mannosidase. The disorder is characterized by a range of clinical phenotypes of which the major manifestations are mental impairment, hearing impairment, skeletal changes, and immunodeficiency. Here, we report an α-mannosidosis mutation database, amamutdb.no, which has been constructed as a publicly accessible online resource for recording and analyzing MAN2B1 variants (http://amamutdb.no). Our aim has been to offer structured and relational information on MAN2B1 mutations and genotypes along with associated clinical phenotypes. Classifying missense mutations, as pathogenic or benign, is a challenge. Therefore, they have been given special attention as we have compiled all available data that relate to their biochemical, functional, and structural properties. The α-mannosidosis mutation database is comprehensive and relational in the sense that information can be retrieved and compiled across datasets; hence, it will facilitate diagnostics and increase our understanding of the clinical and molecular aspects of α-mannosidosis. We believe that the amamutdb.no structure and architecture will be applicable for the development of databases for any monogenic disorder. © 2015 WILEY PERIODICALS, INC.

  3. Lack of association between the P413L variant of chromogranin B and ALS risk or age at onset: a meta-analysis.

    PubMed

    Yang, Xinglong; Li, Shimei; Xing, Dongmei; Li, Peiyun; Li, Ci; Qi, Ling; Xu, Yanming; Ren, Hui

    2018-02-01

    Amyotrophic lateral sclerosis (ALS), the most common motor neuron disease, is thought to result from interaction of genetic and environmental risk factors. Whether the potentially functional exonic P413L variant in the chromogranin B gene influences ALS risk and age at onset is controversial. We meta-analysed or other studies assessing the association between the P413L variant and ALS risk or age at ALS onset indexed in Web of Science, PubMed, Embase, Chinese National Knowledge Infrastructure, Wanfang, and SinoMed databases. Five case-control studies were analysed, involving 2639 patients with sporadic ALS, 201 with familial ALS and 3381 controls. No association was detected between risk of either ALS type and the CT + TT genotype or T-allele of the P413L variant. Age at ALS onset was similar between carriers and non-carriers of the T-allele. The available evidence suggests that the P413L variant of chromogranin B is not associated with ALS risk or age at ALS onset. These results should be validated in large, well-designed studies.

  4. Candidate genetic pathways for attention-deficit/hyperactivity disorder (ADHD) show association to hyperactive/impulsive symptoms in children with ADHD.

    PubMed

    Bralten, Janita; Franke, Barbara; Waldman, Irwin; Rommelse, Nanda; Hartman, Catharina; Asherson, Philip; Banaschewski, Tobias; Ebstein, Richard P; Gill, Michael; Miranda, Ana; Oades, Robert D; Roeyers, Herbert; Rothenberger, Aribert; Sergeant, Joseph A; Oosterlaan, Jaap; Sonuga-Barke, Edmund; Steinhausen, Hans-Christoph; Faraone, Stephen V; Buitelaar, Jan K; Arias-Vásquez, Alejandro

    2013-11-01

    Because multiple genes with small effect sizes are assumed to play a role in attention-deficit/hyperactivity disorder (ADHD) etiology, considering multiple variants within the same analysis likely increases the total explained phenotypic variance, thereby boosting the power of genetic studies. This study investigated whether pathway-based analysis could bring scientists closer to unraveling the biology of ADHD. The pathway was described as a predefined gene selection based on a well-established database or literature data. Common genetic variants in pathways involved in dopamine/norepinephrine and serotonin neurotransmission and genes involved in neuritic outgrowth were investigated in cases from the International Multicentre ADHD Genetics (IMAGE) study. Multivariable analysis was performed to combine the effects of single genetic variants within the pathway genes. Phenotypes were DSM-IV symptom counts for inattention and hyperactivity/impulsivity (n = 871) and symptom severity measured with the Conners Parent (n = 930) and Teacher (n = 916) Rating Scales. Summing genetic effects of common genetic variants within the pathways showed a significant association with hyperactive/impulsive symptoms ((p)empirical = .007) but not with inattentive symptoms ((p)empirical = .73). Analysis of parent-rated Conners hyperactive/impulsive symptom scores validated this result ((p)empirical = .0018). Teacher-rated Conners scores were not associated. Post hoc analyses showed a significant contribution of all pathways to the hyperactive/impulsive symptom domain (dopamine/norepinephrine, (p)empirical = .0004; serotonin, (p)empirical = .0149; neuritic outgrowth, (p)empirical = .0452). The present analysis shows an association between common variants in 3 genetic pathways and the hyperactive/impulsive component of ADHD. This study demonstrates that pathway-based association analyses, using quantitative measurements of ADHD symptom domains, can increase the power of genetic analyses to identify biological risk factors involved in this disorder. Copyright © 2013 American Academy of Child and Adolescent Psychiatry. Published by Elsevier Inc. All rights reserved.

  5. Canadian Open Genetics Repository (COGR): a unified clinical genomics database as a community resource for standardising and sharing genetic interpretations.

    PubMed

    Lerner-Ellis, Jordan; Wang, Marina; White, Shana; Lebo, Matthew S

    2015-07-01

    The Canadian Open Genetics Repository is a collaborative effort for the collection, storage, sharing and robust analysis of variants reported by medical diagnostics laboratories across Canada. As clinical laboratories adopt modern genomics technologies, the need for this type of collaborative framework is increasingly important. A survey to assess existing protocols for variant classification and reporting was delivered to clinical genetics laboratories across Canada. Based on feedback from this survey, a variant assessment tool was made available to all laboratories. Each participating laboratory was provided with an instance of GeneInsight, a software featuring versioning and approval processes for variant assessments and interpretations and allowing for variant data to be shared between instances. Guidelines were established for sharing data among clinical laboratories and in the final outreach phase, data will be made readily available to patient advocacy groups for general use. The survey demonstrated the need for improved standardisation and data sharing across the country. A variant assessment template was made available to the community to aid with standardisation. Instances of the GeneInsight tool were provided to clinical diagnostic laboratories across Canada for the purpose of uploading, transferring, accessing and sharing variant data. As an ongoing endeavour and a permanent resource, the Canadian Open Genetics Repository aims to serve as a focal point for the collaboration of Canadian laboratories with other countries in the development of tools that take full advantage of laboratory data in diagnosing, managing and treating genetic diseases. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  6. Genetic variability in ABCB1, occupational pesticide exposure, and Parkinson's disease.

    PubMed

    Narayan, Shilpa; Sinsheimer, Janet S; Paul, Kimberly C; Liew, Zeyan; Cockburn, Myles; Bronstein, Jeff M; Ritz, Beate

    2015-11-01

    Studies suggested that variants in the ABCB1 gene encoding P-glycoprotein, a xenobiotic transporter, may increase susceptibility to pesticide exposures linked to Parkinson's Disease (PD) risk. To investigate the joint impact of two ABCB1 polymorphisms and pesticide exposures on PD risk. In a population-based case control study, we genotyped ABCB1 gene variants at rs1045642 (c.3435C/T) and rs2032582 (c.2677G/T/A) and assessed occupational exposures to organochlorine (OC) and organophosphorus (OP) pesticides based on self-reported occupational use and record-based ambient workplace exposures for 282 PD cases and 514 controls of European ancestry. We identified active ingredients in self-reported occupational use pesticides from a California database and estimated ambient workplace exposures between 1974 and 1999 employing a geographic information system together with records for state pesticide and land use. With unconditional logistic regression, we estimated marginal and joint contributions for occupational pesticide exposures and ABCB1 variants in PD. For occupationally exposed carriers of homozygous ABCB1 variant genotypes, we estimated odds ratios of 1.89 [95% confidence interval (CI): (0.87, 4.07)] to 3.71 [95% CI: (1.96, 7.02)], with the highest odds ratios estimated for occupationally exposed carriers of homozygous ABCB1 variant genotypes at both SNPs; but we found no multiplicative scale interactions. This study lends support to a previous report that commonly used pesticides, specifically OCs and OPs, and variant ABCB1 genotypes at two polymorphic sites jointly increase risk of PD. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Whole-exome sequencing identifies novel candidate predisposition genes for familial polycythemia vera.

    PubMed

    Hirvonen, Elina A M; Pitkänen, Esa; Hemminki, Kari; Aaltonen, Lauri A; Kilpivaara, Outi

    2017-04-20

    Polycythemia vera (PV), characterized by massive production of erythrocytes, is one of the myeloproliferative neoplasms. Most patients carry a somatic gain-of-function mutation in JAK2, c.1849G > T (p.Val617Phe), leading to constitutive activation of JAK-STAT signaling pathway. Familial clustering is also observed occasionally, but high-penetrance predisposition genes to PV have remained unidentified. We studied the predisposition to PV by exome sequencing (three cases) in a Finnish PV family with four patients. The 12 shared variants (maximum allowed minor allele frequency <0.001 in Finnish population in ExAC database) predicted damaging in silico and absent in an additional control set of over 500 Finns were further validated by Sanger sequencing in a fourth affected family member. Three novel predisposition candidate variants were identified: c.1254C > G (p.Phe418Leu) in ZXDC, c.1931C > G (p.Pro644Arg) in ATN1, and c.701G > A (p.Arg234Gln) in LRRC3. We also observed a rare, predicted benign germline variant c.2912C > G (p.Ala971Gly) in BCORL1 in all four patients. Somatic mutations in BCORL1 have been reported in myeloid malignancies. We further screened the variants in eight PV patients in six other Finnish families, but no other carriers were found. Exome sequencing provides a powerful tool for the identification of novel variants, and understanding the familial predisposition of diseases. This is the first report on Finnish familial PV cases, and we identified three novel candidate variants that may predispose to the disease.

  8. Investigation of mutations in the HBB gene using the 1,000 genomes database.

    PubMed

    Carlice-Dos-Reis, Tânia; Viana, Jaime; Moreira, Fabiano Cordeiro; Cardoso, Greice de Lemos; Guerreiro, João; Santos, Sidney; Ribeiro-Dos-Santos, Ândrea

    2017-01-01

    Mutations in the HBB gene are responsible for several serious hemoglobinopathies, such as sickle cell anemia and β-thalassemia. Sickle cell anemia is one of the most common monogenic diseases worldwide. Due to its prevalence, diverse strategies have been developed for a better understanding of its molecular mechanisms. In silico analysis has been increasingly used to investigate the genotype-phenotype relationship of many diseases, and the sequences of healthy individuals deposited in the 1,000 Genomes database appear to be an excellent tool for such analysis. The objective of this study is to analyze the variations in the HBB gene in the 1,000 Genomes database, to describe the mutation frequencies in the different population groups, and to investigate the pattern of pathogenicity. The computational tool SNPEFF was used to align the data from 2,504 samples of the 1,000 Genomes database with the HG19 genome reference. The pathogenicity of each amino acid change was investigated using the databases CLINVAR, dbSNP and HbVar and five different predictors. Twenty different mutations were found in 209 healthy individuals. The African group had the highest number of individuals with mutations, and the European group had the lowest number. Thus, it is concluded that approximately 8.3% of phenotypically healthy individuals from the 1,000 Genomes database have some mutation in the HBB gene. The frequency of mutated genes was estimated at 0.042, so that the expected frequency of being homozygous or compound heterozygous for these variants in the next generation is approximately 0.002. In total, 193 subjects had a non-synonymous mutation, which 186 (7.4%) have a deleterious mutation. Considering that the 1,000 Genomes database is representative of the world's population, it can be estimated that fourteen out of every 10,000 individuals in the world will have a hemoglobinopathy in the next generation.

  9. Whole-exome sequencing in amyotrophic lateral sclerosis suggests NEK1 is a risk gene in Chinese.

    PubMed

    Gratten, Jacob; Zhao, Qiongyi; Benyamin, Beben; Garton, Fleur; He, Ji; Leo, Paul J; Mangelsdorf, Marie; Anderson, Lisa; Zhang, Zong-Hong; Chen, Lu; Chen, Xiang-Ding; Cremin, Katie; Deng, Hong-Weng; Edson, Janette; Han, Ying-Ying; Harris, Jessica; Henders, Anjali K; Jin, Zi-Bing; Li, Zhongshan; Lin, Yong; Liu, Xiaolu; Marshall, Mhairi; Mowry, Bryan J; Ran, Shu; Reutens, David C; Song, Sharon; Tan, Li-Jun; Tang, Lu; Wallace, Robyn H; Wheeler, Lawrie; Wu, Jinyu; Yang, Jian; Xu, Huji; Visscher, Peter M; Bartlett, Perry F; Brown, Matthew A; Wray, Naomi R; Fan, Dongsheng

    2017-11-17

    Amyotrophic lateral sclerosis (ALS) is a progressive neurological disease characterised by the degeneration of motor neurons, which are responsible for voluntary movement. There remains limited understanding of disease aetiology, with median survival of ALS of three years and no effective treatment. Identifying genes that contribute to ALS susceptibility is an important step towards understanding aetiology. The vast majority of published human genetic studies, including for ALS, have used samples of European ancestry. The importance of trans-ethnic studies in human genetic studies is widely recognised, yet a dearth of studies of non-European ancestries remains. Here, we report analyses of novel whole-exome sequencing (WES) data from Chinese ALS and control individuals. WES data were generated for 610 ALS cases and 460 controls drawn from Chinese populations. We assessed evidence for an excess of rare damaging mutations at the gene level and the gene set level, considering only singleton variants filtered to have allele frequency less than 5 × 10 -5 in reference databases. To meta-analyse our results with a published study of European ancestry, we used a Cochran-Mantel-Haenszel test to compare gene-level variant counts in cases vs controls. No gene passed the genome-wide significance threshold with ALS in Chinese samples alone. Combining rare variant counts in Chinese with those from the largest WES study of European ancestry resulted in three genes surpassing genome-wide significance: TBK1 (p = 8.3 × 10 -12 ), SOD1 (p = 8.9 × 10 -9 ) and NEK1 (p = 1.1 × 10 -9 ). In the Chinese data alone, SOD1 and NEK1 were nominally significantly associated with ALS (p = 0.04 and p = 7 × 10 -3 , respectively) and the case/control frequencies of rare coding variants in these genes were similar in Chinese and Europeans (SOD1: 1.5%/0.2% vs 0.9%/0.1%, NEK1 1.8%/0.4% vs 1.9%/0.8%). This was also true for TBK1 (1.2%/0.2% vs 1.4%/0.4%), but the association with ALS in Chinese was not significant (p = 0.14). While SOD1 is already recognised as an ALS-associated gene in Chinese, we provide novel evidence for association of NEK1 with ALS in Chinese, reporting variants in these genes not previously found in Europeans.

  10. Using diverse U.S. beef cattle genomes to identify missense mutations in EPAS1, a gene associated with pulmonary hypertension

    PubMed Central

    Heaton, Michael P.; Smith, Timothy P.L.; Carnahan, Jacky K.; Basnayake, Veronica; Qiu, Jiansheng; Simpson, Barry; Kalbfleisch, Theodore S.

    2016-01-01

    The availability of whole genome sequence (WGS) data has made it possible to discover protein variants in silico. However, existing bovine WGS databases do not show data in a form conducive to protein variant analysis, and tend to under represent the breadth of genetic diversity in global beef cattle. Thus, our first aim was to use 96 beef sires, sharing minimal pedigree relationships, to create a searchable and publicly viewable set of mapped genomes relevant for 19 popular breeds of U.S. cattle. Our second aim was to identify protein variants encoded by the bovine endothelial PAS domain-containing protein 1 gene ( EPAS1), a gene associated with pulmonary hypertension in Angus cattle. The identity and quality of genomic sequences were verified by comparing WGS genotypes to those derived from other methods. The average read depth, genotype scoring rate, and genotype accuracy exceeded 14, 99%, and 99%, respectively. The 96 genomes were used to discover four amino acid variants encoded by EPAS1 (E270Q, P362L, A671G, and L701F) and confirm two variants previously associated with disease (A606T and G610S). The six EPAS1 missense mutations were verified with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry assays, and their frequencies were estimated in a separate collection of 1154 U.S. cattle representing 46 breeds. A rooted phylogenetic tree of eight polypeptide sequences provided a framework for evaluating the likely order of mutations and potential impact of EPAS1 alleles on the adaptive response to chronic hypoxia in U.S. cattle. This public, whole genome resource facilitates in silico identification of protein variants in diverse types of U.S. beef cattle, and provides a means of translating WGS data into a practical biological and evolutionary context for generating and testing hypotheses. PMID:27746904

  11. The Plant Structure Ontology, a Unified Vocabulary of Anatomy and Morphology of a Flowering Plant1[W][OA

    PubMed Central

    Ilic, Katica; Kellogg, Elizabeth A.; Jaiswal, Pankaj; Zapata, Felipe; Stevens, Peter F.; Vincent, Leszek P.; Avraham, Shulamit; Reiser, Leonore; Pujar, Anuradha; Sachs, Martin M.; Whitman, Noah T.; McCouch, Susan R.; Schaeffer, Mary L.; Ware, Doreen H.; Stein, Lincoln D.; Rhee, Seung Y.

    2007-01-01

    Formal description of plant phenotypes and standardized annotation of gene expression and protein localization data require uniform terminology that accurately describes plant anatomy and morphology. This facilitates cross species comparative studies and quantitative comparison of phenotypes and expression patterns. A major drawback is variable terminology that is used to describe plant anatomy and morphology in publications and genomic databases for different species. The same terms are sometimes applied to different plant structures in different taxonomic groups. Conversely, similar structures are named by their species-specific terms. To address this problem, we created the Plant Structure Ontology (PSO), the first generic ontological representation of anatomy and morphology of a flowering plant. The PSO is intended for a broad plant research community, including bench scientists, curators in genomic databases, and bioinformaticians. The initial releases of the PSO integrated existing ontologies for Arabidopsis (Arabidopsis thaliana), maize (Zea mays), and rice (Oryza sativa); more recent versions of the ontology encompass terms relevant to Fabaceae, Solanaceae, additional cereal crops, and poplar (Populus spp.). Databases such as The Arabidopsis Information Resource, Nottingham Arabidopsis Stock Centre, Gramene, MaizeGDB, and SOL Genomics Network are using the PSO to describe expression patterns of genes and phenotypes of mutants and natural variants and are regularly contributing new annotations to the Plant Ontology database. The PSO is also used in specialized public databases, such as BRENDA, GENEVESTIGATOR, NASCArrays, and others. Over 10,000 gene annotations and phenotype descriptions from participating databases can be queried and retrieved using the Plant Ontology browser. The PSO, as well as contributed gene associations, can be obtained at www.plantontology.org. PMID:17142475

  12. Patterns of Novel Alleles and Genotype/Phenotype Correlations Resulting from the Analysis of 108 Previously Undetected Mutations in Patients Affected by Neurofibromatosis Type I

    PubMed Central

    Bonatti, Francesco; Adorni, Alessia; Matichecchia, Annalisa; Mozzoni, Paola; Uliana, Vera; Pisani, Francesco; Garavelli, Livia; Graziano, Claudio; Gnoli, Maria; Bigoni, Stefania; Boschi, Elena; Martorana, Davide; Percesepe, Antonio

    2017-01-01

    Neurofibromatosis type I, a genetic disorder due to mutations in the NF1 gene, is characterized by a high mutation rate (about 50% of the cases are de novo) but, with the exception of whole gene deletions associated with a more severe phenotype, no specific hotspots and few solid genotype/phenotype correlations. After retrospectively re-evaluating all NF1 gene variants found in the diagnostic activity, we studied 108 patients affected by neurofibromatosis type I who harbored mutations that had not been previously reported in the international databases, with the aim of analyzing their type and distribution along the gene and of correlating them with the phenotypic features of the affected patients. Out of the 108 previously unreported variants, 14 were inherited by one of the affected parents and 94 were de novo. Twenty-nine (26.9%) mutations were of uncertain significance, whereas 79 (73.2%) were predicted as pathogenic or probably pathogenic. No differential distribution in the exons or in the protein domains was observed and no statistically significant genotype/phenotype correlation was found, confirming previous evidences. PMID:28961165

  13. Germline PARP4 mutations in patients with primary thyroid and breast cancers.

    PubMed

    Ikeda, Yuji; Kiyotani, Kazuma; Yew, Poh Yin; Kato, Taigo; Tamura, Kenji; Yap, Kai Lee; Nielsen, Sarah M; Mester, Jessica L; Eng, Charis; Nakamura, Yusuke; Grogan, Raymon H

    2016-03-01

    Germline mutations in the PTEN gene, which cause Cowden syndrome, are known to be one of the genetic factors for primary thyroid and breast cancers; however, PTEN mutations are found in only a small subset of research participants with non-syndrome breast and thyroid cancers. In this study, we aimed to identify germline variants that may be related to genetic risk of primary thyroid and breast cancers. Genomic DNAs extracted from peripheral blood of 14 PTEN WT female research participants with primary thyroid and breast cancers were analyzed by whole-exome sequencing. Gene-based case-control association analysis using the information of 406 Europeans obtained from the 1000 Genomes Project database identified 34 genes possibly associated with the phenotype with P < 1.0 × 10(-3). Among them, rare variants in the PARP4 gene were detected at significant high frequency (odds ratio = 5.2; P = 1.0 × 10(-5)). The variants, G496V and T1170I, were found in six of the 14 study participants (43%) while their frequencies were only 0.5% in controls. Functional analysis using HCC1143 cell line showed that knockdown of PARP4 with siRNA significantly enhanced the cell proliferation, compared with the cells transfected with siControl (P = 0.02). Kaplan-Meier analysis using Gene Expression Omnibus (GEO), European Genome-phenome Archive (EGA) and The Cancer Genome Atlas (TCGA) datasets showed poor relapse-free survival (P < 0.001, Hazard ratio 1.27) and overall survival (P = 0.006, Hazard ratio 1.41) in a PARP4 low-expression group, suggesting that PARP4 may function as a tumor suppressor. In conclusion, we identified PARP4 as a possible susceptibility gene of primary thyroid and breast cancer. © 2016 Society for Endocrinology.

  14. Genetic variant rs17225178 in the ARNT2 gene is associated with Asperger Syndrome.

    PubMed

    Di Napoli, Agnese; Warrier, Varun; Baron-Cohen, Simon; Chakrabarti, Bhismadev

    2015-01-01

    Autism Spectrum Conditions (ASC) are neurodevelopmental conditions characterized by difficulties in communication and social interaction, alongside unusually repetitive behaviours and narrow interests. Asperger Syndrome (AS) is one subgroup of ASC and differs from classic autism in that in AS there is no language or general cognitive delay. Genetic, epigenetic and environmental factors are implicated in ASC and genes involved in neural connectivity and neurodevelopment are good candidates for studying the susceptibility to ASC. The aryl-hydrocarbon receptor nuclear translocator 2 (ARNT2) gene encodes a transcription factor involved in neurodevelopmental processes, neuronal connectivity and cellular responses to hypoxia. A mutation in this gene has been identified in individuals with ASC and single nucleotide polymorphisms (SNPs) have been nominally associated with AS and autistic traits in previous studies. In this study, we tested 34 SNPs in ARNT2 for association with AS in 118 cases and 412 controls of Caucasian origin. P values were adjusted for multiple comparisons, and linkage disequilibrium (LD) among the SNPs analysed was calculated in our sample. Finally, SNP annotation allowed functional and structural analyses of the genetic variants in ARNT2. We tested the replicability of our result using the genome-wide association studies (GWAS) database of the Psychiatric Genomics Consortium (PGC). We report statistically significant association of rs17225178 with AS. This SNP modifies transcription factor binding sites and regions that regulate the chromatin state in neural cell lines. It is also included in a LD block in our sample, alongside other genetic variants that alter chromatin regulatory regions in neural cells. These findings demonstrate that rs17225178 in the ARNT2 gene is associated with AS and support previous studies that pointed out an involvement of this gene in the predisposition to ASC.

  15. Regulators of Androgen Action Resource: a one-stop shop for the comprehensive study of androgen receptor action.

    PubMed

    DePriest, Adam D; Fiandalo, Michael V; Schlanger, Simon; Heemers, Frederike; Mohler, James L; Liu, Song; Heemers, Hannelore V

    2016-01-01

    Androgen receptor (AR) is a ligand-activated transcription factor that is the main target for treatment of non-organ-confined prostate cancer (CaP). Failure of life-prolonging AR-targeting androgen deprivation therapy is due to flexibility in steroidogenic pathways that control intracrine androgen levels and variability in the AR transcriptional output. Androgen biosynthesis enzymes, androgen transporters and AR-associated coregulators are attractive novel CaP treatment targets. These proteins, however, are characterized by multiple transcript variants and isoforms, are subject to genomic alterations, and are differentially expressed among CaPs. Determining their therapeutic potential requires evaluation of extensive, diverse datasets that are dispersed over multiple databases, websites and literature reports. Mining and integrating these datasets are cumbersome, time-consuming tasks and provide only snapshots of relevant information. To overcome this impediment to effective, efficient study of AR and potential drug targets, we developed the Regulators of Androgen Action Resource (RAAR), a non-redundant, curated and user-friendly searchable web interface. RAAR centralizes information on gene function, clinical relevance, and resources for 55 genes that encode proteins involved in biosynthesis, metabolism and transport of androgens and for 274 AR-associated coregulator genes. Data in RAAR are organized in two levels: (i) Information pertaining to production of androgens is contained in a 'pre-receptor level' database, and coregulator gene information is provided in a 'post-receptor level' database, and (ii) an 'other resources' database contains links to additional databases that are complementary to and useful to pursue further the information provided in RAAR. For each of its 329 entries, RAAR provides access to more than 20 well-curated publicly available databases, and thus, access to thousands of data points. Hyperlinks provide direct access to gene-specific entries in the respective database(s). RAAR is a novel, freely available resource that provides fast, reliable and easy access to integrated information that is needed to develop alternative CaP therapies. Database URL: http://www.lerner.ccf.org/cancerbio/heemers/RAAR/search/. © The Author(s) 2016. Published by Oxford University Press.

  16. Characterization of pathogenic SORL1 genetic variants for association with Alzheimer’s disease: a clinical interpretation strategy

    PubMed Central

    Holstege, Henne; van der Lee, Sven J; Hulsman, Marc; Wong, Tsz Hang; van Rooij, Jeroen GJ; Weiss, Marjan; Louwersheimer, Eva; Wolters, Frank J; Amin, Najaf; Uitterlinden, André G; Hofman, Albert; Ikram, M Arfan; van Swieten, John C; Meijers-Heijboer, Hanne; van der Flier, Wiesje M; Reinders, Marcel JT; van Duijn, Cornelia M; Scheltens, Philip

    2017-01-01

    Accumulating evidence suggests that genetic variants in the SORL1 gene are associated with Alzheimer disease (AD), but a strategy to identify which variants are pathogenic is lacking. In a discovery sample of 115 SORL1 variants detected in 1908 Dutch AD cases and controls, we identified the variant characteristics associated with SORL1 variant pathogenicity. Findings were replicated in an independent sample of 103 SORL1 variants detected in 3193 AD cases and controls. In a combined sample of the discovery and replication samples, comprising 181 unique SORL1 variants, we developed a strategy to classify SORL1 variants into five subtypes ranging from pathogenic to benign. We tested this pathogenicity screen in SORL1 variants reported in two independent published studies. SORL1 variant pathogenicity is defined by the Combined Annotation Dependent Depletion (CADD) score and the minor allele frequency (MAF) reported by the Exome Aggregation Consortium (ExAC) database. Variants predicted strongly damaging (CADD score >30), which are extremely rare (ExAC-MAF <1 × 10−5) increased AD risk by 12-fold (95% CI 4.2–34.3; P=5 × 10−9). Protein-truncating SORL1 mutations were all unknown to ExAC and occurred exclusively in AD cases. More common SORL1 variants (ExAC-MAF≥1 × 10−5) were not associated with increased AD risk, even when predicted strongly damaging. Findings were independent of gender and the APOE-ε4 allele. High-risk SORL1 variants were observed in a substantial proportion of the AD cases analyzed (2%). Based on their effect size, we propose to consider high-risk SORL1 variants next to variants in APOE, PSEN1, PSEN2 and APP for personalized risk assessments in clinical practice. PMID:28537274

  17. EFHC1 variants in juvenile myoclonic epilepsy: reanalysis according to NHGRI and ACMG guidelines for assigning disease causality.

    PubMed

    Bailey, Julia N; Patterson, Christopher; de Nijs, Laurence; Durón, Reyna M; Nguyen, Viet-Huong; Tanaka, Miyabi; Medina, Marco T; Jara-Prado, Aurelio; Martínez-Juárez, Iris E; Ochoa, Adriana; Molina, Yolli; Suzuki, Toshimitsu; Alonso, María E; Wight, Jenny E; Lin, Yu-Chen; Guilhoto, Laura; Targas Yacubian, Elza Marcia; Machado-Salas, Jesús; Daga, Andrea; Yamakawa, Kazuhiro; Grisar, Thierry M; Lakaye, Bernard; Delgado-Escueta, Antonio V

    2017-02-01

    EFHC1 variants are the most common mutations in inherited myoclonic and grand mal clonic-tonic-clonic (CTC) convulsions of juvenile myoclonic epilepsy (JME). We reanalyzed 54 EFHC1 variants associated with epilepsy from 17 cohorts based on National Human Genome Research Institute (NHGRI) and American College of Medical Genetics and Genomics (ACMG) guidelines for interpretation of sequence variants. We calculated Bayesian LOD scores for variants in coinheritance, unconditional exact tests and odds ratios (OR) in case-control associations, allele frequencies in genome databases, and predictions for conservation/pathogenicity. We reviewed whether variants damage EFHC1 functions, whether efhc1 -/- KO mice recapitulate CTC convulsions and "microdysgenesis" neuropathology, and whether supernumerary synaptic and dendritic phenotypes can be rescued in the fly model when EFHC1 is overexpressed. We rated strengths of evidence and applied ACMG combinatorial criteria for classifying variants. Nine variants were classified as "pathogenic," 14 as "likely pathogenic," 9 as "benign," and 2 as "likely benign." Twenty variants of unknown significance had an insufficient number of ancestry-matched controls, but ORs exceeded 5 when compared with racial/ethnic-matched Exome Aggregation Consortium (ExAC) controls. NHGRI gene-level evidence and variant-level evidence establish EFHC1 as the first non-ion channel microtubule-associated protein whose mutations disturb R-type VDCC and TRPM2 calcium currents in overgrown synapses and dendrites within abnormally migrated dislocated neurons, thus explaining CTC convulsions and "microdysgenesis" neuropathology of JME.Genet Med 19 2, 144-156.

  18. Next generation sequencing gives an insight into the characteristics of highly selected breeds versus non-breed horses in the course of domestication.

    PubMed

    Metzger, Julia; Tonda, Raul; Beltran, Sergi; Agueda, Lídia; Gut, Marta; Distl, Ottmar

    2014-07-04

    Domestication has shaped the horse and lead to a group of many different types. Some have been under strong human selection while others developed in close relationship with nature. The aim of our study was to perform next generation sequencing of breed and non-breed horses to provide an insight into genetic influences on selective forces. Whole genome sequencing of five horses of four different populations revealed 10,193,421 single nucleotide polymorphisms (SNPs) and 1,361,948 insertion/deletion polymorphisms (indels). In comparison to horse variant databases and previous reports, we were able to identify 3,394,883 novel SNPs and 868,525 novel indels. We analyzed the distribution of individual variants and found significant enrichment of private mutations in coding regions of genes involved in primary metabolic processes, anatomical structures, morphogenesis and cellular components in non-breed horses and in contrast to that private mutations in genes affecting cell communication, lipid metabolic process, neurological system process, muscle contraction, ion transport, developmental processes of the nervous system and ectoderm in breed horses. Our next generation sequencing data constitute an important first step for the characterization of non-breed in comparison to breed horses and provide a large number of novel variants for future analyses. Functional annotations suggest specific variants that could play a role for the characterization of breed or non-breed horses.

  19. A SImplified method for Segregation Analysis (SISA) to determine penetrance and expression of a genetic variant in a family.

    PubMed

    Møller, Pål; Clark, Neal; Mæhle, Lovise

    2011-05-01

    A method for SImplified rapid Segregation Analysis (SISA) to assess penetrance and expression of genetic variants in pedigrees of any complexity is presented. For this purpose the probability for recombination between the variant and the gene is zero. An assumption is that the variant of undetermined significance (VUS) is introduced into the family once only. If so, all family members in between two members demonstrated to carry a VUS, are obligate carriers. Probabilities for cosegregation of disease and VUS by chance, penetrance, and expression, may be calculated. SISA return values do not include person identifiers and need no explicit informed consent. There will be no ethical complications in submitting SISA return values to central databases. Values for several families may be combined. Values for a family may be updated by the contributor. SISA is used to consider penetrance whenever sequencing demonstrates a VUS in the known cancer-predisposing genes. Any family structure at hand in a genetic clinic may be used. One may include an extended lineage in a family through demonstrating the same VUS in a distant relative, and thereby identifying all obligate carriers in between. Such extension is a way to escape the selection biases through expanding the families outside the clusters used to select the families. © 2011 Wiley-Liss, Inc.

  20. Clinical testing of BRCA1 and BRCA2: a worldwide snapshot of technological practices.

    PubMed

    Toland, Amanda Ewart; Forman, Andrea; Couch, Fergus J; Culver, Julie O; Eccles, Diana M; Foulkes, William D; Hogervorst, Frans B L; Houdayer, Claude; Levy-Lahad, Ephrat; Monteiro, Alvaro N; Neuhausen, Susan L; Plon, Sharon E; Sharan, Shyam K; Spurdle, Amanda B; Szabo, Csilla; Brody, Lawrence C

    2018-01-01

    Clinical testing of BRCA1 and BRCA2 began over 20 years ago. With the expiration and overturning of the BRCA patents, limitations on which laboratories could offer commercial testing were lifted. These legal changes occurred approximately the same time as the widespread adoption of massively parallel sequencing (MPS) technologies. Little is known about how these changes impacted laboratory practices for detecting genetic alterations in hereditary breast and ovarian cancer genes. Therefore, we sought to examine current laboratory genetic testing practices for BRCA1 / BRCA2 . We employed an online survey of 65 questions covering four areas: laboratory characteristics, details on technological methods, variant classification, and client-support information. Eight United States (US) laboratories and 78 non-US laboratories completed the survey. Most laboratories (93%; 80/86) used MPS platforms to identify variants. Laboratories differed widely on: (1) technologies used for large rearrangement detection; (2) criteria for minimum read depths; (3) non-coding regions sequenced; (4) variant classification criteria and approaches; (5) testing volume ranging from 2 to 2.5 × 10 5 tests annually; and (6) deposition of variants into public databases. These data may be useful for national and international agencies to set recommendations for quality standards for BRCA1/BRCA2 clinical testing. These standards could also be applied to testing of other disease genes.

  1. Hb Midnapore [β53(D4)Ala→Val; HBB: c.161C>T]: A Novel Hemoglobin Variant with a Structural Abnormality Associated with IVS-I-5 (G>C) (HBB: c.92+5G>C) Found in a Bengali Indian Family.

    PubMed

    Panja, Amrita; Chowdhury, Prosanto; Basu, Anupam

    2016-09-01

    We describe a novel C>T substitution at codon 53 of the HBB gene (HBB: c.161C>T). The proband was a transfusion-dependent β-thalassemia major (β-TM) patient. DNA was extracted and subsequently, DNA sequencing was done to detect the mutations on the HBB gene. Capillary zone electrophoresis (CZE) revealed the presence of an unknown peak. She inherited this mutation from her grandmother through her mother. This mutation exists in cis with the common β 0 mutation IVS-I-5 (G>C) (HBB: c.92+5G>C). The proband is homozygous for HBB: c.92+5G>C and needs monthly transfusions. On the other hand, her grandmother, mother and sister all possess this novel mutation cis with the heterozygous HBB: c.92+5G>C. They are carriers not thalassemic. This mutation produces the substitution β53(D4)Ala→Val; HBB: c.161C>T, a new structural hemoglobin (Hb) variant. As this variant was identified in a Bengali family from Paschim Midnapore district of West Bengal, India, it has been designated as Hb Midnapore. This variant has now been reported to the HbVar database.

  2. Principles and Recommendations for Standardizing the Use of the Next-Generation Sequencing Variant File in Clinical Settings.

    PubMed

    Lubin, Ira M; Aziz, Nazneen; Babb, Lawrence J; Ballinger, Dennis; Bisht, Himani; Church, Deanna M; Cordes, Shaun; Eilbeck, Karen; Hyland, Fiona; Kalman, Lisa; Landrum, Melissa; Lockhart, Edward R; Maglott, Donna; Marth, Gabor; Pfeifer, John D; Rehm, Heidi L; Roy, Somak; Tezak, Zivana; Truty, Rebecca; Ullman-Cullere, Mollie; Voelkerding, Karl V; Worthey, Elizabeth A; Zaranek, Alexander W; Zook, Justin M

    2017-05-01

    A national workgroup convened by the Centers for Disease Control and Prevention identified principles and made recommendations for standardizing the description of sequence data contained within the variant file generated during the course of clinical next-generation sequence analysis for diagnosing human heritable conditions. The specifications for variant files were initially developed to be flexible with regard to content representation to support a variety of research applications. This flexibility permits variation with regard to how sequence findings are described and this depends, in part, on the conventions used. For clinical laboratory testing, this poses a problem because these differences can compromise the capability to compare sequence findings among laboratories to confirm results and to query databases to identify clinically relevant variants. To provide for a more consistent representation of sequence findings described within variant files, the workgroup made several recommendations that considered alignment to a common reference sequence, variant caller settings, use of genomic coordinates, and gene and variant naming conventions. These recommendations were considered with regard to the existing variant file specifications presently used in the clinical setting. Adoption of these recommendations is anticipated to reduce the potential for ambiguity in describing sequence findings and facilitate the sharing of genomic data among clinical laboratories and other entities. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  3. A chromosome-centric human proteome project (C-HPP) to characterize the sets of proteins encoded in chromosome 17.

    PubMed

    Liu, Suli; Im, Hogune; Bairoch, Amos; Cristofanilli, Massimo; Chen, Rui; Deutsch, Eric W; Dalton, Stephen; Fenyo, David; Fanayan, Susan; Gates, Chris; Gaudet, Pascale; Hincapie, Marina; Hanash, Samir; Kim, Hoguen; Jeong, Seul-Ki; Lundberg, Emma; Mias, George; Menon, Rajasree; Mu, Zhaomei; Nice, Edouard; Paik, Young-Ki; Uhlen, Mathias; Wells, Lance; Wu, Shiaw-Lin; Yan, Fangfei; Zhang, Fan; Zhang, Yue; Snyder, Michael; Omenn, Gilbert S; Beavis, Ronald C; Hancock, William S

    2013-01-04

    We report progress assembling the parts list for chromosome 17 and illustrate the various processes that we have developed to integrate available data from diverse genomic and proteomic knowledge bases. As primary resources, we have used GPMDB, neXtProt, PeptideAtlas, Human Protein Atlas (HPA), and GeneCards. All sites share the common resource of Ensembl for the genome modeling information. We have defined the chromosome 17 parts list with the following information: 1169 protein-coding genes, the numbers of proteins confidently identified by various experimental approaches as documented in GPMDB, neXtProt, PeptideAtlas, and HPA, examples of typical data sets obtained by RNASeq and proteomic studies of epithelial derived tumor cell lines (disease proteome) and a normal proteome (peripheral mononuclear cells), reported evidence of post-translational modifications, and examples of alternative splice variants (ASVs). We have constructed a list of the 59 "missing" proteins as well as 201 proteins that have inconclusive mass spectrometric (MS) identifications. In this report we have defined a process to establish a baseline for the incorporation of new evidence on protein identification and characterization as well as related information from transcriptome analyses. This initial list of "missing" proteins that will guide the selection of appropriate samples for discovery studies as well as antibody reagents. Also we have illustrated the significant diversity of protein variants (including post-translational modifications, PTMs) using regions on chromosome 17 that contain important oncogenes. We emphasize the need for mandated deposition of proteomics data in public databases, the further development of improved PTM, ASV, and single nucleotide variant (SNV) databases, and the construction of Web sites that can integrate and regularly update such information. In addition, we describe the distribution of both clustered and scattered sets of protein families on the chromosome. Since chromosome 17 is rich in cancer-associated genes, we have focused the clustering of cancer-associated genes in such genomic regions and have used the ERBB2 amplicon as an example of the value of a proteogenomic approach in which one integrates transcriptomic with proteomic information and captures evidence of coexpression through coordinated regulation.

  4. htsint: a Python library for sequencing pipelines that combines data through gene set generation.

    PubMed

    Richards, Adam J; Herrel, Anthony; Bonneaud, Camille

    2015-09-24

    Sequencing technologies provide a wealth of details in terms of genes, expression, splice variants, polymorphisms, and other features. A standard for sequencing analysis pipelines is to put genomic or transcriptomic features into a context of known functional information, but the relationships between ontology terms are often ignored. For RNA-Seq, considering genes and their genetic variants at the group level enables a convenient way to both integrate annotation data and detect small coordinated changes between experimental conditions, a known caveat of gene level analyses. We introduce the high throughput data integration tool, htsint, as an extension to the commonly used gene set enrichment frameworks. The central aim of htsint is to compile annotation information from one or more taxa in order to calculate functional distances among all genes in a specified gene space. Spectral clustering is then used to partition the genes, thereby generating functional modules. The gene space can range from a targeted list of genes, like a specific pathway, all the way to an ensemble of genomes. Given a collection of gene sets and a count matrix of transcriptomic features (e.g. expression, polymorphisms), the gene sets produced by htsint can be tested for 'enrichment' or conditional differences using one of a number of commonly available packages. The database and bundled tools to generate functional modules were designed with sequencing pipelines in mind, but the toolkit nature of htsint allows it to also be used in other areas of genomics. The software is freely available as a Python library through GitHub at https://github.com/ajrichards/htsint.

  5. interPopula: a Python API to access the HapMap Project dataset

    PubMed Central

    2010-01-01

    Background The HapMap project is a publicly available catalogue of common genetic variants that occur in humans, currently including several million SNPs across 1115 individuals spanning 11 different populations. This important database does not provide any programmatic access to the dataset, furthermore no standard relational database interface is provided. Results interPopula is a Python API to access the HapMap dataset. interPopula provides integration facilities with both the Python ecology of software (e.g. Biopython and matplotlib) and other relevant human population datasets (e.g. Ensembl gene annotation and UCSC Known Genes). A set of guidelines and code examples to address possible inconsistencies across heterogeneous data sources is also provided. Conclusions interPopula is a straightforward and flexible Python API that facilitates the construction of scripts and applications that require access to the HapMap dataset. PMID:21210977

  6. Twelve novel HGD gene variants identified in 99 alkaptonuria patients: focus on 'black bone disease' in Italy.

    PubMed

    Nemethova, Martina; Radvanszky, Jan; Kadasi, Ludevit; Ascher, David B; Pires, Douglas E V; Blundell, Tom L; Porfirio, Berardino; Mannoni, Alessandro; Santucci, Annalisa; Milucci, Lia; Sestini, Silvia; Biolcati, Gianfranco; Sorge, Fiammetta; Aurizi, Caterina; Aquaron, Robert; Alsbou, Mohammed; Lourenço, Charles Marques; Ramadevi, Kanakasabapathi; Ranganath, Lakshminarayan R; Gallagher, James A; van Kan, Christa; Hall, Anthony K; Olsson, Birgitta; Sireau, Nicolas; Ayoob, Hana; Timmis, Oliver G; Sang, Kim-Hanh Le Quan; Genovese, Federica; Imrich, Richard; Rovensky, Jozef; Srinivasaraghavan, Rangan; Bharadwaj, Shruthi K; Spiegel, Ronen; Zatkova, Andrea

    2016-01-01

    Alkaptonuria (AKU) is an autosomal recessive disorder caused by mutations in homogentisate-1,2-dioxygenase (HGD) gene leading to the deficiency of HGD enzyme activity. The DevelopAKUre project is underway to test nitisinone as a specific treatment to counteract this derangement of the phenylalanine-tyrosine catabolic pathway. We analysed DNA of 40 AKU patients enrolled for SONIA1, the first study in DevelopAKUre, and of 59 other AKU patients sent to our laboratory for molecular diagnostics. We identified 12 novel DNA variants: one was identified in patients from Brazil (c.557T>A), Slovakia (c.500C>T) and France (c.440T>C), three in patients from India (c.469+6T>C, c.650-85A>G, c.158G>A), and six in patients from Italy (c.742A>G, c.614G>A, c.1057A>C, c.752G>A, c.119A>C, c.926G>T). Thus, the total number of potential AKU-causing variants found in 380 patients reported in the HGD mutation database is now 129. Using mCSM and DUET, computational approaches based on the protein 3D structure, the novel missense variants are predicted to affect the activity of the enzyme by three mechanisms: decrease of stability of individual protomers, disruption of protomer-protomer interactions or modification of residues in the region of the active site. We also present an overview of AKU in Italy, where so far about 60 AKU cases are known and DNA analysis has been reported for 34 of them. In this rather small group, 26 different HGD variants affecting function were described, indicating rather high heterogeneity. Twelve of these variants seem to be specific for Italy.

  7. Twelve novel HGD gene variants identified in 99 alkaptonuria patients: focus on ‘black bone disease' in Italy

    PubMed Central

    Nemethova, Martina; Radvanszky, Jan; Kadasi, Ludevit; Ascher, David B; Pires, Douglas E V; Blundell, Tom L; Porfirio, Berardino; Mannoni, Alessandro; Santucci, Annalisa; Milucci, Lia; Sestini, Silvia; Biolcati, Gianfranco; Sorge, Fiammetta; Aurizi, Caterina; Aquaron, Robert; Alsbou, Mohammed; Marques Lourenço, Charles; Ramadevi, Kanakasabapathi; Ranganath, Lakshminarayan R; Gallagher, James A; van Kan, Christa; Hall, Anthony K; Olsson, Birgitta; Sireau, Nicolas; Ayoob, Hana; Timmis, Oliver G; Le Quan Sang, Kim-Hanh; Genovese, Federica; Imrich, Richard; Rovensky, Jozef; Srinivasaraghavan, Rangan; Bharadwaj, Shruthi K; Spiegel, Ronen; Zatkova, Andrea

    2016-01-01

    Alkaptonuria (AKU) is an autosomal recessive disorder caused by mutations in homogentisate-1,2-dioxygenase (HGD) gene leading to the deficiency of HGD enzyme activity. The DevelopAKUre project is underway to test nitisinone as a specific treatment to counteract this derangement of the phenylalanine-tyrosine catabolic pathway. We analysed DNA of 40 AKU patients enrolled for SONIA1, the first study in DevelopAKUre, and of 59 other AKU patients sent to our laboratory for molecular diagnostics. We identified 12 novel DNA variants: one was identified in patients from Brazil (c.557T>A), Slovakia (c.500C>T) and France (c.440T>C), three in patients from India (c.469+6T>C, c.650–85A>G, c.158G>A), and six in patients from Italy (c.742A>G, c.614G>A, c.1057A>C, c.752G>A, c.119A>C, c.926G>T). Thus, the total number of potential AKU-causing variants found in 380 patients reported in the HGD mutation database is now 129. Using mCSM and DUET, computational approaches based on the protein 3D structure, the novel missense variants are predicted to affect the activity of the enzyme by three mechanisms: decrease of stability of individual protomers, disruption of protomer-protomer interactions or modification of residues in the region of the active site. We also present an overview of AKU in Italy, where so far about 60 AKU cases are known and DNA analysis has been reported for 34 of them. In this rather small group, 26 different HGD variants affecting function were described, indicating rather high heterogeneity. Twelve of these variants seem to be specific for Italy. PMID:25804398

  8. Genetics of Combined Pituitary Hormone Deficiency: Roadmap into the Genome Era

    PubMed Central

    Fang, Qing; George, Akima S.; Brinkmeier, Michelle L.; Mortensen, Amanda H.; Gergics, Peter; Cheung, Leonard Y. M.; Daly, Alexandre Z.; Ajmal, Adnan; Pérez Millán, María Ines; Ozel, A. Bilge; Kitzman, Jacob O.; Mills, Ryan E.; Li, Jun Z.

    2016-01-01

    The genetic basis for combined pituitary hormone deficiency (CPHD) is complex, involving 30 genes in a variety of syndromic and nonsyndromic presentations. Molecular diagnosis of this disorder is valuable for predicting disease progression, avoiding unnecessary surgery, and family planning. We expect that the application of high throughput sequencing will uncover additional contributing genes and eventually become a valuable tool for molecular diagnosis. For example, in the last 3 years, six new genes have been implicated in CPHD using whole-exome sequencing. In this review, we present a historical perspective on gene discovery for CPHD and predict approaches that may facilitate future gene identification projects conducted by clinicians and basic scientists. Guidelines for systematic reporting of genetic variants and assigning causality are emerging. We apply these guidelines retrospectively to reports of the genetic basis of CPHD and summarize modes of inheritance and penetrance for each of the known genes. In recent years, there have been great improvements in databases of genetic information for diverse populations. Some issues remain that make molecular diagnosis challenging in some cases. These include the inherent genetic complexity of this disorder, technical challenges like uneven coverage, differing results from variant calling and interpretation pipelines, the number of tolerated genetic alterations, and imperfect methods for predicting pathogenicity. We discuss approaches for future research in the genetics of CPHD. PMID:27828722

  9. BTKbase, mutation database for X-linked agammaglobulinemia (XLA).

    PubMed Central

    Vihinen, M; Brandau, O; Brandén, L J; Kwan, S P; Lappalainen, I; Lester, T; Noordzij, J G; Ochs, H D; Ollila, J; Pienaar, S M; Riikonen, P; Saha, B K; Smith, C I

    1998-01-01

    X-linked agammaglobulinemia (XLA) is an immunodeficiency caused by mutations in the gene coding for Bruton's agammaglobulinemia tyrosine kinase (BTK). A database (BTKbase) of BTK mutations has been compiled and the recent update lists 463 mutation entries from 406 unrelated families showing 303 unique molecular events. In addition to mutations, the database also lists variants or polymorphisms. Each patient is given a unique patient identity number (PIN). Information is included regarding the phenotype including symptoms. Mutations in all the five domains of BTK have been noticed to cause the disease, the most common event being missense mutations. The mutations appear almost uniformly throughout the molecule and frequently affect CpG sites that code for arginine residues. The putative structural implications of all the missense mutations are given in the database. The improved version of the registry having a number of new features is available at http://www. helsinki.fi/science/signal/btkbase.html PMID:9399844

  10. GETPrime 2.0: gene- and transcript-specific qPCR primers for 13 species including polymorphisms.

    PubMed

    David, Fabrice P A; Rougemont, Jacques; Deplancke, Bart

    2017-01-04

    GETPrime (http://bbcftools.epfl.ch/getprime) is a database with a web frontend providing gene- and transcript-specific, pre-computed qPCR primer pairs. The primers have been optimized for genome-wide specificity and for allowing the selective amplification of one or several splice variants of most known genes. To ease selection, primers have also been ranked according to defined criteria such as genome-wide specificity (with BLAST), amplicon size, and isoform coverage. Here, we report a major upgrade (2.0) of the database: eight new species (yeast, chicken, macaque, chimpanzee, rat, platypus, pufferfish, and Anolis carolinensis) now complement the five already included in the previous version (human, mouse, zebrafish, fly, and worm). Furthermore, the genomic reference has been updated to Ensembl v81 (while keeping earlier versions for backward compatibility) as a result of re-designing the back-end database and automating the import of relevant sections of the Ensembl database in species-independent fashion. This also allowed us to map known polymorphisms to the primers (on average three per primer for human), with the aim of reducing experimental error when targeting specific strains or individuals. Another consequence is that the inclusion of future Ensembl releases and other species has now become a relatively straightforward task. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Patterns of population differentiation of candidate genes for cardiovascular disease.

    PubMed

    Kullo, Iftikhar J; Ding, Keyue

    2007-07-12

    The basis for ethnic differences in cardiovascular disease (CVD) susceptibility is not fully understood. We investigated patterns of population differentiation (FST) of a set of genes in etiologic pathways of CVD among 3 ethnic groups: Yoruba in Nigeria (YRI), Utah residents with European ancestry (CEU), and Han Chinese (CHB) + Japanese (JPT). We identified 37 pathways implicated in CVD based on the PANTHER classification and 416 genes in these pathways were further studied; these genes belonged to 6 biological processes (apoptosis, blood circulation and gas exchange, blood clotting, homeostasis, immune response, and lipoprotein metabolism). Genotype data were obtained from the HapMap database. We calculated FST for 15,559 common SNPs (minor allele frequency > or = 0.10 in at least one population) in genes that co-segregated among the populations, as well as an average-weighted FST for each gene. SNPs were classified as putatively functional (non-synonymous and untranslated regions) or non-functional (intronic and synonymous sites). Mean FST values for common putatively functional variants were significantly higher than FST values for nonfunctional variants. A significant variation in FST was also seen based on biological processes; the processes of 'apoptosis' and 'lipoprotein metabolism' showed an excess of genes with high FST. Thus, putative functional SNPs in genes in etiologic pathways for CVD show greater population differentiation than non-functional SNPs and a significant variance of FST values was noted among pairwise population comparisons for different biological processes. These results suggest a possible basis for varying susceptibility to CVD among ethnic groups.

  12. Whole-exome sequencing of a rare case of familial childhood acute lymphoblastic leukemia reveals putative predisposing mutations in Fanconi anemia genes.

    PubMed

    Spinella, Jean-François; Healy, Jasmine; Saillour, Virginie; Richer, Chantal; Cassart, Pauline; Ouimet, Manon; Sinnett, Daniel

    2015-07-23

    Acute lymphoblastic leukemia (ALL) is the most common pediatric cancer. While the multi-step model of pediatric leukemogenesis suggests interplay between constitutional and somatic genomes, the role of inherited genetic variability remains largely undescribed. Nonsyndromic familial ALL, although extremely rare, provides the ideal setting to study inherited contributions to ALL. Toward this goal, we sequenced the exomes of a childhood ALL family consisting of mother, father and two non-twinned siblings diagnosed with concordant pre-B hyperdiploid ALL and previously shown to have inherited a rare form of PRDM9, a histone H3 methyltransferase involved in crossing-over at recombination hotspots and Holliday junctions. We postulated that inheritance of additional rare disadvantaging variants in predisposing cancer genes could affect genomic stability and lead to increased risk of hyperdiploid ALL within this family. Whole exomes were captured using Agilent's SureSelect kit and sequenced on the Life Technologies SOLiD System. We applied a data reduction strategy to identify candidate variants shared by both affected siblings. Under a recessive disease model, we focused on rare non-synonymous or frame-shift variants in leukemia predisposing pathways. Though the family was nonsyndromic, we identified a combination of rare variants in Fanconi anemia (FA) genes FANCP/SLX4 (compound heterozygote - rs137976282/rs79842542) and FANCA (rs61753269) and a rare homozygous variant in the Holliday junction resolvase GEN1 (rs16981869). These variants, predicted to affect protein function, were previously identified in familial breast cancer cases. Based on our in-house database of 369 childhood ALL exomes, the sibs were the only patients to carry this particularly rare combination and only a single hyperdiploid patient was heterozygote at both FANCP/SLX4 positions, while no FANCA variant allele carriers were identified. FANCA is the most commonly mutated gene in FA and is essential for resolving DNA interstrand cross-links during replication. FANCP/SLX4 and GEN1 are involved in the cleavage of Holliday junctions and their mutated forms, in combination with the rare allele of PRDM9, could alter Holliday junction resolution leading to nondisjunction of chromosomes and segregation defects. Taken together, these results suggest that concomitant inheritance of rare variants in FANCA, FANCP/SLX4 and GEN1 on the specific genetic background of this familial case, could lead to increased genomic instability, hematopoietic dysfunction, and higher risk of childhood leukemia.

  13. Korean Variant Archive (KOVA): a reference database of genetic variations in the Korean population.

    PubMed

    Lee, Sangmoon; Seo, Jihae; Park, Jinman; Nam, Jae-Yong; Choi, Ahyoung; Ignatius, Jason S; Bjornson, Robert D; Chae, Jong-Hee; Jang, In-Jin; Lee, Sanghyuk; Park, Woong-Yang; Baek, Daehyun; Choi, Murim

    2017-06-27

    Despite efforts to interrogate human genome variation through large-scale databases, systematic preference toward populations of Caucasian descendants has resulted in unintended reduction of power in studying non-Caucasians. Here we report a compilation of coding variants from 1,055 healthy Korean individuals (KOVA; Korean Variant Archive). The samples were sequenced to a mean depth of 75x, yielding 101 singleton variants per individual. Population genetics analysis demonstrates that the Korean population is a distinct ethnic group comparable to other discrete ethnic groups in Africa and Europe, providing a rationale for such independent genomic datasets. Indeed, KOVA conferred 22.8% increased variant filtering power in addition to Exome Aggregation Consortium (ExAC) when used on Korean exomes. Functional assessment of nonsynonymous variant supported the presence of purifying selection in Koreans. Analysis of copy number variants detected 5.2 deletions and 10.3 amplifications per individual with an increased fraction of novel variants among smaller and rarer copy number variable segments. We also report a list of germline variants that are associated with increased tumor susceptibility. This catalog can function as a critical addition to the pre-existing variant databases in pursuing genetic studies of Korean individuals.

  14. ActiveDriverDB: human disease mutations and genome variation in post-translational modification sites of proteins

    PubMed Central

    Krassowski, Michal; Paczkowska, Marta; Cullion, Kim; Huang, Tina; Dzneladze, Irakli; Ouellette, B F Francis; Yamada, Joseph T; Fradet-Turcotte, Amelie

    2018-01-01

    Abstract Interpretation of genetic variation is needed for deciphering genotype-phenotype associations, mechanisms of inherited disease, and cancer driver mutations. Millions of single nucleotide variants (SNVs) in human genomes are known and thousands are associated with disease. An estimated 21% of disease-associated amino acid substitutions corresponding to missense SNVs are located in protein sites of post-translational modifications (PTMs), chemical modifications of amino acids that extend protein function. ActiveDriverDB is a comprehensive human proteo-genomics database that annotates disease mutations and population variants through the lens of PTMs. We integrated >385,000 published PTM sites with ∼3.6 million substitutions from The Cancer Genome Atlas (TCGA), the ClinVar database of disease genes, and human genome sequencing projects. The database includes site-specific interaction networks of proteins, upstream enzymes such as kinases, and drugs targeting these enzymes. We also predicted network-rewiring impact of mutations by analyzing gains and losses of kinase-bound sequence motifs. ActiveDriverDB provides detailed visualization, filtering, browsing and searching options for studying PTM-associated mutations. Users can upload mutation datasets interactively and use our application programming interface in pipelines. Integrative analysis of mutations and PTMs may help decipher molecular mechanisms of phenotypes and disease, as exemplified by case studies of TP53, BRCA2 and VHL. The open-source database is available at https://www.ActiveDriverDB.org. PMID:29126202

  15. Rare copy number variants in a population-based investigation of hypoplastic right heart syndrome.

    PubMed

    Dimopoulos, Aggeliki; Sicko, Robert J; Kay, Denise M; Rigler, Shannon L; Druschel, Charlotte M; Caggana, Michele; Browne, Marilyn L; Fan, Ruzong; Romitti, Paul A; Brody, Lawrence C; Mills, James L

    2017-01-20

    Hypoplastic right heart syndrome (HRHS) is a rare congenital defect characterized by underdevelopment of the right heart structures commonly accompanied by an atrial septal defect. Familial HRHS reports suggest genetic factor involvement. We examined the role of copy number variants (CNVs) in HRHS. We genotyped 32 HRHS cases identified from all New York State live births (1998-2005) using Illumina HumanOmni2.5 microarrays. CNVs were called with PennCNV and prioritized if they were ≥20 Kb, contained ≥10 SNPs and had minimal overlap with CNVs from in-house controls, the Database of Genomic Variants, HapMap3, and Childrens Hospital of Philadelphia database. We identified 28 CNVs in 17 cases; several encompassed genes important for right heart development. One case had a 2p16-2p23 duplication spanning LBH, a limb and heart development transcription factor. Lbh mis-expression results in right ventricular hypoplasia and pulmonary valve defects. This duplication also encompassed SOS1, a factor associated with pulmonary valve stenosis in Noonan syndrome. Sos1 -/- mice display thin and poorly trabeculated ventricles. In another case, we identified a 1.5 Mb deletion associated with Williams-Beuren syndrome, a disorder that includes valvular malformations. A third case had a 24 Kb deletion upstream of the TGFβ ligand ITGB8. Embryos genetically null for Itgb8, and its intracellular interactant Band 4.1B, display lethal cardiac phenotypes. To our knowledge, this is the first study of CNVs in HRHS. We identified several rare CNVs that overlap genes related to right ventricular wall and valve development, suggesting that genetics plays a role in HRHS and providing clues for further investigation. Birth Defects Research 109:16-26, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  16. Rare Copy Number Variants in a Population Based Investigation of Hypoplastic Right Heart Syndrome

    PubMed Central

    Dimopoulos, Aggeliki; Sicko, Robert J.; Kay, Denise M.; Rigler, Shannon L.; Druschel, Charlotte M.; Caggana, Michele; Browne, Marilyn L.; Fan, Ruzong; Romitti, Paul A.; Brody, Lawrence C.; Mills, James L.

    2016-01-01

    Background Hypoplastic right heart syndrome (HRHS) is a rare congenital defect characterized by underdevelopment of the right heart structures commonly accompanied by an atrial septal defect. Familial HRHS reports suggest genetic factor involvement. We examined the role of copy number variants (CNVs) in HRHS. Methods We genotyped 32 HRHS cases identified from all New York State live births (1998–2005) using Illumina HumanOmni2.5 microarrays. CNVs were called with PennCNV and prioritized if they were ≥20Kb, contained ≥10 SNPs and had minimal overlap with CNVs from in-house controls, the Database of Genomic Variants, HapMap3 and CHOP database. Results We identified 28 CNVs in 17 cases; several encompassed genes important for right heart development. One case had a 2p16–2p23 duplication spanning LBH, a limb and heart development transcription factor. Lbh mis-expression results in right ventricular hypoplasia and pulmonary valve defects. This duplication also encompassed SOS1, a factor associated with pulmonary valve stenosis in Noonan syndrome. Sos1−/− mice display thin and poorly trabeculated ventricles. In another case, we identified a 1.5Mb deletion associated with Williams Beuren syndrome, a disorder that includes valvular malformations. A third case had a 24Kb deletion upstream of the TGFβ ligand ITGB8. Embryos genetically null for Itgb8, and its intracellular interactant Band 4.1B, display lethal cardiac phenotypes. Conclusions To our knowledge, this is the first study of CNVs in HRHS. We identified several rare CNVs that overlap genes related to right ventricular wall and valve development, suggesting that genetics plays a role in HRHS and providing clues for further investigation. PMID:28009100

  17. Germline PARP4 mutations in patients with primary thyroid and breast cancers

    PubMed Central

    Ikeda, Yuji; Kiyotani, Kazuma; Yew, Poh Yin; Kato, Taigo; Tamura, Kenji; Yap, Kai-Lee; Nielsen, Sarah M.; Mester, Jessica L; Eng, Charis; Nakamura, Yusuke; Grogan, Raymon H.

    2016-01-01

    Germline mutations in the PTEN gene, which cause Cowden syndrome (CS), are known to be one of the genetic factors for primary thyroid and breast cancers, however, PTEN mutations are found in only a small subset of research participants with non-syndrome breast and thyroid cancers. In this study, we aimed to identify germline variants that may be related to genetic risk of primary thyroid and breast cancers. Genomic DNAs extracted from peripheral blood of 14 PTEN-wild-type female research participants with primary thyroid and breast cancers were analyzed by whole-exome sequencing. Gene-based case control association analysis using the information of 406 Europeans obtained from the 1000 Genomes Project database identified 34 genes possibly associated with the phenotype with P<1.0×10−3. Among them, rare variants in the PARP4 gene were detected at significant high frequency (odds ratio = 5.2, P = 1.0×10−5). The variants, G496V and T1170I, were found in 6 of the 14 study participants (43%) while their frequencies were only 0.5% in controls. Functional analysis using HCC1143 cell line showed that knockdown of PARP4 with siRNA significantly enhanced the cell proliferation, compared with the cells transfected with siControl (P = 0.02). Kaplan-Meier analysis using GEO, EGA and TCGA datasets showed poor progression-free survival (P = 0.006, Hazard ratio 0.71) and overall survival (P < 0.0001, Hazard ratio 0.79) in a PARP4 low-expression group, suggesting that PARP4 may function as a tumor suppression. In conclusion, we identified PARP4 as a possible susceptibility gene of primary thyroid and breast cancer. PMID:26699384

  18. A Genomic and Protein-Protein Interaction Analyses of Nonsyndromic Hearing Impairment in Cameroon Using Targeted Genomic Enrichment and Massively Parallel Sequencing.

    PubMed

    Lebeko, Kamogelo; Manyisa, Noluthando; Chimusa, Emile R; Mulder, Nicola; Dandara, Collet; Wonkam, Ambroise

    2017-02-01

    Hearing impairment (HI) is one of the leading causes of disability in the world, impacting the social, economic, and psychological well-being of the affected individual. This is particularly true in sub-Saharan Africa, which carries one of the highest burdens of this condition. Despite this, there are limited data on the most prevalent genes or mutations that cause HI among sub-Saharan Africans. Next-generation technologies, such as targeted genomic enrichment and massively parallel sequencing, offer new promise in this context. This study reports, for the first time to the best of our knowledge, on the prevalence of novel mutations identified through a platform of 116 HI genes (OtoSCOPE ® ), among 82 African probands with HI. Only variants OTOF NM_194248.2:c.766-2A>G and MYO7A NM_000260.3:c.1996C>T, p.Arg666Stop were found in 3 (3.7%) and 5 (6.1%) patients, respectively. In addition and uniquely, the analysis of protein-protein interactions (PPI), through interrogation of gene subnetworks, using a custom script and two databases (Enrichr and PANTHER), and an algorithm in the igraph package of R, identified the enrichment of sensory perception and mechanical stimulus biological processes, and the most significant molecular functions of these variants pertained to binding or structural activity. Furthermore, 10 genes (MYO7A, MYO6, KCTD3, NUMA1, MYH9, KCNQ1, UBC, DIAPH1, PSMC2, and RDX) were identified as significant hubs within the subnetworks. Results reveal that the novel variants identified among familial cases of HI in Cameroon are not common, and PPI analysis has highlighted the role of 10 genes, potentially important in understanding HI genomics among Africans.

  19. Interaction between early-life stress and FKBP5 gene variants in major depressive disorder and post-traumatic stress disorder: A systematic review and meta-analysis.

    PubMed

    Wang, Qingzhong; Shelton, Richard C; Dwivedi, Yogesh

    2018-01-01

    Gene-environment interaction contributes to the risks of psychiatric disorders. Interactions between FKBP5 gene variants and early-life stress may enhance the risk not only for mood disorder, but also for a number of other behavioral phenotypes. The aim of the present study was to review and conduct a meta-analysis on the results from published studies examining interaction between FKBP5 gene variants and early-life stress and their associations with stress-related disorders such as major depression and PTSD. A literature search was conducted using PsychINFO and PubMed databases until May 2017. A total of 14 studies with a pooled total of 15109 participants met the inclusion criteria, the results of which were combined and a meta-analysis was performed using the differences in correlations as the effect measure. Based on literature, rs1360780, rs3800373, and rs9470080 SNPs were selected within the FKBP5 gene and systematic review was conducted. Based on the Comprehensive Meta-Analysis software, no publication bias was detected. Sensitivity analysis and credibility of meta-analysis results also indicated that the analyses were stable. The meta-analysis showed that individuals who carry T allele of rs1360780, C-allele of rs3800373 or T-allele of rs9470080 exposed to early-life trauma had higher risks for depression or PTSD. The effects of ethnicity, age, sex, and different stress measures were not examined due to limited sample size. These results provide strong evidence of interactions between FKBP5 genotypes and early-life stress, which could pose a significant risk factor for stress-associated disorders such as major depression and PTSD. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Characteristics of MUTYH variants in Japanese colorectal polyposis patients.

    PubMed

    Takao, Misato; Yamaguchi, Tatsuro; Eguchi, Hidetaka; Tada, Yuhki; Kohda, Masakazu; Koizumi, Koichi; Horiguchi, Shin-Ichiro; Okazaki, Yasushi; Ishida, Hideyuki

    2018-06-01

    The base excision repair gene MUTYH is the causative gene of colorectal polyposis syndrome, which is an autosomal recessive disorder associated with a high risk of colorectal cancer. Since few studies have investigated the genotype-phenotype association in Japanese patients with MUTYH variants, the aim of this study was to clarify the clinicopathological findings in Japanese patients with MUTYH gene variants who were detected by screening causative genes associated with hereditary colorectal polyposis. After obtaining informed consent, genetic testing was performed using target enrichment sequencing of 26 genes, including MUTYH. Of the 31 Japanese patients with suspected hereditary colorectal polyposis, eight MUTYH variants were detected in five patients. MUTYH hotspot variants known for Caucasians, namely p.G396D and p.Y179D, were not among the detected variants.Of five patients, two with biallelic MUTYH variants were diagnosed with MUTYH-associated polyposis, while two others had monoallelic MUTYH variants. One patient had the p.P18L and p.G25D variants on the same allele; however, supportive data for considering these two variants 'pathogenic' were lacking. Two patients with biallelic MUTYH variants and two others with monoallelic MUTYH variants were identified among Japanese colorectal polyposis patients. Hotspot variants of the MUTYH gene for Caucasians were not hotspots for Japanese patients.

  1. MYO7A and USH2A gene sequence variants in Italian patients with Usher syndrome.

    PubMed

    Sodi, Andrea; Mariottini, Alessandro; Passerini, Ilaria; Murro, Vittoria; Tachyla, Iryna; Bianchi, Benedetta; Menchini, Ugo; Torricelli, Francesca

    2014-01-01

    To analyze the spectrum of sequence variants in the MYO7A and USH2A genes in a group of Italian patients affected by Usher syndrome (USH). Thirty-six Italian patients with a diagnosis of USH were recruited. They received a standard ophthalmologic examination, visual field testing, optical coherence tomography (OCT) scan, and electrophysiological tests. Fluorescein angiography and fundus autofluorescence imaging were performed in selected cases. All the patients underwent an audiologic examination for the 0.25-8,000 Hz frequencies. Vestibular function was evaluated with specific tests. DNA samples were analyzed for sequence variants of the MYO7A gene (for USH1) and the USH2A gene (for USH2) with direct sequencing techniques. A few patients were analyzed for both genes. In the MYO7A gene, ten missense variants were found; three patients were compound heterozygous, and two were homozygous. Thirty-four USH2A gene variants were detected, including eight missense variants, nine nonsense variants, six splicing variants, and 11 duplications/deletions; 19 patients were compound heterozygous, and three were homozygous. Four MYO7A and 17 USH2A variants have already been described in the literature. Among the novel mutations there are four USH2A large deletions, detected with multiplex ligation dependent probe amplification (MLPA) technology. Two potentially pathogenic variants were found in 27 patients (75%). Affected patients showed variable clinical pictures without a clear genotype-phenotype correlation. Ten variants in the MYO7A gene and 34 variants in the USH2A gene were detected in Italian patients with USH at a high detection rate. A selective analysis of these genes may be valuable for molecular analysis, combining diagnostic efficiency with little time wastage and less resource consumption.

  2. Differences in Transcriptional Activity of Human Papillomavirus Type 6 Molecular Variants in Recurrent Respiratory Papillomatosis

    PubMed Central

    Measso do Bonfim, Caroline; Simão Sobrinho, João; Lacerda Nogueira, Rodrigo; Salgado Kupper, Daniel; Cardoso Pereira Valera, Fabiana; Lacerda Nogueira, Maurício; Villa, Luisa Lina; Rahal, Paula; Sichero, Laura

    2015-01-01

    A significant proportion of recurrent respiratory papillomatosis (RRP) is caused by human papillomavirus type 6 (HPV-6). The long control region (LCR) contains cis-elements for regulation of transcription. Our aim was to characterize LCR HPV-6 variants in RRP cases, compare promoter activity of these isolates and search for cellular transcription factors (TFs) that could explain the differences observed. The complete LCR from 13 RRP was analyzed. Transcriptional activity of 5 variants was compared using luciferase assays. Differences in putative TFs binding sites among variants were revealed using the TRANSFAC database. Chromatin immunoprecipation (CHIP) and luciferase assays were used to evaluate TF binding and impact upon transcription, respectively. Juvenile-onset RRP cases harbored exclusively HPV-6vc related variants, whereas among adult-onset cases HPV-6a variants were more prevalent. The HPV-6vc reference was more transcriptionally active than the HPV-6a reference. Active FOXA1, ELF1 and GATA1 binding sites overlap variable nucleotide positions among isolates and influenced LCR activity. Furthermore, our results support a crucial role for ELF1 on transcriptional downregulation. We identified TFs implicated in the regulation of HPV-6 early gene expression. Many of these factors are mutated in cancer or are putative cancer biomarkers, and must be further studied. PMID:26151558

  3. Amyotrophic lateral sclerosis onset is influenced by the burden of rare variants in known amyotrophic lateral sclerosis genes.

    PubMed

    Cady, Janet; Allred, Peggy; Bali, Taha; Pestronk, Alan; Goate, Alison; Miller, Timothy M; Mitra, Robi D; Ravits, John; Harms, Matthew B; Baloh, Robert H

    2015-01-01

    To define the genetic landscape of amyotrophic lateral sclerosis (ALS) and assess the contribution of possible oligogenic inheritance, we aimed to comprehensively sequence 17 known ALS genes in 391 ALS patients from the United States. Targeted pooled-sample sequencing was used to identify variants in 17 ALS genes. Fragment size analysis was used to define ATXN2 and C9ORF72 expansion sizes. Genotype-phenotype correlations were made with individual variants and total burden of variants. Rare variant associations for risk of ALS were investigated at both the single variant and gene level. A total of 64.3% of familial and 27.8% of sporadic subjects carried potentially pathogenic novel or rare coding variants identified by sequencing or an expanded repeat in C9ORF72 or ATXN2; 3.8% of subjects had variants in >1 ALS gene, and these individuals had disease onset 10 years earlier (p = 0.0046) than subjects with variants in a single gene. The number of potentially pathogenic coding variants did not influence disease duration or site of onset. Rare and potentially pathogenic variants in known ALS genes are present in >25% of apparently sporadic and 64% of familial patients, significantly higher than previous reports using less comprehensive sequencing approaches. A significant number of subjects carried variants in >1 gene, which influenced the age of symptom onset and supports oligogenic inheritance as relevant to disease pathogenesis. © 2014 American Neurological Association.

  4. Exome Pool-Seq in neurodevelopmental disorders.

    PubMed

    Popp, Bernt; Ekici, Arif B; Thiel, Christian T; Hoyer, Juliane; Wiesener, Antje; Kraus, Cornelia; Reis, André; Zweier, Christiane

    2017-12-01

    High throughput sequencing has greatly advanced disease gene identification, especially in heterogeneous entities. Despite falling costs this is still an expensive and laborious technique, particularly when studying large cohorts. To address this problem we applied Exome Pool-Seq as an economic and fast screening technology in neurodevelopmental disorders (NDDs). Sequencing of 96 individuals can be performed in eight pools of 12 samples on less than one Illumina sequencer lane. In a pilot study with 96 cases we identified 27 variants, likely or possibly affecting function. Twenty five of these were identified in 923 established NDD genes (based on SysID database, status November 2016) (ACTB, AHDC1, ANKRD11, ATP6V1B2, ATRX, CASK, CHD8, GNAS, IFIH1, KCNQ2, KMT2A, KRAS, MAOA, MED12, MED13L, RIT1, SETD5, SIN3A, TCF4, TRAPPC11, TUBA1A, WAC, ZBTB18, ZMYND11), two in 543 (SysID) candidate genes (ZNF292, BPTF), and additionally a de novo loss-of-function variant in LRRC7, not previously implicated in NDDs. Most of them were confirmed to be de novo, but we also identified X-linked or autosomal-dominantly or autosomal-recessively inherited variants. With a detection rate of 28%, Exome Pool-Seq achieves comparable results to individual exome analyses but reduces costs by >85%. Compared with other large scale approaches using Molecular Inversion Probes (MIP) or gene panels, it allows flexible re-analysis of data. Exome Pool-Seq is thus well suited for large-scale, cost-efficient and flexible screening in characterized but heterogeneous entities like NDDs.

  5. NGS Catalog: A Database of Next Generation Sequencing Studies in Humans

    PubMed Central

    Xia, Junfeng; Wang, Qingguo; Jia, Peilin; Wang, Bing; Pao, William; Zhao, Zhongming

    2015-01-01

    Next generation sequencing (NGS) technologies have been rapidly applied in biomedical and biological research since its advent only a few years ago, and they are expected to advance at an unprecedented pace in the following years. To provide the research community with a comprehensive NGS resource, we have developed the database Next Generation Sequencing Catalog (NGS Catalog, http://bioinfo.mc.vanderbilt.edu/NGS/index.html), a continually updated database that collects, curates and manages available human NGS data obtained from published literature. NGS Catalog deposits publication information of NGS studies and their mutation characteristics (SNVs, small insertions/deletions, copy number variations, and structural variants), as well as mutated genes and gene fusions detected by NGS. Other functions include user data upload, NGS general analysis pipelines, and NGS software. NGS Catalog is particularly useful for investigators who are new to NGS but would like to take advantage of these powerful technologies for their own research. Finally, based on the data deposited in NGS Catalog, we summarized features and findings from whole exome sequencing, whole genome sequencing, and transcriptome sequencing studies for human diseases or traits. PMID:22517761

  6. Role of GLI2 in hypopituitarism phenotype.

    PubMed

    Arnhold, Ivo J P; França, Marcela M; Carvalho, Luciani R; Mendonca, Berenice B; Jorge, Alexander A L

    2015-06-01

    GLI2 is a zinc-finger transcription factor involved in the Sonic Hedgehog pathway. Gli2 mutant mice have hypoplastic anterior and absent posterior pituitary glands. We reviewed the literature for patients with hypopituitarism and alterations in GLI2. Twenty-five patients (16 families) had heterozygous truncating mutations, and the phenotype frequently included GH deficiency, a small anterior pituitary lobe and an ectopic/undescended posterior pituitary lobe on magnetic resonance imaging and postaxial polydactyly. The inheritance pattern was autosomal dominant with incomplete penetrance and variable expressivity. The mutation was frequently inherited from an asymptomatic parent. Eleven patients had heterozygous non-synonymous GLI2 variants that were classified as variants of unknown significance, because they were either absent from or had a frequency lower than 0.001 in the databases. In these patients, the posterior pituitary was also ectopic, but none had polydactyly. A third group of variants found in patients with hypopituitarism were considered benign because their frequency was ≥ 0.001 in the databases. GLI2 is a large and polymorphic gene, and sequencing may identify variants whose interpretation may be difficult. Incomplete penetrance implies in the participation of other genetic and/or environmental factors. An interaction between Gli2 mutations and prenatal ethanol exposure has been demonstrated in mice dysmorphology. In conclusion, a relatively high frequency of GLI2 mutations and variants were identified in patients with congenital GH deficiency without other brain defects, and most of these patients presented with combined pituitary hormone deficiency and an ectopic posterior pituitary lobe. Future studies may clarify the relative role and frequency of GLI2 alterations in the aetiology of hypopituitarism. © 2015 Society for Endocrinology.

  7. SIBIS: a Bayesian model for inconsistent protein sequence estimation.

    PubMed

    Khenoussi, Walyd; Vanhoutrève, Renaud; Poch, Olivier; Thompson, Julie D

    2014-09-01

    The prediction of protein coding genes is a major challenge that depends on the quality of genome sequencing, the accuracy of the model used to elucidate the exonic structure of the genes and the complexity of the gene splicing process leading to different protein variants. As a consequence, today's protein databases contain a huge amount of inconsistency, due to both natural variants and sequence prediction errors. We have developed a new method, called SIBIS, to detect such inconsistencies based on the evolutionary information in multiple sequence alignments. A Bayesian framework, combined with Dirichlet mixture models, is used to estimate the probability of observing specific amino acids and to detect inconsistent or erroneous sequence segments. We evaluated the performance of SIBIS on a reference set of protein sequences with experimentally validated errors and showed that the sensitivity is significantly higher than previous methods, with only a small loss of specificity. We also assessed a large set of human sequences from the UniProt database and found evidence of inconsistency in 48% of the previously uncharacterized sequences. We conclude that the integration of quality control methods like SIBIS in automatic analysis pipelines will be critical for the robust inference of structural, functional and phylogenetic information from these sequences. Source code, implemented in C on a linux system, and the datasets of protein sequences are freely available for download at http://www.lbgi.fr/∼julie/SIBIS. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data.

    PubMed

    Wright, Caroline F; Fitzgerald, Tomas W; Jones, Wendy D; Clayton, Stephen; McRae, Jeremy F; van Kogelenberg, Margriet; King, Daniel A; Ambridge, Kirsty; Barrett, Daniel M; Bayzetinova, Tanya; Bevan, A Paul; Bragin, Eugene; Chatzimichali, Eleni A; Gribble, Susan; Jones, Philip; Krishnappa, Netravathi; Mason, Laura E; Miller, Ray; Morley, Katherine I; Parthiban, Vijaya; Prigmore, Elena; Rajan, Diana; Sifrim, Alejandro; Swaminathan, G Jawahar; Tivey, Adrian R; Middleton, Anna; Parker, Michael; Carter, Nigel P; Barrett, Jeffrey C; Hurles, Matthew E; FitzPatrick, David R; Firth, Helen V

    2015-04-04

    Human genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount. The Deciphering Developmental Disorders (DDD) study has developed a UK-wide patient recruitment network involving over 180 clinicians across all 24 regional genetics services, and has performed genome-wide microarray and whole exome sequencing on children with undiagnosed developmental disorders and their parents. After data analysis, pertinent genomic variants were returned to individual research participants via their local clinical genetics team. Around 80,000 genomic variants were identified from exome sequencing and microarray analysis in each individual, of which on average 400 were rare and predicted to be protein altering. By focusing only on de novo and segregating variants in known developmental disorder genes, we achieved a diagnostic yield of 27% among 1133 previously investigated yet undiagnosed children with developmental disorders, whilst minimising incidental findings. In families with developmentally normal parents, whole exome sequencing of the child and both parents resulted in a 10-fold reduction in the number of potential causal variants that needed clinical evaluation compared to sequencing only the child. Most diagnostic variants identified in known genes were novel and not present in current databases of known disease variation. Implementation of a robust translational genomics workflow is achievable within a large-scale rare disease research study to allow feedback of potentially diagnostic findings to clinicians and research participants. Systematic recording of relevant clinical data, curation of a gene-phenotype knowledge base, and development of clinical decision support software are needed in addition to automated exclusion of almost all variants, which is crucial for scalable prioritisation and review of possible diagnostic variants. However, the resource requirements of development and maintenance of a clinical reporting system within a research setting are substantial. Health Innovation Challenge Fund, a parallel funding partnership between the Wellcome Trust and the UK Department of Health. Copyright © 2015 Wright et al. Open Access article distributed under the terms of CC BY. Published by Elsevier Ltd. All rights reserved.

  9. An Integrated Tool to Study MHC Region: Accurate SNV Detection and HLA Genes Typing in Human MHC Region Using Targeted High-Throughput Sequencing

    PubMed Central

    Liu, Xiao; Xu, Yinyin; Liang, Dequan; Gao, Peng; Sun, Yepeng; Gifford, Benjamin; D’Ascenzo, Mark; Liu, Xiaomin; Tellier, Laurent C. A. M.; Yang, Fang; Tong, Xin; Chen, Dan; Zheng, Jing; Li, Weiyang; Richmond, Todd; Xu, Xun; Wang, Jun; Li, Yingrui

    2013-01-01

    The major histocompatibility complex (MHC) is one of the most variable and gene-dense regions of the human genome. Most studies of the MHC, and associated regions, focus on minor variants and HLA typing, many of which have been demonstrated to be associated with human disease susceptibility and metabolic pathways. However, the detection of variants in the MHC region, and diagnostic HLA typing, still lacks a coherent, standardized, cost effective and high coverage protocol of clinical quality and reliability. In this paper, we presented such a method for the accurate detection of minor variants and HLA types in the human MHC region, using high-throughput, high-coverage sequencing of target regions. A probe set was designed to template upon the 8 annotated human MHC haplotypes, and to encompass the 5 megabases (Mb) of the extended MHC region. We deployed our probes upon three, genetically diverse human samples for probe set evaluation, and sequencing data show that ∼97% of the MHC region, and over 99% of the genes in MHC region, are covered with sufficient depth and good evenness. 98% of genotypes called by this capture sequencing prove consistent with established HapMap genotypes. We have concurrently developed a one-step pipeline for calling any HLA type referenced in the IMGT/HLA database from this target capture sequencing data, which shows over 96% typing accuracy when deployed at 4 digital resolution. This cost-effective and highly accurate approach for variant detection and HLA typing in the MHC region may lend further insight into immune-mediated diseases studies, and may find clinical utility in transplantation medicine research. This one-step pipeline is released for general evaluation and use by the scientific community. PMID:23894464

  10. Novel Nine-Exon AR Transcripts (Exon 1/Exon 1b/Exons 2-8) in Normal and Cancerous Breast and Prostate Cells.

    PubMed

    Hu, Dong Gui; McKinnon, Ross A; Hulin, Julie-Ann; Mackenzie, Peter I; Meech, Robyn

    2016-12-27

    Nearly 20 different transcripts of the human androgen receptor (AR) are reported with two currently listed as Refseq isoforms in the NCBI database. Isoform 1 encodes wild-type AR (type 1 AR) and isoform 2 encodes the variant AR45 (type 2 AR). Both variants contain eight exons: they share common exons 2-8 but differ in exon 1 with the canonical exon 1 in isoform 1 and the variant exon 1b in isoform 2. Splicing of exon 1 or exon 1b is reported to be mutually exclusive. In this study, we identified a novel exon 1b (1b/TAG) that contains an additional TAG trinucleotide upstream of exon 1b. Moreover, we identified AR transcripts in both normal and cancerous breast and prostate cells that contained either exon 1b or 1b/TAG spliced between the canonical exon 1 and exon 2, generating nine-exon AR transcripts that we have named isoforms 3a and 3b. The proteins encoded by these new AR variants could regulate androgen-responsive reporters in breast and prostate cancer cells under androgen-depleted conditions. Analysis of type 3 AR-GFP fusion proteins showed partial nuclear localization in PC3 cells under androgen-depleted conditions, supporting androgen-independent activation of the AR. Type 3 AR proteins inhibited androgen-induced growth of LNCaP cells. Microarray analysis identified a small set of type 3a AR target genes in LNCaP cells, including genes known to modulate growth and proliferation of prostate cancer ( PCGEM1 , PEG3 , EPHA3 , and EFNB2 ) or other types of human cancers ( TOX3 , ST8SIA4 , and SLITRK3 ), and genes that are diagnostic/prognostic biomarkers of prostate cancer ( GRINA3 , and BCHE ).

  11. Genetic abnormalities in bicuspid aortic valve root phenotype: preliminary results.

    PubMed

    Girdauskas, Evaldas; Geist, Lisa; Disha, Kushtrim; Kazakbaev, Iliaz; Groß, Tatiana; Schulz, Solveig; Ungelenk, Martin; Kuntze, Thomas; Reichenspurner, Hermann; Kurth, Ingo

    2017-07-01

    Genetic defects associated with bicuspid aortopathy have been infrequently analysed. Our goal was to examine the prevalence of rare genetic variants in patients with a bicuspid aortic valve (BAV) with a root phenotype using next-generation sequencing technology. We investigated a total of 124 patients with BAV with a root dilatation phenotype who underwent aortic valve ± proximal aortic surgery at a single institution (BAV database, n  = 812) during a 20-year period (1995-2015). Cross-sectional follow-up revealed 63 (51%) patients who were still alive and willing to participate. Systematic follow-up visits were scheduled from March to December 2015 and included aortic imaging as well as peripheral blood sampling for genetic testing. Next-generation sequencing libraries were prepared using a custom-made HaloPlex HS gene panel and included 20 candidate genes known to be associated with aortopathy and BAV. The primary end-point was the prevalence of genetic defects in our study cohort. A total of 63 patients (mean age 46 ± 10 years, 92% men) with BAV root phenotype and mean post-aortic valve replacement follow-up of 10.3 ± 4.9 years were included. Our genetic analysis yielded a wide spectrum of rare, potentially or likely pathogenic variants in 19 (30%) patients, with NOTCH1 variants being the most common ( n  = 6). Moreover, deleterious variants were revealed in AXIN1 ( n  = 3), NOS3 ( n  = 3), ELN ( n  = 2), FBN1 ( n  = 2) , FN1 ( n  = 2) and rarely in other candidate genes. Our preliminary study demonstrates a high prevalence and a wide spectrum of rare genetic variants in patients with the BAV root phenotype, indicative of the potentially congenital origin of associated aortopathy in this specific BAV cohort. © The Author 2017. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery. All rights reserved.

  12. Complex Landscape of Germline Variants in Brazilian Patients With Hereditary and Early Onset Breast Cancer.

    PubMed

    Torrezan, Giovana T; de Almeida, Fernanda G Dos Santos R; Figueiredo, Márcia C P; Barros, Bruna D de Figueiredo; de Paula, Cláudia A A; Valieris, Renan; de Souza, Jorge E S; Ramalho, Rodrigo F; da Silva, Felipe C C; Ferreira, Elisa N; de Nóbrega, Amanda F; Felicio, Paula S; Achatz, Maria I; de Souza, Sandro J; Palmero, Edenir I; Carraro, Dirce M

    2018-01-01

    Pathogenic variants in known breast cancer (BC) predisposing genes explain only about 30% of Hereditary Breast Cancer (HBC) cases, whereas the underlying genetic factors for most families remain unknown. Here, we used whole-exome sequencing (WES) to identify genetic variants associated to HBC in 17 patients of Brazil with familial BC and negative for causal variants in major BC risk genes ( BRCA1/2, TP53 , and CHEK2 c.1100delC). First, we searched for rare variants in 27 known HBC genes and identified two patients harboring truncating pathogenic variants in ATM and BARD1 . For the remaining 15 negative patients, we found a substantial vast number of rare genetic variants. Thus, for selecting the most promising variants we used functional-based variant prioritization, followed by NGS validation, analysis in a control group, cosegregation analysis in one family and comparison with previous WES studies, shrinking our list to 23 novel BC candidate genes, which were evaluated in an independent cohort of 42 high-risk BC patients. Rare and possibly damaging variants were identified in 12 candidate genes in this cohort, including variants in DNA repair genes ( ERCC1 and SXL4 ) and other cancer-related genes ( NOTCH2, ERBB2, MST1R , and RAF1 ). Overall, this is the first WES study applied for identifying novel genes associated to HBC in Brazilian patients, in which we provide a set of putative BC predisposing genes. We also underpin the value of using WES for assessing the complex landscape of HBC susceptibility, especially in less characterized populations.

  13. Evaluating Reported Candidate Gene Associations with Polycystic Ovary Syndrome

    PubMed Central

    Pau, Cindy; Saxena, Richa; Welt, Corrine Kolka

    2013-01-01

    Objective To replicate variants in candidate genes associated with PCOS in a population of European PCOS and control subjects. Design Case-control association analysis and meta-analysis. Setting Major academic hospital Patients Women of European ancestry with PCOS (n=525) and controls (n=472), aged 18 to 45 years. Intervention Variants previously associated with PCOS in candidate gene studies were genotyped (n=39). Metabolic, reproductive and anthropomorphic parameters were examined as a function of the candidate variants. All genetic association analyses were adjusted for age, BMI and ancestry and were reported after correction for multiple testing. Main Outcome Measure Association of candidate gene variants with PCOS. Results Three variants, rs3797179 (SRD5A1), rs12473543 (POMC), and rs1501299 (ADIPOQ), were nominally associated with PCOS. However, they did not remain significant after correction for multiple testing and none of the variants replicated in a sufficiently powered meta-analysis. Variants in the FBN3 gene (rs17202517 and rs73503752) were associated with smaller waist circumferences and variant rs727428 in the SHBG gene was associated with lower SHBG levels. Conclusion Previously identified variants in candidate genes do not appear to be associated with PCOS risk. PMID:23375202

  14. Genetic basis of congenital erythrocytosis: mutation update and online databases.

    PubMed

    Bento, Celeste; Percy, Melanie J; Gardie, Betty; Maia, Tabita Magalhães; van Wijk, Richard; Perrotta, Silverio; Della Ragione, Fulvio; Almeida, Helena; Rossi, Cedric; Girodon, François; Aström, Maria; Neumann, Drorit; Schnittger, Susanne; Landin, Britta; Minkov, Milen; Randi, Maria Luigia; Richard, Stéphane; Casadevall, Nicole; Vainchenker, William; Rives, Susana; Hermouet, Sylvie; Ribeiro, M Leticia; McMullin, Mary Frances; Cario, Holger; Chauveau, Aurelie; Gimenez-Roqueplo, Anne-Paule; Bressac-de-Paillerets, Brigitte; Altindirek, Didem; Lorenzo, Felipe; Lambert, Frederic; Dan, Harlev; Gad-Lapiteau, Sophie; Catarina Oliveira, Ana; Rossi, Cédric; Fraga, Cristina; Taradin, Gennadiy; Martin-Nuñez, Guillermo; Vitória, Helena; Diaz Aguado, Herrera; Palmblad, Jan; Vidán, Julia; Relvas, Luis; Ribeiro, Maria Leticia; Luigi Larocca, Maria; Luigia Randi, Maria; Pedro Silveira, Maria; Percy, Melanie; Gross, Mor; Marques da Costa, Ricardo; Beshara, Soheir; Ben-Ami, Tal; Ugo, Valérie

    2014-01-01

    Congenital erythrocytosis (CE), or congenital polycythemia, represents a rare and heterogeneous clinical entity. It is caused by deregulated red blood cell production where erythrocyte overproduction results in elevated hemoglobin and hematocrit levels. Primary congenital familial erythrocytosis is associated with low erythropoietin (Epo) levels and results from mutations in the Epo receptor gene (EPOR). Secondary CE arises from conditions causing tissue hypoxia and results in increased Epo production. These include hemoglobin variants with increased affinity for oxygen (HBB, HBA mutations), decreased production of 2,3-bisphosphoglycerate due to BPGM mutations, or mutations in the genes involved in the hypoxia sensing pathway (VHL, EPAS1, and EGLN1). Depending on the affected gene, CE can be inherited either in an autosomal dominant or recessive mode, with sporadic cases arising de novo. Despite recent important discoveries in the molecular pathogenesis of CE, the molecular causes remain to be identified in about 70% of the patients. With the objective of collecting all the published and unpublished cases of CE the COST action MPN&MPNr-Euronet developed a comprehensive Internet-based database focusing on the registration of clinical history, hematological, biochemical, and molecular data (http://www.erythrocytosis.org/). In addition, unreported mutations are also curated in the corresponding Leiden Open Variation Database. © 2013 WILEY PERIODICALS, INC.

  15. Multi-variant study of obesity risk genes in African Americans: The Jackson Heart Study.

    PubMed

    Liu, Shijian; Wilson, James G; Jiang, Fan; Griswold, Michael; Correa, Adolfo; Mei, Hao

    2016-11-30

    Genome-wide association study (GWAS) has been successful in identifying obesity risk genes by single-variant association analysis. For this study, we designed steps of analysis strategy and aimed to identify multi-variant effects on obesity risk among candidate genes. Our analyses were focused on 2137 African American participants with body mass index measured in the Jackson Heart Study and 657 common single nucleotide polymorphisms (SNPs) genotyped at 8 GWAS-identified obesity risk genes. Single-variant association test showed that no SNPs reached significance after multiple testing adjustment. The following gene-gene interaction analysis, which was focused on SNPs with unadjusted p-value<0.10, identified 6 significant multi-variant associations. Logistic regression showed that SNPs in these associations did not have significant linear interactions; examination of genetic risk score evidenced that 4 multi-variant associations had significant additive effects of risk SNPs; and haplotype association test presented that all multi-variant associations contained one or several combinations of particular alleles or haplotypes, associated with increased obesity risk. Our study evidenced that obesity risk genes generated multi-variant effects, which can be additive or non-linear interactions, and multi-variant study is an important supplement to existing GWAS for understanding genetic effects of obesity risk genes. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. BRCA1/2 missense mutations and the value of in-silico analyses.

    PubMed

    Sadowski, Carolin E; Kohlstedt, Daniela; Meisel, Cornelia; Keller, Katja; Becker, Kerstin; Mackenroth, Luisa; Rump, Andreas; Schröck, Evelin; Wimberger, Pauline; Kast, Karin

    2017-11-01

    The clinical implications of genetic variants in BRCA1/2 in healthy and affected individuals are considerable. Variant interpretation, however, is especially challenging for missense variants. The majority of them are classified as variants of unknown clinical significance (VUS). Computational (in-silico) predictive programs are easy to access, but represent only one tool out of a wide range of complemental approaches to classify VUS. With this single-center study, we aimed to evaluate the impact of in-silico analyses in a spectrum of different BRCA1/2 missense variants. We conducted mutation analysis of BRCA1/2 in 523 index patients with suspected hereditary breast and ovarian cancer (HBOC). Classification of the genetic variants was performed according to the German Consortium (GC)-HBOC database. Additionally, all missense variants were classified by the following three in-silico prediction tools: SIFT, Mutation Taster (MT2) and PolyPhen2 (PPH2). Overall 201 different variants, 68 of which constituted missense variants were ranked as pathogenic, neutral, or unknown. The classification of missense variants by in-silico tools resulted in a higher amount of pathogenic mutations (25% vs. 13.2%) compared to the GC-HBOC-classification. Altogether, more than fifty percent (38/68, 55.9%) of missense variants were ranked differently. Sensitivity of in-silico-tools for mutation prediction was 88.9% (PPH2), 100% (SIFT) and 100% (MT2). We found a relevant discrepancy in variant classification by using in-silico prediction tools, resulting in potential overestimation and/or underestimation of cancer risk. More reliable, notably gene-specific, prediction tools and functional tests are needed to improve clinical counseling. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  17. First detection of canine parvovirus type 2b from diarrheic dogs in Himachal Pradesh.

    PubMed

    Sharma, Shalini; Dhar, Prasenjit; Thakur, Aneesh; Sharma, Vivek; Sharma, Mandeep

    2016-09-01

    The present study was conducted to detect the presence of canine parvovirus (CPV) among diarrheic dogs in Himachal Pradesh and to identify the most prevalent antigenic variant of CPV based on molecular typing and sequence analysis of VP2 gene. A total of 102 fecal samples were collected from clinical cases of diarrhea or hemorrhagic gastroenteritis from CPV vaccinated or non-vaccinated dogs. Samples were tested using CPV-specific polymerase chain reaction (PCR) targeting VP2 gene, multiplex PCR for detection of CPV-2a and CPV-2b antigenic variants, and a PCR for the detection of CPV-2c. CPV-2b isolate was cultured on Madin-Darby canine kidney (MDCK) cell lines and sequenced using VP2 structural protein gene. Multiple alignment and phylogenetic analysis was done using ClustalW and MEGA6 and inferred using the Neighbor-Joining method. No sample was found positive for the original CPV strain usually present in the vaccine. However, about 50% (52 out of 102) of the samples were found to be positive with CPV-2ab PCR assay that detects newer variants of CPV circulating in the field. In addition, multiplex PCR assay that identifies both CPV-2ab and CPV-2b revealed that CPV-2b was the major antigenic variant present in the affected dogs. A PCR positive isolate of CPV-2b was adapted to grow in MDCK cells and produced characteristic cytopathic effect after 5 th passage. Multiple sequence alignment of VP2 structural gene of CPV-2b isolate (Accession number HG004610) used in the study was found to be similar to other sequenced isolates in NCBI sequence database and showed 98-99% homology. This study reports the first detection of CPV-2b in dogs with hemorrhagic gastroenteritis in Himachal Pradesh and absence of other antigenic types of CPV. Further, CPV-specific PCR assay can be used for rapid confirmation of circulating virus strains under field conditions.

  18. Genetic variants in IL-6/JAK/STAT3 pathway and the risk of CRC.

    PubMed

    Wang, Shuwei; Zhang, Weidong

    2016-05-01

    Interleukin (IL)-6 and the downstream Janus kinase (JAK)/signal transducer and activator of transcription (STAT) pathway have previously been reported to be important in the development of colorectal cancer (CRC), and several studies have shown the relationship between the polymorphisms of related genes in this pathway with the risk of CRC. However, the findings of these related studies are inconsistent. Moreover, there has no systematic review and meta-analysis to evaluate the relationship between genetic variants in IL-6/JAK/STAT3 pathway and CRC susceptibility. Hence, we conducted a meta-analysis to explore the relationship between polymorphisms in IL-6/JAK/STAT3 pathway genes and CRC risk. Eighteen eligible studies with a total of 13,795 CRC cases and 18,043 controls were identified by searching PubMed, Web of Science, Embase, and the Cochrane Library databases for the period up to September 15, 2015. Odds ratios (ORs) and their 95 % confidence intervals (CIs) were used to calculate the strength of the association. Our results indicated that IL-6 genetic variants in allele additive model (OR = 1.05, 95 % CI = 1.00, 1.09) and JAK2 genetic variants (OR = 1.40, 95 % CI = 1.15, 1.65) in genotype recessive model were significantly associated with CRC risk. Moreover, the pooled data revealed that IL-6 rs1800795 polymorphism significantly increased the risk of CRC in allele additive model in Europe (OR = 1.07, 95 % CI = 1.01, 1.14). In conclusion, the present findings indicate that IL-6 and JAK2 genetic variants are associated with the increased risk of CRC while STAT3 genetic variants not. We need more well-designed clinical studies covering more countries and population to definitively establish the association between genetic variants in IL-6/JAK/STAT3 pathway and CRC susceptibility.

  19. Establishment of an international database for genetic variants in esophageal cancer.

    PubMed

    Vihinen, Mauno

    2016-10-01

    The establishment of a database has been suggested in order to collect, organize, and distribute genetic information about esophageal cancer. The World Organization for Specialized Studies on Diseases of the Esophagus and the Human Variome Project will be in charge of a central database of information about esophageal cancer-related variations from publications, databases, and laboratories; in addition to genetic details, clinical parameters will also be included. The aim will be to get all the central players in research, clinical, and commercial laboratories to contribute. The database will follow established recommendations and guidelines. The database will require a team of dedicated curators with different backgrounds. Numerous layers of systematics will be applied to facilitate computational analyses. The data items will be extensively integrated with other information sources. The database will be distributed as open access to ensure exchange of the data with other databases. Variations will be reported in relation to reference sequences on three levels--DNA, RNA, and protein-whenever applicable. In the first phase, the database will concentrate on genetic variations including both somatic and germline variations for susceptibility genes. Additional types of information can be integrated at a later stage. © 2016 New York Academy of Sciences.

  20. Chromosomal DNA Deletions Explain Phenotypic Characteristics of Two Antigenic Variants, Phase II and RSA 514 (Crazy), of the Coxiella burnetii Nine Mile Strain†

    PubMed Central

    Hoover, T. A.; Culp, D. W.; Vodkin, M. H.; Williams, J. C.; Thompson, H. A.

    2002-01-01

    After repeated passages through embyronated eggs, the Nine Mile strain of Coxiella burnetii exhibits antigenic variation, a loss of virulence characteristics, and transition to a truncated lipopolysaccharide (LPS) structure. In two independently derived strains, Nine Mile phase II and RSA 514, these phenotypic changes were accompanied by a large chromosomal deletion (M. H. Vodkin and J. C. Williams, J. Gen. Microbiol. 132:2587-2594, 1986). In the work reported here, additional screening of a cosmid bank prepared from the wild-type strain was used to map the deletion termini of both mutant strains and to accumulate all the segments of DNA that comprise the two deletions. The corresponding DNAs were then sequenced and annotated. The Nine Mile phase II deletion was completely nested within the deletion of the RSA 514 strain. Basic alignment and homology studies indicated that a large group of LPS biosynthetic genes, arranged in an apparent O-antigen cluster, was deleted in both variants. Database homologies identified, in particular, mannose pathway genes and genes encoding sugar methylases and nucleotide sugar epimerase-dehydratase proteins. Candidate genes for addition of sugar units to the core oligosaccharide for synthesis of the rare sugar 6-deoxy-3-C-methylgulose (virenose) were identified in the deleted region. Repeats, redundancies, paralogous genes, and two regions with reduced G+C contents were found within the deletions. PMID:12438347

  1. A rat RNA-Seq transcriptomic BodyMap across 11 organs and 4 developmental stages

    PubMed Central

    Yu, Ying; Fuscoe, James C.; Zhao, Chen; Guo, Chao; Jia, Meiwen; Qing, Tao; Bannon, Desmond I.; Lancashire, Lee; Bao, Wenjun; Du, Tingting; Luo, Heng; Su, Zhenqiang; Jones, Wendell D.; Moland, Carrie L.; Branham, William S.; Qian, Feng; Ning, Baitang; Li, Yan; Hong, Huixiao; Guo, Lei; Mei, Nan; Shi, Tieliu; Wang, Kevin Y.; Wolfinger, Russell D.; Nikolsky, Yuri; Walker, Stephen J.; Duerksen-Hughes, Penelope; Mason, Christopher E.; Tong, Weida; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Shi, Leming; Wang, Charles

    2014-01-01

    The rat has been used extensively as a model for evaluating chemical toxicities and for understanding drug mechanisms. However, its transcriptome across multiple organs, or developmental stages, has not yet been reported. Here we show, as part of the SEQC consortium efforts, a comprehensive rat transcriptomic BodyMap created by performing RNA-Seq on 320 samples from 11 organs of both sexes of juvenile, adolescent, adult and aged Fischer 344 rats. We catalogue the expression profiles of 40,064 genes, 65,167 transcripts, 31,909 alternatively spliced transcript variants and 2,367 non-coding genes/non-coding RNAs (ncRNAs) annotated in AceView. We find that organ-enriched, differentially expressed genes reflect the known organ-specific biological activities. A large number of transcripts show organ-specific, age-dependent or sex-specific differential expression patterns. We create a web-based, open-access rat BodyMap database of expression profiles with crosslinks to other widely used databases, anticipating that it will serve as a primary resource for biomedical research using the rat model. PMID:24510058

  2. Disease-associated mitochondrial mutations and the evolution of primate mitogenomes

    PubMed Central

    Tavares, William Corrêa

    2017-01-01

    Several human diseases have been associated with mutations in mitochondrial genes comprising a set of confirmed and reported mutations according to the MITOMAP database. An analysis of complete mitogenomes across 139 primate species showed that most confirmed disease-associated mutations occurred in aligned codon positions and gene regions under strong purifying selection resulting in a strong evolutionary conservation. Only two confirmed variants (7.1%), coding for the same amino acids accounting for severe human diseases, were identified without apparent pathogenicity in non-human primates, like the closely related Bornean orangutan. Conversely, reported disease-associated mutations were not especially concentrated in conserved codon positions, and a large fraction of them occurred in highly variable ones. Additionally, 88 (45.8%) of reported mutations showed similar variants in several non-human primates and some of them have been present in extinct species of the genus Homo. Considering that recurrent mutations leading to persistent variants throughout the evolutionary diversification of primates are less likely to be severely damaging to fitness, we suggest that these 88 mutations are less likely to be pathogenic. Conversely, 69 (35.9%) of reported disease-associated mutations occurred in extremely conserved aligned codon positions which makes them more likely to damage the primate mitochondrial physiology. PMID:28510580

  3. dbWGFP: a database and web server of human whole-genome single nucleotide variants and their functional predictions.

    PubMed

    Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui

    2016-01-01

    The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. © The Author(s) 2016. Published by Oxford University Press.

  4. Exome Sequence Analysis of 14 Families With High Myopia.

    PubMed

    Kloss, Bethany A; Tompson, Stuart W; Whisenhunt, Kristina N; Quow, Krystina L; Huang, Samuel J; Pavelec, Derek M; Rosenberg, Thomas; Young, Terri L

    2017-04-01

    To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sanger sequencing was used to confirm variants in original DNA, and to test for disease cosegregation in additional family members. Candidate genes and chromosomal loci previously associated with myopic refractive error and its endophenotypes were comprehensively screened. In 14 high myopia families, we identified 73 rare and 31 novel gene variants as candidates for pathogenicity. In seven of these families, two of the novel and eight of the rare variants were within known myopia loci. A total of 104 heterozygous nonsynonymous rare variants in 104 genes were identified in 10 out of 14 probands. Each variant cosegregated with affection status. No rare variants were identified in genes known to cause myopia or in genes closest to published genome-wide association study association signals for refractive error or its endophenotypes. Whole exome sequencing was performed to determine gene variants implicated in the pathogenesis of AD high myopia. This study provides new genes for consideration in the pathogenesis of high myopia, and may aid in the development of genetic profiling of those at greatest risk for attendant ocular morbidities of this disorder.

  5. Germline pathogenic variants in PALB2 and other cancer-predisposing genes in families with hereditary diffuse gastric cancer without CDH1 mutation: a whole-exome sequencing study.

    PubMed

    Fewings, Eleanor; Larionov, Alexey; Redman, James; Goldgraben, Mae A; Scarth, James; Richardson, Susan; Brewer, Carole; Davidson, Rosemarie; Ellis, Ian; Evans, D Gareth; Halliday, Dorothy; Izatt, Louise; Marks, Peter; McConnell, Vivienne; Verbist, Louis; Mayes, Rebecca; Clark, Graeme R; Hadfield, James; Chin, Suet-Feung; Teixeira, Manuel R; Giger, Olivier T; Hardwick, Richard; di Pietro, Massimiliano; O'Donovan, Maria; Pharoah, Paul; Caldas, Carlos; Fitzgerald, Rebecca C; Tischkowitz, Marc

    2018-04-26

    Germline pathogenic variants in the E-cadherin gene (CDH1) are strongly associated with the development of hereditary diffuse gastric cancer. There is a paucity of data to guide risk assessment and management of families with hereditary diffuse gastric cancer that do not carry a CDH1 pathogenic variant, making it difficult to make informed decisions about surveillance and risk-reducing surgery. We aimed to identify new candidate genes associated with predisposition to hereditary diffuse gastric cancer in affected families without pathogenic CDH1 variants. We did whole-exome sequencing on DNA extracted from the blood of 39 individuals (28 individuals diagnosed with hereditary diffuse gastric cancer and 11 unaffected first-degree relatives) in 22 families without pathogenic CDH1 variants. Genes with loss-of-function variants were prioritised using gene-interaction analysis to identify clusters of genes that could be involved in predisposition to hereditary diffuse gastric cancer. Protein-affecting germline variants were identified in probands from six families with hereditary diffuse gastric cancer; variants were found in genes known to predispose to cancer and in lesser-studied DNA repair genes. A frameshift deletion in PALB2 was found in one member of a family with a history of gastric and breast cancer. Two different MSH2 variants were identified in two unrelated affected individuals, including one frameshift insertion and one previously described start-codon loss. One family had a unique combination of variants in the DNA repair genes ATR and NBN. Two variants in the DNA repair gene RECQL5 were identified in two unrelated families: one missense variant and a splice-acceptor variant. The results of this study suggest a role for the known cancer predisposition gene PALB2 in families with hereditary diffuse gastric cancer and no detected pathogenic CDH1 variants. We also identified new candidate genes associated with disease risk in these families. UK Medical Research Council (Sackler programme), European Research Council under the European Union's Seventh Framework Programme (2007-13), National Institute for Health Research Cambridge Biomedical Research Centre, Experimental Cancer Medicine Centres, and Cancer Research UK. Copyright © 2018 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.

  6. Analysis of predicted loss-of-function variants in UK Biobank identifies variants protective for disease.

    PubMed

    Emdin, Connor A; Khera, Amit V; Chaffin, Mark; Klarin, Derek; Natarajan, Pradeep; Aragam, Krishna; Haas, Mary; Bick, Alexander; Zekavat, Seyedeh M; Nomura, Akihiro; Ardissino, Diego; Wilson, James G; Schunkert, Heribert; McPherson, Ruth; Watkins, Hugh; Elosua, Roberto; Bown, Matthew J; Samani, Nilesh J; Baber, Usman; Erdmann, Jeanette; Gupta, Namrata; Danesh, John; Chasman, Daniel; Ridker, Paul; Denny, Joshua; Bastarache, Lisa; Lichtman, Judith H; D'Onofrio, Gail; Mattera, Jennifer; Spertus, John A; Sheu, Wayne H-H; Taylor, Kent D; Psaty, Bruce M; Rich, Stephen S; Post, Wendy; Rotter, Jerome I; Chen, Yii-Der Ida; Krumholz, Harlan; Saleheen, Danish; Gabriel, Stacey; Kathiresan, Sekar

    2018-04-24

    Less than 3% of protein-coding genetic variants are predicted to result in loss of protein function through the introduction of a stop codon, frameshift, or the disruption of an essential splice site; however, such predicted loss-of-function (pLOF) variants provide insight into effector transcript and direction of biological effect. In >400,000 UK Biobank participants, we conduct association analyses of 3759 pLOF variants with six metabolic traits, six cardiometabolic diseases, and twelve additional diseases. We identified 18 new low-frequency or rare (allele frequency < 5%) pLOF variant-phenotype associations. pLOF variants in the gene GPR151 protect against obesity and type 2 diabetes, in the gene IL33 against asthma and allergic disease, and in the gene IFIH1 against hypothyroidism. In the gene PDE3B, pLOF variants associate with elevated height, improved body fat distribution and protection from coronary artery disease. Our findings prioritize genes for which pharmacologic mimics of pLOF variants may lower risk for disease.

  7. Evolutionary conservation analysis increases the colocalization of predicted exonic splicing enhancers in the BRCA1 gene with missense sequence changes and in-frame deletions, but not polymorphisms

    PubMed Central

    Pettigrew, Christopher; Wayte, Nicola; Lovelock, Paul K; Tavtigian, Sean V; Chenevix-Trench, Georgia; Spurdle, Amanda B; Brown, Melissa A

    2005-01-01

    Introduction Aberrant pre-mRNA splicing can be more detrimental to the function of a gene than changes in the length or nature of the encoded amino acid sequence. Although predicting the effects of changes in consensus 5' and 3' splice sites near intron:exon boundaries is relatively straightforward, predicting the possible effects of changes in exonic splicing enhancers (ESEs) remains a challenge. Methods As an initial step toward determining which ESEs predicted by the web-based tool ESEfinder in the breast cancer susceptibility gene BRCA1 are likely to be functional, we have determined their evolutionary conservation and compared their location with known BRCA1 sequence variants. Results Using the default settings of ESEfinder, we initially detected 669 potential ESEs in the coding region of the BRCA1 gene. Increasing the threshold score reduced the total number to 464, while taking into consideration the proximity to splice donor and acceptor sites reduced the number to 211. Approximately 11% of these ESEs (23/211) either are identical at the nucleotide level in human, primates, mouse, cow, dog and opossum Brca1 (conserved) or are detectable by ESEfinder in the same position in the Brca1 sequence (shared). The frequency of conserved and shared predicted ESEs between human and mouse is higher in BRCA1 exons (2.8 per 100 nucleotides) than in introns (0.6 per 100 nucleotides). Of conserved or shared putative ESEs, 61% (14/23) were predicted to be affected by sequence variants reported in the Breast Cancer Information Core database. Applying the filters described above increased the colocalization of predicted ESEs with missense changes, in-frame deletions and unclassified variants predicted to be deleterious to protein function, whereas they decreased the colocalization with known polymorphisms or unclassified variants predicted to be neutral. Conclusion In this report we show that evolutionary conservation analysis may be used to improve the specificity of an ESE prediction tool. This is the first report on the prediction of the frequency and distribution of ESEs in the BRCA1 gene, and it is the first reported attempt to predict which ESEs are most likely to be functional and therefore which sequence variants in ESEs are most likely to be pathogenic. PMID:16280041

  8. Standards for Clinical Grade Genomic Databases.

    PubMed

    Yohe, Sophia L; Carter, Alexis B; Pfeifer, John D; Crawford, James M; Cushman-Vokoun, Allison; Caughron, Samuel; Leonard, Debra G B

    2015-11-01

    Next-generation sequencing performed in a clinical environment must meet clinical standards, which requires reproducibility of all aspects of the testing. Clinical-grade genomic databases (CGGDs) are required to classify a variant and to assist in the professional interpretation of clinical next-generation sequencing. Applying quality laboratory standards to the reference databases used for sequence-variant interpretation presents a new challenge for validation and curation. To define CGGD and the categories of information contained in CGGDs and to frame recommendations for the structure and use of these databases in clinical patient care. Members of the College of American Pathologists Personalized Health Care Committee reviewed the literature and existing state of genomic databases and developed a framework for guiding CGGD development in the future. Clinical-grade genomic databases may provide different types of information. This work group defined 3 layers of information in CGGDs: clinical genomic variant repositories, genomic medical data repositories, and genomic medicine evidence databases. The layers are differentiated by the types of genomic and medical information contained and the utility in assisting with clinical interpretation of genomic variants. Clinical-grade genomic databases must meet specific standards regarding submission, curation, and retrieval of data, as well as the maintenance of privacy and security. These organizing principles for CGGDs should serve as a foundation for future development of specific standards that support the use of such databases for patient care.

  9. LitVar: a semantic search engine for linking genomic variant data in PubMed and PMC.

    PubMed

    Allot, Alexis; Peng, Yifan; Wei, Chih-Hsuan; Lee, Kyubum; Phan, Lon; Lu, Zhiyong

    2018-05-14

    The identification and interpretation of genomic variants play a key role in the diagnosis of genetic diseases and related research. These tasks increasingly rely on accessing relevant manually curated information from domain databases (e.g. SwissProt or ClinVar). However, due to the sheer volume of medical literature and high cost of expert curation, curated variant information in existing databases are often incomplete and out-of-date. In addition, the same genetic variant can be mentioned in publications with various names (e.g. 'A146T' versus 'c.436G>A' versus 'rs121913527'). A search in PubMed using only one name usually cannot retrieve all relevant articles for the variant of interest. Hence, to help scientists, healthcare professionals, and database curators find the most up-to-date published variant research, we have developed LitVar for the search and retrieval of standardized variant information. In addition, LitVar uses advanced text mining techniques to compute and extract relationships between variants and other associated entities such as diseases and chemicals/drugs. LitVar is publicly available at https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/LitVar.

  10. Pooled Resequencing of 122 Ulcerative Colitis Genes in a Large Dutch Cohort Suggests Population-Specific Associations of Rare Variants in MUC2.

    PubMed

    Visschedijk, Marijn C; Alberts, Rudi; Mucha, Soren; Deelen, Patrick; de Jong, Dirk J; Pierik, Marieke; Spekhorst, Lieke M; Imhann, Floris; van der Meulen-de Jong, Andrea E; van der Woude, C Janneke; van Bodegraven, Adriaan A; Oldenburg, Bas; Löwenberg, Mark; Dijkstra, Gerard; Ellinghaus, David; Schreiber, Stefan; Wijmenga, Cisca; Rivas, Manuel A; Franke, Andre; van Diemen, Cleo C; Weersma, Rinse K

    2016-01-01

    Genome-wide association studies have revealed several common genetic risk variants for ulcerative colitis (UC). However, little is known about the contribution of rare, large effect genetic variants to UC susceptibility. In this study, we performed a deep targeted re-sequencing of 122 genes in Dutch UC patients in order to investigate the contribution of rare variants to the genetic susceptibility to UC. The selection of genes consists of 111 established human UC susceptibility genes and 11 genes that lead to spontaneous colitis when knocked-out in mice. In addition, we sequenced the promoter regions of 45 genes where known variants exert cis-eQTL-effects. Targeted pooled re-sequencing was performed on DNA of 790 Dutch UC cases. The Genome of the Netherlands project provided sequence data of 500 healthy controls. After quality control and prioritization based on allele frequency and pathogenicity probability, follow-up genotyping of 171 rare variants was performed on 1021 Dutch UC cases and 1166 Dutch controls. Single-variant association and gene-based analyses identified an association of rare variants in the MUC2 gene with UC. The associated variants in the Dutch population could not be replicated in a German replication cohort (1026 UC cases, 3532 controls). In conclusion, this study has identified a putative role for MUC2 on UC susceptibility in the Dutch population and suggests a population-specific contribution of rare variants to UC.

  11. Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

    PubMed Central

    Grinde, Kelsey E.; Arbet, Jaron; Green, Alden; O'Connell, Michael; Valcarcel, Alessandra; Westra, Jason; Tintle, Nathan

    2017-01-01

    To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p < 2.2 × 10−6) and, consequently, substantially improves mean squared error and variant prioritization/ranking. The method is particularly helpful in adjustment for winner's curse effects when the initial gene-based test has low power and for relatively more common, non-causal variants. Adjustment for winner's curse is recommended for all post-hoc estimation and ranking of variants after a gene-based test. Further work is necessary to continue seeking ways to reduce bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures. PMID:28959274

  12. Intrahaplotypic Variants Differentiate Complex Linkage Disequilibrium within Human MHC Haplotypes

    PubMed Central

    Lam, Tze Hau; Tay, Matthew Zirui; Wang, Bei; Xiao, Ziwei; Ren, Ee Chee

    2015-01-01

    Distinct regions of long-range genetic fixation in the human MHC region, known as conserved extended haplotypes (CEHs), possess unique genomic characteristics and are strongly associated with numerous diseases. While CEHs appear to be homogeneous by SNP analysis, the nature of fine variations within their genomic structure is unknown. Using multiple, MHC-homozygous cell lines, we demonstrate extensive sequence conservation in two common Asian MHC haplotypes: A33-B58-DR3 and A2-B46-DR9. However, characterization of phase-resolved MHC haplotypes revealed unique intra-CEH patterns of variation and uncovered 127 single nucleotide variants (SNVs) which are missing from public databases. We further show that the strong linkage disequilibrium structure within the human MHC that typically confounds precise identification of genetic features can be resolved using intra-CEH variants, as evidenced by rs3129063 and rs448489, which affect expression of ZFP57, a gene important in methylation and epigenetic regulation. This study demonstrates an improved strategy that can be used towards genetic dissection of diseases. PMID:26593880

  13. EvoSNP-DB: A database of genetic diversity in East Asian populations.

    PubMed

    Kim, Young Uk; Kim, Young Jin; Lee, Jong-Young; Park, Kiejung

    2013-08-01

    Genome-wide association studies (GWAS) have become popular as an approach for the identification of large numbers of phenotype-associated variants. However, differences in genetic architecture and environmental factors mean that the effect of variants can vary across populations. Understanding population genetic diversity is valuable for the investigation of possible population specific and independent effects of variants. EvoSNP-DB aims to provide information regarding genetic diversity among East Asian populations, including Chinese, Japanese, and Korean. Non-redundant SNPs (1.6 million) were genotyped in 54 Korean trios (162 samples) and were compared with 4 million SNPs from HapMap phase II populations. EvoSNP-DB provides two user interfaces for data query and visualization, and integrates scores of genetic diversity (Fst and VarLD) at the level of SNPs, genes, and chromosome regions. EvoSNP-DB is a web-based application that allows users to navigate and visualize measurements of population genetic differences in an interactive manner, and is available online at [http://biomi.cdc.go.kr/EvoSNP/].

  14. Patterns of population differentiation of candidate genes for cardiovascular disease

    PubMed Central

    Kullo, Iftikhar J; Ding, Keyue

    2007-01-01

    Background The basis for ethnic differences in cardiovascular disease (CVD) susceptibility is not fully understood. We investigated patterns of population differentiation (FST) of a set of genes in etiologic pathways of CVD among 3 ethnic groups: Yoruba in Nigeria (YRI), Utah residents with European ancestry (CEU), and Han Chinese (CHB) + Japanese (JPT). We identified 37 pathways implicated in CVD based on the PANTHER classification and 416 genes in these pathways were further studied; these genes belonged to 6 biological processes (apoptosis, blood circulation and gas exchange, blood clotting, homeostasis, immune response, and lipoprotein metabolism). Genotype data were obtained from the HapMap database. Results We calculated FST for 15,559 common SNPs (minor allele frequency ≥ 0.10 in at least one population) in genes that co-segregated among the populations, as well as an average-weighted FST for each gene. SNPs were classified as putatively functional (non-synonymous and untranslated regions) or non-functional (intronic and synonymous sites). Mean FST values for common putatively functional variants were significantly higher than FST values for nonfunctional variants. A significant variation in FST was also seen based on biological processes; the processes of 'apoptosis' and 'lipoprotein metabolism' showed an excess of genes with high FST. Thus, putative functional SNPs in genes in etiologic pathways for CVD show greater population differentiation than non-functional SNPs and a significant variance of FST values was noted among pairwise population comparisons for different biological processes. Conclusion These results suggest a possible basis for varying susceptibility to CVD among ethnic groups. PMID:17626638

  15. Database for Parkinson Disease Mutations and Rare Variants

    DTIC Science & Technology

    2016-09-01

    AWARD NUMBER: W81XWH-14-1-0097 TITLE: “ Database for Parkinson Disease Mutations and Rare Variants” PRINCIPAL INVESTIGATOR: JEFFERY M. VANCE...TO THE ABOVE ADDRESS. 1. REPORT DATE September 2016 2. REPORT TYPE FINAL 3. DATES COVERED 1 Jul 2014 – 30 Jun 2016 4. TITLE AND SUBTITLE Database ...For Parkinson Disease (PD) specifically, the variant databases currently available are incomplete, don’t assess impact and/or are not equipped to

  16. Real-world clinical applicability of pathogenicity predictors assessed on SERPINA1 mutations in alpha-1-antitrypsin deficiency.

    PubMed

    Giacopuzzi, Edoardo; Laffranchi, Mattia; Berardelli, Romina; Ravasio, Viola; Ferrarotti, Ilaria; Gooptu, Bibek; Borsani, Giuseppe; Fra, Annamaria

    2018-06-07

    The growth of publicly available data informing upon genetic variations, mechanisms of disease and disease sub-phenotypes offers great potential for personalised medicine. Computational approaches are likely required to assess large numbers of novel genetic variants. However, the integration of genetic, structural and pathophysiological data still represents a challenge for computational predictions and their clinical use. We addressed these issues for alpha-1-antitrypsin deficiency, a disease mediated by mutations in the SERPINA1 gene encoding alpha-1-antitrypsin. We compiled a comprehensive database of SERPINA1 coding mutations and assigned them apparent pathological relevance based upon available data. 'Benign' and 'Pathogenic' mutations were used to assess performance of 31 pathogenicity predictors. Well-performing algorithms clustered the subset of variants known to be severely pathogenic with high scores. Eight new mutations identified in the ExAC database and achieving high scores were selected for characterisation in cell models and showed secretory deficiency and polymer formation, supporting the predictive power of our computational approach. The behaviour of the pathogenic new variants and consistent outliers were rationalised by considering the protein structural context and residue conservation. These findings highlight the potential of computational methods to provide meaningful predictions of the pathogenic significance of novel mutations and identify areas for further investigation. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  17. Coffin-Siris syndrome and the BAF complex: genotype-phenotype study in 63 patients.

    PubMed

    Santen, Gijs W E; Aten, Emmelien; Vulto-van Silfhout, Anneke T; Pottinger, Caroline; van Bon, Bregje W M; van Minderhout, Ivonne J H M; Snowdowne, Ronelle; van der Lans, Christian A C; Boogaard, Merel; Linssen, Margot M L; Vijfhuizen, Linda; van der Wielen, Michiel J R; Vollebregt, M J Ellen; Breuning, Martijn H; Kriek, Marjolein; van Haeringen, Arie; den Dunnen, Johan T; Hoischen, Alexander; Clayton-Smith, Jill; de Vries, Bert B A; Hennekam, Raoul C M; van Belzen, Martine J

    2013-11-01

    De novo germline variants in several components of the SWI/SNF-like BAF complex can cause Coffin-Siris syndrome (CSS), Nicolaides-Baraitser syndrome (NCBRS), and nonsyndromic intellectual disability. We screened 63 patients with a clinical diagnosis of CSS for these genes (ARID1A, ARID1B, SMARCA2, SMARCA4, SMARCB1, and SMARCE1) and identified pathogenic variants in 45 (71%) patients. We found a high proportion of variants in ARID1B (68%). All four pathogenic variants in ARID1A appeared to be mosaic. By using all variants from the Exome Variant Server as test data, we were able to classify variants in ARID1A, ARID1B, and SMARCB1 reliably as being pathogenic or nonpathogenic. For SMARCA2, SMARCA4, and SMARCE1 several variants in the EVS remained unclassified, underlining the importance of parental testing. We have entered all variant and clinical information in LOVD-powered databases to facilitate further genotype-phenotype correlations, as these will become increasingly important because of the uptake of targeted and untargeted next generation sequencing in diagnostics. The emerging phenotype-genotype correlation is that SMARCB1 patients have the most marked physical phenotype and severe cognitive and growth delay. The variability in phenotype seems most marked in ARID1A and ARID1B patients. Distal limbs anomalies are most marked in ARID1A patients and least in SMARCB1 patients. Numbers are small however, and larger series are needed to confirm this correlation. © 2013 WILEY PERIODICALS, INC.

  18. Gene profiling-based phenotyping for identification of cellular parameters that contribute to fitness, stress-tolerance and virulence of Listeria monocytogenes variants.

    PubMed

    Koomen, Jeroen; den Besten, Heidy M W; Metselaar, Karin I; Tempelaars, Marcel H; Wijnands, Lucas M; Zwietering, Marcel H; Abee, Tjakko

    2018-06-07

    Microbial population heterogeneity allows for a differential microbial response to environmental stresses and can lead to the selection of stress resistant variants. In this study, we have used two different stress resistant variants of Listeria monocytogenes LO28 with mutations in the rpsU gene encoding ribosomal protein S21, to elucidate features that can contribute to fitness, stress-tolerance and host interaction using a comparative gene profiling and phenotyping approach. Transcriptome analysis showed that 116 genes were upregulated and 114 genes were downregulated in both rpsU variants. Upregulated genes included a major contribution of SigB-controlled genes such as intracellular acid resistance-associated glutamate decarboxylase (GAD) (gad3), genes involved in compatible solute uptake (opuC), glycerol metabolism (glpF, glpK, glpD), and virulence (inlA, inlB). Downregulated genes in the two variants involved mainly genes involved in flagella synthesis and motility. Phenotyping results of the two rpsU variants matched the gene profiling data including enhanced freezing resistance conceivably linked to compatible solute accumulation, higher glycerol utilisation rates, and better adhesion to Caco 2 cells presumably linked to higher expression of internalins. Also, bright field and electron microscopy analysis confirmed reduced flagellation of the variants. The activation of SigB-mediated stress defence offers an explanation for the multiple-stress resistant phenotype in rpsU variants. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  19. Polygenic overlap between schizophrenia risk and antipsychotic response: a genomic medicine approach

    PubMed Central

    Ruderfer, Douglas M; Charney, Alexander W; Readhead, Ben; Kidd, Brian A; Kähler, Anna K; Kenny, Paul J; Keiser, Michael J; Moran, Jennifer L; Hultman, Christina M; Scott, Stuart A; Sullivan, Patrick F; Purcell, Shaun M; Dudley, Joel T; Sklar, Pamela

    2016-01-01

    Summary Background Therapeutic treatments for schizophrenia do not alleviate symptoms for all patients and efficacy is limited by common, often severe, side-effects. Genetic studies of disease can identify novel drug targets, and drugs for which the mechanism has direct genetic support have increased likelihood of clinical success. Large-scale genetic studies of schizophrenia have increased the number of genes and gene sets associated with risk. We aimed to examine the overlap between schizophrenia risk loci and gene targets of a comprehensive set of medications to potentially inform and improve treatment of schizophrenia. Methods We defined schizophrenia risk loci as genomic regions reaching genome-wide significance in the latest Psychiatric Genomics Consortium schizophrenia genome-wide association study (GWAS) of 36 989 cases and 113 075 controls and loss of function variants observed only once among 5079 individuals in an exome-sequencing study of 2536 schizophrenia cases and 2543 controls (Swedish Schizophrenia Study). Using two large and orthogonally created databases, we collated drug targets into 167 gene sets targeted by pharmacologically similar drugs and examined enrichment of schizophrenia risk loci in these sets. We further linked the exome-sequenced data with a national drug registry (the Swedish Prescribed Drug Register) to assess the contribution of rare variants to treatment response, using clozapine prescription as a proxy for treatment resistance. Findings We combined results from testing rare and common variation and, after correction for multiple testing, two gene sets were associated with schizophrenia risk: agents against amoebiasis and other protozoal diseases (106 genes, p=0·00046, pcorrected =0·024) and antipsychotics (347 genes, p=0·00078, pcorrected=0·046). Further analysis pointed to antipsychotics as having independent enrichment after removing genes that overlapped these two target sets. We noted significant enrichment both in known targets of antipsychotics (70 genes, p=0·0078) and novel predicted targets (277 genes, p=0·019). Patients with treatment-resistant schizophrenia had an excess of rare disruptive variants in gene targets of antipsychotics (347 genes, p=0·0067) and in genes with evidence for a role in antipsychotic efficacy (91 genes, p=0·0029). Interpretation Our results support genetic overlap between schizophrenia pathogenesis and antipsychotic mechanism of action. This finding is consistent with treatment efficacy being polygenic and suggests that single-target therapeutics might be insufficient. We provide evidence of a role for rare functional variants in antipsychotic treatment response, pointing to a subset of patients where their genetic information could inform treatment. Finally, we present a novel framework for identifying treatments from genetic data and improving our understanding of therapeutic mechanism. PMID:26915512

  20. Resequencing of the CETP gene in American whites and African blacks: Association of rare and common variants with HDL-cholesterol levels

    PubMed Central

    Pirim, Dilek; Wang, Xingbin; Niemsiri, Vipavee; Radwan, Zaheda H.; Bunker, Clareann H.; Hokanson, John E.; Hamman, Richard F.; Barmada, M. Michael; Demirci, F. Yesim; Kamboh, M. Ilyas

    2015-01-01

    Background Cholesteryl ester transfer protein (CETP) plays a crucial role in lipid metabolism. Associations of common CETP variants with variation in plasma lipid levels, and/or CETP mass/activity have been extensively studied and well-documented; however, the effects of uncommon/rare CETP variants on plasma lipid profile remain undefined. Hence, resequencing of the gene in extreme phenotypes and follow-up rare-variant association analyses are essential to fill this gap. Objective To identify common and uncommon/rare variants in the CETP gene by resequencing the entire gene and test the effects of both common and uncommon/rare CETP variants on plasma lipid traits in two genetically distinct populations. Methods and Results The entire CETP gene plus flanking regions were resequenced in 190 individuals comprising 95 non-Hispanic Whites (NHWs) and 95 African blacks with extreme HDL-C levels. A total of 279 sequence variants were identified, of which 25 were novel. Selected variants were genotyped in the entire samples of 623 NHWs and 788 African blacks and 184 QC-passed variants were tested in relation to plasma lipid traits by using gene-based, single-site, haplotype and rare variant association analyses (SKAT-O). Two novel and independent associations of rs1968905 and rs289740 with HDL-C were identified in African blacks. Using SKAT-O analysis, we also identified rare variants with minor allele frequency <0.01 to be associated with HDL-C in both NHWs (P=0.024) and African blacks (P=0.009). Conclusions Our results point out that in addition to the common CETP variants, rare genetic variants in the CETP gene also contribute to the phenotypic variation of HDL-C in the general population. PMID:26683795

  1. Mutation of ATF6 causes autosomal recessive achromatopsia.

    PubMed

    Ansar, Muhammad; Santos-Cortez, Regie Lyn P; Saqib, Muhammad Arif Nadeem; Zulfiqar, Fareeha; Lee, Kwanghyuk; Ashraf, Naeem Mahmood; Ullah, Ehsan; Wang, Xin; Sajid, Sundus; Khan, Falak Sher; Amin-ud-Din, Muhammad; Smith, Joshua D; Shendure, Jay; Bamshad, Michael J; Nickerson, Deborah A; Hameed, Abdul; Riazuddin, Saima; Ahmed, Zubair M; Ahmad, Wasim; Leal, Suzanne M

    2015-09-01

    Achromatopsia (ACHM) is an early-onset retinal dystrophy characterized by photophobia, nystagmus, color blindness and severely reduced visual acuity. Currently mutations in five genes CNGA3, CNGB3, GNAT2, PDE6C and PDE6H have been implicated in ACHM. We performed homozygosity mapping and linkage analysis in a consanguineous Pakistani ACHM family and mapped the locus to a 15.12-Mb region on chromosome 1q23.1-q24.3 with a maximum LOD score of 3.6. A DNA sample from an affected family member underwent exome sequencing. Within the ATF6 gene, a single-base insertion variant c.355_356dupG (p.Glu119Glyfs*8) was identified, which completely segregates with the ACHM phenotype within the family. The frameshift variant was absent in public variant databases, in 130 exomes from unrelated Pakistani individuals, and in 235 ethnically matched controls. The variant is predicted to result in a truncated protein that lacks the DNA binding and transmembrane domains and therefore affects the function of ATF6 as a transcription factor that initiates the unfolded protein response during endoplasmic reticulum (ER) stress. Immunolabeling with anti-ATF6 antibodies showed localization throughout the mouse neuronal retina, including retinal pigment epithelium, photoreceptor cells, inner nuclear layer, inner and outer plexiform layers, with a more prominent signal in retinal ganglion cells. In contrast to cytoplasmic expression of wild-type protein, in heterologous cells ATF6 protein with the p.Glu119Glyfs*8 variant is mainly confined to the nucleus. Our results imply that response to ER stress as mediated by the ATF6 pathway is essential for color vision in humans.

  2. A global evolutionary and metabolic analysis of human obesity gene risk variants.

    PubMed

    Castillo, Joseph J; Hazlett, Zachary S; Orlando, Robert A; Garver, William S

    2017-09-05

    It is generally accepted that the selection of gene variants during human evolution optimized energy metabolism that now interacts with our obesogenic environment to increase the prevalence of obesity. The purpose of this study was to perform a global evolutionary and metabolic analysis of human obesity gene risk variants (110 human obesity genes with 127 nearest gene risk variants) identified using genome-wide association studies (GWAS) to enhance our knowledge of early and late genotypes. As a result of determining the mean frequency of these obesity gene risk variants in 13 available populations from around the world our results provide evidence for the early selection of ancestral risk variants (defined as selection before migration from Africa) and late selection of derived risk variants (defined as selection after migration from Africa). Our results also provide novel information for association of these obesity genes or encoded proteins with diverse metabolic pathways and other human diseases. The overall results indicate a significant differential evolutionary pattern for the selection of obesity gene ancestral and derived risk variants proposed to optimize energy metabolism in varying global environments and complex association with metabolic pathways and other human diseases. These results are consistent with obesity genes that encode proteins possessing a fundamental role in maintaining energy metabolism and survival during the course of human evolution. Copyright © 2017. Published by Elsevier B.V.

  3. Identification of possible genetic polymorphisms involved in cancer cachexia: a systematic review.

    PubMed

    Tan, Benjamin H L; Ross, James A; Kaasa, Stein; Skorpen, Frank; Fearon, Kenneth C H

    2011-04-01

    Cancer cachexia is a polygenic and complex syndrome. Genetic variations in regulation of the inflammatory response, muscle and fat metabolic pathways, and pathways in appetite regulation are likely to contribute to the susceptibility or resistance to developing cancer cachexia. A systematic search of Medline and EmBase databases, covering 1986-2008 was performed for potential candidate genes/genetic polymorphisms relating to cancer cachexia. Related genes were then identified using pathway functional analysis software. All candidate genes were reviewed for functional polymorphisms or clinically significant polymorphisms associated with cachexia using the OMIM and GeneRIF databases. Genes with variants which had functional or clinical associations with cachexia and replicated in at least one study were entered into pathway analysis software to reveal possible network associations between genes. A total of 184 polymorphisms with functional or clinical relevance to cancer cachexia were identified in 92 candidate genes. Of these, 42 polymorphisms (in 33 genes) were replicated in more than one study with 13 polymorphisms found to influence two or more hallmarks of cachexia (i.e. inflammation, loss of fat mass and/or lean mass and reduced survival). Thirty-three genes were found to be significantly interconnected in two major networks with four genes (ADIPOQ, IL6, NFKB1 and TLR4) interlinking both networks. Selection of candidate genes and polymorphisms is a key element of multigene study design. The present study provides an initial framework to select genes/polymorphisms for further study in cancer cachexia, and to develop their potential as susceptibility biomarkers of developing cachexia.

  4. Pathogenic variants for Mendelian and complex traits in exomes of 6,517 European and African Americans: implications for the return of incidental results.

    PubMed

    Tabor, Holly K; Auer, Paul L; Jamal, Seema M; Chong, Jessica X; Yu, Joon-Ho; Gordon, Adam S; Graubert, Timothy A; O'Donnell, Christopher J; Rich, Stephen S; Nickerson, Deborah A; Bamshad, Michael J

    2014-08-07

    Exome sequencing (ES) is rapidly being deployed for use in clinical settings despite limited empirical data about the number and types of incidental results (with potential clinical utility) that could be offered for return to an individual. We analyzed deidentified ES data from 6,517 participants (2,204 African Americans and 4,313 European Americans) from the National Heart, Lung, and Blood Institute Exome Sequencing Project. We characterized the frequencies of pathogenic alleles in genes underlying Mendelian conditions commonly assessed by newborn-screening (NBS, n = 39) programs, genes associated with age-related macular degeneration (ARMD, n = 17), and genes known to influence drug response (PGx, n = 14). From these 70 genes, we identified 10,789 variants and curated them by manual review of OMIM, HGMD, locus-specific databases, or primary literature to a total of 399 validated pathogenic variants. The mean number of risk alleles per individual was 15.3. Every individual had at least five known PGx alleles, 99% of individuals had at least one ARMD risk allele, and 45% of individuals were carriers for at least one pathogenic NBS allele. The carrier burden for severe recessive childhood disorders was 0.57. Our results demonstrate that risk alleles of potential clinical utility for both Mendelian and complex traits are detectable in every individual. These findings highlight the necessity of developing guidelines and policies that consider the return of results to all individuals and underscore the need to develop innovative approaches and tools that enable individuals to exercise their choice about the return of incidental results. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  5. Rare high-impact disease variants: properties and identifications.

    PubMed

    Park, Leeyoung; Kim, Ju Han

    2016-03-21

    Although many genome-wide association studies have been performed, the identification of disease polymorphisms remains important. It is now suspected that many rare disease variants induce the association signal of common variants in linkage disequilibrium (LD). Based on recent development of genetic models, the current study provides explanations of the existence of rare variants with high impacts and common variants with low impacts. Disease variants are neither necessary nor sufficient due to gene-gene or gene-environment interactions. A new method was developed based on theoretical aspects to identify both rare and common disease variants by their genotypes. Common disease variants were identified with relatively small odds ratios and relatively small sample sizes, except for specific situations in which the disease variants were in strong LD with a variant with a higher frequency. Rare disease variants with small impacts were difficult to identify without increasing sample sizes; however, the method was reasonably accurate for rare disease variants with high impacts. For rare variants, dominant variants generally showed better Type II error rates than recessive variants; however, the trend was reversed for common variants. Type II error rates increased in gene regions containing more than two disease variants because the more common variant, rather than both disease variants, was usually identified. The proposed method would be useful for identifying common disease variants with small impacts and rare disease variants with large impacts when disease variants have the same effects on disease presentation.

  6. ClinGen--the Clinical Genome Resource.

    PubMed

    Rehm, Heidi L; Berg, Jonathan S; Brooks, Lisa D; Bustamante, Carlos D; Evans, James P; Landrum, Melissa J; Ledbetter, David H; Maglott, Donna R; Martin, Christa Lese; Nussbaum, Robert L; Plon, Sharon E; Ramos, Erin M; Sherry, Stephen T; Watson, Michael S

    2015-06-04

    On autopsy, a patient is found to have hypertrophic cardiomyopathy. The patient’s family pursues genetic testing that shows a “likely pathogenic” variant for the condition on the basis of a study in an original research publication. Given the dominant inheritance of the condition and the risk of sudden cardiac death, other family members are tested for the genetic variant to determine their risk. Several family members test negative and are told that they are not at risk for hypertrophic cardiomyopathy and sudden cardiac death, and those who test positive are told that they need to be regularly monitored for cardiomyopathy on echocardiography. Five years later, during a routine clinic visit of one of the genotype-positive family members, the cardiologist queries a database for current knowledge on the genetic variant and discovers that the variant is now interpreted as “likely benign” by another laboratory that uses more recently derived population-frequency data. A newly available testing panel for additional genes that are implicated in hypertrophic cardiomyopathy is initiated on an affected family member, and a different variant is found that is determined to be pathogenic. Family members are retested, and one member who previously tested negative is now found to be positive for this new variant. An immediate clinical workup detects evidence of cardiomyopathy, and an intracardiac defibrillator is implanted to reduce the risk of sudden cardiac death.

  7. A power set-based statistical selection procedure to locate susceptible rare variants associated with complex traits with sequencing data.

    PubMed

    Sun, Hokeun; Wang, Shuang

    2014-08-15

    Existing association methods for rare variants from sequencing data have focused on aggregating variants in a gene or a genetic region because of the fact that analysing individual rare variants is underpowered. However, these existing rare variant detection methods are not able to identify which rare variants in a gene or a genetic region of all variants are associated with the complex diseases or traits. Once phenotypic associations of a gene or a genetic region are identified, the natural next step in the association study with sequencing data is to locate the susceptible rare variants within the gene or the genetic region. In this article, we propose a power set-based statistical selection procedure that is able to identify the locations of the potentially susceptible rare variants within a disease-related gene or a genetic region. The selection performance of the proposed selection procedure was evaluated through simulation studies, where we demonstrated the feasibility and superior power over several comparable existing methods. In particular, the proposed method is able to handle the mixed effects when both risk and protective variants are present in a gene or a genetic region. The proposed selection procedure was also applied to the sequence data on the ANGPTL gene family from the Dallas Heart Study to identify potentially susceptible rare variants within the trait-related genes. An R package 'rvsel' can be downloaded from http://www.columbia.edu/∼sw2206/ and http://statsun.pusan.ac.kr. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. The impact of rare variation on gene expression across tissues.

    PubMed

    Li, Xin; Kim, Yungil; Tsang, Emily K; Davis, Joe R; Damani, Farhan N; Chiang, Colby; Hess, Gaelen T; Zappala, Zachary; Strober, Benjamin J; Scott, Alexandra J; Li, Amy; Ganna, Andrea; Bassik, Michael C; Merker, Jason D; Hall, Ira M; Battle, Alexis; Montgomery, Stephen B

    2017-10-11

    Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.

  9. DaMold: A data-mining platform for variant annotation and visualization in molecular diagnostics research.

    PubMed

    Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas

    2017-07-01

    Next-generation sequencing (NGS) has become a powerful and efficient tool for routine mutation screening in clinical research. As each NGS test yields hundreds of variants, the current challenge is to meaningfully interpret the data and select potential candidates. Analyzing each variant while manually investigating several relevant databases to collect specific information is a cumbersome and time-consuming process, and it requires expertise and familiarity with these databases. Thus, a tool that can seamlessly annotate variants with clinically relevant databases under one common interface would be of great help for variant annotation, cross-referencing, and visualization. This tool would allow variants to be processed in an automated and high-throughput manner and facilitate the investigation of variants in several genome browsers. Several analysis tools are available for raw sequencing-read processing and variant identification, but an automated variant filtering, annotation, cross-referencing, and visualization tool is still lacking. To fulfill these requirements, we developed DaMold, a Web-based, user-friendly tool that can filter and annotate variants and can access and compile information from 37 resources. It is easy to use, provides flexible input options, and accepts variants from NGS and Sanger sequencing as well as hotspots in VCF and BED formats. DaMold is available as an online application at http://damold.platomics.com/index.html, and as a Docker container and virtual machine at https://sourceforge.net/projects/damold/. © 2017 Wiley Periodicals, Inc.

  10. NGS testing for cardiomyopathy: Utility of adding RASopathy-associated genes.

    PubMed

    Ceyhan-Birsoy, Ozge; Miatkowski, Maya M; Hynes, Elizabeth; Funke, Birgit H; Mason-Suares, Heather

    2018-04-25

    RASopathies include a group of syndromes caused by pathogenic germline variants in RAS-MAPK pathway genes and typically present with facial dysmorphology, cardiovascular disease, and musculoskeletal anomalies. Recently, variants in RASopathy-associated genes have been reported in individuals with apparently nonsyndromic cardiomyopathy, suggesting that subtle features may be overlooked. To determine the utility and burden of adding RASopathy-associated genes to cardiomyopathy panels, we tested 11 RASopathy-associated genes by next-generation sequencing (NGS), including NGS-based copy number variant assessment, in 1,111 individuals referred for genetic testing for hypertrophic cardiomyopathy (HCM) or dilated cardiomyopathy (DCM). Disease-causing variants were identified in 0.6% (four of 692) of individuals with HCM, including three missense variants in the PTPN11, SOS1, and BRAF genes. Overall, 36 variants of uncertain significance (VUSs) were identified, averaging ∼3VUSs/100 cases. This study demonstrates that adding a subset of the RASopathy-associated genes to cardiomyopathy panels will increase clinical diagnoses without significantly increasing the number of VUSs/case. © 2018 Wiley Periodicals, Inc.

  11. Biallelic SCN10A mutations in neuromuscular disease and epileptic encephalopathy.

    PubMed

    Kambouris, Marios; Thevenon, Julien; Soldatos, Ariane; Cox, Allison; Stephen, Joshi; Ben-Omran, Tawfeg; Al-Sarraj, Yasser; Boulos, Hala; Bone, William; Mullikin, James C; Masurel-Paulet, Alice; St-Onge, Judith; Dufford, Yannis; Chantegret, Corrine; Thauvin-Robinet, Christel; Al-Alami, Jamil; Faivre, Laurence; Riviere, Jean Baptiste; Gahl, William A; Bassuk, Alexander G; Malicdan, May Christine V; El-Shanti, Hatem

    2017-01-01

    Two consanguineous families, one of Sudanese ethnicity presenting progressive neuromuscular disease, severe cognitive impairment, muscle weakness, upper motor neuron lesion, anhydrosis, facial dysmorphism, and recurrent seizures and the other of Egyptian ethnicity presenting with neonatal hypotonia, bradycardia, and recurrent seizures, were evaluated for the causative gene mutation. Homozygosity mapping and whole exome sequencing (WES) identified damaging homozygous variants in SCN10A , namely c.4514C>T; p.Thr1505Met in the first family and c.4735C>T; p.Arg1579* in the second family. A third family, of Western European descent, included a child with febrile infection-related epilepsy syndrome (FIRES) who also had compound heterozygous missense mutations in SCN10A , namely, c.3482T>C; p.Met1161Thr and c.4709C>A; p.Thr1570Lys. A search for SCN10A variants in three consortia datasets (EuroEPINOMICS, Epi4K/EPGP, Autism/dbGaP) identified an additional five individuals with compound heterozygous variants. A Hispanic male with infantile spasms [c.2842G>C; p.Val948Leu and c.1453C>T; p.Arg485Cys], and a Caucasian female with Lennox-Gastaut syndrome [c.1529C>T; p.Pro510Leu and c.4984G>A; p.Gly1662Ser] in the epilepsy databases and three in the autism databases with [c.4009T>A; p.Ser1337Thr and c.1141A>G; p.Ile381Val], [c.2972C>T; p.Pro991Leu and c.2470C>T; p.His824Tyr], and [c.4009T>A; p.Ser1337Thr and c.2052G>A; p.Met684Ile]. SCN10A is a member of the voltage-gated sodium channel (VGSC) gene family. Sodium channels are responsible for the instigation and proliferation of action potentials in central and peripheral nervous systems. Heterozygous mutations in VGSC genes cause a wide range of epileptic and peripheral nervous system disorders. This report presents autosomal recessive mutations in SCN10A that may be linked to epilepsy-related phenotypes, Lennox-Gastaut syndrome, infantile spasms, and Autism Spectrum Disorder.

  12. Spanish personal name variations in national and international biomedical databases: implications for information retrieval and bibliometric studies

    PubMed Central

    Ruiz-Pérez, R.; López-Cózar, E. Delgado; Jiménez-Contreras, E.

    2002-01-01

    Objectives: The study sought to investigate how Spanish names are handled by national and international databases and to identify mistakes that can undermine the usefulness of these databases for locating and retrieving works by Spanish authors. Methods: The authors sampled 172 articles published by authors from the University of Granada Medical School between 1987 and 1996 and analyzed the variations in how each of their names was indexed in Science Citation Index (SCI), MEDLINE, and Índice Médico Español (IME). The number and types of variants that appeared for each author's name were recorded and compared across databases to identify inconsistencies in indexing practices. We analyzed the relationship between variability (number of variants of an author's name) and productivity (number of items the name was associated with as an author), the consequences for retrieval of information, and the most frequent indexing structures used for Spanish names. Results: The proportion of authors who appeared under more then one name was 48.1% in SCI, 50.7% in MEDLINE, and 69.0% in IME. Productivity correlated directly with variability: more than 50% of the authors listed on five to ten items appeared under more than one name in any given database, and close to 100% of the authors listed on more than ten items appeared under two or more variants. Productivity correlated inversely with retrievability: as the number of variants for a name increased, the number of items retrieved under each variant decreased. For the most highly productive authors, the number of items retrieved under each variant tended toward one. The most frequent indexing methods varied between databases. In MEDLINE and IME, names were indexed correctly as “first surname second surname, first name initial middle name initial” (if present) in 41.7% and 49.5% of the records, respectively. However, in SCI, the most frequent method was “first surname, first name initial second name initial” (48.0% of the records) and first surname and second surname run together, first name initial (18.3%). Conclusions: Retrievability on the basis of author's name was poor in all three databases. Each database uses accurate indexing methods, but these methods fail to result in consistency or coherence for specific entries. The likely causes of inconsistency are: (1) use by authors of variants of their names during their publication careers, (2) lack of authority control in all three databases, (3) the use of an inappropriate indexing method for Spanish names in SCI, (4) authors' inconsistent behaviors, and (5) possible editorial interventions by some journals. We offer some suggestions as to how to avert the proliferation of author name variants in the databases. PMID:12398248

  13. High-performance web services for querying gene and variant annotation.

    PubMed

    Xin, Jiwen; Mark, Adam; Afrasiabi, Cyrus; Tsueng, Ginger; Juchler, Moritz; Gopal, Nikhil; Stupp, Gregory S; Putman, Timothy E; Ainscough, Benjamin J; Griffith, Obi L; Torkamani, Ali; Whetzel, Patricia L; Mungall, Christopher J; Mooney, Sean D; Su, Andrew I; Wu, Chunlei

    2016-05-06

    Efficient tools for data management and integration are essential for many aspects of high-throughput biology. In particular, annotations of genes and human genetic variants are commonly used but highly fragmented across many resources. Here, we describe MyGene.info and MyVariant.info, high-performance web services for querying gene and variant annotation information. These web services are currently accessed more than three million times permonth. They also demonstrate a generalizable cloud-based model for organizing and querying biological annotation information. MyGene.info and MyVariant.info are provided as high-performance web services, accessible at http://mygene.info and http://myvariant.info . Both are offered free of charge to the research community.

  14. Cloning and characterization of two novel DNases from Streptococcus pyogenes.

    PubMed

    Hasegawa, Tadao; Torii, Keizo; Hashikawa, Shinnosuke; Iinuma, Yoshitsugu; Ohta, Michio

    2002-06-01

    The proteins in the culture supernatant (exoproteins) from Streptococcus pyogenes serotype M1 were separated by two-dimensional gel electrophoresis, and their N-terminal amino acid sequences were determined. The amino acid sequences were compared to sequences in the S. pyogenes genome database. The coding sequence showed similarity to sequences of two genes, mf2-v ( mf2 variant) and mf3, which had sequence similarity to genes encoding mitogenic factor (MF); MF has DNase activity. The recombinant genes were expressed in Escherichia coli and the proteins were synthesized. Mf2-v and Mf3 had DNase activity. The activity of Mf2-v was localized to the C-terminal half of the protein. The mf3 gene was shown to be present in most clinically isolated strains of S. pyogenes tested, and the mf2gene was detected in 20% of the isolates. The products of the mf2 and mf3 genes in clinically isolated S. pyogenes strains were thus shown to be DNases.

  15. Changes in global gene expression profiles induced by HPV 16 E6 oncoprotein variants in cervical carcinoma C33-A cells

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zacapala-Gómez, Ana Elvira, E-mail: zak_ana@yahoo.com.mx; Del Moral-Hernández, Oscar, E-mail: odelmoralh@gmail.com; Villegas-Sepúlveda, Nicolás, E-mail: nvillega@cinvestav.mx

    We analyzed the effects of the expression of HPV 16 E6 oncoprotein variants (AA-a, AA-c, E-A176/G350, E-C188/G350, E-G350), and the E-Prototype in global gene expression profiles in an in vitro model. E6 gene was cloned into an expression vector fused to GFP and was transfected in C33-A cells. Affymetrix GeneChip Human Transcriptome Array 2.0 platform was used to analyze the expression of over 245,000 coding transcripts. We found that HPV16 E6 variants altered the expression of 387 different genes in comparison with E-Prototype. The altered genes are involved in cellular processes related to the development of cervical carcinoma, such asmore » adhesion, angiogenesis, apoptosis, differentiation, cell cycle, proliferation, transcription and protein translation. Our results show that polymorphic changes in HPV16 E6 natural variants are sufficient to alter the overall gene expression profile in C33-A cells, explaining in part the observed differences in oncogenic potential of HPV16 variants. - Highlights: • Amino acid changes in HPV16 E6 variants modulate the transciption of specific genes. • This is the first comparison of global gene expression profile of HPV 16 E6 variants. • Each HPV 16 E6 variant appears to have its own molecular signature.« less

  16. Reconciling newborn screening and a novel splice variant in BTD associated with partial biotinidase deficiency: A BabySeq Project case report.

    PubMed

    Murry, Jaclyn B; Machini, Kalotina; Ceyhan-Birsoy, Ozge; Kritzer, Amy; Krier, Joel B; Lebo, Matthew S; Fayer, Shawn; Genetti, Casie A; Vannoy, Grace E; Yu, Timothy W; Agrawal, Pankaj B; Parad, Richard B; Holm, Ingrid A; McGuire, Amy L; Green, Robert C; Beggs, Alan H; Rehm, Heidi L; Project, The BabySeq

    2018-05-04

    Here, we report a newborn female infant from the well-baby cohort of the BabySeq Project who was identified with compound heterozygous BTD gene variants. The two identified variants included a well-established pathogenic variant (c.1612C>T, p.Arg538Cys) that causes profound biotinidase deficiency (BTD) in homozygosity. In addition, a novel splice variant (c.44+1G>A, p.?) was identified in the invariant splice donor region of intron 1, potentially predictive of loss of function. The novel variant was predicted to impact splicing of exon 1; however, given the absence of any reported pathogenic variants in exon 1 and the presence of alternative splicing with exon 1 absent in most tissues in the GTEx database, we assigned an initial classification of uncertain significance. Follow-up medical record review of state mandated newborn screen (NBS) results revealed an initial out-of-range biotinidase activity level. Levels from a repeat NBS sample barely passed cut-off into the normal range. To determine whether the infant was biotinidase deficient, subsequent diagnostic enzyme activity testing was performed, confirming partial BTD, and resulted in a change of management for this patient. This led to reclassification of the novel splice variant based on these results. In conclusion, combining the genetic and NBS results together prompted clinical follow-up that confirmed partial biotinidase deficiency, and informed this novel splice site's reclassification emphasizing the importance of combining iterative genetic and phenotypic evaluations. Cold Spring Harbor Laboratory Press.

  17. Identification of novel genetic causes of Rett syndrome-like phenotypes.

    PubMed

    Lopes, Fátima; Barbosa, Mafalda; Ameur, Adam; Soares, Gabriela; de Sá, Joaquim; Dias, Ana Isabel; Oliveira, Guiomar; Cabral, Pedro; Temudo, Teresa; Calado, Eulália; Cruz, Isabel Fineza; Vieira, José Pedro; Oliveira, Renata; Esteves, Sofia; Sauer, Sascha; Jonasson, Inger; Syvänen, Ann-Christine; Gyllensten, Ulf; Pinto, Dalila; Maciel, Patrícia

    2016-03-01

    The aim of this work was to identify new genetic causes of Rett-like phenotypes using array comparative genomic hybridisation and a whole exome sequencing approach. We studied a cohort of 19 Portuguese patients (16 girls, 3 boys) with a clinical presentation significantly overlapping Rett syndrome (RTT). Genetic analysis included filtering of the single nucleotide variants and indels with preference for de novo, homozygous/compound heterozygous, or maternally inherited X linked variants. Examination by MRI and muscle biopsies was also performed. Pathogenic genomic imbalances were found in two patients (10.5%): an 18q21.2 deletion encompassing four exons of the TCF4 gene and a mosaic UPD of chromosome 3. Variants in genes previously implicated in neurodevelopmental disorders (NDD) were identified in six patients (32%): de novo variants in EEF1A2, STXBP1 and ZNF238 were found in three patients, maternally inherited X linked variants in SLC35A2, ZFX and SHROOM4 were detected in two male patients and one homozygous variant in EIF2B2 was detected in one patient. Variants were also detected in five novel NDD candidate genes (26%): we identified de novo variants in the RHOBTB2, SMARCA1 and GABBR2 genes; a homozygous variant in EIF4G1; compound heterozygous variant in HTT. Network analysis reveals that these genes interact by means of protein interactions with each other and with the known RTT genes. These findings expand the phenotypical spectrum of previously known NDD genes to encompass RTT-like clinical presentations and identify new candidate genes for RTT-like phenotypes. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  18. SNPdbe: constructing an nsSNP functional impacts database.

    PubMed

    Schaefer, Christian; Meier, Alice; Rost, Burkhard; Bromberg, Yana

    2012-02-15

    Many existing databases annotate experimentally characterized single nucleotide polymorphisms (SNPs). Each non-synonymous SNP (nsSNP) changes one amino acid in the gene product (single amino acid substitution;SAAS). This change can either affect protein function or be neutral in that respect. Most polymorphisms lack experimental annotation of their functional impact. Here, we introduce SNPdbe-SNP database of effects, with predictions of computationally annotated functional impacts of SNPs. Database entries represent nsSNPs in dbSNP and 1000 Genomes collection, as well as variants from UniProt and PMD. SAASs come from >2600 organisms; 'human' being the most prevalent. The impact of each SAAS on protein function is predicted using the SNAP and SIFT algorithms and augmented with experimentally derived function/structure information and disease associations from PMD, OMIM and UniProt. SNPdbe is consistently updated and easily augmented with new sources of information. The database is available as an MySQL dump and via a web front end that allows searches with any combination of organism names, sequences and mutation IDs. http://www.rostlab.org/services/snpdbe.

  19. Systematic detection of positive selection in the human-pathogen interactome and lasting effects on infectious disease susceptibility.

    PubMed

    Corona, Erik; Wang, Liuyang; Ko, Dennis; Patel, Chirag J

    2018-01-01

    Infectious disease has shaped the natural genetic diversity of humans throughout the world. A new approach to capture positive selection driven by pathogens would provide information regarding pathogen exposure in distinct human populations and the constantly evolving arms race between host and disease-causing agents. We created a human pathogen interaction database and used the integrated haplotype score (iHS) to detect recent positive selection in genes that interact with proteins from 26 different pathogens. We used the Human Genome Diversity Panel to identify specific populations harboring pathogen-interacting genes that have undergone positive selection. We found that human genes that interact with 9 pathogen species show evidence of recent positive selection. These pathogens are Yersenia pestis, human immunodeficiency virus (HIV) 1, Zaire ebolavirus, Francisella tularensis, dengue virus, human respiratory syncytial virus, measles virus, Rubella virus, and Bacillus anthracis. For HIV-1, GWAS demonstrate that some naturally selected variants in the host-pathogen protein interaction networks continue to have functional consequences for susceptibility to these pathogens. We show that selected human genes were enriched for HIV susceptibility variants (identified through GWAS), providing further support for the hypothesis that ancient humans were exposed to lentivirus pandemics. Human genes in the Italian, Miao, and Biaka Pygmy populations that interact with Y. pestis show significant signs of selection. These results reveal some of the genetic footprints created by pathogens in the human genome that may have left lasting marks on susceptibility to infectious disease.

  20. The evolving genetic risk for sporadic ALS.

    PubMed

    Gibson, Summer B; Downie, Jonathan M; Tsetsou, Spyridoula; Feusier, Julie E; Figueroa, Karla P; Bromberg, Mark B; Jorde, Lynn B; Pulst, Stefan M

    2017-07-18

    To estimate the genetic risk conferred by known amyotrophic lateral sclerosis (ALS)-associated genes to the pathogenesis of sporadic ALS (SALS) using variant allele frequencies combined with predicted variant pathogenicity. Whole exome sequencing and repeat expansion PCR of C9orf72 and ATXN2 were performed on 87 patients of European ancestry with SALS seen at the University of Utah. DNA variants that change the protein coding sequence of 31 ALS-associated genes were annotated to determine which were rare and deleterious as predicted by MetaSVM. The percentage of patients with SALS with a rare and deleterious variant or repeat expansion in an ALS-associated gene was calculated. An odds ratio analysis was performed comparing the burden of ALS-associated genes in patients with SALS vs 324 normal controls. Nineteen rare nonsynonymous variants in an ALS-associated gene, 2 of which were found in 2 different individuals, were identified in 21 patients with SALS. Further, 5 deleterious C9orf72 and 2 ATXN2 repeat expansions were identified. A total of 17.2% of patients with SALS had a rare and deleterious variant or repeat expansion in an ALS-associated gene. The genetic burden of ALS-associated genes in patients with SALS as predicted by MetaSVM was significantly higher than in normal controls. Previous analyses have identified SALS-predisposing variants only in terms of their rarity in normal control populations. By incorporating variant pathogenicity as well as variant frequency, we demonstrated that the genetic risk contributed by these genes for SALS is substantially lower than previous estimates. © 2017 American Academy of Neurology.

  1. Novel mutation in the CHST6 gene causes macular corneal dystrophy in a black South African family.

    PubMed

    Carstens, Nadia; Williams, Susan; Goolam, Saadiah; Carmichael, Trevor; Cheung, Ming Sin; Büchmann-Møller, Stine; Sultan, Marc; Staedtler, Frank; Zou, Chao; Swart, Peter; Rice, Dennis S; Lacoste, Arnaud; Paes, Kim; Ramsay, Michèle

    2016-07-20

    Macular corneal dystrophy (MCD) is a rare autosomal recessive disorder that is characterized by progressive corneal opacity that starts in early childhood and ultimately progresses to blindness in early adulthood. The aim of this study was to identify the cause of MCD in a black South African family with two affected sisters. A multigenerational South African Sotho-speaking family with type I MCD was studied using whole exome sequencing. Variant filtering to identify the MCD-causal mutation included the disease inheritance pattern, variant minor allele frequency and potential functional impact. Ophthalmologic evaluation of the cases revealed a typical MCD phenotype and none of the other family members were affected. An average of 127 713 variants per individual was identified following exome sequencing and approximately 1.2 % were not present in any of the investigated public databases. Variant filtering identified a homozygous E71Q mutation in CHST6, a known MCD-causing gene encoding corneal N-acetyl glucosamine-6-O-sulfotransferase. This E71Q mutation results in a non-conservative amino acid change in a highly conserved functional domain of the human CHST6 that is essential for enzyme activity. We identified a novel E71Q mutation in CHST6 as the MCD-causal mutation in a black South African family with type I MCD. This is the first description of MCD in a black Sub-Saharan African family and therefore contributes valuable insights into the genetic aetiology of this disease, while improving genetic counselling for this and potentially other MCD families.

  2. Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure in obesity.

    PubMed

    Turcot, Valérie; Lu, Yingchang; Highland, Heather M; Schurmann, Claudia; Justice, Anne E; Fine, Rebecca S; Bradfield, Jonathan P; Esko, Tõnu; Giri, Ayush; Graff, Mariaelisa; Guo, Xiuqing; Hendricks, Audrey E; Karaderi, Tugce; Lempradl, Adelheid; Locke, Adam E; Mahajan, Anubha; Marouli, Eirini; Sivapalaratnam, Suthesh; Young, Kristin L; Alfred, Tamuno; Feitosa, Mary F; Masca, Nicholas G D; Manning, Alisa K; Medina-Gomez, Carolina; Mudgal, Poorva; Ng, Maggie C Y; Reiner, Alex P; Vedantam, Sailaja; Willems, Sara M; Winkler, Thomas W; Abecasis, Gonçalo; Aben, Katja K; Alam, Dewan S; Alharthi, Sameer E; Allison, Matthew; Amouyel, Philippe; Asselbergs, Folkert W; Auer, Paul L; Balkau, Beverley; Bang, Lia E; Barroso, Inês; Bastarache, Lisa; Benn, Marianne; Bergmann, Sven; Bielak, Lawrence F; Blüher, Matthias; Boehnke, Michael; Boeing, Heiner; Boerwinkle, Eric; Böger, Carsten A; Bork-Jensen, Jette; Bots, Michiel L; Bottinger, Erwin P; Bowden, Donald W; Brandslund, Ivan; Breen, Gerome; Brilliant, Murray H; Broer, Linda; Brumat, Marco; Burt, Amber A; Butterworth, Adam S; Campbell, Peter T; Cappellani, Stefania; Carey, David J; Catamo, Eulalia; Caulfield, Mark J; Chambers, John C; Chasman, Daniel I; Chen, Yii-Der I; Chowdhury, Rajiv; Christensen, Cramer; Chu, Audrey Y; Cocca, Massimiliano; Collins, Francis S; Cook, James P; Corley, Janie; Corominas Galbany, Jordi; Cox, Amanda J; Crosslin, David S; Cuellar-Partida, Gabriel; D'Eustacchio, Angela; Danesh, John; Davies, Gail; Bakker, Paul I W; Groot, Mark C H; Mutsert, Renée; Deary, Ian J; Dedoussis, George; Demerath, Ellen W; Heijer, Martin; Hollander, Anneke I; Ruijter, Hester M; Dennis, Joe G; Denny, Josh C; Di Angelantonio, Emanuele; Drenos, Fotios; Du, Mengmeng; Dubé, Marie-Pierre; Dunning, Alison M; Easton, Douglas F; Edwards, Todd L; Ellinghaus, David; Ellinor, Patrick T; Elliott, Paul; Evangelou, Evangelos; Farmaki, Aliki-Eleni; Farooqi, I Sadaf; Faul, Jessica D; Fauser, Sascha; Feng, Shuang; Ferrannini, Ele; Ferrieres, Jean; Florez, Jose C; Ford, Ian; Fornage, Myriam; Franco, Oscar H; Franke, Andre; Franks, Paul W; Friedrich, Nele; Frikke-Schmidt, Ruth; Galesloot, Tessel E; Gan, Wei; Gandin, Ilaria; Gasparini, Paolo; Gibson, Jane; Giedraitis, Vilmantas; Gjesing, Anette P; Gordon-Larsen, Penny; Gorski, Mathias; Grabe, Hans-Jörgen; Grant, Struan F A; Grarup, Niels; Griffiths, Helen L; Grove, Megan L; Gudnason, Vilmundur; Gustafsson, Stefan; Haessler, Jeff; Hakonarson, Hakon; Hammerschlag, Anke R; Hansen, Torben; Harris, Kathleen Mullan; Harris, Tamara B; Hattersley, Andrew T; Have, Christian T; Hayward, Caroline; He, Liang; Heard-Costa, Nancy L; Heath, Andrew C; Heid, Iris M; Helgeland, Øyvind; Hernesniemi, Jussi; Hewitt, Alex W; Holmen, Oddgeir L; Hovingh, G Kees; Howson, Joanna M M; Hu, Yao; Huang, Paul L; Huffman, Jennifer E; Ikram, M Arfan; Ingelsson, Erik; Jackson, Anne U; Jansson, Jan-Håkan; Jarvik, Gail P; Jensen, Gorm B; Jia, Yucheng; Johansson, Stefan; Jørgensen, Marit E; Jørgensen, Torben; Jukema, J Wouter; Kahali, Bratati; Kahn, René S; Kähönen, Mika; Kamstrup, Pia R; Kanoni, Stavroula; Kaprio, Jaakko; Karaleftheri, Maria; Kardia, Sharon L R; Karpe, Fredrik; Kathiresan, Sekar; Kee, Frank; Kiemeney, Lambertus A; Kim, Eric; Kitajima, Hidetoshi; Komulainen, Pirjo; Kooner, Jaspal S; Kooperberg, Charles; Korhonen, Tellervo; Kovacs, Peter; Kuivaniemi, Helena; Kutalik, Zoltán; Kuulasmaa, Kari; Kuusisto, Johanna; Laakso, Markku; Lakka, Timo A; Lamparter, David; Lange, Ethan M; Lange, Leslie A; Langenberg, Claudia; Larson, Eric B; Lee, Nanette R; Lehtimäki, Terho; Lewis, Cora E; Li, Huaixing; Li, Jin; Li-Gao, Ruifang; Lin, Honghuang; Lin, Keng-Hung; Lin, Li-An; Lin, Xu; Lind, Lars; Lindström, Jaana; Linneberg, Allan; Liu, Ching-Ti; Liu, Dajiang J; Liu, Yongmei; Lo, Ken S; Lophatananon, Artitaya; Lotery, Andrew J; Loukola, Anu; Luan, Jian'an; Lubitz, Steven A; Lyytikäinen, Leo-Pekka; Männistö, Satu; Marenne, Gaëlle; Mazul, Angela L; McCarthy, Mark I; McKean-Cowdin, Roberta; Medland, Sarah E; Meidtner, Karina; Milani, Lili; Mistry, Vanisha; Mitchell, Paul; Mohlke, Karen L; Moilanen, Leena; Moitry, Marie; Montgomery, Grant W; Mook-Kanamori, Dennis O; Moore, Carmel; Mori, Trevor A; Morris, Andrew D; Morris, Andrew P; Müller-Nurasyid, Martina; Munroe, Patricia B; Nalls, Mike A; Narisu, Narisu; Nelson, Christopher P; Neville, Matt; Nielsen, Sune F; Nikus, Kjell; Njølstad, Pål R; Nordestgaard, Børge G; Nyholt, Dale R; O'Connel, Jeffrey R; O'Donoghue, Michelle L; Olde Loohuis, Loes M; Ophoff, Roel A; Owen, Katharine R; Packard, Chris J; Padmanabhan, Sandosh; Palmer, Colin N A; Palmer, Nicholette D; Pasterkamp, Gerard; Patel, Aniruddh P; Pattie, Alison; Pedersen, Oluf; Peissig, Peggy L; Peloso, Gina M; Pennell, Craig E; Perola, Markus; Perry, James A; Perry, John R B; Pers, Tune H; Person, Thomas N; Peters, Annette; Petersen, Eva R B; Peyser, Patricia A; Pirie, Ailith; Polasek, Ozren; Polderman, Tinca J; Puolijoki, Hannu; Raitakari, Olli T; Rasheed, Asif; Rauramaa, Rainer; Reilly, Dermot F; Renström, Frida; Rheinberger, Myriam; Ridker, Paul M; Rioux, John D; Rivas, Manuel A; Roberts, David J; Robertson, Neil R; Robino, Antonietta; Rolandsson, Olov; Rudan, Igor; Ruth, Katherine S; Saleheen, Danish; Salomaa, Veikko; Samani, Nilesh J; Sapkota, Yadav; Sattar, Naveed; Schoen, Robert E; Schreiner, Pamela J; Schulze, Matthias B; Scott, Robert A; Segura-Lepe, Marcelo P; Shah, Svati H; Sheu, Wayne H-H; Sim, Xueling; Slater, Andrew J; Small, Kerrin S; Smith, Albert V; Southam, Lorraine; Spector, Timothy D; Speliotes, Elizabeth K; Starr, John M; Stefansson, Kari; Steinthorsdottir, Valgerdur; Stirrups, Kathleen E; Strauch, Konstantin; Stringham, Heather M; Stumvoll, Michael; Sun, Liang; Surendran, Praveen; Swift, Amy J; Tada, Hayato; Tansey, Katherine E; Tardif, Jean-Claude; Taylor, Kent D; Teumer, Alexander; Thompson, Deborah J; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Thuesen, Betina H; Tönjes, Anke; Tromp, Gerard; Trompet, Stella; Tsafantakis, Emmanouil; Tuomilehto, Jaakko; Tybjaerg-Hansen, Anne; Tyrer, Jonathan P; Uher, Rudolf; Uitterlinden, André G; Uusitupa, Matti; Laan, Sander W; Duijn, Cornelia M; Leeuwen, Nienke; van Setten, Jessica; Vanhala, Mauno; Varbo, Anette; Varga, Tibor V; Varma, Rohit; Velez Edwards, Digna R; Vermeulen, Sita H; Veronesi, Giovanni; Vestergaard, Henrik; Vitart, Veronique; Vogt, Thomas F; Völker, Uwe; Vuckovic, Dragana; Wagenknecht, Lynne E; Walker, Mark; Wallentin, Lars; Wang, Feijie; Wang, Carol A; Wang, Shuai; Wang, Yiqin; Ware, Erin B; Wareham, Nicholas J; Warren, Helen R; Waterworth, Dawn M; Wessel, Jennifer; White, Harvey D; Willer, Cristen J; Wilson, James G; Witte, Daniel R; Wood, Andrew R; Wu, Ying; Yaghootkar, Hanieh; Yao, Jie; Yao, Pang; Yerges-Armstrong, Laura M; Young, Robin; Zeggini, Eleftheria; Zhan, Xiaowei; Zhang, Weihua; Zhao, Jing Hua; Zhao, Wei; Zhao, Wei; Zhou, Wei; Zondervan, Krina T; Rotter, Jerome I; Pospisilik, John A; Rivadeneira, Fernando; Borecki, Ingrid B; Deloukas, Panos; Frayling, Timothy M; Lettre, Guillaume; North, Kari E; Lindgren, Cecilia M; Hirschhorn, Joel N; Loos, Ruth J F

    2018-01-01

    Genome-wide association studies (GWAS) have identified >250 loci for body mass index (BMI), implicating pathways related to neuronal biology. Most GWAS loci represent clusters of common, noncoding variants from which pinpointing causal genes remains challenging. Here we combined data from 718,734 individuals to discover rare and low-frequency (minor allele frequency (MAF) < 5%) coding variants associated with BMI. We identified 14 coding variants in 13 genes, of which 8 variants were in genes (ZBTB7B, ACHE, RAPGEF3, RAB21, ZFHX3, ENTPD6, ZFR2 and ZNF169) newly implicated in human obesity, 2 variants were in genes (MC4R and KSR2) previously observed to be mutated in extreme obesity and 2 variants were in GIPR. The effect sizes of rare variants are ~10 times larger than those of common variants, with the largest effect observed in carriers of an MC4R mutation introducing a stop codon (p.Tyr35Ter, MAF = 0.01%), who weighed ~7 kg more than non-carriers. Pathway analyses based on the variants associated with BMI confirm enrichment of neuronal genes and provide new evidence for adipocyte and energy expenditure biology, widening the potential of genetically supported therapeutic targets in obesity.

  3. DBATE: database of alternative transcripts expression.

    PubMed

    Bianchi, Valerio; Colantoni, Alessio; Calderone, Alberto; Ausiello, Gabriele; Ferrè, Fabrizio; Helmer-Citterich, Manuela

    2013-01-01

    The use of high-throughput RNA sequencing technology (RNA-seq) allows whole transcriptome analysis, providing an unbiased and unabridged view of alternative transcript expression. Coupling splicing variant-specific expression with its functional inference is still an open and difficult issue for which we created the DataBase of Alternative Transcripts Expression (DBATE), a web-based repository storing expression values and functional annotation of alternative splicing variants. We processed 13 large RNA-seq panels from human healthy tissues and in disease conditions, reporting expression levels and functional annotations gathered and integrated from different sources for each splicing variant, using a variant-specific annotation transfer pipeline. The possibility to perform complex queries by cross-referencing different functional annotations permits the retrieval of desired subsets of splicing variant expression values that can be visualized in several ways, from simple to more informative. DBATE is intended as a novel tool to help appreciate how, and possibly why, the transcriptome expression is shaped. DATABASE URL: http://bioinformatica.uniroma2.it/DBATE/.

  4. Mutation analysis of genes within the dynactin complex in a cohort of hereditary peripheral neuropathies.

    PubMed

    Tey, S; Ahmad-Annuar, A; Drew, A P; Shahrizaila, N; Nicholson, G A; Kennerson, M L

    2016-08-01

    The cytoplasmic dynein-dynactin genes are attractive candidates for neurodegenerative disorders given their functional role in retrograde transport along neurons. The cytoplasmic dynein heavy chain (DYNC1H1) gene has been implicated in various neurodegenerative disorders, and dynactin 1 (DCTN1) genes have been implicated in a wide spectrum of disorders including motor neuron disease, Parkinson's disease, spinobulbar muscular atrophy and hereditary spastic paraplegia. However, the involvement of other dynactin genes with inherited peripheral neuropathies (IPN) namely, hereditary sensory neuropathy, hereditary motor neuropathy and Charcot-Marie-Tooth disease is under reported. We screened eight genes; DCTN1-6 and ACTR1A and ACTR1B in 136 IPN patients using whole-exome sequencing and high-resolution melt (HRM) analysis. Eight non-synonymous variants (including one novel variant) and three synonymous variants were identified. Four variants have been reported previously in other studies, however segregation analysis within family members excluded them from causing IPN in these families. No variants of disease significance were identified in this study suggesting the dynactin genes are unlikely to be a common cause of IPNs. However, with the ease of querying gene variants from exome data, these genes remain worthwhile candidates to assess unsolved IPN families for variants that may affect the function of the proteins. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  5. Resequencing three candidate genes discovers seven potentially deleterious variants susceptibility to major depressive disorder and suicide attempts in Chinese.

    PubMed

    Rao, Shitao; Leung, Cherry She Ting; Lam, Macro Hb; Wing, Yun Kwok; Waye, Mary Miu Yee; Tsui, Stephen Kwok Wing

    2017-03-01

    To date almost 200 genes were found to be associated with major depressive disorder (MDD) or suicide attempts (SA), but very few genes were reported for their molecular mechanisms. This study aimed to find out whether there were common or rare variants in three candidate genes altering the risk for MDD and SA in Chinese. Three candidate genes (HOMER1, SLC6A4 and TEF) were chosen for resequencing analysis and association studies as they were reported to be involved in the etiology of MDD and SA. Following that, bioinformatics analyses were applied on those variants of interest. After resequencing analysis and alignment for the amplicons, a total of 34 common or rare variants were found in the randomly selected 36 Hong Kong Chinese patients with both MDD and SA. Among those, seven variants show potentially deleterious features. Rs60029191 and a rare variant located in regulatory region of the HOMER1 gene may affect the promoter activities through interacting with predicted transcription factors. Two missense mutations existed in the SLC6A4 coding regions were firstly reported in Hong Kong Chinese MDD and SA patients, and both of them could affect the transport efficiency of SLC6A4 to serotonin. Moreover, a common variant rs6354 located in the untranslated region of this gene may affect the expression level or exonic splicing of serotonin transporter. In addition, both of a most studied polymorphism rs738499 and a low-frequency variant in the promoter region of the TEF gene were found to be located in potential transcription factor binding sites, which may let the two variants be able to influence the promoter activities of the gene. This study elucidated the potentially molecular mechanisms of the three candidate genes altering the risk for MDD and SA. These findings implied that not only common variants but rare variants could make contributions to the genetic susceptibility to MDD and SA in Chinese. Copyright © 2016 Elsevier B.V. All rights reserved.

  6. Clinical diagnosis and typing of systemic amyloidosis in subcutaneous fat aspirates by mass spectrometry-based proteomics

    PubMed Central

    Vrana, Julie A.; Theis, Jason D.; Dasari, Surendra; Mereuta, Oana M.; Dispenzieri, Angela; Zeldenrust, Steven R.; Gertz, Morie A.; Kurtin, Paul J.; Grogg, Karen L.; Dogan, Ahmet

    2014-01-01

    Examination of abdominal subcutaneous fat aspirates is a practical, sensitive and specific method for the diagnosis of systemic amyloidosis. Here we describe the development and implementation of a clinical assay using mass spectrometry-based proteomics to type amyloidosis in subcutaneous fat aspirates. First, we validated the assay comparing amyloid-positive (n=43) and -negative (n=26) subcutaneous fat aspirates. The assay classified amyloidosis with 88% sensitivity and 96% specificity. We then implemented the assay as a clinical test, and analyzed 366 amyloid-positive subcutaneous fat aspirates in a 4-year period as part of routine clinical care. The assay had a sensitivity of 90%, and diverse amyloid types, including immunoglobulin light chain (74%), transthyretin (13%), serum amyloid A (%1), gelsolin (1%), and lysozyme (1%), were identified. Using bioinformatics, we identified a universal amyloid proteome signature, which has high sensitivity and specificity for amyloidosis similar to that of Congo red staining. We curated proteome databases which included variant proteins associated with systemic amyloidosis, and identified clonotypic immunoglobulin variable gene usage in immunoglobulin light chain amyloidosis, and the variant peptides in hereditary transthyretin amyloidosis. In conclusion, mass spectrometry-based proteomic analysis of subcutaneous fat aspirates offers a powerful tool for the diagnosis and typing of systemic amyloidosis. The assay reveals the underlying pathogenesis by identifying variable gene usage in immunoglobulin light chains and the variant peptides in hereditary amyloidosis. PMID:24747948

  7. X-exome sequencing identifies a HDAC8 variant in a large pedigree with X-linked intellectual disability, truncal obesity, gynaecomastia, hypogonadism and unusual face.

    PubMed

    Harakalova, Magdalena; van den Boogaard, Marie-Jose; Sinke, Richard; van Lieshout, Stef; van Tuil, Marc C; Duran, Karen; Renkens, Ivo; Terhal, Paulien A; de Kovel, Carolien; Nijman, Ies J; van Haelst, Mieke; Knoers, Nine V A M; van Haaften, Gijs; Kloosterman, Wigard; Hennekam, Raoul C M; Cuppen, Edwin; Ploos van Amstel, Hans Kristian

    2012-08-01

    We present a large Dutch family with seven males affected by a novel syndrome of X-linked intellectual disability, hypogonadism, gynaecomastia, truncal obesity, short stature and recognisable craniofacial manifestations resembling but not identical to Wilson-Turner syndrome. Seven female relatives show a much milder expression of the phenotype. We performed X chromosome exome (X-exome) sequencing in five individuals from this family and identified a novel intronic variant in the histone deacetylase 8 gene (HDAC8), c.164+5G>A, which disturbs the normal splicing of exon 2 resulting in exon skipping, and introduces a premature stop at the beginning of the histone deacetylase catalytic domain. The identified variant completely segregates in this family and was absent in 96 Dutch controls and available databases. Affected female carriers showed a notably skewed X-inactivation pattern in lymphocytes in which the mutated X-chromosome was completely inactivated. HDAC8 is a member of the protein family of histone deacetylases that play a major role in epigenetic gene silencing during development. HDAC8 specifically controls the patterning of the skull with the mouse HDAC8 knock-out showing craniofacial deformities of the skull. The present family provides the first evidence for involvement of HDAC8 in a syndromic form of intellectual disability.

  8. Utility of whole-genome sequencing for detection of newborn screening disorders in a population cohort of 1,696 neonates.

    PubMed

    Bodian, Dale L; Klein, Elisabeth; Iyer, Ramaswamy K; Wong, Wendy S W; Kothiyal, Prachi; Stauffer, Daniel; Huddleston, Kathi C; Gaither, Amber D; Remsburg, Irina; Khromykh, Alina; Baker, Robin L; Maxwell, George L; Vockley, Joseph G; Niederhuber, John E; Solomon, Benjamin D

    2016-03-01

    To assess the potential of whole-genome sequencing (WGS) to replicate and augment results from conventional blood-based newborn screening (NBS). Research-generated WGS data from an ancestrally diverse cohort of 1,696 infants and both parents of each infant were analyzed for variants in 163 genes involved in disorders included or under discussion for inclusion in US NBS programs. WGS results were compared with results from state NBS and related follow-up testing. NBS genes are generally well covered by WGS. There is a median of one (range: 0-6) database-annotated pathogenic variant in the NBS genes per infant. Results of WGS and NBS in detecting 28 state-screened disorders and four hemoglobin traits were concordant for 88.6% of true positives (n = 35) and 98.9% of true negatives (n = 45,757). Of the five infants affected with a state-screened disorder, WGS identified two whereas NBS detected four. WGS yielded fewer false positives than NBS (0.037 vs. 0.17%) but more results of uncertain significance (0.90 vs. 0.013%). WGS may help rule in and rule out NBS disorders, pinpoint molecular diagnoses, and detect conditions not amenable to current NBS assays.

  9. An integrative computational approach for prioritization of genomic variants

    DOE PAGES

    Dubchak, Inna; Balasubramanian, Sandhya; Wang, Sheng; ...

    2014-12-15

    An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidatemore » genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. This study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.« less

  10. Identification of pathogenic gene mutations in LMNA and MYBPC3 that alter RNA splicing.

    PubMed

    Ito, Kaoru; Patel, Parth N; Gorham, Joshua M; McDonough, Barbara; DePalma, Steven R; Adler, Emily E; Lam, Lien; MacRae, Calum A; Mohiuddin, Syed M; Fatkin, Diane; Seidman, Christine E; Seidman, J G

    2017-07-18

    Genetic variants that cause haploinsufficiency account for many autosomal dominant (AD) disorders. Gene-based diagnosis classifies variants that alter canonical splice signals as pathogenic, but due to imperfect understanding of RNA splice signals other variants that may create or eliminate splice sites are often clinically classified as variants of unknown significance (VUS). To improve recognition of pathogenic splice-altering variants in AD disorders, we used computational tools to prioritize VUS and developed a cell-based minigene splicing assay to confirm aberrant splicing. Using this two-step procedure we evaluated all rare variants in two AD cardiomyopathy genes, lamin A/C ( LMNA ) and myosin binding protein C ( MYBPC3 ). We demonstrate that 13 LMNA and 35 MYBPC3 variants identified in cardiomyopathy patients alter RNA splicing, representing a 50% increase in the numbers of established damaging splice variants in these genes. Over half of these variants are annotated as VUS by clinical diagnostic laboratories. Familial analyses of one variant, a synonymous LMNA VUS, demonstrated segregation with cardiomyopathy affection status and altered cardiac LMNA splicing. Application of this strategy should improve diagnostic accuracy and variant classification in other haploinsufficient AD disorders.

  11. Mutational Landscape of Candidate Genes in Familial Prostate Cancer

    PubMed Central

    Johnson, Anna M.; Zuhlke, Kimberly A.; Plotts, Chris; McDonnell, Shannon K.; Middha, Sumit; Riska, Shaun M.; Thibodeau, Stephen N.; Douglas, Julie A.; Cooney, Kathleen A.

    2014-01-01

    Background Family history is a major risk factor for prostate cancer (PCa), suggesting a genetic component to the disease. However, traditional linkage and association studies have failed to fully elucidate the underlying genetic basis of familial PCa. Methods Here we use a candidate gene approach to identify potential PCa susceptibility variants in whole exome sequencing data from familial PCa cases. Six hundred ninety-seven candidate genes were identified based on function, location near a known chromosome 17 linkage signal, and/or previous association with prostate or other cancers. Single nucleotide variants (SNVs) in these candidate genes were identified in whole exome sequence data from 33 PCa cases from 11 multiplex PCa families (3 cases/family). Results Overall, 4856 candidate gene SNVs were identified, including 1052 missense and 10 nonsense variants. Twenty missense variants were shared by all 3 family members in each family in which they were observed. Additionally, 15 missense variants were shared by 2 of 3 family members and predicted to be deleterious by 5 different algorithms. Four missense variants, BLM Gln123Arg, PARP2 Arg283Gln, LRCC46 Ala295Thr and KIF2B Pro91Leu, and 1 nonsense variant, CYP3A43 Arg441Ter, showed complete co-segregation with PCa status. Twelve additional variants displayed partial co-segregation with PCa. Conclusions Forty-three nonsense and shared, missense variants were identified in our candidate genes. Further research is needed to determine the contribution of these variants to PCa susceptibility. PMID:25111073

  12. Detection of Allelic Variants of the POLE and POLD1 Genes in Colorectal Cancer Patients

    PubMed Central

    LA, Pätzold; D, Bērziņa; Z, Daneberga; J, Gardovskis; E, Miklaševičs

    2017-01-01

    Abstract Incidence of colorectal cancer is high worldwide and it mostly occurs as an accumulation of environmental factors and genetic alterations. Hereditary colorectal cancer can develop as a part of a hereditary syndrome. There is a suspected correlation between colorectal cancer and allelic variants of the POLE and POLD1 genes. The aim of the present study was to look for associations between the allelic variants in the POLE and POLD1 genes and colorectal cancer. One thousand, seven hundred and forty-nine DNA samples from colorectal cancer patients were collected from 2002 to 2013. Samples were divided in three groups: hereditary colorectal cancer patients, patients with different hereditary cancer syndromes in their families and patients with no cancer history in their families. The DNA samples were screened for allelic variants of POLE rs483352909 and POLD1 rs39751463 using denaturing high performance liquid chromatography (DHPLC). All patients were negative for allelic variants rs483352909 of the POLE gene and rs397514632 of the POLD1 gene. One allelic variant rs373243003 in the POLE gene and one novel duplication of four nucleotides at the excision site between intron and exon (c.1384-5dupCCTA) in the POLD1 gene, was found. We could not detect or confirm the connection between the genetic variants in the POLD1 and POLE genes and colorectal cancer patients, but we detected a novel genetic variant with an unknown significance. PMID:29876237

  13. Next generation sequencing to identify novel genetic variants causative of autosomal dominant familial hypercholesterolemia associated with increased risk of coronary heart disease.

    PubMed

    Al-Allaf, Faisal A; Athar, Mohammad; Abduljaleel, Zainularifeen; Taher, Mohiuddin M; Khan, Wajahatullah; Ba-Hammam, Faisal A; Abalkhail, Hala; Alashwal, Abdullah

    2015-07-01

    Familial hypercholesterolemia (FH) is an autosomal dominant inherited disease characterized by elevated plasma low-density lipoprotein cholesterol (LDL-C). It is an autosomal dominant disease, caused by variants in Ldlr, ApoB or Pcsk9, which results in high levels of LDL-cholesterol (LDL-C) leading to early coronary heart disease. Sequencing whole genome for screening variants for FH are not suitable due to high cost. Hence, in this study we performed targeted customized sequencing of FH 12 genes (Ldlr, ApoB, Pcsk9, Abca1, Apoa2, Apoc3, Apon2, Arh, Ldlrap1, Apoc2, ApoE, and Lpl) that have been implicated in the homozygous phenotype of a proband pedigree to identify candidate variants by NGS Ion torrent PGM. Only three genes (Ldlr, ApoB, and Pcsk9) were found to be highly associated with FH based on the variant rate. The results showed that seven deleterious variants in Ldlr, ApoB, and Pcsk9 genes were pathological and were clinically significant based on predictions identified by SIFT and PolyPhen. Targeted customized sequencing is an efficient technique for screening variants among targeted FH genes. Final validation of seven deleterious variants conducted by capillary resulted to only one novel variant in Ldlr gene that was found in exon 14 (c.2026delG, p. Gly676fs). The variant found in Ldlr gene was a novel heterozygous variant derived from a male in the proband. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Variants at serotonin transporter and 2A receptor genes predict cooperative behavior differentially according to presence of punishment.

    PubMed

    Schroeder, Kari B; McElreath, Richard; Nettle, Daniel

    2013-03-05

    Punishment of free-riding has been implicated in the evolution of cooperation in humans, and yet mechanisms for punishment avoidance remain largely uninvestigated. Individual variation in these mechanisms may stem from variation in the serotonergic system, which modulates processing of aversive stimuli. Functional serotonin gene variants have been associated with variation in the processing of aversive stimuli and widely studied as risk factors for psychiatric disorders. We show that variants at the serotonin transporter gene (SLC6A4) and serotonin 2A receptor gene (HTR2A) predict contributions to the public good in economic games, dependent upon whether contribution behavior can be punished. Participants with a variant at the serotonin transporter gene contribute more, leading to group-level differences in cooperation, but this effect dissipates in the presence of punishment. When contribution behavior can be punished, those with a variant at the serotonin 2A receptor gene contribute more than those without it. This variant also predicts a more stressful experience of the games. The diversity of institutions (including norms) that govern cooperation and punishment may create selective pressures for punishment avoidance that change rapidly across time and space. Variant-specific epigenetic regulation of these genes, as well as population-level variation in the frequencies of these variants, may facilitate adaptation to local norms of cooperation and punishment.

  15. Variants at serotonin transporter and 2A receptor genes predict cooperative behavior differentially according to presence of punishment

    PubMed Central

    Schroeder, Kari B.; McElreath, Richard; Nettle, Daniel

    2013-01-01

    Punishment of free-riding has been implicated in the evolution of cooperation in humans, and yet mechanisms for punishment avoidance remain largely uninvestigated. Individual variation in these mechanisms may stem from variation in the serotonergic system, which modulates processing of aversive stimuli. Functional serotonin gene variants have been associated with variation in the processing of aversive stimuli and widely studied as risk factors for psychiatric disorders. We show that variants at the serotonin transporter gene (SLC6A4) and serotonin 2A receptor gene (HTR2A) predict contributions to the public good in economic games, dependent upon whether contribution behavior can be punished. Participants with a variant at the serotonin transporter gene contribute more, leading to group-level differences in cooperation, but this effect dissipates in the presence of punishment. When contribution behavior can be punished, those with a variant at the serotonin 2A receptor gene contribute more than those without it. This variant also predicts a more stressful experience of the games. The diversity of institutions (including norms) that govern cooperation and punishment may create selective pressures for punishment avoidance that change rapidly across time and space. Variant-specific epigenetic regulation of these genes, as well as population-level variation in the frequencies of these variants, may facilitate adaptation to local norms of cooperation and punishment. PMID:23431136

  16. Whole-Exome Sequencing Identifies Novel Variants for Tooth Agenesis.

    PubMed

    Dinckan, N; Du, R; Petty, L E; Coban-Akdemir, Z; Jhangiani, S N; Paine, I; Baugh, E H; Erdem, A P; Kayserili, H; Doddapaneni, H; Hu, J; Muzny, D M; Boerwinkle, E; Gibbs, R A; Lupski, J R; Uyguner, Z O; Below, J E; Letra, A

    2018-01-01

    Tooth agenesis is a common craniofacial abnormality in humans and represents failure to develop 1 or more permanent teeth. Tooth agenesis is complex, and variations in about a dozen genes have been reported as contributing to the etiology. Here, we combined whole-exome sequencing, array-based genotyping, and linkage analysis to identify putative pathogenic variants in candidate disease genes for tooth agenesis in 10 multiplex Turkish families. Novel homozygous and heterozygous variants in LRP6, DKK1, LAMA3, and COL17A1 genes, as well as known variants in WNT10A, were identified as likely pathogenic in isolated tooth agenesis. Novel variants in KREMEN1 were identified as likely pathogenic in 2 families with suspected syndromic tooth agenesis. Variants in more than 1 gene were identified segregating with tooth agenesis in 2 families, suggesting oligogenic inheritance. Structural modeling of missense variants suggests deleterious effects to the encoded proteins. Functional analysis of an indel variant (c.3607+3_6del) in LRP6 suggested that the predicted resulting mRNA is subject to nonsense-mediated decay. Our results support a major role for WNT pathways genes in the etiology of tooth agenesis while revealing new candidate genes. Moreover, oligogenic cosegregation was suggestive for complex inheritance and potentially complex gene product interactions during development, contributing to improved understanding of the genetic etiology of familial tooth agenesis.

  17. Single nucleotide polymorphisms in the neuropeptide Y2 receptor (NPY2R) gene and association with severe obesity in French white subjects.

    PubMed

    Siddiq, A; Gueorguiev, M; Samson, C; Hercberg, S; Heude, B; Levy-Marchal, C; Jouret, B; Weill, J; Meyre, D; Walley, A; Froguel, P

    2007-03-01

    Genetic variants of genes for peptide YY (PYY), neuropeptide Y2 receptor (NPY2R) and pancreatic polypeptide (PPY) were investigated for association with severe obesity. The initial screening of the genes for variants was performed by sequencing in a group of severely obese subjects (n=161). Case-control analysis of the common variants was then carried out in 557 severely obese adults, 515 severely obese children and 1,163 non-obese/non-diabetic control subjects. Rare variants were genotyped in 700 obese children and the non-obese/non-diabetic control subjects (n=1,163). Significant association was found for a 5' variant (rs6857715) in the NPY2R gene with both severe adult obesity (p=0.002) and childhood obesity (p=0.02). This significant association was further supported by a pooled allelic analysis of all obese cases (adults and children, n=928) vs the control subjects (n=938) (p=0.0004, odds ratio=1.3, 95% CI 1.1-1.5). Quantitative trait analysis of BMI and WHR was performed and significant association was observed for SNP rs1047214 in NPY2R with an increase in WHR in the severely obese children (co-dominant model p=0.005, recessive model p=0.001). Association was also observed for an intron 3 variant (rs162430) in the PYY gene with childhood obesity (p=0.04). No significant associations were observed for PPY variants. Only one rare variant in the NPY2R gene (C-5641T) was not found in lean individuals and this was found to co-segregate with obesity in one family. These results provide evidence of association for NPY2R and PYY gene variants with obesity and none for PPY variants. A rare variant of the NPY2R gene showed evidence of co-segregation with obesity and its contribution to obesity should be investigated further.

  18. Prevalence of pathogenic germline variants detected by multigene sequencing in unselected Japanese patients with ovarian cancer

    PubMed Central

    Hirasawa, Akira; Imoto, Issei; Naruto, Takuya; Akahane, Tomoko; Yamagami, Wataru; Nomura, Hiroyuki; Masuda, Kiyoshi; Susumu, Nobuyuki; Tsuda, Hitoshi; Aoki, Daisuke

    2017-01-01

    Pathogenic germline BRCA1, BRCA2 (BRCA1/2), and several other gene variants predispose women to primary ovarian, fallopian tube, and peritoneal carcinoma (OC), although variant frequency and relevance information is scarce in Japanese women with OC. Using targeted panel sequencing, we screened 230 unselected Japanese women with OC from our hospital-based cohort for pathogenic germline variants in 75 or 79 OC-associated genes. Pathogenic variants of 11 genes were identified in 41 (17.8%) women: 19 (8.3%; BRCA1), 8 (3.5%; BRCA2), 6 (2.6%; mismatch repair genes), 3 (1.3%; RAD51D), 2 (0.9%; ATM), 1 (0.4%; MRE11A), 1 (FANCC), and 1 (GABRA6). Carriers of BRCA1/2 or any other tested gene pathogenic variants were more likely to be diagnosed younger, have first or second-degree relatives with OC, and have OC classified as high-grade serous carcinoma (HGSC). After adjustment for these variables, all 3 features were independent predictive factors for pathogenic variants in any tested genes whereas only the latter two remained for variants in BRCA1/2. Our data indicate similar variant prevalence in Japanese patients with OC and other ethnic groups and suggest that HGSC and OC family history may facilitate genetic predisposition prediction in Japanese patients with OC and referring high-risk patients for genetic counseling and testing. PMID:29348823

  19. Prevalence of pathogenic germline variants detected by multigene sequencing in unselected Japanese patients with ovarian cancer.

    PubMed

    Hirasawa, Akira; Imoto, Issei; Naruto, Takuya; Akahane, Tomoko; Yamagami, Wataru; Nomura, Hiroyuki; Masuda, Kiyoshi; Susumu, Nobuyuki; Tsuda, Hitoshi; Aoki, Daisuke

    2017-12-22

    Pathogenic germline BRCA1 , BRCA2 ( BRCA1/2 ), and several other gene variants predispose women to primary ovarian, fallopian tube, and peritoneal carcinoma (OC), although variant frequency and relevance information is scarce in Japanese women with OC. Using targeted panel sequencing, we screened 230 unselected Japanese women with OC from our hospital-based cohort for pathogenic germline variants in 75 or 79 OC-associated genes. Pathogenic variants of 11 genes were identified in 41 (17.8%) women: 19 (8.3%; BRCA1 ), 8 (3.5%; BRCA2 ), 6 (2.6%; mismatch repair genes), 3 (1.3%; RAD51D ), 2 (0.9%; ATM ), 1 (0.4%; MRE11A ), 1 ( FANCC ), and 1 ( GABRA6 ). Carriers of BRCA1/2 or any other tested gene pathogenic variants were more likely to be diagnosed younger, have first or second-degree relatives with OC, and have OC classified as high-grade serous carcinoma (HGSC). After adjustment for these variables, all 3 features were independent predictive factors for pathogenic variants in any tested genes whereas only the latter two remained for variants in BRCA1/2 . Our data indicate similar variant prevalence in Japanese patients with OC and other ethnic groups and suggest that HGSC and OC family history may facilitate genetic predisposition prediction in Japanese patients with OC and referring high-risk patients for genetic counseling and testing.

  20. Mutation spectrum of genes associated with steroid-resistant nephrotic syndrome in Chinese children.

    PubMed

    Wang, Ying; Dang, Xiqiang; He, Qingnan; Zhen, Yan; He, Xiaoxie; Yi, Zhuwen; Zhu, Kuichun

    2017-08-20

    Approximately 20% of children with idiopathic nephrotic syndrome do not respond to steroid therapy. More than 30 genes have been identified as disease-causing genes for the steroid-resistant nephrotic syndrome (SRNS). Few reports were from the Chinese population. The coding regions of genes commonly associated with SRNS were analyzed to characterize the gene mutation spectrum in children with SRNS in central China. The first phase study involved 38 children with five genes (NPHS1, NPHS2, PLCE1, WT1, and TRPC6) by Sanger sequencing. The second phase study involved 33 children with 17 genes by next generation DNA sequencing (NGS. 22 new patients, and 11 patients from first phase study but without positive findings). Overall deleterious or putatively deleterious gene variants were identified in 19 patients (31.7%), including four NPHS1 variants among five patients and three PLCE1 variants among four other patients. Variants in COL4A3, COL4A4, or COL4A5 were found in six patients. Eight novel variants were identified, including two in NPHS1, two in PLCE1, one in NPHS2, LAMB2, COL4A3, and COL4A4, respectively. 55.6% of the children with variants failed to respond to immunosuppressive agent therapy, while the resistance rate in children without variants was 44.4%. Our results show that screening for deleterious variants in some common genes in children clinically suspected with SRNS might be helpful for disease diagnosis as well as prediction of treatment efficacy and prognosis. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Implication of common and disease specific variants in CLU, CR1, and PICALM.

    PubMed

    Ferrari, Raffaele; Moreno, Jorge H; Minhajuddin, Abu T; O'Bryant, Sid E; Reisch, Joan S; Barber, Robert C; Momeni, Parastoo

    2012-08-01

    Two recent genome-wide association studies (GWAS) for late onset Alzheimer's disease (LOAD) revealed 3 new genes: clusterin (CLU), phosphatidylinositol binding clathrin assembly protein (PICALM), and complement receptor 1 (CR1). In order to evaluate association with these genome-wide association study-identified genes and to isolate the variants contributing to the pathogenesis of LOAD, we genotyped the top single nucleotide polymorphisms (SNPs), rs11136000 (CLU), rs3818361 (CR1), and rs3851179 (PICALM), and sequenced the entire coding regions of these genes in our cohort of 342 LOAD patients and 277 control subjects. We confirmed the association of rs3851179 (PICALM) (p = 7.4 × 10(-3)) with the disease status. Through sequencing we identified 18 variants in CLU, 3 of which were found exclusively in patients; 8 variants (out of 65) in CR1 gene were only found in patients and the 16 variants identified in PICALM gene were present in both patients and controls. In silico analysis of the variants in PICALM did not predict any damaging effect on the protein. The haplotype analysis of the variants in each gene predicted a common haplotype when the 3 single nucleotide polymorphisms rs11136000 (CLU), rs3818361 (CR1), and rs3851179 (PICALM), respectively, were included. For each gene the haplotype structure and size differed between patients and controls. In conclusion, we confirmed association of CLU, CR1, and PICALM genes with the disease status in our cohort through identification of a number of disease-specific variants among patients through the sequencing of the coding region of these genes. Published by Elsevier Inc.

  2. Variant pathogenicity evaluation in the community-driven Inherited Neuropathy Variant Browser.

    PubMed

    Saghira, Cima; Bis, Dana M; Stanek, David; Strickland, Alleene; Herrmann, David N; Reilly, Mary M; Scherer, Steven S; Shy, Michael E; Züchner, Stephan

    2018-05-01

    Charcot-Marie-Tooth disease (CMT) is an umbrella term for inherited neuropathies affecting an estimated one in 2,500 people. Over 120 CMT and related genes have been identified and clinical gene panels often contain more than 100 genes. Such a large genomic space will invariantly yield variants of uncertain clinical significance (VUS) in nearly any person tested. This rise in number of VUS creates major challenges for genetic counseling. Additionally, fewer individual variants in known genes are being published as the academic merit is decreasing, and most testing now happens in clinical laboratories, which typically do not correlate their variants with clinical phenotypes. For CMT, we aim to encourage and facilitate the global capture of variant data to gain a large collection of alleles in CMT genes, ideally in conjunction with phenotypic information. The Inherited Neuropathy Variant Browser provides user-friendly open access to currently reported variation in CMT genes. Geneticists, physicians, and genetic counselors can enter variants detected by clinical tests or in research studies in addition to genetic variation gathered from published literature, which are then submitted to ClinVar biannually. Active participation of the broader CMT community will provide an advance over existing resources for interpretation of CMT genetic variation. © 2018 Wiley Periodicals, Inc.

  3. Human AZU-1 gene, variants thereof and expressed gene products

    DOEpatents

    Chen, Huei-Mei; Bissell, Mina

    2004-06-22

    A human AZU-1 gene, mutants, variants and fragments thereof. Protein products encoded by the AZU-1 gene and homologs encoded by the variants of AZU-1 gene acting as tumor suppressors or markers of malignancy progression and tumorigenicity reversion. Identification, isolation and characterization of AZU-1 and AZU-2 genes localized to a tumor suppressive locus at chromosome 10q26, highly expressed in nonmalignant and premalignant cells derived from a human breast tumor progression model. A recombinant full length protein sequences encoded by the AZU-1 gene and nucleotide sequences of AZU-1 and AZU-2 genes and variant and fragments thereof. Monoclonal or polyclonal antibodies specific to AZU-1, AZU-2 encoded protein and to AZU-1, or AZU-2 encoded protein homologs.

  4. Exome-based analysis of cardiac arrhythmia, respiratory control, and epilepsy genes in sudden unexpected death in epilepsy.

    PubMed

    Bagnall, Richard D; Crompton, Douglas E; Petrovski, Slavé; Lam, Lien; Cutmore, Carina; Garry, Sarah I; Sadleir, Lynette G; Dibbens, Leanne M; Cairns, Anita; Kivity, Sara; Afawi, Zaid; Regan, Brigid M; Duflou, Johan; Berkovic, Samuel F; Scheffer, Ingrid E; Semsarian, Christopher

    2016-04-01

    The leading cause of epilepsy-related premature mortality is sudden unexpected death in epilepsy (SUDEP). The cause of SUDEP remains unknown. To search for genetic risk factors in SUDEP cases, we performed an exome-based analysis of rare variants. Demographic and clinical information of 61 SUDEP cases were collected. Exome sequencing and rare variant collapsing analysis with 2,936 control exomes were performed to test for genes enriched with damaging variants. Additionally, cardiac arrhythmia, respiratory control, and epilepsy genes were screened for variants with frequency of <0.1% and predicted to be pathogenic with multiple in silico tools. The 61 SUDEP cases were categorized as definite SUDEP (n = 54), probable SUDEP (n = 5), and definite SUDEP plus (n = 2). We identified de novo mutations, previously reported pathogenic mutations, or candidate pathogenic variants in 28 of 61 (46%) cases. Four SUDEP cases (7%) had mutations in common genes responsible for the cardiac arrhythmia disease, long QT syndrome (LQTS). Nine cases (15%) had candidate pathogenic variants in dominant cardiac arrhythmia genes. Fifteen cases (25%) had mutations or candidate pathogenic variants in dominant epilepsy genes. No gene reached genome-wide significance with rare variant collapsing analysis; however, DEPDC5 (p = 0.00015) and KCNH2 (p = 0.0037) were among the top 30 genes, genome-wide. A sizeable proportion of SUDEP cases have clinically relevant mutations in cardiac arrhythmia and epilepsy genes. In cases with an LQTS gene mutation, SUDEP may occur as a result of a predictable and preventable cause. Understanding the genetic basis of SUDEP may inform cascade testing of at-risk family members. © 2016 American Neurological Association.

  5. Incorporating gene-environment interaction in testing for association with rare genetic variants.

    PubMed

    Chen, Han; Meigs, James B; Dupuis, Josée

    2014-01-01

    The incorporation of gene-environment interactions could improve the ability to detect genetic associations with complex traits. For common genetic variants, single-marker interaction tests and joint tests of genetic main effects and gene-environment interaction have been well-established and used to identify novel association loci for complex diseases and continuous traits. For rare genetic variants, however, single-marker tests are severely underpowered due to the low minor allele frequency, and only a few gene-environment interaction tests have been developed. We aimed at developing powerful and computationally efficient tests for gene-environment interaction with rare variants. In this paper, we propose interaction and joint tests for testing gene-environment interaction of rare genetic variants. Our approach is a generalization of existing gene-environment interaction tests for multiple genetic variants under certain conditions. We show in our simulation studies that our interaction and joint tests have correct type I errors, and that the joint test is a powerful approach for testing genetic association, allowing for gene-environment interaction. We also illustrate our approach in a real data example from the Framingham Heart Study. Our approach can be applied to both binary and continuous traits, it is powerful and computationally efficient.

  6. Integrated rare variant-based risk gene prioritization in disease case-control sequencing studies.

    PubMed

    Lin, Jhih-Rong; Zhang, Quanwei; Cai, Ying; Morrow, Bernice E; Zhang, Zhengdong D

    2017-12-01

    Rare variants of major effect play an important role in human complex diseases and can be discovered by sequencing-based genome-wide association studies. Here, we introduce an integrated approach that combines the rare variant association test with gene network and phenotype information to identify risk genes implicated by rare variants for human complex diseases. Our data integration method follows a 'discovery-driven' strategy without relying on prior knowledge about the disease and thus maintains the unbiased character of genome-wide association studies. Simulations reveal that our method can outperform a widely-used rare variant association test method by 2 to 3 times. In a case study of a small disease cohort, we uncovered putative risk genes and the corresponding rare variants that may act as genetic modifiers of congenital heart disease in 22q11.2 deletion syndrome patients. These variants were missed by a conventional approach that relied on the rare variant association test alone.

  7. Variants of human papillomavirus type 16 predispose toward persistent infection

    PubMed Central

    Zhang, Lei; Liao, Hong; Yang, Binlie; Geffre, Christopher P; Zhang, Ai; Zhou, Aizhi; Cao, Huimin; Wang, Jieru; Zhang, Zhenbo; Zheng, Wenxin

    2015-01-01

    A cohort study of 292 Chinese women was conducted to determine the relationship between human papillomavirus (HPV) type 16 variants and persistent viral infection. Enrolled patients were HPV16 positive and had both normal cytology and histology. Flow-through hybridization and gene chip technology was used to identify the HPV type. A PCR sequencing assay was performed to find HPV16 E2, E6 and E7 gene variants. The associations between these variants and HPV16 persistent infection was analyzed by Fisher’s exact test. It was found that the variants T178G, T350G and A442C in the E6 gene, as well as C3158A and G3248A variants in the E2 gene were associated with persistent HPV16 infection. No link was observed between E7 variants and persistent viral infection. Our findings suggest that detection of specific HPV variants would help identify patients who are at high risk for viral persistence and development of cervical neoplasia. PMID:26339417

  8. Expression of drought tolerance genes in tropical upland rice cultivars (Oryza sativa).

    PubMed

    Silveira, R D D; Abreu, F R M; Mamidi, S; McClean, P E; Vianello, R P; Lanna, A C; Carneiro, N P; Brondani, C

    2015-07-27

    Gene expression related to drought response in the leaf tissues of two Brazilian upland cultivars, the drought-tolerant Douradão and the drought-sensitive Primavera, was analyzed. RNA-seq identified 27,618 transcripts in the Douradão cultivar, with 24,090 (87.2%) homologous to the rice database, and 27,221 transcripts in the Primavera cultivar, with 23,663 (86.9%) homologous to the rice database. Gene-expression analysis between control and water-deficient treatments revealed 493 and 1154 differentially expressed genes in Douradão and Primavera cultivars, respectively. Genes exclusively expressed under drought were identified for Douradão, including two genes of particular interest coding for the protein peroxidase precursor, which is involved in three distinct metabolic pathways. Comparisons between the two drought-exposed cultivars revealed 2314 genes were differentially expressed (978 upregulated, 1336 downregulated in Douradão). Six genes distributed across 4 different transcription factor families (bHLH, MYB, NAC, and WRKY) were identified, all of which were upregulated in Douradão compared to Primavera during drought. Most of the genes identified in Douradão activate metabolic pathways responsible for production of secondary metabolites and genes coding for enzymatically active signaling receptors. Quantitative PCR validation showed that most gene expression was in agreement with computational prediction of these transcripts. The transcripts identified here will define molecular markers for identification of Cis-acting elements to search for allelic variants of these genes through analysis of polymorphic SNPs in GenBank accessions of upland rice, aiming to develop cultivars with the best combination of these alleles, resulting in materials with high yield potential in the event of drought during the reproductive phase.

  9. Characterization of the two intra-individual sequence variants in the 18S rRNA gene in the plant parasitic nematode, Rotylenchulus reniformis.

    PubMed

    Nyaku, Seloame T; Sripathi, Venkateswara R; Kantety, Ramesh V; Gu, Yong Q; Lawrence, Kathy; Sharma, Govind C

    2013-01-01

    The 18S rRNA gene is fundamental to cellular and organismal protein synthesis and because of its stable persistence through generations it is also used in phylogenetic analysis among taxa. Sequence variation in this gene within a single species is rare, but it has been observed in few metazoan organisms. More frequently it has mostly been reported in the non-transcribed spacer region. Here, we have identified two sequence variants within the near full coding region of 18S rRNA gene from a single reniform nematode (RN) Rotylenchulus reniformis labeled as reniform nematode variant 1 (RN_VAR1) and variant 2 (RN_VAR2). All sequences from three of the four isolates had both RN variants in their sequences; however, isolate 13B had only RN variant 2 sequence. Specific variable base sites (96 or 5.5%) were found within the 18S rRNA gene that can clearly distinguish the two 18S rDNA variants of RN, in 11 (25.0%) and 33 (75.0%) of the 44 RN clones, for RN_VAR1 and RN_VAR2, respectively. Neighbor-joining trees show that the RN_VAR1 is very similar to the previously existing R. reniformis sequence in GenBank, while the RN_VAR2 sequence is more divergent. This is the first report of the identification of two major variants of the 18S rRNA gene in the same single RN, and documents the specific base variation between the two variants, and hypothesizes on simultaneous co-existence of these two variants for this gene.

  10. Characterization of the Two Intra-Individual Sequence Variants in the 18S rRNA Gene in the Plant Parasitic Nematode, Rotylenchulus reniformis

    PubMed Central

    Nyaku, Seloame T.; Sripathi, Venkateswara R.; Kantety, Ramesh V.; Gu, Yong Q.; Lawrence, Kathy; Sharma, Govind C.

    2013-01-01

    The 18S rRNA gene is fundamental to cellular and organismal protein synthesis and because of its stable persistence through generations it is also used in phylogenetic analysis among taxa. Sequence variation in this gene within a single species is rare, but it has been observed in few metazoan organisms. More frequently it has mostly been reported in the non-transcribed spacer region. Here, we have identified two sequence variants within the near full coding region of 18S rRNA gene from a single reniform nematode (RN) Rotylenchulus reniformis labeled as reniform nematode variant 1 (RN_VAR1) and variant 2 (RN_VAR2). All sequences from three of the four isolates had both RN variants in their sequences; however, isolate 13B had only RN variant 2 sequence. Specific variable base sites (96 or 5.5%) were found within the 18S rRNA gene that can clearly distinguish the two 18S rDNA variants of RN, in 11 (25.0%) and 33 (75.0%) of the 44 RN clones, for RN_VAR1 and RN_VAR2, respectively. Neighbor-joining trees show that the RN_VAR1 is very similar to the previously existing R. reniformis sequence in GenBank, while the RN_VAR2 sequence is more divergent. This is the first report of the identification of two major variants of the 18S rRNA gene in the same single RN, and documents the specific base variation between the two variants, and hypothesizes on simultaneous co-existence of these two variants for this gene. PMID:23593343

  11. Identification of causal genes for complex traits.

    PubMed

    Hormozdiari, Farhad; Kichaev, Gleb; Yang, Wen-Yun; Pasaniuc, Bogdan; Eskin, Eleazar

    2015-06-15

    Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability ρ. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. Software is freely available for download at genetics.cs.ucla.edu/caviar. © The Author 2015. Published by Oxford University Press.

  12. Whole-Exome Sequencing in Age-Related Macular Degeneration Identifies Rare Variants in COL8A1, a Component of Bruch's Membrane.

    PubMed

    Corominas, Jordi; Colijn, Johanna M; Geerlings, Maartje J; Pauper, Marc; Bakker, Bjorn; Amin, Najaf; Lores Motta, Laura; Kersten, Eveline; Garanto, Alejandro; Verlouw, Joost A M; van Rooij, Jeroen G J; Kraaij, Robert; de Jong, Paulus T V M; Hofman, Albert; Vingerling, Johannes R; Schick, Tina; Fauser, Sascha; de Jong, Eiko K; van Duijn, Cornelia M; Hoyng, Carel B; Klaver, Caroline C W; den Hollander, Anneke I

    2018-04-26

    Genome-wide association studies and targeted sequencing studies of candidate genes have identified common and rare variants that are associated with age-related macular degeneration (AMD). Whole-exome sequencing (WES) studies allow a more comprehensive analysis of rare coding variants across all genes of the genome and will contribute to a better understanding of the underlying disease mechanisms. To date, the number of WES studies in AMD case-control cohorts remains scarce and sample sizes are limited. To scrutinize the role of rare protein-altering variants in AMD cause, we performed the largest WES study in AMD to date in a large European cohort consisting of 1125 AMD patients and 1361 control participants. Genome-wide case-control association study of WES data. One thousand one hundred twenty-five AMD patients and 1361 control participants. A single variant association test of WES data was performed to detect variants that are associated individually with AMD. The cumulative effect of multiple rare variants with 1 gene was analyzed using a gene-based CMC burden test. Immunohistochemistry was performed to determine the localization of the Col8a1 protein in mouse eyes. Genetic variants associated with AMD. We detected significantly more rare protein-altering variants in the COL8A1 gene in patients (22/2250 alleles [1.0%]) than in control participants (11/2722 alleles [0.4%]; P = 7.07×10 -5 ). The association of rare variants in the COL8A1 gene is independent of the common intergenic variant (rs140647181) near the COL8A1 gene previously associated with AMD. We demonstrated that the Col8a1 protein localizes at Bruch's membrane. This study supported a role for protein-altering variants in the COL8A1 gene in AMD pathogenesis. We demonstrated the presence of Col8a1 in Bruch's membrane, further supporting the role of COL8A1 variants in AMD pathogenesis. Protein-altering variants in COL8A1 may alter the integrity of Bruch's membrane, contributing to the accumulation of drusen and the development of AMD. Copyright © 2018 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.

  13. Investigation of exomic variants associated with overall survival in ovarian cancer

    PubMed Central

    Ann Chen, Yian; Larson, Melissa C; Fogarty, Zachary C; Earp, Madalene A; Anton-Culver, Hoda; Bandera, Elisa V; Cramer, Daniel; Doherty, Jennifer A; Goodman, Marc T; Gronwald, Jacek; Karlan, Beth Y; Kjaer, Susanne K; Levine, Douglas A; Menon, Usha; Ness, Roberta B; Pearce, Celeste L; Pejovic, Tanja; Rossing, Mary Anne; Wentzensen, Nicolas; Bean, Yukie T; Bisogna, Maria; Brinton, Louise A; Carney, Michael E; Cunningham, Julie M; Cybulski, Cezary; deFazio, Anna; Dicks, Ed M; Edwards, Robert P; Gayther, Simon A; Gentry-Maharaj, Aleksandra; Gore, Martin; Iversen, Edwin S; Jensen, Allan; Johnatty, Sharon E; Lester, Jenny; Lin, Hui-Yi; Lissowska, Jolanta; Lubinski, Jan; Menkiszak, Janusz; Modugno, Francesmary; Moysich, Kirsten B; Orlow, Irene; Pike, Malcolm C; Ramus, Susan J; Song, Honglin; Terry, Kathryn L; Thompson, Pamela J; Tyrer, Jonathan P; van den Berg, David J; Vierkant, Robert A; Vitonis, Allison F; Walsh, Christine; Wilkens, Lynne R; Wu, Anna H; Yang, Hannah; Ziogas, Argyrios; Berchuck, Andrew; Chenevix-Trench, Georgia; Schildkraut, Joellen M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pharoah, Paul D P; Fridley, Brooke L

    2016-01-01

    Background While numerous susceptibility loci for epithelial ovarian cancer (EOC) have been identified, few associations have been reported with overall survival. In the absence of common prognostic genetic markers, we hypothesize that rare coding variants may be associated with overall EOC survival and assessed their contribution in two exome-based genotyping projects of the Ovarian Cancer Association Consortium (OCAC). Methods The primary patient set (Set 1) included 14 independent EOC studies (4293 patients) and 227,892 variants, and a secondary patient set (Set 2) included six additional EOC studies (1744 patients) and 114,620 variants. Because power to detect rare variants individually is reduced, gene-level tests were conducted. Sets were analyzed separately at individual variants and by gene, and then combined with meta-analyses (73,203 variants and 13,163 genes overlapped). Results No individual variant reached genome-wide statistical significance. A SNP previously implicated to be associated with EOC risk and, to a lesser extent, survival, rs8170, showed the strongest evidence of association with survival and similar effect size estimates across sets (Pmeta=1.1E-6, HRSet1=1.17, HRSet2=1.14). Rare variants in ATG2B, an autophagy gene important for apoptosis, were significantly associated with survival after multiple testing correction (Pmeta=1.1E-6; Pcorrected=0.01). Conclusions Common variant rs8170 and rare variants in ATG2B may be associated with EOC overall survival, although further study is needed. Impact This study represents the first exome-wide association study of EOC survival to include rare variant analyses, and suggests that complementary single variant and gene-level analyses in large studies are needed to identify rare variants that warrant follow-up study. PMID:26747452

  14. An automated procedure to identify biomedical articles that contain cancer-associated gene variants.

    PubMed

    McDonald, Ryan; Scott Winters, R; Ankuda, Claire K; Murphy, Joan A; Rogers, Amy E; Pereira, Fernando; Greenblatt, Marc S; White, Peter S

    2006-09-01

    The proliferation of biomedical literature makes it increasingly difficult for researchers to find and manage relevant information. However, identifying research articles containing mutation data, a requisite first step in integrating large and complex mutation data sets, is currently tedious, time-consuming and imprecise. More effective mechanisms for identifying articles containing mutation information would be beneficial both for the curation of mutation databases and for individual researchers. We developed an automated method that uses information extraction, classifier, and relevance ranking techniques to determine the likelihood of MEDLINE abstracts containing information regarding genomic variation data suitable for inclusion in mutation databases. We targeted the CDKN2A (p16) gene and the procedure for document identification currently used by CDKN2A Database curators as a measure of feasibility. A set of abstracts was manually identified from a MEDLINE search as potentially containing specific CDKN2A mutation events. A subset of these abstracts was used as a training set for a maximum entropy classifier to identify text features distinguishing "relevant" from "not relevant" abstracts. Each document was represented as a set of indicative word, word pair, and entity tagger-derived genomic variation features. When applied to a test set of 200 candidate abstracts, the classifier predicted 88 articles as being relevant; of these, 29 of 32 manuscripts in which manual curation found CDKN2A sequence variants were positively predicted. Thus, the set of potentially useful articles that a manual curator would have to review was reduced by 56%, maintaining 91% recall (sensitivity) and more than doubling precision (positive predictive value). Subsequent expansion of the training set to 494 articles yielded similar precision and recall rates, and comparison of the original and expanded trials demonstrated that the average precision improved with the larger data set. Our results show that automated systems can effectively identify article subsets relevant to a given task and may prove to be powerful tools for the broader research community. This procedure can be readily adapted to any or all genes, organisms, or sets of documents. Published 2006 Wiley-Liss, Inc.

  15. Novel Nine-Exon AR Transcripts (Exon 1/Exon 1b/Exons 2–8) in Normal and Cancerous Breast and Prostate Cells

    PubMed Central

    Hu, Dong Gui; McKinnon, Ross A.; Hulin, Julie-Ann; Mackenzie, Peter I.; Meech, Robyn

    2016-01-01

    Nearly 20 different transcripts of the human androgen receptor (AR) are reported with two currently listed as Refseq isoforms in the NCBI database. Isoform 1 encodes wild-type AR (type 1 AR) and isoform 2 encodes the variant AR45 (type 2 AR). Both variants contain eight exons: they share common exons 2–8 but differ in exon 1 with the canonical exon 1 in isoform 1 and the variant exon 1b in isoform 2. Splicing of exon 1 or exon 1b is reported to be mutually exclusive. In this study, we identified a novel exon 1b (1b/TAG) that contains an additional TAG trinucleotide upstream of exon 1b. Moreover, we identified AR transcripts in both normal and cancerous breast and prostate cells that contained either exon 1b or 1b/TAG spliced between the canonical exon 1 and exon 2, generating nine-exon AR transcripts that we have named isoforms 3a and 3b. The proteins encoded by these new AR variants could regulate androgen-responsive reporters in breast and prostate cancer cells under androgen-depleted conditions. Analysis of type 3 AR-GFP fusion proteins showed partial nuclear localization in PC3 cells under androgen-depleted conditions, supporting androgen-independent activation of the AR. Type 3 AR proteins inhibited androgen-induced growth of LNCaP cells. Microarray analysis identified a small set of type 3a AR target genes in LNCaP cells, including genes known to modulate growth and proliferation of prostate cancer (PCGEM1, PEG3, EPHA3, and EFNB2) or other types of human cancers (TOX3, ST8SIA4, and SLITRK3), and genes that are diagnostic/prognostic biomarkers of prostate cancer (GRINA3, and BCHE). PMID:28035996

  16. Introduction to Deep Sequencing and Its Application to Drug Addiction Research with a Focus on Rare Variants

    PubMed Central

    Wang, Shaolin; Yang, Zhongli; Ma, Jennie Z.; Payne, Thomas J.; Li, Ming D

    2013-01-01

    Through linkage analysis, candidate gene approach, and genome-wide association studies (GWAS), many genetic susceptibility factors for substance dependence have been discovered, such as the alcohol dehydrogenase gene (ALDH2) for alcohol dependence (AD) and nicotinic acetylcholine receptor (nAChR) subunit variants on chromosomes 8 and 15 for nicotine dependence (ND). However, these confirmed genetic factors contribute only a small portion of the heritability responsible for each addiction. Among many potential factors, rare variants in those identified and unidentified susceptibility genes are supposed to contribute greatly to the missing heritability. Several studies focusing on rare variants have been conducted by taking advantage of next-generation sequencing technologies, which revealed that some rare variants of nAChR subunits are associated with ND in both genetic and functional studies. However, these studies investigated variants for only a small number of genes and need to be expanded to broad regions/genes in a larger population. This review presents an update on recently developed methods for rare-variant identification and association analysis and on studies focused on rare-variant discovery and function related to addictions. PMID:23990377

  17. Next-generation sequencing for genetic testing of familial colorectal cancer syndromes.

    PubMed

    Simbolo, Michele; Mafficini, Andrea; Agostini, Marco; Pedrazzani, Corrado; Bedin, Chiara; Urso, Emanuele D; Nitti, Donato; Turri, Giona; Scardoni, Maria; Fassan, Matteo; Scarpa, Aldo

    2015-01-01

    Genetic screening in families with high risk to develop colorectal cancer (CRC) prevents incurable disease and permits personalized therapeutic and follow-up strategies. The advancement of next-generation sequencing (NGS) technologies has revolutionized the throughput of DNA sequencing. A series of 16 probands for either familial adenomatous polyposis (FAP; 8 cases) or hereditary nonpolyposis colorectal cancer (HNPCC; 8 cases) were investigated for intragenic mutations in five CRC familial syndromes-associated genes (APC, MUTYH, MLH1, MSH2, MSH6) applying both a custom multigene Ion AmpliSeq NGS panel and conventional Sanger sequencing. Fourteen pathogenic variants were detected in 13/16 FAP/HNPCC probands (81.3 %); one FAP proband presented two co-existing pathogenic variants, one in APC and one in MUTYH. Thirteen of these 14 pathogenic variants were detected by both NGS and Sanger, while one MSH2 mutation (L280FfsX3) was identified only by Sanger sequencing. This is due to a limitation of the NGS approach in resolving sequences close or within homopolymeric stretches of DNA. To evaluate the performance of our NGS custom panel we assessed its capability to resolve the DNA sequences corresponding to 2225 pathogenic variants reported in the COSMIC database for APC, MUTYH, MLH1, MSH2, MSH6. Our NGS custom panel resolves the sequences where 2108 (94.7 %) of these variants occur. The remaining 117 mutations reside inside or in close proximity to homopolymer stretches; of these 27 (1.2 %) are imprecisely identified by the software but can be resolved by visual inspection of the region, while the remaining 90 variants (4.0 %) are blind spots. In summary, our custom panel would miss 4 % (90/2225) of pathogenic variants that would need a small set of Sanger sequencing reactions to be solved. The multiplex NGS approach has the advantage of analyzing multiple genes in multiple samples simultaneously, requiring only a reduced number of Sanger sequences to resolve homopolymeric DNA regions not adequately assessed by NGS. The implementation of NGS approaches in routine diagnostics of familial CRC is cost-effective and significantly reduces diagnostic turnaround times.

  18. X-Linked Glomerulopathy Due to COL4A5 Founder Variant.

    PubMed

    Barua, Moumita; John, Rohan; Stella, Lorenzo; Li, Weili; Roslin, Nicole M; Sharif, Bedra; Hack, Saidah; Lajoie-Starkell, Ginette; Schwaderer, Andrew L; Becknell, Brian; Wuttke, Matthias; Köttgen, Anna; Cattran, Daniel; Paterson, Andrew D; Pei, York

    2018-03-01

    Alport syndrome is a rare hereditary disorder caused by rare variants in 1 of 3 genes encoding for type IV collagen. Rare variants in COL4A5 on chromosome Xq22 cause X-linked Alport syndrome, which accounts for ∼80% of the cases. Alport syndrome has a variable clinical presentation, including progressive kidney failure, hearing loss, and ocular defects. Exome sequencing performed in 2 affected related males with an undefined X-linked glomerulopathy characterized by global and segmental glomerulosclerosis, mesangial hypercellularity, and vague basement membrane immune complex deposition revealed a COL4A5 sequence variant, a substitution of a thymine by a guanine at nucleotide 665 (c.T665G; rs281874761) of the coding DNA predicted to lead to a cysteine to phenylalanine substitution at amino acid 222, which was not seen in databases cataloguing natural human genetic variation, including dbSNP138, 1000 Genomes Project release version 01-11-2004, Exome Sequencing Project 21-06-2014, or ExAC 01-11-2014. Review of the literature identified 2 additional families with the same COL4A5 variant leading to similar atypical histopathologic features, suggesting a unique pathologic mechanism initiated by this specific rare variant. Homology modeling suggests that the substitution alters the structural and dynamic properties of the type IV collagen trimer. Genetic analysis comparing members of the 3 families indicated a distant relationship with a shared haplotype, implying a founder effect. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.

  19. Large-scale gene-centric analysis identifies novel variants for coronary artery disease.

    PubMed

    2011-09-01

    Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10(-33); LPA:p<10(-19); 1p13.3:p<10(-17)) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10(-7)). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06-1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and clarified the literature with regard to many previously suggested genes.

  20. A Protein Domain and Family Based Approach to Rare Variant Association Analysis.

    PubMed

    Richardson, Tom G; Shihab, Hashem A; Rivas, Manuel A; McCarthy, Mark I; Campbell, Colin; Timpson, Nicholas J; Gaunt, Tom R

    2016-01-01

    It has become common practice to analyse large scale sequencing data with statistical approaches based around the aggregation of rare variants within the same gene. We applied a novel approach to rare variant analysis by collapsing variants together using protein domain and family coordinates, regarded to be a more discrete definition of a biologically functional unit. Using Pfam definitions, we collapsed rare variants (Minor Allele Frequency ≤ 1%) together in three different ways 1) variants within single genomic regions which map to individual protein domains 2) variants within two individual protein domain regions which are predicted to be responsible for a protein-protein interaction 3) all variants within combined regions from multiple genes responsible for coding the same protein domain (i.e. protein families). A conventional collapsing analysis using gene coordinates was also undertaken for comparison. We used UK10K sequence data and investigated associations between regions of variants and lipid traits using the sequence kernel association test (SKAT). We observed no strong evidence of association between regions of variants based on Pfam domain definitions and lipid traits. Quantile-Quantile plots illustrated that the overall distributions of p-values from the protein domain analyses were comparable to that of a conventional gene-based approach. Deviations from this distribution suggested that collapsing by either protein domain or gene definitions may be favourable depending on the trait analysed. We have collapsed rare variants together using protein domain and family coordinates to present an alternative approach over collapsing across conventionally used gene-based regions. Although no strong evidence of association was detected in these analyses, future studies may still find value in adopting these approaches to detect previously unidentified association signals.

  1. Whole Gene Capture Analysis of 15 CRC Susceptibility Genes in Suspected Lynch Syndrome Patients.

    PubMed

    Jansen, Anne M L; Geilenkirchen, Marije A; van Wezel, Tom; Jagmohan-Changur, Shantie C; Ruano, Dina; van der Klift, Heleen M; van den Akker, Brendy E W M; Laros, Jeroen F J; van Galen, Michiel; Wagner, Anja; Letteboer, Tom G W; Gómez-García, Encarna B; Tops, Carli M J; Vasen, Hans F; Devilee, Peter; Hes, Frederik J; Morreau, Hans; Wijnen, Juul T

    2016-01-01

    Lynch Syndrome (LS) is caused by pathogenic germline variants in one of the mismatch repair (MMR) genes. However, up to 60% of MMR-deficient colorectal cancer cases are categorized as suspected Lynch Syndrome (sLS) because no pathogenic MMR germline variant can be identified, which leads to difficulties in clinical management. We therefore analyzed the genomic regions of 15 CRC susceptibility genes in leukocyte DNA of 34 unrelated sLS patients and 11 patients with MLH1 hypermethylated tumors with a clear family history. Using targeted next-generation sequencing, we analyzed the entire non-repetitive genomic sequence, including intronic and regulatory sequences, of 15 CRC susceptibility genes. In addition, tumor DNA from 28 sLS patients was analyzed for somatic MMR variants. Of 1979 germline variants found in the leukocyte DNA of 34 sLS patients, one was a pathogenic variant (MLH1 c.1667+1delG). Leukocyte DNA of 11 patients with MLH1 hypermethylated tumors was negative for pathogenic germline variants in the tested CRC susceptibility genes and for germline MLH1 hypermethylation. Somatic DNA analysis of 28 sLS tumors identified eight (29%) cases with two pathogenic somatic variants, one with a VUS predicted to pathogenic and LOH, and nine cases (32%) with one pathogenic somatic variant (n = 8) or one VUS predicted to be pathogenic (n = 1). This is the first study in sLS patients to include the entire genomic sequence of CRC susceptibility genes. An underlying somatic or germline MMR gene defect was identified in ten of 34 sLS patients (29%). In the remaining sLS patients, the underlying genetic defect explaining the MMRdeficiency in their tumors might be found outside the genomic regions harboring the MMR and other known CRC susceptibility genes.

  2. Extreme obesity is associated with variation in genes related to the circadian rhythm of food intake and hypothalamic signaling.

    PubMed

    Mariman, Edwin C M; Bouwman, Freek G; Aller, Erik E J G; van Baak, Marleen A; Wang, Ping

    2015-06-01

    The hypothalamus is important for regulation of energy intake. Mutations in genes involved in the function of the hypothalamus can lead to early-onset severe obesity. To look further into this, we have followed a strategy that allowed us to identify rare and common gene variants as candidates for the background of extreme obesity from a relatively small cohort. For that we focused on subjects with a well-selected phenotype and on a defined gene set and used a rich source of genetic data with stringent cut-off values. A list of 166 genes functionally related to the hypothalamus was generated. In those genes complete exome sequence data from 30 extreme obese subjects (60 genomes) were screened for novel rare indel, nonsense, and missense variants with a predicted negative impact on protein function. In addition, (moderately) common variants in those genes were analyzed for allelic association using the general population as reference (false discovery rate<0.05). Six novel rare deleterious missense variants were found in the genes for BAIAP3, NBEA, PRRC2A, RYR1, SIM1, and TRH, and a novel indel variant in LEPR. Common variants in the six genes for MBOAT4, NPC1, NPW, NUCB2, PER1, and PRRC2A showed significant allelic association with extreme obesity. Our findings underscore the complexity of the genetic background of extreme obesity involving rare and common variants of genes from defined metabolic and physiologic processes, in particular regulation of the circadian rhythm of food intake and hypothalamic signaling. Copyright © 2015 the American Physiological Society.

  3. HFE gene variants affect iron in the brain.

    PubMed

    Nandar, Wint; Connor, James R

    2011-04-01

    Iron accumulation in the brain and increased oxidative stress are consistent observations in many neurodegenerative diseases. Thus, we have begun examination into gene mutations or allelic variants that could be associated with loss of iron homeostasis. One of the mechanisms leading to iron overload is a mutation in the HFE gene, which is involved in iron metabolism. The 2 most common HFE gene variants are C282Y (1.9%) and H63D (8.9%). The C282Y HFE variant is more commonly associated with hereditary hemochromatosis, which is an autosomal recessive disorder, characterized by iron overload in a number of systemic organs. The H63D HFE variant appears less frequently associated with hemochromatosis, but its role in the neurodegenerative diseases has received more attention. At the cellular level, the HFE mutant protein resulting from the H63D HFE gene variant is associated with iron dyshomeostasis, increased oxidative stress, glutamate release, tau phosphorylation, and alteration in inflammatory response, each of which is under investigation as a contributing factor to neurodegenerative diseases. Therefore, the HFE gene variants are proposed to be genetic modifiers or a risk factor for neurodegenerative diseases by establishing an enabling milieu for pathogenic agents. This review will discuss the current knowledge of the association of the HFE gene variants with neurodegenerative diseases: amyotrophic lateral sclerosis, Alzheimer's disease, Parkinson's disease, and ischemic stroke. Importantly, the data herein also begin to dispel the long-held view that the brain is protected from iron accumulation associated with the HFE mutations.

  4. Gene-based meta-analysis of genome-wide association study data identifies independent single-nucleotide polymorphisms in ANXA6 as being associated with systemic lupus erythematosus in Asian populations.

    PubMed

    Zhang, Jing; Zhang, Lu; Zhang, Yan; Yang, Jing; Guo, Mengbiao; Sun, Liangdan; Pan, Hai-Feng; Hirankarn, Nattiya; Ying, Dingge; Zeng, Shuai; Lee, Tsz Leung; Lau, Chak Sing; Chan, Tak Mao; Leung, Alexander Moon Ho; Mok, Chi Chiu; Wong, Sik Nin; Lee, Ka Wing; Ho, Marco Hok Kung; Lee, Pamela Pui Wah; Chung, Brian Hon-Yin; Chong, Chun Yin; Wong, Raymond Woon Sing; Mok, Mo Yin; Wong, Wilfred Hing Sang; Tong, Kwok Lung; Tse, Niko Kei Chiu; Li, Xiang-Pei; Avihingsanon, Yingyos; Rianthavorn, Pornpimol; Deekajorndej, Thavatchai; Suphapeetiporn, Kanya; Shotelersuk, Vorasuk; Ying, Shirley King Yee; Fung, Samuel Ka Shun; Lai, Wai Ming; Garcia-Barceló, Maria-Mercè; Cherny, Stacey S; Sham, Pak Chung; Cui, Yong; Yang, Sen; Ye, Dong Qing; Zhang, Xue-Jun; Lau, Yu Lung; Yang, Wanling

    2015-11-01

    Previous genome-wide association studies (GWAS), which were mainly based on single-variant analysis, have identified many systemic lupus erythematosus (SLE) susceptibility loci. However, the genetic architecture of this complex disease is far from being understood. The aim of this study was to investigate whether using a gene-based analysis may help to identify novel loci, by considering global evidence of association from a gene or a genomic region rather than focusing on evidence for individual variants. Based on the results of a meta-analysis of 2 GWAS of SLE conducted in 2 Asian cohorts, we performed an in-depth gene-based analysis followed by replication in a total of 4,626 patients and 7,466 control subjects of Asian ancestry. Differential allelic expression was measured by pyrosequencing. More than one-half of the reported SLE susceptibility loci showed evidence of independent effects, and this finding is important for understanding the mechanisms of association and explaining disease heritability. ANXA6 was detected as a novel SLE susceptibility gene, with several single-nucleotide polymorphisms (SNPs) contributing independently to the association with disease. The risk allele of rs11960458 correlated significantly with increased expression of ANXA6 in peripheral blood mononuclear cells from heterozygous healthy control subjects. Several other associated SNPs may also regulate ANXA6 expression, according to data obtained from public databases. Higher expression of ANXA6 in patients with SLE was also reported previously. Our study demonstrated the merit of using gene-based analysis to identify novel susceptibility loci, especially those with independent effects, and also demonstrated the widespread presence of loci with independent effects in SLE susceptibility genes. © 2015, American College of Rheumatology.

  5. GTRAC: fast retrieval from compressed collections of genomic variants

    PubMed Central

    Tatwawadi, Kedar; Hernaez, Mikel; Ochoa, Idoia; Weissman, Tsachy

    2016-01-01

    Motivation: The dramatic decrease in the cost of sequencing has resulted in the generation of huge amounts of genomic data, as evidenced by projects such as the UK10K and the Million Veteran Project, with the number of sequenced genomes ranging in the order of 10 K to 1 M. Due to the large redundancies among genomic sequences of individuals from the same species, most of the medical research deals with the variants in the sequences as compared with a reference sequence, rather than with the complete genomic sequences. Consequently, millions of genomes represented as variants are stored in databases. These databases are constantly updated and queried to extract information such as the common variants among individuals or groups of individuals. Previous algorithms for compression of this type of databases lack efficient random access capabilities, rendering querying the database for particular variants and/or individuals extremely inefficient, to the point where compression is often relinquished altogether. Results: We present a new algorithm for this task, called GTRAC, that achieves significant compression ratios while allowing fast random access over the compressed database. For example, GTRAC is able to compress a Homo sapiens dataset containing 1092 samples in 1.1 GB (compression ratio of 160), while allowing for decompression of specific samples in less than a second and decompression of specific variants in 17 ms. GTRAC uses and adapts techniques from information theory, such as a specialized Lempel-Ziv compressor, and tailored succinct data structures. Availability and Implementation: The GTRAC algorithm is available for download at: https://github.com/kedartatwawadi/GTRAC Contact: kedart@stanford.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27587665

  6. GTRAC: fast retrieval from compressed collections of genomic variants.

    PubMed

    Tatwawadi, Kedar; Hernaez, Mikel; Ochoa, Idoia; Weissman, Tsachy

    2016-09-01

    The dramatic decrease in the cost of sequencing has resulted in the generation of huge amounts of genomic data, as evidenced by projects such as the UK10K and the Million Veteran Project, with the number of sequenced genomes ranging in the order of 10 K to 1 M. Due to the large redundancies among genomic sequences of individuals from the same species, most of the medical research deals with the variants in the sequences as compared with a reference sequence, rather than with the complete genomic sequences. Consequently, millions of genomes represented as variants are stored in databases. These databases are constantly updated and queried to extract information such as the common variants among individuals or groups of individuals. Previous algorithms for compression of this type of databases lack efficient random access capabilities, rendering querying the database for particular variants and/or individuals extremely inefficient, to the point where compression is often relinquished altogether. We present a new algorithm for this task, called GTRAC, that achieves significant compression ratios while allowing fast random access over the compressed database. For example, GTRAC is able to compress a Homo sapiens dataset containing 1092 samples in 1.1 GB (compression ratio of 160), while allowing for decompression of specific samples in less than a second and decompression of specific variants in 17 ms. GTRAC uses and adapts techniques from information theory, such as a specialized Lempel-Ziv compressor, and tailored succinct data structures. The GTRAC algorithm is available for download at: https://github.com/kedartatwawadi/GTRAC CONTACT: : kedart@stanford.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. No Evidence That Schizophrenia Candidate Genes Are More Associated With Schizophrenia Than Noncandidate Genes.

    PubMed

    Johnson, Emma C; Border, Richard; Melroy-Greif, Whitney E; de Leeuw, Christiaan A; Ehringer, Marissa A; Keller, Matthew C

    2017-11-15

    A recent analysis of 25 historical candidate gene polymorphisms for schizophrenia in the largest genome-wide association study conducted to date suggested that these commonly studied variants were no more associated with the disorder than would be expected by chance. However, the same study identified other variants within those candidate genes that demonstrated genome-wide significant associations with schizophrenia. As such, it is possible that variants within historic schizophrenia candidate genes are associated with schizophrenia at levels above those expected by chance, even if the most-studied specific polymorphisms are not. The present study used association statistics from the largest schizophrenia genome-wide association study conducted to date as input to a gene set analysis to investigate whether variants within schizophrenia candidate genes are enriched for association with schizophrenia. As a group, variants in the most-studied candidate genes were no more associated with schizophrenia than were variants in control sets of noncandidate genes. While a small subset of candidate genes did appear to be significantly associated with schizophrenia, these genes were not particularly noteworthy given the large number of more strongly associated noncandidate genes. The history of schizophrenia research should serve as a cautionary tale to candidate gene investigators examining other phenotypes: our findings indicate that the most investigated candidate gene hypotheses of schizophrenia are not well supported by genome-wide association studies, and it is likely that this will be the case for other complex traits as well. Copyright © 2017 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  8. The Histone Database: an integrated resource for histones and histone fold-containing proteins

    PubMed Central

    Mariño-Ramírez, Leonardo; Levine, Kevin M.; Morales, Mario; Zhang, Suiyuan; Moreland, R. Travis; Baxevanis, Andreas D.; Landsman, David

    2011-01-01

    Eukaryotic chromatin is composed of DNA and protein components—core histones—that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins. Database URL: The Histone Sequence Database is freely available and can be accessed at http://research.nhgri.nih.gov/histones/. PMID:22025671

  9. Molecular Background of Colorectal Tumors From Patients with Lynch Syndrome Associated With Germline Variants in PMS2.

    PubMed

    Ten Broeke, S W; van Bavel, T C; Jansen, A M L; Gómez-García, E; Hes, F J; van Hest, L P; Letteboer, T G W; Olderode-Berends, M J W; Ruano, D; Spruijt, L; Suerink, M; Tops, C M; van Eijk, R; Morreau, H; van Wezel, T; Nielsen, M

    2018-05-11

    Germline variants in the mismatch repair genes MLH1, MSH2 (EPCAM), MSH6, or PMS2 cause Lynch syndrome. Patients with these variants have an increased risk of developing colorectal cancers (CRCs) that differ from sporadic CRCs in genetic and histologic features. It has been a challenge to study CRCs associated with PMS2 variants (PMS2-associated CRCs) because these develop less frequently and in patients of older ages than colorectal tumors with variants in the other mismatch repair genes. We analyzed 20 CRCs associated with germline variants in PMS2, 22 sporadic CRCs, 18 CRCs with germline variants in MSH2, and 24 CRCs from patients with germline variants in MLH1. Tumor tissue blocks were collected from Dutch pathology departments in 2017. After extraction of tumor DNA, we used a platform designed to detect approximately 3000 somatic hotspot variants in 55 genes (including KRAS, APC, CTNNB1, and TP53). Somatic variant frequencies were compared using the Fisher's exact test. None of the PMS2-associated CRCs contained any somatic variants in the catenin beta 1 gene (CTNNB1), which encodes β-catenin, whereas 14/24 MLH1-associated CRCs (58%) contained variants in CTNNB1. Half of PMS2-associated CRCs contained KRAS variants, but only 20% of these were in hotspots that encoded G12D or G13D. These hotspot variants occurred more frequently in CRCs associated with variants in MLH1 (37.5%, P=.44) and MSH2 (and 71.4%, P=.035) than with variants in PMS2. In a genetic analysis of 84 colorectal tumors, we found tumors from patients with PMS2-associated Lynch syndrome to be distinct from colorectal tumors associated with defects in other mismatch repair genes. This might account for differences in development and less frequent occurrence. Copyright © 2018 AGA Institute. Published by Elsevier Inc. All rights reserved.

  10. Hundreds of variants clustered in genomic loci and biological pathways affect human height

    PubMed Central

    Lango Allen, Hana; Estrada, Karol; Lettre, Guillaume; Berndt, Sonja I.; Weedon, Michael N.; Rivadeneira, Fernando; Willer, Cristen J.; Jackson, Anne U.; Vedantam, Sailaja; Raychaudhuri, Soumya; Ferreira, Teresa; Wood, Andrew R.; Weyant, Robert J.; Segrè, Ayellet V.; Speliotes, Elizabeth K.; Wheeler, Eleanor; Soranzo, Nicole; Park, Ju-Hyun; Yang, Jian; Gudbjartsson, Daniel; Heard-Costa, Nancy L.; Randall, Joshua C.; Qi, Lu; Smith, Albert Vernon; Mägi, Reedik; Pastinen, Tomi; Liang, Liming; Heid, Iris M.; Luan, Jian'an; Thorleifsson, Gudmar; Winkler, Thomas W.; Goddard, Michael E.; Lo, Ken Sin; Palmer, Cameron; Workalemahu, Tsegaselassie; Aulchenko, Yurii S.; Johansson, Åsa; Zillikens, M.Carola; Feitosa, Mary F.; Esko, Tõnu; Johnson, Toby; Ketkar, Shamika; Kraft, Peter; Mangino, Massimo; Prokopenko, Inga; Absher, Devin; Albrecht, Eva; Ernst, Florian; Glazer, Nicole L.; Hayward, Caroline; Hottenga, Jouke-Jan; Jacobs, Kevin B.; Knowles, Joshua W.; Kutalik, Zoltán; Monda, Keri L.; Polasek, Ozren; Preuss, Michael; Rayner, Nigel W.; Robertson, Neil R.; Steinthorsdottir, Valgerdur; Tyrer, Jonathan P.; Voight, Benjamin F.; Wiklund, Fredrik; Xu, Jianfeng; Zhao, Jing Hua; Nyholt, Dale R.; Pellikka, Niina; Perola, Markus; Perry, John R.B.; Surakka, Ida; Tammesoo, Mari-Liis; Altmaier, Elizabeth L.; Amin, Najaf; Aspelund, Thor; Bhangale, Tushar; Boucher, Gabrielle; Chasman, Daniel I.; Chen, Constance; Coin, Lachlan; Cooper, Matthew N.; Dixon, Anna L.; Gibson, Quince; Grundberg, Elin; Hao, Ke; Junttila, M. Juhani; Kaplan, Lee M.; Kettunen, Johannes; König, Inke R.; Kwan, Tony; Lawrence, Robert W.; Levinson, Douglas F.; Lorentzon, Mattias; McKnight, Barbara; Morris, Andrew P.; Müller, Martina; Ngwa, Julius Suh; Purcell, Shaun; Rafelt, Suzanne; Salem, Rany M.; Salvi, Erika; Sanna, Serena; Shi, Jianxin; Sovio, Ulla; Thompson, John R.; Turchin, Michael C.; Vandenput, Liesbeth; Verlaan, Dominique J.; Vitart, Veronique; White, Charles C.; Ziegler, Andreas; Almgren, Peter; Balmforth, Anthony J.; Campbell, Harry; Citterio, Lorena; De Grandi, Alessandro; Dominiczak, Anna; Duan, Jubao; Elliott, Paul; Elosua, Roberto; Eriksson, Johan G.; Freimer, Nelson B.; Geus, Eco J.C.; Glorioso, Nicola; Haiqing, Shen; Hartikainen, Anna-Liisa; Havulinna, Aki S.; Hicks, Andrew A.; Hui, Jennie; Igl, Wilmar; Illig, Thomas; Jula, Antti; Kajantie, Eero; Kilpeläinen, Tuomas O.; Koiranen, Markku; Kolcic, Ivana; Koskinen, Seppo; Kovacs, Peter; Laitinen, Jaana; Liu, Jianjun; Lokki, Marja-Liisa; Marusic, Ana; Maschio, Andrea; Meitinger, Thomas; Mulas, Antonella; Paré, Guillaume; Parker, Alex N.; Peden, John F.; Petersmann, Astrid; Pichler, Irene; Pietiläinen, Kirsi H.; Pouta, Anneli; Ridderstråle, Martin; Rotter, Jerome I.; Sambrook, Jennifer G.; Sanders, Alan R.; Schmidt, Carsten Oliver; Sinisalo, Juha; Smit, Jan H.; Stringham, Heather M.; Walters, G.Bragi; Widen, Elisabeth; Wild, Sarah H.; Willemsen, Gonneke; Zagato, Laura; Zgaga, Lina; Zitting, Paavo; Alavere, Helene; Farrall, Martin; McArdle, Wendy L.; Nelis, Mari; Peters, Marjolein J.; Ripatti, Samuli; van Meurs, Joyce B.J.; Aben, Katja K.; Ardlie, Kristin G; Beckmann, Jacques S.; Beilby, John P.; Bergman, Richard N.; Bergmann, Sven; Collins, Francis S.; Cusi, Daniele; den Heijer, Martin; Eiriksdottir, Gudny; Gejman, Pablo V.; Hall, Alistair S.; Hamsten, Anders; Huikuri, Heikki V.; Iribarren, Carlos; Kähönen, Mika; Kaprio, Jaakko; Kathiresan, Sekar; Kiemeney, Lambertus; Kocher, Thomas; Launer, Lenore J.; Lehtimäki, Terho; Melander, Olle; Mosley, Tom H.; Musk, Arthur W.; Nieminen, Markku S.; O'Donnell, Christopher J.; Ohlsson, Claes; Oostra, Ben; Palmer, Lyle J.; Raitakari, Olli; Ridker, Paul M.; Rioux, John D.; Rissanen, Aila; Rivolta, Carlo; Schunkert, Heribert; Shuldiner, Alan R.; Siscovick, David S.; Stumvoll, Michael; Tönjes, Anke; Tuomilehto, Jaakko; van Ommen, Gert-Jan; Viikari, Jorma; Heath, Andrew C.; Martin, Nicholas G.; Montgomery, Grant W.; Province, Michael A.; Kayser, Manfred; Arnold, Alice M.; Atwood, Larry D.; Boerwinkle, Eric; Chanock, Stephen J.; Deloukas, Panos; Gieger, Christian; Grönberg, Henrik; Hall, Per; Hattersley, Andrew T.; Hengstenberg, Christian; Hoffman, Wolfgang; Lathrop, G.Mark; Salomaa, Veikko; Schreiber, Stefan; Uda, Manuela; Waterworth, Dawn; Wright, Alan F.; Assimes, Themistocles L.; Barroso, Inês; Hofman, Albert; Mohlke, Karen L.; Boomsma, Dorret I.; Caulfield, Mark J.; Cupples, L.Adrienne; Erdmann, Jeanette; Fox, Caroline S.; Gudnason, Vilmundur; Gyllensten, Ulf; Harris, Tamara B.; Hayes, Richard B.; Jarvelin, Marjo-Riitta; Mooser, Vincent; Munroe, Patricia B.; Ouwehand, Willem H.; Penninx, Brenda W.; Pramstaller, Peter P.; Quertermous, Thomas; Rudan, Igor; Samani, Nilesh J.; Spector, Timothy D.; Völzke, Henry; Watkins, Hugh; Wilson, James F.; Groop, Leif C.; Haritunians, Talin; Hu, Frank B.; Kaplan, Robert C.; Metspalu, Andres; North, Kari E.; Schlessinger, David; Wareham, Nicholas J.; Hunter, David J.; O'Connell, Jeffrey R.; Strachan, David P.; Wichmann, H.-Erich; Borecki, Ingrid B.; van Duijn, Cornelia M.; Schadt, Eric E.; Thorsteinsdottir, Unnur; Peltonen, Leena; Uitterlinden, André; Visscher, Peter M.; Chatterjee, Nilanjan; Loos, Ruth J.F.; Boehnke, Michael; McCarthy, Mark I.; Ingelsson, Erik; Lindgren, Cecilia M.; Abecasis, Gonçalo R.; Stefansson, Kari; Frayling, Timothy M.; Hirschhorn, Joel N

    2010-01-01

    Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence phenotype. Genome-wide association (GWA) studies have identified >600 variants associated with human traits1, but these typically explain small fractions of phenotypic variation, raising questions about the utility of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait2,3. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P=0.016), and that underlie skeletal growth defects (P<0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants, and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented amongst variants that alter amino acid structure of proteins and expression levels of nearby genes. Our data explain ∼10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to ∼16% of phenotypic variation (∼20% of heritable variation). Although additional approaches are needed to fully dissect the genetic architecture of polygenic human traits, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways. PMID:20881960

  11. Mining the LIPG Allelic Spectrum Reveals the Contribution of Rare and Common Regulatory Variants to HDL Cholesterol

    PubMed Central

    Raghavan, Avanthi; Neeli, Hemanth; Jin, Weijun; Badellino, Karen O.; Demissie, Serkalem; Manning, Alisa K.; DerOhannessian, Stephanie L.; Wolfe, Megan L.; Cupples, L. Adrienne; Li, Mingyao; Kathiresan, Sekar; Rader, Daniel J.

    2011-01-01

    Genome-wide association studies (GWAS) have successfully identified loci associated with quantitative traits, such as blood lipids. Deep resequencing studies are being utilized to catalogue the allelic spectrum at GWAS loci. The goal of these studies is to identify causative variants and missing heritability, including heritability due to low frequency and rare alleles with large phenotypic impact. Whereas rare variant efforts have primarily focused on nonsynonymous coding variants, we hypothesized that noncoding variants in these loci are also functionally important. Using the HDL-C gene LIPG as an example, we explored the effect of regulatory variants identified through resequencing of subjects at HDL-C extremes on gene expression, protein levels, and phenotype. Resequencing a portion of the LIPG promoter and 5′ UTR in human subjects with extreme HDL-C, we identified several rare variants in individuals from both extremes. Luciferase reporter assays were used to measure the effect of these rare variants on LIPG expression. Variants conferring opposing effects on gene expression were enriched in opposite extremes of the phenotypic distribution. Minor alleles of a common regulatory haplotype and noncoding GWAS SNPs were associated with reduced plasma levels of the LIPG gene product endothelial lipase (EL), consistent with its role in HDL-C catabolism. Additionally, we found that a common nonfunctional coding variant associated with HDL-C (rs2000813) is in linkage disequilibrium with a 5′ UTR variant (rs34474737) that decreases LIPG promoter activity. We attribute the gene regulatory role of rs34474737 to the observed association of the coding variant with plasma EL levels and HDL-C. Taken together, the findings show that both rare and common noncoding regulatory variants are important contributors to the allelic spectrum in complex trait loci. PMID:22174694

  12. The germline variants in DNA repair genes in pediatric medulloblastoma: a challenge for current therapeutic strategies.

    PubMed

    Trubicka, Joanna; Żemojtel, Tomasz; Hecht, Jochen; Falana, Katarzyna; Piekutowska-Abramczuk, Dorota; Płoski, Rafał; Perek-Polnik, Marta; Drogosiewicz, Monika; Grajkowska, Wiesława; Ciara, Elżbieta; Moszczyńska, Elżbieta; Dembowska-Bagińska, Bożenna; Perek, Danuta; Chrzanowska, Krystyna H; Krajewska-Walasek, Małgorzata; Łastowska, Maria

    2017-04-04

    The defects in DNA repair genes are potentially linked to development and response to therapy in medulloblastoma. Therefore the purpose of this study was to establish the spectrum and frequency of germline variants in selected DNA repair genes and their impact on response to chemotherapy in medulloblastoma patients. The following genes were investigated in 102 paediatric patients: MSH2 and RAD50 using targeted gene panel sequencing and NBN variants (p.I171V and p.K219fs*19) by Sanger sequencing. In three patients with presence of rare life-threatening adverse events (AE) and no detected variants in the analyzed genes, whole exome sequencing was performed. Based on combination of molecular and immunohistochemical evaluations tumors were divided into molecular subgroups. Presence of variants was tested for potential association with the occurrence of rare life-threatening AE and other clinical features. We have identified altogether six new potentially pathogenic variants in MSH2 (p.A733T and p.V606I), RAD50 (p.R1093*), FANCM (p.L694*), ERCC2 (p.R695C) and EXO1 (p.V738L), in addition to two known NBN variants. Five out of twelve patients with defects in either of MSH2, RAD50 and NBN genes suffered from rare life-threatening AE, more frequently than in control group (p = 0.0005). When all detected variants were taken into account, the majority of patients (8 out of 15) suffered from life-threatening toxicity during chemotherapy. Our results, based on the largest systematic study performed in a clinical setting, provide preliminary evidence for a link between defects in DNA repair genes and treatment related toxicity in children with medulloblastoma. The data suggest that patients with DNA repair gene variants could need special vigilance during and after courses of chemotherapy.

  13. High-resolution melting (HRM) re-analysis of a polyposis patients cohort reveals previously undetected heterozygous and mosaic APC gene mutations.

    PubMed

    Out, Astrid A; van Minderhout, Ivonne J H M; van der Stoep, Nienke; van Bommel, Lysette S R; Kluijt, Irma; Aalfs, Cora; Voorendt, Marsha; Vossen, Rolf H A M; Nielsen, Maartje; Vasen, Hans F A; Morreau, Hans; Devilee, Peter; Tops, Carli M J; Hes, Frederik J

    2015-06-01

    Familial adenomatous polyposis is most frequently caused by pathogenic variants in either the APC gene or the MUTYH gene. The detection rate of pathogenic variants depends on the severity of the phenotype and sensitivity of the screening method, including sensitivity for mosaic variants. For 171 patients with multiple colorectal polyps without previously detectable pathogenic variant, APC was reanalyzed in leukocyte DNA by one uniform technique: high-resolution melting (HRM) analysis. Serial dilution of heterozygous DNA resulted in a lowest detectable allelic fraction of 6% for the majority of variants. HRM analysis and subsequent sequencing detected pathogenic fully heterozygous APC variants in 10 (6%) of the patients and pathogenic mosaic variants in 2 (1%). All these variants were previously missed by various conventional scanning methods. In parallel, HRM APC scanning was applied to DNA isolated from polyp tissue of two additional patients with apparently sporadic polyposis and without detectable pathogenic APC variant in leukocyte DNA. In both patients a pathogenic mosaic APC variant was present in multiple polyps. The detection of pathogenic APC variants in 7% of the patients, including mosaics, illustrates the usefulness of a complete APC gene reanalysis of previously tested patients, by a supplementary scanning method. HRM is a sensitive and fast pre-screening method for reliable detection of heterozygous and mosaic variants, which can be applied to leukocyte and polyp derived DNA.

  14. Novel variant in the TP63 gene associated to ankyloblepharon-ectodermal dysplasia-cleft lip/palate (AEC) syndrome.

    PubMed

    Gonzalez, Francisco; Loidi, Lourdes; Abalo-Lojo, Jose M

    2017-01-01

    Ankyloblepharon-ectodermal dysplasia-cleft lip/palate (AEC) syndrome is a disorder resulting from anomalous embryonic development of ectodermal tissues. There is evidence that AEC syndrome is caused by mutations in the TP63 gene, which encodes the p63 protein. This is an important regulatory protein involved in epidermal proliferation and differentiation. Genome sequencing was performed in DNA from peripheral blood leukocytes of a newborn with AEC syndrome and her parents. Variants were searched in all coding exons and intron-exon boundaries of the TP63 gene. A heterozygous missense variant (NM_003722.4:c.1063G>C (p.Asp355His) was found in the newborn patient. No variants were found in either of the parents. We identified a previously unreported variant in TP63 gene which seems to be involved in the somatic malformations found in the AEC syndrome. The absence of this variant in both parents suggests that the variant appeared de novo.

  15. Identification of causal genes for complex traits

    PubMed Central

    Hormozdiari, Farhad; Kichaev, Gleb; Yang, Wen-Yun; Pasaniuc, Bogdan; Eskin, Eleazar

    2015-01-01

    Motivation: Although genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider ‘causal variants’ as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations. Results: In this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability ρ. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2. Availability and implementation: Software is freely available for download at genetics.cs.ucla.edu/caviar. Contact: eeskin@cs.ucla.edu PMID:26072484

  16. Natural rpoS mutations contribute to population heterogeneity in Escherichia coli O157:H7 strains linked to the 2006 US spinach-associated outbreak.

    PubMed

    Carter, Michelle Qiu; Louie, Jacqueline W; Huynh, Steven; Parker, Craig T

    2014-12-01

    We previously reported significantly different acid resistance between curli variants derived from the same Escherichia coli O157:H7 strain, although the curli fimbriae were not associated with this phenotypic divergence. Here we investigated the underlying molecular mechanism by examining the genes encoding the common transcriptional regulators of curli biogenesis and acid resistance. rpoS null mutations were detected in all curli-expressing variants of the 2006 spinach-associated outbreak strains, whereas a wild-type rpoS was present in all curli-deficient variants. Consequently curli-expressing variants were much more sensitive to various stress challenges than curli-deficient variants. This loss of general stress fitness appeared solely to be the result of rpoS mutation since the stress resistances could be restored in curli-expressing variants by a functional rpoS. Comparative transcriptomic analyses between the curli variants revealed a large number of differentially expressed genes, characterized by the enhanced expression of metabolic genes in curli-expressing variants, but a marked decrease in transcription of genes related to stress resistances. Unlike the curli-expressing variants of the 1993 US hamburger-associated outbreak strains (Applied Environmental Microbiology 78: 7706-7719), all curli-expressing variants of the 2006 spinach-associated outbreak strains carry a functional rcsB gene, suggesting an alternative mechanism governing intra-strain phenotypic divergence in E. coli O157:H7. Published by Elsevier Ltd.

  17. Filtering genetic variants and placing informative priors based on putative biological function.

    PubMed

    Friedrichs, Stefanie; Malzahn, Dörthe; Pugh, Elizabeth W; Almeida, Marcio; Liu, Xiao Qing; Bailey, Julia N

    2016-02-03

    High-density genetic marker data, especially sequence data, imply an immense multiple testing burden. This can be ameliorated by filtering genetic variants, exploiting or accounting for correlations between variants, jointly testing variants, and by incorporating informative priors. Priors can be based on biological knowledge or predicted variant function, or even be used to integrate gene expression or other omics data. Based on Genetic Analysis Workshop (GAW) 19 data, this article discusses diversity and usefulness of functional variant scores provided, for example, by PolyPhen2, SIFT, or RegulomeDB annotations. Incorporating functional scores into variant filters or weights and adjusting the significance level for correlations between variants yielded significant associations with blood pressure traits in a large family study of Mexican Americans (GAW19 data set). Marker rs218966 in gene PHF14 and rs9836027 in MAP4 significantly associated with hypertension; additionally, rare variants in SNUPN significantly associated with systolic blood pressure. Variant weights strongly influenced the power of kernel methods and burden tests. Apart from variant weights in test statistics, prior weights may also be used when combining test statistics or to informatively weight p values while controlling false discovery rate (FDR). Indeed, power improved when gene expression data for FDR-controlled informative weighting of association test p values of genes was used. Finally, approaches exploiting variant correlations included identity-by-descent mapping and the optimal strategy for joint testing rare and common variants, which was observed to depend on linkage disequilibrium structure.

  18. Arrhythmogenic KCNE gene variants: current knowledge and future challenges

    PubMed Central

    Crump, Shawn M.; Abbott, Geoffrey W.

    2014-01-01

    There are twenty-five known inherited cardiac arrhythmia susceptibility genes, all of which encode either ion channel pore-forming subunits or proteins that regulate aspects of ion channel biology such as function, trafficking, and localization. The human KCNE gene family comprises five potassium channel regulatory subunits, sequence variants in each of which are associated with cardiac arrhythmias. KCNE gene products exhibit promiscuous partnering and in some cases ubiquitous expression, hampering efforts to unequivocally correlate each gene to specific native potassium currents. Likewise, deducing the molecular etiology of cardiac arrhythmias in individuals harboring rare KCNE gene variants, or more common KCNE polymorphisms, can be challenging. In this review we provide an update on putative arrhythmia-causing KCNE gene variants, and discuss current thinking and future challenges in the study of molecular mechanisms of KCNE-associated cardiac rhythm disturbances. PMID:24478792

  19. Novel approach to genetic analysis and results in 3000 hemophilia patients enrolled in the My Life, Our Future initiative

    PubMed Central

    Johnsen, Jill M.; Fletcher, Shelley N.; Huston, Haley; Roberge, Sarah; Martin, Beth K.; Kircher, Martin; Josephson, Neil C.; Shendure, Jay; Ruuska, Sarah; Koerper, Marion A.; Morales, Jaime; Pierce, Glenn F.; Aschman, Diane J.

    2017-01-01

    Hemophilia A and B are rare, X-linked bleeding disorders. My Life, Our Future (MLOF) is a collaborative project established to genotype and study hemophilia. Patients were enrolled at US hemophilia treatment centers (HTCs). Genotyping was performed centrally using next-generation sequencing (NGS) with an approach that detected common F8 gene inversions simultaneously with F8 and F9 gene sequencing followed by confirmation using standard genotyping methods. Sixty-nine HTCs enrolled the first 3000 patients in under 3 years. Clinically reportable DNA variants were detected in 98.1% (2357/2401) of hemophilia A and 99.3% (595/599) of hemophilia B patients. Of the 924 unique variants found, 285 were novel. Predicted gene-disrupting variants were common in severe disease; missense variants predominated in mild–moderate disease. Novel DNA variants accounted for ∼30% of variants found and were detected continuously throughout the project, indicating that additional variation likely remains undiscovered. The NGS approach detected >1 reportable variants in 36 patients (10 females), a finding with potential clinical implications. NGS also detected incidental variants unlikely to cause disease, including 11 variants previously reported in hemophilia. Although these genes are thought to be conserved, our findings support caution in interpretation of new variants. In summary, MLOF has contributed significantly toward variant annotation in the F8 and F9 genes. In the near future, investigators will be able to access MLOF data and repository samples for research to advance our understanding of hemophilia. PMID:29296726

  20. Combined mismatch repair and POLE/POLD1 defects explain unresolved suspected Lynch syndrome cancers

    PubMed Central

    Jansen, Anne ML; van Wezel, Tom; van den Akker, Brendy EWM; Ventayol Garcia, Marina; Ruano, Dina; Tops, Carli MJ; Wagner, Anja; Letteboer, Tom GW; Gómez-García, Encarna B; Devilee, Peter; Wijnen, Juul T; Hes, Frederik J; Morreau, Hans

    2016-01-01

    Many suspected Lynch Syndrome (sLS) patients who lack mismatch repair (MMR) germline gene variants and MLH1 or MSH2 hypermethylation are currently explained by somatic MMR gene variants or, occasionally, by germline POLE variants. To further investigate unexplained sLS patients, we analyzed leukocyte and tumor DNA of 62 sLS patients using gene panel sequencing including the POLE, POLD1 and MMR genes. Forty tumors showed either one, two or more somatic MMR variants predicted to affect function. Nine sLS tumors showed a likely ultramutated phenotype and were found to carry germline (n=2) or somatic variants (n=7) in the POLE/POLD1 exonuclease domain (EDM). Six of these POLE/POLD1-EDM mutated tumors also carried somatic MMR variants. Our findings suggest that faulty proofreading may result in loss of MMR and thereby in microsatellite instability. PMID:26648449

  1. Targeted Deep Resequencing Identifies Coding Variants in the PEAR1 Gene That Play a Role in Platelet Aggregation

    PubMed Central

    Kim, Yoonhee; Suktitipat, Bhoom; Yanek, Lisa R.; Faraday, Nauder; Wilson, Alexander F.; Becker, Diane M.; Becker, Lewis C.; Mathias, Rasika A.

    2013-01-01

    Platelet aggregation is heritable, and genome-wide association studies have detected strong associations with a common intronic variant of the platelet endothelial aggregation receptor1 (PEAR1) gene both in African American and European American individuals. In this study, we used a sequencing approach to identify additional exonic variants in PEAR1 that may also determine variability in platelet aggregation in the GeneSTAR Study. A 0.3 Mb targeted region on chromosome 1q23.1 including the entire PEAR1 gene was Sanger sequenced in 104 subjects (45% male, 49% African American, age = 52±13) selected on the basis of hyper- and hypo- aggregation across three different agonists (collagen, epinephrine, and adenosine diphosphate). Single-variant and multi-variant burden tests for association were performed. Of the 235 variants identified through sequencing, 61 were novel, and three of these were missense variants. More rare variants (MAF<5%) were noted in African Americans compared to European Americans (108 vs. 45). The common intronic GWAS-identified variant (rs12041331) demonstrated the most significant association signal in African Americans (p = 4.020×10−4); no association was seen for additional exonic variants in this group. In contrast, multi-variant burden tests indicated that exonic variants play a more significant role in European Americans (p = 0.0099 for the collective coding variants compared to p = 0.0565 for intronic variant rs12041331). Imputation of the individual exonic variants in the rest of the GeneSTAR European American cohort (N = 1,965) supports the results noted in the sequenced discovery sample: p = 3.56×10−4, 2.27×10−7, 5.20×10−5 for coding synonymous variant rs56260937 and collagen, epinephrine and adenosine diphosphate induced platelet aggregation, respectively. Sequencing approaches confirm that a common intronic variant has the strongest association with platelet aggregation in African Americans, and show that exonic variants play an additional role in platelet aggregation in European Americans. PMID:23704978

  2. LBH Gene Transcription Regulation by the Interplay of an Enhancer Risk Allele and DNA Methylation in Rheumatoid Arthritis.

    PubMed

    Hammaker, Deepa; Whitaker, John W; Maeshima, Keisuke; Boyle, David L; Ekwall, Anna-Karin H; Wang, Wei; Firestein, Gary S

    2016-11-01

    To identify nonobvious therapeutic targets for rheumatoid arthritis (RA), we performed an integrative analysis incorporating multiple "omics" data and the Encyclopedia of DNA Elements (ENCODE) database for potential regulatory regions. This analysis identified the limb bud and heart development (LBH) gene, which has risk alleles associated with RA/celiac disease and lupus, and can regulate cell proliferation in RA. We identified a novel LBH transcription enhancer with an RA risk allele (rs906868 G [Ref]/T) 6 kb upstream of the LBH gene with a differentially methylated locus. The confluence of 3 regulatory elements, rs906868, an RA differentially methylated locus, and a putative enhancer, led us to investigate their effects on LBH regulation in fibroblast-like synoviocytes (FLS). We cloned the 1.4-kb putative enhancer with either the rs906868 Ref allele or single-nucleotide polymorphism (SNP) variant into reporter constructs. The constructs were methylated in vitro and transfected into cultured FLS by nucleofection. We found that both variants increased transcription, thereby confirming the region's enhancer function. Unexpectedly, the transcriptional activity of the Ref risk allele was significantly lower than that of the SNP variant and is consistent with low LBH levels as a risk factor for aggressive FLS behavior. Using RA FLS lines with a homozygous Ref or SNP allele, we confirmed that homozygous Ref lines expressed lower LBH messenger RNA levels than did the SNP lines. Methylation significantly reduced enhancer activity for both alleles, indicating that enhancer function is dependent on its methylation status. This study shows how the interplay between genetics and epigenetics can affect expression of LBH in RA. © 2016, American College of Rheumatology.

  3. In silico prediction of splice-altering single nucleotide variants in the human genome.

    PubMed

    Jian, Xueqiu; Boerwinkle, Eric; Liu, Xiaoming

    2014-12-16

    In silico tools have been developed to predict variants that may have an impact on pre-mRNA splicing. The major limitation of the application of these tools to basic research and clinical practice is the difficulty in interpreting the output. Most tools only predict potential splice sites given a DNA sequence without measuring splicing signal changes caused by a variant. Another limitation is the lack of large-scale evaluation studies of these tools. We compared eight in silico tools on 2959 single nucleotide variants within splicing consensus regions (scSNVs) using receiver operating characteristic analysis. The Position Weight Matrix model and MaxEntScan outperformed other methods. Two ensemble learning methods, adaptive boosting and random forests, were used to construct models that take advantage of individual methods. Both models further improved prediction, with outputs of directly interpretable prediction scores. We applied our ensemble scores to scSNVs from the Catalogue of Somatic Mutations in Cancer database. Analysis showed that predicted splice-altering scSNVs are enriched in recurrent scSNVs and known cancer genes. We pre-computed our ensemble scores for all potential scSNVs across the human genome, providing a whole genome level resource for identifying splice-altering scSNVs discovered from large-scale sequencing studies.

  4. ToTem: a tool for variant calling pipeline optimization.

    PubMed

    Tom, Nikola; Tom, Ondrej; Malcikova, Jitka; Pavlova, Sarka; Kubesova, Blanka; Rausch, Tobias; Kolarik, Miroslav; Benes, Vladimir; Bystry, Vojtech; Pospisilova, Sarka

    2018-06-26

    High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall. Here we introduce ToTem, a tool for automated pipeline optimization. ToTem is a stand-alone web application with a comprehensive graphical user interface (GUI). ToTem is written in Java and PHP with an underlying connection to a MySQL database. Its primary role is to automatically generate, execute and benchmark different variant calling pipeline settings. Our tool allows an analysis to be started from any level of the process and with the possibility of plugging almost any tool or code. To prevent an over-fitting of pipeline parameters, ToTem ensures the reproducibility of these by using cross validation techniques that penalize the final precision, recall and F-measure. The results are interpreted as interactive graphs and tables allowing an optimal pipeline to be selected, based on the user's priorities. Using ToTem, we were able to optimize somatic variant calling from ultra-deep targeted gene sequencing (TGS) data and germline variant detection in whole genome sequencing (WGS) data. ToTem is a tool for automated pipeline optimization which is freely available as a web application at  https://totem.software .

  5. Early-Onset Progressive Retinal Atrophy Associated with an IQCB1 Variant in African Black-Footed Cats (Felis nigripes)

    PubMed Central

    Oh, Annie; Pearce, Jacqueline W.; Gandolfi, Barbara; Creighton, Erica K.; Suedmeyer, William K.; Selig, Michael; Bosiack, Ann P.; Castaner, Leilani J.; Whiting, Rebecca E. H.; Belknap, Ellen B.; Lyons, Leslie A.; Aderdein, Danielle; Alves, Paulo C.; Barsh, Gregory S.; Beale, Holly C.; Boyko, Adam R.; Castelhano, Marta G.; Chan, Patricia; Ellinwood, N. Matthew; Garrick, Dorian J.; Helps, Christopher R.; Kaelin, Christopher B.; Leeb, Tosso; Lohi, Hannes; Longeri, Maria; Malik, Richard; Montague, Michael J.; Munday, John S.; Murphy, William J.; Pedersen, Niels C.; Rothschild, Max F.; Swanson, William F.; Terio, Karen A.; Todhunter, Rory J.; Warren, Wesley C.

    2017-01-01

    African black-footed cats (Felis nigripes) are endangered wild felids. One male and full-sibling female African black-footed cat developed vision deficits and mydriasis as early as 3 months of age. The diagnosis of early-onset progressive retinal atrophy (PRA) was supported by reduced direct and consensual pupillary light reflexes, phenotypic presence of retinal degeneration, and a non-recordable electroretinogram with negligible amplitudes in both eyes. Whole genome sequencing, conducted on two unaffected parents and one affected offspring was compared to a variant database from 51 domestic cats and a Pallas cat, revealed 50 candidate variants that segregated concordantly with the PRA phenotype. Testing in additional affected cats confirmed that cats homozygous for a 2 base pair (bp) deletion within IQ calmodulin-binding motif-containing protein-1 (IQCB1), the gene that encodes for nephrocystin-5 (NPHP5), had vision loss. The variant segregated concordantly in other related individuals within the pedigree supporting the identification of a recessively inherited early-onset feline PRA. Analysis of the black-footed cat studbook suggests additional captive cats are at risk. Genetic testing for IQCB1 and avoidance of matings between carriers should be added to the species survival plan for captive management. PMID:28322220

  6. Cis-Regulatory Variants Affect CHRNA5 mRNA Expression in Populations of African and European Ancestry

    PubMed Central

    Wang, Jen-Chyong; Spiegel, Noah; Bertelsen, Sarah; Le, Nhung; McKenna, Nicholas; Budde, John P.; Harari, Oscar; Kapoor, Manav; Brooks, Andrew; Hancock, Dana; Tischfield, Jay; Foroud, Tatiana; Bierut, Laura J.; Steinbach, Joe Henry; Edenberg, Howard J.; Traynor, Bryan J.; Goate, Alison M.

    2013-01-01

    Variants within the gene cluster encoding α3, α5, and β4 nicotinic receptor subunits are major risk factors for substance dependence. The strongest impact on risk is associated with variation in the CHRNA5 gene, where at least two mechanisms are at work: amino acid variation and altered mRNA expression levels. The risk allele of the non-synonymous variant (rs16969968; D398N) primarily occurs on the haplotype containing the low mRNA expression allele. In populations of European ancestry, there are approximately 50 highly correlated variants in the CHRNA5-CHRNA3-CHRNB4 gene cluster and the adjacent PSMA4 gene region that are associated with CHRNA5 mRNA levels. It is not clear which of these variants contribute to the changes in CHRNA5 transcript level. Because populations of African ancestry have reduced linkage disequilibrium among variants spanning this gene cluster, eQTL mapping in subjects of African ancestry could potentially aid in defining the functional variants that affect CHRNA5 mRNA levels. We performed quantitative allele specific gene expression using frontal cortices derived from 49 subjects of African ancestry and 111 subjects of European ancestry. This method measures allele-specific transcript levels in the same individual, which eliminates other biological variation that occurs when comparing expression levels between different samples. This analysis confirmed that substance dependence associated variants have a direct cis-regulatory effect on CHRNA5 transcript levels in human frontal cortices of African and European ancestry and identified 10 highly correlated variants, located in a 9 kb region, that are potential functional variants modifying CHRNA5 mRNA expression levels. PMID:24303001

  7. Association study of genetic variants in estrogen metabolic pathway genes and colorectal cancer risk and survival.

    PubMed

    Li, Shuwei; Xie, Lisheng; Du, Mulong; Xu, Kaili; Zhu, Lingjun; Chu, Haiyan; Chen, Jinfei; Wang, Meilin; Zhang, Zhengdong; Gu, Dongying

    2018-05-16

    Although studies have investigated the association of genetic variants and the abnormal expression of estrogen-related genes with colorectal cancer risk, the evidence remains inconsistent. We clarified the relationship of genetic variants in estrogen metabolic pathway genes with colorectal cancer risk and survival. A case-control study was performed to assess the association of single-nucleotide polymorphisms (SNPs) in ten candidate genes with colorectal cancer risk in a Chinese population. A logistic regression model and Cox regression model were used to calculate SNP effects on colorectal cancer susceptibility and survival, respectively. Expression quantitative trait loci (eQTL) analysis was conducted using the Genotype-Tissue Expression (GTEx) project dataset. The sequence kernel association test (SKAT) was used to perform gene-set analysis. Colorectal cancer risk and rs3760806 in SULT2B1 were significantly associated in both genders [male: OR = 1.38 (1.15-1.66); female: OR = 1.38 (1.13-1.68)]. Two SNPs in SULT1E1 were related to progression-free survival (PFS) [rs1238574: HR = 1.24 (1.02-1.50), P = 2.79 × 10 -2 ; rs3822172: HR = 1.30 (1.07-1.57), P = 8.44 × 10 -3 ] and overall survival (OS) [rs1238574: HR = 1.51 (1.16-1.97), P = 2.30 × 10 -3 ; rs3822172: HR = 1.53 (1.67-2.00), P = 2.03 × 10 -3 ]. Moreover, rs3760806 was an eQTL for SULT2B1 in colon samples (transverse: P = 3.6 × 10 -3 ; sigmoid: P = 1.0 × 10 -3 ). SULT2B1 expression was significantly higher in colorectal tumor tissues than in normal tissues in the Cancer Genome Atlas (TCGA) database (P < 1.0 × 10 -4 ). Our results indicated that SNPs in estrogen metabolic pathway genes confer colorectal cancer susceptibility and survival.

  8. Autism Linked to Increased Oncogene Mutations but Decreased Cancer Rate

    PubMed Central

    Zimmerman, M. Bridget; Mahajan, Vinit B.; Bassuk, Alexander G.

    2016-01-01

    Autism spectrum disorder (ASD) is one phenotypic aspect of many monogenic, hereditary cancer syndromes. Pleiotropic effects of cancer genes on the autism phenotype could lead to repurposing of oncology medications to treat this increasingly prevalent neurodevelopmental condition for which there is currently no treatment. To explore this hypothesis we sought to discover whether autistic patients more often have rare coding, single-nucleotide variants within tumor suppressor and oncogenes and whether autistic patients are more often diagnosed with neoplasms. Exome-sequencing data from the ARRA Autism Sequencing Collaboration was compared to that of a control cohort from the Exome Variant Server database revealing that rare, coding variants within oncogenes were enriched for in the ARRA ASD cohort (p<1.0x10-8). In contrast, variants were not significantly enriched in tumor suppressor genes. Phenotypically, children and adults with ASD exhibited a protective effect against cancer, with a frequency of 1.3% vs. 3.9% (p<0.001), but the protective effect decreased with age. The odds ratio of neoplasm for those with ASD relative to controls was 0.06 (95% CI: 0.02, 0.19; p<0.0001) in the 0 to 14 age group; 0.35 (95% CI: 0.14, 0.87; p = 0.024) in the 15 to 29 age group; 0.41 (95% CI: 0.15, 1.17; p = 0.095) in the 30 to 54 age group; and 0.49 (95% CI: 0.14, 1.74; p = 0.267) in those 55 and older. Both males and females demonstrated the protective effect. These findings suggest that defects in cellular proliferation, and potentially senescence, might influence both autism and neoplasm, and already approved drugs targeting oncogenic pathways might also have therapeutic value for treating autism. PMID:26934580

  9. Meta-analysis of Clear Cell Renal Cell Carcinoma Gene Expression Defines a Variant Subgroup and Identifies Gender Influences on Tumor Biology

    PubMed Central

    Brannon, A. Rose; Haake, Scott M.; Hacker, Kathryn E.; Pruthi, Raj S.; Wallen, Eric M.; Nielsen, Matthew E.; Rathmell, W. Kimryn

    2011-01-01

    Background Clear cell renal cell carcinoma (ccRCC) displays molecular and histologic heterogeneity. Previously described subsets of this disease, ccA and ccB, were defined based on multigene expression profiles, but it is unclear whether these subgroupings reflect the full spectrum of disease or how these molecular subtypes relate to histologic descriptions or gender. Objective Determine whether additional subtypes of ccRCC exist and whether these subtypes are related to von Hippel-Lindau (VHL) inactivation, hypoxia-inducible factor (HIF) 1 and 2 expression, tumor histology, or gender. Design, setting, and participants Six large, publicly available ccRCC gene expression databases were identified that cumulatively provided data for 480 tumors for meta-analysis via meta-array compilation. Measurements Unsupervised consensus clustering was performed on the meta-arrays. Tumors were examined for the relationship of multigene-defined consensus subtypes and expression signatures of VHL mutation and HIF status, tumor histology, and gender. Results and limitations Two dominant subsets of ccRCC were observed. However, a minor third cluster was revealed that correlated strongly with a wild type (WT) VHL expression profile and indications of variant histologies. When variant histologies were removed, ccA tumors naturally divided by gender. This technique is limited by the potential for persistent batch effect, tumor sampling bias, and restrictions of annotated information. Conclusions The ccA and ccB subsets of ccRCC are robust in meta-analysis among histologically conventional ccRCC tumors. A third group of tumors was identified that may represent a new variant of ccRCC. Within definitively clear cell tumors, gender may delineate tumors in such a way that it could have implications regarding current treatments and future drug development. PMID:22030119

  10. APOL1 Nephropathy: A Population Genetics and Evolutionary Medicine Detective Story.

    PubMed

    Kruzel-Davila, Etty; Wasser, Walter G; Skorecki, Karl

    2017-11-01

    Common DNA sequence variants rarely have a high-risk association with a common disease. When such associations do occur, evolutionary forces must be sought, such as in the association of apolipoprotein L1 (APOL1) gene risk variants with nondiabetic kidney diseases in populations of African ancestry. The variants originated in West Africa and provided pathogenic resistance in the heterozygous state that led to high allele frequencies owing to an adaptive evolutionary selective sweep. However, the homozygous state is disadvantageous and is associated with a markedly increased risk of a spectrum of kidney diseases encompassing hypertension-attributed kidney disease, focal segmental glomerulosclerosis, human immunodeficiency virus nephropathy, sickle cell nephropathy, and progressive lupus nephritis. This scientific success story emerged with the help of the tools developed over the past 2 decades in human genome sequencing and population genomic databases. In this introductory article to a timely issue dedicated to illuminating progress in this area, we describe this unique population genetics and evolutionary medicine detective story. We emphasize the paradox of the inheritance mode, the missing heritability, and unresolved associations, including cardiovascular risk and diabetic nephropathy. We also highlight how genetic epidemiology elucidates mechanisms and how the principles of evolution can be used to unravel conserved pathways affected by APOL1 that may lead to novel therapies. The APOL1 gene provides a compelling example of a common variant association with common forms of nondiabetic kidney disease occurring in a continental population isolate with subsequent global admixture. Scientific collaboration using multiple experimental model systems and approaches should further clarify pathomechanisms further, leading to novel therapies. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Sensitivity of BRCA1/2 testing in high-risk breast/ovarian/male breast cancer families: little contribution of comprehensive RNA/NGS panel testing.

    PubMed

    Byers, Helen; Wallis, Yvonne; van Veen, Elke M; Lalloo, Fiona; Reay, Kim; Smith, Philip; Wallace, Andrew J; Bowers, Naomi; Newman, William G; Evans, D Gareth

    2016-11-01

    The sensitivity of testing BRCA1 and BRCA2 remains unresolved as the frequency of deep intronic splicing variants has not been defined in high-risk familial breast/ovarian cancer families. This variant category is reported at significant frequency in other tumour predisposition genes, including NF1 and MSH2. We carried out comprehensive whole gene RNA analysis on 45 high-risk breast/ovary and male breast cancer families with no identified pathogenic variant on exonic sequencing and copy number analysis of BRCA1/2. In addition, we undertook variant screening of a 10-gene high/moderate risk breast/ovarian cancer panel by next-generation sequencing. DNA testing identified the causative variant in 50/56 (89%) breast/ovarian/male breast cancer families with Manchester scores of ≥50 with two variants being confirmed to affect splicing on RNA analysis. RNA sequencing of BRCA1/BRCA2 on 45 individuals from high-risk families identified no deep intronic variants and did not suggest loss of RNA expression as a cause of lost sensitivity. Panel testing in 42 samples identified a known RAD51D variant, a high-risk ATM variant in another breast ovary family and a truncating CHEK2 mutation. Current exonic sequencing and copy number analysis variant detection methods of BRCA1/2 have high sensitivity in high-risk breast/ovarian cancer families. Sequence analysis of RNA does not identify any variants undetected by current analysis of BRCA1/2. However, RNA analysis clarified the pathogenicity of variants of unknown significance detected by current methods. The low diagnostic uplift achieved through sequence analysis of the other known breast/ovarian cancer susceptibility genes indicates that further high-risk genes remain to be identified.

  12. Variant allele frequency enrichment analysis in vitro reveals sonic hedgehog pathway to impede sustained temozolomide response in GBM.

    PubMed

    Biswas, Nidhan K; Chandra, Vikas; Sarkar-Roy, Neeta; Das, Tapojyoti; Bhattacharya, Rabindra N; Tripathy, Laxmi N; Basu, Sunandan K; Kumar, Shantanu; Das, Subrata; Chatterjee, Ankita; Mukherjee, Ankur; Basu, Pryiadarshi; Maitra, Arindam; Chattopadhyay, Ansuman; Basu, Analabha; Dhara, Surajit

    2015-01-21

    Neoplastic cells of Glioblastoma multiforme (GBM) may or may not show sustained response to temozolomide (TMZ) chemotherapy. We hypothesize that TMZ chemotherapy response in GBM is predetermined in its neoplastic clones via a specific set of mutations that alter relevant pathways. We describe exome-wide enrichment of variant allele frequencies (VAFs) in neurospheres displaying contrasting phenotypes of sustained versus reversible TMZ-responses in vitro. Enrichment of VAFs was found on genes ST5, RP6KA1 and PRKDC in cells showing sustained TMZ-effect whereas on genes FREM2, AASDH and STK36, in cells showing reversible TMZ-effect. Ingenuity pathway analysis (IPA) revealed that these genes alter cell-cycle, G2/M-checkpoint-regulation and NHEJ pathways in sustained TMZ-effect cells whereas the lysine-II&V/phenylalanine degradation and sonic hedgehog (Hh) pathways in reversible TMZ-effect cells. Next, we validated the likely involvement of the Hh-pathway in TMZ-response on additional GBM neurospheres as well as on GBM patients, by extracting RNA-sequencing-based gene expression data from the TCGA-GBM database. Finally, we demonstrated TMZ-sensitization of a TMZ non-responder neurosphere in vitro by treating them with the FDA-approved pharmacological Hh-pathway inhibitor vismodegib. Altogether, our results indicate that the Hh-pathway impedes sustained TMZ-response in GBM and could be a potential therapeutic target to enhance TMZ-response in this malignancy.

  13. Association between Genetic Variants and Diabetes Mellitus in Iranian Populations: A Systematic Review of Observational Studies

    PubMed Central

    Khodaeian, Mehrnoosh; Enayati, Samaneh; Tabatabaei-Malazy, Ozra; Amoli, Mahsa M.

    2015-01-01

    Introduction. Diabetes mellitus as the most prevalent metabolic disease is a multifactorial disease which is influenced by environmental and genetic factors. In this systematic review, we assessed the association between genetic variants and diabetes/its complications in studies with Iranian populations. Methods. Google Scholar, PubMed, Scopus, and Persian web databases were systematically searched up to January 2014. The search terms were “gene,” “polymorphism,” “diabetes,” and “diabetic complications”; nephropathy, retinopathy, neuropathy, foot ulcer, and CAD (coronary artery diseases); and Persian equivalents. Animal studies, letters to editor, and in vitro studies were excluded. Results. Out of overall 3029 eligible articles, 88 articles were included. We found significant association between CTLA-4, IL-18, VDR, TAP2, IL-12, and CD4 genes and T1DM, HNFα and MODY, haptoglobin, paraoxonase, leptin, TCF7L2, calreticulin, ERα, PPAR-γ2, CXCL5, calpain-10, IRS-1 and 2, GSTM1, KCNJ11, eNOS, VDR, INSR, ACE, apoA-I, apo E, adiponectin, PTPN1, CETP, AT1R, resistin, MMP-3, BChE K, AT2R, SUMO4, IL-10, VEGF, MTHFR, and GSTM1 with T2DM or its complications. Discussion. We found some controversial results due to heterogeneity in ethnicity and genetic background. We thought genome wide association studies on large number of samples will be helpful in identifying diabetes susceptible genes as an alternative to studying individual candidate genes in Iranian populations. PMID:26587547

  14. Variants in the interleukin 8 gene and the response to inhaled bronchodilators in cystic fibrosis.

    PubMed

    Furlan, Larissa Lazzarini; Ribeiro, José Dirceu; Bertuzzo, Carmen Sílvia; Salomão Junior, João Batista; Souza, Dorotéia Rossi Silva; Marson, Fernando Augusto Lima

    Interleukin 8 protein promotes inflammatory responses, even in airways. The presence of interleukin 8 gene variants causes altered inflammatory responses and possibly varied responses to inhaled bronchodilators. Thus, this study analyzed the interleukin 8 variants (rs4073, rs2227306, and rs2227307) and their association with the response to inhaled bronchodilators in cystic fibrosis patients. Analysis of interleukin 8 gene variants was performed by restriction fragment length polymorphism of polymerase chain reaction. The association between spirometry markers and the response to inhaled bronchodilators was evaluated by Mann-Whitney and Kruskal-Wallis tests. The analysis included all cystic fibrosis patients, and subsequently patients with two mutations in the cystic fibrosis transmembrane conductance regulator gene belonging to classes I to III. This study included 186 cystic fibrosis patients. There was no association of the rs2227307 variant with the response to inhaled bronchodilators. The rs2227306 variant was associated with FEF 50% in the dominant group and in the group with two identified mutations in the cystic fibrosis transmembrane conductance regulator gene. The rs4073 variant was associated with spirometry markers in four genetic models: co-dominant (FEF 25-75% and FEF 75% ), dominant (FEV 1 , FEF 50% , FEF 75% , and FEF 25-75% ), recessive (FEF 75% and FEF 25-75% ), and over-dominant (FEV 1 /FVC). This study highlighted the importance of the rs4073 variant of the interleukin 8 gene, regarding response to inhaled bronchodilators, and of the assessment of mutations in the cystic fibrosis transmembrane conductance regulator gene. Copyright © 2017 Sociedade Brasileira de Pediatria. Published by Elsevier Editora Ltda. All rights reserved.

  15. Network perturbation by recurrent regulatory variants in cancer

    PubMed Central

    Cho, Ara; Lee, Insuk; Choi, Jung Kyoon

    2017-01-01

    Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. PMID:28333928

  16. Oncodomains: A protein domain-centric framework for analyzing rare variants in tumor samples

    PubMed Central

    Peterson, Thomas A.; Park, Junyong

    2017-01-01

    The fight against cancer is hindered by its highly heterogeneous nature. Genome-wide sequencing studies have shown that individual malignancies contain many mutations that range from those commonly found in tumor genomes to rare somatic variants present only in a small fraction of lesions. Such rare somatic variants dominate the landscape of genomic mutations in cancer, yet efforts to correlate somatic mutations found in one or few individuals with functional roles have been largely unsuccessful. Traditional methods for identifying somatic variants that drive cancer are ‘gene-centric’ in that they consider only somatic variants within a particular gene and make no comparison to other similar genes in the same family that may play a similar role in cancer. In this work, we present oncodomain hotspots, a new ‘domain-centric’ method for identifying clusters of somatic mutations across entire gene families using protein domain models. Our analysis confirms that our approach creates a framework for leveraging structural and functional information encapsulated by protein domains into the analysis of somatic variants in cancer, enabling the assessment of even rare somatic variants by comparison to similar genes. Our results reveal a vast landscape of somatic variants that act at the level of domain families altering pathways known to be involved with cancer such as protein phosphorylation, signaling, gene regulation, and cell metabolism. Due to oncodomain hotspots’ unique ability to assess rare variants, we expect our method to become an important tool for the analysis of sequenced tumor genomes, complementing existing methods. PMID:28426665

  17. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data

    PubMed Central

    Wright, Caroline F; Fitzgerald, Tomas W; Jones, Wendy D; Clayton, Stephen; McRae, Jeremy F; van Kogelenberg, Margriet; King, Daniel A; Ambridge, Kirsty; Barrett, Daniel M; Bayzetinova, Tanya; Bevan, A Paul; Bragin, Eugene; Chatzimichali, Eleni A; Gribble, Susan; Jones, Philip; Krishnappa, Netravathi; Mason, Laura E; Miller, Ray; Morley, Katherine I; Parthiban, Vijaya; Prigmore, Elena; Rajan, Diana; Sifrim, Alejandro; Swaminathan, G Jawahar; Tivey, Adrian R; Middleton, Anna; Parker, Michael; Carter, Nigel P; Barrett, Jeffrey C; Hurles, Matthew E; FitzPatrick, David R; Firth, Helen V

    2015-01-01

    Summary Background Human genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount. Methods The Deciphering Developmental Disorders (DDD) study has developed a UK-wide patient recruitment network involving over 180 clinicians across all 24 regional genetics services, and has performed genome-wide microarray and whole exome sequencing on children with undiagnosed developmental disorders and their parents. After data analysis, pertinent genomic variants were returned to individual research participants via their local clinical genetics team. Findings Around 80 000 genomic variants were identified from exome sequencing and microarray analysis in each individual, of which on average 400 were rare and predicted to be protein altering. By focusing only on de novo and segregating variants in known developmental disorder genes, we achieved a diagnostic yield of 27% among 1133 previously investigated yet undiagnosed children with developmental disorders, whilst minimising incidental findings. In families with developmentally normal parents, whole exome sequencing of the child and both parents resulted in a 10-fold reduction in the number of potential causal variants that needed clinical evaluation compared to sequencing only the child. Most diagnostic variants identified in known genes were novel and not present in current databases of known disease variation. Interpretation Implementation of a robust translational genomics workflow is achievable within a large-scale rare disease research study to allow feedback of potentially diagnostic findings to clinicians and research participants. Systematic recording of relevant clinical data, curation of a gene–phenotype knowledge base, and development of clinical decision support software are needed in addition to automated exclusion of almost all variants, which is crucial for scalable prioritisation and review of possible diagnostic variants. However, the resource requirements of development and maintenance of a clinical reporting system within a research setting are substantial. Funding Health Innovation Challenge Fund, a parallel funding partnership between the Wellcome Trust and the UK Department of Health. PMID:25529582

  18. Establishing the role of rare coding variants in known Parkinson's disease risk loci.

    PubMed

    Jansen, Iris E; Gibbs, J Raphael; Nalls, Mike A; Price, T Ryan; Lubbe, Steven; van Rooij, Jeroen; Uitterlinden, André G; Kraaij, Robert; Williams, Nigel M; Brice, Alexis; Hardy, John; Wood, Nicholas W; Morris, Huw R; Gasser, Thomas; Singleton, Andrew B; Heutink, Peter; Sharma, Manu

    2017-11-01

    Many common genetic factors have been identified to contribute to Parkinson's disease (PD) susceptibility, improving our understanding of the related underlying biological mechanisms. The involvement of rarer variants in these loci has been poorly studied. Using International Parkinson's Disease Genomics Consortium data sets, we performed a comprehensive study to determine the impact of rare variants in 23 previously published genome-wide association studies (GWAS) loci in PD. We applied Prix fixe to select the putative causal genes underneath the GWAS peaks, which was based on underlying functional similarities. The Sequence Kernel Association Test was used to analyze the joint effect of rare, common, or both types of variants on PD susceptibility. All genes were tested simultaneously as a gene set and each gene individually. We observed a moderate association of common variants, confirming the involvement of the known PD risk loci within our genetic data sets. Focusing on rare variants, we identified additional association signals for LRRK2, STBD1, and SPATA19. Our study suggests an involvement of rare variants within several putatively causal genes underneath previously identified PD GWAS peaks. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. FARVATX: FAmily-based Rare Variant Association Test for X-linked genes

    PubMed Central

    Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H.; Silverman, Edwin K; Park, Taesung; Won, Sungho

    2016-01-01

    Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease (COPD). Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. PMID:27325607

  20. FARVATX: Family-Based Rare Variant Association Test for X-Linked Genes.

    PubMed

    Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H; Silverman, Edwin K; Park, Taesung; Won, Sungho

    2016-09-01

    Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. © 2016 WILEY PERIODICALS, INC.

  1. An informatics approach to analyzing the incidentalome.

    PubMed

    Berg, Jonathan S; Adams, Michael; Nassar, Nassib; Bizon, Chris; Lee, Kristy; Schmitt, Charles P; Wilhelmsen, Kirk C; Evans, James P

    2013-01-01

    Next-generation sequencing has transformed genetic research and is poised to revolutionize clinical diagnosis. However, the vast amount of data and inevitable discovery of incidental findings require novel analytic approaches. We therefore implemented for the first time a strategy that utilizes an a priori structured framework and a conservative threshold for selecting clinically relevant incidental findings. We categorized 2,016 genes linked with Mendelian diseases into "bins" based on clinical utility and validity, and used a computational algorithm to analyze 80 whole-genome sequences in order to explore the use of such an approach in a simulated real-world setting. The algorithm effectively reduced the number of variants requiring human review and identified incidental variants with likely clinical relevance. Incorporation of the Human Gene Mutation Database improved the yield for missense mutations but also revealed that a substantial proportion of purported disease-causing mutations were misleading. This approach is adaptable to any clinically relevant bin structure, scalable to the demands of a clinical laboratory workflow, and flexible with respect to advances in genomics. We anticipate that application of this strategy will facilitate pretest informed consent, laboratory analysis, and posttest return of results in a clinical context.

  2. Deep Sequencing of 71 Candidate Genes to Characterize Variation Associated with Alcohol Dependence.

    PubMed

    Clark, Shaunna L; McClay, Joseph L; Adkins, Daniel E; Kumar, Gaurav; Aberg, Karolina A; Nerella, Srilaxmi; Xie, Linying; Collins, Ann L; Crowley, James J; Quackenbush, Corey R; Hilliard, Christopher E; Shabalin, Andrey A; Vrieze, Scott I; Peterson, Roseann E; Copeland, William E; Silberg, Judy L; McGue, Matt; Maes, Hermine; Iacono, William G; Sullivan, Patrick F; Costello, Elizabeth J; van den Oord, Edwin J

    2017-04-01

    Previous genomewide association studies (GWASs) have identified a number of putative risk loci for alcohol dependence (AD). However, only a few loci have replicated and these replicated variants only explain a small proportion of AD risk. Using an innovative approach, the goal of this study was to generate hypotheses about potentially causal variants for AD that can be explored further through functional studies. We employed targeted capture of 71 candidate loci and flanking regions followed by next-generation deep sequencing (mean coverage 78X) in 806 European Americans. Regions included in our targeted capture library were genes identified through published GWAS of alcohol, all human alcohol and aldehyde dehydrogenases, reward system genes including dopaminergic and opioid receptors, prioritized candidate genes based on previous associations, and genes involved in the absorption, distribution, metabolism, and excretion of drugs. We performed single-locus tests to determine if any single variant was associated with AD symptom count. Sets of variants that overlapped with biologically meaningful annotations were tested for association in aggregate. No single, common variant was significantly associated with AD in our study. We did, however, find evidence for association with several variant sets. Two variant sets were significant at the q-value <0.10 level: a genic enhancer for ADHFE1 (p = 1.47 × 10 -5 ; q = 0.019), an alcohol dehydrogenase, and ADORA1 (p = 5.29 × 10 -5 ; q = 0.035), an adenosine receptor that belongs to a G-protein-coupled receptor gene family. To our knowledge, this is the first sequencing study of AD to examine variants in entire genes, including flanking and regulatory regions. We found that in addition to protein coding variant sets, regulatory variant sets may play a role in AD. From these findings, we have generated initial functional hypotheses about how these sets may influence AD. Copyright © 2017 by the Research Society on Alcoholism.

  3. Clinical relevance of rare germline sequence variants in cancer genes: evolution and application of classification models.

    PubMed

    Spurdle, Amanda B

    2010-06-01

    Multifactorial models developed for BRCA1/2 variant classification have proved very useful for delineating BRCA1/2 variants associated with very high risk of cancer, or with little clinical significance. Recent linkage of this quantitative assessment of risk to clinical management guidelines has provided a basis to standardize variant reporting, variant classification and management of families with such variants, and can theoretically be applied to any disease gene. As proof of principle, the multifactorial approach already shows great promise for application to the evaluation of mismatch repair gene variants identified in families with suspected Lynch syndrome. However there is need to be cautious of the noted limitations and caveats of the current model, some of which may be exacerbated by differences in ascertainment and biological pathways to disease for different cancer syndromes.

  4. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people

    PubMed Central

    Nelson, Matthew R.; Wegmann, Daniel; Ehm, Margaret G.; Kessner, Darren; St. Jean, Pamela; Verzilli, Claudio; Shen, Judong; Tang, Zhengzheng; Bacanu, Silviu-Alin; Fraser, Dana; Warren, Liling; Aponte, Jennifer; Zawistowski, Matthew; Liu, Xiao; Zhang, Hao; Zhang, Yong; Li, Jun; Li, Yun; Li, Li; Woollard, Peter; Topp, Simon; Hall, Matthew D.; Nangle, Keith; Wang, Jun; Abecasis, Gonçalo; Cardon, Lon R.; Zöllner, Sebastian; Whittaker, John C.; Chissoe, Stephanie L.; Novembre, John; Mooser, Vincent

    2015-01-01

    Rare genetic variants contribute to complex disease risk; however, the abundance of rare variants in human populations remains unknown. We explored this spectrum of variation by sequencing 202 genes encoding drug targets in 14,002 individuals. We find rare variants are abundant (one every 17 bases) and geographically localized, such that even with large sample sizes, rare variant catalogs will be largely incomplete. We used the observed patterns of variation to estimate population growth parameters, the proportion of variants in a given frequency class that are putatively deleterious, and mutation rates for each gene. Overall we conclude that, due to rapid population growth and weak purifying selection, human populations harbor an abundance of rare variants, many of which are deleterious and have relevance to understanding disease risk. PMID:22604722

  5. Large-Scale Gene-Centric Analysis Identifies Novel Variants for Coronary Artery Disease

    PubMed Central

    2011-01-01

    Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10−33; LPA:p<10−19; 1p13.3:p<10−17) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10−7). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06–1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and clarified the literature with regard to many previously suggested genes. PMID:21966275

  6. G6PDdb, an integrated database of glucose-6-phosphate dehydrogenase (G6PD) mutations.

    PubMed

    Kwok, Colin J; Martin, Andrew C R; Au, Shannon W N; Lam, Veronica M S

    2002-03-01

    G6PDdb (http://www.rubic.rdg.ac.uk/g6pd/ or http://www.bioinf.org.uk/g6pd/) is a newly created web-accessible locus-specific mutation database for the human Glucose-6-phosphate dehydrogenase (G6PD) gene. The relational database integrates up-to-date mutational and structural data from various databanks (GenBank, Protein Data Bank, etc.) with biochemically characterized variants and their associated phenotypes obtained from published literature and the Favism website. An automated analysis of the mutations likely to have a significant impact on the structure of the protein has been performed using a recently developed procedure. The database may be queried online and the full results of the analysis of the structural impact of mutations are available. The web page provides a form for submitting additional mutation data and is linked to resources such as the Favism website, OMIM, HGMD, HGVBASE, and the PDB. This database provides insights into the molecular aspects and clinical significance of G6PD deficiency for researchers and clinicians and the web page functions as a knowledge base relevant to the understanding of G6PD deficiency and its management. Copyright 2002 Wiley-Liss, Inc.

  7. Analysis of PAC1 receptor gene variants in Caucasian and African American infants dying of sudden infant death syndrome.

    PubMed

    Barrett, Karlene T; Rodikova, Ekaterina; Weese-Mayer, Debra E; Rand, Casey M; Marazita, Mary L; Cooper, Margaret E; Berry-Kravis, Elizabeth M; Bech-Hansen, N Torben; Wilson, Richard J A

    2013-12-01

    Stress peptide, pituitary adenylate cyclase-activating polypeptide (PACAP), has been implicated in sudden infant death syndrome (SIDS). The aim of this exploratory study was to determine whether variants in the gene encoding the PACAP-specific receptor, PAC1, are associated with SIDS in Caucasian and African American infants. Polymerase chain reaction and Sanger DNA sequencing was used to compare variants in the 5'-untranslated region, exons and intron-exon boundaries of the PAC1 gene in 96 SIDS cases and 96 race- and gender-matched controls. The intron 3 variant, A/G: rs758995 (variant 'h'), and the intron 6 variant, C/T: rs10081254 (variant 'n'), were significantly associated with SIDS in Caucasians and African Americans, respectively (p < 0.05). Also associated with SIDS were interactions between the variants rs2302475 (variant 'i') in PAC1 and rs8192597 and rs2856966 in PACAP among Caucasians (p < 0.02) and rs2267734 (variant 'q') in PAC1 and rs1893154 in PACAP among African Americans (p < 0.01). However, none of these differences survived post hoc analysis. Overall, this study does not support a strong association between variants in the PAC1 gene and SIDS; however, a number of potential associations between race-specific variants and SIDS were identified that warrant targeted investigations in future studies. ©2013 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.

  8. A systematic review of genetic variants associated with metabolic syndrome in patients with schizophrenia.

    PubMed

    Malan-Müller, Stefanie; Kilian, Sanja; van den Heuvel, Leigh L; Bardien, Soraya; Asmal, Laila; Warnich, Louise; Emsley, Robin A; Hemmings, Sîan M J; Seedat, Soraya

    2016-01-01

    Metabolic syndrome (MetS) is a cluster of factors that increases the risk of cardiovascular disease (CVD), one of the leading causes of mortality in patients with schizophrenia. Incidence rates of MetS are significantly higher in patients with schizophrenia compared to the general population. Several factors contribute to this high comorbidity. This systematic review focuses on genetic factors and interrogates data from association studies of genes implicated in the development of MetS in patients with schizophrenia. We aimed to identify variants that potentially contribute to the high comorbidity between these disorders. PubMed, Web of Science and Scopus databases were accessed and a systematic review of published studies was conducted. Several genes showed strong evidence for an association with MetS in patients with schizophrenia, including the fat mass and obesity associated gene (FTO), leptin and leptin receptor genes (LEP, LEPR), methylenetetrahydrofolate reductase (MTHFR) gene and the serotonin receptor 2C gene (HTR2C). Genetic association studies in complex disorders are convoluted by the multifactorial nature of these disorders, further complicating investigations of comorbidity. Recommendations for future studies include assessment of larger samples, inclusion of healthy controls, longitudinal rather than cross-sectional study designs, detailed capturing of data on confounding variables for both disorders and verification of significant findings in other populations. In future, big genomic datasets may allow for the calculation of polygenic risk scores in risk prediction of MetS in patients with schizophrenia. This could ultimately facilitate early, precise, and patient-specific pharmacological and non-pharmacological interventions to minimise CVD associated morbidity and mortality. Copyright © 2015 Elsevier B.V. All rights reserved.

  9. CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects

    PubMed Central

    Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

    2014-01-01

    CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB PMID:25281234

  10. CanvasDB: a local database infrastructure for analysis of targeted- and whole genome re-sequencing projects.

    PubMed

    Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf

    2014-01-01

    CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB. © The Author(s) 2014. Published by Oxford University Press.

  11. Genetics and Genomics of Single-Gene Cardiovascular Diseases: Common Hereditary Cardiomyopathies as Prototypes of Single-Gene Disorders

    PubMed Central

    Marian, Ali J.; van Rooij, Eva; Roberts, Robert

    2016-01-01

    This is the first of 2 review papers on genetics and genomics appearing as part of the series on “omics.” Genomics pertains to all components of an organism’s genes, whereas genetics involves analysis of a specific gene(s) in the context of heredity. The paper provides introductory comments, describes the basis of human genetic diversity, and addresses the phenotypic consequences of genetic variants. Rare variants with large effect sizes are responsible for single-gene disorders, whereas complex polygenic diseases are typically due to multiple genetic variants, each exerting a modest effect size. To illustrate the clinical implications of genetic variants with large effect sizes, 3 common forms of hereditary cardiomyopathies are discussed as prototypic examples of single-gene disorders, including their genetics, clinical manifestations, pathogenesis, and treatment. The genetic basis of complex traits is discussed in a separate paper. PMID:28007145

  12. A RESTful application programming interface for the PubMLST molecular typing and genome databases

    PubMed Central

    Bray, James E.; Maiden, Martin C. J.

    2017-01-01

    Abstract Molecular typing is used to differentiate microorganisms at the subspecies or strain level for epidemiological investigations, infection control, public health and environmental sampling. DNA sequence-based typing methods require authoritative databases that link sequence variants to nomenclature in order to facilitate communication and comparison of identified types in national or global settings. The PubMLST website (https://pubmlst.org/) fulfils this role for over a hundred microorganisms for which it hosts curated molecular sequence typing data, providing sequence and allelic profile definitions for multi-locus sequence typing (MLST) and single-gene typing approaches. In recent years, these have expanded to cover the whole genome with schemes such as core genome MLST (cgMLST) and whole genome MLST (wgMLST) which catalogue the allelic diversity found in hundreds to thousands of genes. These approaches provide a common nomenclature for high-resolution strain characterization and comparison. Molecular typing information is linked to isolate provenance, phenotype, and increasingly genome assemblies, providing a resource for outbreak investigation and research in to population structure, gene association, global epidemiology and vaccine coverage. A Representational State Transfer (REST) Application Programming Interface (API) has been developed for the PubMLST website to make these large quantities of structured molecular typing and whole genome sequence data available for programmatic access by any third party application. The API is an integral component of the Bacterial Isolate Genome Sequence Database (BIGSdb) platform that is used to host PubMLST resources, and exposes all public data within the site. In addition to data browsing, searching and download, the API supports authentication and submission of new data to curator queues. Database URL: http://rest.pubmlst.org/ PMID:29220452

  13. Towards understanding the low prevalence of Helicobacter pylori in Malays: genetic variants among Helicobacter pylori-negative ethnic Malays in the north-eastern region of Peninsular Malaysia and Han Chinese and South Indians.

    PubMed

    Maran, Sathiya; Lee, Yeong Yeh; Xu, Shu Hua; Raj, Mahendra Sundramoorthy; Abdul Majid, Noorizan; Choo, Keng Ee; Zilfalil, Bin Alwi; Graham, David Y

    2013-04-01

    To identify gene polymorphisms that differ between Malays, Han Chinese and South Indians, and to identify candidate genes for the investigation of their role in protecting Malays from Helicobacter pylori (H. pylori) infection. Malay participants born and residing in Kelantan with a documented absence of H. pylori infection were studied. Venous blood was used for genotyping using the Affymetrix 50K Xba I kit. CEL files from 141 Han Chinese and 76 South Indians were analyzed to compare their allele frequency with that of the Malays using fixation index (FST ) calculation. The single nucleotide polymorphisms (SNPs) with the highest allele frequency (outliers) were then examined for their functional characteristics using F-SNP software and the Entrez Gene database. In all, 37 Malays were enrolled in the study; of whom 7 were excluded for low genotyping call rates. The average FST estimated from the genome-wide data were 0.038 (Malays in Kelantan vs the South Indians), 0.015 (Malays in Kelantan vs Han Chinese) and 0.066 (Han Chinese vs South Indians), respectively. The outlier gene variants present in Malays with functional characteristics were C7orf10 (FST  0.29988), TSTD2 (FST  0.43278), SMG7 (FST  0.29877) and XPA (FST  0.43393 and 0.43644). Genetic variants possibly related to protection against H. pylori infection in ethnic Malays from the north-eastern region of Peninsular Malaysia were identified for testing in subsequent trials among infected and uninfected Malays. © 2012 The Authors. Journal of Digestive Diseases © 2012 Chinese Medical Association Shanghai Branch, Chinese Society of Gastroenterology, Renji Hospital Affiliated to Shanghai Jiaotong University School of Medicine and Wiley Publishing Asia Pty Ltd.

  14. [Homozygous ectonucleotide pyrophosphatase/phosphodiesterase 1 variants in a girl with hypophosphatemic rickets and literature review].

    PubMed

    Liu, Z Q; Chen, X B; Song, F Y; Gao, K; Qiu, M F; Qian, Y; Du, M

    2017-11-02

    Objective: To investigate the clinical features and genetic characteristics of patients with ectonucleotide pyrophosphatase/phosphodiesterase 1 (ENPP1) gene variants. Method: The clinical data of a patient with ENPP1 homozygous variants from Capital Institute of Pediatrics was collected, the related literature was searched from China National Knowledge Infrastructure, Wanfang Data Knowledge Service Platform, National Center from Biotechnology Information and PubMed by using search term "ENPP1" , "hypophosphatemic rickets" . The literature retrieval was confined from 1980 to February 2017. The clinical manifestations, bone metabolism examinations, X-RAY and genotypes were reviewed. Result: Our patient was an 11 years old girl, with 7 years history of lower limb malformation. She showed significant valgus deformity of the knee (genu valgum). Metabolic examination revealed reduced level of plasma phosphate (0.86 mmol/L), a normal level of plasma calcium (2.30 mmol/L) and an elevated alkaline phosphatase level of 688 IU/L. The calcium-phosphorus product was 25.9. A homozygous nonsense variants of ENPP1 gene, c.783C>G (p.Tyr261X) in exon 7 was identified in the patient. Both parents were heterozygous carriers. Literature review identified 3 Chinese patients from one publication and 17 cases from twenty one publications around the world. None of the patients was found PHEX variants which is the most common variants among hypophosphatemic rickets patients. The disease onset age was 11 months to 10 years. Eight patients had short stature, five patients had the history of generalized arterial calcification of infancy. Four suffered from deafness, three showed localized calcifications of arteries, three patients manifested pseudoxanthoma elasticum and two suffered from ossification of posterior longitudinal ligament. Nine missense variants, six splicing variants and 4 nonsense variants were reported among these twenty patients. c.783C>G was found in two Chinese patients. Conclusion: ENPP1 gene mutation was a cause of patient with hypophosphatemic rickets. Comorbid features included generalized arterial calcification of infancy, early onset hearing loss, pseudoxanthoma and ossification of posterior longitudinal ligament. ENPP1 gene testing should be performed on hypophosphatemic rickets patients without PHEX gene variants. Long-term follow up is recommended. The most common types of ENPP1 gene variants were nonsense/splicing variants. The gene c.783C>G was the most common variants in Chinese patients.

  15. The curation of genetic variants: difficulties and possible solutions.

    PubMed

    Pandey, Kapil Raj; Maden, Narendra; Poudel, Barsha; Pradhananga, Sailendra; Sharma, Amit Kumar

    2012-12-01

    The curation of genetic variants from biomedical articles is required for various clinical and research purposes. Nowadays, establishment of variant databases that include overall information about variants is becoming quite popular. These databases have immense utility, serving as a user-friendly information storehouse of variants for information seekers. While manual curation is the gold standard method for curation of variants, it can turn out to be time-consuming on a large scale thus necessitating the need for automation. Curation of variants described in biomedical literature may not be straightforward mainly due to various nomenclature and expression issues. Though current trends in paper writing on variants is inclined to the standard nomenclature such that variants can easily be retrieved, we have a massive store of variants in the literature that are present as non-standard names and the online search engines that are predominantly used may not be capable of finding them. For effective curation of variants, knowledge about the overall process of curation, nature and types of difficulties in curation, and ways to tackle the difficulties during the task are crucial. Only by effective curation, can variants be correctly interpreted. This paper presents the process and difficulties of curation of genetic variants with possible solutions and suggestions from our work experience in the field including literature support. The paper also highlights aspects of interpretation of genetic variants and the importance of writing papers on variants following standard and retrievable methods. Copyright © 2012. Published by Elsevier Ltd.

  16. The Curation of Genetic Variants: Difficulties and Possible Solutions

    PubMed Central

    Pandey, Kapil Raj; Maden, Narendra; Poudel, Barsha; Pradhananga, Sailendra; Sharma, Amit Kumar

    2012-01-01

    The curation of genetic variants from biomedical articles is required for various clinical and research purposes. Nowadays, establishment of variant databases that include overall information about variants is becoming quite popular. These databases have immense utility, serving as a user-friendly information storehouse of variants for information seekers. While manual curation is the gold standard method for curation of variants, it can turn out to be time-consuming on a large scale thus necessitating the need for automation. Curation of variants described in biomedical literature may not be straightforward mainly due to various nomenclature and expression issues. Though current trends in paper writing on variants is inclined to the standard nomenclature such that variants can easily be retrieved, we have a massive store of variants in the literature that are present as non-standard names and the online search engines that are predominantly used may not be capable of finding them. For effective curation of variants, knowledge about the overall process of curation, nature and types of difficulties in curation, and ways to tackle the difficulties during the task are crucial. Only by effective curation, can variants be correctly interpreted. This paper presents the process and difficulties of curation of genetic variants with possible solutions and suggestions from our work experience in the field including literature support. The paper also highlights aspects of interpretation of genetic variants and the importance of writing papers on variants following standard and retrievable methods. PMID:23317699

  17. Novel sequence variants in the TMIE gene in families with autosomal recessive nonsyndromic hearing impairment

    PubMed Central

    Santos, Regie Lyn P.; El-Shanti, Hatem; Sikandar, Shaheen; Lee, Kwanghyuk; Bhatti, Attya; Yan, Kai; Chahrour, Maria H.; McArthur, Nathan; Pham, Thanh L.; Mahasneh, Amjad Abdullah; Ahmad, Wasim

    2010-01-01

    To date, 37 genes have been identified for nonsyndromic hearing impairment (NSHI). Identifying the functional sequence variants within these genes and knowing their population-specific frequencies is of public health value, in particular for genetic screening for NSHI. To determine putatively functional sequence variants in the transmembrane inner ear (TMIE) gene in Pakistani and Jordanian families with autosomal recessive (AR) NSHI, four Jordanian and 168 Pakistani families with ARNSHI that is not due to GJB2 (CX26) were submitted to a genome scan. Two-point and multipoint parametric linkage analyses were performed, and families with logarithmic odds (LOD) scores of 1.0 or greater within the TMIE region underwent further DNA sequencing. The evolutionary conservation and location in predicted protein domains of amino acid residues where sequence variants occurred were studied to elucidate the possible effects of these sequence variants on function. Of seven families that were screened for TMIE, putatively functional sequence variants were found to segregate with hearing impairment in four families but were not seen in not less than 110 ethnically matched control chromosomes. The previously reported c.241C>T (p.R81C) variant was observed in two Pakistani families. Two novel variants, c.92A>G (p.E31G) and the splice site mutation c.212–2A>C, were identified in one Pakistani and one Jordanian family, respectively. The c.92A>G (p.E31G) variant occurred at a residue that is conserved in the mouse and is predicted to be extracellular. Conservation and potential functionality of previously published mutations were also examined. The prevalence of functional TMIE variants in Pakistani families is 1.7% [95% confidence interval (CI) 0.3–4.8]. Further studies on the spectrum, prevalence rates, and functional effect of sequence variants in the TMIE gene in other populations should demonstrate the true importance of this gene as a cause of hearing impairment. PMID:16389551

  18. Mutation analysis in 129 genes associated with other forms of retinal dystrophy in 157 families with retinitis pigmentosa based on exome sequencing.

    PubMed

    Xu, Yan; Guan, Liping; Xiao, Xueshan; Zhang, Jianguo; Li, Shiqiang; Jiang, Hui; Jia, Xiaoyun; Yang, Jianhua; Guo, Xiangming; Yin, Ye; Wang, Jun; Zhang, Qingjiong

    2015-01-01

    Mutations in 60 known genes were previously identified by exome sequencing in 79 of 157 families with retinitis pigmentosa (RP). This study analyzed variants in 129 genes associated with other forms of hereditary retinal dystrophy in the same cohort. Apart from the 73 genes previously analyzed, a further 129 genes responsible for other forms of hereditary retinal dystrophy were selected based on RetNet. Variants in the 129 genes determined by whole exome sequencing were selected and filtered by bioinformatics analysis. Candidate variants were confirmed by Sanger sequencing and validated by analysis of available family members and controls. A total of 90 candidate variants were present in the 129 genes. Sanger sequencing confirmed 83 of the 90 variants. Analysis of family members and controls excluded 76 of these 83 variants. The remaining seven variants were considered to be potential pathogenic mutations; these were c.899A>G, c.1814C>G, and c.2107C>T in BBS2; c.1073C>T and c.1669C>T in INPP5E; and c.3582C>G and c.5704-5C>G in CACNA1F. Six of these seven mutations were novel. The mutations were detected in five unrelated patients without a family history, including three patients with homozygous or compound heterozygous mutations in BBS2 and INPP5E, and two patients with hemizygous mutations in CACNA1F. None of the patients had mutations in the genes associated with autosome dominant retinal dystrophy. Only a small portion of patients with RP, about 3% (5/157), had causative mutations in the 129 genes associated with other forms of hereditary retinal dystrophy.

  19. Genetic evidence for role of integration of fast and slow neurotransmission in schizophrenia.

    PubMed

    Devor, A; Andreassen, O A; Wang, Y; Mäki-Marttunen, T; Smeland, O B; Fan, C-C; Schork, A J; Holland, D; Thompson, W K; Witoelar, A; Chen, C-H; Desikan, R S; McEvoy, L K; Djurovic, S; Greengard, P; Svenningsson, P; Einevoll, G T; Dale, A M

    2017-06-01

    The most recent genome-wide association studies (GWAS) of schizophrenia (SCZ) identified hundreds of risk variants potentially implicated in the disease. Further, novel statistical methodology designed for polygenic architecture revealed more potential risk variants. This can provide a link between individual genetic factors and the mechanistic underpinnings of SCZ. Intriguingly, a large number of genes coding for ionotropic and metabotropic receptors for various neurotransmitters-glutamate, γ-aminobutyric acid (GABA), dopamine, serotonin, acetylcholine and opioids-and numerous ion channels were associated with SCZ. Here, we review these findings from the standpoint of classical neurobiological knowledge of neuronal synaptic transmission and regulation of electrical excitability. We show that a substantial proportion of the identified genes are involved in intracellular cascades known to integrate 'slow' (G-protein-coupled receptors) and 'fast' (ionotropic receptors) neurotransmission converging on the protein DARPP-32. Inspection of the Human Brain Transcriptome Project database confirms that that these genes are indeed expressed in the brain, with the expression profile following specific developmental trajectories, underscoring their relevance to brain organization and function. These findings extend the existing pathophysiology hypothesis by suggesting a unifying role of dysregulation in neuronal excitability and synaptic integration in SCZ. This emergent model supports the concept of SCZ as an 'associative' disorder-a breakdown in the communication across different slow and fast neurotransmitter systems through intracellular signaling pathways-and may unify a number of currently competing hypotheses of SCZ pathophysiology.

  20. A clinically driven variant prioritization framework outperforms purely computational approaches for the diagnostic analysis of singleton WES data.

    PubMed

    Stark, Zornitza; Dashnow, Harriet; Lunke, Sebastian; Tan, Tiong Y; Yeung, Alison; Sadedin, Simon; Thorne, Natalie; Macciocca, Ivan; Gaff, Clara; Oshlack, Alicia; White, Susan M; James, Paul A

    2017-11-01

    Rapid identification of clinically significant variants is key to the successful application of next generation sequencing technologies in clinical practice. The Melbourne Genomics Health Alliance (MGHA) variant prioritization framework employs a gene prioritization index based on clinician-generated a priori gene lists, and a variant prioritization index (VPI) based on rarity, conservation and protein effect. We used data from 80 patients who underwent singleton whole exome sequencing (WES) to test the ability of the framework to rank causative variants highly, and compared it against the performance of other gene and variant prioritization tools. Causative variants were identified in 59 of the patients. Using the MGHA prioritization framework the average rank of the causative variant was 2.24, with 76% ranked as the top priority variant, and 90% ranked within the top five. Using clinician-generated gene lists resulted in ranking causative variants an average of 8.2 positions higher than prioritization based on variant properties alone. This clinically driven prioritization approach significantly outperformed purely computational tools, placing a greater proportion of causative variants top or in the top 5 (permutation P-value=0.001). Clinicians included 40 of the 49 WES diagnoses in their a priori list of differential diagnoses (81%). The lists generated by PhenoTips and Phenomizer contained 14 (29%) and 18 (37%) of these diagnoses respectively. These results highlight the benefits of clinically led variant prioritization in increasing the efficiency of singleton WES data analysis and have important implications for developing models for the funding and delivery of genomic services.

  1. Regularized rare variant enrichment analysis for case-control exome sequencing data.

    PubMed

    Larson, Nicholas B; Schaid, Daniel J

    2014-02-01

    Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research. © 2013 WILEY PERIODICALS, INC.

  2. Exome analysis of Smith-Magenis-like syndrome cohort identifies de novo likely pathogenic variants.

    PubMed

    Berger, Seth I; Ciccone, Carla; Simon, Karen L; Malicdan, May Christine; Vilboux, Thierry; Billington, Charles; Fischer, Roxanne; Introne, Wendy J; Gropman, Andrea; Blancato, Jan K; Mullikin, James C; Gahl, William A; Huizing, Marjan; Smith, Ann C M

    2017-04-01

    Smith-Magenis syndrome (SMS), a neurodevelopmental disorder characterized by dysmorphic features, intellectual disability (ID), and sleep disturbances, results from a 17p11.2 microdeletion or a mutation in the RAI1 gene. We performed exome sequencing on 6 patients with SMS-like phenotypes but without chromosomal abnormalities or RAI1 variants. We identified pathogenic de novo variants in two cases, a nonsense variant in IQSEC2 and a missense variant in the SAND domain of DEAF1, and candidate de novo missense variants in an additional two cases. One candidate variant was located in an alpha helix of Necdin (NDN), phased to the paternally inherited allele. NDN is maternally imprinted within the 15q11.2 Prader-Willi Syndrome (PWS) region. This can help clarify NDN's role in the PWS phenotype. No definitive pathogenic gene variants were detected in the remaining SMS-like cases, but we report our findings for future comparison. This study provides information about the inheritance pattern and recurrence risk for patients with identified variants and demonstrates clinical and genetic overlap of neurodevelopmental disorders. Identification and characterization of ID-related genes that assist in development of common developmental pathways and/or gene-networks, may inform disease mechanism and treatment strategies.

  3. Mitochondrial targeting sequence variants of the CHCHD2 gene are a risk for Lewy body disorders

    PubMed Central

    Ogaki, Kotaro; Koga, Shunsuke; Heckman, Michael G.; Fiesel, Fabienne C.; Ando, Maya; Labbé, Catherine; Lorenzo-Betancor, Oswaldo; Moussaud-Lamodière, Elisabeth L.; Soto-Ortolaza, Alexandra I.; Walton, Ronald L.; Strongosky, Audrey J.; Uitti, Ryan J.; McCarthy, Allan; Lynch, Timothy; Siuda, Joanna; Opala, Grzegorz; Rudzinska, Monika; Krygowska-Wajs, Anna; Barcikowska, Maria; Czyzewski, Krzysztof; Puschmann, Andreas; Nishioka, Kenya; Funayama, Manabu; Hattori, Nobutaka; Parisi, Joseph E.; Petersen, Ronald C.; Graff-Radford, Neill R.; Boeve, Bradley F.; Springer, Wolfdieter; Wszolek, Zbigniew K.; Dickson, Dennis W.

    2015-01-01

    Objective: To assess the role of CHCHD2 variants in patients with Parkinson disease (PD) and Lewy body disease (LBD) in Caucasian populations. Methods: All exons of the CHCHD2 gene were sequenced in a US Caucasian patient-control series (878 PD, 610 LBD, and 717 controls). Subsequently, exons 1 and 2 were sequenced in an Irish series (355 PD and 365 controls) and a Polish series (394 PD and 350 controls). Immunohistochemistry and immunofluorescence studies were performed on pathologic LBD cases with rare CHCHD2 variants. Results: We identified 9 rare exonic variants of unknown significance. These variants were more frequent in the combined group of PD and LBD patients compared to controls (0.6% vs 0.1%, p = 0.013). In addition, the presence of any rare variant was more common in patients with LBD (2.5% vs 1.0%, p = 0.050) compared to controls. Eight of these 9 variants were located within the gene's mitochondrial targeting sequence. Conclusions: Although the role of variants of the CHCHD2 gene in PD and LBD remains to be further elucidated, the rare variants in the mitochondrial targeting sequence may be a risk factor for Lewy body disorders, which may link CHCHD2 to other genetic forms of parkinsonism with mitochondrial dysfunction. PMID:26561290

  4. Novel Lethal Form of Congenital Hypopituitarism Associated With the First Recessive LHX4 Mutation

    PubMed Central

    Gregory, L. C.; Humayun, K. N.; Turton, J. P. G.; McCabe, M. J.; Rhodes, S. J.

    2015-01-01

    Background: LHX4 encodes a member of the LIM-homeodomain family of transcription factors that is required for normal development of the pituitary gland. To date, only incompletely penetrant heterozygous mutations in LHX4 have been described in patients with variable combined pituitary hormone deficiencies. Objective/Hypothesis: To report a unique family with a novel recessive variant in LHX4 associated with a lethal form of congenital hypopituitarism that was identified through screening a total of 97 patients. Method: We screened 97 unrelated patients with combined pituitary hormone deficiency, including 65% with an ectopic posterior pituitary, for variants in the LHX4 gene using Sanger sequencing. Control databases (1000 Genomes, dbSNP, Exome Variant Server, ExAC Browser) were consulted upon identification of variants. Results: We identified the first novel homozygous missense variant (c.377C>T, p.T126M) in two deceased male patients of Pakistani origin with severe panhypopituitarism associated with anterior pituitary aplasia and posterior pituitary ectopia. Both were born small for gestational age with a small phallus, undescended testes, and mid-facial hypoplasia. The parents' first-born child was a female with mid-facial hypoplasia (DNA was unavailable). Despite rapid commencement of hydrocortisone and T4 in the brothers, all three children died within the first week of life. The LHX4(p.T126M) variant is located within the LIM2 domain, in a highly conserved location. The absence of homozygosity for the variant in over 65 000 controls suggests that it is likely to be responsible for the phenotype. Conclusion: We report, for the first time to our knowledge, a novel homozygous mutation in LHX4 associated with a lethal phenotype, implying that recessive mutations in LHX4 may be incompatible with life. PMID:25871839

  5. Mapping genetic variations to three-dimensional protein structures to enhance variant interpretation: a proposed framework.

    PubMed

    Glusman, Gustavo; Rose, Peter W; Prlić, Andreas; Dougherty, Jennifer; Duarte, José M; Hoffman, Andrew S; Barton, Geoffrey J; Bendixen, Emøke; Bergquist, Timothy; Bock, Christian; Brunk, Elizabeth; Buljan, Marija; Burley, Stephen K; Cai, Binghuang; Carter, Hannah; Gao, JianJiong; Godzik, Adam; Heuer, Michael; Hicks, Michael; Hrabe, Thomas; Karchin, Rachel; Leman, Julia Koehler; Lane, Lydie; Masica, David L; Mooney, Sean D; Moult, John; Omenn, Gilbert S; Pearl, Frances; Pejaver, Vikas; Reynolds, Sheila M; Rokem, Ariel; Schwede, Torsten; Song, Sicheng; Tilgner, Hagen; Valasatava, Yana; Zhang, Yang; Deutsch, Eric W

    2017-12-18

    The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.

  6. Sodium taurocholate cotransporting polypeptide (NTCP) deficiency: Identification of a novel SLC10A1 mutation in two unrelated infants presenting with neonatal indirect hyperbilirubinemia and remarkable hypercholanemia

    PubMed Central

    Qiu, Jian-Wu; Deng, Mei; Cheng, Ying; Atif, Raza-Muhammad; Lin, Wei-Xia; Guo, Li; Li, Hua; Song, Yuan-Zong

    2017-01-01

    Sodium taurocholate cotransporting polypeptide (NTCP) is encoded by the gene SLC10A1 and expressed in the basolateral membrane of the hepatocyte, functioning to uptake bile acids from plasma. Although SLC10A1 has been cloned and NTCP function studied intensively for years, clinical description of NTCP deficiency remains rather limited. This study reported the genotypic and phenotypic features of two neonatal patients with NTCP deficiency. They both presented with neonatal indirect hyperbilirubinemia and remarkable hypercholanemia, and harbored the SLC10A1 variants c.800C>T (p.S267F) and c.263T>C (p.I88T). On genetic analysis of the two family trios, the latter missense variant was detected in trans with the former, a reported loss-of-function variant. Having not been reported in any databases, the c.263T>C (p.I88T) variant demonstrated an allele frequency of 0.67% (1/150) in healthy controls. Moreover, this variant involved a relatively conservative amino acid, and was predicted to be pathogenic or deleterious by changing the conformation of the NTCP molecule. In conclusion, the novel variant c.263T>C (p.I88T) in this study enriched the SLC10A1 mutation spectrum; the clinical findings lent support to the primary role of NTCP in hepatic bile acid clearance, and suggested that NTCP deficiency might be a contributing factor for the development of neonatal indirect hyperbilirubinemia. PMID:29290974

  7. Sodium taurocholate cotransporting polypeptide (NTCP) deficiency: Identification of a novel SLC10A1 mutation in two unrelated infants presenting with neonatal indirect hyperbilirubinemia and remarkable hypercholanemia.

    PubMed

    Qiu, Jian-Wu; Deng, Mei; Cheng, Ying; Atif, Raza-Muhammad; Lin, Wei-Xia; Guo, Li; Li, Hua; Song, Yuan-Zong

    2017-12-05

    Sodium taurocholate cotransporting polypeptide (NTCP) is encoded by the gene SLC10A1 and expressed in the basolateral membrane of the hepatocyte, functioning to uptake bile acids from plasma. Although SLC10A1 has been cloned and NTCP function studied intensively for years, clinical description of NTCP deficiency remains rather limited. This study reported the genotypic and phenotypic features of two neonatal patients with NTCP deficiency. They both presented with neonatal indirect hyperbilirubinemia and remarkable hypercholanemia, and harbored the SLC10A1 variants c.800C>T (p.S267F) and c.263T>C (p.I88T). On genetic analysis of the two family trios, the latter missense variant was detected in trans with the former, a reported loss-of-function variant. Having not been reported in any databases, the c.263T>C (p.I88T) variant demonstrated an allele frequency of 0.67% (1/150) in healthy controls. Moreover, this variant involved a relatively conservative amino acid, and was predicted to be pathogenic or deleterious by changing the conformation of the NTCP molecule. In conclusion, the novel variant c.263T>C (p.I88T) in this study enriched the SLC10A1 mutation spectrum; the clinical findings lent support to the primary role of NTCP in hepatic bile acid clearance, and suggested that NTCP deficiency might be a contributing factor for the development of neonatal indirect hyperbilirubinemia.

  8. Small intragenic deletion in FOXP2 associated with childhood apraxia of speech and dysarthria.

    PubMed

    Turner, Samantha J; Hildebrand, Michael S; Block, Susan; Damiano, John; Fahey, Michael; Reilly, Sheena; Bahlo, Melanie; Scheffer, Ingrid E; Morgan, Angela T

    2013-09-01

    Relatively little is known about the neurobiological basis of speech disorders although genetic determinants are increasingly recognized. The first gene for primary speech disorder was FOXP2, identified in a large, informative family with verbal and oral dyspraxia. Subsequently, many de novo and familial cases with a severe speech disorder associated with FOXP2 mutations have been reported. These mutations include sequencing alterations, translocations, uniparental disomy, and genomic copy number variants. We studied eight probands with speech disorder and their families. Family members were phenotyped using a comprehensive assessment of speech, oral motor function, language, literacy skills, and cognition. Coding regions of FOXP2 were screened to identify novel variants. Segregation of the variant was determined in the probands' families. Variants were identified in two probands. One child with severe motor speech disorder had a small de novo intragenic FOXP2 deletion. His phenotype included features of childhood apraxia of speech and dysarthria, oral motor dyspraxia, receptive and expressive language disorder, and literacy difficulties. The other variant was found in a family in two of three family members with stuttering, and also in the mother with oral motor impairment. This variant was considered a benign polymorphism as it was predicted to be non-pathogenic with in silico tools and found in database controls. This is the first report of a small intragenic deletion of FOXP2 that is likely to be the cause of severe motor speech disorder associated with language and literacy problems. Copyright © 2013 Wiley Periodicals, Inc.

  9. Protein-altering variants associated with body mass index implicate pathways that control energy intake and expenditure underpinning obesity

    PubMed Central

    Turcot, Valérie; Lu, Yingchang; Highland, Heather M; Schurmann, Claudia; Justice, Anne E; Fine, Rebecca S; Bradfield, Jonathan P; Esko, Tõnu; Giri, Ayush; Graff, Mariaelisa; Guo, Xiuqing; Hendricks, Audrey E; Karaderi, Tugce; Lempradl, Adelheid; Locke, Adam E; Mahajan, Anubha; Marouli, Eirini; Sivapalaratnam, Suthesh; Young, Kristin L; Alfred, Tamuno; Feitosa, Mary F; Masca, Nicholas GD; Manning, Alisa K; Medina-Gomez, Carolina; Mudgal, Poorva; Ng, Maggie CY; Reiner, Alex P; Vedantam, Sailaja; Willems, Sara M; Winkler, Thomas W; Abecasis, Goncalo; Aben, Katja K; Alam, Dewan S; Alharthi, Sameer E; Allison, Matthew; Amouyel, Philippe; Asselbergs, Folkert W; Auer, Paul L; Balkau, Beverley; Bang, Lia E; Barroso, Inês; Bastarache, Lisa; Benn, Marianne; Bergmann, Sven; Bielak, Lawrence F; Blüher, Matthias; Boehnke, Michael; Boeing, Heiner; Boerwinkle, Eric; Böger, Carsten A; Bork-Jensen, Jette; Bots, Michiel L; Bottinger, Erwin P; Bowden, Donald W; Brandslund, Ivan; Breen, Gerome; Brilliant, Murray H; Broer, Linda; Brumat, Marco; Burt, Amber A; Butterworth, Adam S; Campbell, Peter T; Cappellani, Stefania; Carey, David J; Catamo, Eulalia; Caulfield, Mark J; Chambers, John C; Chasman, Daniel I; Chen, Yii-Der Ida; Chowdhury, Rajiv; Christensen, Cramer; Chu, Audrey Y; Cocca, Massimiliano; Collins, Francis S; Cook, James P; Corley, Janie; Galbany, Jordi Corominas; Cox, Amanda J; Crosslin, David S; Cuellar-Partida, Gabriel; D'Eustacchio, Angela; Danesh, John; Davies, Gail; de Bakker, Paul IW; de Groot, Mark CH; de Mutsert, Renée; Deary, Ian J; Dedoussis, George; Demerath, Ellen W; den Heijer, Martin; den Hollander, Anneke I; den Ruijter, Hester M; Dennis, Joe G; Denny, Josh C; Di Angelantonio, Emanuele; Drenos, Fotios; Du, Mengmeng; Dubé, Marie-Pierre; Dunning, Alison M; Easton, Douglas F; Edwards, Todd L; Ellinghaus, David; Ellinor, Patrick T; Elliott, Paul; Evangelou, Evangelos; Farmaki, Aliki-Eleni; Farooqi, I. Sadaf; Faul, Jessica D; Fauser, Sascha; Feng, Shuang; Ferrannini, Ele; Ferrieres, Jean; Florez, Jose C; Ford, Ian; Fornage, Myriam; Franco, Oscar H; Franke, Andre; Franks, Paul W; Friedrich, Nele; Frikke-Schmidt, Ruth; Galesloot, Tessel E.; Gan, Wei; Gandin, Ilaria; Gasparini, Paolo; Gibson, Jane; Giedraitis, Vilmantas; Gjesing, Anette P; Gordon-Larsen, Penny; Gorski, Mathias; Grabe, Hans-Jörgen; Grant, Struan FA; Grarup, Niels; Griffiths, Helen L; Grove, Megan L; Gudnason, Vilmundur; Gustafsson, Stefan; Haessler, Jeff; Hakonarson, Hakon; Hammerschlag, Anke R; Hansen, Torben; Harris, Kathleen Mullan; Harris, Tamara B; Hattersley, Andrew T; Have, Christian T; Hayward, Caroline; He, Liang; Heard-Costa, Nancy L; Heath, Andrew C; Heid, Iris M; Helgeland, Øyvind; Hernesniemi, Jussi; Hewitt, Alex W; Holmen, Oddgeir L; Hovingh, G Kees; Howson, Joanna MM; Hu, Yao; Huang, Paul L; Huffman, Jennifer E; Ikram, M Arfan; Ingelsson, Erik; Jackson, Anne U; Jansson, Jan-Håkan; Jarvik, Gail P; Jensen, Gorm B; Jia, Yucheng; Johansson, Stefan; Jørgensen, Marit E; Jørgensen, Torben; Jukema, J Wouter; Kahali, Bratati; Kahn, René S; Kähönen, Mika; Kamstrup, Pia R; Kanoni, Stavroula; Kaprio, Jaakko; Karaleftheri, Maria; Kardia, Sharon LR; Karpe, Fredrik; Kathiresan, Sekar; Kee, Frank; Kiemeney, Lambertus A; Kim, Eric; Kitajima, Hidetoshi; Komulainen, Pirjo; Kooner, Jaspal S; Kooperberg, Charles; Korhonen, Tellervo; Kovacs, Peter; Kuivaniemi, Helena; Kutalik, Zoltán; Kuulasmaa, Kari; Kuusisto, Johanna; Laakso, Markku; Lakka, Timo A; Lamparter, David; Lange, Ethan M; Lange, Leslie A; Langenberg, Claudia; Larson, Eric B; Lee, Nanette R; Lehtimäki, Terho; Lewis, Cora E; Li, Huaixing; Li, Jin; Li-Gao, Ruifang; Lin, Honghuang; Lin, Keng-Hung; Lin, Li-An; Lin, Xu; Lind, Lars; Lindström, Jaana; Linneberg, Allan; Liu, Ching-Ti; Liu, Dajiang J; Liu, Yongmei; Lo, Ken Sin; Lophatananon, Artitaya; Lotery, Andrew J; Loukola, Anu; Luan, Jian'an; Lubitz, Steven A; Lyytikäinen, Leo-Pekka; Männistö, Satu; Marenne, Gaëlle; Mazul, Angela L; McCarthy, Mark I; McKean-Cowdin, Roberta; Medland, Sarah E; Meidtner, Karina; Milani, Lili; Mistry, Vanisha; Mitchell, Paul; Mohlke, Karen L; Moilanen, Leena; Moitry, Marie; Montgomery, Grant W; Mook-Kanamori, Dennis O; Moore, Carmel; Mori, Trevor A; Morris, Andrew D; Morris, Andrew P; Müller-Nurasyid, Martina; Munroe, Patricia B; Nalls, Mike A; Narisu, Narisu; Nelson, Christopher P; Neville, Matt; Nielsen, Sune F; Nikus, Kjell; Njølstad, Pål R; Nordestgaard, Børge G; Nyholt, Dale R; O'Connel, Jeffrey R; O’Donoghue, Michelle L.; Olde Loohuis, Loes M; Ophoff, Roel A; Owen, Katharine R; Packard, Chris J; Padmanabhan, Sandosh; Palmer, Colin NA; Palmer, Nicholette D; Pasterkamp, Gerard; Patel, Aniruddh P; Pattie, Alison; Pedersen, Oluf; Peissig, Peggy L; Peloso, Gina M; Pennell, Craig E; Perola, Markus; Perry, James A; Perry, John RB; Pers, Tune H; Person, Thomas N; Peters, Annette; Petersen, Eva RB; Peyser, Patricia A; Pirie, Ailith; Polasek, Ozren; Polderman, Tinca J; Puolijoki, Hannu; Raitakari, Olli T; Rasheed, Asif; Rauramaa, Rainer; Reilly, Dermot F; Renström, Frida; Rheinberger, Myriam; Ridker, Paul M; Rioux, John D; Rivas, Manuel A; Roberts, David J; Robertson, Neil R; Robino, Antonietta; Rolandsson, Olov; Rudan, Igor; Ruth, Katherine S; Saleheen, Danish; Salomaa, Veikko; Samani, Nilesh J; Sapkota, Yadav; Sattar, Naveed; Schoen, Robert E; Schreiner, Pamela J; Schulze, Matthias B; Scott, Robert A; Segura-Lepe, Marcelo P; Shah, Svati H; Sheu, Wayne H-H; Sim, Xueling; Slater, Andrew J; Small, Kerrin S; Smith, Albert Vernon; Southam, Lorraine; Spector, Timothy D; Speliotes, Elizabeth K; Starr, John M; Stefansson, Kari; Steinthorsdottir, Valgerdur; Stirrups, Kathleen E; Strauch, Konstantin; Stringham, Heather M; Stumvoll, Michael; Sun, Liang; Surendran, Praveen; Swift, Amy J; Tada, Hayato; Tansey, Katherine E; Tardif, Jean-Claude; Taylor, Kent D; Teumer, Alexander; Thompson, Deborah J; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Thuesen, Betina H; Tönjes, Anke; Tromp, Gerard; Trompet, Stella; Tsafantakis, Emmanouil; Tuomilehto, Jaakko; Tybjaerg-Hansen, Anne; Tyrer, Jonathan P; Uher, Rudolf; Uitterlinden, André G; Uusitupa, Matti; van der Laan, Sander W; van Duijn, Cornelia M; van Leeuwen, Nienke; van Setten, Jessica; Vanhala, Mauno; Varbo, Anette; Varga, Tibor V; Varma, Rohit; Velez Edwards, Digna R; Vermeulen, Sita H; Veronesi, Giovanni; Vestergaard, Henrik; Vitart, Veronique; Vogt, Thomas F; Völker, Uwe; Vuckovic, Dragana; Wagenknecht, Lynne E; Walker, Mark; Wallentin, Lars; Wang, Feijie; Wang, Carol A; Wang, Shuai; Wang, Yiqin; Ware, Erin B; Wareham, Nicholas J; Warren, Helen R; Waterworth, Dawn M; Wessel, Jennifer; White, Harvey D; Willer, Cristen J; Wilson, James G; Witte, Daniel R; Wood, Andrew R; Wu, Ying; Yaghootkar, Hanieh; Yao, Jie; Yao, Pang; Yerges-Armstrong, Laura M; Young, Robin; Zeggini, Eleftheria; Zhan, Xiaowei; Zhang, Weihua; Zhao, Jing Hua; Zhao, Wei; Zhao, Wei; Zhou, Wei; Zondervan, Krina T; Rotter, Jerome I; Pospisilik, John A; Rivadeneira, Fernando; Borecki, Ingrid B; Deloukas, Panos; Frayling, Timothy M; Lettre, Guillaume; North, Kari E; Lindgren, Cecilia M; Hirschhorn, Joel N; Loos, Ruth JF

    2018-01-01

    Genome-wide association studies (GWAS) have identified >250 loci for body mass index (BMI), implicating pathways related to neuronal biology. Most GWAS loci represent clusters of common, non-coding variants from which pinpointing causal genes remains challenging. Here, we combined data from 718,734 individuals to discover rare and low-frequency (MAF<5%) coding variants associated with BMI. We identified 14 coding variants in 13 genes, of which eight in genes (ZBTB7B, ACHE, RAPGEF3, RAB21, ZFHX3, ENTPD6, ZFR2, ZNF169) newly implicated in human obesity, two (MC4R, KSR2) previously observed in extreme obesity, and two variants in GIPR. Effect sizes of rare variants are ~10 times larger than of common variants, with the largest effect observed in carriers of an MC4R stop-codon (p.Tyr35Ter, MAF=0.01%), weighing ~7kg more than non-carriers. Pathway analyses confirmed enrichment of neuronal genes and provide new evidence for adipocyte and energy expenditure biology, widening the potential of genetically-supported therapeutic targets to treat obesity. PMID:29273807

  10. Multi-species sequence comparison reveals conservation of ghrelin gene-derived splice variants encoding a truncated ghrelin peptide.

    PubMed

    Seim, Inge; Jeffery, Penny L; Thomas, Patrick B; Walpole, Carina M; Maugham, Michelle; Fung, Jenny N T; Yap, Pei-Yi; O'Keeffe, Angela J; Lai, John; Whiteside, Eliza J; Herington, Adrian C; Chopin, Lisa K

    2016-06-01

    The peptide hormone ghrelin is a potent orexigen produced predominantly in the stomach. It has a number of other biological actions, including roles in appetite stimulation, energy balance, the stimulation of growth hormone release and the regulation of cell proliferation. Recently, several ghrelin gene splice variants have been described. Here, we attempted to identify conserved alternative splicing of the ghrelin gene by cross-species sequence comparisons. We identified a novel human exon 2-deleted variant and provide preliminary evidence that this splice variant and in1-ghrelin encode a C-terminally truncated form of the ghrelin peptide, termed minighrelin. These variants are expressed in humans and mice, demonstrating conservation of alternative splicing spanning 90 million years. Minighrelin appears to have similar actions to full-length ghrelin, as treatment with exogenous minighrelin peptide stimulates appetite and feeding in mice. Forced expression of the exon 2-deleted preproghrelin variant mirrors the effect of the canonical preproghrelin, stimulating cell proliferation and migration in the PC3 prostate cancer cell line. This is the first study to characterise an exon 2-deleted preproghrelin variant and to demonstrate sequence conservation of ghrelin gene-derived splice variants that encode a truncated ghrelin peptide. This adds further impetus for studies into the alternative splicing of the ghrelin gene and the function of novel ghrelin peptides in vertebrates.

  11. Anaplasma phagocytophilum in questing Ixodes ricinus ticks: comparison of prevalences and partial 16S rRNA gene variants in urban, pasture, and natural habitats.

    PubMed

    Overzier, Evelyn; Pfister, Kurt; Thiel, Claudia; Herb, Ingrid; Mahling, Monia; Silaghi, Cornelia

    2013-03-01

    Urban, natural, and pasture areas were investigated for prevalences and 16S rRNA gene variants of Anaplasma phagocytophilum in questing Ixodes ricinus ticks. The prevalences differed significantly between habitat types, and year-to-year variations in prevalence and habitat-dependent occurrence of 16S rRNA gene variants were detected.

  12. Next-generation sequencing using a pre-designed gene panel for the molecular diagnosis of congenital disorders in pediatric patients.

    PubMed

    Lim, Eileen C P; Brett, Maggie; Lai, Angeline H M; Lee, Siew-Peng; Tan, Ee-Shien; Jamuar, Saumya S; Ng, Ivy S L; Tan, Ene-Choo

    2015-12-14

    Next-generation sequencing (NGS) has revolutionized genetic research and offers enormous potential for clinical application. Sequencing the exome has the advantage of casting the net wide for all known coding regions while targeted gene panel sequencing provides enhanced sequencing depths and can be designed to avoid incidental findings in adult-onset conditions. A HaloPlex panel consisting of 180 genes within commonly altered chromosomal regions is available for use on both the Ion Personal Genome Machine (PGM) and MiSeq platforms to screen for causative mutations in these genes. We used this Haloplex ICCG panel for targeted sequencing of 15 patients with clinical presentations indicative of an abnormality in one of the 180 genes. Sequencing runs were done using the Ion 318 Chips on the Ion Torrent PGM. Variants were filtered for known polymorphisms and analysis was done to identify possible disease-causing variants before validation by Sanger sequencing. When possible, segregation of variants with phenotype in family members was performed to ascertain the pathogenicity of the variant. More than 97% of the target bases were covered at >20×. There was an average of 9.6 novel variants per patient. Pathogenic mutations were identified in five genes for six patients, with two novel variants. There were another five likely pathogenic variants, some of which were unreported novel variants. In a cohort of 15 patients, we were able to identify a likely genetic etiology in six patients (40%). Another five patients had candidate variants for which further evaluation and segregation analysis are ongoing. Our results indicate that the HaloPlex ICCG panel is useful as a rapid, high-throughput and cost-effective screening tool for 170 of the 180 genes. There is low coverage for some regions in several genes which might have to be supplemented by Sanger sequencing. However, comparing the cost, ease of analysis, and shorter turnaround time, it is a good alternative to exome sequencing for patients whose features are suggestive of a genetic etiology involving one of the genes in the panel.

  13. Interactions among variants in TXA2R, P2Y12 and GPIIIa are associated with carotid plaque vulnerability in Chinese population.

    PubMed

    Yi, Xingyang; Lin, Jing; Luo, Hua; Zhou, Ju; Zhou, Qiang; Wang, Yanfen; Wang, Chun

    2018-04-03

    The associations between variants in platelet activation-relevant genes and carotid plaque vulnerability are not fully understood. The aim of the present study was to investigate the associations of the variants in platelet activation-relevant genes and interactions among these variants with carotid plaque vulnerability. There were no significant differences in the frequencies of genotypes of the 11 variants between patients and controls. Among 396 patients, 102 patients had not carotid plaque, 106 had VP, and 188 had SP. The 11 variants were not independently associated with risk of carotid plaque vulnerability after adjusting for potential confounding variables. However, the GMDR analysis showed that there were synergistic effects of gene-gene interactions among TXA2Rr s1131882, GPIIIa rs2317676 and P2Y12 rs16863323 on carotid plaque vulnerability. The high-risk interactions among the three variants were associated with high platelet activation, and independently associated with the risk of carotid plaque vulnerability. Eleven variants in platelet activation-relevant genes were examined using mass spectrometry methods in 396 ischemic stroke patients and 291controls. Platelet-leukocyte aggregates and platelet aggregation were also measured. Carotid plaques were assessed by B-mode ultrasound. According to the results of ultrasound, the patients were stratified into three groups: non-plaque group, vulnerable plaque (VP) group and stable plaque (SP) group. Furthermore, gene-gene interactions were analyzed using generalized multifactor dimensionality reduction (GMDR) methods. The rs1131882, rs2317676, and rs16863323 three-loci interactions may confer a higher risk of carotid plaque vulnerability, and might be potential markers for plaque instability.

  14. Comprehensive analysis of the mutation spectrum in 301 German ALS families.

    PubMed

    Müller, Kathrin; Brenner, David; Weydt, Patrick; Meyer, Thomas; Grehl, Torsten; Petri, Susanne; Grosskreutz, Julian; Schuster, Joachim; Volk, Alexander E; Borck, Guntram; Kubisch, Christian; Klopstock, Thomas; Zeller, Daniel; Jablonka, Sibylle; Sendtner, Michael; Klebe, Stephan; Knehr, Antje; Günther, Kornelia; Weis, Joachim; Claeys, Kristl G; Schrank, Berthold; Sperfeld, Anne-Dorte; Hübers, Annemarie; Otto, Markus; Dorst, Johannes; Meitinger, Thomas; Strom, Tim M; Andersen, Peter M; Ludolph, Albert C; Weishaupt, Jochen H

    2018-04-12

    Recent advances in amyotrophic lateral sclerosis (ALS) genetics have revealed that mutations in any of more than 25 genes can cause ALS, mostly as an autosomal-dominant Mendelian trait. Detailed knowledge about the genetic architecture of ALS in a specific population will be important for genetic counselling but also for genotype-specific therapeutic interventions. Here we combined fragment length analysis, repeat-primed PCR, Southern blotting, Sanger sequencing and whole exome sequencing to obtain a comprehensive profile of genetic variants in ALS disease genes in 301 German pedigrees with familial ALS. We report C9orf72 mutations as well as variants in consensus splice sites and non-synonymous variants in protein-coding regions of ALS genes. We furthermore estimate their pathogenicity by taking into account type and frequency of the respective variant as well as segregation within the families. 49% of our German ALS families carried a likely pathogenic variant in at least one of the earlier identified ALS genes. In 45% of the ALS families, likely pathogenic variants were detected in C9orf72, SOD1, FUS, TARDBP or TBK1 , whereas the relative contribution of the other ALS genes in this familial ALS cohort was 4%. We identified several previously unreported rare variants and demonstrated the absence of likely pathogenic variants in some of the recently described ALS disease genes. We here present a comprehensive genetic characterisation of German familial ALS. The present findings are of importance for genetic counselling in clinical practice, for molecular research and for the design of diagnostic gene panels or genotype-specific therapeutic interventions in Europe. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  15. Multi-gene panel testing in Korean patients with common genetic generalized epilepsy syndromes.

    PubMed

    Lee, Cha Gon; Lee, Jeehun; Lee, Munhyang

    2018-01-01

    Genetic heterogeneity of common genetic generalized epilepsy syndromes is frequently considered. The present study conducted a focused analysis of potential candidate or susceptibility genes for common genetic generalized epilepsy syndromes using multi-gene panel testing with next-generation sequencing. This study included patients with juvenile myoclonic epilepsy, juvenile absence epilepsy, and epilepsy with generalized tonic-clonic seizures alone. We identified pathogenic variants according to the American College of Medical Genetics and Genomics guidelines and identified susceptibility variants using case-control association analyses and family analyses for familial cases. A total of 57 patients were enrolled, including 51 sporadic cases and 6 familial cases. Twenty-two pathogenic and likely pathogenic variants of 16 different genes were identified. CACNA1H was the most frequently observed single gene. Variants of voltage-gated Ca2+ channel genes, including CACNA1A, CACNA1G, and CACNA1H were observed in 32% of variants (n = 7/22). Analyses to identify susceptibility variants using case-control association analysis indicated that KCNMA1 c.400G>C was associated with common genetic generalized epilepsy syndromes. Only 1 family (family A) exhibited a candidate pathogenic variant p.(Arg788His) on CACNA1H, as determined via family analyses. This study identified candidate genetic variants in about a quarter of patients (n = 16/57) and an average of 2.8 variants was identified in each patient. The results reinforced the polygenic disorder with very high locus and allelic heterogeneity of common GGE syndromes. Further, voltage-gated Ca2+ channels are suggested as important contributors to common genetic generalized epilepsy syndromes. This study extends our comprehensive understanding of common genetic generalized epilepsy syndromes.

  16. Uptake, Results, and Outcomes of Germline Multiple-Gene Sequencing After Diagnosis of Breast Cancer.

    PubMed

    Kurian, Allison W; Ward, Kevin C; Hamilton, Ann S; Deapen, Dennis M; Abrahamse, Paul; Bondarenko, Irina; Li, Yun; Hawley, Sarah T; Morrow, Monica; Jagsi, Reshma; Katz, Steven J

    2018-05-10

    Low-cost sequencing of multiple genes is increasingly available for cancer risk assessment. Little is known about uptake or outcomes of multiple-gene sequencing after breast cancer diagnosis in community practice. To examine the effect of multiple-gene sequencing on the experience and treatment outcomes for patients with breast cancer. For this population-based retrospective cohort study, patients with breast cancer diagnosed from January 2013 to December 2015 and accrued from SEER registries across Georgia and in Los Angeles, California, were surveyed (n = 5080, response rate = 70%). Responses were merged with SEER data and results of clinical genetic tests, either BRCA1 and BRCA2 (BRCA1/2) sequencing only or including additional other genes (multiple-gene sequencing), provided by 4 laboratories. Type of testing (multiple-gene sequencing vs BRCA1/2-only sequencing), test results (negative, variant of unknown significance, or pathogenic variant), patient experiences with testing (timing of testing, who discussed results), and treatment (strength of patient consideration of, and surgeon recommendation for, prophylactic mastectomy), and prophylactic mastectomy receipt. We defined a patient subgroup with higher pretest risk of carrying a pathogenic variant according to practice guidelines. Among 5026 patients (mean [SD] age, 59.9 [10.7]), 1316 (26.2%) were linked to genetic results from any laboratory. Multiple-gene sequencing increasingly replaced BRCA1/2-only testing over time: in 2013, the rate of multiple-gene sequencing was 25.6% and BRCA1/2-only testing, 74.4%;in 2015 the rate of multiple-gene sequencing was 66.5% and BRCA1/2-only testing, 33.5%. Multiple-gene sequencing was more often ordered by genetic counselors (multiple-gene sequencing, 25.5% and BRCA1/2-only testing, 15.3%) and delayed until after surgery (multiple-gene sequencing, 32.5% and BRCA1/2-only testing, 19.9%). Multiple-gene sequencing substantially increased rate of detection of any pathogenic variant (multiple-gene sequencing: higher-risk patients, 12%; average-risk patients, 4.2% and BRCA1/2-only testing: higher-risk patients, 7.8%; average-risk patients, 2.2%) and variants of uncertain significance, especially in minorities (multiple-gene sequencing: white patients, 23.7%; black patients, 44.5%; and Asian patients, 50.9% and BRCA1/2-only testing: white patients, 2.2%; black patients, 5.6%; and Asian patients, 0%). Multiple-gene sequencing was not associated with an increase in the rate of prophylactic mastectomy use, which was highest with pathogenic variants in BRCA1/2 (BRCA1/2, 79.0%; other pathogenic variant, 37.6%; variant of uncertain significance, 30.2%; negative, 35.3%). Multiple-gene sequencing rapidly replaced BRCA1/2-only testing for patients with breast cancer in the community and enabled 2-fold higher detection of clinically relevant pathogenic variants without an associated increase in prophylactic mastectomy. However, important targets for improvement in the clinical utility of multiple-gene sequencing include postsurgical delay and racial/ethnic disparity in variants of uncertain significance.

  17. Whole exome sequencing identifies novel candidate genes that modify chronic obstructive pulmonary disease susceptibility.

    PubMed

    Bruse, Shannon; Moreau, Michael; Bromberg, Yana; Jang, Jun-Ho; Wang, Nan; Ha, Hongseok; Picchi, Maria; Lin, Yong; Langley, Raymond J; Qualls, Clifford; Klensney-Tait, Julia; Zabner, Joseph; Leng, Shuguang; Mao, Jenny; Belinsky, Steven A; Xing, Jinchuan; Nyunoya, Toru

    2016-01-07

    Chronic obstructive pulmonary disease (COPD) is characterized by an irreversible airflow limitation in response to inhalation of noxious stimuli, such as cigarette smoke. However, only 15-20 % smokers manifest COPD, suggesting a role for genetic predisposition. Although genome-wide association studies have identified common genetic variants that are associated with susceptibility to COPD, effect sizes of the identified variants are modest, as is the total heritability accounted for by these variants. In this study, an extreme phenotype exome sequencing study was combined with in vitro modeling to identify COPD candidate genes. We performed whole exome sequencing of 62 highly susceptible smokers and 30 exceptionally resistant smokers to identify rare variants that may contribute to disease risk or resistance to COPD. This was a cross-sectional case-control study without therapeutic intervention or longitudinal follow-up information. We identified candidate genes based on rare variant analyses and evaluated exonic variants to pinpoint individual genes whose function was computationally established to be significantly different between susceptible and resistant smokers. Top scoring candidate genes from these analyses were further filtered by requiring that each gene be expressed in human bronchial epithelial cells (HBECs). A total of 81 candidate genes were thus selected for in vitro functional testing in cigarette smoke extract (CSE)-exposed HBECs. Using small interfering RNA (siRNA)-mediated gene silencing experiments, we showed that silencing of several candidate genes augmented CSE-induced cytotoxicity in vitro. Our integrative analysis through both genetic and functional approaches identified two candidate genes (TACC2 and MYO1E) that augment cigarette smoke (CS)-induced cytotoxicity and, potentially, COPD susceptibility.

  18. GALT protein database: querying structural and functional features of GALT enzyme.

    PubMed

    d'Acierno, Antonio; Facchiano, Angelo; Marabotti, Anna

    2014-09-01

    Knowledge of the impact of variations on protein structure can enhance the comprehension of the mechanisms of genetic diseases related to that protein. Here, we present a new version of GALT Protein Database, a Web-accessible data repository for the storage and interrogation of structural effects of variations of the enzyme galactose-1-phosphate uridylyltransferase (GALT), the impairment of which leads to classic Galactosemia, a rare genetic disease. This new version of this database now contains the models of 201 missense variants of GALT enzyme, including heterozygous variants, and it allows users not only to retrieve information about the missense variations affecting this protein, but also to investigate their impact on substrate binding, intersubunit interactions, stability, and other structural features. In addition, it allows the interactive visualization of the models of variants collected into the database. We have developed additional tools to improve the use of the database by nonspecialized users. This Web-accessible database (http://bioinformatica.isa.cnr.it/GALT/GALT2.0) represents a model of tools potentially suitable for application to other proteins that are involved in human pathologies and that are subjected to genetic variations. © 2014 WILEY PERIODICALS, INC.

  19. Genetic Variation in Cardiomyopathy and Cardiovascular Disorders.

    PubMed

    McNally, Elizabeth M; Puckelwartz, Megan J

    2015-01-01

    With the wider deployment of massively-parallel, next-generation sequencing, it is now possible to survey human genome data for research and clinical purposes. The reduced cost of producing short-read sequencing has now shifted the burden to data analysis. Analysis of genome sequencing remains challenged by the complexity of the human genome, including redundancy and the repetitive nature of genome elements and the large amount of variation in individual genomes. Public databases of human genome sequences greatly facilitate interpretation of common and rare genetic variation, although linking database sequence information to detailed clinical information is limited by privacy and practical issues. Genetic variation is a rich source of knowledge for cardiovascular disease because many, if not all, cardiovascular disorders are highly heritable. The role of rare genetic variation in predicting risk and complications of cardiovascular diseases has been well established for hypertrophic and dilated cardiomyopathy, where the number of genes that are linked to these disorders is growing. Bolstered by family data, where genetic variants segregate with disease, rare variation can be linked to specific genetic variation that offers profound diagnostic information. Understanding genetic variation in cardiomyopathy is likely to help stratify forms of heart failure and guide therapy. Ultimately, genetic variation may be amenable to gene correction and gene editing strategies.

  20. Building a protein name dictionary from full text: a machine learning term extraction approach.

    PubMed

    Shi, Lei; Campagne, Fabien

    2005-04-07

    The majority of information in the biological literature resides in full text articles, instead of abstracts. Yet, abstracts remain the focus of many publicly available literature data mining tools. Most literature mining tools rely on pre-existing lexicons of biological names, often extracted from curated gene or protein databases. This is a limitation, because such databases have low coverage of the many name variants which are used to refer to biological entities in the literature. We present an approach to recognize named entities in full text. The approach collects high frequency terms in an article, and uses support vector machines (SVM) to identify biological entity names. It is also computationally efficient and robust to noise commonly found in full text material. We use the method to create a protein name dictionary from a set of 80,528 full text articles. Only 8.3% of the names in this dictionary match SwissProt description lines. We assess the quality of the dictionary by studying its protein name recognition performance in full text. This dictionary term lookup method compares favourably to other published methods, supporting the significance of our direct extraction approach. The method is strong in recognizing name variants not found in SwissProt.

  1. Building a protein name dictionary from full text: a machine learning term extraction approach

    PubMed Central

    Shi, Lei; Campagne, Fabien

    2005-01-01

    Background The majority of information in the biological literature resides in full text articles, instead of abstracts. Yet, abstracts remain the focus of many publicly available literature data mining tools. Most literature mining tools rely on pre-existing lexicons of biological names, often extracted from curated gene or protein databases. This is a limitation, because such databases have low coverage of the many name variants which are used to refer to biological entities in the literature. Results We present an approach to recognize named entities in full text. The approach collects high frequency terms in an article, and uses support vector machines (SVM) to identify biological entity names. It is also computationally efficient and robust to noise commonly found in full text material. We use the method to create a protein name dictionary from a set of 80,528 full text articles. Only 8.3% of the names in this dictionary match SwissProt description lines. We assess the quality of the dictionary by studying its protein name recognition performance in full text. Conclusion This dictionary term lookup method compares favourably to other published methods, supporting the significance of our direct extraction approach. The method is strong in recognizing name variants not found in SwissProt. PMID:15817129

  2. Genome-wide copy number variant analysis for congenital ventricular septal defects in Chinese Han population.

    PubMed

    An, Yu; Duan, Wenyuan; Huang, Guoying; Chen, Xiaoli; Li, Li; Nie, Chenxia; Hou, Jia; Gui, Yonghao; Wu, Yiming; Zhang, Feng; Shen, Yiping; Wu, Bailin; Wang, Hongyan

    2016-01-08

    Ventricular septal defects (VSDs) constitute the most prevalent congenital heart disease (CHD), occurs either in isolation (isolated VSD) or in combination with other cardiac defects (complex VSD). Copy number variation (CNV) has been highlighted as a possible contributing factor to the etiology of many congenital diseases. However, little is known concerning the involvement of CNVs in either isolated or complex VSDs. We analyzed 154 unrelated Chinese individuals with VSD by chromosomal microarray analysis. The subjects were recruited from four hospitals across China. Each case underwent clinical assessment to define the type of VSD, either isolated or complex VSD. CNVs detected were categorized into syndrom related CNVs, recurrent CNVs and rare CNVs. Genes encompassed by the CNVs were analyzed using enrichment and pathway analysis. Among 154 probands, we identified 29 rare CNVs in 26 VSD patients (16.9 %, 26/154) and 8 syndrome-related CNVs in 8 VSD patients (5.2 %, 8/154). 12 of the detected 29 rare CNVs (41.3 %) were recurrently reported in DECIPHER or ISCA database as associated with either VSD or general heart disease. Fifteen genes (5 %, 15/285) within CNVs were associated with a broad spectrum of complicated CHD. Among these15 genes, 7 genes were in "abnormal interventricular septum morphology" derived from the MGI (mouse genome informatics) database, and nine genes were associated with cardiovascular system development (GO:0072538).We also found that these VSD-related candidate genes are enriched in chromatin binding and transcription regulation, which are the biological processes underlying heart development. Our study demonstrates the potential clinical diagnostic utility of genomic imbalance profiling in VSD patients. Additionally, gene enrichment and pathway analysis helped us to implicate VSD related candidate genes.

  3. A rare variant of the mtDNA HVS1 sequence in the hairs of Napoléon's family.

    PubMed

    Lucotte, Gérard

    2010-10-04

    This paper describes the finding of a rare variant in the sequence of the hypervariable segment (HVS1) of mitochondrial (mtDNA) extracted from two preserved hairs, authenticated as belonging to the French Emperor Napoléon I (Napoléon Bonaparte). This rare variant is a mutation that changes the base C to T at position 16,184 (16184C→T), and it constitutes the only mutation found in this HVS1 sequence. This mutation is rare, because it was not found in a reference database (P < 0.05). In a personal database (M. Pala) comprising 37,000 different sequences, the 16184C→T mutation was found in only three samples, thus in this database the mutation frequency was 0.00008%. This mutation 16184C→T was also the only variant found subsequently in the HVS1 sequences of mtDNAs extracted from Napoléon's mother (Letizia) and from his youngest sister (Caroline), confirming that this mutation is maternally inherited. This 16184C→T variant could be used for genetic verification to authenticate any doubtful material and determine whether it should indeed be attributed to Napoléon.

  4. A rare variant of the mtDNA HVS1 sequence in the hairs of Napoléon's family

    PubMed Central

    2010-01-01

    This paper describes the finding of a rare variant in the sequence of the hypervariable segment (HVS1) of mitochondrial (mtDNA) extracted from two preserved hairs, authenticated as belonging to the French Emperor Napoléon I (Napoléon Bonaparte). This rare variant is a mutation that changes the base C to T at position 16,184 (16184C→T), and it constitutes the only mutation found in this HVS1 sequence. This mutation is rare, because it was not found in a reference database (P < 0.05). In a personal database (M. Pala) comprising 37,000 different sequences, the 16184C→T mutation was found in only three samples, thus in this database the mutation frequency was 0.00008%. This mutation 16184C→T was also the only variant found subsequently in the HVS1 sequences of mtDNAs extracted from Napoléon's mother (Letizia) and from his youngest sister (Caroline), confirming that this mutation is maternally inherited. This 16184C→T variant could be used for genetic verification to authenticate any doubtful material and determine whether it should indeed be attributed to Napoléon. PMID:21092341

  5. Enhancer Variants Synergistically Drive Dysfunction of a Gene Regulatory Network In Hirschsprung Disease

    DOE PAGES

    Chatterjee, Sumantra; Kapoor, Ashish; Akiyama, Jennifer A.; ...

    2016-09-29

    Common sequence variants in cis-regulatory elements (CREs) are suspected etiological causes of complex disorders. We previously identified an intronic enhancer variant in the RET gene disrupting SOX10 binding and increasing Hirschsprung disease (HSCR) risk 4-fold. We now show that two other functionally independent CRE variants, one binding Gata2 and the other binding Rarb, also reduce Ret expression and increase risk 2- and 1.7-fold. By studying human and mouse fetal gut tissues and cell lines, we demonstrate that reduced RET expression propagates throughout its gene regulatory network, exerting effects on both its positive and negative feedback components. We also provide evidencemore » that the presence of a combination of CRE variants synergistically reduces RET expression and its effects throughout the GRN. These studies show how the effects of functionally independent non-coding variants in a coordinated gene regulatory network amplify their individually small effects, providing a model for complex disorders.« less

  6. Enhancer Variants Synergistically Drive Dysfunction of a Gene Regulatory Network In Hirschsprung Disease

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chatterjee, Sumantra; Kapoor, Ashish; Akiyama, Jennifer A.

    Common sequence variants in cis-regulatory elements (CREs) are suspected etiological causes of complex disorders. We previously identified an intronic enhancer variant in the RET gene disrupting SOX10 binding and increasing Hirschsprung disease (HSCR) risk 4-fold. We now show that two other functionally independent CRE variants, one binding Gata2 and the other binding Rarb, also reduce Ret expression and increase risk 2- and 1.7-fold. By studying human and mouse fetal gut tissues and cell lines, we demonstrate that reduced RET expression propagates throughout its gene regulatory network, exerting effects on both its positive and negative feedback components. We also provide evidencemore » that the presence of a combination of CRE variants synergistically reduces RET expression and its effects throughout the GRN. These studies show how the effects of functionally independent non-coding variants in a coordinated gene regulatory network amplify their individually small effects, providing a model for complex disorders.« less

  7. De Novo Coding Variants Are Strongly Associated with Tourette Disorder

    PubMed Central

    Willsey, A. Jeremy; Fernandez, Thomas V.; Yu, Dongmei; King, Robert A.; Dietrich, Andrea; Xing, Jinchuan; Sanders, Stephan J.; Mandell, Jeffrey D.; Huang, Alden Y.; Richer, Petra; Smith, Louw; Dong, Shan; Samocha, Kaitlin E.; Neale, Benjamin M.; Coppola, Giovanni; Mathews, Carol A.; Tischfield, Jay A.; Scharf, Jeremiah M.; State, Matthew W.; Heiman, Gary A.

    2017-01-01

    SUMMARY Whole-exome sequencing (WES) and de novo variant detection have proven a powerful approach to gene discovery in complex neurodevelopmental disorders. We have completed WES of 325 Tourette disorder trios from the Tourette International Collaborative Genetics cohort and a replication sample of 186 trios from the Tourette Syndrome Association International Consortium on Genetics (511 total). We observe strong and consistent evidence for the contribution of de novo likely gene-disrupting (LGD) variants (rate ratio [RR] 2.32, p = 0.002). Additionally, de novo damaging variants (LGD and probably damaging missense) are overrepresented in probands (RR 1.37, p = 0.003). We identify four likely risk genes with multiple de novo damaging variants in unrelated probands: WWC1 (WW and C2 domain containing 1), CELSR3 (Cadherin EGF LAG seven-pass G-type receptor 3), NIPBL (Nipped-B-like), and FN1 (fibronectin 1). Overall, we estimate that de novo damaging variants in approximately 400 genes contribute risk in 12% of clinical cases. PMID:28472652

  8. Pathogenic Germline Variants in 10,389 Adult Cancers.

    PubMed

    Huang, Kuan-Lin; Mashl, R Jay; Wu, Yige; Ritter, Deborah I; Wang, Jiayin; Oh, Clara; Paczkowska, Marta; Reynolds, Sheila; Wyczalkowski, Matthew A; Oak, Ninad; Scott, Adam D; Krassowski, Michal; Cherniack, Andrew D; Houlahan, Kathleen E; Jayasinghe, Reyka; Wang, Liang-Bo; Zhou, Daniel Cui; Liu, Di; Cao, Song; Kim, Young Won; Koire, Amanda; McMichael, Joshua F; Hucthagowder, Vishwanathan; Kim, Tae-Beom; Hahn, Abigail; Wang, Chen; McLellan, Michael D; Al-Mulla, Fahd; Johnson, Kimberly J; Lichtarge, Olivier; Boutros, Paul C; Raphael, Benjamin; Lazar, Alexander J; Zhang, Wei; Wendl, Michael C; Govindan, Ramaswamy; Jain, Sanjay; Wheeler, David; Kulkarni, Shashikant; Dipersio, John F; Reimand, Jüri; Meric-Bernstam, Funda; Chen, Ken; Shmulevich, Ilya; Plon, Sharon E; Chen, Feng; Ding, Li

    2018-04-05

    We conducted the largest investigation of predisposition variants in cancer to date, discovering 853 pathogenic or likely pathogenic variants in 8% of 10,389 cases from 33 cancer types. Twenty-one genes showed single or cross-cancer associations, including novel associations of SDHA in melanoma and PALB2 in stomach adenocarcinoma. The 659 predisposition variants and 18 additional large deletions in tumor suppressors, including ATM, BRCA1, and NF1, showed low gene expression and frequent (43%) loss of heterozygosity or biallelic two-hit events. We also discovered 33 such variants in oncogenes, including missenses in MET, RET, and PTPN11 associated with high gene expression. We nominated 47 additional predisposition variants from prioritized VUSs supported by multiple evidences involving case-control frequency, loss of heterozygosity, expression effect, and co-localization with mutations and modified residues. Our integrative approach links rare predisposition variants to functional consequences, informing future guidelines of variant classification and germline genetic testing in cancer. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  9. An integrated map of structural variation in 2,504 human genomes.

    PubMed

    Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Fritz, Markus Hsi-Yang; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Casale, Francesco Paolo; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Mu, Xinmeng Jasmine; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

    2015-10-01

    Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.

  10. Mutation Spectrum of GNE Myopathy in the Indian Sub-Continent.

    PubMed

    Bhattacharya, Sudha; Khadilkar, Satish V; Nalini, Atchayaram; Ganapathy, Aparna; Mannan, Ashraf U; Majumder, Partha P; Bhattacharya, Alok

    GNE myopathy is an adult onset recessive genetic disorder that affects distal muscles sparing the quadriceps. GNE gene mutations have been identified in GNE myopathy patients all over the world. Homozygosity is a common feature in GNE myopathy patients worldwide. The major objective of this study was to investigate the mutation spectrum of GNE myopathy in India in relation to the population diversity in the country. We have collated GNE mutation data of Indian GNE myopathy patients from published literature and from recently identified patients. We also used data of people of Indian subcontinent from 1000 genomes database, South Asian Genome database and Strand Life Science database to determine frequency of GNE mutations in the general population. A total of 67 GNE myopathy patients were studied, of whom 21% were homozygous for GNE variants, while the rest were compound heterozygous. Thirty-five different mutations in the GNE gene were recorded, of which 5 have not been reported earlier. The most frequent mutation was p.Val727Met (65%) found mainly in the heterozygous form. Another mutation, p.Ile618Thr was also common (16%) but was found mainly in patients from Rajasthan, while p.Val727Met was more widely distributed. The latter was also seen at a high frequency in general population of Indian subcontinent in all the databases. It was also present in Thailand but was absent in general population elsewhere in the world. p.Val727Met is likely to be a founder mutation of Indian subcontinent.

  11. The genomic landscape shaped by selection on transposable elements across 18 mouse strains.

    PubMed

    Nellåker, Christoffer; Keane, Thomas M; Yalcin, Binnaz; Wong, Kim; Agam, Avigail; Belgard, T Grant; Flint, Jonathan; Adams, David J; Frankel, Wayne N; Ponting, Chris P

    2012-06-15

    Transposable element (TE)-derived sequence dominates the landscape of mammalian genomes and can modulate gene function by dysregulating transcription and translation. Our current knowledge of TEs in laboratory mouse strains is limited primarily to those present in the C57BL/6J reference genome, with most mouse TEs being drawn from three distinct classes, namely short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs) and the endogenous retrovirus (ERV) superfamily. Despite their high prevalence, the different genomic and gene properties controlling whether TEs are preferentially purged from, or are retained by, genetic drift or positive selection in mammalian genomes remain poorly defined. Using whole genome sequencing data from 13 classical laboratory and 4 wild-derived mouse inbred strains, we developed a comprehensive catalogue of 103,798 polymorphic TE variants. We employ this extensive data set to characterize TE variants across the Mus lineage, and to infer neutral and selective processes that have acted over 2 million years. Our results indicate that the majority of TE variants are introduced though the male germline and that only a minority of TE variants exert detectable changes in gene expression. However, among genes with differential expression across the strains there are twice as many TE variants identified as being putative causal variants as expected. Most TE variants that cause gene expression changes appear to be purged rapidly by purifying selection. Our findings demonstrate that past TE insertions have often been highly deleterious, and help to prioritize TE variants according to their likely contribution to gene expression or phenotype variation.

  12. Polymorphisms in adenosine receptor genes are associated with infarct size in patients with ischemic cardiomyopathy.

    PubMed

    Tang, Z; Diamond, M A; Chen, J-M; Holly, T A; Bonow, R O; Dasgupta, A; Hyslop, T; Purzycki, A; Wagner, J; McNamara, D M; Kukulski, T; Wos, S; Velazquez, E J; Ardlie, K; Feldman, A M

    2007-10-01

    The goal of this experiment was to identify the presence of genetic variants in the adenosine receptor genes and assess their relationship to infarct size in a population of patients with ischemic cardiomyopathy. Adenosine receptors play an important role in protecting the heart during ischemia and in mediating the effects of ischemic preconditioning. We sequenced DNA samples from 273 individuals with ischemic cardiomyopathy and from 203 normal controls to identify the presence of genetic variants in the adenosine receptor genes. Subsequently, we analyzed the relationship between the identified genetic variants and infarct size, left ventricular size, and left ventricular function. Three variants in the 3'-untranslated region of the A(1)-adenosine gene (nt 1689 C/A, nt 2206 Tdel, nt 2683del36) and an informative polymorphism in the coding region of the A3-adenosine gene (nt 1509 A/C I248L) were associated with changes in infarct size. These results suggest that genetic variants in the adenosine receptor genes may predict the heart's response to ischemia or injury and might also influence an individual's response to adenosine therapy.

  13. Breast and Prostate Cancer and Hormone-Related Gene Variant Study

    Cancer.gov

    The Breast and Prostate Cancer and Hormone-Related Gene Variant Study allows large-scale analyses of breast and prostate cancer risk in relation to genetic polymorphisms and gene-environment interactions that affect hormone metabolism.

  14. Renin-Angiotensin System Gene Variants and Type 2 Diabetes Mellitus: Influence of Angiotensinogen

    PubMed Central

    Joyce-Tan, Siew Mei; Zain, Shamsul Mohd; Abdul Sattar, Munavvar Zubaid; Abdullah, Nor Azizan

    2016-01-01

    Genome-wide association studies (GWAS) have been successfully used to call for variants associated with diseases including type 2 diabetes mellitus (T2DM). However, some variants are not included in the GWAS to avoid penalty in multiple hypothetic testing. Thus, candidate gene approach is still useful even at GWAS era. This study attempted to assess whether genetic variations in the renin-angiotensin system (RAS) and their gene interactions are associated with T2DM risk. We genotyped 290 T2DM patients and 267 controls using three genes of the RAS, namely, angiotensin converting enzyme (ACE), angiotensinogen (AGT), and angiotensin II type 1 receptor (AGTR1). There were significant differences in allele frequencies between cases and controls for AGT variants (P = 0.05) but not for ACE and AGTR1. Haplotype TCG of the AGT was associated with increased risk of T2DM (OR 1.92, 95% CI 1.15–3.20, permuted P = 0.012); however, no evidence of significant gene-gene interactions was seen. Nonetheless, our analysis revealed that the associations of the AGT variants with T2DM were independently associated. Thus, this study suggests that genetic variants of the RAS can modestly influence the T2DM risk. PMID:26682227

  15. Benchmarking distributed data warehouse solutions for storing genomic variant information

    PubMed Central

    Wiewiórka, Marek S.; Wysakowicz, Dawid P.; Okoniewski, Michał J.

    2017-01-01

    Abstract Genomic-based personalized medicine encompasses storing, analysing and interpreting genomic variants as its central issues. At a time when thousands of patientss sequenced exomes and genomes are becoming available, there is a growing need for efficient database storage and querying. The answer could be the application of modern distributed storage systems and query engines. However, the application of large genomic variant databases to this problem has not been sufficiently far explored so far in the literature. To investigate the effectiveness of modern columnar storage [column-oriented Database Management System (DBMS)] and query engines, we have developed a prototypic genomic variant data warehouse, populated with large generated content of genomic variants and phenotypic data. Next, we have benchmarked performance of a number of combinations of distributed storages and query engines on a set of SQL queries that address biological questions essential for both research and medical applications. In addition, a non-distributed, analytical database (MonetDB) has been used as a baseline. Comparison of query execution times confirms that distributed data warehousing solutions outperform classic relational DBMSs. Moreover, pre-aggregation and further denormalization of data, which reduce the number of distributed join operations, significantly improve query performance by several orders of magnitude. Most of distributed back-ends offer a good performance for complex analytical queries, while the Optimized Row Columnar (ORC) format paired with Presto and Parquet with Spark 2 query engines provide, on average, the lowest execution times. Apache Kudu on the other hand, is the only solution that guarantees a sub-second performance for simple genome range queries returning a small subset of data, where low-latency response is expected, while still offering decent performance for running analytical queries. In summary, research and clinical applications that require the storage and analysis of variants from thousands of samples can benefit from the scalability and performance of distributed data warehouse solutions. Database URL: https://github.com/ZSI-Bio/variantsdwh PMID:29220442

  16. Common Variants within Oxidative Phosphorylation Genes Influence Risk of Ischemic Stroke and Intracerebral Hemorrhage

    PubMed Central

    Anderson, Christopher D.; Biffi, Alessandro; Nalls, Michael A.; Devan, William J.; Schwab, Kristin; Ayres, Alison M.; Valant, Valerie; Ross, Owen A.; Rost, Natalia S.; Saxena, Richa; Viswanathan, Anand; Worrall, Bradford B.; Brott, Thomas G.; Goldstein, Joshua N.; Brown, Devin; Broderick, Joseph P.; Norrving, Bo; Greenberg, Steven M.; Silliman, Scott L.; Hansen, Björn M.; Tirschwell, David L.; Lindgren, Arne; Slowik, Agnieszka; Schmidt, Reinhold; Selim, Magdy; Roquer, Jaume; Montaner, Joan; Singleton, Andrew B.; Kidwell, Chelsea S.; Woo, Daniel; Furie, Karen L.; Meschia, James F.; Rosand, Jonathan

    2013-01-01

    Background and Purpose Prior studies demonstrated association between mitochondrial DNA variants and ischemic stroke (IS). We investigated whether variants within a larger set of oxidative phosphorylation (OXPHOS) genes encoded by both autosomal and mitochondrial DNA were associated with risk of IS and, based on our results, extended our investigation to intracerebral hemorrhage (ICH). Methods This association study employed a discovery cohort of 1643 individuals, a validation cohort of 2432 individuals for IS, and an extension cohort of 1476 individuals for ICH. Gene-set enrichment analysis (GSEA) was performed on all structural OXPHOS genes, as well as genes contributing to individual respiratory complexes. Gene-sets passing GSEA were tested by constructing genetic scores using common variants residing within each gene. Associations between each variant and IS that emerged in the discovery cohort were examined in validation and extension cohorts. Results IS was associated with genetic risk scores in OXPHOS as a whole (odds ratio (OR)=1.17, p=0.008) and Complex I (OR=1.06, p=0.050). Among IS subtypes, small vessel (SV) stroke showed association with OXPHOS (OR=1.16, p=0.007), Complex I (OR=1.13, p=0.027) and Complex IV (OR 1.14, p=0.018). To further explore this SV association, we extended our analysis to ICH, revealing association between deep hemispheric ICH and Complex IV (OR=1.08, p=0.008). Conclusions This pathway analysis demonstrates association between common genetic variants within OXPHOS genes and stroke. The associations for SV stroke and deep ICH suggest that genetic variation in OXPHOS influences small vessel pathobiology. Further studies are needed to identify culprit genetic variants and assess their functional consequences. PMID:23362085

  17. A sex-specific association of common variants of neuroligin genes (NLGN3 and NLGN4X) with autism spectrum disorders in a Chinese Han cohort.

    PubMed

    Yu, Jindan; He, Xue; Yao, Dan; Li, Zhongyue; Li, Hui; Zhao, Zhengyan

    2011-05-14

    Synaptic genes, NLGN3 and NLGN4X, two homologous members of the neuroligin family, have been supposed as predisposition loci for autism spectrum disorders (ASDs), and defects of these two genes have been identified in a small fraction of individuals with ASDs. But no such rare variant in these two genes has as yet been adequately replicated in Chinese population and no common variant has been further investigated to be associated with ASDs. 7 known ASDs-related rare variants in NLGN3 and NLGN4X genes were screened for replication of the initial findings and 12 intronic tagging single nucleotide polymorphisms (SNPs) were genotyped for case-control association analysis in a total of 229 ASDs cases and 184 control individuals in a Chinese Han cohort, using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. We found that a common intronic variant, SNP rs4844285 in NLGN3 gene, and a specific 3-marker haplotype XA-XG-XT (rs11795613-rs4844285-rs4844286) containing this individual SNP were associated with ASDs and showed a male bias, even after correction for multiple testing (SNP allele: P = 0.048, haplotype:P = 0.032). Simultaneously, none of these 7 known rare mutation of NLGN3 and NLGN4X genes was identified, neither in our patients with ASDs nor controls, giving further evidence that these known rare variants might be not enriched in Chinese Han cohort. The present study provides initial evidence that a common variant in NLGN3 gene may play a role in the etiology of ASDs among affected males in Chinese Han population, and further supports the hypothesis that defect of synapse might involvement in the pathophysiology of ASDs.

  18. Complex nature of SNP genotype effects on gene expression in primary human leucocytes.

    PubMed

    Heap, Graham A; Trynka, Gosia; Jansen, Ritsert C; Bruinenberg, Marcel; Swertz, Morris A; Dinesen, Lotte C; Hunt, Karen A; Wijmenga, Cisca; Vanheel, David A; Franke, Lude

    2009-01-07

    Genome wide association studies have been hugely successful in identifying disease risk variants, yet most variants do not lead to coding changes and how variants influence biological function is usually unknown. We correlated gene expression and genetic variation in untouched primary leucocytes (n = 110) from individuals with celiac disease - a common condition with multiple risk variants identified. We compared our observations with an EBV-transformed HapMap B cell line dataset (n = 90), and performed a meta-analysis to increase power to detect non-tissue specific effects. In celiac peripheral blood, 2,315 SNP variants influenced gene expression at 765 different transcripts (< 250 kb from SNP, at FDR = 0.05, cis expression quantitative trait loci, eQTLs). 135 of the detected SNP-probe effects (reflecting 51 unique probes) were also detected in a HapMap B cell line published dataset, all with effects in the same allelic direction. Overall gene expression differences within the two datasets predominantly explain the limited overlap in observed cis-eQTLs. Celiac associated risk variants from two regions, containing genes IL18RAP and CCR3, showed significant cis genotype-expression correlations in the peripheral blood but not in the B cell line datasets. We identified 14 genes where a SNP affected the expression of different probes within the same gene, but in opposite allelic directions. By incorporating genetic variation in co-expression analyses, functional relationships between genes can be more significantly detected. In conclusion, the complex nature of genotypic effects in human populations makes the use of a relevant tissue, large datasets, and analysis of different exons essential to enable the identification of the function for many genetic risk variants in common diseases.

  19. Rare variants analysis of cutaneous malignant melanoma genes in Parkinson's disease.

    PubMed

    Lubbe, S J; Escott-Price, V; Brice, A; Gasser, T; Pittman, A M; Bras, J; Hardy, J; Heutink, P; Wood, N M; Singleton, A B; Grosset, D G; Carroll, C B; Law, M H; Demenais, F; Iles, M M; Bishop, D T; Newton-Bishop, J; Williams, N M; Morris, H R

    2016-12-01

    A shared genetic susceptibility between cutaneous malignant melanoma (CMM) and Parkinson's disease (PD) has been suggested. We investigated this by assessing the contribution of rare variants in genes involved in CMM to PD risk. We studied rare variation across 29 CMM risk genes using high-quality genotype data in 6875 PD cases and 6065 controls and sought to replicate findings using whole-exome sequencing data from a second independent cohort totaling 1255 PD cases and 473 controls. No statistically significant enrichment of rare variants across all genes, per gene, or for any individual variant was detected in either cohort. There were nonsignificant trends toward different carrier frequencies between PD cases and controls, under different inheritance models, in the following CMM risk genes: BAP1, DCC, ERBB4, KIT, MAPK2, MITF, PTEN, and TP53. The very rare TYR p.V275F variant, which is a pathogenic allele for recessive albinism, was more common in PD cases than controls in 3 independent cohorts. Tyrosinase, encoded by TYR, is the rate-limiting enzyme for the production of neuromelanin, and has a role in the production of dopamine. These results suggest a possible role for another gene in the dopamine-biosynthetic pathway in susceptibility to neurodegenerative Parkinsonism, but further studies in larger PD cohorts are needed to accurately determine the role of these genes/variants in disease pathogenesis. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.

  20. TRAM (Transcriptome Mapper): database-driven creation and analysis of transcriptome maps from multiple sources

    PubMed Central

    2011-01-01

    Background Several tools have been developed to perform global gene expression profile data analysis, to search for specific chromosomal regions whose features meet defined criteria as well as to study neighbouring gene expression. However, most of these tools are tailored for a specific use in a particular context (e.g. they are species-specific, or limited to a particular data format) and they typically accept only gene lists as input. Results TRAM (Transcriptome Mapper) is a new general tool that allows the simple generation and analysis of quantitative transcriptome maps, starting from any source listing gene expression values for a given gene set (e.g. expression microarrays), implemented as a relational database. It includes a parser able to assign univocal and updated gene symbols to gene identifiers from different data sources. Moreover, TRAM is able to perform intra-sample and inter-sample data normalization, including an original variant of quantile normalization (scaled quantile), useful to normalize data from platforms with highly different numbers of investigated genes. When in 'Map' mode, the software generates a quantitative representation of the transcriptome of a sample (or of a pool of samples) and identifies if segments of defined lengths are over/under-expressed compared to the desired threshold. When in 'Cluster' mode, the software searches for a set of over/under-expressed consecutive genes. Statistical significance for all results is calculated with respect to genes localized on the same chromosome or to all genome genes. Transcriptome maps, showing differential expression between two sample groups, relative to two different biological conditions, may be easily generated. We present the results of a biological model test, based on a meta-analysis comparison between a sample pool of human CD34+ hematopoietic progenitor cells and a sample pool of megakaryocytic cells. Biologically relevant chromosomal segments and gene clusters with differential expression during the differentiation toward megakaryocyte were identified. Conclusions TRAM is designed to create, and statistically analyze, quantitative transcriptome maps, based on gene expression data from multiple sources. The release includes FileMaker Pro database management runtime application and it is freely available at http://apollo11.isto.unibo.it/software/, along with preconfigured implementations for mapping of human, mouse and zebrafish transcriptomes. PMID:21333005

  1. The 2014 Nucleic Acids Research Database Issue and an updated NAR online Molecular Biology Database Collection.

    PubMed

    Fernández-Suárez, Xosé M; Rigden, Daniel J; Galperin, Michael Y

    2014-01-01

    The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI's MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).

  2. Three new genetic loci (R1210C in CFH, variants in COL8A1 and RAD51B) are independently related to progression to advanced macular degeneration.

    PubMed

    Seddon, Johanna M; Reynolds, Robyn; Yu, Yi; Rosner, Bernard

    2014-01-01

    To assess the independent impact of new genetic variants on conversion to advanced stages of AMD, controlling for established risk factors, and to determine the contribution of genes in predictive models. In this prospective longitudinal study of 2765 individuals, 777 subjects progressed to neovascular disease (NV) or geographic atrophy (GA) in either eye over 12 years. Recently reported genetic loci were assessed for their independent effects on incident advanced AMD after controlling for 6 established loci in 5 genes, and demographic, behavioral, and macular characteristics. New variants which remained significantly related to progression were then added to a final multivariate model to assess their independent effects. The contribution of genes to risk models was assessed using reclassification tables by determining risk within cross-classified quintiles for alternative models. THREE NEW GENETIC VARIANTS WERE SIGNIFICANTLY RELATED TO PROGRESSION: rare variant R1210C in CFH (hazard ratio (HR) 2.5, 95% confidence interval [CI] 1.2-5.3, P = 0.01), and common variants in genes COL8A1 (HR 2.0, 95% CI 1.1-3.5, P = 0.02) and RAD51B (HR 0.8, 95% CI 0.60-0.97, P = 0.03). The area under the curve statistic (AUC) was significantly higher for the 9 gene model (.884) vs the 0 gene model (.873), P = .01. AUC's for the 9 vs 6 gene models were not significantly different, but reclassification analyses indicated significant added information for more genes, with adjusted odds ratios (OR) for progression within 5 years per one quintile increase in risk score of 2.7, P<0.001 for the 9 vs 6 loci model, and OR 3.5, P<0.001 for the 9 vs. 0 gene model. Similar results were seen for NV and GA. Rare variant CFH R1210C and common variants in COL8A1 and RAD51B plus six genes in previous models contribute additional predictive information for advanced AMD beyond macular and behavioral phenotypes.

  3. Three New Genetic Loci (R1210C in CFH, Variants in COL8A1 and RAD51B) Are Independently Related to Progression to Advanced Macular Degeneration

    PubMed Central

    Seddon, Johanna M.; Reynolds, Robyn; Yu, Yi; Rosner, Bernard

    2014-01-01

    Objectives To assess the independent impact of new genetic variants on conversion to advanced stages of AMD, controlling for established risk factors, and to determine the contribution of genes in predictive models. Methods In this prospective longitudinal study of 2765 individuals, 777 subjects progressed to neovascular disease (NV) or geographic atrophy (GA) in either eye over 12 years. Recently reported genetic loci were assessed for their independent effects on incident advanced AMD after controlling for 6 established loci in 5 genes, and demographic, behavioral, and macular characteristics. New variants which remained significantly related to progression were then added to a final multivariate model to assess their independent effects. The contribution of genes to risk models was assessed using reclassification tables by determining risk within cross-classified quintiles for alternative models. Results Three new genetic variants were significantly related to progression: rare variant R1210C in CFH (hazard ratio (HR) 2.5, 95% confidence interval [CI] 1.2–5.3, P = 0.01), and common variants in genes COL8A1 (HR 2.0, 95% CI 1.1–3.5, P = 0.02) and RAD51B (HR 0.8, 95% CI 0.60–0.97, P = 0.03). The area under the curve statistic (AUC) was significantly higher for the 9 gene model (.884) vs the 0 gene model (.873), P = .01. AUC’s for the 9 vs 6 gene models were not significantly different, but reclassification analyses indicated significant added information for more genes, with adjusted odds ratios (OR) for progression within 5 years per one quintile increase in risk score of 2.7, P<0.001 for the 9 vs 6 loci model, and OR 3.5, P<0.001 for the 9 vs. 0 gene model. Similar results were seen for NV and GA. Conclusions Rare variant CFH R1210C and common variants in COL8A1 and RAD51B plus six genes in previous models contribute additional predictive information for advanced AMD beyond macular and behavioral phenotypes. PMID:24498017

  4. Nominal ISOMERs (Incorrect Spellings Of Medicines Eluding Researchers)-variants in the spellings of drug names in PubMed: a database review.

    PubMed

    Ferner, Robin E; Aronson, Jeffrey K

    2016-12-14

     To examine how misspellings of drug names could impede searches for published literature.  Database review.  PubMed.  The study included 30 drug names that are commonly misspelt on prescription charts in hospitals in Birmingham, UK (test set), and 30 control names randomly chosen from a hospital formulary (control set). The following definitions were used: standard names-the international non-proprietary names, variant names-deviations in spelling from standard names that are not themselves standard names in English language nomenclature, and hidden reference variants-variant spellings that identified publications in textword (tw) searches of PubMed or other databases, and which were not identified by textword searches for the standard names. Variant names were generated from standard names by applying letter substitutions, omissions, additions, transpositions, duplications, deduplications, and combinations of these. Searches were carried out in PubMed (30 June 2016) for "standard name[tw]" and "variant name[tw] NOT standard name[tw]."  The 30 standard names of drugs in the test set gave 325 979 hits in total, and 160 hidden reference variants gave 3872 hits (1.17%). The standard names of the control set gave 470 064 hits, and 79 hidden reference variants gave 766 hits (0.16%). Letter substitutions (particularly i to y and vice versa) and omissions together accounted for 2924 (74%) of the variants. Amitriptyline (8530 hits) yielded 18 hidden reference variants (179 (2.1%) hits). Names ending in "in," "ine," or "micin" were commonly misspelt. Failing to search for hidden reference variants of "gentamicin," "amitriptyline," "mirtazapine," and "trazodone" would miss at least 19 systematic reviews. A hidden reference variant related to Christmas, "No-el", was rare; variants of "X-miss" were rarer.  When performing searches, researchers should include misspellings of drug names among their search terms. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  5. Assessing the Spectrum of Germline Variation in Fanconi Anemia Genes among Patients with Head and Neck Carcinoma before Age 50

    PubMed Central

    Chandrasekharappa, Settara C.; Chinn, Steven B.; Donovan, Frank X.; Chowdhury, Naweed I.; Kamat, Aparna; Adeyemo, Adebowale A.; Thomas, James W.; Vemulapalli, Meghana; Hussey, Caroline S.; Reid, Holly H.; Mullikin, James C.; Wei, Qingyi; Sturgis, Erich M.

    2018-01-01

    Patients with Fanconi anemia (FA) have increased risk for head and neck squamous cell carcinoma (HNSCC). We sought to determine the prevalence of undiagnosed FA and FA carriers in patients with HNSCC and an age cutoff for FA genetic screening. Screening germline DNA from 417 HNSCC patients under age 50 revealed 194 FA gene variants in 185 patients (44%). The variant spectrum was comprised of 183 nonsynonymous point mutations, nine indels, one large deletion, and one synonymous variant predicted to effect splicing. 108 patients (26%) had at least one rare variant predicted to be damaging, and 57 (14%) had at least one rare variant predicted to be damaging and previously reported. Fifteen patients carried two rare variants, or an X-linked variant, in an FA gene. Overall, we did not identify an age cutoff for FA screening among young HNSCC patients, as there were no significant differences in mutation rates when patients were stratified by age, tumor site, ethnicity, smoking status, or human papillomavirus status. However, we observed an increased burden, or mutation load, of FA gene variants in FANCD2, FANCE, and FANCL in our HNSCC patient cohort relative to the 1000 Genomes population. FANCE and FANCL, components of the core complex, are known to be responsible for the recruitment and ubiquitination, respectively, of FANCD2, a critical step in the FA DNA repair pathway. FA germline functional variants offer a novel area of study in HNSCC tumorigenesis, and the increased mutation burden of critical genes indicates the importance of the FA pathway in HNSCC. PMID:28678401

  6. Comprehensive Maturity Onset Diabetes of the Young (MODY) Gene Screening in Pregnant Women with Diabetes in India.

    PubMed

    Doddabelavangala Mruthyunjaya, Mahesh; Chapla, Aaron; Hesarghatta Shyamasunder, Asha; Varghese, Deny; Varshney, Manika; Paul, Johan; Inbakumari, Mercy; Christina, Flory; Varghese, Ron Thomas; Kuruvilla, Kurien Anil; V Paul, Thomas; Jose, Ruby; Regi, Annie; Lionel, Jessie; Jeyaseelan, L; Mathew, Jiji; Thomas, Nihal

    2017-01-01

    Pregnant women with diabetes may have underlying beta cell dysfunction due to mutations/rare variants in genes associated with Maturity Onset Diabetes of the Young (MODY). MODY gene screening would reveal those women genetically predisposed and previously unrecognized with a monogenic form of diabetes for further clinical management, family screening and genetic counselling. However, there are minimal data available on MODY gene variants in pregnant women with diabetes from India. In this study, utilizing the Next generation sequencing (NGS) based protocol fifty subjects were screened for variants in a panel of thirteen MODY genes. Of these subjects 18% (9/50) were positive for definite or likely pathogenic or uncertain MODY variants. The majority of these variants was identified in subjects with autosomal dominant family history, of whom five were in women with pre-GDM and four with overt-GDM. The identified variants included one patient with HNF1A Ser3Cys, two PDX1 Glu224Lys, His94Gln, two NEUROD1 Glu59Gln, Phe318Ser, one INS Gly44Arg, one GCK, one ABCC8 Arg620Cys and one BLK Val418Met variants. In addition, three of the seven offspring screened were positive for the identified variant. These identified variants were further confirmed by Sanger sequencing. In conclusion, these findings in pregnant women with diabetes, imply that a proportion of GDM patients with autosomal dominant family history may have MODY. Further NGS based comprehensive studies with larger samples are required to confirm these finding.

  7. Comprehensive Maturity Onset Diabetes of the Young (MODY) Gene Screening in Pregnant Women with Diabetes in India

    PubMed Central

    Hesarghatta Shyamasunder, Asha; Varghese, Deny; Varshney, Manika; Paul, Johan; Inbakumari, Mercy; Christina, Flory; Varghese, Ron Thomas; Kuruvilla, Kurien Anil; V. Paul, Thomas; Jose, Ruby; Regi, Annie; Lionel, Jessie; Jeyaseelan, L.; Mathew, Jiji; Thomas, Nihal

    2017-01-01

    Pregnant women with diabetes may have underlying beta cell dysfunction due to mutations/rare variants in genes associated with Maturity Onset Diabetes of the Young (MODY). MODY gene screening would reveal those women genetically predisposed and previously unrecognized with a monogenic form of diabetes for further clinical management, family screening and genetic counselling. However, there are minimal data available on MODY gene variants in pregnant women with diabetes from India. In this study, utilizing the Next generation sequencing (NGS) based protocol fifty subjects were screened for variants in a panel of thirteen MODY genes. Of these subjects 18% (9/50) were positive for definite or likely pathogenic or uncertain MODY variants. The majority of these variants was identified in subjects with autosomal dominant family history, of whom five were in women with pre-GDM and four with overt-GDM. The identified variants included one patient with HNF1A Ser3Cys, two PDX1 Glu224Lys, His94Gln, two NEUROD1 Glu59Gln, Phe318Ser, one INS Gly44Arg, one GCK, one ABCC8 Arg620Cys and one BLK Val418Met variants. In addition, three of the seven offspring screened were positive for the identified variant. These identified variants were further confirmed by Sanger sequencing. In conclusion, these findings in pregnant women with diabetes, imply that a proportion of GDM patients with autosomal dominant family history may have MODY. Further NGS based comprehensive studies with larger samples are required to confirm these finding PMID:28095440

  8. Joint associations between genetic variants and reproductive factors in glioma risk among women.

    PubMed

    Wang, Sophia S; Hartge, Patricia; Yeager, Meredith; Carreón, Tania; Ruder, Avima M; Linet, Martha; Inskip, Peter D; Black, Amanda; Hsing, Ann W; Alavanja, Michael; Beane-Freeman, Laura; Safaiean, Mahboobeh; Chanock, Stephen J; Rajaraman, Preetha

    2011-10-15

    In a pooled analysis of 4 US epidemiologic studies (1993-2001), the authors evaluated the role of 5 female reproductive factors in 357 women with glioma and 822 controls. The authors further evaluated the independent association between 5 implicated gene variants and glioma risk among the study population, as well as the joint associations of female reproductive factors (ages at menarche and menopause, menopausal status, use of oral contraceptives, and menopausal hormone therapy) and these gene variants on glioma risk. Risk estimates were calculated as odds ratios and 95% confidence intervals that were adjusted for age, race, and study. Three of the gene variants (rs4295627, a variant of CCDC26; rs4977756, a variant of CDKN2A and CDKN2B; and rs6010620, a variant of RTEL1) were statistically significantly associated with glioma risk in the present population. Compared with women who had an early age at menarche (<12 years of age), those who reported menarche at 12-13 years of age or at 14 years of age or older had a 1.7-fold higher risk and a 1.9-fold higher risk of glioma, respectively (P for trend = 0.009). Postmenopausal women and women who reported ever having used oral contraceptives had a decreased risk of glioma. The authors did not observe joint associations between these reproductive characteristics and the implicated glioma gene variants. These results require replication, but if confirmed, they would suggest that the gene variants that have previously been implicated in the development of glioma are unlikely to act through the same hormonal mechanisms in women.

  9. The role of ghrelin and ghrelin-receptor gene variants and promoter activity in type 2 diabetes.

    PubMed

    Garcia, Edwin A; King, Peter; Sidhu, Kally; Ohgusu, Hideko; Walley, Andrew; Lecoeur, Cecile; Gueorguiev, Maria; Khalaf, Sahira; Davies, Derek; Grossman, Ashley B; Kojima, Masayasu; Petersenn, Stephan; Froguel, Phillipe; Korbonits, Márta

    2009-08-01

    Ghrelin and its receptor play an important role in glucose metabolism and energy homeostasis, and therefore they are functional candidates for genes carrying susceptibility alleles for type 2 diabetes. We assessed common genetic variation of the ghrelin (GHRL; five single nucleotide polymorphisms (SNP)) and the ghrelin-receptor (GHSR) genes (four SNPs) in 610 Caucasian patients with type 2 diabetes and 820 controls. In addition, promoter reporter assays were conducted to model the regulatory regions of both genes. Neither GHRL nor GHSR gene SNPs were associated with type 2 diabetes. One of the ghrelin haplotypes showed a marginal protective role in type 2 diabetes. We observed profound differences in the regulation of the GHRL gene according to promoter sequence variants. There are three different GHRL promoter haplotypes represented in the studied cohort causing up to 45% difference in the level of gene expression, while the promoter region of GHSR gene is primarily represented by a single haplotype. The GHRL and GHSR gene variants are not associated with type 2 diabetes, although GHRL promoter variants have significantly different activities.

  10. Effect of polymorphic variants of GH, Pit-1, and beta-LG genes on milk production of Holstein cows.

    PubMed

    Heidari, M; Azari, M A; Hasani, S; Khanahmadi, A; Zerehdaran, S

    2012-04-01

    Effect of polymorphic variants of growth hormone (GH), beta-lactoglobulin (beta-LG), and Pit-1 genes on milk yield was analyzed in a Holstein herd. Genotypes of the cows for these genes were determined by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) method. Allele frequencies were 0.884 and 0.116 for L and V variants of GH, 0.170 and 0.830 for A and B variants of Pit-1, and 0.529 and 0.471 for A and B variants of beta-LG, respectively. GLM procedure of SAS software was used to test the effects of these genes on milk yield. Results indicated significant effects of these genes on milk yield (P < 0.05). Cows with LL genotype of GH produced more milk than cows with LVgenotype (P < 0.05). Also, for Pit-1 gene, animals with AB genotype produced more milk than BB genotype (P < 0.05). In the case of beta-LG gene, milk yield of animals with AA genotype was more than BB genotype (P < 0.01). Therefore, it might be concluded that homozygote genotypes of GH (LL) and beta-LG (AA) were superior compared to heterozygote genotypes, whereas, the heterozygote genotype of Pit-1 gene (AB) was desirable.

  11. Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases

    DOE PAGES

    Smedley, Damian; Kohler, Sebastian; Czeschik, Johanna Christina; ...

    2014-07-30

    Here, whole-exome sequencing (WES) has opened up previously unheard of possibilities for identifying novel disease genes in Mendelian disorders, only about half of which have been elucidated to date. However, interpretation of WES data remains challenging. As a result, we analyze protein–protein association (PPA) networks to identify candidate genes in the vicinity of genes previously implicated in a disease. The analysis, using a random-walk with restart (RWR) method, is adapted to the setting of WES by developing a composite variant-gene relevance score based on the rarity, location and predicted pathogenicity of variants and the RWR evaluation of genes harboring themore » variants. Benchmarking using known disease variants from 88 disease-gene families reveals that the correct gene is ranked among the top 10 candidates in ≥50% of cases, a figure which we confirmed using a prospective study of disease genes identified in 2012 and PPA data produced before that date. In conclusion, we implement our method in a freely available Web server, ExomeWalker, that displays a ranked list of candidates together with information on PPAs, frequency and predicted pathogenicity of the variants to allow quick and effective searches for candidates that are likely to reward closer investigation.« less

  12. Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smedley, Damian; Kohler, Sebastian; Czeschik, Johanna Christina

    Here, whole-exome sequencing (WES) has opened up previously unheard of possibilities for identifying novel disease genes in Mendelian disorders, only about half of which have been elucidated to date. However, interpretation of WES data remains challenging. As a result, we analyze protein–protein association (PPA) networks to identify candidate genes in the vicinity of genes previously implicated in a disease. The analysis, using a random-walk with restart (RWR) method, is adapted to the setting of WES by developing a composite variant-gene relevance score based on the rarity, location and predicted pathogenicity of variants and the RWR evaluation of genes harboring themore » variants. Benchmarking using known disease variants from 88 disease-gene families reveals that the correct gene is ranked among the top 10 candidates in ≥50% of cases, a figure which we confirmed using a prospective study of disease genes identified in 2012 and PPA data produced before that date. In conclusion, we implement our method in a freely available Web server, ExomeWalker, that displays a ranked list of candidates together with information on PPAs, frequency and predicted pathogenicity of the variants to allow quick and effective searches for candidates that are likely to reward closer investigation.« less

  13. Confirmation of chromosomal microarray as a first-tier clinical diagnostic test for individuals with developmental delay, intellectual disability, autism spectrum disorders and dysmorphic features.

    PubMed

    Battaglia, Agatino; Doccini, Viola; Bernardini, Laura; Novelli, Antonio; Loddo, Sara; Capalbo, Anna; Filippi, Tiziana; Carey, John C

    2013-11-01

    Submicroscopic chromosomal rearrangements are the most common identifiable causes of intellectual disability and autism spectrum disorders associated with dysmorphic features. Chromosomal microarray (CMA) can detect copy number variants <1 Mb and identifies size and presence of known genes. The aim of this study was to demonstrate the usefulness of CMA, as a first-tier tool in detecting the etiology of unexplained intellectual disability/autism spectrum disorders (ID/ASDs) associated with dysmorphic features in a large cohort of pediatric patients. We studied 349 individuals; 223 males, 126 females, aged 5 months-19 years. Blood samples were analyzed with CMA at a resolution ranging from 1 Mb to 40 Kb. The imbalance was confirmed by FISH or qPCR. We considered copy number variants (CNVs) causative if the variant was responsible for a known syndrome, encompassed gene/s of known function, occurred de novo or, if inherited, the parent was variably affected, and/or the involved gene/s had been reported in association with ID/ASDs in dedicated databases. 91 CNVs were detected in 77 (22.06%) patients: 5 (6.49%) of those presenting with borderline cognitive impairment, 54 (70.13%) with a variable degree of DD/ID, and 18/77 (23.38%) with ID of variable degree and ASDs. 16/77 (20.8%) patients had two different rearrangements. Deletions exceeded duplications (58 versus 33); 45.05% (41/91) of the detected CNVs were de novo, 45.05% (41/91) inherited, and 9.9% (9/91) unknown. The CNVs caused the phenotype in 57/77 (74%) patients; 12/57 (21.05%) had ASDs/ID, and 45/57 (78.95%) had DD/ID. Our study provides further evidence of the high diagnostic yield of CMA for genetic testing in children with unexplained ID/ASDs who had dysmorphic features. We confirm the value of CMA as the first-tier tool in the assessment of those conditions in the pediatric setting. Copyright © 2013 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.

  14. Population- and individual-specific regulatory variation in Sardinia.

    PubMed

    Pala, Mauro; Zappala, Zachary; Marongiu, Mara; Li, Xin; Davis, Joe R; Cusano, Roberto; Crobu, Francesca; Kukurba, Kimberly R; Gloudemans, Michael J; Reinier, Frederic; Berutti, Riccardo; Piras, Maria G; Mulas, Antonella; Zoledziewska, Magdalena; Marongiu, Michele; Sorokin, Elena P; Hess, Gaelen T; Smith, Kevin S; Busonero, Fabio; Maschio, Andrea; Steri, Maristella; Sidore, Carlo; Sanna, Serena; Fiorillo, Edoardo; Bassik, Michael C; Sawcer, Stephen J; Battle, Alexis; Novembre, John; Jones, Chris; Angius, Andrea; Abecasis, Gonçalo R; Schlessinger, David; Cucca, Francesco; Montgomery, Stephen B

    2017-05-01

    Genetic studies of complex traits have mainly identified associations with noncoding variants. To further determine the contribution of regulatory variation, we combined whole-genome and transcriptome data for 624 individuals from Sardinia to identify common and rare variants that influence gene expression and splicing. We identified 21,183 expression quantitative trait loci (eQTLs) and 6,768 splicing quantitative trait loci (sQTLs), including 619 new QTLs. We identified high-frequency QTLs and found evidence of selection near genes involved in malarial resistance and increased multiple sclerosis risk, reflecting the epidemiological history of Sardinia. Using family relationships, we identified 809 segregating expression outliers (median z score of 2.97), averaging 13.3 genes per individual. Outlier genes were enriched for proximal rare variants, providing a new approach to study large-effect regulatory variants and their relevance to traits. Our results provide insight into the effects of regulatory variants and their relationship to population history and individual genetic risk.

  15. Genetic variants of ghrelin in metabolic disorders.

    PubMed

    Ukkola, Olavi

    2011-11-01

    An increasing understanding of the role of genes in the development of obesity may reveal genetic variants that, in combination with conventional risk factors, may help to predict an individual's risk for developing metabolic disorders. Accumulating evidence indicates that ghrelin plays a role in regulating food intake and energy homeostasis and it is a reasonable candidate gene for obesity-related co-morbidities. In cross-sectional studies low total ghrelin concentrations and some genetic polymorphisms of ghrelin have been associated with obesity-associated diseases. The present review highlights many of the important problems in association studies of genetic variants and complex diseases. It is known that population-specific differences in reported associations exist. We therefore conclude that more studies on variants of ghrelin gene are needed to perform in different populations to get deeper understanding on the relationship of ghrelin gene and its variants to obesity. Copyright © 2011 Elsevier Inc. All rights reserved.

  16. Spectrum and Frequency of the GJB2 Gene Pathogenic Variants in a Large Cohort of Patients with Hearing Impairment Living in a Subarctic Region of Russia (the Sakha Republic)

    PubMed Central

    Posukh, Olga L.; Teryutin, Fedor M.; Solovyev, Aisen V.; Klarov, Leonid A.; Romanov, Georgii P.; Gotovtsev, Nyurgun N.; Kozhevnikov, Andrey A.; Kirillina, Elena V.; Sidorova, Oksana G.; Vasilyevа, Lena M.; Fedotova, Elvira E.; Morozov, Igor V.; Bondar, Alexander A.; Solovyevа, Natalya A.; Kononova, Sardana K.; Rafailov, Adyum M.; Sazonov, Nikolay N.; Alekseev, Anatoliy N.; Tomsky, Mikhail I.; Dzhemileva, Lilya U.; Khusnutdinova, Elza K.; Fedorova, Sardana A.

    2016-01-01

    Pathogenic variants in the GJB2 gene, encoding connexin 26, are known to be a major cause of hearing impairment (HI). More than 300 allelic variants have been identified in the GJB2 gene. Spectrum and allelic frequencies of the GJB2 gene vary significantly among different ethnic groups worldwide. Until now, the spectrum and frequency of the pathogenic variants in exon 1, exon 2 and the flanking intronic regions of the GJB2 gene have not been described thoroughly in the Sakha Republic (Yakutia), which is located in a subarctic region in Russia. The complete sequencing of the non-coding and coding regions of the GJB2 gene was performed in 393 patients with HI (Yakuts—296, Russians—51, mixed and other ethnicities—46) and in 187 normal hearing individuals of Yakut (n = 107) and Russian (n = 80) populations. In the total sample (n = 580), we revealed 12 allelic variants of the GJB2 gene, 8 of which were recessive pathogenic variants. Ten genotypes with biallelic recessive pathogenic variants in the GJB2 gene (in a homozygous or a compound heterozygous state) were found in 192 out of 393 patients (48.85%). We found that the most frequent GJB2 pathogenic variant in the Yakut patients was c.-23+1G>A (51.82%) and that the second most frequent was c.109G>A (2.37%), followed by c.35delG (1.64%). Pathogenic variants с.35delG (22.34%), c.-23+1G>A (5.31%), and c.313_326del14 (2.12%) were found to be the most frequent among the Russian patients. The carrier frequencies of the c.-23+1G>A and с.109G>A pathogenic variants in the Yakut control group were 10.20% and 2.80%, respectively. The carrier frequencies of с.35delG and c.101T>C were identical (2.5%) in the Russian control group. We found that the contribution of the GJB2 gene pathogenic variants in HI in the population of the Sakha Republic (48.85%) was the highest among all of the previously studied regions of Asia. We suggest that extensive accumulation of the c.-23+1G>A pathogenic variant in the indigenous Yakut population (92.20% of all mutant chromosomes in patients) and an extremely high (10.20%) carrier frequency in the control group may indicate a possible selective advantage for the c.-23+1G>A carriers living in subarctic climate. PMID:27224056

  17. Exceptions to the rule: case studies in the prediction of pathogenicity for genetic variants in hereditary cancer genes.

    PubMed

    Rosenthal, E T; Bowles, K R; Pruss, D; van Kan, A; Vail, P J; McElroy, H; Wenstrup, R J

    2015-12-01

    Based on current consensus guidelines and standard practice, many genetic variants detected in clinical testing are classified as disease causing based on their predicted impact on the normal expression or function of the gene in the absence of additional data. However, our laboratory has identified a subset of such variants in hereditary cancer genes for which compelling contradictory evidence emerged after the initial evaluation following the first observation of the variant. Three representative examples of variants in BRCA1, BRCA2 and MSH2 that are predicted to disrupt splicing, prematurely truncate the protein, or remove the start codon were evaluated for pathogenicity by analyzing clinical data with multiple classification algorithms. Available clinical data for all three variants contradicts the expected pathogenic classification. These variants illustrate potential pitfalls associated with standard approaches to variant classification as well as the challenges associated with monitoring data, updating classifications, and reporting potentially contradictory interpretations to the clinicians responsible for translating test outcomes to appropriate clinical action. It is important to address these challenges now as the model for clinical testing moves toward the use of large multi-gene panels and whole exome/genome analysis, which will dramatically increase the number of genetic variants identified. © 2015 The Authors. Clinical Genetics published by John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  18. A novel start codon mutation of the MERTK gene in a patient with retinitis pigmentosa

    PubMed Central

    Jinda, Worapoj; Poungvarin, Naravat; Taylor, Todd D.; Suzuki, Yutaka; Thongnoppakhun, Wanna; Limwongse, Chanin; Lertrit, Patcharee; Suriyaphol, Prapat

    2016-01-01

    Purpose Retinitis pigmentosa (RP) is a clinically and genetically heterogeneous group of inherited retinal degenerations characterized by progressive loss of photoreceptor cells and RPE functions. More than 70 causative genes are known to be responsible for RP. This study aimed to identify the causative gene in a patient from a consanguineous family with childhood-onset severe retinal dystrophy. Methods To identify the defective gene, whole exome sequencing was performed. Candidate causative variants were selected and validated using Sanger sequencing. Segregation analysis of the causative gene was performed in additional family members. To verify that the mutation has an effect on protein synthesis, an expression vector containing the first ten amino acids of the mutant protein fused with the DsRed2 fluorescent protein was constructed and transfected into HEK293T cells. Expression of the fusion protein in the transfected cells was measured using fluorescence microscopy. Results By filtering against public variant databases, a novel homozygous missense mutation (c.3G>A) localized in the start codon of the MERTK gene was detected as a potentially pathogenic mutation for autosomal recessive RP. The c.3G>A mutation cosegregated with the disease phenotype in the family. No expression of the first ten amino acids of the MerTK mutant fused with the DsRed2 fluorescent protein was detected in HEK293T cells, indicating that the mutation affects the translation initiation site of the gene that may lead to loss of function of the MerTK signaling pathway. Conclusions We report a novel missense mutation (c.3G>A, p.0?) in the MERTK gene that causes severe vision impairment in a patient. Taken together with previous reports, our results expand the spectrum of MERTK mutations and extend our understanding of the role of the MerTK protein in the pathogenesis of retinitis pigmentosa. PMID:27122965

  19. Using whole-exome sequencing to identify variants inherited from mosaic parents

    PubMed Central

    Rios, Jonathan J; Delgado, Mauricio R

    2015-01-01

    Whole-exome sequencing (WES) has allowed the discovery of genes and variants causing rare human disease. This is often achieved by comparing nonsynonymous variants between unrelated patients, and particularly for sporadic or recessive disease, often identifies a single or few candidate genes for further consideration. However, despite the potential for this approach to elucidate the genetic cause of rare human disease, a majority of patients fail to realize a genetic diagnosis using standard exome analysis methods. Although genetic heterogeneity contributes to the difficulty of exome sequence analysis between patients, it remains plausible that rare human disease is not caused by de novo or recessive variants. Multiple human disorders have been described for which the variant was inherited from a phenotypically normal mosaic parent. Here we highlight the potential for exome sequencing to identify a reasonable number of candidate genes when dominant disease variants are inherited from a mosaic parent. We show the power of WES to identify a limited number of candidate genes using this disease model and how sequence coverage affects identification of mosaic variants by WES. We propose this analysis as an alternative to discover genetic causes of rare human disorders for which typical WES approaches fail to identify likely pathogenic variants. PMID:24986828

  20. Common variants of the EPDR1 gene and the risk of Dupuytren’s disease.

    PubMed

    Dębniak, T; Żyluk, A; Puchalski, P; Serrano-Fernandez, P

    2013-10-01

    The object of this study was the investigation of 3 common variants of single nucleotide polymorphisms of the ependymin-related gene 1 and its association with the occurrence of Dupuytren's disease. DNA samples were obtained from the peripheral blood of 508 consecutive patients. The control group comprised 515 healthy adults who were age-matched with the Dupuytren's patients. 3 common variants were analysed using TaqMan® genotyping assays and sequencing. The differences in the frequencies of variants of single nucleotide polymorphisms in patients and the control group were statistically tested. Additionally, haplotype frequency and linkage disequilibrium were analysed for these variants. A statistically significant association was noted between rs16879765_CT, rs16879765_TT and rs13240429_AA variants and Dupuytren's disease. 2 haplotypes: rs2722280_C+rs13240429_A+rs16879765_C and rs2722280_C+rs13240429_G+rs16879765_T were found to be statistically significantly associated with Dupuytren's disease. Moreover, we found that rs13240429 and rs16879765 variants were in strong linkage disequilibrium, while rs2722280 was only in moderate linkage disequilibrium. No significant differences were found in the frequencies of the variants of the gene between the groups with a positive and negative familial history of Dupuytren's disease. In conclusion, results of this study suggest that EPDR1 gene can be added to a growing list of genes associated with Dupuytren's disease development. © Georg Thieme Verlag KG Stuttgart · New York.

  1. Negligible impact of rare autoimmune-locus coding-region variants on missing heritability.

    PubMed

    Hunt, Karen A; Mistry, Vanisha; Bockett, Nicholas A; Ahmad, Tariq; Ban, Maria; Barker, Jonathan N; Barrett, Jeffrey C; Blackburn, Hannah; Brand, Oliver; Burren, Oliver; Capon, Francesca; Compston, Alastair; Gough, Stephen C L; Jostins, Luke; Kong, Yong; Lee, James C; Lek, Monkol; MacArthur, Daniel G; Mansfield, John C; Mathew, Christopher G; Mein, Charles A; Mirza, Muddassar; Nutland, Sarah; Onengut-Gumuscu, Suna; Papouli, Efterpi; Parkes, Miles; Rich, Stephen S; Sawcer, Steven; Satsangi, Jack; Simmonds, Matthew J; Trembath, Richard C; Walker, Neil M; Wozniak, Eva; Todd, John A; Simpson, Michael A; Plagnol, Vincent; van Heel, David A

    2013-06-13

    Genome-wide association studies (GWAS) have identified common variants of modest-effect size at hundreds of loci for common autoimmune diseases; however, a substantial fraction of heritability remains unexplained, to which rare variants may contribute. To discover rare variants and test them for association with a phenotype, most studies re-sequence a small initial sample size and then genotype the discovered variants in a larger sample set. This approach fails to analyse a large fraction of the rare variants present in the entire sample set. Here we perform simultaneous amplicon-sequencing-based variant discovery and genotyping for coding exons of 25 GWAS risk genes in 41,911 UK residents of white European origin, comprising 24,892 subjects with six autoimmune disease phenotypes and 17,019 controls, and show that rare coding-region variants at known loci have a negligible role in common autoimmune disease susceptibility. These results do not support the rare-variant synthetic genome-wide-association hypothesis (in which unobserved rare causal variants lead to association detected at common tag variants). Many known autoimmune disease risk loci contain multiple, independently associated, common and low-frequency variants, and so genes at these loci are a priori stronger candidates for harbouring rare coding-region variants than other genes. Our data indicate that the missing heritability for common autoimmune diseases may not be attributable to the rare coding-region variant portion of the allelic spectrum, but perhaps, as others have proposed, may be a result of many common-variant loci of weak effect.

  2. An automatic and efficient pipeline for disease gene identification through utilizing family-based sequencing data.

    PubMed

    Song, Dandan; Li, Ning; Liao, Lejian

    2015-01-01

    Due to the generation of enormous amounts of data at both lower costs as well as in shorter times, whole-exome sequencing technologies provide dramatic opportunities for identifying disease genes implicated in Mendelian disorders. Since upwards of thousands genomic variants can be sequenced in each exome, it is challenging to filter pathogenic variants in protein coding regions and reduce the number of missing true variants. Therefore, an automatic and efficient pipeline for finding disease variants in Mendelian disorders is designed by exploiting a combination of variants filtering steps to analyze the family-based exome sequencing approach. Recent studies on the Freeman-Sheldon disease are revisited and show that the proposed method outperforms other existing candidate gene identification methods.

  3. Genome-wide genetic analyses highlight mitogen-activated protein kinase (MAPK) signaling in the pathogenesis of endometriosis.

    PubMed

    Uimari, Outi; Rahmioglu, Nilufer; Nyholt, Dale R; Vincent, Katy; Missmer, Stacey A; Becker, Christian; Morris, Andrew P; Montgomery, Grant W; Zondervan, Krina T

    2017-04-01

    Do genome-wide association study (GWAS) data for endometriosis provide insight into novel biological pathways associated with its pathogenesis? GWAS analysis uncovered multiple pathways that are statistically enriched for genetic association signals, analysis of Stage A disease highlighted a novel variant in MAP3K4, while top pathways significantly associated with all endometriosis and Stage A disease included several mitogen-activated protein kinase (MAPK)-related pathways. Endometriosis is a complex disease with an estimated heritability of 50%. To date, GWAS revealed 10 genomic regions associated with endometriosis, explaining <4% of heritability, while half of the heritability is estimated to be due to common risk variants. Pathway analyses combine the evidence of single variants into gene-based measures, leveraging the aggregate effect of variants in genes and uncovering biological pathways involved in disease pathogenesis. Pathway analysis was conducted utilizing the International Endogene Consortium GWAS data, comprising 3194 surgically confirmed endometriosis cases and 7060 controls of European ancestry with genotype data imputed up to 1000 Genomes Phase three reference panel. GWAS was performed for all endometriosis cases and for Stage A (revised American Fertility Society (rAFS) I/II, n = 1686) and B (rAFS III/IV, n = 1364) cases separately. The identified significant pathways were compared with pathways previously investigated in the literature through candidate association studies. The most comprehensive biological pathway databases, MSigDB (including BioCarta, KEGG, PID, SA, SIG, ST and GO) and PANTHER were utilized to test for enrichment of genetic variants associated with endometriosis. Statistical enrichment analysis was performed using the MAGENTA (Meta-Analysis Gene-set Enrichment of variaNT Associations) software. The first genome-wide association analysis for Stage A endometriosis revealed a novel locus, rs144240142 (P = 6.45 × 10-8, OR = 1.71, 95% CI = 1.23-2.37), an intronic single-nucleotide polymorphism (SNP) within MAP3K4. This SNP was not associated with Stage B disease (P = 0.086). MAP3K4 was also shown to be differentially expressed in eutopic endometrium between Stage A endometriosis cases and controls (P = 3.8 × 10-4), but not with Stage B disease (P = 0.26). A total of 14 pathways enriched with genetic endometriosis associations were identified (false discovery rate (FDR)-P < 0.05). The pathways associated with any endometriosis were Grb2-Sos provides linkage to MAPK signaling for integrins pathway (P = 2.8 × 10-5, FDR-P = 3.0 × 10-3), Wnt signaling (P = 0.026, FDR-P = 0.026) and p130Cas linkage to MAPK signaling for integrins pathway (P = 6.0 × 10-4, FDR-P = 0.029); with Stage A endometriosis: extracellular signal-regulated kinase (ERK)1 ERK2 MAPK (P = 5.0 × 10-4, FDR-P = 5.0 × 10-4) and with Stage B endometriosis: two overlapping pathways that related to extracellular matrix biology-Core matrisome (P = 1.4 × 10-3, FDR-P = 0.013) and ECM glycoproteins (P = 1.8 × 10-3, FDR-P = 7.1 × 10-3). Genes arising from endometriosis candidate gene studies performed to date were enriched for Interleukin signaling pathway (P = 2.3 × 10-12), Apoptosis signaling pathway (P = 9.7 × 10-9) and Gonadotropin releasing hormone receptor pathway (P = 1.2 × 10-6); however, these pathways did not feature in the results based on GWAS data. Not applicable. The analysis is restricted to (i) variants in/near genes that can be assigned to pathways, excluding intergenic variants; (ii) the gene-based pathway definition as registered in the databases; (iii) women of European ancestry. The top ranked pathways associated with overall and Stage A endometriosis in particular involve integrin-mediated MAPK activation and intracellular ERK/MAPK acting downstream in the MAPK cascade, both acting in the control of cell division, gene expression, cell movement and survival. Other top enriched pathways in Stage B disease include ECM glycoprotein pathways important for extracellular structure and biochemical support. The results highlight the need for increased efforts to understand the functional role of these pathways in endometriosis pathogenesis, including the investigation of the biological effects of the genetic variants on downstream molecular processes in tissue relevant to endometriosis. Additionally, our results offer further support for the hypothesis of at least partially distinct causal pathophysiology for minimal/mild (rAFS I/II) vs. moderate/severe (rAFS III/IV) endometriosis. The genome-wide association data and Wellcome Trust Case Control Consortium (WTCCC) were generated through funding from the Wellcome Trust (WT084766/Z/08/Z, 076113 and 085475) and the National Health and Medical Research Council (NHMRC) of Australia (241944, 339462, 389927, 389875, 389891, 389892, 389938, 443036, 442915, 442981, 496610, 496739, 552485 and 552498). N.R. was funded by a grant from the Medical Research Council UK (MR/K011480/1). A.P.M. is a Wellcome Trust Senior Fellow in Basic Biomedical Science (grant WT098017). All authors declare there are no conflicts of interest. © The Author 2017. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology.

  4. Genome-wide genetic analyses highlight mitogen-activated protein kinase (MAPK) signaling in the pathogenesis of endometriosis

    PubMed Central

    Uimari, Outi; Rahmioglu, Nilufer; Nyholt, Dale R.; Vincent, Katy; Missmer, Stacey A.; Becker, Christian; Morris, Andrew P.; Montgomery, Grant W.

    2017-01-01

    Abstract STUDY QUESTION Do genome-wide association study (GWAS) data for endometriosis provide insight into novel biological pathways associated with its pathogenesis? SUMMARY ANSWER GWAS analysis uncovered multiple pathways that are statistically enriched for genetic association signals, analysis of Stage A disease highlighted a novel variant in MAP3K4, while top pathways significantly associated with all endometriosis and Stage A disease included several mitogen-activated protein kinase (MAPK)-related pathways. WHAT IS KNOWN ALREADY Endometriosis is a complex disease with an estimated heritability of 50%. To date, GWAS revealed 10 genomic regions associated with endometriosis, explaining <4% of heritability, while half of the heritability is estimated to be due to common risk variants. Pathway analyses combine the evidence of single variants into gene-based measures, leveraging the aggregate effect of variants in genes and uncovering biological pathways involved in disease pathogenesis. STUDY DESIGN, SIZE, DURATION Pathway analysis was conducted utilizing the International Endogene Consortium GWAS data, comprising 3194 surgically confirmed endometriosis cases and 7060 controls of European ancestry with genotype data imputed up to 1000 Genomes Phase three reference panel. GWAS was performed for all endometriosis cases and for Stage A (revised American Fertility Society (rAFS) I/II, n = 1686) and B (rAFS III/IV, n = 1364) cases separately. The identified significant pathways were compared with pathways previously investigated in the literature through candidate association studies. PARTICIPANTS/MATERIALS, SETTING, METHODS The most comprehensive biological pathway databases, MSigDB (including BioCarta, KEGG, PID, SA, SIG, ST and GO) and PANTHER were utilized to test for enrichment of genetic variants associated with endometriosis. Statistical enrichment analysis was performed using the MAGENTA (Meta-Analysis Gene-set Enrichment of variaNT Associations) software. MAIN RESULTS AND THE ROLE OF CHANCE The first genome-wide association analysis for Stage A endometriosis revealed a novel locus, rs144240142 (P = 6.45 × 10−8, OR = 1.71, 95% CI = 1.23–2.37), an intronic single-nucleotide polymorphism (SNP) within MAP3K4. This SNP was not associated with Stage B disease (P = 0.086). MAP3K4 was also shown to be differentially expressed in eutopic endometrium between Stage A endometriosis cases and controls (P = 3.8 × 10−4), but not with Stage B disease (P = 0.26). A total of 14 pathways enriched with genetic endometriosis associations were identified (false discovery rate (FDR)-P < 0.05). The pathways associated with any endometriosis were Grb2-Sos provides linkage to MAPK signaling for integrins pathway (P = 2.8 × 10−5, FDR-P = 3.0 × 10−3), Wnt signaling (P = 0.026, FDR-P = 0.026) and p130Cas linkage to MAPK signaling for integrins pathway (P = 6.0 × 10−4, FDR-P = 0.029); with Stage A endometriosis: extracellular signal-regulated kinase (ERK)1 ERK2 MAPK (P = 5.0 × 10−4, FDR-P = 5.0 × 10−4) and with Stage B endometriosis: two overlapping pathways that related to extracellular matrix biology—Core matrisome (P = 1.4 × 10−3, FDR-P = 0.013) and ECM glycoproteins (P = 1.8 × 10−3, FDR-P = 7.1 × 10−3). Genes arising from endometriosis candidate gene studies performed to date were enriched for Interleukin signaling pathway (P = 2.3 × 10−12), Apoptosis signaling pathway (P = 9.7 × 10−9) and Gonadotropin releasing hormone receptor pathway (P = 1.2 × 10−6); however, these pathways did not feature in the results based on GWAS data. LARGE SCALE DATA Not applicable. LIMITATIONS, REASONS FOR CAUTION The analysis is restricted to (i) variants in/near genes that can be assigned to pathways, excluding intergenic variants; (ii) the gene-based pathway definition as registered in the databases; (iii) women of European ancestry. WIDER IMPLICATIONS OF THE FINDINGS The top ranked pathways associated with overall and Stage A endometriosis in particular involve integrin-mediated MAPK activation and intracellular ERK/MAPK acting downstream in the MAPK cascade, both acting in the control of cell division, gene expression, cell movement and survival. Other top enriched pathways in Stage B disease include ECM glycoprotein pathways important for extracellular structure and biochemical support. The results highlight the need for increased efforts to understand the functional role of these pathways in endometriosis pathogenesis, including the investigation of the biological effects of the genetic variants on downstream molecular processes in tissue relevant to endometriosis. Additionally, our results offer further support for the hypothesis of at least partially distinct causal pathophysiology for minimal/mild (rAFS I/II) vs. moderate/severe (rAFS III/IV) endometriosis. STUDY FUNDING/COMPETING INTEREST(S) The genome-wide association data and Wellcome Trust Case Control Consortium (WTCCC) were generated through funding from the Wellcome Trust (WT084766/Z/08/Z, 076113 and 085475) and the National Health and Medical Research Council (NHMRC) of Australia (241944, 339462, 389927, 389875, 389891, 389892, 389938, 443036, 442915, 442981, 496610, 496739, 552485 and 552498). N.R. was funded by a grant from the Medical Research Council UK (MR/K011480/1). A.P.M. is a Wellcome Trust Senior Fellow in Basic Biomedical Science (grant WT098017). All authors declare there are no conflicts of interest. PMID:28333195

  5. Common single nucleotide variants underlying drug addiction: more than a decade of research.

    PubMed

    Bühler, Kora-Mareen; Giné, Elena; Echeverry-Alzate, Victor; Calleja-Conde, Javier; de Fonseca, Fernando Rodriguez; López-Moreno, Jose Antonio

    2015-09-01

    Drug-related phenotypes are common complex and highly heritable traits. In the last few years, candidate gene (CGAS) and genome-wide association studies (GWAS) have identified a huge number of single nucleotide polymorphisms (SNPs) associated with drug use, abuse or dependence, mainly related to alcohol or nicotine. Nevertheless, few of these associations have been replicated in independent studies. The aim of this study was to provide a review of the SNPs that have been most significantly associated with alcohol-, nicotine-, cannabis- and cocaine-related phenotypes in humans between the years of 2000 and 2012. To this end, we selected CGAS, GWAS, family-based association and case-only studies published in peer-reviewed international scientific journals (using the PubMed/MEDLINE and Addiction GWAS Resource databases) in which a significant association was reported. A total of 371 studies fit the search criteria. We then filtered SNPs with at least one replication study and performed meta-analysis of the significance of the associations. SNPs in the alcohol metabolizing genes, in the cholinergic gene cluster CHRNA5-CHRNA3-CHRNB4, and in the DRD2 and ANNK1 genes, are, to date, the most replicated and significant gene variants associated with alcohol- and nicotine-related phenotypes. In the case of cannabis and cocaine, a far fewer number of studies and replications have been reported, indicating either a need for further investigation or that the genetics of cannabis/cocaine addiction are more elusive. This review brings a global state-of-the-art vision of the behavioral genetics of addiction and collaborates on formulation of new hypothesis to guide future work. © 2015 Society for the Study of Addiction.

  6. Association analysis of rare variants near the APOE region with CSF and neuroimaging biomarkers of Alzheimer's disease.

    PubMed

    Nho, Kwangsik; Kim, Sungeun; Horgusluoglu, Emrin; Risacher, Shannon L; Shen, Li; Kim, Dokyoon; Lee, Seunggeun; Foroud, Tatiana; Shaw, Leslie M; Trojanowski, John Q; Aisen, Paul S; Petersen, Ronald C; Jack, Clifford R; Weiner, Michael W; Green, Robert C; Toga, Arthur W; Saykin, Andrew J

    2017-05-24

    The APOE ε4 allele is the most significant common genetic risk factor for late-onset Alzheimer's disease (LOAD). The region surrounding APOE on chromosome 19 has also shown consistent association with LOAD. However, no common variants in the region remain significant after adjusting for APOE genotype. We report a rare variant association analysis of genes in the vicinity of APOE with cerebrospinal fluid (CSF) and neuroimaging biomarkers of LOAD. Whole genome sequencing (WGS) was performed on 817 blood DNA samples from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Sequence data from 757 non-Hispanic Caucasian participants was used in the present analysis. We extracted all rare variants (MAF (minor allele frequency) < 0.05) within a 312 kb window in APOE's vicinity encompassing 12 genes. We assessed CSF and neuroimaging (MRI and PET) biomarkers as LOAD-related quantitative endophenotypes. Gene-based analyses of rare variants were performed using the optimal Sequence Kernel Association Test (SKAT-O). A total of 3,334 rare variants (MAF < 0.05) were found within the APOE region. Among them, 72 rare non-synonymous variants were observed. Eight genes spanning the APOE region were significantly associated with CSF Aβ 1-42 (p < 1.0 × 10 -3 ). After controlling for APOE genotype and adjusting for multiple comparisons, 4 genes (CBLC, BCAM, APOE, and RELB) remained significant. Whole-brain surface-based analysis identified highly significant clusters associated with rare variants of CBLC in the temporal lobe region including the entorhinal cortex, as well as frontal lobe regions. Whole-brain voxel-wise analysis of amyloid PET identified significant clusters in the bilateral frontal and parietal lobes showing associations of rare variants of RELB with cortical amyloid burden. Rare variants within genes spanning the APOE region are significantly associated with LOAD-related CSF Aβ 1-42 and neuroimaging biomarkers after adjusting for APOE genotype. These findings warrant further investigation and illustrate the role of next generation sequencing and quantitative endophenotypes in assessing rare variants which may help explain missing heritability in AD and other complex diseases.

  7. Assessing the spectrum of germline variation in Fanconi anemia genes among patients with head and neck carcinoma before age 50.

    PubMed

    Chandrasekharappa, Settara C; Chinn, Steven B; Donovan, Frank X; Chowdhury, Naweed I; Kamat, Aparna; Adeyemo, Adebowale A; Thomas, James W; Vemulapalli, Meghana; Hussey, Caroline S; Reid, Holly H; Mullikin, James C; Wei, Qingyi; Sturgis, Erich M

    2017-10-15

    Patients with Fanconi anemia (FA) have an increased risk for head and neck squamous cell carcinoma (HNSCC). The authors sought to determine the prevalence of undiagnosed FA and FA carriers among patients with HNSCC as well as an age cutoff for FA genetic screening. Germline DNA samples from 417 patients with HNSCC aged <50 years were screened for sequence variants by targeted next-generation sequencing of the entire length of 16 FA genes. The sequence revealed 194 FA gene variants in 185 patients (44%). The variant spectrum was comprised of 183 nonsynonymous point mutations, 9 indels, 1 large deletion, and 1 synonymous variant that was predicted to effect splicing. One hundred eight patients (26%) had at least 1 rare variant that was predicted to be damaging, and 57 (14%) had at least 1 rare variant that was predicted to be damaging and had been previously reported. Fifteen patients carried 2 rare variants or an X-linked variant in an FA gene. Overall, an age cutoff for FA screening was not identified among young patients with HNSCC, because there were no significant differences in mutation rates when patients were stratified by age, tumor site, ethnicity, smoking status, or human papillomavirus status. However, an increased burden, or mutation load, of FA gene variants was observed in carriers of the genes FA complementation group D2 (FANCD2), FANCE, and FANCL in the HNSCC patient cohort relative to the 1000 Genomes population. FA germline functional variants offer a novel area of study in HNSCC tumorigenesis. FANCE and FANCL, which are components of the core complex, are known to be responsible for the recruitment and ubiquitination, respectively, of FANCD2, a critical step in the FA DNA repair pathway. In the current cohort, the increased mutation load of FANCD2, FANCE, and FANCL variants among younger patients with HNSCC indicates the importance of the FA pathway in HNSCC. Cancer 2017;123:3943-54. © 2017 American Cancer Society. © 2017 American Cancer Society.

  8. Exome sequencing and genome-wide linkage analysis in 17 families illustrate the complex contribution of TTN truncating variants to dilated cardiomyopathy.

    PubMed

    Norton, Nadine; Li, Duanxiang; Rampersaud, Evadnie; Morales, Ana; Martin, Eden R; Zuchner, Stephan; Guo, Shengru; Gonzalez, Michael; Hedges, Dale J; Robertson, Peggy D; Krumm, Niklas; Nickerson, Deborah A; Hershberger, Ray E

    2013-04-01

    BACKGROUND- Familial dilated cardiomyopathy (DCM) is a genetically heterogeneous disease with >30 known genes. TTN truncating variants were recently implicated in a candidate gene study to cause 25% of familial and 18% of sporadic DCM cases. METHODS AND RESULTS- We used an unbiased genome-wide approach using both linkage analysis and variant filtering across the exome sequences of 48 individuals affected with DCM from 17 families to identify genetic cause. Linkage analysis ranked the TTN region as falling under the second highest genome-wide multipoint linkage peak, multipoint logarithm of odds, 1.59. We identified 6 TTN truncating variants carried by individuals affected with DCM in 7 of 17 DCM families (logarithm of odds, 2.99); 2 of these 7 families also had novel missense variants that segregated with disease. Two additional novel truncating TTN variants did not segregate with DCM. Nucleotide diversity at the TTN locus, including missense variants, was comparable with 5 other known DCM genes. The average number of missense variants in the exome sequences from the DCM cases or the ≈5400 cases from the Exome Sequencing Project was ≈23 per individual. The average number of TTN truncating variants in the Exome Sequencing Project was 0.014 per individual. We also identified a region (chr9q21.11-q22.31) with no known DCM genes with a maximum heterogeneity logarithm of odds score of 1.74. CONCLUSIONS- These data suggest that TTN truncating variants contribute to DCM cause. However, the lack of segregation of all identified TTN truncating variants illustrates the challenge of determining variant pathogenicity even with full exome sequencing.

  9. Rare Variation in TET2 Is Associated with Clinically Relevant Prostate Carcinoma in African-Americans

    PubMed Central

    Koboldt, Daniel C.; Kanchi, Krishna L.; Gui, Bin; Larson, David E.; Fulton, Robert S.; Isaacs, William B.; Kraja, Aldi; Borecki, Ingrid B.; Jia, Li; Wilson, Richard K.; Mardis, Elaine R.; Kibel, Adam S.

    2016-01-01

    Background Common variants have been associated with prostate cancer risk. Unfortunately, few are reproducibly linked to aggressive disease, the phenotype of greatest clinical relevance. One possible explanation is that rare genetic variants underlie a significant proportion of the risk for aggressive disease. Method To identify such variants, we performed a two staged approach using whole exome sequencing followed by targeted sequencing of 800 genes in 652 aggressive prostate cancer patients and 752 disease-free controls in both African and European Americans. In each population, we tested rare variants for association using two gene-based aggregation tests. We established a study-wide significance threshold of 3.125 × 10−5 to correct for multiple testing. Results TET2 in African-Americans was associated with aggressive disease with 24.4% of cases harboring a rare deleterious variant compared to 9.6% of controls (FET p = 1.84×10−5, OR=3.0; SKAT-O p= 2.74×10−5). We report 8 additional genes with suggestive evidence of association, including the DNA repair genes PARP2 and MSH6. Finally, we observed an excess of rare truncation variants in 5 genes including the DNA repair genes MSH6, BRCA1 and BRCA2. This adds to the growing body of evidence that DNA repair pathway defects may influence susceptibility to aggressive prostate cancer. Conclusion Our findings suggest that rare variants influence risk of clinically relevant prostate cancer and, if validated, could serve to identify men for screening, prophylaxis and treatment. Impact This study provides evidence that rare variants in TET2 may help identify African-American men at increased risk for clinically relevant prostate cancer. PMID:27486019

  10. Long QT molecular autopsy in sudden unexplained death in the young (1-40 years old): Lessons learnt from an eight year experience in New Zealand.

    PubMed

    Marcondes, Luciana; Crawford, Jackie; Earle, Nikki; Smith, Warren; Hayes, Ian; Morrow, Paul; Donoghue, Tom; Graham, Amanda; Love, Donald; Skinner, Jonathan R

    2018-01-01

    To review long QT syndrome molecular autopsy results in sudden unexplained death in young (SUDY) between 2006 and 2013 in New Zealand. Audit of the LQTS molecular autopsy results, cardiac investigations and family screening data from gene-positive families. During the study period, 365 SUDY cases were referred for molecular autopsy. 128 cases (35%) underwent LQTS genetic testing. 31 likely pathogenic variants were identified in 27 cases (21%); SCN5A (14/31, 45%), KCNH2 (7/31, 22%), KCNQ1 (4/31, 13%), KCNE2 (3/31, 10%), KCNE1 (2/31, 7%), KCNJ2 (1/31, 3%). Thirteen variants (13/128, 10%) were ultimately classified as pathogenic. Most deaths (63%) occurred during sleep. Gene variant carriage was more likely with a positive medical history (mostly seizures, 63% vs 36%, p = 0.01), amongst females (36% vs 12%, p = 0.001) and whites more than Maori (31% vs 0, p = 0.0009). Children 1-12 years were more likely to be gene-positive (33% vs 14%, p = 0.02). Family screening identified 42 gene-positive relatives, 18 with definitive phenotypic expression of LQTS/Brugada. 76% of the variants were maternally inherited (p = 0.007). Further family investigations and research now support pathogenicity of the variant in 13/27 (48%) of gene-positive cases. In New Zealand, variants in SCN5A and KCNH2, with maternal inheritance, predominate. A rare variant in LQTS genes is more likely in whites rather than Maori, females, children 1-12 years and those with a positive personal and family history of seizures, syncope or SUDY. Family screening supported the diagnosis in a third of the cases. The changing classification of variants creates a significant challenge.

  11. DNA methylation of the filaggrin gene adds to the risk of eczema associated with loss-of-function variants

    PubMed Central

    Ziyab, A. H.; Karmaus, W.; Holloway, J. W.; Zhang, H.; Ewart, S.; Arshad, S. H.

    2012-01-01

    Background Loss-of-function variants within the filaggrin gene (FLG) are associated with a dysfunctional skin barrier that contributes to the development of eczema. Epigenetic modifications, such as DNA methylation, are genetic regulatory mechanisms that modulate gene expression without changing the DAN sequence. Objectives To investigate whether genetic variants and adjacent differential DNA methylation within the FLG gene synergistically act on the development of eczema. Methods A subsample (n = 245, only females aged 18 years) of the Isle of Wight birth cohort participants (n = 1,456) had available information for FLG variants R501X, 2282del4, and S3247X and DNA methylation levels for 10 CpG sites within the FLG gene. Log-binomial regression was used to estimate the risk ratios (RRs) of eczema associated with FLG variants at different methylation levels. Results The period prevalence of eczema was 15.2% at age 18 years and 9.0% of participants were carriers (heterozygous) of FLG variants. Of the 10 CpG sites spanning the genomic region of FLG, methylation levels of CpG site ‘cg07548383’ showed a significant interaction with FLG sequence variants on the risk for eczema. At 86% methylation level, filaggrin haploinsufficient individuals had 5.48-fold increased risk of eczema when compared to those with wild type FLG genotype (p-value = 0.0008). Conclusions Our novel results indicated that the association between FLG loss-of-function variants and eczema is modulated by DNA methylation. Simultaneously assessing the joint effect of genetic and epigenetic factors within the FLG gene further highlights the importance of this genomic region for eczema manifestation. PMID:23003573

  12. Common variants at the CHEK2 gene locus and risk of epithelial ovarian cancer

    PubMed Central

    Lawrenson, Kate; Iversen, Edwin S.; Tyrer, Jonathan; Weber, Rachel Palmieri; Concannon, Patrick; Hazelett, Dennis J.; Li, Qiyuan; Marks, Jeffrey R.; Berchuck, Andrew; Lee, Janet M.; Aben, Katja K.H.; Anton-Culver, Hoda; Antonenkova, Natalia; Bandera, Elisa V.; Bean, Yukie; Beckmann, Matthias W.; Bisogna, Maria; Bjorge, Line; Bogdanova, Natalia; Brinton, Louise A.; Brooks-Wilson, Angela; Bruinsma, Fiona; Butzow, Ralf; Campbell, Ian G.; Carty, Karen; Chang-Claude, Jenny; Chenevix-Trench, Georgia; Chen, Ann; Chen, Zhihua; Cook, Linda S.; Cramer, Daniel W.; Cunningham, Julie M.; Cybulski, Cezary; Plisiecka-Halasa, Joanna; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A.; Dörk, Thilo; du Bois, Andreas; Eccles, Diana; Easton, Douglas T.; Edwards, Robert P.; Eilber, Ursula; Ekici, Arif B.; Fasching, Peter A.; Fridley, Brooke L.; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G.; Glasspool, Rosalind; Goode, Ellen L.; Goodman, Marc T.; Gronwald, Jacek; Harter, Philipp; Hasmad, Hanis Nazihah; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A.T.; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus; Hosono, Satoyo; Jakubowska, Anna; Paul, James; Jensen, Allan; Karlan, Beth Y.; Kjaer, Susanne Kruger; Kelemen, Linda E.; Kellar, Melissa; Kelley, Joseph L.; Kiemeney, Lambertus A.; Krakstad, Camilla; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Alice W.; Cannioto, Rikki; Leminen, Arto; Lester, Jenny; Levine, Douglas A.; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon F.A.G.; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R.; Nevanlinna, Heli; McNeish, Iain; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B.; Narod, Steven A.; Nedergaard, Lotte; Ness, Roberta B.; Noor Azmi, Mat Adenan; Odunsi, Kunle; Olson, Sara H.; Orlow, Irene; Orsulic, Sandra; Pearce, Celeste L.; Pejovic, Tanja; Pelttari, Liisa M.; Permuth-Wey, Jennifer; Phelan, Catherine M.; Pike, Malcolm C.; Poole, Elizabeth M.; Ramus, Susan J.; Risch, Harvey A.; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H.; Rudolph, Anja; Runnebaum, Ingo B.; Rzepecka, Iwona K.; Salvesen, Helga B.; Budzilowska, Agnieszka; Sellers, Thomas A.; Shu, Xiao-Ou; Shvetsov, Yurii B.; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C.; Sucheston, Lara; Tangen, Ingvild L.; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J.; Timorek, Agnieszka; Tworoger, Shelley S.; Nieuwenhuysen, Els Van; Vergote, Ignace; Vierkant, Robert A.; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S.; Wicklund, Kristine G.; Wilkens, Lynne R.; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna H.; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Coetzee, Gerhard A.; Freedman, Matthew L.; Monteiro, Alvaro N.A.; Moes-Sosnowska, Joanna; Kupryjanczyk, Jolanta; Pharoah, Paul D.; Gayther, Simon A.; Schildkraut, Joellen M.

    2015-01-01

    Genome-wide association studies have identified 20 genomic regions associated with risk of epithelial ovarian cancer (EOC), but many additional risk variants may exist. Here, we evaluated associations between common genetic variants [single nucleotide polymorphisms (SNPs) and indels] in DNA repair genes and EOC risk. We genotyped 2896 common variants at 143 gene loci in DNA samples from 15 397 patients with invasive EOC and controls. We found evidence of associations with EOC risk for variants at FANCA, EXO1, E2F4, E2F2, CREB5 and CHEK2 genes (P ≤ 0.001). The strongest risk association was for CHEK2 SNP rs17507066 with serous EOC (P = 4.74 x 10–7). Additional genotyping and imputation of genotypes from the 1000 genomes project identified a slightly more significant association for CHEK2 SNP rs6005807 (r 2 with rs17507066 = 0.84, odds ratio (OR) 1.17, 95% CI 1.11–1.24, P = 1.1×10−7). We identified 293 variants in the region with likelihood ratios of less than 1:100 for representing the causal variant. Functional annotation identified 25 candidate SNPs that alter transcription factor binding sites within regulatory elements active in EOC precursor tissues. In The Cancer Genome Atlas dataset, CHEK2 gene expression was significantly higher in primary EOCs compared to normal fallopian tube tissues (P = 3.72×10−8). We also identified an association between genotypes of the candidate causal SNP rs12166475 (r 2 = 0.99 with rs6005807) and CHEK2 expression (P = 2.70×10-8). These data suggest that common variants at 22q12.1 are associated with risk of serous EOC and CHEK2 as a plausible target susceptibility gene. PMID:26424751

  13. The Effects of a BDNF Val66Met Polymorphism on Posttraumatic Stress Disorder: A Meta-Analysis.

    PubMed

    Bountress, Kaitlin E; Bacanu, Silviu-Alin; Tomko, Rachel L; Korte, Kristina J; Hicks, Terrell; Sheerin, Christina; Lind, Mackenzie J; Marraccini, Marisa; Nugent, Nicole; Amstadter, Ananda B

    2018-06-06

    Given evidence that posttraumatic stress disorder (PTSD) is moderately heritable, a number of studies utilizing candidate gene approaches have attempted to examine the potential contributions of theoretically relevant genetic variation. Some of these studies have found sup port for a brain-derived neurotrophic factor (BDNF) variant, Val66Met, in the risk of developing PTSD, while others have failed to find this link. This study sought to reconcile these conflicting findings using a meta-analysis framework. Analyses were also used to determine whether there is significant heterogeneity in the link between this variant and PTSD. We conducted a systematic review of the literature on BDNF and PTSD from the PsycINFO and PubMed databases. A total of 11 studies were included in the analysis. Findings indicate a marginally significant effect of the BDNF Val66Met variant on PTSD (p < 0.1). However, of the 11 studies included, only 2 suggested an effect with a non-zero confidence interval, one of which showed a z score of 3.31. We did not find any evidence for heterogeneity. Findings from this meta-analytic investigation of the published literature provide little support for the Val66Met variant of BDNF as a predictor of PTSD. Future well-powered agnostic genome-wide association studies with more refined phenotyping are needed to clarify genetic influences on PTSD. © 2018 S. Karger AG, Basel.

  14. A Novel de novo CDH1 Germline Variant Aids in the Classification of C-terminal E-cadherin Alterations Predicted to Escape Nonsense-Mediated mRNA Decay.

    PubMed

    Krempely, Kate; Karam, Rachid

    2018-05-24

    Most truncating CDH1 pathogenic alterations confer an elevated lifetime risk of diffuse gastric cancer and lobular breast cancer. However, transcripts containing carboxyl-terminal (C-terminal) premature stop codons have been demonstrated to escape the nonsense-mediated mRNA decay (NMD) pathway, and gastric and breast cancer risks associated with these truncations should be carefully evaluated. A female patient underwent multigene panel testing due to a personal history of invasive lobular breast cancer diagnosed at age 54, which identified the germline CDH1 nonsense alteration, c.2506G>T (p.E836*), in the last exon of the gene. Subsequent parental testing for the alteration was negative and additional short tandem repeat analysis confirmed the familial relationships and the de novo occurrence in the proband. Based on the de novo occurrence, clinical history, and rarity in general population databases, this alteration was classified as a likely pathogenic variant. This is the most C-terminal pathogenic alteration reported to date. Additionally, this alteration contributed to the classification of six other upstream CDH1 C-terminal truncating variants as pathogenic or likely pathogenic. Identifying the most distal pathogenic alteration provides evidence to classify other C-terminal truncating variants as either pathogenic or benign, a fundamental step to offering pre-symptomatic screening and prophylactic procedures to the appropriate patients. Cold Spring Harbor Laboratory Press.

  15. Long genes and genes with multiple splice variants are enriched in pathways linked to cancer and other multigenic diseases.

    PubMed

    Sahakyan, Aleksandr B; Balasubramanian, Shankar

    2016-03-12

    The role of random mutations and genetic errors in defining the etiology of cancer and other multigenic diseases has recently received much attention. With the view that complex genes should be particularly vulnerable to such events, here we explore the link between the simple properties of the human genes, such as transcript length, number of splice variants, exon/intron composition, and their involvement in the pathways linked to cancer and other multigenic diseases. We reveal a substantial enrichment of cancer pathways with long genes and genes that have multiple splice variants. Although the latter two factors are interdependent, we show that the overall gene length and splicing complexity increase in cancer pathways in a partially decoupled manner. Our systematic survey for the pathways enriched with top lengthy genes and with genes that have multiple splice variants reveal, along with cancer pathways, the pathways involved in various neuronal processes, cardiomyopathies and type II diabetes. We outline a correlation between the gene length and the number of somatic mutations. Our work is a step forward in the assessment of the role of simple gene characteristics in cancer and a wider range of multigenic diseases. We demonstrate a significant accumulation of long genes and genes with multiple splice variants in pathways of multigenic diseases that have already been associated with de novo mutations. Unlike the cancer pathways, we note that the pathways of neuronal processes, cardiomyopathies and type II diabetes contain genes long enough for topoisomerase-dependent gene expression to also be a potential contributing factor in the emergence of pathologies, should topoisomerases become impaired.

  16. Identification of protein-damaging mutations in 10 swine taste receptors and 191 appetite-reward genes.

    PubMed

    Clop, Alex; Sharaf, Abdoallah; Castelló, Anna; Ramos-Onsins, Sebastián; Cirera, Susanna; Mercadé, Anna; Derdak, Sophia; Beltran, Sergi; Huisman, Abe; Fredholm, Merete; van As, Pieter; Sánchez, Armand

    2016-08-26

    Taste receptors (TASRs) are essential for the body's recognition of chemical compounds. In the tongue, TASRs sense the sweet and umami and the toxin-related bitter taste thus promoting a particular eating behaviour. Moreover, their relevance in other organs is now becoming evident. In the intestine, they regulate nutrient absorption and gut motility. Upon ligand binding, TASRs activate the appetite-reward circuitry to signal the nervous system and keep body homeostasis. With the aim to identify genetic variation in the swine TASRs and in the genes from the appetite and the reward pathways, we have sequenced the exons of 201 TASRs and appetite-reward genes from 304 pigs belonging to ten breeds, wild boars and to two phenotypically extreme groups from a F2 resource with data on growth and fat deposition. We identified 2,766 coding variants 395 of which were predicted to have a strong impact on protein sequence and function. 334 variants were present in only one breed and at predicted alternative allele frequency (pAAF) ≥ 0.1. The Asian pigs and the wild boars showed the largest proportion of breed specific variants. We also compared the pAAF of the two F2 groups and found that variants in TAS2R39 and CD36 display significant differences suggesting that these genes could influence growth and fat deposition. We developed a 128-variant genotyping assay and confirmed 57 of these variants. We have identified thousands of variants affecting TASRs as well as genes involved in the appetite and the reward mechanisms. Some of these genes have been already associated to taste preferences, appetite or behaviour in humans and mouse. We have also detected indications of a potential relationship of some of these genes with growth and fat deposition, which could have been caused by changes in taste preferences, appetite or reward and ultimately impact on food intake. A genotyping array with 57 variants in 31 of these genes is now available for genotyping and start elucidating the impact of genetic variation in these genes on pig biology and breeding.

  17. snpGeneSets: An R Package for Genome-Wide Study Annotation

    PubMed Central

    Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

    2016-01-01

    Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048

  18. Whole-Exome Sequencing in Familial Parkinson Disease

    PubMed Central

    Farlow, Janice L.; Robak, Laurie A.; Hetrick, Kurt; Bowling, Kevin; Boerwinkle, Eric; Coban-Akdemir, Zeynep H.; Gambin, Tomasz; Gibbs, Richard A.; Gu, Shen; Jain, Preti; Jankovic, Joseph; Jhangiani, Shalini; Kaw, Kaveeta; Lai, Dongbing; Lin, Hai; Ling, Hua; Liu, Yunlong; Lupski, James R.; Muzny, Donna; Porter, Paula; Pugh, Elizabeth; White, Janson; Doheny, Kimberly; Myers, Richard M.; Shulman, Joshua M.; Foroud, Tatiana

    2016-01-01

    IMPORTANCE Parkinson disease (PD) is a progressive neurodegenerative disease for which susceptibility is linked to genetic and environmental risk factors. OBJECTIVE To identify genetic variants contributing to disease risk in familial PD. DESIGN, SETTING, AND PARTICIPANTS A 2-stage study design that included a discovery cohort of families with PD and a replication cohort of familial probands was used. In the discovery cohort, rare exonic variants that segregated in multiple affected individuals in a family and were predicted to be conserved or damaging were retained. Genes with retained variants were prioritized if expressed in the brain and located within PD-relevant pathways. Genes in which prioritized variants were observed in at least 4 families were selected as candidate genes for replication in the replication cohort. The setting was among individuals with familial PD enrolled from academic movement disorder specialty clinics across the United States. All participants had a family history of PD. MAIN OUTCOMES AND MEASURES Identification of genes containing rare, likely deleterious, genetic variants in individuals with familial PD using a 2-stage exome sequencing study design. RESULTS The 93 individuals from 32 families in the discovery cohort (49.5% [46 of 93] female) had a mean (SD) age at onset of 61.8 (10.0) years. The 49 individuals with familial PD in the replication cohort (32.6% [16 of 49] female) had a mean (SD) age at onset of 50.1 (15.7) years. Discovery cohort recruitment dates were 1999 to 2009, and replication cohort recruitment dates were 2003 to 2014. Data analysis dates were 2011 to 2015. Three genes containing a total of 13 rare and potentially damaging variants were prioritized in the discovery cohort. Two of these genes (TNK2 and TNR) also had rare variants that were predicted to be damaging in the replication cohort. All 9 variants identified in the 2 replicated genes in 12 families across the discovery and replication cohorts were confirmed via Sanger sequencing. CONCLUSIONS AND RELEVANCE TNK2 and TNR harbored rare, likely deleterious, variants in individuals having familial PD, with similar findings in an independent cohort. To our knowledge, these genes have not been previously associated with PD, although they have been linked to critical neuronal functions. Further studies are required to confirm a potential role for these genes in the pathogenesis of PD. PMID:26595808

  19. Nominal ISOMERs (Incorrect Spellings Of Medicines Eluding Researchers)—variants in the spellings of drug names in PubMed: a database review

    PubMed Central

    Aronson, Jeffrey K

    2016-01-01

    Objective To examine how misspellings of drug names could impede searches for published literature. Design Database review. Data source PubMed. Review methods The study included 30 drug names that are commonly misspelt on prescription charts in hospitals in Birmingham, UK (test set), and 30 control names randomly chosen from a hospital formulary (control set). The following definitions were used: standard names—the international non-proprietary names, variant names—deviations in spelling from standard names that are not themselves standard names in English language nomenclature, and hidden reference variants—variant spellings that identified publications in textword (tw) searches of PubMed or other databases, and which were not identified by textword searches for the standard names. Variant names were generated from standard names by applying letter substitutions, omissions, additions, transpositions, duplications, deduplications, and combinations of these. Searches were carried out in PubMed (30 June 2016) for “standard name[tw]” and “variant name[tw] NOT standard name[tw].” Results The 30 standard names of drugs in the test set gave 325 979 hits in total, and 160 hidden reference variants gave 3872 hits (1.17%). The standard names of the control set gave 470 064 hits, and 79 hidden reference variants gave 766 hits (0.16%). Letter substitutions (particularly i to y and vice versa) and omissions together accounted for 2924 (74%) of the variants. Amitriptyline (8530 hits) yielded 18 hidden reference variants (179 (2.1%) hits). Names ending in “in,” “ine,” or “micin” were commonly misspelt. Failing to search for hidden reference variants of “gentamicin,” “amitriptyline,” “mirtazapine,” and “trazodone” would miss at least 19 systematic reviews. A hidden reference variant related to Christmas, “No-el”, was rare; variants of “X-miss” were rarer. Conclusion When performing searches, researchers should include misspellings of drug names among their search terms. PMID:27974346

  20. De Novo Coding Variants Are Strongly Associated with Tourette Disorder.

    PubMed

    Willsey, A Jeremy; Fernandez, Thomas V; Yu, Dongmei; King, Robert A; Dietrich, Andrea; Xing, Jinchuan; Sanders, Stephan J; Mandell, Jeffrey D; Huang, Alden Y; Richer, Petra; Smith, Louw; Dong, Shan; Samocha, Kaitlin E; Neale, Benjamin M; Coppola, Giovanni; Mathews, Carol A; Tischfield, Jay A; Scharf, Jeremiah M; State, Matthew W; Heiman, Gary A

    2017-05-03

    Whole-exome sequencing (WES) and de novo variant detection have proven a powerful approach to gene discovery in complex neurodevelopmental disorders. We have completed WES of 325 Tourette disorder trios from the Tourette International Collaborative Genetics cohort and a replication sample of 186 trios from the Tourette Syndrome Association International Consortium on Genetics (511 total). We observe strong and consistent evidence for the contribution of de novo likely gene-disrupting (LGD) variants (rate ratio [RR] 2.32, p = 0.002). Additionally, de novo damaging variants (LGD and probably damaging missense) are overrepresented in probands (RR 1.37, p = 0.003). We identify four likely risk genes with multiple de novo damaging variants in unrelated probands: WWC1 (WW and C2 domain containing 1), CELSR3 (Cadherin EGF LAG seven-pass G-type receptor 3), NIPBL (Nipped-B-like), and FN1 (fibronectin 1). Overall, we estimate that de novo damaging variants in approximately 400 genes contribute risk in 12% of clinical cases. VIDEO ABSTRACT. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Examining Gene-Environment Interactions in Comorbid Depressive and Disruptive Behavior Disorders using a Bayesian Approach

    PubMed Central

    Adrian, Molly; Kiff, Cara; Glazner, Chris; Kohen, Ruth; Tracy, Julia Helen; Zhou, Chuan; McCauley, Elizabeth; Stoep, Ann Vander

    2015-01-01

    Objective The objective of this study was to apply a Bayesian statistical analytic approach that minimizes multiple testing problems to explore the combined effects of chronic low familial support and variants in 12 candidate genes on risk for a common and debilitating childhood mental health condition. Method Bayesian mixture modeling was used to examine gene by environment interactions among genetic variants and environmental factors (family support) associated in previous studies with the occurrence of comorbid depression and disruptive behavior disorders youth, using a sample of 255 children. Results One main effects, variants in the oxytocin receptor (OXTR, rs53576) was associated with increased risk for comorbid disorders. Two significant gene x environment and one signification gene x gene interaction emerged. Variants in the nicotinic acetylcholine receptor α5 subunit (CHRNA5, rs16969968) and in the glucocorticoid receptor chaperone protein FK506 binding protein 5 (FKBP5, rs4713902) interacted with chronic low family support in association with child mental health status. One gene x gene interaction, 5-HTTLPR variant of the serotonin transporter (SERT/SLC6A4) in combination with μ opioid receptor (OPRM1, rs1799971) was associated with comorbid depression and conduct problems. Conclusions Results indicate that Bayesian modeling is a feasible strategy for conducting behavioral genetics research. This approach, combined with an optimized genetic selection strategy (Vrieze, Iacono, & McGue, 2012), revealed genetic variants involved in stress regulation ( FKBP5, SERTxOPMR), social bonding (OXTR), and nicotine responsivity (CHRNA5) in predicting comorbid status. PMID:26228411

  2. Sequencing of sporadic Attention-Deficit Hyperactivity Disorder (ADHD) identifies novel and potentially pathogenic de novo variants and excludes overlap with genes associated with autism spectrum disorder.

    PubMed

    Kim, Daniel Seung; Burt, Amber A; Ranchalis, Jane E; Wilmot, Beth; Smith, Joshua D; Patterson, Karynne E; Coe, Bradley P; Li, Yatong K; Bamshad, Michael J; Nikolas, Molly; Eichler, Evan E; Swanson, James M; Nigg, Joel T; Nickerson, Deborah A; Jarvik, Gail P

    2017-06-01

    Attention-Deficit Hyperactivity Disorder (ADHD) has high heritability; however, studies of common variation account for <5% of ADHD variance. Using data from affected participants without a family history of ADHD, we sought to identify de novo variants that could account for sporadic ADHD. Considering a total of 128 families, two analyses were conducted in parallel: first, in 11 unaffected parent/affected proband trios (or quads with the addition of an unaffected sibling) we completed exome sequencing. Six de novo missense variants at highly conserved bases were identified and validated from four of the 11 families: the brain-expressed genes TBC1D9, DAGLA, QARS, CSMD2, TRPM2, and WDR83. Separately, in 117 unrelated probands with sporadic ADHD, we sequenced a panel of 26 genes implicated in intellectual disability (ID) and autism spectrum disorder (ASD) to evaluate whether variation in ASD/ID-associated genes were also present in participants with ADHD. Only one putative deleterious variant (Gln600STOP) in CHD1L was identified; this was found in a single proband. Notably, no other nonsense, splice, frameshift, or highly conserved missense variants in the 26 gene panel were identified and validated. These data suggest that de novo variant analysis in families with independently adjudicated sporadic ADHD diagnosis can identify novel genes implicated in ADHD pathogenesis. Moreover, that only one of the 128 cases (0.8%, 11 exome, and 117 MIP sequenced participants) had putative deleterious variants within our data in 26 genes related to ID and ASD suggests significant independence in the genetic pathogenesis of ADHD as compared to ASD and ID phenotypes. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  3. Rare key functional domain missense substitutions in MRE11A, RAD50, and NBN contribute to breast cancer susceptibility: results from a Breast Cancer Family Registry case-control mutation-screening study

    PubMed Central

    2014-01-01

    Introduction The MRE11A-RAD50-Nibrin (MRN) complex plays several critical roles related to repair of DNA double-strand breaks. Inherited mutations in the three components predispose to genetic instability disorders and the MRN genes have been implicated in breast cancer susceptibility, but the underlying data are not entirely convincing. Here, we address two related questions: (1) are some rare MRN variants intermediate-risk breast cancer susceptibility alleles, and if so (2) do the MRN genes follow a BRCA1/BRCA2 pattern wherein most susceptibility alleles are protein-truncating variants, or do they follow an ATM/CHEK2 pattern wherein half or more of the susceptibility alleles are missense substitutions? Methods Using high-resolution melt curve analysis followed by Sanger sequencing, we mutation screened the coding exons and proximal splice junction regions of the MRN genes in 1,313 early-onset breast cancer cases and 1,123 population controls. Rare variants in the three genes were pooled using bioinformatics methods similar to those previously applied to ATM, BRCA1, BRCA2, and CHEK2, and then assessed by logistic regression. Results Re-analysis of our ATM, BRCA1, and BRCA2 mutation screening data revealed that these genes do not harbor pathogenic alleles (other than modest-risk SNPs) with minor allele frequencies >0.1% in Caucasian Americans, African Americans, or East Asians. Limiting our MRN analyses to variants with allele frequencies of <0.1% and combining protein-truncating variants, likely spliceogenic variants, and key functional domain rare missense substitutions, we found significant evidence that the MRN genes are indeed intermediate-risk breast cancer susceptibility genes (odds ratio (OR) = 2.88, P = 0.0090). Key domain missense substitutions were more frequent than the truncating variants (24 versus 12 observations) and conferred a slightly higher OR (3.07 versus 2.61) with a lower P value (0.029 versus 0.14). Conclusions These data establish that MRE11A, RAD50, and NBN are intermediate-risk breast cancer susceptibility genes. Like ATM and CHEK2, their spectrum of pathogenic variants includes a relatively high proportion of missense substitutions. However, the data neither establish whether variants in each of the three genes are best evaluated under the same analysis model nor achieve clinically actionable classification of individual variants observed in this study. PMID:24894818

  4. MSX1 gene variant - its presence in tooth absence - a case control genetic study.

    PubMed

    Reddy, Naveen Admala; Adusumilli, Gopinath; Devanna, Raghu; Pichai, Saravanan; Rohra, Mayur Gobindram; Arjunan, Sharmila

    2013-10-01

    Non Syndromic tooth agenesis is a congenital anomaly with significant medical, psychological and social ramifications. There is sufficient evidence to hypothesize that locus for this condition can be identified by candidate genes. The aim of this study was to test whether MSX1 671 T>C gene variant was involved in etiology of Non Syndromic tooth agenesis in Raichur Patients. Blood samples were collected with informed consent from 50 subjects having Non Syndromic tooth agenesis and 50 controls. Genomic DNA was extracted from the blood samples, Polymerase Chain Reaction was performed (PCR) and Restriction Fragment Length Polymorphism (RFLP) was performed for digestion products that were evaluated. The RESULTS showed positive correlation between MSX1671 T>C gene variant and Non Syndromic tooth agenesis in Raichur Patients. MSX1 671 T>C gene variant may be a good screening marker for Non Syndromic tooth agenesis in Raichur Patients . How to cite this article:Reddy NA, Adusumilli G, Devanna R, Pichai S, Rohra MG, Arjunan S. Msx1 Gene Variant - Its Presence in Tooth Absence - A Case Control Genetic Study. J Int Oral Health 2013; 5(5):20-6.

  5. Msx1 Gene Variant - Its Presence in Tooth Absence - A Case Control Genetic Study

    PubMed Central

    Reddy, Naveen Admala; Adusumilli, Gopinath; Devanna, Raghu; Pichai, Saravanan; Rohra, Mayur Gobindram; Arjunan, Sharmila

    2013-01-01

    Background: Non Syndromic tooth agenesis is a congenital anomaly with significant medical, psychological and social ramifications. There is sufficient evidence to hypothesize that locus for this condition can be identified by candidate genes. The aim of this study was to test whether MSX1 671 T>C gene variant was involved in etiology of Non Syndromic tooth agenesis in Raichur Patients. Materials & Methods: Blood samples were collected with informed consent from 50 subjects having Non Syndromic tooth agenesis and 50 controls. Genomic DNA was extracted from the blood samples, Polymerase Chain Reaction was performed (PCR) and Restriction Fragment Length Polymorphism (RFLP) was performed for digestion products that were evaluated. Results: The Results showed positive correlation between MSX1671 T>C gene variant and Non Syndromic tooth agenesis in Raichur Patients. Conclusion: MSX1 671 T>C gene variant may be a good screening marker for Non Syndromic tooth agenesis in Raichur Patients . How to cite this article:Reddy NA, Adusumilli G, Devanna R, Pichai S, Rohra MG, Arjunan S. Msx1 Gene Variant - Its Presence in Tooth Absence - A Case Control Genetic Study. J Int Oral Health 2013; 5(5):20-6. PMID:24324300

  6. A novel variant in the SLC12A1 gene in two families with antenatal Bartter syndrome.

    PubMed

    Breinbjerg, Anders; Siggaard Rittig, Charlotte; Gregersen, Niels; Rittig, Søren; Hvarregaard Christensen, Jane

    2017-01-01

    Bartter syndrome is an autosomal-recessive inherited disease in which patients present with hypokalaemia and metabolic alkalosis. We present two apparently nonrelated cases with antenatal Bartter syndrome type I, due to a novel variant in the SLC12A1 gene encoding the bumetanide-sensitive sodium-(potassium)-chloride cotransporter 2 in the thick ascending limb of the loop of Henle. Blood samples were received from the two cases and 19 of their relatives, and deoxyribonucleic acid was extracted. The coding regions of the SLC12A1 gene were amplified using polymerase chain reaction, followed by bidirectional direct deoxyribonucleic acid sequencing. Each affected child in the two families was homozygous for a novel inherited variant in the SLC12A1gene, c.1614T>A. The variant predicts a change from a tyrosine codon to a stop codon (p.Tyr538Ter). The two cases presented antenatally and at six months of age, respectively. The two cases were homozygous for the same variant in the SLC12A1 gene, but presented clinically at different ages. This could eventually be explained by the presence of other gene variants or environmental factors modifying the phenotypes. The phenotypes of the patients were similar to other patients with antenatal Bartter syndrome. ©2016 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.

  7. Low load for disruptive mutations in autism genes and their biased transmission

    PubMed Central

    Iossifov, Ivan; Levy, Dan; Allen, Jeremy; Ye, Kenny; Ronemus, Michael; Lee, Yoon-ha; Yamrom, Boris; Wigler, Michael

    2015-01-01

    We previously computed that genes with de novo (DN) likely gene-disruptive (LGD) mutations in children with autism spectrum disorders (ASD) have high vulnerability: disruptive mutations in many of these genes, the vulnerable autism genes, will have a high likelihood of resulting in ASD. Because individuals with ASD have lower fecundity, such mutations in autism genes would be under strong negative selection pressure. An immediate prediction is that these genes will have a lower LGD load than typical genes in the human gene pool. We confirm this hypothesis in an explicit test by measuring the load of disruptive mutations in whole-exome sequence databases from two cohorts. We use information about mutational load to show that lower and higher intelligence quotients (IQ) affected individuals can be distinguished by the mutational load in their respective gene targets, as well as to help prioritize gene targets by their likelihood of being autism genes. Moreover, we demonstrate that transmission of rare disruptions in genes with a lower LGD load occurs more often to affected offspring; we show transmission originates most often from the mother, and transmission of such variants is seen more often in offspring with lower IQ. A surprising proportion of transmission of these rare events comes from genes expressed in the embryonic brain that show sharply reduced expression shortly after birth. PMID:26401017

  8. Screening for rare variants in the PNPLA3 gene in obese liver biopsy patients.

    PubMed

    Zegers, Doreen; Verrijken, An; Francque, Sven; de Freitas, Fenna; Beckers, Sigri; Aerts, Evi; Ruppert, Martin; Hubens, Guy; Michielsen, Peter; Van Hul, Wim; Van Gaal, Luc F

    2016-12-01

    Previous research has clearly implicated the PNPLA3 gene in the etiology of nonalcoholic fatty liver disease as a polymorphism in the gene was found to be robustly associated to the disease. However, data on the involvement of rare PNPLA3 variants in the development of nonalcoholic fatty liver disease (NAFLD) is currently limited. Therefore, we performed an extensive mutation analysis study on a cohort of obese liver biopsy patients to determine PNPLA3 variation and its correlation with fatty liver disease. We screened the entire coding region of the PNPLA3 gene in DNA samples of 393 obese liver biopsy patients with varying degrees of fatty liver disease. Mutation analysis was performed by high-resolution melting curve analysis in combination with direct sequencing. We identified several common polymorphisms as well as one rare synonymous variant (c.867G>A rs139896256), one rare intronic variant (c.979+13C>T) and 3 nonsynonymous coding variants (p.A76T, p.A104V and p.T200M) in the PNPLA3 gene. In silico analysis indicated that the p.A104V variant will probably have no functional effect, whereas for the p.A76T and p.T200M variant a possible pathogenic effect is suggested. Overall, we showed that novel variants in PNPLA3 are very rare in our liver biopsy cohort, thereby indicating that their impact on the etiology of NAFLD is probably limited. Nevertheless, for the three rare coding variants that were identified in patients with advanced liver disease, further functional characterization will be essential to verify their potential disease causality. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  9. Systematic reconstruction of autism biology from massive genetic mutation profiles

    PubMed Central

    Zhang, Chaolin; Jiang, Yong-hui

    2018-01-01

    Autism spectrum disorder (ASD) affects 1% of world population and has become a pressing medical and social problem worldwide. As a paradigmatic complex genetic disease, ASD has been intensively studied and thousands of gene mutations have been reported. Because these mutations rarely recur, it is difficult to (i) pinpoint the fewer disease-causing versus majority random events and (ii) replicate or verify independent studies. A coherent and systematic understanding of autism biology has not been achieved. We analyzed 3392 and 4792 autism-related mutations from two large-scale whole-exome studies across multiple resolution levels, that is, variants (single-nucleotide), genes (protein-coding unit), and pathways (molecular module). These mutations do not recur or replicate at the variant level, but significantly and increasingly do so at gene and pathway levels. Genetic association reveals a novel gene + pathway dual-hit model, where the mutation burden becomes less relevant. In multiple independent analyses, hundreds of variants or genes repeatedly converge to several canonical pathways, either novel or literature-supported. These pathways define recurrent and systematic ASD biology, distinct from previously reported gene groups or networks. They also present a catalog of novel ASD risk factors including 118 variants and 72 genes. At a subpathway level, most variants disrupt the pathway-related gene functions, and in the same gene, they tend to hit residues extremely close to each other and in the same domain. Multiple interacting variants spotlight key modules, including the cAMP (adenosine 3′,5′-monophosphate) second-messenger system and mGluR (metabotropic glutamate receptor) signaling regulation by GRKs (G protein–coupled receptor kinases). At a superpathway level, distinct pathways further interconnect and converge to three biology themes: synaptic function, morphology, and plasticity. PMID:29651456

  10. Systematic reconstruction of autism biology from massive genetic mutation profiles.

    PubMed

    Luo, Weijun; Zhang, Chaolin; Jiang, Yong-Hui; Brouwer, Cory R

    2018-04-01

    Autism spectrum disorder (ASD) affects 1% of world population and has become a pressing medical and social problem worldwide. As a paradigmatic complex genetic disease, ASD has been intensively studied and thousands of gene mutations have been reported. Because these mutations rarely recur, it is difficult to (i) pinpoint the fewer disease-causing versus majority random events and (ii) replicate or verify independent studies. A coherent and systematic understanding of autism biology has not been achieved. We analyzed 3392 and 4792 autism-related mutations from two large-scale whole-exome studies across multiple resolution levels, that is, variants (single-nucleotide), genes (protein-coding unit), and pathways (molecular module). These mutations do not recur or replicate at the variant level, but significantly and increasingly do so at gene and pathway levels. Genetic association reveals a novel gene + pathway dual-hit model, where the mutation burden becomes less relevant. In multiple independent analyses, hundreds of variants or genes repeatedly converge to several canonical pathways, either novel or literature-supported. These pathways define recurrent and systematic ASD biology, distinct from previously reported gene groups or networks. They also present a catalog of novel ASD risk factors including 118 variants and 72 genes. At a subpathway level, most variants disrupt the pathway-related gene functions, and in the same gene, they tend to hit residues extremely close to each other and in the same domain. Multiple interacting variants spotlight key modules, including the cAMP (adenosine 3',5'-monophosphate) second-messenger system and mGluR (metabotropic glutamate receptor) signaling regulation by GRKs (G protein-coupled receptor kinases). At a superpathway level, distinct pathways further interconnect and converge to three biology themes: synaptic function, morphology, and plasticity.

  11. Aberrant Gene Expression in Humans

    PubMed Central

    Yang, Ence; Ji, Guoli; Brinkmeyer-Langford, Candice L.; Cai, James J.

    2015-01-01

    Gene expression as an intermediate molecular phenotype has been a focus of research interest. In particular, studies of expression quantitative trait loci (eQTL) have offered promise for understanding gene regulation through the discovery of genetic variants that explain variation in gene expression levels. Existing eQTL methods are designed for assessing the effects of common variants, but not rare variants. Here, we address the problem by establishing a novel analytical framework for evaluating the effects of rare or private variants on gene expression. Our method starts from the identification of outlier individuals that show markedly different gene expression from the majority of a population, and then reveals the contributions of private SNPs to the aberrant gene expression in these outliers. Using population-scale mRNA sequencing data, we identify outlier individuals using a multivariate approach. We find that outlier individuals are more readily detected with respect to gene sets that include genes involved in cellular regulation and signal transduction, and less likely to be detected with respect to the gene sets with genes involved in metabolic pathways and other fundamental molecular functions. Analysis of polymorphic data suggests that private SNPs of outlier individuals are enriched in the enhancer and promoter regions of corresponding aberrantly-expressed genes, suggesting a specific regulatory role of private SNPs, while the commonly-occurring regulatory genetic variants (i.e., eQTL SNPs) show little evidence of involvement. Additional data suggest that non-genetic factors may also underlie aberrant gene expression. Taken together, our findings advance a novel viewpoint relevant to situations wherein common eQTLs fail to predict gene expression when heritable, rare inter-individual variation exists. The analytical framework we describe, taking into consideration the reality of differential phenotypic robustness, may be valuable for investigating complex traits and conditions. PMID:25617623

  12. Characterization of Homozygous Hb Setif (HBA2: c.283G>T) in the Iranian Population.

    PubMed

    Farashi, Samaneh; Garous, Negin F; Vakili, Shadi; Ashki, Mehri; Imanian, Hashem; Azarkeivan, Azita; Najmabadi, Hossein

    2016-01-01

    Hemoglobin (Hb) variants are abnormalities resulting from point mutations in either of the two α-globin genes (HBA2 or HBA1) or the β-globin gene (HBB). Various reports of Hb variants have been described in Iran and other countries around the world. Hb Setif (or HBA2: c.283G>T) is one of these variants with a mutation at codon 94 of of the α2-globin gene that is characterized in clinically normal heterozygous individuals. We here report clinical and hematological findings in two homozygous cases of Iranian origin for this unstable Hb variant.

  13. Association of GWAS Top Genes With Late-Onset Alzheimer's Disease in Colombian Population.

    PubMed

    Moreno, Diana Jennifer; Ruiz, Susana; Ríos, Ángela; Lopera, Francisco; Ostos, Henry; Via, Marc; Bedoya, Gabriel

    2017-02-01

    The association of variants in CLU, CR1, PICALM, BIN1, ABCA7, and CD33 genes with late-onset Alzheimer's disease (LOAD) was evaluated and confirmed through genome-wide association study. However, it is unknown whether these associations can be replicated in admixed populations. The association of 14 single-nucleotide polymorphisms in those genes was evaluated in 280 LOAD cases and 357 controls from the Colombian population. In a multivariate analysis using age, gender, APOE∊4 status, and admixture covariates, significant associations were obtained ( P < .05) for variants in BIN1 (rs744373, odds ratio [OR]: 1.42), CLU (rs11136000, OR: 0.66), PICALM (rs541458, OR: 0.69), ABCA7 (rs3764650, OR: 1.7), and CD33 (rs3865444, OR: 1.12). Likewise, a significant interaction effect was observed between CLU and CR1 variants with APOE. This study replicated the associations previously reported in populations of European ancestry and shows that APOE variants have a regulatory role on the effect that variants in other loci have on LOAD, reflecting the importance of gene-gene interactions in the etiology of neurodegenerative diseases.

  14. CDK5RAP2 gene and tau pathophysiology in late-onset sporadic Alzheimer's disease.

    PubMed

    Miron, Justin; Picard, Cynthia; Nilsson, Nathalie; Frappier, Josée; Dea, Doris; Théroux, Louise; Poirier, Judes

    2018-06-01

    Because currently known Alzheimer's disease (AD) single-nucleotide polymorphisms only account for a small fraction of the genetic variance in this disease, there is a need to identify new variants associated with AD. Our team performed a genome-wide association study in the Quebec Founder Population isolate to identify novel protective or risk genetic factors for late-onset sporadic AD and examined the impact of these variants on gene expression and AD pathology. The rs10984186 variant is associated with an increased risk of developing AD and with a higher CDK5RAP2 mRNA prevalence in the hippocampus. On the other hand, the rs4837766 variant, which is among the best cis-expression quantitative trait loci in the CDK5RAP2 gene, is associated with lower mild cognitive impairment/AD risk and conversion rate. The rs10984186 risk and rs4837766 protective polymorphic variants of the CDK5RAP2 gene might act as potent genetic modifiers for AD risk and/or conversion by modulating the expression of this gene. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  15. Exome Sequencing Analysis Reveals Variants in Primary Immunodeficiency Genes in Patients With Very Early Onset Inflammatory Bowel Disease

    PubMed Central

    Kelsen, Judith R.; Dawany, Noor; Moran, Christopher J.; Petersen, Britt-Sabina; Sarmady, Mahdi; Sasson, Ariella; Pauly-Hubbard, Helen; Martinez, Alejandro; Maurer, Kelly; Soong, Joanne; Rappaport, Eric; Franke, Andre; Keller, Andreas; Winter, Harland S.; Mamula, Petar; Piccoli, David; Artis, David; Sonnenberg, Gregory F.; Daly, Mark; Sullivan, Kathleen E.; Baldassano, Robert N.; Devoto, Marcella

    2016-01-01

    Background & Aims Very early onset inflammatory bowel disease (VEO-IBD), IBD diagnosed ≤5 y of age, frequently presents with a different and more severe phenotype than older-onset IBD. We investigated whether patients with VEO-IBD carry rare or novel variants in genes associated with immunodeficiencies that might contribute to disease development. Methods Patients with VEO-IBD and parents (when available) were recruited from the Children's Hospital of Philadelphia from March 2013 through July 2014. We analyzed DNA from 125 patients with VEO-IBD (ages 3 weeks to 4 y) and 19 parents, 4 of whom also had IBD. Exome capture was performed by Agilent SureSelect V4, and sequencing was performed using the Illumina HiSeq platform. Alignment to human genome GRCh37 was achieved followed by post-processing and variant calling. Following functional annotation, candidate variants were analyzed for change in protein function, minor allele frequency <0.1%, and scaled combined annotation dependent depletion scores ≤10. We focused on genes associated with primary immunodeficiencies and related pathways. An additional 210 exome samples from patients with pediatric IBD (n=45) or adult-onset Crohn's disease (n=20) and healthy individuals (controls, n=145) were obtained from the University of Kiel, Germany and used as control groups. Results Four-hundred genes and regions associated with primary immunodeficiency, covering approximately 6500 coding exons totaling > 1 Mbp of coding sequence, were selected from the whole exome data. Our analysis revealed novel and rare variants within these genes that could contribute to the development of VEO-IBD, including rare heterozygous missense variants in IL10RA and previously unidentified variants in MSH5 and CD19. Conclusions In an exome sequence analysis of patients with VEO-IBD and their parents, we identified variants in genes that regulate B- and T-cell functions and could contribute to pathogenesis. Our analysis could lead to the identification of previously unidentified IBD-associated variants. PMID:26193622

  16. Reanalysis of BRCA1/2 negative high risk ovarian cancer patients reveals novel germline risk loci and insights into missing heritability

    PubMed Central

    Dyson, Gregory; Levin, Nancy K.; Chaudhry, Sophia; Rosati, Rita; Kalpage, Hasini; Simon, Michael S.; Tainsky, Michael A.

    2017-01-01

    While up to 25% of ovarian cancer (OVCA) cases are thought to be due to inherited factors, the majority of genetic risk remains unexplained. To address this gap, we sought to identify previously undescribed OVCA risk variants through the whole exome sequencing (WES) and candidate gene analysis of 48 women with ovarian cancer and selected for high risk of genetic inheritance, yet negative for any known pathogenic variants in either BRCA1 or BRCA2. In silico SNP analysis was employed to identify suspect variants followed by validation using Sanger DNA sequencing. We identified five pathogenic variants in our sample, four of which are in two genes featured on current multi-gene panels; (RAD51D, ATM). In addition, we found a pathogenic FANCM variant (R1931*) which has been recently implicated in familial breast cancer risk. Numerous rare and predicted to be damaging variants of unknown significance were detected in genes on current commercial testing panels, most prominently in ATM (n = 6) and PALB2 (n = 5). The BRCA2 variant p.K3326*, resulting in a 93 amino acid truncation, was overrepresented in our sample (odds ratio = 4.95, p = 0.01) and coexisted in the germline of these women with other deleterious variants, suggesting a possible role as a modifier of genetic penetrance. Furthermore, we detected loss of function variants in non-panel genes involved in OVCA relevant pathways; DNA repair and cell cycle control, including CHEK1, TP53I3, REC8, HMMR, RAD52, RAD1, POLK, POLQ, and MCM4. In summary, our study implicates novel risk loci as well as highlights the clinical utility for retesting BRCA1/2 negative OVCA patients by genomic sequencing and analysis of genes in relevant pathways. PMID:28591191

  17. Pooled Sequencing of 531 Genes in Inflammatory Bowel Disease Identifies an Associated Rare Variant in BTNL2 and Implicates Other Immune Related Genes

    PubMed Central

    Prescott, Natalie J.; Lehne, Benjamin; Stone, Kristina; Lee, James C.; Taylor, Kirstin; Knight, Jo; Papouli, Efterpi; Mirza, Muddassar M.; Simpson, Michael A.; Spain, Sarah L.; Lu, Grace; Fraternali, Franca; Bumpstead, Suzannah J.; Gray, Emma; Amar, Ariella; Bye, Hannah; Green, Peter; Chung-Faye, Guy; Hayee, Bu’Hussain; Pollok, Richard; Satsangi, Jack; Parkes, Miles; Barrett, Jeffrey C.; Mansfield, John C.; Sanderson, Jeremy; Lewis, Cathryn M.; Weale, Michael E.; Schlitt, Thomas; Mathew, Christopher G.

    2015-01-01

    The contribution of rare coding sequence variants to genetic susceptibility in complex disorders is an important but unresolved question. Most studies thus far have investigated a limited number of genes from regions which contain common disease associated variants. Here we investigate this in inflammatory bowel disease by sequencing the exons and proximal promoters of 531 genes selected from both genome-wide association studies and pathway analysis in pooled DNA panels from 474 cases of Crohn’s disease and 480 controls. 80 variants with evidence of association in the sequencing experiment or with potential functional significance were selected for follow up genotyping in 6,507 IBD cases and 3,064 population controls. The top 5 disease associated variants were genotyped in an extension panel of 3,662 IBD cases and 3,639 controls, and tested for association in a combined analysis of 10,147 IBD cases and 7,008 controls. A rare coding variant p.G454C in the BTNL2 gene within the major histocompatibility complex was significantly associated with increased risk for IBD (p = 9.65x10−10, OR = 2.3[95% CI = 1.75–3.04]), but was independent of the known common associated CD and UC variants at this locus. Rare (<1%) and low frequency (1–5%) variants in 3 additional genes showed suggestive association (p<0.005) with either an increased risk (ARIH2 c.338-6C>T) or decreased risk (IL12B p.V298F, and NICN p.H191R) of IBD. These results provide additional insights into the involvement of the inhibition of T cell activation in the development of both sub-phenotypes of inflammatory bowel disease. We suggest that although rare coding variants may make a modest overall contribution to complex disease susceptibility, they can inform our understanding of the molecular pathways that contribute to pathogenesis. PMID:25671699

  18. Germline variant FGFR4  p.G388R exposes a membrane-proximal STAT3 binding site.

    PubMed

    Ulaganathan, Vijay K; Sperl, Bianca; Rapp, Ulf R; Ullrich, Axel

    2015-12-24

    Variant rs351855-G/A is a commonly occurring single-nucleotide polymorphism of coding regions in exon 9 of the fibroblast growth factor receptor FGFR4 (CD334) gene (c.1162G>A). It results in an amino-acid change at codon 388 from glycine to arginine (p.Gly388Arg) in the transmembrane domain of the receptor. Despite compelling genetic evidence for the association of this common variant with cancers of the bone, breast, colon, prostate, skin, lung, head and neck, as well as soft-tissue sarcomas and non-Hodgkin lymphoma, the underlying biological mechanism has remained elusive. Here we show that substitution of the conserved glycine 388 residue to a charged arginine residue alters the transmembrane spanning segment and exposes a membrane-proximal cytoplasmic signal transducer and activator of transcription 3 (STAT3) binding site Y(390)-(P)XXQ(393). We demonstrate that such membrane-proximal STAT3 binding motifs in the germline of type I membrane receptors enhance STAT3 tyrosine phosphorylation by recruiting STAT3 proteins to the inner cell membrane. Remarkably, such germline variants frequently co-localize with somatic mutations in the Catalogue of Somatic Mutations in Cancer (COSMIC) database. Using Fgfr4 single nucleotide polymorphism knock-in mice and transgenic mouse models for breast and lung cancers, we validate the enhanced STAT3 signalling induced by the FGFR4 Arg388-variant in vivo. Thus, our findings elucidate the molecular mechanism behind the genetic association of rs351855 with accelerated cancer progression and suggest that germline variants of cell-surface molecules that recruit STAT3 to the inner cell membrane are a significant risk for cancer prognosis and disease progression.

  19. The insulin-sensitivity sulphonylurea receptor variant is associated with thyrotoxic paralysis.

    PubMed

    Rolim, Ana Luiza R; Lindsey, Susan C; Kunii, Ilda S; Crispim, Felipe; Moisés, Regina Célia M S; Maciel, Rui M B; Dias-da-Silva, Magnus R

    2014-10-01

    Thyrotoxicosis is the most common cause of the acquired flaccid muscle paralysis in adults called thyrotoxic periodic paralysis (TPP) and is characterised by transient hypokalaemia and hypophosphataemia under high thyroid hormone levels that is frequently precipitated by carbohydrate load. The sulphonylurea receptor 1 (SUR1 (ABCC8)) is an essential regulatory subunit of the β-cell ATP-sensitive K(+) channel that controls insulin secretion after feeding. Additionally, the SUR1 Ala1369Ser variant appears to be associated with insulin sensitivity. We examined the ABCC8 gene at the single nucleotide level using PCR-restriction fragment length polymorphism (RFLP) analysis to determine its allelic variant frequency and calculated the frequency of the Ala1369Ser C-allele variant in a cohort of 36 Brazilian TPP patients in comparison with 32 controls presenting with thyrotoxicosis without paralysis (TWP). We verified that the frequency of the alanine 1369 C-allele was significantly higher in TPP patients than in TWP patients (61.1 vs 34.4%, odds ratio (OR)=3.42, P=0.039) and was significantly more common than the minor allele frequency observed in the general population from the 1000 Genomes database (61.1 vs 29.0%, OR=4.87, P<0.005). Additionally, the C-allele frequency was similar between TWP patients and the general population (34.4 vs 29%, OR=1.42, P=0.325). We have demonstrated that SUR1 alanine 1369 variant is associated with allelic susceptibility to TPP. We suggest that the hyperinsulinaemia that is observed in TPP may be linked to the ATP-sensitive K(+)/SUR1 alanine variant and, therefore, contribute to the major feedforward precipitating factors in the pathophysiology of TPP. © 2014 Society for Endocrinology.

  20. Divergent Ah Receptor Ligand Selectivity during Hominin Evolution

    PubMed Central

    Hubbard, Troy D.; Murray, Iain A.; Bisson, William H.; Sullivan, Alexis P.; Sebastian, Aswathy; Perry, George H.; Jablonski, Nina G.; Perdew, Gary H.

    2016-01-01

    We have identified a fixed nonsynonymous sequence difference between humans (Val381; derived variant) and Neandertals (Ala381; ancestral variant) in the ligand-binding domain of the aryl hydrocarbon receptor (AHR) gene. In an exome sequence analysis of four Neandertal and Denisovan individuals compared with nine modern humans, there are only 90 total nucleotide sites genome-wide for which archaic hominins are fixed for the ancestral nonsynonymous variant and the modern humans are fixed for the derived variant. Of those sites, only 27, including Val381 in the AHR, also have no reported variability in the human dbSNP database, further suggesting that this highly conserved functional variant is a rare event. Functional analysis of the amino acid variant Ala381 within the AHR carried by Neandertals and nonhuman primates indicate enhanced polycyclic aromatic hydrocarbon (PAH) binding, DNA binding capacity, and AHR mediated transcriptional activity compared with the human AHR. Also relative to human AHR, the Neandertal AHR exhibited 150–1000 times greater sensitivity to induction of Cyp1a1 and Cyp1b1 expression by PAHs (e.g., benzo(a)pyrene). The resulting CYP1A1/CYP1B1 enzymes are responsible for PAH first pass metabolism, which can result in the generation of toxic intermediates and perhaps AHR-associated toxicities. In contrast, the human AHR retains the ancestral sensitivity observed in primates to nontoxic endogenous AHR ligands (e.g., indole, indoxyl sulfate). Our findings reveal that a functionally significant change in the AHR occurred uniquely in humans, relative to other primates, that would attenuate the response to many environmental pollutants, including chemicals present in smoke from fire use during cooking. PMID:27486223

  1. A Shared Genetic Basis for Self-Limited Delayed Puberty and Idiopathic Hypogonadotropic Hypogonadism

    PubMed Central

    Zhu, Jia; Choa, Ruth E.-Y.; Guo, Michael H.; Plummer, Lacey; Buck, Cassandra; Palmert, Mark R.; Hirschhorn, Joel N.; Seminara, Stephanie B.

    2015-01-01

    Context: Delayed puberty (DP) is a common issue and, in the absence of an underlying condition, is typically self limited. Alhough DP seems to be heritable, no specific genetic cause for DP has yet been reported. In contrast, many genetic causes have been found for idiopathic hypogonadotropic hypogonadism (IHH), a rare disorder characterized by absent or stalled pubertal development. Objective: The objective of this retrospective study, conducted at academic medical centers, was to determine whether variants in IHH genes contribute to the pathogenesis of DP. Subjects and Outcome Measures: Potentially pathogenic variants in IHH genes were identified in two cohorts: 1) DP family members of an IHH proband previously found to have a variant in an IHH gene, with unaffected family members serving as controls, and 2) DP individuals with no family history of IHH, with ethnically matched control subjects drawn from the Exome Aggregation Consortium. Results: In pedigrees with an IHH proband, the proband's variant was shared by 53% (10/19) of DP family members vs 12% (4/33) of unaffected family members (P = .003). In DP subjects with no family history of IHH, 14% (8/56) had potentially pathogenic variants in IHH genes vs 5.6% (1 907/33 855) of controls (P = .01). Potentially pathogenic variants were found in multiple DP subjects for the genes IL17RD and TAC3. Conclusions: These findings suggest that variants in IHH genes can contribute to the pathogenesis of self-limited DP. Thus, at least in some cases, self-limited DP shares an underlying pathophysiology with IHH. PMID:25636053

  2. Microarray gene expression profiling analysis combined with bioinformatics in multiple sclerosis.

    PubMed

    Liu, Mingyuan; Hou, Xiaojun; Zhang, Ping; Hao, Yong; Yang, Yiting; Wu, Xiongfeng; Zhu, Desheng; Guan, Yangtai

    2013-05-01

    Multiple sclerosis (MS) is the most prevalent demyelinating disease and the principal cause of neurological disability in young adults. Recent microarray gene expression profiling studies have identified several genetic variants contributing to the complex pathogenesis of MS, however, expressional and functional studies are still required to further understand its molecular mechanism. The present study aimed to analyze the molecular mechanism of MS using microarray analysis combined with bioinformatics techniques. We downloaded the gene expression profile of MS from Gene Expression Omnibus (GEO) and analysed the microarray data using the differentially coexpressed genes (DCGs) and links package in R and Database for Annotation, Visualization and Integrated Discovery. The regulatory impact factor (RIF) algorithm was used to measure the impact factor of transcription factor. A total of 1,297 DCGs between MS patients and healthy controls were identified. Functional annotation indicated that these DCGs were associated with immune and neurological functions. Furthermore, the RIF result suggested that IKZF1, BACH1, CEBPB, EGR1, FOS may play central regulatory roles in controlling gene expression in the pathogenesis of MS. Our findings confirm the presence of multiple molecular alterations in MS and indicate the possibility for identifying prognostic factors associated with MS pathogenesis.

  3. HeLa Nucleic Acid Contamination in The Cancer Genome Atlas Leads to the Misidentification of Human Papillomavirus 18

    PubMed Central

    Cantalupo, Paul G.; Katz, Joshua P.

    2015-01-01

    ABSTRACT We searched The Cancer Genome Atlas (TCGA) database for viruses by comparing non-human reads present in transcriptome sequencing (RNA-Seq) and whole-exome sequencing (WXS) data to viral sequence databases. Human papillomavirus 18 (HPV18) is an etiologic agent of cervical cancer, and as expected, we found robust expression of HPV18 genes in cervical cancer samples. In agreement with previous studies, we also found HPV18 transcripts in non-cervical cancer samples, including those from the colon, rectum, and normal kidney. However, in each of these cases, HPV18 gene expression was low, and single-nucleotide variants and positions of genomic alignments matched the integrated portion of HPV18 present in HeLa cells. Chimeric reads that match a known virus-cell junction of HPV18 integrated in HeLa cells were also present in some samples. We hypothesize that HPV18 sequences in these non-cervical samples are due to nucleic acid contamination from HeLa cells. This finding highlights the problems that contamination presents in computational virus detection pipelines. IMPORTANCE Viruses associated with cancer can be detected by searching tumor sequence databases. Several studies involving searches of the TCGA database have reported the presence of HPV18, a known cause of cervical cancer, in a small number of additional cancers, including those of the rectum, kidney, and colon. We have determined that the sequences related to HPV18 in non-cervical samples are due to nucleic acid contamination from HeLa cells. To our knowledge, this is the first report of the misidentification of viruses in next-generation sequencing data of tumors due to contamination with a cancer cell line. These results raise awareness of the difficulty of accurately identifying viruses in human sequence databases. PMID:25631090

  4. Association of ICAM-1 and HMGA1 Gene Variants with Retinopathy in Type 2 Diabetes Mellitus Among Chinese Individuals.

    PubMed

    Lv, Zhiping; Li, Ying; Wu, Yongzhong; Qu, Yi

    2016-08-01

    To evaluate the association of intercellular cell-adhesion molecule 1 (ICAM-1) and high-mobility group A1 (HMGA1) gene variants with diabetic retinopathy (DR) in a Chinese type 2 diabetes mellitus (T2DM) cohort. A total of 792 patients with T2DM were enrolled and categorized into two groups: (1) the DR group consisted of 448 patients, which was further subclassified into the proliferative DR (PDR) group with 220 patients and the nonproliferative DR (NPDR) group with 228 patients; (2) the diabetes without retinopathy (DNR) group comprised 344 patients who had no signs of DR. The single-nucleotide polymorphism (SNP) rs5498 in ICAM-1 gene and IVS5-13insC variant in HMGA1 gene were genotyped. No evident association was found in the allele frequencies between SNP rs5498 in ICAM-1 gene and DR patients; the combined p values for the additive, dominant, and recessive models in genotype were greater than 0.05. No significant association was identified between the IVS5-13insC variant in HMGA1 gene and DR individuals. Our results revealed that SNP rs5498 in ICAM-1 gene and IVS5-13insC variant in HMGA1 gene were not associated with the susceptibility of DR in the Chinese T2DM cohort.

  5. Mapping cis- and trans-regulatory effects across multiple tissues in twins

    PubMed Central

    Grundberg, Elin; Small, Kerrin S.; Hedman, Åsa K.; Nica, Alexandra C.; Buil, Alfonso; Keildson, Sarah; Bell, Jordana T.; Yang, Tsun-Po; Meduri, Eshwar; Barrett, Amy; Nisbett, James; Sekowska, Magdalena; Wilk, Alicja; Shin, So-Youn; Glass, Daniel; Travers, Mary; Min, Josine L.; Ring, Sue; Ho, Karen; Thorleifsson, Gudmar; Kong, Augustine; Thorsteindottir, Unnur; Ainali, Chrysanthi; Dimas, Antigone S.; Hassanali, Neelam; Ingle, Catherine; Knowles, David; Krestyaninova, Maria; Lowe, Christopher E.; Di Meglio, Paola; Montgomery, Stephen B.; Parts, Leopold; Potter, Simon; Surdulescu, Gabriela; Tsaprouni, Loukia; Tsoka, Sophia; Bataille, Veronique; Durbin, Richard; Nestle, Frank O.; O’Rahilly, Stephen; Soranzo, Nicole; Lindgren, Cecilia M.; Zondervan, Krina T.; Ahmadi, Kourosh R.; Schadt, Eric E.; Stefansson, Kari; Smith, George Davey; McCarthy, Mark I.; Deloukas, Panos; Dermitzakis, Emmanouil T.; Spector, Tim D.

    2013-01-01

    Sequence-based variation in gene expression is a key driver of disease risk. Common variants regulating expression in cis have been mapped in many eQTL studies typically in single tissues from unrelated individuals. Here, we present a comprehensive analysis of gene expression across multiple tissues conducted in a large set of mono- and dizygotic twins that allows systematic dissection of genetic (cis and trans) and non-genetic effects on gene expression. Using identity-by-descent estimates, we show that at least 40% of the total heritable cis-effect on expression cannot be accounted for by common cis-variants, a finding which exposes the contribution of low frequency and rare regulatory variants with respect to both transcriptional regulation and complex trait susceptibility. We show that a substantial proportion of gene expression heritability is trans to the structural gene and identify several replicating trans-variants which act predominantly in a tissue-restricted manner and may regulate the transcription of many genes. PMID:22941192

  6. Rare Variant Association Test with Multiple Phenotypes

    PubMed Central

    Lee, Selyeong; Won, Sungho; Kim, Young Jin; Kim, Yongkang; Kim, Bong-Jo; Park, Taesung

    2016-01-01

    Although genome-wide association studies (GWAS) have now discovered thousands of genetic variants associated with common traits, such variants cannot explain the large degree of “missing heritability,” likely due to rare variants. The advent of next generation sequencing technology has allowed rare variant detection and association with common traits, often by investigating specific genomic regions for rare variant effects on a trait. Although multiply correlated phenotypes are often concurrently observed in GWAS, most studies analyze only single phenotypes, which may lessen statistical power. To increase power, multivariate analyses, which consider correlations between multiple phenotypes, can be used. However, few existing multi-variant analyses can identify rare variants for assessing multiple phenotypes. Here, we propose Multivariate Association Analysis using Score Statistics (MAAUSS), to identify rare variants associated with multiple phenotypes, based on the widely used Sequence Kernel Association Test (SKAT) for a single phenotype. We applied MAAUSS to Whole Exome Sequencing (WES) data from a Korean population of 1,058 subjects, to discover genes associated with multiple traits of liver function. We then assessed validation of those genes by a replication study, using an independent dataset of 3,445 individuals. Notably, we detected the gene ZNF620 among five significant genes. We then performed a simulation study to compare MAAUSS's performance with existing methods. Overall, MAAUSS successfully conserved type 1 error rates and in many cases, had a higher power than the existing methods. This study illustrates a feasible and straightforward approach for identifying rare variants correlated with multiple phenotypes, with likely relevance to missing heritability. PMID:28039885

  7. Burden of rare variants in ALS genes influences survival in familial and sporadic ALS.

    PubMed

    Pang, Shirley Yin-Yu; Hsu, Jacob Shujui; Teo, Kay-Cheong; Li, Yan; Kung, Michelle H W; Cheah, Kathryn S E; Chan, Danny; Cheung, Kenneth M C; Li, Miaoxin; Sham, Pak-Chung; Ho, Shu-Leong

    2017-10-01

    Genetic variants are implicated in the development of amyotrophic lateral sclerosis (ALS), but it is unclear whether the burden of rare variants in ALS genes has an effect on survival. We performed whole genome sequencing on 8 familial ALS (FALS) patients with superoxide dismutase 1 (SOD1) mutation and whole exome sequencing on 46 sporadic ALS (SALS) patients living in Hong Kong and found that 67% had at least 1 rare variant in the exons of 40 ALS genes; 22% had 2 or more. Patients with 2 or more rare variants had lower probability of survival than patients with 0 or 1 variant (p = 0.001). After adjusting for other factors, each additional rare variant increased the risk of respiratory failure or death by 60% (p = 0.0098). The presence of the rare variant was associated with the risk of ALS (Odds ratio 1.91, 95% confidence interval 1.03-3.61, p = 0.03), and ALS patients had higher rare variant burden than controls (MB, p = 0.004). Our findings support an oligogenic basis with the burden of rare variants affecting the development and survival of ALS. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  8. Functional Analysis of OMICs Data and Small Molecule Compounds in an Integrated "Knowledge-Based" Platform.

    PubMed

    Dubovenko, Alexey; Nikolsky, Yuri; Rakhmatulin, Eugene; Nikolskaya, Tatiana

    2017-01-01

    Analysis of NGS and other sequencing data, gene variants, gene expression, proteomics, and other high-throughput (OMICs) data is challenging because of its biological complexity and high level of technical and biological noise. One way to deal with both problems is to perform analysis with a high fidelity annotated knowledgebase of protein interactions, pathways, and functional ontologies. This knowledgebase has to be structured in a computer-readable format and must include software tools for managing experimental data, analysis, and reporting. Here, we present MetaCore™ and Key Pathway Advisor (KPA), an integrated platform for functional data analysis. On the content side, MetaCore and KPA encompass a comprehensive database of molecular interactions of different types, pathways, network models, and ten functional ontologies covering human, mouse, and rat genes. The analytical toolkit includes tools for gene/protein list enrichment analysis, statistical "interactome" tool for the identification of over- and under-connected proteins in the dataset, and a biological network analysis module made up of network generation algorithms and filters. The suite also features Advanced Search, an application for combinatorial search of the database content, as well as a Java-based tool called Pathway Map Creator for drawing and editing custom pathway maps. Applications of MetaCore and KPA include molecular mode of action of disease research, identification of potential biomarkers and drug targets, pathway hypothesis generation, analysis of biological effects for novel small molecule compounds and clinical applications (analysis of large cohorts of patients, and translational and personalized medicine).

  9. An interactive mutation database for human coagulation factor IX provides novel insights into the phenotypes and genetics of hemophilia B.

    PubMed

    Rallapalli, P M; Kemball-Cook, G; Tuddenham, E G; Gomez, K; Perkins, S J

    2013-07-01

    Factor IX (FIX) is important in the coagulation cascade, being activated to FIXa on cleavage. Defects in the human F9 gene frequently lead to hemophilia B. To assess 1113 unique F9 mutations corresponding to 3721 patient entries in a new and up-to-date interactive web database alongside the FIXa protein structure. The mutations database was built using MySQL and structural analyses were based on a homology model for the human FIXa structure based on closely-related crystal structures. Mutations have been found in 336 (73%) out of 461 residues in FIX. There were 812 unique point mutations, 182 deletions, 54 polymorphisms, 39 insertions and 26 others that together comprise a total of 1113 unique variants. The 64 unique mild severity mutations in the mature protein with known circulating protein phenotypes include 15 (23%) quantitative type I mutations and 41 (64%) predominantly qualitative type II mutations. Inhibitors were described in 59 reports (1.6%) corresponding to 25 unique mutations. The interactive database provides insights into mechanisms of hemophilia B. Type II mutations are deduced to disrupt predominantly those structural regions involved with functional interactions. The interactive features of the database will assist in making judgments about patient management. © 2013 International Society on Thrombosis and Haemostasis.

  10. Isolation and characterization of alternatively spliced variants of the mouse sigma1 receptor gene, Sigmar1.

    PubMed

    Pan, Ling; Pasternak, David A; Xu, Jin; Xu, Mingming; Lu, Zhigang; Pasternak, Gavril W; Pan, Ying-Xian

    2017-01-01

    The sigma1 receptor acts as a chaperone at the endoplasmic reticulum, associates with multiple proteins in various cellular systems, and involves in a number of diseases, such as addiction, pain, cancer and psychiatric disorders. The sigma1 receptor is encoded by the single copy SIGMAR1 gene. The current study identifies five alternatively spliced variants of the mouse sigma1 receptor gene using a polymerase chain reaction cloning approach. All the splice variants are generated by exon skipping or alternative 3' or 5' splicing, producing the truncated sigma1 receptor. Similar alternative splicing has been observed in the human SIGMAR1 gene based on the molecular cloning or genome sequence prediction, suggesting conservation of alternative splicing of SIGMAR1 gene. Using quantitative polymerase chain reactions, we demonstrate differential expression of several splice variants in mouse tissues and brain regions. When expressed in HEK293 cells, all the splice variants fail to bind sigma ligands, implicating that each truncated region in these splice variants is important for ligand binding. However, co-immunoprecipitation (Co-IP) study in HEK293 cells co-transfected with tagged constructs reveals that all the splice variants maintain their ability to physically associate with a mu opioid receptor (mMOR-1), providing useful information to correlate the motifs/sequences necessary for their physical association. Furthermore, a competition Co-IP study showed that all the variants can disrupt in a dose-dependent manner the dimerization of the original sigma1 receptor with mMOR-1, suggesting a potential dominant negative function and providing significant insights into their function.

  11. Multiplex Degenerate Primer Design for Targeted Whole Genome Amplification of Many Viral Genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, Shea N.; Jaing, Crystal J.; Elsheikh, Maher M.

    Background . Targeted enrichment improves coverage of highly mutable viruses at low concentration in complex samples. Degenerate primers that anneal to conserved regions can facilitate amplification of divergent, low concentration variants, even when the strain present is unknown. Results . A tool for designing multiplex sets of degenerate sequencing primers to tile overlapping amplicons across multiple whole genomes is described. The new script, run_tiled_primers, is part of the PriMux software. Primers were designed for each segment of South American hemorrhagic fever viruses, tick-borne encephalitis, Henipaviruses, Arenaviruses, Filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus, and Japanese encephalitis virus. Eachmore » group is highly diverse with as little as 5% genome consensus. Primer sets were computationally checked for nontarget cross reactions against the NCBI nucleotide sequence database. Primers for murine hepatitis virus were demonstrated in the lab to specifically amplify selected genes from a laboratory cultured strain that had undergone extensive passage in vitro and in vivo. Conclusions . This software should help researchers design multiplex sets of primers for targeted whole genome enrichment prior to sequencing to obtain better coverage of low titer, divergent viruses. Applications include viral discovery from a complex background and improved sensitivity and coverage of rapidly evolving strains or variants in a gene family.« less

  12. A targeted sequencing panel identifies rare damaging variants in multiple genes in the cranial neural tube defect, anencephaly

    PubMed Central

    Cullup, T.; Boustred, C.; James, C.; Docker, J.; English, C.; Lench, N.; Copp, A.J.; Moore, G.E.; Greene, N.D.E.; Stanier, P.

    2018-01-01

    Neural tube defects (NTDs) affecting the brain (anencephaly) are lethal before or at birth, whereas lower spinal defects (spina bifida) may lead to lifelong neurological handicap. Collectively, NTDs rank among the most common birth defects worldwide. This study focuses on anencephaly, which despite having a similar frequency to spina bifida and being the most common type of NTD observed in mouse models, has had more limited inclusion in genetic studies. A genetic influence is strongly implicated in determining risk of NTDs and a molecular diagnosis is of fundamental importance to families both in terms of understanding the origin of the condition and for managing future pregnancies. Here we used a custom panel of 191 NTD candidate genes to screen 90 patients with cranial NTDs (n = 85 anencephaly and n = 5 craniorachischisis) with a targeted exome sequencing platform. After filtering and comparing to our in‐house control exome database (N = 509), we identified 397 rare variants (minor allele frequency, MAF < 1%), 21 of which were previously unreported and predicted damaging. This included 1 frameshift (PDGFRA), 2 stop‐gained (MAT1A; NOS2) and 18 missense variations. Together with evidence for oligogenic inheritance, this study provides new information on the possible genetic causation of anencephaly. PMID:29205322

  13. Multiplex Degenerate Primer Design for Targeted Whole Genome Amplification of Many Viral Genomes

    DOE PAGES

    Gardner, Shea N.; Jaing, Crystal J.; Elsheikh, Maher M.; ...

    2014-01-01

    Background . Targeted enrichment improves coverage of highly mutable viruses at low concentration in complex samples. Degenerate primers that anneal to conserved regions can facilitate amplification of divergent, low concentration variants, even when the strain present is unknown. Results . A tool for designing multiplex sets of degenerate sequencing primers to tile overlapping amplicons across multiple whole genomes is described. The new script, run_tiled_primers, is part of the PriMux software. Primers were designed for each segment of South American hemorrhagic fever viruses, tick-borne encephalitis, Henipaviruses, Arenaviruses, Filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus, and Japanese encephalitis virus. Eachmore » group is highly diverse with as little as 5% genome consensus. Primer sets were computationally checked for nontarget cross reactions against the NCBI nucleotide sequence database. Primers for murine hepatitis virus were demonstrated in the lab to specifically amplify selected genes from a laboratory cultured strain that had undergone extensive passage in vitro and in vivo. Conclusions . This software should help researchers design multiplex sets of primers for targeted whole genome enrichment prior to sequencing to obtain better coverage of low titer, divergent viruses. Applications include viral discovery from a complex background and improved sensitivity and coverage of rapidly evolving strains or variants in a gene family.« less

  14. Correlation of rare coding variants in the gene encoding human glucokinase regulatory protein with phenotypic, cellular, and kinetic outcomes.

    PubMed

    Rees, Matthew G; Ng, David; Ruppert, Sarah; Turner, Clesson; Beer, Nicola L; Swift, Amy J; Morken, Mario A; Below, Jennifer E; Blech, Ilana; Mullikin, James C; McCarthy, Mark I; Biesecker, Leslie G; Gloyn, Anna L; Collins, Francis S

    2012-01-01

    Defining the genetic contribution of rare variants to common diseases is a major basic and clinical science challenge that could offer new insights into disease etiology and provide potential for directed gene- and pathway-based prevention and treatment. Common and rare nonsynonymous variants in the GCKR gene are associated with alterations in metabolic traits, most notably serum triglyceride levels. GCKR encodes glucokinase regulatory protein (GKRP), a predominantly nuclear protein that inhibits hepatic glucokinase (GCK) and plays a critical role in glucose homeostasis. The mode of action of rare GCKR variants remains unexplored. We identified 19 nonsynonymous GCKR variants among 800 individuals from the ClinSeq medical sequencing project. Excluding the previously described common missense variant p.Pro446Leu, all variants were rare in the cohort. Accordingly, we functionally characterized all variants to evaluate their potential phenotypic effects. Defects were observed for the majority of the rare variants after assessment of cellular localization, ability to interact with GCK, and kinetic activity of the encoded proteins. Comparing the individuals with functional rare variants to those without such variants showed associations with lipid phenotypes. Our findings suggest that, while nonsynonymous GCKR variants, excluding p.Pro446Leu, are rare in individuals of mixed European descent, the majority do affect protein function. In sum, this study utilizes computational, cell biological, and biochemical methods to present a model for interpreting the clinical significance of rare genetic variants in common disease.

  15. PERCH: A Unified Framework for Disease Gene Prioritization.

    PubMed

    Feng, Bing-Jian

    2017-03-01

    To interpret genetic variants discovered from next-generation sequencing, integration of heterogeneous information is vital for success. This article describes a framework named PERCH (Polymorphism Evaluation, Ranking, and Classification for a Heritable trait), available at http://BJFengLab.org/. It can prioritize disease genes by quantitatively unifying a new deleteriousness measure called BayesDel, an improved assessment of the biological relevance of genes to the disease, a modified linkage analysis, a novel rare-variant association test, and a converted variant call quality score. It supports data that contain various combinations of extended pedigrees, trios, and case-controls, and allows for a reduced penetrance, an elevated phenocopy rate, liability classes, and covariates. BayesDel is more accurate than PolyPhen2, SIFT, FATHMM, LRT, Mutation Taster, Mutation Assessor, PhyloP, GERP++, SiPhy, CADD, MetaLR, and MetaSVM. The overall approach is faster and more powerful than the existing quantitative method pVAAST, as shown by the simulations of challenging situations in finding the missing heritability of a complex disease. This framework can also classify variants of unknown significance (variants of uncertain significance) by quantitatively integrating allele frequencies, deleteriousness, association, and co-segregation. PERCH is a versatile tool for gene prioritization in gene discovery research and variant classification in clinical genetic testing. © 2016 The Authors. **Human Mutation published by Wiley Periodicals, Inc.

  16. The cholinesterase variants found in some African tribes living in Rhodesia.

    PubMed

    Whittaker, M; Lowe, R F

    1976-01-01

    Blood samples from 1,614 Africans living in Rhodesia have been phenotyped for the cholinesterase variants at the E1 and E2 loci. 24% of the African population were non-Rhodesian by birth. 1,227 Rhodesians aligned themselves to 20 tribes, 191 Malawians to 8 tribes. 162 Mozambique Africans to 9 tribes and 34 Zambians to 8 tribes. A high frequency of 0.036 for the Ef1 gene, which varies from tribe to tribe, has been found in Rhodesian and Malawian Africans. Similar high frequencies for this gene are recorded for Zambian (0.045) and Mozambique Africans (0.034). The frequencies of the Es1 gene in these groups are 0.013 (Rhodesian), 0.009 (Malawian), and 0.016 (Mozambique African). The small Zambian sample showed evidence for neither the Es1 nor the C5+ electrophoretic variant. The absence of the Ea1 gene in the 1,613 Africans provides additional evidence of the rarity of this gene in negroid populations. The frequency of the C5+ variant in Rhodesian, Malawian and Mozambique Africans, although varying from tribe to tribe within the range of 0-8%, averages 3% in each group. These represent low frequencies for this variant when compared to other populations. No rare or 'private' electrophoretic variant has been found.

  17. Use of model organism and disease databases to support matchmaking for human disease gene discovery.

    PubMed

    Mungall, Christopher J; Washington, Nicole L; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N; Lewis, Suzanna E; Haendel, Melissa A

    2015-10-01

    The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases. © 2015 WILEY PERIODICALS, INC.

  18. Genetic polymorphisms in Na+-taurocholate co-transporting polypeptide (NTCP) and ileal apical sodium-dependent bile acid transporter (ASBT) and ethnic comparisons of functional variants of NTCP among Asian populations.

    PubMed

    Pan, Wei; Song, Im-Sook; Shin, Ho-Jung; Kim, Min-Hye; Choi, Yeong-Lim; Lim, Su-Jeong; Kim, Woo-Young; Lee, Sang-Seop; Shin, Jae-Gook

    2011-06-01

    Genetic variants of Na(+)-taurocholate co-transporting polypeptide (NTCP; SLC10A1) and ileal apical sodium-dependent bile acid transporter (ASBT; SLC10A2), which greatly contribute to bile acid homeostasis, were extensively explored in the Korean population and functional variants of NTCP were compared among Asian populations. From direct DNA sequencing, six SNPs were identified in the SLC10A1 gene and 14 SNPs in the SLC10A2 gene. Three of seven coding variants were non-synonymous SNPs: two variants from SLC10A1 (A64T, S267F) and one from SLC10A2 (A171S). No linkage was analysed in the SLC10A1 gene because of low frequencies of genetic variants, and the SLC10A2 gene was composed of two separated linkage disequilibrium blocks contrary to the white population. The stably transfected NTCP-A64T variant showed significantly decreased uptakes of taurocholate and rosuvastatin compared with wild-type NTCP. The decreased taurocholate uptake and increased rosuvastatin uptake were shown in the NTCP-S267F variant. The allele frequencies of these functional variants were 1.0% and 3.1%, respectively, in a Korean population. However, NTCP-A64T was not found in Chinese and Vietnamese subjects. The frequency distribution of NTCP-S267F in Koreans was significantly lower than those in Chinese and Vietnamese populations. Our data suggest that NTCP-A64T and -S267F variants cause substrate-dependent functional change in vitro, and show ethnic difference in their allelic frequencies among Asian populations although the clinical relevance of these variants is remained to be evaluated.

  19. Gene variants and binge eating as predictors of comorbidity and outcome of treatment in severe obesity.

    PubMed

    Potoczna, Natascha; Branson, Ruth; Kral, John G; Piec, Grazyna; Steffen, Rudolf; Ricklin, Thomas; Hoehe, Margret R; Lentes, Klaus-Ulrich; Horber, Fritz F

    2004-12-01

    Melanocortin-4 receptor gene (MC4R) variants are associated with obesity and binge eating disorder (BED), whereas the more prevalent proopiomelanocortin (POMC) and leptin receptor gene (LEPR) mutations are rarely associated with obesity or BED. The complete coding regions of MC4R, POMC, and leptin-binding domain of LEPR were comparatively sequenced in 300 patients (233 women and 67 men; mean +/- SEM age, 42 +/- 1 years; mean +/- SEM body mass index, 43.5 +/- 0.3 kg/m2) undergoing laparoscopic gastric banding. Eating behavior, esophagogastric pathology, metabolic syndrome prevalence, and postoperative weight loss and complications were retrospectively compared between carriers and noncarriers of gene variants with and without BED during 36 +/- 3-month follow-up. Nineteen patients (6.3%) carried 8 MC4R variants, 144 (48.0%) carried 13 POMC variants, and 247 (82.3%) carried 11 LEPR variants. All MC4R variant carriers had BED, compared with 18.1% of noncarriers (P < 0.001). BED rates were similar among POMC and LEPR variant carriers and noncarriers. Gastroscopy revealed more erosive esophagitis in bingers than in nonbingers before and after banding (P < 0.04), regardless of genotype. MC4R variant carriers lost less weight (P=0.003), showed less improvement in metabolic syndrome (P < 0.001), had dilated esophagi (P < 0.001) and more vomiting (P < 0.05), and had fivefold more gastric complications (P < 0.001) than noncarriers. Overall outcome was poorest in MC4R variant carriers, better in noncarriers with BED (P < 0.05), and best in noncarriers without BED (P < 0.001). MC4R variants influence comorbidities and treatment outcomes in severe obesity.

  20. Association Between Germline Mutation in VSIG10L and Familial Barrett Neoplasia.

    PubMed

    Fecteau, Ryan E; Kong, Jianping; Kresak, Adam; Brock, Wendy; Song, Yeunjoo; Fujioka, Hisashi; Elston, Robert; Willis, Joseph E; Lynch, John P; Markowitz, Sanford D; Guda, Kishore; Chak, Amitabh

    2016-10-01

    Esophageal adenocarcinoma and its precursor lesion Barrett esophagus have seen a dramatic increase in incidence over the past 4 decades yet marked genetic heterogeneity of this disease has precluded advances in understanding its pathogenesis and improving treatment. To identify novel disease susceptibility variants in a familial syndrome of esophageal adenocarcinoma and Barrett esophagus, termed familial Barrett esophagus, by using high-throughput sequencing in affected individuals from a large, multigenerational family. We performed whole exome sequencing (WES) from peripheral lymphocyte DNA on 4 distant relatives from our multiplex, multigenerational familial Barrett esophagus family to identify candidate disease susceptibility variants. Gene variants were filtered, verified, and segregation analysis performed to identify a single candidate variant. Gene expression analysis was done with both quantitative real-time polymerase chain reaction and in situ RNA hybridization. A 3-dimensional organotypic cell culture model of esophageal maturation was utilized to determine the phenotypic effects of our gene variant. We used electron microscopy on esophageal mucosa from an affected family member carrying the gene variant to assess ultrastructural changes. Identification of a novel, germline disease susceptibility variant in a previously uncharacterized gene. A multiplex, multigenerational family with 14 members affected (3 members with esophageal adenocarcinoma and 11 with Barrett esophagus) was identified, and whole-exome sequencing identified a germline mutation (S631G) at a highly conserved serine residue in the uncharacterized gene VSIG10L that segregated in affected members. Transfection of S631G variant into a 3-dimensional organotypic culture model of normal esophageal squamous cells dramatically inhibited epithelial maturation compared with the wild-type. VSIG10L exhibited high expression in normal squamous esophagus with marked loss of expression in Barrett-associated lesions. Electron microscopy of squamous esophageal mucosa harboring the S631G variant revealed dilated intercellular spaces and reduced desmosomes. This study presents VSIG10L as a candidate familial Barrett esophagus susceptibility gene, with a putative role in maintaining normal esophageal homeostasis. Further research assessing VSIG10L function may reveal pathways important for esophageal maturation and the pathogenesis of Barrett esophagus and esophageal adenocarcinoma.

Top