finding genome-wide three-way: Topics by Science.gov

Sample records for finding genome-wide three-way

Genome-Wide Prediction of the Performance of Three-Way Hybrids in Barley.

PubMed

Li, Zuo; Philipp, Norman; Spiller, Monika; Stiewe, Gunther; Reif, Jochen C; Zhao, Yusheng

2017-03-01

Predicting the grain yield performance of three-way hybrids is challenging. Three-way crosses are relevant for hybrid breeding in barley ( L.) and maize ( L.) adapted to East Africa. The main goal of our study was to implement and evaluate genome-wide prediction approaches of the performance of three-way hybrids using data of single-cross hybrids for a scenario in which parental lines of the three-way hybrids originate from three genetically distinct subpopulations. We extended the ridge regression best linear unbiased prediction (RRBLUP) and devised a genomic selection model allowing for subpopulation-specific marker effects (GSA-RRBLUP: general and subpopulation-specific additive RRBLUP). Using an empirical barley data set, we showed that applying GSA-RRBLUP tripled the prediction ability of three-way hybrids from 0.095 to 0.308 compared with RRBLUP, modeling one additive effect for all three subpopulations. The experimental findings were further substantiated with computer simulations. Our results emphasize the potential of GSA-RRBLUP to improve genome-wide hybrid prediction of three-way hybrids for scenarios of genetically diverse parental populations. Because of the advantages of the GSA-RRBLUP model in dealing with hybrids from different parental populations, it may also be a promising approach to boost the prediction ability for hybrid breeding programs based on genetically diverse heterotic groups. Copyright © 2017 Crop Science Society of America.
Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species

PubMed Central

Wang, Jing; Street, Nathaniel R.; Scofield, Douglas G.; Ingvarsson, Pär K.

2016-01-01

A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. PMID:26721855
Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species.

PubMed

Wang, Jing; Street, Nathaniel R; Scofield, Douglas G; Ingvarsson, Pär K

2016-03-01

A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species. Copyright © 2016 by the Genetics Society of America.
Complex multi-enhancer contacts captured by Genome Architecture Mapping (GAM)

PubMed Central

Beagrie, Robert A.; Scialdone, Antonio; Schueler, Markus; Kraemer, Dorothee C.A.; Chotalia, Mita; Xie, Sheila Q.; Barbieri, Mariano; de Santiago, Inês; Lavitas, Liron-Mark; Branco, Miguel R.; Fraser, James; Dostie, Josée; Game, Laurence; Dillon, Niall; Edwards, Paul A.W.; Nicodemi, Mario; Pombo, Ana

2017-01-01

Summary The organization of the genome in the nucleus and the interactions of genes with their regulatory elements are key features of transcriptional control and their disruption can cause disease. We developed a novel genome-wide method, Genome Architecture Mapping (GAM), for measuring chromatin contacts, and other features of three-dimensional chromatin topology, based on sequencing DNA from a large collection of thin nuclear sections. We apply GAM to mouse embryonic stem cells and identify an enrichment for specific interactions between active genes and enhancers across very large genomic distances, using a mathematical model ‘SLICE’ (Statistical Inference of Co-segregation). GAM also reveals an abundance of three-way contacts genome-wide, especially between regions that are highly transcribed or contain super-enhancers, highlighting a previously inaccessible complexity in genome architecture and a major role for gene-expression specific contacts in organizing the genome in mammalian nuclei. PMID:28273065
Detection of Epistasis for Flowering Time Using Bayesian Multilocus Estimation in a Barley MAGIC Population

PubMed Central

Mathew, Boby; Léon, Jens; Sannemann, Wiebke; Sillanpää, Mikko J.

2018-01-01

Gene-by-gene interactions, also known as epistasis, regulate many complex traits in different species. With the availability of low-cost genotyping it is now possible to study epistasis on a genome-wide scale. However, identifying genome-wide epistasis is a high-dimensional multiple regression problem and needs the application of dimensionality reduction techniques. Flowering Time (FT) in crops is a complex trait that is known to be influenced by many interacting genes and pathways in various crops. In this study, we successfully apply Sure Independence Screening (SIS) for dimensionality reduction to identify two-way and three-way epistasis for the FT trait in a Multiparent Advanced Generation Inter-Cross (MAGIC) barley population using the Bayesian multilocus model. The MAGIC barley population was generated from intercrossing among eight parental lines and thus, offered greater genetic diversity to detect higher-order epistatic interactions. Our results suggest that SIS is an efficient dimensionality reduction approach to detect high-order interactions in a Bayesian multilocus model. We also observe that many of our findings (genomic regions with main or higher-order epistatic effects) overlap with known candidate genes that have been already reported in barley and closely related species for the FT trait. PMID:29254994
Exploring the potential duty of care in clinical genomics under UK law

PubMed Central

Mitchell, Colin; Ploem, Corrette; Chico, Victoria; Ormondroyd, Elizabeth; Hall, Alison; Wallace, Susan; Fay, Michael; Goodwin, Deirdre; Bell, Jessica; Phillips, Simon; Taylor, Jenny C.; Hennekam, Raoul; Kaye, Jane

2017-01-01

Genome-wide sequencing technologies are beginning to be used in projects that have both clinical diagnostic and research components. The clinical application of this technology, which generates a huge amount of information of varying diagnostic certainty, involves addressing a number of challenges to establish appropriate standards. In this article, we explore the way that UK law may respond to three of these key challenges and could establish new legal duties in relation to feedback of findings that are unrelated to the presenting condition (secondary, additional or incidental findings); duties towards genetic relatives as well as the patient and duties on the part of researchers and professionals who do not have direct contact with patients. When considering these issues, the courts will take account of European and international comparisons, developing guidance and relevant ethical, social and policy factors. The UK courts will also be strongly influenced by precedent set in case law. PMID:28943725
Exploring the potential duty of care in clinical genomics under UK law.

PubMed

Mitchell, Colin; Ploem, Corrette; Chico, Victoria; Ormondroyd, Elizabeth; Hall, Alison; Wallace, Susan; Fay, Michael; Goodwin, Deirdre; Bell, Jessica; Phillips, Simon; Taylor, Jenny C; Hennekam, Raoul; Kaye, Jane

2017-09-01

Genome-wide sequencing technologies are beginning to be used in projects that have both clinical diagnostic and research components. The clinical application of this technology, which generates a huge amount of information of varying diagnostic certainty, involves addressing a number of challenges to establish appropriate standards. In this article, we explore the way that UK law may respond to three of these key challenges and could establish new legal duties in relation to feedback of findings that are unrelated to the presenting condition (secondary, additional or incidental findings); duties towards genetic relatives as well as the patient and duties on the part of researchers and professionals who do not have direct contact with patients. When considering these issues, the courts will take account of European and international comparisons, developing guidance and relevant ethical, social and policy factors. The UK courts will also be strongly influenced by precedent set in case law.
Genome-wide association studies in bladder cancer: first results and potential relevance.

PubMed

Kiemeney, Lambertus A; Grotenhuis, Anne J; Vermeulen, Sita H; Wu, Xifeng

2009-09-01

The role of genetic susceptibility in the development of urinary bladder cancer is unclear, as it is in many other types of cancer. Since 2007, however, an innovative research approach (i.e. genome-wide association studies or GWASs) has led to the identification of numerous genomic loci that harbor susceptibility factors for one or more cancer sites. All GWASs have been published in high-impact journals and the strengths of the design are acknowledged by all experts, but there is criticism about the relevance of the results. Late 2008, the first GWAS in bladder cancer was published. In this review, the principles of GWASs are explained, as well as their strengths and limitations. The study in bladder cancer among 4000 cases and 38,000 controls identified three new susceptibility loci at 8q24, 3q28, and 5p15 that increase the risk of bladder cancer by 22, 19, and 16%, respectively. The results of two other GWASs in bladder cancer are expected to appear this year. Joint analysis of the three studies will probably identify additional susceptibility loci. The results of bladder cancer GWASs may point the way to yet unknown disease mechanisms. So far, the findings are not sufficiently discriminative for risk predictions to be used in clinical care or public health.
A Panel of Ancestry Informative Markers for the Complex Five-Way Admixed South African Coloured Population

PubMed Central

Daya, Michelle; van der Merwe, Lize; Galal, Ushma; Möller, Marlo; Salie, Muneeb; Chimusa, Emile R.; Galanter, Joshua M.; van Helden, Paul D.; Henn, Brenna M.; Gignoux, Chris R.; Hoal, Eileen

2013-01-01

Admixture is a well known confounder in genetic association studies. If genome-wide data is not available, as would be the case for candidate gene studies, ancestry informative markers (AIMs) are required in order to adjust for admixture. The predominant population group in the Western Cape, South Africa, is the admixed group known as the South African Coloured (SAC). A small set of AIMs that is optimized to distinguish between the five source populations of this population (African San, African non-San, European, South Asian, and East Asian) will enable researchers to cost-effectively reduce false-positive findings resulting from ignoring admixture in genetic association studies of the population. Using genome-wide data to find SNPs with large allele frequency differences between the source populations of the SAC, as quantified by Rosenberg et. al's -statistic, we developed a panel of AIMs by experimenting with various selection strategies. Subsets of different sizes were evaluated by measuring the correlation between ancestry proportions estimated by each AIM subset with ancestry proportions estimated using genome-wide data. We show that a panel of 96 AIMs can be used to assess ancestry proportions and to adjust for the confounding effect of the complex five-way admixture that occurred in the South African Coloured population. PMID:24376522
Harnessing Omics Big Data in Nine Vertebrate Species by Genome-Wide Prioritization of Sequence Variants with the Highest Predicted Deleterious Effect on Protein Function.

PubMed

Rozman, Vita; Kunej, Tanja

2018-05-10

Harnessing the genomics big data requires innovation in how we extract and interpret biologically relevant variants. Currently, there is no established catalog of prioritized missense variants associated with deleterious protein function phenotypes. We report in this study, to the best of our knowledge, the first genome-wide prioritization of sequence variants with the most deleterious effect on protein function (potentially deleterious variants [pDelVars]) in nine vertebrate species: human, cattle, horse, sheep, pig, dog, rat, mouse, and zebrafish. The analysis was conducted using the Ensembl/BioMart tool. Genes comprising pDelVars in the highest number of examined species were identified using a Python script. Multiple genomic alignments of the selected genes were built to identify interspecies orthologous potentially deleterious variants, which we defined as the "ortho-pDelVars." Genome-wide prioritization revealed that in humans, 0.12% of the known variants are predicted to be deleterious. In seven out of nine examined vertebrate species, the genes encoding the multiple PDZ domain crumbs cell polarity complex component (MPDZ) and the transforming acidic coiled-coil containing protein 2 (TACC2) comprise pDelVars. Five interspecies ortho-pDelVars were identified in three genes. These findings offer new ways to harness genomics big data by facilitating the identification of functional polymorphisms in humans and animal models and thus provide a future basis for optimization of protocols for whole genome prioritization of pDelVars and screening of orthologous sequence variants. The approach presented here can inform various postgenomic applications such as personalized medicine and multiomics study of health interventions (iatromics).
Genome-wide association study of the four-constitution medicine.

PubMed

Yin, Chang Shik; Park, Hi Joon; Chung, Joo-Ho; Lee, Hye-Jung; Lee, Byung-Cheol

2009-12-01

Four-constitution medicine (FCM), also known as Sasang constitutional medicine, and the heritage of the long history of individualized acupuncture medicine tradition, is one of the holistic and traditional systems of constitution to appraise and categorize individual differences into four major types. This study first reports a genome-wide association study on FCM, to explore the genetic basis of FCM and facilitate the integration of FCM with conventional individual differences research. Healthy individuals of the Korean population were classified into the four constitutional types (FCTs). A total of 353,202 single nucleotide polymorphisms (SNPs) were typed using whole genome amplified samples, and six-way comparison of FCM types provided lists of significantly differential SNPs. In one-to-one FCT comparisons, 15,944 SNPs were significantly differential, and 5 SNPs were commonly significant in all of the three comparisons. In one-to-two FCT comparisons, 22,616 SNPs were significantly differential, and 20 SNPs were commonly significant in all of the three comparison groups. This study presents the association between genome-wide SNP profiles and the categorization of the FCM, and it could further provide a starting point of genome-based identification and research of the constitutions of FCM.
Complex multi-enhancer contacts captured by genome architecture mapping.

PubMed

Beagrie, Robert A; Scialdone, Antonio; Schueler, Markus; Kraemer, Dorothee C A; Chotalia, Mita; Xie, Sheila Q; Barbieri, Mariano; de Santiago, Inês; Lavitas, Liron-Mark; Branco, Miguel R; Fraser, James; Dostie, Josée; Game, Laurence; Dillon, Niall; Edwards, Paul A W; Nicodemi, Mario; Pombo, Ana

2017-03-23

The organization of the genome in the nucleus and the interactions of genes with their regulatory elements are key features of transcriptional control and their disruption can cause disease. Here we report a genome-wide method, genome architecture mapping (GAM), for measuring chromatin contacts and other features of three-dimensional chromatin topology on the basis of sequencing DNA from a large collection of thin nuclear sections. We apply GAM to mouse embryonic stem cells and identify enrichment for specific interactions between active genes and enhancers across very large genomic distances using a mathematical model termed SLICE (statistical inference of co-segregation). GAM also reveals an abundance of three-way contacts across the genome, especially between regions that are highly transcribed or contain super-enhancers, providing a level of insight into genome architecture that, owing to the technical limitations of current technologies, has previously remained unattainable. Furthermore, GAM highlights a role for gene-expression-specific contacts in organizing the genome in mammalian nuclei.
Three tiers of genome evolution in reptiles

PubMed Central

Organ, Chris L.; Moreno, Ricardo Godínez; Edwards, Scott V.

2008-01-01

Characterization of reptilian genomes is essential for understanding the overall diversity and evolution of amniote genomes, because reptiles, which include birds, constitute a major fraction of the amniote evolutionary tree. To better understand the evolution and diversity of genomic characteristics in Reptilia, we conducted comparative analyses of online sequence data from Alligator mississippiensis (alligator) and Sphenodon punctatus (tuatara) as well as genome size and karyological data from a wide range of reptilian species. At the whole-genome and chromosomal tiers of organization, we find that reptilian genome size distribution is consistent with a model of continuous gradual evolution while genomic compartmentalization, as manifested in the number of microchromosomes and macrochromosomes, appears to have undergone early rapid change. At the sequence level, the third genomic tier, we find that exon size in Alligator is distributed in a pattern matching that of exons in Gallus (chicken), especially in the 101—200 bp size class. A small spike in the fraction of exons in the 301 bp—1 kb size class is also observed for Alligator, but more so for Sphenodon. For introns, we find that members of Reptilia have a larger fraction of introns within the 101 bp–2 kb size class and a lower fraction of introns within the 5–30 kb size class than do mammals. These findings suggest that the mode of reptilian genome evolution varies across three hierarchical levels of the genome, a pattern consistent with a mosaic model of genomic evolution. PMID:21669810
Discovering hotspots in functional genomic data superposed on 3D chromatin configuration reconstructions

PubMed Central

Capurso, Daniel; Bengtsson, Henrik; Segal, Mark R.

2016-01-01

The spatial organization of the genome influences cellular function, notably gene regulation. Recent studies have assessed the three-dimensional (3D) co-localization of functional annotations (e.g. centromeres, long terminal repeats) using 3D genome reconstructions from Hi-C (genome-wide chromosome conformation capture) data; however, corresponding assessments for continuous functional genomic data (e.g. chromatin immunoprecipitation-sequencing (ChIP-seq) peak height) are lacking. Here, we demonstrate that applying bump hunting via the patient rule induction method (PRIM) to ChIP-seq data superposed on a Saccharomyces cerevisiae 3D genome reconstruction can discover ‘functional 3D hotspots’, regions in 3-space for which the mean ChIP-seq peak height is significantly elevated. For the transcription factor Swi6, the top hotspot by P-value contains MSB2 and ERG11 – known Swi6 target genes on different chromosomes. We verify this finding in a number of ways. First, this top hotspot is relatively stable under PRIM across parameter settings. Second, this hotspot is among the top hotspots by mean outcome identified by an alternative algorithm, k-Nearest Neighbor (k-NN) regression. Third, the distance between MSB2 and ERG11 is smaller than expected (by resampling) in two other 3D reconstructions generated via different normalization and reconstruction algorithms. This analytic approach can discover functional 3D hotspots and potentially reveal novel regulatory interactions. PMID:26869583
Genetic variants associated with subjective well-being, depressive symptoms and neuroticism identified through genome-wide analyses

PubMed Central

Derringer, Jaime; Gratten, Jacob; Lee, James J; Liu, Jimmy Z; de Vlaming, Ronald; Ahluwalia, Tarunveer S; Buchwald, Jadwiga; Cavadino, Alana; Frazier-Wood, Alexis C; Davies, Gail; Furlotte, Nicholas A; Garfield, Victoria; Geisel, Marie Henrike; Gonzalez, Juan R; Haitjema, Saskia; Karlsson, Robert; van der Laan, Sander W; Ladwig, Karl-Heinz; Lahti, Jari; van der Lee, Sven J; Miller, Michael B; Lind, Penelope A; Liu, Tian; Matteson, Lindsay; Mihailov, Evelin; Minica, Camelia C; Nolte, Ilja M; Mook-Kanamori, Dennis O; van der Most, Peter J; Oldmeadow, Christopher; Qian, Yong; Raitakari, Olli; Rawal, Rajesh; Realo, Anu; Rueedi, Rico; Schmidt, Börge; Smith, Albert V; Stergiakouli, Evie; Tanaka, Toshiko; Taylor, Kent; Thorleifsson, Gudmar; Wedenoja, Juho; Wellmann, Juergen; Westra, Harm-Jan; Willems, Sara M; Zhao, Wei; Amin, Najaf; Bakshi, Andrew; Bergmann, Sven; Bjornsdottir, Gyda; Boyle, Patricia A; Cherney, Samantha; Cox, Simon R; Davis, Oliver S P; Ding, Jun; Direk, Nese; Eibich, Peter; Emeny, Rebecca T; Fatemifar, Ghazaleh; Faul, Jessica D; Ferrucci, Luigi; Forstner, Andreas J; Gieger, Christian; Gupta, Richa; Harris, Tamara B; Harris, Juliette M; Holliday, Elizabeth G; Hottenga, Jouke-Jan; De Jager, Philip L; Kaakinen, Marika A; Kajantie, Eero; Karhunen, Ville; Kolcic, Ivana; Kumari, Meena; Launer, Lenore J; Franke, Lude; Li-Gao, Ruifang; Liewald, David C; Koini, Marisa; Loukola, Anu; Marques-Vidal, Pedro; Montgomery, Grant W; Mosing, Miriam A; Paternoster, Lavinia; Pattie, Alison; Petrovic, Katja E; Pulkki-Råback, Laura; Quaye, Lydia; Räikkönen, Katri; Rudan, Igor; Scott, Rodney J; Smith, Jennifer A; Sutin, Angelina R; Trzaskowski, Maciej; Vinkhuyzen, Anna E; Yu, Lei; Zabaneh, Delilah; Attia, John R; Bennett, David A; Berger, Klaus; Bertram, Lars; Boomsma, Dorret I; Snieder, Harold; Chang, Shun-Chiao; Cucca, Francesco; Deary, Ian J; van Duijn, Cornelia M; Eriksson, Johan G; Bültmann, Ute; de Geus, Eco J C; Groenen, Patrick J F; Gudnason, Vilmundur; Hansen, Torben; Hartman, Catharine A; Haworth, Claire M A; Hayward, Caroline; Heath, Andrew C; Hinds, David A; Hyppönen, Elina; Iacono, William G; Järvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Kaprio, Jaakko; Kardia, Sharon L R; Keltikangas-Järvinen, Liisa; Kraft, Peter; Kubzansky, Laura D; Lehtimäki, Terho; Magnusson, Patrik K E; Martin, Nicholas G; McGue, Matt; Metspalu, Andres; Mills, Melinda; de Mutsert, Renée; Oldehinkel, Albertine J; Pasterkamp, Gerard; Pedersen, Nancy L; Plomin, Robert; Polasek, Ozren; Power, Christine; Rich, Stephen S; Rosendaal, Frits R; den Ruijter, Hester M; Schlessinger, David; Schmidt, Helena; Svento, Rauli; Schmidt, Reinhold; Alizadeh, Behrooz Z; Sørensen, Thorkild I A; Spector, Tim D; Starr, John M; Stefansson, Kari; Steptoe, Andrew; Terracciano, Antonio; Thorsteinsdottir, Unnur; Thurik, A Roy; Timpson, Nicholas J; Tiemeier, Henning; Uitterlinden, André G; Vollenweider, Peter; Wagner, Gert G; Weir, David R; Yang, Jian; Conley, Dalton C; Smith, George Davey; Hofman, Albert; Johannesson, Magnus; Laibson, David I; Medland, Sarah E; Meyer, Michelle N; Pickrell, Joseph K; Esko, Tõnu; Krueger, Robert F; Beauchamp, Jonathan P; Koellinger, Philipp D; Benjamin, Daniel J; Bartels, Meike; Cesarini, David

2016-01-01

We conducted genome-wide association studies of three phenotypes: subjective well-being (N = 298,420), depressive symptoms (N = 161,460), and neuroticism (N = 170,910). We identified three variants associated with subjective well-being, two with depressive symptoms, and eleven with neuroticism, including two inversion polymorphisms. The two depressive symptoms loci replicate in an independent depression sample. Joint analyses that exploit the high genetic correlations between the phenotypes (|ρ^| ≈ 0.8) strengthen the overall credibility of the findings, and allow us to identify additional variants. Across our phenotypes, loci regulating expression in central nervous system and adrenal/pancreas tissues are strongly enriched for association. PMID:27089181
Replicability and Robustness of GWAS for Behavioral Traits

PubMed Central

Rietveld, Cornelius A.; Conley, Dalton; Eriksson, Nicholas; Esko, Tõnu; Medland, Sarah E.; Vinkhuyzen, Anna A.E.; Yang, Jian; Boardman, Jason D.; Chabris, Christopher F.; Dawes, Christopher T.; Domingue, Benjamin W.; Hinds, David A.; Johannesson, Magnus; Kiefer, Amy K.; Laibson, David; Magnusson, Patrik K. E.; Mountain, Joanna L.; Oskarsson, Sven; Rostapshova, Olga; Teumer, Alexander; Tung, Joyce Y.; Visscher, Peter M.; Benjamin, Daniel J.; Cesarini, David; Koellinger, Philipp D.

2015-01-01

A recent genome-wide association study (GWAS) of educational attainment identified three single-nucleotide polymorphisms (SNPs) that, despite their small effect sizes (each R2 ≈ 0.02%), reached genome-wide significance (p < 5×10−8) in a large discovery sample and replicated in an independent sample (p < 0.05). The study also reported associations between educational attainment and indices of SNPs called “polygenic scores.” We evaluate the robustness of these findings. Study 1 finds that all three SNPs replicate in another large (N = 34,428) independent sample. We also find that the scores remain predictive (R2 ≈ 2%) with stringent controls for stratification (Study 2) and in new within-family analyses (Study 3). Our results show that large and therefore well-powered GWASs can identify replicable genetic associations with behavioral traits. The small effect sizes of individual SNPs are likely to be a major contributing explanation for the striking contrast between our results and the disappointing replication record of most candidate gene studies. PMID:25287667
Large meta-analysis of genome-wide association studies identifies five loci for lean body mass.

PubMed

Zillikens, M Carola; Demissie, Serkalem; Hsu, Yi-Hsiang; Yerges-Armstrong, Laura M; Chou, Wen-Chi; Stolk, Lisette; Livshits, Gregory; Broer, Linda; Johnson, Toby; Koller, Daniel L; Kutalik, Zoltán; Luan, Jian'an; Malkin, Ida; Ried, Janina S; Smith, Albert V; Thorleifsson, Gudmar; Vandenput, Liesbeth; Hua Zhao, Jing; Zhang, Weihua; Aghdassi, Ali; Åkesson, Kristina; Amin, Najaf; Baier, Leslie J; Barroso, Inês; Bennett, David A; Bertram, Lars; Biffar, Rainer; Bochud, Murielle; Boehnke, Michael; Borecki, Ingrid B; Buchman, Aron S; Byberg, Liisa; Campbell, Harry; Campos Obanda, Natalia; Cauley, Jane A; Cawthon, Peggy M; Cederberg, Henna; Chen, Zhao; Cho, Nam H; Jin Choi, Hyung; Claussnitzer, Melina; Collins, Francis; Cummings, Steven R; De Jager, Philip L; Demuth, Ilja; Dhonukshe-Rutten, Rosalie A M; Diatchenko, Luda; Eiriksdottir, Gudny; Enneman, Anke W; Erdos, Mike; Eriksson, Johan G; Eriksson, Joel; Estrada, Karol; Evans, Daniel S; Feitosa, Mary F; Fu, Mao; Garcia, Melissa; Gieger, Christian; Girke, Thomas; Glazer, Nicole L; Grallert, Harald; Grewal, Jagvir; Han, Bok-Ghee; Hanson, Robert L; Hayward, Caroline; Hofman, Albert; Hoffman, Eric P; Homuth, Georg; Hsueh, Wen-Chi; Hubal, Monica J; Hubbard, Alan; Huffman, Kim M; Husted, Lise B; Illig, Thomas; Ingelsson, Erik; Ittermann, Till; Jansson, John-Olov; Jordan, Joanne M; Jula, Antti; Karlsson, Magnus; Khaw, Kay-Tee; Kilpeläinen, Tuomas O; Klopp, Norman; Kloth, Jacqueline S L; Koistinen, Heikki A; Kraus, William E; Kritchevsky, Stephen; Kuulasmaa, Teemu; Kuusisto, Johanna; Laakso, Markku; Lahti, Jari; Lang, Thomas; Langdahl, Bente L; Launer, Lenore J; Lee, Jong-Young; Lerch, Markus M; Lewis, Joshua R; Lind, Lars; Lindgren, Cecilia; Liu, Yongmei; Liu, Tian; Liu, Youfang; Ljunggren, Östen; Lorentzon, Mattias; Luben, Robert N; Maixner, William; McGuigan, Fiona E; Medina-Gomez, Carolina; Meitinger, Thomas; Melhus, Håkan; Mellström, Dan; Melov, Simon; Michaëlsson, Karl; Mitchell, Braxton D; Morris, Andrew P; Mosekilde, Leif; Newman, Anne; Nielson, Carrie M; O'Connell, Jeffrey R; Oostra, Ben A; Orwoll, Eric S; Palotie, Aarno; Parker, Stephen C J; Peacock, Munro; Perola, Markus; Peters, Annette; Polasek, Ozren; Prince, Richard L; Räikkönen, Katri; Ralston, Stuart H; Ripatti, Samuli; Robbins, John A; Rotter, Jerome I; Rudan, Igor; Salomaa, Veikko; Satterfield, Suzanne; Schadt, Eric E; Schipf, Sabine; Scott, Laura; Sehmi, Joban; Shen, Jian; Soo Shin, Chan; Sigurdsson, Gunnar; Smith, Shad; Soranzo, Nicole; Stančáková, Alena; Steinhagen-Thiessen, Elisabeth; Streeten, Elizabeth A; Styrkarsdottir, Unnur; Swart, Karin M A; Tan, Sian-Tsung; Tarnopolsky, Mark A; Thompson, Patricia; Thomson, Cynthia A; Thorsteinsdottir, Unnur; Tikkanen, Emmi; Tranah, Gregory J; Tuomilehto, Jaakko; van Schoor, Natasja M; Verma, Arjun; Vollenweider, Peter; Völzke, Henry; Wactawski-Wende, Jean; Walker, Mark; Weedon, Michael N; Welch, Ryan; Wichmann, H-Erich; Widen, Elisabeth; Williams, Frances M K; Wilson, James F; Wright, Nicole C; Xie, Weijia; Yu, Lei; Zhou, Yanhua; Chambers, John C; Döring, Angela; van Duijn, Cornelia M; Econs, Michael J; Gudnason, Vilmundur; Kooner, Jaspal S; Psaty, Bruce M; Spector, Timothy D; Stefansson, Kari; Rivadeneira, Fernando; Uitterlinden, André G; Wareham, Nicholas J; Ossowski, Vicky; Waterworth, Dawn; Loos, Ruth J F; Karasik, David; Harris, Tamara B; Ohlsson, Claes; Kiel, Douglas P

2017-07-19

Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p < 5 × 10 -8 ) or suggestively genome wide (p < 2.3 × 10 -6 ). Replication in 63,475 (47,227 of European ancestry) individuals from 33 cohorts for whole body lean body mass and in 45,090 (42,360 of European ancestry) subjects from 25 cohorts for appendicular lean body mass was successful for five single-nucleotide polymorphisms in/near HSD17B11, VCAN, ADAMTSL3, IRS1, and FTO for total lean body mass and for three single-nucleotide polymorphisms in/near VCAN, ADAMTSL3, and IRS1 for appendicular lean body mass. Our findings provide new insight into the genetics of lean body mass.Lean body mass is a highly heritable trait and is associated with various health conditions. Here, Kiel and colleagues perform a meta-analysis of genome-wide association studies for whole body lean body mass and find five novel genetic loci to be significantly associated.
Anonymization of electronic medical records for validating genome-wide association studies

PubMed Central

Loukides, Grigorios; Gkoulalas-Divanis, Aris; Malin, Bradley

2010-01-01

Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet of personalized medicine. The increasing adoption of electronic medical records allows large amounts of patients’ standardized clinical features to be combined with the genomic sequences of these patients and shared to support validation of GWAS findings and to enable novel discoveries. However, disseminating these data “as is” may lead to patient reidentification when genomic sequences are linked to resources that contain the corresponding patients’ identity information based on standardized clinical features. This work proposes an approach that provably prevents this type of data linkage and furnishes a result that helps support GWAS. Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual reidentification, while supporting GWAS validation and clinical case analysis tasks. PMID:20385806
A knowledge base for tracking the impact of genomics on population health.

PubMed

Yu, Wei; Gwinn, Marta; Dotson, W David; Green, Ridgely Fisk; Clyne, Mindy; Wulf, Anja; Bowen, Scott; Kolor, Katherine; Khoury, Muin J

2016-12-01

We created an online knowledge base (the Public Health Genomics Knowledge Base (PHGKB)) to provide systematically curated and updated information that bridges population-based research on genomics with clinical and public health applications. Weekly horizon scanning of a wide variety of online resources is used to retrieve relevant scientific publications, guidelines, and commentaries. After curation by domain experts, links are deposited into Web-based databases. PHGKB currently consists of nine component databases. Users can search the entire knowledge base or search one or more component databases directly and choose options for customizing the display of their search results. PHGKB offers researchers, policy makers, practitioners, and the general public a way to find information they need to understand the complicated landscape of genomics and population health.Genet Med 18 12, 1312-1314.
Meta-Analysis in Genome-Wide Association Datasets: Strategies and Application in Parkinson Disease

PubMed Central

Evangelou, Evangelos; Maraganore, Demetrius M.; Ioannidis, John P.A.

2007-01-01

Background Genome-wide association studies hold substantial promise for identifying common genetic variants that regulate susceptibility to complex diseases. However, for the detection of small genetic effects, single studies may be underpowered. Power may be improved by combining genome-wide datasets with meta-analytic techniques. Methodology/Principal Findings Both single and two-stage genome-wide data may be combined and there are several possible strategies. In the two-stage framework, we considered the options of (1) enhancement of replication data and (2) enhancement of first-stage data, and then, we also considered (3) joint meta-analyses including all first-stage and second-stage data. These strategies were examined empirically using data from two genome-wide association studies (three datasets) on Parkinson disease. In the three strategies, we derived 12, 5, and 49 single nucleotide polymorphisms that show significant associations at conventional levels of statistical significance. None of these remained significant after conservative adjustment for the number of performed analyses in each strategy. However, some may warrant further consideration: 6 SNPs were identified with at least 2 of the 3 strategies and 3 SNPs [rs1000291 on chromosome 3, rs2241743 on chromosome 4 and rs3018626 on chromosome 11] were identified with all 3 strategies and had no or minimal between-dataset heterogeneity (I2 = 0, 0 and 15%, respectively). Analyses were primarily limited by the suboptimal overlap of tested polymorphisms across different datasets (e.g., only 31,192 shared polymorphisms between the two tier 1 datasets). Conclusions/Significance Meta-analysis may be used to improve the power and examine the between-dataset heterogeneity of genome-wide association studies. Prospective designs may be most efficient, if they try to maximize the overlap of genotyping platforms and anticipate the combination of data across many genome-wide association studies. PMID:17332845

Evolutionarily diverse determinants of meiotic DNA break and recombination landscapes across the genome

PubMed Central

Fowler, Kyle R.; Sasaki, Mariko; Milman, Neta

2014-01-01

Fission yeast Rec12 (Spo11 homolog) initiates meiotic recombination by forming developmentally programmed DNA double-strand breaks (DSBs). DSB distributions influence patterns of heredity and genome evolution, but the basis of the highly nonrandom choice of Rec12 cleavage sites is poorly understood, largely because available maps are of relatively low resolution and sensitivity. Here, we determined DSBs genome-wide at near-nucleotide resolution by sequencing the oligonucleotides attached to Rec12 following DNA cleavage. The single oligonucleotide size class allowed us to deeply sample all break events. We find strong evidence across the genome for differential DSB repair accounting for crossover invariance (constant cM/kb in spite of DSB hotspots). Surprisingly, about half of all crossovers occur in regions where DSBs occur at low frequency and are widely dispersed in location from cell to cell. These previously undetected, low-level DSBs thus play an outsized and crucial role in meiosis. We further find that the influence of underlying nucleotide sequence and chromosomal architecture differs in multiple ways from that in budding yeast. DSBs are not strongly restricted to nucleosome-depleted regions, as they are in budding yeast, but are nevertheless spatially influenced by chromatin structure. Our analyses demonstrate that evolutionarily fluid factors contribute to crossover initiation and regulation. PMID:25024163
Discovering hotspots in functional genomic data superposed on 3D chromatin configuration reconstructions.

PubMed

Capurso, Daniel; Bengtsson, Henrik; Segal, Mark R

2016-03-18

The spatial organization of the genome influences cellular function, notably gene regulation. Recent studies have assessed the three-dimensional (3D) co-localization of functional annotations (e.g. centromeres, long terminal repeats) using 3D genome reconstructions from Hi-C (genome-wide chromosome conformation capture) data; however, corresponding assessments for continuous functional genomic data (e.g. chromatin immunoprecipitation-sequencing (ChIP-seq) peak height) are lacking. Here, we demonstrate that applying bump hunting via the patient rule induction method (PRIM) to ChIP-seq data superposed on a Saccharomyces cerevisiae 3D genome reconstruction can discover 'functional 3D hotspots', regions in 3-space for which the mean ChIP-seq peak height is significantly elevated. For the transcription factor Swi6, the top hotspot by P-value contains MSB2 and ERG11 - known Swi6 target genes on different chromosomes. We verify this finding in a number of ways. First, this top hotspot is relatively stable under PRIM across parameter settings. Second, this hotspot is among the top hotspots by mean outcome identified by an alternative algorithm, k-Nearest Neighbor (k-NN) regression. Third, the distance between MSB2 and ERG11 is smaller than expected (by resampling) in two other 3D reconstructions generated via different normalization and reconstruction algorithms. This analytic approach can discover functional 3D hotspots and potentially reveal novel regulatory interactions. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
A community assessment of privacy preserving techniques for human genomes

PubMed Central

2014-01-01

To answer the need for the rigorous protection of biomedical data, we organized the Critical Assessment of Data Privacy and Protection initiative as a community effort to evaluate privacy-preserving dissemination techniques for biomedical data. We focused on the challenge of sharing aggregate human genomic data (e.g., allele frequencies) in a way that preserves the privacy of the data donors, without undermining the utility of genome-wide association studies (GWAS) or impeding their dissemination. Specifically, we designed two problems for disseminating the raw data and the analysis outcome, respectively, based on publicly available data from HapMap and from the Personal Genome Project. A total of six teams participated in the challenges. The final results were presented at a workshop of the iDASH (integrating Data for Analysis, 'anonymization,' and SHaring) National Center for Biomedical Computing. We report the results of the challenge and our findings about the current genome privacy protection techniques. PMID:25521230
A community assessment of privacy preserving techniques for human genomes.

PubMed

Jiang, Xiaoqian; Zhao, Yongan; Wang, Xiaofeng; Malin, Bradley; Wang, Shuang; Ohno-Machado, Lucila; Tang, Haixu

2014-01-01

To answer the need for the rigorous protection of biomedical data, we organized the Critical Assessment of Data Privacy and Protection initiative as a community effort to evaluate privacy-preserving dissemination techniques for biomedical data. We focused on the challenge of sharing aggregate human genomic data (e.g., allele frequencies) in a way that preserves the privacy of the data donors, without undermining the utility of genome-wide association studies (GWAS) or impeding their dissemination. Specifically, we designed two problems for disseminating the raw data and the analysis outcome, respectively, based on publicly available data from HapMap and from the Personal Genome Project. A total of six teams participated in the challenges. The final results were presented at a workshop of the iDASH (integrating Data for Analysis, 'anonymization,' and SHaring) National Center for Biomedical Computing. We report the results of the challenge and our findings about the current genome privacy protection techniques.
Using a Euclid distance discriminant method to find protein coding genes in the yeast genome.

PubMed

Zhang, Chun-Ting; Wang, Ju; Zhang, Ren

2002-02-01

The Euclid distance discriminant method is used to find protein coding genes in the yeast genome, based on the single nucleotide frequencies at three codon positions in the ORFs. The method is extremely simple and may be extended to find genes in prokaryotic genomes or eukaryotic genomes with less introns. Six-fold cross-validation tests have demonstrated that the accuracy of the algorithm is better than 93%. Based on this, it is found that the total number of protein coding genes in the yeast genome is less than or equal to 5579 only, about 3.8-7.0% less than 5800-6000, which is currently widely accepted. The base compositions at three codon positions are analyzed in details using a graphic method. The result shows that the preference codons adopted by yeast genes are of the RGW type, where R, G and W indicate the bases of purine, non-G and A/T, whereas the 'codons' in the intergenic sequences are of the form NNN, where N denotes any base. This fact constitutes the basis of the algorithm to distinguish between coding and non-coding ORFs in the yeast genome. The names of putative non-coding ORFs are listed here in detail.
Genome-wide association study meta-analysis of European and Asian-ancestry samples identifies three novel loci associated with bipolar disorder.

PubMed

Chen, D T; Jiang, X; Akula, N; Shugart, Y Y; Wendland, J R; Steele, C J M; Kassem, L; Park, J-H; Chatterjee, N; Jamain, S; Cheng, A; Leboyer, M; Muglia, P; Schulze, T G; Cichon, S; Nöthen, M M; Rietschel, M; McMahon, F J; Farmer, A; McGuffin, P; Craig, I; Lewis, C; Hosang, G; Cohen-Woods, S; Vincent, J B; Kennedy, J L; Strauss, J

2013-02-01

Meta-analyses of bipolar disorder (BD) genome-wide association studies (GWAS) have identified several genome-wide significant signals in European-ancestry samples, but so far account for little of the inherited risk. We performed a meta-analysis of ∼750,000 high-quality genetic markers on a combined sample of ∼14,000 subjects of European and Asian-ancestry (phase I). The most significant findings were further tested in an extended sample of ∼17,700 cases and controls (phase II). The results suggest novel association findings near the genes TRANK1 (LBA1), LMAN2L and PTGFR. In phase I, the most significant single nucleotide polymorphism (SNP), rs9834970 near TRANK1, was significant at the P=2.4 × 10(-11) level, with no heterogeneity. Supportive evidence for prior association findings near ANK3 and a locus on chromosome 3p21.1 was also observed. The phase II results were similar, although the heterogeneity test became significant for several SNPs. On the basis of these results and other established risk loci, we used the method developed by Park et al. to estimate the number, and the effect size distribution, of BD risk loci that could still be found by GWAS methods. We estimate that >63,000 case-control samples would be needed to identify the ∼105 BD risk loci discoverable by GWAS, and that these will together explain <6% of the inherited risk. These results support previous GWAS findings and identify three new candidate genes for BD. Further studies are needed to replicate these findings and may potentially lead to identification of functional variants. Sample size will remain a limiting factor in the discovery of common alleles associated with BD.
Genome-wide association analysis identifies 30 new susceptibility loci for schizophrenia.

PubMed

Li, Zhiqiang; Chen, Jianhua; Yu, Hao; He, Lin; Xu, Yifeng; Zhang, Dai; Yi, Qizhong; Li, Changgui; Li, Xingwang; Shen, Jiawei; Song, Zhijian; Ji, Weidong; Wang, Meng; Zhou, Juan; Chen, Boyu; Liu, Yahui; Wang, Jiqiang; Wang, Peng; Yang, Ping; Wang, Qingzhong; Feng, Guoyin; Liu, Benxiu; Sun, Wensheng; Li, Baojie; He, Guang; Li, Weidong; Wan, Chunling; Xu, Qi; Li, Wenjin; Wen, Zujia; Liu, Ke; Huang, Fang; Ji, Jue; Ripke, Stephan; Yue, Weihua; Sullivan, Patrick F; O'Donovan, Michael C; Shi, Yongyong

2017-11-01

We conducted a genome-wide association study (GWAS) with replication in 36,180 Chinese individuals and performed further transancestry meta-analyses with data from the Psychiatry Genomics Consortium (PGC2). Approximately 95% of the genome-wide significant (GWS) index alleles (or their proxies) from the PGC2 study were overrepresented in Chinese schizophrenia cases, including ∼50% that achieved nominal significance and ∼75% that continued to be GWS in the transancestry analysis. The Chinese-only analysis identified seven GWS loci; three of these also were GWS in the transancestry analyses, which identified 109 GWS loci, thus yielding a total of 113 GWS loci (30 novel) in at least one of these analyses. We observed improvements in the fine-mapping resolution at many susceptibility loci. Our results provide several lines of evidence supporting candidate genes at many loci and highlight some pathways for further research. Together, our findings provide novel insight into the genetic architecture and biological etiology of schizophrenia.
Genomes of surface isolates of Alteromonas macleodii: the life of a widespread marine opportunistic copiotroph

PubMed Central

López-Pérez, Mario; Gonzaga, Aitor; Martin-Cuadrado, Ana-Belen; Onyshchenko, Olga; Ghavidel, Akbar; Ghai, Rohit; Rodriguez-Valera, Francisco

2012-01-01

Alteromonas macleodii is a marine gammaproteobacterium with widespread distribution in temperate or tropical waters. We describe three genomes of isolates from surface waters around Europe (Atlantic, Mediterranean and Black Sea) and compare them with a previously described deep Mediterranean isolate (AltDE) that belongs to a widely divergent clade. The surface isolates are quite similar, the most divergent being the Black Sea (BS11) isolate. The genomes contain several genomic islands with different gene content. The recruitment of very similar genomic fragments from metagenomes in different locations indicates that the surface clade is globally abundant with little effect of geography, even the AltDE and the BS11 genomes recruiting from surface samples in open ocean locations. The finding of CRISPR protospacers of AltDE in a lysogenic phage in the Atlantic (English Channel) isolate illustrates a flow of genetic material among these clades and a remarkably wide distribution of this phage. PMID:23019517
Genome-Wide Detection and Analysis of Multifunctional Genes

PubMed Central

Pritykin, Yuri; Ghersi, Dario; Singh, Mona

2015-01-01

Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655
Multi-criteria decision making approaches for quality control of genome-wide association studies.

PubMed

Malovini, Alberto; Rognoni, Carla; Puca, Annibale; Bellazzi, Riccardo

2009-03-01

Experimental errors in the genotyping phases of a Genome-Wide Association Study (GWAS) can lead to false positive findings and to spurious associations. An appropriate quality control phase could minimize the effects of this kind of errors. Several filtering criteria can be used to perform quality control. Currently, no formal methods have been proposed for taking into account at the same time these criteria and the experimenter's preferences. In this paper we propose two strategies for setting appropriate genotyping rate thresholds for GWAS quality control. These two approaches are based on the Multi-Criteria Decision Making theory. We have applied our method on a real dataset composed by 734 individuals affected by Arterial Hypertension (AH) and 486 nonagenarians without history of AH. The proposed strategies appear to deal with GWAS quality control in a sound way, as they lead to rationalize and make explicit the experimenter's choices thus providing more reproducible results.
GWAS of 126,559 Individuals Identifies Genetic Variants Associated with Educational Attainment

PubMed Central

Rietveld, Cornelius A.; Medland, Sarah E.; Derringer, Jaime; Yang, Jian; Esko, Tõnu; Martin, Nicolas W.; Westra, Harm-Jan; Shakhbazov, Konstantin; Abdellaoui, Abdel; Agrawal, Arpana; Albrecht, Eva; Alizadeh, Behrooz Z.; Amin, Najaf; Barnard, John; Baumeister, Sebastian E.; Benke, Kelly S.; Bielak, Lawrence F.; Boatman, Jeffrey A.; Boyle, Patricia A.; Davies, Gail; de Leeuw, Christiaan; Eklund, Niina; Evans, Daniel S.; Ferhmann, Rudolf; Fischer, Krista; Gieger, Christian; Gjessing, Håkon K.; Hägg, Sara; Harris, Jennifer R.; Hayward, Caroline; Holzapfel, Christina; Ibrahim-Verbaas, Carla A.; Ingelsson, Erik; Jacobsson, Bo; Joshi, Peter K.; Jugessur, Astanand; Kaakinen, Marika; Kanoni, Stavroula; Karjalainen, Juha; Kolcic, Ivana; Kristiansson, Kati; Kutalik, Zoltán; Lahti, Jari; Lee, Sang H.; Lin, Peng; Lind, Penelope A.; Liu, Yongmei; Lohman, Kurt; Loitfelder, Marisa; McMahon, George; Vidal, Pedro Marques; Meirelles, Osorio; Milani, Lili; Myhre, Ronny; Nuotio, Marja-Liisa; Oldmeadow, Christopher J.; Petrovic, Katja E.; Peyrot, Wouter J.; Polašek, Ozren; Quaye, Lydia; Reinmaa, Eva; Rice, John P.; Rizzi, Thais S.; Schmidt, Helena; Schmidt, Reinhold; Smith, Albert V.; Smith, Jennifer A.; Tanaka, Toshiko; Terracciano, Antonio; van der Loos, Matthijs J.H.M.; Vitart, Veronique; Völzke, Henry; Wellmann, Jürgen; Yu, Lei; Zhao, Wei; Allik, Jüri; Attia, John R.; Bandinelli, Stefania; Bastardot, François; Beauchamp, Jonathan; Bennett, David A.; Berger, Klaus; Bierut, Laura J.; Boomsma, Dorret I.; Bültmann, Ute; Campbell, Harry; Chabris, Christopher F.; Cherkas, Lynn; Chung, Mina K.; Cucca, Francesco; de Andrade, Mariza; De Jager, Philip L.; De Neve, Jan-Emmanuel; Deary, Ian J.; Dedoussis, George V.; Deloukas, Panos; Dimitriou, Maria; Eiriksdottir, Gudny; Elderson, Martin F.; Eriksson, Johan G.; Evans, David M.; Faul, Jessica D.; Ferrucci, Luigi; Garcia, Melissa E.; Grönberg, Henrik; Gudnason, Vilmundur; Hall, Per; Harris, Juliette M.; Harris, Tamara B.; Hastie, Nicholas D.; Heath, Andrew C.; Hernandez, Dena G.; Hoffmann, Wolfgang; Hofman, Adriaan; Holle, Rolf; Holliday, Elizabeth G.; Hottenga, Jouke-Jan; Iacono, William G.; Illig, Thomas; Järvelin, Marjo-Riitta; Kähönen, Mika; Kaprio, Jaakko; Kirkpatrick, Robert M.; Kowgier, Matthew; Latvala, Antti; Launer, Lenore J.; Lawlor, Debbie A.; Lehtimäki, Terho; Li, Jingmei; Lichtenstein, Paul; Lichtner, Peter; Liewald, David C.; Madden, Pamela A.; Magnusson, Patrik K. E.; Mäkinen, Tomi E.; Masala, Marco; McGue, Matt; Metspalu, Andres; Mielck, Andreas; Miller, Michael B.; Montgomery, Grant W.; Mukherjee, Sutapa; Nyholt, Dale R.; Oostra, Ben A.; Palmer, Lyle J.; Palotie, Aarno; Penninx, Brenda; Perola, Markus; Peyser, Patricia A.; Preisig, Martin; Räikkönen, Katri; Raitakari, Olli T.; Realo, Anu; Ring, Susan M.; Ripatti, Samuli; Rivadeneira, Fernando; Rudan, Igor; Rustichini, Aldo; Salomaa, Veikko; Sarin, Antti-Pekka; Schlessinger, David; Scott, Rodney J.; Snieder, Harold; Pourcain, Beate St; Starr, John M.; Sul, Jae Hoon; Surakka, Ida; Svento, Rauli; Teumer, Alexander; Tiemeier, Henning; Rooij, Frank JAan; Van Wagoner, David R.; Vartiainen, Erkki; Viikari, Jorma; Vollenweider, Peter; Vonk, Judith M.; Waeber, Gérard; Weir, David R.; Wichmann, H.-Erich; Widen, Elisabeth; Willemsen, Gonneke; Wilson, James F.; Wright, Alan F.; Conley, Dalton; Davey-Smith, George; Franke, Lude; Groenen, Patrick J. F.; Hofman, Albert; Johannesson, Magnus; Kardia, Sharon L.R.; Krueger, Robert F.; Laibson, David; Martin, Nicholas G.; Meyer, Michelle N.; Posthuma, Danielle; Thurik, A. Roy; Timpson, Nicholas J.; Uitterlinden, André G.; van Duijn, Cornelia M.; Visscher, Peter M.; Benjamin, Daniel J.; Cesarini, David; Koellinger, Philipp D.

2013-01-01

A genome-wide association study of educational attainment was conducted in a discovery sample of 101,069 individuals and a replication sample of 25,490. Three independent SNPs are genome-wide significant (rs9320913, rs11584700, rs4851266), and all three replicate. Estimated effects sizes are small (R2 ≈ 0.02%), approximately 1 month of schooling per allele. A linear polygenic score from all measured SNPs accounts for ≈ 2% of the variance in both educational attainment and cognitive function. Genes in the region of the loci have previously been associated with health, cognitive, and central nervous system phenotypes, and bioinformatics analyses suggest the involvement of the anterior caudate nucleus. These findings provide promising candidate SNPs for follow-up work, and our effect size estimates can anchor power analyses in social-science genetics. PMID:23722424
Identification of SNPs associated with variola virus virulence

PubMed Central

2013-01-01

Background Decades after the eradication of smallpox, its etiological agent, variola virus (VARV), remains a threat as a potential bioweapon. Outbreaks of smallpox around the time of the global eradication effort exhibited variable case fatality rates (CFRs), likely attributable in part to complex viral genetic determinants of smallpox virulence. We aimed to identify genome-wide single nucleotide polymorphisms associated with CFR. We evaluated unadjusted and outbreak geographic location-adjusted models of single SNPs and two- and three-way interactions between SNPs. Findings Using the data mining approach multifactor dimensionality reduction (MDR), we identified five VARV SNPs in models significantly associated with CFR. The top performing unadjusted model and adjusted models both revealed the same two-way gene-gene interaction. We discuss the biological plausibility of the influence of the SNPs identified these and other significant models on the strain-specific virulence of VARV. Conclusions We have identified genetic loci in the VARV genome that are statistically associated with VARV virulence as measured by CFR. While our ability to infer a causal relationship between the specific SNPs identified in our analysis and VARV virulence is limited, our results suggest that smallpox severity is in part associated with VARV strain variation and that VARV virulence may be determined by multiple genetic loci. This study represents the first application of MDR to the identification of pathogen gene-gene interactions for predicting infectious disease outbreak severity. PMID:23410064
Genetic susceptibility to type 2 diabetes and obesity: follow-up of findings from genome-wide association studies.

PubMed

Basile, Kevin J; Johnson, Matthew E; Xia, Qianghua; Grant, Struan F A

2014-01-01

Elucidating the underlying genetic variations influencing various complex diseases is one of the major challenges currently facing clinical genetic research. Although these variations are often difficult to uncover, approaches such as genome-wide association studies (GWASs) have been successful at finding statistically significant associations between specific genomic loci and disease susceptibility. GWAS has been especially successful in elucidating genetic variants that influence type 2 diabetes (T2D) and obesity/body mass index (BMI). Specifically, several GWASs have confirmed that a variant in transcription factor 7-like 2 (TCF7L2) confers risk for T2D, while a variant in fat mass and obesity-associated protein (FTO) confers risk for obesity/BMI; indeed both of these signals are considered the most statistically associated loci discovered for these respective traits to date. The discovery of these two key loci in this context has been invaluable for providing novel insight into mechanisms of heritability and disease pathogenesis. As follow-up studies of TCF7L2 and FTO have typically lead the way in how to follow up a GWAS discovery, we outline what has been learned from such investigations and how they have implications for the myriad of other loci that have been subsequently reported in this disease context.
Meta-analysis identifies common variants associated with body mass index in East Asians

PubMed Central

Wen, Wanqing; Cho, Yoon Shin; Zheng, Wei; Dorajoo, Rajkumar; Kato, Norihiro; Qi, Lu; Chen, Chien-Hsiun; Delahanty, Ryan J.; Okada, Yukinori; Tabara, Yasuharu; Gu, Dongfeng; Zhu, Dingliang; Haiman, Christopher A.; Mo, Zengnan; Gao, Yu-Tang; Saw, Seang Mei; Go, Min Jin; Takeuchi, Fumihiko; Chang, Li-Ching; Kokubo, Yoshihiro; Liang, Jun; Hao, Mei; Marchand, Loic Le; Zhang, Yi; Hu, Yanling; Wong, Tien Yin; Long, Jirong; Han, Bok-Ghee; Kubo, Michiaki; Yamamoto, Ken; Su, Mei-Hsin; Miki, Tetsuro; Henderson, Brian E.; Song, Huaidong; Tan, Aihua; He, Jiang; Ng, Daniel P.-K.; Cai, Qiuyin; Tsunoda, Tatsuhiko; Tsai, Fuu-Jen; Iwai, Naoharu; Chen, Gary K.; Shi, Jiajun; Xu, Jianfeng; Sim, Xueling; Xiang, Yong-Bing; Maeda, Shiro; Ong, Rick T.H.; Li, Chun; Nakamura, Yusuke; Aung, Tin; Kamatani, Naoyuki; Liu, Jian Jun; Lu, Wei; Yokota, Mitsuhiro; Seielstad, Mark; Fann, Cathy S.J.; Wu, Jer-Yuarn; Lee, Jong-Young; Hu, Frank B.; Tanaka, Toshihiro; Tai, E. Shyong; Shu, Xiao Ou

2012-01-01

Multiple genetic loci associated with obesity or body mass index (BMI) have been identified through genome-wide association studies conducted predominantly in populations of European ancestry. We conducted a meta-analysis of associations between BMI and approximately 2.4 million SNPs in 27,715 East Asians, followed by in silico and de novo replication in 37,691 and 17,642 additional East Asians, respectively. We identified ten BMI-associated loci at the genome-wide significance level (P<5.0×10−8), including seven previously identified loci (FTO, SEC16B, MC4R, GIPR/QPCTL, ADCY3/RBJ, BDNF, and MAP2K5) and three novel loci in or near the CDKAL1,PCSK1, and GP2 genes. Three additional loci nearly reached the genome-wide significance threshold, including two previously identified loci in the GNPDA2 and TFAP2B genes and a new locus near PAX6, which all had P<5.0×10−7. Findings from this study may shed light on new pathways involved in obesity and demonstrate the value of conducting genetic studies in non-European populations. PMID:22344219
Genome-wide ancestry of 17th-century enslaved Africans from the Caribbean.

PubMed

Schroeder, Hannes; Ávila-Arcos, María C; Malaspinas, Anna-Sapfo; Poznik, G David; Sandoval-Velasco, Marcela; Carpenter, Meredith L; Moreno-Mayar, José Víctor; Sikora, Martin; Johnson, Philip L F; Allentoft, Morten Erik; Samaniego, José Alfredo; Haviser, Jay B; Dee, Michael W; Stafford, Thomas W; Salas, Antonio; Orlando, Ludovic; Willerslev, Eske; Bustamante, Carlos D; Gilbert, M Thomas P

2015-03-24

Between 1500 and 1850, more than 12 million enslaved Africans were transported to the New World. The vast majority were shipped from West and West-Central Africa, but their precise origins are largely unknown. We used genome-wide ancient DNA analyses to investigate the genetic origins of three enslaved Africans whose remains were recovered on the Caribbean island of Saint Martin. We trace their origins to distinct subcontinental source populations within Africa, including Bantu-speaking groups from northern Cameroon and non-Bantu speakers living in present-day Nigeria and Ghana. To our knowledge, these findings provide the first direct evidence for the ethnic origins of enslaved Africans, at a time for which historical records are scarce, and demonstrate that genomic data provide another type of record that can shed new light on long-standing historical questions.
Genome-wide ancestry of 17th-century enslaved Africans from the Caribbean

PubMed Central

Schroeder, Hannes; Ávila-Arcos, María C.; Malaspinas, Anna-Sapfo; Sandoval-Velasco, Marcela; Carpenter, Meredith L.; Moreno-Mayar, José Víctor; Sikora, Martin; Johnson, Philip L. F.; Allentoft, Morten Erik; Samaniego, José Alfredo; Haviser, Jay B.; Dee, Michael W.; Stafford, Thomas W.; Salas, Antonio; Orlando, Ludovic; Willerslev, Eske; Bustamante, Carlos D.; Gilbert, M. Thomas P.

2015-01-01

Between 1500 and 1850, more than 12 million enslaved Africans were transported to the New World. The vast majority were shipped from West and West-Central Africa, but their precise origins are largely unknown. We used genome-wide ancient DNA analyses to investigate the genetic origins of three enslaved Africans whose remains were recovered on the Caribbean island of Saint Martin. We trace their origins to distinct subcontinental source populations within Africa, including Bantu-speaking groups from northern Cameroon and non-Bantu speakers living in present-day Nigeria and Ghana. To our knowledge, these findings provide the first direct evidence for the ethnic origins of enslaved Africans, at a time for which historical records are scarce, and demonstrate that genomic data provide another type of record that can shed new light on long-standing historical questions. PMID:25755263
Genomecmp: computer software to detect genomic rearrangements using markers

NASA Astrophysics Data System (ADS)

Kulawik, Maciej; Nowak, Robert M.

2017-08-01

Detection of genomics rearrangements is a tough task, because of the size of data to be processed. As genome sequences may consist of hundreds of millions symbols, it is not only practically impossible to compare them by hand, but it is also complex problem for computer software. The way to significantly accelerate the process is to use rearrangement detection algorithm based on unique short sequences called markers. The algorithm described in this paper develops markers using base genome and find the markers positions on other genome. The algorithm has been extended by support for ambiguity symbols. Web application with graphical user interface has been created using three-layer architecture, where users could run the task simultaneously. The accuracy and efficiency of proposed solution has been studied using generated and real data.
Assembling a protein-protein interaction map of the SSU processome from existing datasets.

PubMed

Lim, Young H; Charette, J Michael; Baserga, Susan J

2011-03-10

The small subunit (SSU) processome is a large ribonucleoprotein complex involved in small ribosomal subunit assembly. It consists of the U3 snoRNA and ∼72 proteins. While most of its components have been identified, the protein-protein interactions (PPIs) among them remain largely unknown, and thus the assembly, architecture and function of the SSU processome remains unclear. We queried PPI databases for SSU processome proteins to quantify the degree to which the three genome-wide high-throughput yeast two-hybrid (HT-Y2H) studies, the genome-wide protein fragment complementation assay (PCA) and the literature-curated (LC) datasets cover the SSU processome interactome. We find that coverage of the SSU processome PPI network is remarkably sparse. Two of the three HT-Y2H studies each account for four and six PPIs between only six of the 72 proteins, while the third study accounts for as little as one PPI and two proteins. The PCA dataset has the highest coverage among the genome-wide studies with 27 PPIs between 25 proteins. The LC dataset was the most extensive, accounting for 34 proteins and 38 PPIs, many of which were validated by independent methods, thereby further increasing their reliability. When the collected data were merged, we found that at least 70% of the predicted PPIs have yet to be determined and 26 proteins (36%) have no known partners. Since the SSU processome is conserved in all Eukaryotes, we also queried HT-Y2H datasets from six additional model organisms, but only four orthologues and three previously known interologous interactions were found. This provides a starting point for further work on SSU processome assembly, and spotlights the need for a more complete genome-wide Y2H analysis.
Assembling a Protein-Protein Interaction Map of the SSU Processome from Existing Datasets

PubMed Central

Baserga, Susan J.

2011-01-01

Background The small subunit (SSU) processome is a large ribonucleoprotein complex involved in small ribosomal subunit assembly. It consists of the U3 snoRNA and ∼72 proteins. While most of its components have been identified, the protein-protein interactions (PPIs) among them remain largely unknown, and thus the assembly, architecture and function of the SSU processome remains unclear. Methodology We queried PPI databases for SSU processome proteins to quantify the degree to which the three genome-wide high-throughput yeast two-hybrid (HT-Y2H) studies, the genome-wide protein fragment complementation assay (PCA) and the literature-curated (LC) datasets cover the SSU processome interactome. Conclusions We find that coverage of the SSU processome PPI network is remarkably sparse. Two of the three HT-Y2H studies each account for four and six PPIs between only six of the 72 proteins, while the third study accounts for as little as one PPI and two proteins. The PCA dataset has the highest coverage among the genome-wide studies with 27 PPIs between 25 proteins. The LC dataset was the most extensive, accounting for 34 proteins and 38 PPIs, many of which were validated by independent methods, thereby further increasing their reliability. When the collected data were merged, we found that at least 70% of the predicted PPIs have yet to be determined and 26 proteins (36%) have no known partners. Since the SSU processome is conserved in all Eukaryotes, we also queried HT-Y2H datasets from six additional model organisms, but only four orthologues and three previously known interologous interactions were found. This provides a starting point for further work on SSU processome assembly, and spotlights the need for a more complete genome-wide Y2H analysis. PMID:21423703
Genome-wide association studies in cardiac electrophysiology: recent discoveries and implications for clinical practice.

PubMed

Milan, David J; Lubitz, Steven A; Kääb, Stefan; Ellinor, Patrick T

2010-08-01

Genome-wide association studies have been increasingly used to study the genetics of complex human diseases. Within the field of cardiac electrophysiology, this technique has been applied to conditions such as atrial fibrillation, and several electrocardiographic parameters including the QT interval. While these studies have identified multiple genomic regions associated with each trait, questions remain, including the best way to explore the pathophysiology of each association and the potential for clinical utility. This review will summarize recent genome-wide association study results within cardiac electrophysiology and discuss their broader implications in basic science and clinical medicine. Copyright 2010 Heart Rhythm Society. Published by Elsevier Inc. All rights reserved.

Case–Control Genome-Wide Association Study of Persistent Attention-Deficit Hyperactivity Disorder Identifies FBXO33 as a Novel Susceptibility Gene for the Disorder

PubMed Central

Sánchez-Mora, Cristina; Ramos-Quiroga, Josep A; Bosch, Rosa; Corrales, Montse; Garcia-Martínez, Iris; Nogueira, Mariana; Pagerols, Mireia; Palomar, Gloria; Richarte, Vanesa; Vidal, Raquel; Arias-Vasquez, Alejandro; Bustamante, Mariona; Forns, Joan; Gross-Lesch, Silke; Guxens, Monica; Hinney, Anke; Hoogman, Martine; Jacob, Christian; Jacobsen, Kaya K; Kan, Cornelis C; Kiemeney, Lambertus; Kittel-Schneider, Sarah; Klein, Marieke; Onnink, Marten; Rivero, Olga; Zayats, Tetyana; Buitelaar, Jan; Faraone, Stephen V; Franke, Barbara; Haavik, Jan; Johansson, Stefan; Lesch, Klaus-Peter; Reif, Andreas; Sunyer, Jordi; Bayés, Mònica; Casas, Miguel; Cormand, Bru; Ribasés, Marta

2015-01-01

Attention-deficit hyperactivity disorder (ADHD) is a neurodevelopmental disorder with high heritability. At least 30% of patients diagnosed in childhood continue to suffer from ADHD during adulthood and genetic risk factors may play an essential role in the persistence of the disorder throughout lifespan. To date, genome-wide association studies (GWAS) of ADHD have been completed in seven independent datasets, six of which were pediatric samples and one on persistent ADHD using a DNA-pooling strategy, but none of them reported genome-wide significant associations. In an attempt to unravel novel genes for the persistence of ADHD into adulthood, we conducted the first two-stage GWAS in adults with ADHD. The discovery sample included 607 ADHD cases and 584 controls. Top signals were subsequently tested for replication in three independent follow-up samples of 2104 ADHD patients and 1901 controls. None of the findings exceeded the genome-wide threshold for significance (PGC<5e−08), but we found evidence for the involvement of the FBXO33 (F-box only protein 33) gene in combined ADHD in the discovery sample (P=9.02e−07) and in the joint analysis of both stages (P=9.7e−03). Additional evidence for a FBXO33 role in ADHD was found through gene-wise and pathway enrichment analyses in our genomic study. Risk alleles were associated with lower FBXO33 expression in lymphoblastoid cell lines and with reduced frontal gray matter volume in a sample of 1300 adult subjects. Our findings point for the first time at the ubiquitination machinery as a new disease mechanism for adult ADHD and establish a rationale for searching for additional risk variants in ubiquitination-related genes. PMID:25284319
A human genome-wide loss-of-function screen identifies effective chikungunya antiviral drugs

PubMed Central

Karlas, Alexander; Berre, Stefano; Couderc, Thérèse; Varjak, Margus; Braun, Peter; Meyer, Michael; Gangneux, Nicolas; Karo-Astover, Liis; Weege, Friderike; Raftery, Martin; Schönrich, Günther; Klemm, Uwe; Wurzlbauer, Anne; Bracher, Franz; Merits, Andres; Meyer, Thomas F.; Lecuit, Marc

2016-01-01

Chikungunya virus (CHIKV) is a globally spreading alphavirus against which there is no commercially available vaccine or therapy. Here we use a genome-wide siRNA screen to identify 156 proviral and 41 antiviral host factors affecting CHIKV replication. We analyse the cellular pathways in which human proviral genes are involved and identify druggable targets. Twenty-one small-molecule inhibitors, some of which are FDA approved, targeting six proviral factors or pathways, have high antiviral activity in vitro, with low toxicity. Three identified inhibitors have prophylactic antiviral effects in mouse models of chikungunya infection. Two of them, the calmodulin inhibitor pimozide and the fatty acid synthesis inhibitor TOFA, have a therapeutic effect in vivo when combined. These results demonstrate the value of loss-of-function screening and pathway analysis for the rational identification of small molecules with therapeutic potential and pave the way for the development of new, host-directed, antiviral agents. PMID:27177310
A human genome-wide loss-of-function screen identifies effective chikungunya antiviral drugs.

PubMed

Karlas, Alexander; Berre, Stefano; Couderc, Thérèse; Varjak, Margus; Braun, Peter; Meyer, Michael; Gangneux, Nicolas; Karo-Astover, Liis; Weege, Friderike; Raftery, Martin; Schönrich, Günther; Klemm, Uwe; Wurzlbauer, Anne; Bracher, Franz; Merits, Andres; Meyer, Thomas F; Lecuit, Marc

2016-05-12

Chikungunya virus (CHIKV) is a globally spreading alphavirus against which there is no commercially available vaccine or therapy. Here we use a genome-wide siRNA screen to identify 156 proviral and 41 antiviral host factors affecting CHIKV replication. We analyse the cellular pathways in which human proviral genes are involved and identify druggable targets. Twenty-one small-molecule inhibitors, some of which are FDA approved, targeting six proviral factors or pathways, have high antiviral activity in vitro, with low toxicity. Three identified inhibitors have prophylactic antiviral effects in mouse models of chikungunya infection. Two of them, the calmodulin inhibitor pimozide and the fatty acid synthesis inhibitor TOFA, have a therapeutic effect in vivo when combined. These results demonstrate the value of loss-of-function screening and pathway analysis for the rational identification of small molecules with therapeutic potential and pave the way for the development of new, host-directed, antiviral agents.
Education and personalized genomics: deciphering the public's genetic health report

PubMed Central

Lamb, Neil E; Myers, Richard M; Gunter, Chris

2010-01-01

Where do members of the public turn to understand what genetic tests mean in terms of their own health? Now that genome-wide association studies and complete genome sequencing are widely available, the importance of education in personalized genomics cannot be overstated. Although some media have introduced the concept of genetic testing to better understand health and disease, the public's understanding of the scope and impact of genetic variation has not kept up with the pace of the science or technology. Unfortunately, the likely sources to which the public turn to for guidance – their physician and the media – are often no better prepared. We examine several venues for information, including print and online guides for both lay and health-oriented audiences, and summarize selected resources in multiple formats. We also note on the roadblocks to progress and discuss ways to remove them, as urgent action is needed to connect people with their genomes in a meaningful way. PMID:20161675
Genome-wide association analyses identify three new susceptibility loci for primary angle closure glaucoma

PubMed Central

Nongpiur, Monisha E; George, Ronnie; Chen, Li-Jia; Do, Tan; Abu-Amero, Khaled; Huang, Chor Kai; Low, Sancy; Tajudin, Liza-Sharmini A; Perera, Shamira A; Cheng, Ching-Yu; Xu, Liang; Jia, Hongyan; Ho, Ching-Lin; Sim, Kar Seng; Wu, Ren-Yi; Tham, Clement C Y; Chew, Paul T K; Su, Daniel H; Oen, Francis T; Sarangapani, Sripriya; Soumittra, Nagaswamy; Osman, Essam A; Wong, Hon-Tym; Tang, Guangxian; Fan, Sujie; Meng, Hailin; Huong, Dao T L; Wang, Hua; Feng, Bo; Baskaran, Mani; Shantha, Balekudaru; Ramprasad, Vedam L; Kumaramanickavel, Govindasamy; Iyengar, Sudha K; How, Alicia C; Lee, Kelvin Y; Sivakumaran, Theru A; Yong, Victor H K; Ting, Serena M L; Li, Yang; Wang, Ya-Xing; Tay, Wan-Ting; Sim, Xueling; Lavanya, Raghavan; Cornes, Belinda K; Zheng, Ying-Feng; Wong, Tina T; Loon, Seng-Chee; Yong, Vernon K Y; Waseem, Naushin; Yaakub, Azhany; Chia, Kee-Seng; Allingham, R Rand; Hauser, Michael A; Lam, Dennis S C; Hibberd, Martin L; Bhattacharya, Shomi S; Zhang, Mingzhi; Teo, Yik Ying; Tan, Donald T; Jonas, Jost B; Tai, E-Shyong; Saw, Seang-Mei; Hon, Do Nhu; Al-Obeidan, Saleh A; Liu, Jianjun; Chau, Tran Nguyen Bich; Simmons, Cameron P; Bei, Jin-Xin; Zeng, Yi-Xin; Foster, Paul J; Vijaya, Lingam; Wong, Tien-Yin; Pang, Chi-Pui

2014-01-01

Primary angle closure glaucoma (PACG) is a major cause of blindness worldwide. We conducted a genome-wide association study including 1,854 PACG cases and 9,608 controls across 5 sample collections in Asia. Replication experiments were conducted in 1,917 PACG cases and 8,943 controls collected from a further 6 sample collections. We report significant associations at three new loci: rs11024102 in PLEKHA7 (per-allele odds ratio (OR) = 1.22; P = 5.33 × 10−12), rs3753841 in COL11A1 (per-allele OR = 1.20; P = 9.22 × 10−10) and rs1015213 located between PCMTD1 and ST18 on chromosome 8q (per-allele OR = 1.50; P = 3.29 × 10−9). Our findings, accumulated across these independent worldwide collections, suggest possible mechanisms explaining the pathogenesis of PACG. PMID:22922875
3-way Networks: Application of Hypergraphs for Modelling Increased Complexity in Comparative Genomics

DOE PAGES

Weighill, Deborah A.; Jacobson, Daniel A.

2015-03-27

Herein we present and develop the theory of 3-way networks, a type of hypergraph in which each edge models relationships between triplets of objects as opposed to pairs of objects as done by standard network models. We explore approaches of how to prune these 3-way networks, illustrate their utility in comparative genomics and demonstrate how they find relationships which would be missed by standard 2-way network models using a phylogenomic dataset of 211 bacterial genomes.
3-way Networks: Application of Hypergraphs for Modelling Increased Complexity in Comparative Genomics

PubMed Central

Weighill, Deborah A; Jacobson, Daniel A

2015-01-01

We present and develop the theory of 3-way networks, a type of hypergraph in which each edge models relationships between triplets of objects as opposed to pairs of objects as done by standard network models. We explore approaches of how to prune these 3-way networks, illustrate their utility in comparative genomics and demonstrate how they find relationships which would be missed by standard 2-way network models using a phylogenomic dataset of 211 bacterial genomes. PMID:25815802
Genome-Wide Meta-Analysis of Longitudinal Alcohol Consumption Across Youth and Early Adulthood.

PubMed

Adkins, Daniel E; Clark, Shaunna L; Copeland, William E; Kennedy, Martin; Conway, Kevin; Angold, Adrian; Maes, Hermine; Liu, Youfang; Kumar, Gaurav; Erkanli, Alaattin; Patkar, Ashwin A; Silberg, Judy; Brown, Tyson H; Fergusson, David M; Horwood, L John; Eaves, Lindon; van den Oord, Edwin J C G; Sullivan, Patrick F; Costello, E J

2015-08-01

The public health burden of alcohol is unevenly distributed across the life course, with levels of use, abuse, and dependence increasing across adolescence and peaking in early adulthood. Here, we leverage this temporal patterning to search for common genetic variants predicting developmental trajectories of alcohol consumption. Comparable psychiatric evaluations measuring alcohol consumption were collected in three longitudinal community samples (N=2,126, obs=12,166). Consumption-repeated measurements spanning adolescence and early adulthood were analyzed using linear mixed models, estimating individual consumption trajectories, which were then tested for association with Illumina 660W-Quad genotype data (866,099 SNPs after imputation and QC). Association results were combined across samples using standard meta-analysis methods. Four meta-analysis associations satisfied our pre-determined genome-wide significance criterion (FDR<0.1) and six others met our 'suggestive' criterion (FDR<0.2). Genome-wide significant associations were highly biological plausible, including associations within GABA transporter 1, SLC6A1 (solute carrier family 6, member 1), and exonic hits in LOC100129340 (mitofusin-1-like). Pathway analyses elaborated single marker results, indicating significant enriched associations to intuitive biological mechanisms, including neurotransmission, xenobiotic pharmacodynamics, and nuclear hormone receptors (NHR). These findings underscore the value of combining longitudinal behavioral data and genome-wide genotype information in order to study developmental patterns and improve statistical power in genomic studies.
GWAMA: software for genome-wide association meta-analysis.

PubMed

Mägi, Reedik; Morris, Andrew P

2010-05-28

Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.
Exploiting the Proteome to Improve the Genome-Wide Genetic Analysis of Epistasis in Common Human Diseases

PubMed Central

Pattin, Kristine A.; Moore, Jason H.

2009-01-01

One of the central goals of human genetics is the identification of loci with alleles or genotypes that confer increased susceptibility. The availability of dense maps of single-nucleotide polymorphisms (SNPs) along with high-throughput genotyping technologies has set the stage for routine genome-wide association studies that are expected to significantly improve our ability to identify susceptibility loci. Before this promise can be realized, there are some significant challenges that need to be addressed. We address here the challenge of detecting epistasis or gene-gene interactions in genome-wide association studies. Discovering epistatic interactions in high dimensional datasets remains a challenge due to the computational complexity resulting from the analysis of all possible combinations of SNPs. One potential way to overcome the computational burden of a genome-wide epistasis analysis would be to devise a logical way to prioritize the many SNPs in a dataset so that the data may be analyzed more efficiently and yet still retain important biological information. One of the strongest demonstrations of the functional relationship between genes is protein-protein interaction. Thus, it is plausible that the expert knowledge extracted from protein interaction databases may allow for a more efficient analysis of genome-wide studies as well as facilitate the biological interpretation of the data. In this review we will discuss the challenges of detecting epistasis in genome-wide genetic studies and the means by which we propose to apply expert knowledge extracted from protein interaction databases to facilitate this process. We explore some of the fundamentals of protein interactions and the databases that are publicly available. PMID:18551320
Genetic Influences on Political Ideologies: Twin Analyses of 19 Measures of Political Ideologies from Five Democracies and Genome-Wide Findings from Three Populations

PubMed Central

Hatemi, Peter K.; Medland, Sarah E.; Klemmensen, Robert; Oskarrson, Sven; Littvay, Levente; Dawes, Chris; Verhulst, Brad; McDermott, Rose; Nørgaard, Asbjørn Sonne; Klofstad, Casey; Christensen, Kaare; Johannesson, Magnus; Magnusson, Patrik K.E.; Eaves, Lindon J.; Martin, Nicholas G.

2014-01-01

Almost forty years ago, evidence from large studies of adult twins and their relatives suggested that between 30-60% of the variance in social and political attitudes could be explained by genetic influences. However, these findings have not been widely accepted or incorporated into the dominant paradigms that explain the etiology of political ideology. This has been attributed in part to measurement and sample limitations, as well the relative absence of molecular genetic studies. Here we present results from original analyses of a combined sample of over 12,000 twins pairs, ascertained from nine different studies conducted in five democracies, sampled over the course of four decades. We provide evidence that genetic factors play a role in the formation of political ideology, regardless of how ideology is measured, the era, or the population sampled. The only exception is a question that explicitly uses the phrase “Left-Right”. We then present results from one of the first genome-wide association studies on political ideology using data from three samples: a 1990 Australian sample involving 6,894 individuals from 3,516 families; a 2008 Australian sample of 1,160 related individuals from 635 families and a 2010 Swedish sample involving 3,334 individuals from 2,607 families. No polymorphisms reached genome-wide significance in the meta-analysis. The combined evidence suggests that political ideology constitutes a fundamental aspect of one’s genetically informed psychological disposition, but as Fisher proposed long ago, genetic influences on complex traits will be composed of thousands of markers of very small effects and it will require extremely large samples to have enough power in order to identify specific polymorphisms related to complex social traits. PMID:24569950
Genetic influences on political ideologies: twin analyses of 19 measures of political ideologies from five democracies and genome-wide findings from three populations.

PubMed

Hatemi, Peter K; Medland, Sarah E; Klemmensen, Robert; Oskarsson, Sven; Littvay, Levente; Dawes, Christopher T; Verhulst, Brad; McDermott, Rose; Nørgaard, Asbjørn Sonne; Klofstad, Casey A; Christensen, Kaare; Johannesson, Magnus; Magnusson, Patrik K E; Eaves, Lindon J; Martin, Nicholas G

2014-05-01

Almost 40 years ago, evidence from large studies of adult twins and their relatives suggested that between 30 and 60% of the variance in social and political attitudes could be explained by genetic influences. However, these findings have not been widely accepted or incorporated into the dominant paradigms that explain the etiology of political ideology. This has been attributed in part to measurement and sample limitations, as well the relative absence of molecular genetic studies. Here we present results from original analyses of a combined sample of over 12,000 twins pairs, ascertained from nine different studies conducted in five democracies, sampled over the course of four decades. We provide evidence that genetic factors play a role in the formation of political ideology, regardless of how ideology is measured, the era, or the population sampled. The only exception is a question that explicitly uses the phrase "Left-Right". We then present results from one of the first genome-wide association studies on political ideology using data from three samples: a 1990 Australian sample involving 6,894 individuals from 3,516 families; a 2008 Australian sample of 1,160 related individuals from 635 families and a 2010 Swedish sample involving 3,334 individuals from 2,607 families. No polymorphisms reached genome-wide significance in the meta-analysis. The combined evidence suggests that political ideology constitutes a fundamental aspect of one's genetically informed psychological disposition, but as Fisher proposed long ago, genetic influences on complex traits will be composed of thousands of markers of very small effects and it will require extremely large samples to have enough power in order to identify specific polymorphisms related to complex social traits.
Construction of Ultradense Linkage Maps with Lep-MAP2: Stickleback F2 Recombinant Crosses as an Example

PubMed Central

Rastas, Pasi; Calboli, Federico C. F.; Guo, Baocheng; Shikano, Takahito; Merilä, Juha

2016-01-01

High-density linkage maps are important tools for genome biology and evolutionary genetics by quantifying the extent of recombination, linkage disequilibrium, and chromosomal rearrangements across chromosomes, sexes, and populations. They provide one of the best ways to validate and refine de novo genome assemblies, with the power to identify errors in assemblies increasing with marker density. However, assembly of high-density linkage maps is still challenging due to software limitations. We describe Lep-MAP2, a software for ultradense genome-wide linkage map construction. Lep-MAP2 can handle various family structures and can account for achiasmatic meiosis to gain linkage map accuracy. Simulations show that Lep-MAP2 outperforms other available mapping software both in computational efficiency and accuracy. When applied to two large F2-generation recombinant crosses between two nine-spined stickleback (Pungitius pungitius) populations, it produced two high-density (∼6 markers/cM) linkage maps containing 18,691 and 20,054 single nucleotide polymorphisms. The two maps showed a high degree of synteny, but female maps were 1.5–2 times longer than male maps in all linkage groups, suggesting genome-wide recombination suppression in males. Comparison with the genome sequence of the three-spined stickleback (Gasterosteus aculeatus) revealed a high degree of interspecific synteny with a low frequency (<5%) of interchromosomal rearrangements. However, a fairly large (ca. 10 Mb) translocation from autosome to sex chromosome was detected in both maps. These results illustrate the utility and novel features of Lep-MAP2 in assembling high-density linkage maps, and their usefulness in revealing evolutionarily interesting properties of genomes, such as strong genome-wide sex bias in recombination rates. PMID:26668116
Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences

PubMed Central

Holmes, Christina; Carlson, Siobhan M.; McDonald, Fiona; Jones, Mavis; Graham, Janice

2016-01-01

Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics. PMID:27134568
Exploring the post-genomic world: differing explanatory and manipulatory functions of post-genomic sciences.

PubMed

Holmes, Christina; Carlson, Siobhan M; McDonald, Fiona; Jones, Mavis; Graham, Janice

2016-01-02

Richard Lewontin proposed that the ability of a scientific field to create a narrative for public understanding garners it social relevance. This article applies Lewontin's conceptual framework of the functions of science (manipulatory and explanatory) to compare and explain the current differences in perceived societal relevance of genetics/genomics and proteomics. We provide three examples to illustrate the social relevance and strong cultural narrative of genetics/genomics for which no counterpart exists for proteomics. We argue that the major difference between genetics/genomics and proteomics is that genomics has a strong explanatory function, due to the strong cultural narrative of heredity. Based on qualitative interviews and observations of proteomics conferences, we suggest that the nature of proteins, lack of public understanding, and theoretical complexity exacerbates this difference for proteomics. Lewontin's framework suggests that social scientists may find that omics sciences affect social relations in different ways than past analyses of genetics.
Transcriptome sequencing reveals genome-wide variation in molecular evolutionary rate among ferns.

PubMed

Grusz, Amanda L; Rothfels, Carl J; Schuettpelz, Eric

2016-08-30

Transcriptomics in non-model plant systems has recently reached a point where the examination of nuclear genome-wide patterns in understudied groups is an achievable reality. This progress is especially notable in evolutionary studies of ferns, for which molecular resources to date have been derived primarily from the plastid genome. Here, we utilize transcriptome data in the first genome-wide comparative study of molecular evolutionary rate in ferns. We focus on the ecologically diverse family Pteridaceae, which comprises about 10 % of fern diversity and includes the enigmatic vittarioid ferns-an epiphytic, tropical lineage known for dramatically reduced morphologies and radically elongated phylogenetic branch lengths. Using expressed sequence data for 2091 loci, we perform pairwise comparisons of molecular evolutionary rate among 12 species spanning the three largest clades in the family and ask whether previously documented heterogeneity in plastid substitution rates is reflected in their nuclear genomes. We then inquire whether variation in evolutionary rate is being shaped by genes belonging to specific functional categories and test for differential patterns of selection. We find significant, genome-wide differences in evolutionary rate for vittarioid ferns relative to all other lineages within the Pteridaceae, but we recover few significant correlations between faster/slower vittarioid loci and known functional gene categories. We demonstrate that the faster rates characteristic of the vittarioid ferns are likely not driven by positive selection, nor are they unique to any particular type of nucleotide substitution. Our results reinforce recently reviewed mechanisms hypothesized to shape molecular evolutionary rates in vittarioid ferns and provide novel insight into substitution rate variation both within and among fern nuclear genomes.
Five endometrial cancer risk loci identified through genome-wide association analysis.

PubMed

Cheng, Timothy Ht; Thompson, Deborah J; O'Mara, Tracy A; Painter, Jodie N; Glubb, Dylan M; Flach, Susanne; Lewis, Annabelle; French, Juliet D; Freeman-Mills, Luke; Church, David; Gorman, Maggie; Martin, Lynn; Hodgson, Shirley; Webb, Penelope M; Attia, John; Holliday, Elizabeth G; McEvoy, Mark; Scott, Rodney J; Henders, Anjali K; Martin, Nicholas G; Montgomery, Grant W; Nyholt, Dale R; Ahmed, Shahana; Healey, Catherine S; Shah, Mitul; Dennis, Joe; Fasching, Peter A; Beckmann, Matthias W; Hein, Alexander; Ekici, Arif B; Hall, Per; Czene, Kamila; Darabi, Hatef; Li, Jingmei; Dörk, Thilo; Dürst, Matthias; Hillemanns, Peter; Runnebaum, Ingo; Amant, Frederic; Schrauwen, Stefanie; Zhao, Hui; Lambrechts, Diether; Depreeuw, Jeroen; Dowdy, Sean C; Goode, Ellen L; Fridley, Brooke L; Winham, Stacey J; Njølstad, Tormund S; Salvesen, Helga B; Trovik, Jone; Werner, Henrica Mj; Ashton, Katie; Otton, Geoffrey; Proietto, Tony; Liu, Tao; Mints, Miriam; Tham, Emma; Consortium, Chibcha; Jun Li, Mulin; Yip, Shun H; Wang, Junwen; Bolla, Manjeet K; Michailidou, Kyriaki; Wang, Qin; Tyrer, Jonathan P; Dunlop, Malcolm; Houlston, Richard; Palles, Claire; Hopper, John L; Peto, Julian; Swerdlow, Anthony J; Burwinkel, Barbara; Brenner, Hermann; Meindl, Alfons; Brauch, Hiltrud; Lindblom, Annika; Chang-Claude, Jenny; Couch, Fergus J; Giles, Graham G; Kristensen, Vessela N; Cox, Angela; Cunningham, Julie M; Pharoah, Paul D P; Dunning, Alison M; Edwards, Stacey L; Easton, Douglas F; Tomlinson, Ian; Spurdle, Amanda B

2016-06-01

We conducted a meta-analysis of three endometrial cancer genome-wide association studies (GWAS) and two follow-up phases totaling 7,737 endometrial cancer cases and 37,144 controls of European ancestry. Genome-wide imputation and meta-analysis identified five new risk loci of genome-wide significance at likely regulatory regions on chromosomes 13q22.1 (rs11841589, near KLF5), 6q22.31 (rs13328298, in LOC643623 and near HEY2 and NCOA7), 8q24.21 (rs4733613, telomeric to MYC), 15q15.1 (rs937213, in EIF2AK4, near BMF) and 14q32.33 (rs2498796, in AKT1, near SIVA1). We also found a second independent 8q24.21 signal (rs17232730). Functional studies of the 13q22.1 locus showed that rs9600103 (pairwise r(2) = 0.98 with rs11841589) is located in a region of active chromatin that interacts with the KLF5 promoter region. The rs9600103[T] allele that is protective in endometrial cancer suppressed gene expression in vitro, suggesting that regulation of the expression of KLF5, a gene linked to uterine development, is implicated in tumorigenesis. These findings provide enhanced insight into the genetic and biological basis of endometrial cancer.
Genome-wide association study of response to cognitive-behavioural therapy in children with anxiety disorders.

PubMed

Coleman, Jonathan R I; Lester, Kathryn J; Keers, Robert; Roberts, Susanna; Curtis, Charles; Arendt, Kristian; Bögels, Susan; Cooper, Peter; Creswell, Cathy; Dalgleish, Tim; Hartman, Catharina A; Heiervang, Einar R; Hötzel, Katrin; Hudson, Jennifer L; In-Albon, Tina; Lavallee, Kristen; Lyneham, Heidi J; Marin, Carla E; Meiser-Stedman, Richard; Morris, Talia; Nauta, Maaike H; Rapee, Ronald M; Schneider, Silvia; Schneider, Sophie C; Silverman, Wendy K; Thastum, Mikael; Thirlwall, Kerstin; Waite, Polly; Wergeland, Gro Janne; Breen, Gerome; Eley, Thalia C

2016-09-01

Anxiety disorders are common, and cognitive-behavioural therapy (CBT) is a first-line treatment. Candidate gene studies have suggested a genetic basis to treatment response, but findings have been inconsistent. To perform the first genome-wide association study (GWAS) of psychological treatment response in children with anxiety disorders (n = 980). Presence and severity of anxiety was assessed using semi-structured interview at baseline, on completion of treatment (post-treatment), and 3 to 12 months after treatment completion (follow-up). DNA was genotyped using the Illumina Human Core Exome-12v1.0 array. Linear mixed models were used to test associations between genetic variants and response (change in symptom severity) immediately post-treatment and at 6-month follow-up. No variants passed a genome-wide significance threshold (P = 5 × 10(-8)) in either analysis. Four variants met criteria for suggestive significance (P<5 × 10(-6)) in association with response post-treatment, and three variants in the 6-month follow-up analysis. This is the first genome-wide therapygenetic study. It suggests no common variants of very high effect underlie response to CBT. Future investigations should maximise power to detect single-variant and polygenic effects by using larger, more homogeneous cohorts. © The Royal College of Psychiatrists 2016.
Genome-wide association study of response to cognitive–behavioural therapy in children with anxiety disorders

PubMed Central

Coleman, Jonathan R. I.; Lester, Kathryn J.; Keers, Robert; Roberts, Susanna; Curtis, Charles; Arendt, Kristian; Bögels, Susan; Cooper, Peter; Creswell, Cathy; Dalgleish, Tim; Hartman, Catharina A.; Heiervang, Einar R.; Hötzel, Katrin; Hudson, Jennifer L.; In-Albon, Tina; Lavallee, Kristen; Lyneham, Heidi J.; Marin, Carla E.; Meiser-Stedman, Richard; Morris, Talia; Nauta, Maaike H.; Rapee, Ronald M.; Schneider, Silvia; Schneider, Sophie C.; Silverman, Wendy K.; Thastum, Mikael; Thirlwall, Kerstin; Waite, Polly; Wergeland, Gro Janne; Breen, Gerome; Eley, Thalia C.

2016-01-01

Background Anxiety disorders are common, and cognitive–behavioural therapy (CBT) is a first-line treatment. Candidate gene studies have suggested a genetic basis to treatment response, but findings have been inconsistent. Aims To perform the first genome-wide association study (GWAS) of psychological treatment response in children with anxiety disorders (n = 980). Method Presence and severity of anxiety was assessed using semi-structured interview at baseline, on completion of treatment (post-treatment), and 3 to 12 months after treatment completion (follow-up). DNA was genotyped using the Illumina Human Core Exome-12v1.0 array. Linear mixed models were used to test associations between genetic variants and response (change in symptom severity) immediately post-treatment and at 6-month follow-up. Results No variants passed a genome-wide significance threshold (P = 5 × 10−8) in either analysis. Four variants met criteria for suggestive significance (P<5 × 10−6) in association with response post-treatment, and three variants in the 6-month follow-up analysis. Conclusions This is the first genome-wide therapygenetic study. It suggests no common variants of very high effect underlie response to CBT. Future investigations should maximise power to detect single-variant and polygenic effects by using larger, more homogeneous cohorts. PMID:26989097
Multi-instance multi-label distance metric learning for genome-wide protein function prediction.

PubMed

Xu, Yonghui; Min, Huaqing; Song, Hengjie; Wu, Qingyao

2016-08-01

Multi-instance multi-label (MIML) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with not only multiple instances but also multiple class labels. To find an appropriate MIML learning method for genome-wide protein function prediction, many studies in the literature attempted to optimize objective functions in which dissimilarity between instances is measured using the Euclidean distance. But in many real applications, Euclidean distance may be unable to capture the intrinsic similarity/dissimilarity in feature space and label space. Unlike other previous approaches, in this paper, we propose to learn a multi-instance multi-label distance metric learning framework (MIMLDML) for genome-wide protein function prediction. Specifically, we learn a Mahalanobis distance to preserve and utilize the intrinsic geometric information of both feature space and label space for MIML learning. In addition, we try to deal with the sparsely labeled data by giving weight to the labeled data. Extensive experiments on seven real-world organisms covering the biological three-domain system (i.e., archaea, bacteria, and eukaryote; Woese et al., 1990) show that the MIMLDML algorithm is superior to most state-of-the-art MIML learning algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.

Applications of the 1000 Genomes Project resources

PubMed Central

Zheng-Bradley, Xiangqun

2017-01-01

Abstract The 1000 Genomes Project created a valuable, worldwide reference for human genetic variation. Common uses of the 1000 Genomes dataset include genotype imputation supporting Genome-wide Association Studies, mapping expression Quantitative Trait Loci, filtering non-pathogenic variants from exome, whole genome and cancer genome sequencing projects, and genetic analysis of population structure and molecular evolution. In this article, we will highlight some of the multiple ways that the 1000 Genomes data can be and has been utilized for genetic studies. PMID:27436001
Genome-Wide Identification and Expression Analysis of WRKY Transcription Factors under Multiple Stresses in Brassica napus

PubMed Central

He, Yajun; Mao, Shaoshuai; Gao, Yulong; Zhu, Liying; Wu, Daoming; Cui, Yixin; Li, Jiana; Qian, Wei

2016-01-01

WRKY transcription factors play important roles in responses to environmental stress stimuli. Using a genome-wide domain analysis, we identified 287 WRKY genes with 343 WRKY domains in the sequenced genome of Brassica napus, 139 in the A sub-genome and 148 in the C sub-genome. These genes were classified into eight groups based on phylogenetic analysis. In the 343 WRKY domains, a total of 26 members showed divergence in the WRKY domain, and 21 belonged to group I. This finding suggested that WRKY genes in group I are more active and variable compared with genes in other groups. Using genome-wide identification and analysis of the WRKY gene family in Brassica napus, we observed genome duplication, chromosomal/segmental duplications and tandem duplication. All of these duplications contributed to the expansion of the WRKY gene family. The duplicate segments that were detected indicated that genome duplication events occurred in the two diploid progenitors B. rapa and B. olearecea before they combined to form B. napus. Analysis of the public microarray database and EST database for B. napus indicated that 74 WRKY genes were induced or preferentially expressed under stress conditions. According to the public QTL data, we identified 77 WRKY genes in 31 QTL regions related to various stress tolerance. We further evaluated the expression of 26 BnaWRKY genes under multiple stresses by qRT-PCR. Most of the genes were induced by low temperature, salinity and drought stress, indicating that the WRKYs play important roles in B. napus stress responses. Further, three BnaWRKY genes were strongly responsive to the three multiple stresses simultaneously, which suggests that these 3 WRKY may have multi-functional roles in stress tolerance and can potentially be used in breeding new rapeseed cultivars. We also found six tandem repeat pairs exhibiting similar expression profiles under the various stress conditions, and three pairs were mapped in the stress related QTL regions, indicating tandem duplicate WRKYs in the adaptive responses to environmental stimuli during the evolution process. Our results provide a framework for future studies regarding the function of WRKY genes in response to stress in B. napus. PMID:27322342
Genome-Wide Identification and Expression Analysis of WRKY Transcription Factors under Multiple Stresses in Brassica napus.

PubMed

He, Yajun; Mao, Shaoshuai; Gao, Yulong; Zhu, Liying; Wu, Daoming; Cui, Yixin; Li, Jiana; Qian, Wei

2016-01-01

WRKY transcription factors play important roles in responses to environmental stress stimuli. Using a genome-wide domain analysis, we identified 287 WRKY genes with 343 WRKY domains in the sequenced genome of Brassica napus, 139 in the A sub-genome and 148 in the C sub-genome. These genes were classified into eight groups based on phylogenetic analysis. In the 343 WRKY domains, a total of 26 members showed divergence in the WRKY domain, and 21 belonged to group I. This finding suggested that WRKY genes in group I are more active and variable compared with genes in other groups. Using genome-wide identification and analysis of the WRKY gene family in Brassica napus, we observed genome duplication, chromosomal/segmental duplications and tandem duplication. All of these duplications contributed to the expansion of the WRKY gene family. The duplicate segments that were detected indicated that genome duplication events occurred in the two diploid progenitors B. rapa and B. olearecea before they combined to form B. napus. Analysis of the public microarray database and EST database for B. napus indicated that 74 WRKY genes were induced or preferentially expressed under stress conditions. According to the public QTL data, we identified 77 WRKY genes in 31 QTL regions related to various stress tolerance. We further evaluated the expression of 26 BnaWRKY genes under multiple stresses by qRT-PCR. Most of the genes were induced by low temperature, salinity and drought stress, indicating that the WRKYs play important roles in B. napus stress responses. Further, three BnaWRKY genes were strongly responsive to the three multiple stresses simultaneously, which suggests that these 3 WRKY may have multi-functional roles in stress tolerance and can potentially be used in breeding new rapeseed cultivars. We also found six tandem repeat pairs exhibiting similar expression profiles under the various stress conditions, and three pairs were mapped in the stress related QTL regions, indicating tandem duplicate WRKYs in the adaptive responses to environmental stimuli during the evolution process. Our results provide a framework for future studies regarding the function of WRKY genes in response to stress in B. napus.
A Comparative Encyclopedia of DNA Elements in the Mouse Genome

PubMed Central

Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D.; Shen, Yin; Pervouchine, Dmitri D.; Djebali, Sarah; Thurman, Bob; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K.; Williams, Brian A.; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M. A.; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T.; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D.; Bansal, Mukul S.; Keller, Cheryl A.; Morrissey, Christapher S.; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S.; Cayting, Philip; Kawli, Trupti; Boyle, Alan P.; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S.; Cline, Melissa S.; Erickson, Drew T.; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A.; Rosenbloom, Kate R.; de Sousa, Beatriz Lacerda; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W. James; Santos, Miguel Ramalho; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J.; Wilken, Matthew S.; Reh, Thomas A.; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P.; Neph, Shane; Humbert, Richard; Hansen, R. Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E.; Orkin, Stuart H.; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J.; Blobel, Gerd A.; Good, Peter J.; Lowdon, Rebecca F.; Adams, Leslie B.; Zhou, Xiao-Qiao; Pazin, Michael J.; Feingold, Elise A.; Wold, Barbara; Taylor, James; Kellis, Manolis; Mortazavi, Ali; Weissman, Sherman M.; Stamatoyannopoulos, John; Snyder, Michael P.; Guigo, Roderic; Gingeras, Thomas R.; Gilbert, David M.; Hardison, Ross C.; Beer, Michael A.; Ren, Bing

2014-01-01

Summary As the premier model organism in biomedical research, the laboratory mouse shares the majority of protein-coding genes with humans, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications, and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of other sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases. PMID:25409824
A comparative encyclopedia of DNA elements in the mouse genome.

PubMed

Yue, Feng; Cheng, Yong; Breschi, Alessandra; Vierstra, Jeff; Wu, Weisheng; Ryba, Tyrone; Sandstrom, Richard; Ma, Zhihai; Davis, Carrie; Pope, Benjamin D; Shen, Yin; Pervouchine, Dmitri D; Djebali, Sarah; Thurman, Robert E; Kaul, Rajinder; Rynes, Eric; Kirilusha, Anthony; Marinov, Georgi K; Williams, Brian A; Trout, Diane; Amrhein, Henry; Fisher-Aylor, Katherine; Antoshechkin, Igor; DeSalvo, Gilberto; See, Lei-Hoon; Fastuca, Meagan; Drenkow, Jorg; Zaleski, Chris; Dobin, Alex; Prieto, Pablo; Lagarde, Julien; Bussotti, Giovanni; Tanzer, Andrea; Denas, Olgert; Li, Kanwei; Bender, M A; Zhang, Miaohua; Byron, Rachel; Groudine, Mark T; McCleary, David; Pham, Long; Ye, Zhen; Kuan, Samantha; Edsall, Lee; Wu, Yi-Chieh; Rasmussen, Matthew D; Bansal, Mukul S; Kellis, Manolis; Keller, Cheryl A; Morrissey, Christapher S; Mishra, Tejaswini; Jain, Deepti; Dogan, Nergiz; Harris, Robert S; Cayting, Philip; Kawli, Trupti; Boyle, Alan P; Euskirchen, Ghia; Kundaje, Anshul; Lin, Shin; Lin, Yiing; Jansen, Camden; Malladi, Venkat S; Cline, Melissa S; Erickson, Drew T; Kirkup, Vanessa M; Learned, Katrina; Sloan, Cricket A; Rosenbloom, Kate R; Lacerda de Sousa, Beatriz; Beal, Kathryn; Pignatelli, Miguel; Flicek, Paul; Lian, Jin; Kahveci, Tamer; Lee, Dongwon; Kent, W James; Ramalho Santos, Miguel; Herrero, Javier; Notredame, Cedric; Johnson, Audra; Vong, Shinny; Lee, Kristen; Bates, Daniel; Neri, Fidencio; Diegel, Morgan; Canfield, Theresa; Sabo, Peter J; Wilken, Matthew S; Reh, Thomas A; Giste, Erika; Shafer, Anthony; Kutyavin, Tanya; Haugen, Eric; Dunn, Douglas; Reynolds, Alex P; Neph, Shane; Humbert, Richard; Hansen, R Scott; De Bruijn, Marella; Selleri, Licia; Rudensky, Alexander; Josefowicz, Steven; Samstein, Robert; Eichler, Evan E; Orkin, Stuart H; Levasseur, Dana; Papayannopoulou, Thalia; Chang, Kai-Hsin; Skoultchi, Arthur; Gosh, Srikanta; Disteche, Christine; Treuting, Piper; Wang, Yanli; Weiss, Mitchell J; Blobel, Gerd A; Cao, Xiaoyi; Zhong, Sheng; Wang, Ting; Good, Peter J; Lowdon, Rebecca F; Adams, Leslie B; Zhou, Xiao-Qiao; Pazin, Michael J; Feingold, Elise A; Wold, Barbara; Taylor, James; Mortazavi, Ali; Weissman, Sherman M; Stamatoyannopoulos, John A; Snyder, Michael P; Guigo, Roderic; Gingeras, Thomas R; Gilbert, David M; Hardison, Ross C; Beer, Michael A; Ren, Bing

2014-11-20

The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
The Arabidopsis thaliana mobilome and its impact at the species level.

PubMed

Quadrana, Leandro; Bortolini Silveira, Amanda; Mayhew, George F; LeBlanc, Chantal; Martienssen, Robert A; Jeddeloh, Jeffrey A; Colot, Vincent

2016-06-03

Transposable elements (TEs) are powerful motors of genome evolution yet a comprehensive assessment of recent transposition activity at the species level is lacking for most organisms. Here, using genome sequencing data for 211 Arabidopsis thaliana accessions taken from across the globe, we identify thousands of recent transposition events involving half of the 326 TE families annotated in this plant species. We further show that the composition and activity of the 'mobilome' vary extensively between accessions in relation to climate and genetic factors. Moreover, TEs insert equally throughout the genome and are rapidly purged by natural selection from gene-rich regions because they frequently affect genes, in multiple ways. Remarkably, loci controlling adaptive responses to the environment are the most frequent transposition targets observed. These findings demonstrate the pervasive, species-wide impact that a rich mobilome can have and the importance of transposition as a recurrent generator of large-effect alleles.
Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes

USDA-ARS?s Scientific Manuscript database

In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approxima...
RNA 3D Modules in Genome-Wide Predictions of RNA 2D Structure

PubMed Central

Theis, Corinna; Zirbel, Craig L.; zu Siederdissen, Christian Höner; Anthon, Christian; Hofacker, Ivo L.; Nielsen, Henrik; Gorodkin, Jan

2015-01-01

Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution. These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D module prediction tools and apply them on a 13-way vertebrate sequence-based alignment. We find that RNA 3D modules predicted by metaRNAmodules and JAR3D are significantly enriched in the screened windows compared to their shuffled counterparts. The initially estimated FDR of 47.0% is lowered to below 25% when certain 3D module predictions are present in the window of the 2D prediction. We discuss the implications and prospects for further development of computational strategies for detection of RNA 2D structure in genomic sequence. PMID:26509713
An information-gain approach to detecting three-way epistatic interactions in genetic association studies

PubMed Central

Hu, Ting; Chen, Yuanzhu; Kiralis, Jeff W; Collins, Ryan L; Wejse, Christian; Sirugo, Giorgio; Williams, Scott M; Moore, Jason H

2013-01-01

Background Epistasis has been historically used to describe the phenomenon that the effect of a given gene on a phenotype can be dependent on one or more other genes, and is an essential element for understanding the association between genetic and phenotypic variations. Quantifying epistasis of orders higher than two is very challenging due to both the computational complexity of enumerating all possible combinations in genome-wide data and the lack of efficient and effective methodologies. Objectives In this study, we propose a fast, non-parametric, and model-free measure for three-way epistasis. Methods Such a measure is based on information gain, and is able to separate all lower order effects from pure three-way epistasis. Results Our method was verified on synthetic data and applied to real data from a candidate-gene study of tuberculosis in a West African population. In the tuberculosis data, we found a statistically significant pure three-way epistatic interaction effect that was stronger than any lower-order associations. Conclusion Our study provides a methodological basis for detecting and characterizing high-order gene-gene interactions in genetic association studies. PMID:23396514
Genome-wide association screens for Achilles tendon and ACL tears and tendinopathy

PubMed Central

Roos, Thomas R.; Roos, Andrew K.; Kleimeyer, John P.; Ahmed, Marwa A.; Goodlin, Gabrielle T.; Fredericson, Michael; Ioannidis, John P. A.; Avins, Andrew L.; Dragoo, Jason L.

2017-01-01

Achilles tendinopathy or rupture and anterior cruciate ligament (ACL) rupture are substantial injuries affecting athletes, associated with delayed recovery or inability to return to competition. To identify genetic markers that might be used to predict risk for these injuries, we performed genome-wide association screens for these injuries using data from the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort consisting of 102,979 individuals. We did not find any single nucleotide polymorphisms (SNPs) associated with either of these injuries with a p-value that was genome-wide significant (p<5x10-8). We found, however, four and three polymorphisms with p-values that were borderline significant (p<10−6) for Achilles tendon injury and ACL rupture, respectively. We then tested SNPs previously reported to be associated with either Achilles tendon injury or ACL rupture. None showed an association in our cohort with a false discovery rate of less than 5%. We obtained, however, moderate to weak evidence for replication in one case; specifically, rs4919510 in MIR608 had a p-value of 5.1x10-3 for association with Achilles tendon injury, corresponding to a 7% chance of false replication. Finally, we tested 2855 SNPs in 90 candidate genes for musculoskeletal injury, but did not find any that showed a significant association below a false discovery rate of 5%. We provide data containing summary statistics for the entire genome, which will be useful for future genetic studies on these injuries. PMID:28358823
Three chromosomal rearrangements promote genomic divergence between migratory and stationary ecotypes of Atlantic cod.

PubMed

Berg, Paul R; Star, Bastiaan; Pampoulie, Christophe; Sodeland, Marte; Barth, Julia M I; Knutsen, Halvor; Jakobsen, Kjetill S; Jentoft, Sissel

2016-03-17

Identification of genome-wide patterns of divergence provides insight on how genomes are influenced by selection and can reveal the potential for local adaptation in spatially structured populations. In Atlantic cod - historically a major marine resource - Northeast-Arctic- and Norwegian coastal cod are recognized by fundamental differences in migratory and non-migratory behavior, respectively. However, the genomic architecture underlying such behavioral ecotypes is unclear. Here, we have analyzed more than 8.000 polymorphic SNPs distributed throughout all 23 linkage groups and show that loci putatively under selection are localized within three distinct genomic regions, each of several megabases long, covering approximately 4% of the Atlantic cod genome. These regions likely represent genomic inversions. The frequency of these distinct regions differ markedly between the ecotypes, spawning in the vicinity of each other, which contrasts with the low level of divergence in the rest of the genome. The observed patterns strongly suggest that these chromosomal rearrangements are instrumental in local adaptation and separation of Atlantic cod populations, leaving footprints of large genomic regions under selection. Our findings demonstrate the power of using genomic information in further understanding the population dynamics and defining management units in one of the world's most economically important marine resources.
Replication of Previous Genome-wide Association Studies of Bone Mineral Density in Premenopausal American Women

PubMed Central

Ichikawa, Shoji; Koller, Daniel L; Padgett, Leah R; Lai, Dongbing; Hui, Siu L; Peacock, Munro; Foroud, Tatiana; Econs, Michael J

2010-01-01

Bone mineral density (BMD) achieved during young adulthood (peak BMD) is one of the major determinants of osteoporotic fracture in later life. Genetic variants associated with BMD have been identified by three recent genome-wide association studies. The most significant single-nucleotide polymorphisms (SNPs) from these studies were genotyped to test whether they were associated with peak BMD in premenopausal American women. Femoral neck and lumbar spine BMD were determined by dual-energy X-ray absorptiometry in two groups of premenopausal women: 1524 white women and 512 black women. In premenopausal white women, two SNPs in the C6orf97/ESR1 region were significantly associated with BMD (p < 4.8 × 10−4), with suggestive evidence for CTNNBL1 and LRP5 (p < .01). Evidence of association with one of the two SNPs in the C6orf97/ESR1 region also was observed in premenopausal black women. Furthermore, SNPs in SP7 and a chromosome 4 intergenic region showed suggestive association with BMD in black women. Detailed analyses of additional SNPs in the C6orf97/ESR1 region revealed multiple genomic blocks independently associated with femoral neck and lumbar spine BMD. Findings in the three published genome-wide association studies were replicated in independent samples of premenopausal American women, suggesting that genetic variants in these genes or regions contribute to peak BMD in healthy women in various populations. © 2010 American Society for Bone and Mineral Research. PMID:20200978
Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

PubMed

Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

2017-03-27

Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.
Copy number variations and genetic admixtures in three Xinjiang ethnic minority groups

PubMed Central

Lou, Haiyi; Li, Shilin; Jin, Wenfei; Fu, Ruiqing; Lu, Dongsheng; Pan, Xinwei; Zhou, Huaigu; Ping, Yuan; Jin, Li; Xu, Shuhua

2015-01-01

Xinjiang is geographically located in central Asia, and it has played an important historical role in connecting eastern Eurasian (EEA) and western Eurasian (WEA) people. However, human population genomic studies in this region have been largely underrepresented, especially with respect to studies of copy number variations (CNVs). Here we constructed the first CNV map of the three major ethnic minority groups, the Uyghur, Kazakh and Kirgiz, using Affymetrix Genome-Wide Human SNP Array 6.0. We systematically compared the properties of CNVs we identified in the three groups with the data from representatives of EEA and WEA. The analyses indicated a typical genetic admixture pattern in all three groups with ancestries from both EEA and WEA. We also identified several CNV regions showing significant deviation of allele frequency from the expected genome-wide distribution, which might be associated with population-specific phenotypes. Our study provides the first genome-wide perspective on the CNVs of three major Xinjiang ethnic minority groups and has implications for both evolutionary and medical studies. PMID:25026903
Copy number variations and genetic admixtures in three Xinjiang ethnic minority groups.

PubMed

Lou, Haiyi; Li, Shilin; Jin, Wenfei; Fu, Ruiqing; Lu, Dongsheng; Pan, Xinwei; Zhou, Huaigu; Ping, Yuan; Jin, Li; Xu, Shuhua

2015-04-01

Xinjiang is geographically located in central Asia, and it has played an important historical role in connecting eastern Eurasian (EEA) and western Eurasian (WEA) people. However, human population genomic studies in this region have been largely underrepresented, especially with respect to studies of copy number variations (CNVs). Here we constructed the first CNV map of the three major ethnic minority groups, the Uyghur, Kazakh and Kirgiz, using Affymetrix Genome-Wide Human SNP Array 6.0. We systematically compared the properties of CNVs we identified in the three groups with the data from representatives of EEA and WEA. The analyses indicated a typical genetic admixture pattern in all three groups with ancestries from both EEA and WEA. We also identified several CNV regions showing significant deviation of allele frequency from the expected genome-wide distribution, which might be associated with population-specific phenotypes. Our study provides the first genome-wide perspective on the CNVs of three major Xinjiang ethnic minority groups and has implications for both evolutionary and medical studies.
Copy Number Variation across European Populations

PubMed Central

Chen, Wanting; Hayward, Caroline; Wright, Alan F.; Hicks, Andrew A.; Vitart, Veronique; Knott, Sara; Wild, Sarah H.; Pramstaller, Peter P.; Wilson, James F.; Rudan, Igor; Porteous, David J.

2011-01-01

Genome analysis provides a powerful approach to test for evidence of genetic variation within and between geographical regions and local populations. Copy number variants which comprise insertions, deletions and duplications of genomic sequence provide one such convenient and informative source. Here, we investigate copy number variants from genome wide scans of single nucleotide polymorphisms in three European population isolates, the island of Vis in Croatia, the islands of Orkney in Scotland and the South Tyrol in Italy. We show that whereas the overall copy number variant frequencies are similar between populations, their distribution is highly specific to the population of origin, a finding which is supported by evidence for increased kinship correlation for specific copy number variants within populations. PMID:21829696
Multi-Criteria Decision Making Approaches for Quality Control of Genome-Wide Association Studies

PubMed Central

Malovini, Alberto; Rognoni, Carla; Puca, Annibale; Bellazzi, Riccardo

2009-01-01

Experimental errors in the genotyping phases of a Genome-Wide Association Study (GWAS) can lead to false positive findings and to spurious associations. An appropriate quality control phase could minimize the effects of this kind of errors. Several filtering criteria can be used to perform quality control. Currently, no formal methods have been proposed for taking into account at the same time these criteria and the experimenter’s preferences. In this paper we propose two strategies for setting appropriate genotyping rate thresholds for GWAS quality control. These two approaches are based on the Multi-Criteria Decision Making theory. We have applied our method on a real dataset composed by 734 individuals affected by Arterial Hypertension (AH) and 486 nonagenarians without history of AH. The proposed strategies appear to deal with GWAS quality control in a sound way, as they lead to rationalize and make explicit the experimenter’s choices thus providing more reproducible results. PMID:21347174
Genetic and phenotypic features defining industrial relevant Lactococcus lactis, L. cremoris and L. lactis biovar. diacetylactis strains.

PubMed

Manno, Mariano Torres; Zuljan, Federico; Alarcón, Sergio; Esteban, Luis; Blancato, Victor; Espariz, Martín; Magni, Christian

2018-06-23

Lactococcus lactis strains constitute one of the most important starter cultures for cheese production. In this study, a genome-wide analysis was performed including 68 available genomes of L. lactis group strains showing the existence of two species (L. lactis and L. cremoris) and two biovars (L. lactis biovar. diacetylactis and L. cremoris biovar. lactis). The proposed classification scheme revealed coherency among phenotypic (through in silico and in vivo bacterial function profiling), phylogenomic (through maximum likelihood trees) and genomic (using overall genome sequence-based parameters) approaches. Strain biodiversity for the industrial biovar. diacetylactis was also analyzed, finding they are formed by at least three variants with the CC1 clonal complex as the only one distributed worldwide. These findings and methodologies will help improve the selection of L. lactis group strains for industrial use as well as facilitate the interpretation of previous or future research studies on this diverse group of bacteria. Copyright © 2018. Published by Elsevier B.V.
Genome-wide meta-analyses of stratified depression in Generation Scotland and UK Biobank.

PubMed

Hall, Lynsey S; Adams, Mark J; Arnau-Soler, Aleix; Clarke, Toni-Kim; Howard, David M; Zeng, Yanni; Davies, Gail; Hagenaars, Saskia P; Maria Fernandez-Pujals, Ana; Gibson, Jude; Wigmore, Eleanor M; Boutin, Thibaud S; Hayward, Caroline; Scotland, Generation; Porteous, David J; Deary, Ian J; Thomson, Pippa A; Haley, Chris S; McIntosh, Andrew M

2018-01-10

Few replicable genetic associations for Major Depressive Disorder (MDD) have been identified. Recent studies of MDD have identified common risk variants by using a broader phenotype definition in very large samples, or by reducing phenotypic and ancestral heterogeneity. We sought to ascertain whether it is more informative to maximize the sample size using data from all available cases and controls, or to use a sex or recurrent stratified subset of affected individuals. To test this, we compared heritability estimates, genetic correlation with other traits, variance explained by MDD polygenic score, and variants identified by genome-wide meta-analysis for broad and narrow MDD classifications in two large British cohorts - Generation Scotland and UK Biobank. Genome-wide meta-analysis of MDD in males yielded one genome-wide significant locus on 3p22.3, with three genes in this region (CRTAP, GLB1, and TMPPE) demonstrating a significant association in gene-based tests. Meta-analyzed MDD, recurrent MDD and female MDD yielded equivalent heritability estimates, showed no detectable difference in association with polygenic scores, and were each genetically correlated with six health-correlated traits (neuroticism, depressive symptoms, subjective well-being, MDD, a cross-disorder phenotype and Bipolar Disorder). Whilst stratified GWAS analysis revealed a genome-wide significant locus for male MDD, the lack of independent replication, and the consistent pattern of results in other MDD classifications suggests that phenotypic stratification using recurrence or sex in currently available sample sizes is currently weakly justified. Based upon existing studies and our findings, the strategy of maximizing sample sizes is likely to provide the greater gain.
The Genome-Wide Influence on Human BMI Depends on Physical Activity, Life Course, and Historical Period.

PubMed

Guo, Guang; Liu, Hexuan; Wang, Ling; Shen, Haipeng; Hu, Wen

2015-10-01

In this analysis, guided by an evolutionary framework, we investigate how the human genome as a whole interacts with historical period, age, and physical activity to influence body mass index (BMI). The genomic influence is estimated by (1) heritability or the proportion of variance in BMI explained by genome-wide genotype data, and (2) the random effects or the best linear unbiased predictors (BLUPs) of genome-wide association studies (GWAS) data on BMI. Data were used from the Framingham Heart Study (FHS) in the United States. The study was initiated in 1948, and the obesity data were collected repeatedly over the subsequent decades. The analyses draw analysis samples from a pool of >8,000 individuals in the FHS. The hypothesis testing based on Pitman test, permutation Pitman test, F test, and permutation F test produces three sets of significant findings. First, the genomic influence on BMI is substantially larger after the mid-1980s than in the few decades before the mid-1980s within each age group of 21-40, 41-50, 51-60, and >60. Second, the genomic influence on BMI weakens as one ages across the life course, or the genomic influence on BMI tends to be more important during reproductive ages than after reproductive ages within each of the two historical periods. Third, within the age group of 21-50 and not in the age group of >50, the genomic influence on BMI among physically active individuals is substantially smaller than the influence on those who are not physically active. In summary, this study provides evidence that the influence of human genome as a whole on obesity depends on historical period, age, and level of physical activity.

Genome-wide significant association between a sequence variant at 15q15.2 and lung cancer risk

PubMed Central

Rafnar, Thorunn; Sulem, Patrick; Besenbacher, Soren; Gudbjartsson, Daniel F.; Zanon, Carlo; Gudmundsson, Julius; Stacey, Simon N.; Kostic, Jelena P.; Thorgeirsson, Thorgeir E.; Thorleifsson, Gudmar; Bjarnason, Hjordis; Skuladottir, Halla; Gudbjartsson, Tomas; Isaksson, Helgi J.; Isla, Dolores; Murillo, Laura; García-Prats, Maria D.; Panadero, Angeles; Aben, Katja K.H.; Vermeulen, Sita H.; van der Heijden, Henricus F.M.; Feser, William; Miller, York E.; Bunn, Paul A.; Kong, Augustine; Wolf, Holly J.; Franklin, Wilbur A.; Mayordomo, Jose I; Kiemeney, Lambertus A.; Jonsson, Steinn; Thorsteinsdottir, Unnur; Stefansson, Kari

2010-01-01

Genome-wide association studies (GWAS) have identified three genomic regions, at 15q24-25.1, 5p15.33 and 6p21.33, which associate with risk of lung cancer. Large meta-analyses of GWA data have failed to find additional associations of genome-wide significance. In this study, we sought to confirm 7 variants with suggestive association to lung cancer (P<10−5) in a recently published meta-analysis. In a GWA dataset of 1,447 lung cancer cases and 36,256 controls in Iceland, three correlated variants on 15q15.2 (rs504417, rs11853991 and rs748404) showed a significant association with lung cancer whereas rs4254535 on 2p14, rs1530057 on 3p24.1, rs6438347 on 3q13.31 and rs1926203 on 10q23.31 did not. The most significant variant, rs748404, was genotyped in additional 1,299 lung cancer cases and 4,102 controls from the Netherlands, Spain and the USA and the results combined with published GWAS data. In this analysis, the T allele of rs748404 reached genome-wide significance (OR=1.15, P=1.1×10−9). Another variant at the same locus, rs12050604, showed association with lung cancer (OR=1.09, 3.6×10−6) and remained significant after adjustment for rs748404 and vice versa. rs748404 is located 140 kb centromeric of the TP53BP1 gene that has been implicated in lung cancer risk. Two fully correlated, non-synonymous coding variants in TP53BP1, rs2602141 (Q1136K) and rs560191 (E353D), showed association with lung cancer in our sample set; however, this association did not remain significant after adjustment for rs748404. Our data show that one or more lung cancer risk variants of genome-wide significance and distinct from the coding variants in TP53BP1 are located at 15q15.2. PMID:21303977
Applications of the 1000 Genomes Project resources.

PubMed

Zheng-Bradley, Xiangqun; Flicek, Paul

2017-05-01

The 1000 Genomes Project created a valuable, worldwide reference for human genetic variation. Common uses of the 1000 Genomes dataset include genotype imputation supporting Genome-wide Association Studies, mapping expression Quantitative Trait Loci, filtering non-pathogenic variants from exome, whole genome and cancer genome sequencing projects, and genetic analysis of population structure and molecular evolution. In this article, we will highlight some of the multiple ways that the 1000 Genomes data can be and has been utilized for genetic studies. © The Author 2016. Published by Oxford University Press.
Genetic associations with lipoprotein subfraction measures differ by ethnicity in the multi-ethnic study of atherosclerosis (MESA)

USDA-ARS?s Scientific Manuscript database

A recent genome-wide association study associated 62 single nucleotide polymorphisms (SNPs) from 43 genomic loci, with fasting lipoprotein subfractions in European–Americans (EAs) at genome-wide levels of significance across three independent samples. Whether these associations are consistent across...
Genome-wide association study of cognitive functions and educational attainment in UK Biobank (N=112 151)

PubMed Central

Davies, G; Marioni, R E; Liewald, D C; Hill, W D; Hagenaars, S P; Harris, S E; Ritchie, S J; Luciano, M; Fawns-Ritchie, C; Lyall, D; Cullen, B; Cox, S R; Hayward, C; Porteous, D J; Evans, J; McIntosh, A M; Gallacher, J; Craddock, N; Pell, J P; Smith, D J; Gale, C R; Deary, I J

2016-01-01

People's differences in cognitive functions are partly heritable and are associated with important life outcomes. Previous genome-wide association (GWA) studies of cognitive functions have found evidence for polygenic effects yet, to date, there are few replicated genetic associations. Here we use data from the UK Biobank sample to investigate the genetic contributions to variation in tests of three cognitive functions and in educational attainment. GWA analyses were performed for verbal–numerical reasoning (N=36 035), memory (N=112 067), reaction time (N=111 483) and for the attainment of a college or a university degree (N=111 114). We report genome-wide significant single-nucleotide polymorphism (SNP)-based associations in 20 genomic regions, and significant gene-based findings in 46 regions. These include findings in the ATXN2, CYP2DG, APBA1 and CADM2 genes. We report replication of these hits in published GWA studies of cognitive function, educational attainment and childhood intelligence. There is also replication, in UK Biobank, of SNP hits reported previously in GWA studies of educational attainment and cognitive function. GCTA-GREML analyses, using common SNPs (minor allele frequency>0.01), indicated significant SNP-based heritabilities of 31% (s.e.m.=1.8%) for verbal–numerical reasoning, 5% (s.e.m.=0.6%) for memory, 11% (s.e.m.=0.6%) for reaction time and 21% (s.e.m.=0.6%) for educational attainment. Polygenic score analyses indicate that up to 5% of the variance in cognitive test scores can be predicted in an independent cohort. The genomic regions identified include several novel loci, some of which have been associated with intracranial volume, neurodegeneration, Alzheimer's disease and schizophrenia. PMID:27046643
Genome-wide association analysis of age-at-onset in Alzheimer's disease.

PubMed

Kamboh, M I; Barmada, M M; Demirci, F Y; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Sweet, R A; Feingold, E; DeKosky, S T; Lopez, O L

2012-12-01

The risk of Alzheimer's disease (AD) is strongly determined by genetic factors and recent genome-wide association studies (GWAS) have identified several genes for the disease risk. In addition to the disease risk, age-at-onset (AAO) of AD has also strong genetic component with an estimated heritability of 42%. Identification of AAO genes may help to understand the biological mechanisms that regulate the onset of the disease. Here we report the first GWAS focused on identifying genes for the AAO of AD. We performed a genome-wide meta-analysis on three samples comprising a total of 2222 AD cases. A total of ~2.5 million directly genotyped or imputed single-nucleotide polymorphisms (SNPs) were analyzed in relation to AAO of AD. As expected, the most significant associations were observed in the apolipoprotein E (APOE) region on chromosome 19 where several SNPs surpassed the conservative genome-wide significant threshold (P<5E-08). The most significant SNP outside the APOE region was located in the DCHS2 gene on chromosome 4q31.3 (rs1466662; P=4.95E-07). There were 19 additional significant SNPs in this region at P<1E-04 and the DCHS2 gene is expressed in the cerebral cortex and thus is a potential candidate for affecting AAO in AD. These findings need to be confirmed in additional well-powered samples.
Family-wide Structural Characterization and Genomic Comparisons Decode the Diversity-oriented Biosynthesis of Thalassospiramides by Marine Proteobacteria*

PubMed Central

Zhang, Weipeng; Lu, Liang; Lai, Qiliang; Zhu, Beika; Li, Zhongrui; Xu, Ying; Shao, Zongze; Herrup, Karl; Moore, Bradley S.; Ross, Avena C.; Qian, Pei-Yuan

2016-01-01

The thalassospiramide lipopeptides have great potential for therapeutic applications; however, their structural and functional diversity and biosynthesis are poorly understood. Here, by cultivating 130 Rhodospirillaceae strains sampled from oceans worldwide, we discovered 21 new thalassospiramide analogues and demonstrated their neuroprotective effects. To investigate the diversity of biosynthetic gene cluster (BGC) architectures, we sequenced the draft genomes of 28 Rhodospirillaceae strains. Our family-wide genomic analysis revealed three types of dysfunctional BGCs and four functional BGCs whose architectures correspond to four production patterns. This correlation allowed us to reassess the “diversity-oriented biosynthesis” proposed for the microbial production of thalassospiramides, which involves iteration of several key modules. Preliminary evolutionary investigation suggested that the functional BGCs could have arisen through module/domain loss, whereas the dysfunctional BGCs arose through horizontal gene transfer. Further comparative genomics indicated that thalassospiramide production is likely to be attendant on particular genes/pathways for amino acid metabolism, signaling transduction, and compound efflux. Our findings provide a systematic understanding of thalassospiramide production and new insights into the underlying mechanism. PMID:27875306
Finding Our Way through Phenotypes

PubMed Central

Deans, Andrew R.; Lewis, Suzanna E.; Huala, Eva; Anzaldo, Salvatore S.; Ashburner, Michael; Balhoff, James P.; Blackburn, David C.; Blake, Judith A.; Burleigh, J. Gordon; Chanet, Bruno; Cooper, Laurel D.; Courtot, Mélanie; Csösz, Sándor; Cui, Hong; Dahdul, Wasila; Das, Sandip; Dececchi, T. Alexander; Dettai, Agnes; Diogo, Rui; Druzinsky, Robert E.; Dumontier, Michel; Franz, Nico M.; Friedrich, Frank; Gkoutos, George V.; Haendel, Melissa; Harmon, Luke J.; Hayamizu, Terry F.; He, Yongqun; Hines, Heather M.; Ibrahim, Nizar; Jackson, Laura M.; Jaiswal, Pankaj; James-Zorn, Christina; Köhler, Sebastian; Lecointre, Guillaume; Lapp, Hilmar; Lawrence, Carolyn J.; Le Novère, Nicolas; Lundberg, John G.; Macklin, James; Mast, Austin R.; Midford, Peter E.; Mikó, István; Mungall, Christopher J.; Oellrich, Anika; Osumi-Sutherland, David; Parkinson, Helen; Ramírez, Martín J.; Richter, Stefan; Robinson, Peter N.; Ruttenberg, Alan; Schulz, Katja S.; Segerdell, Erik; Seltmann, Katja C.; Sharkey, Michael J.; Smith, Aaron D.; Smith, Barry; Specht, Chelsea D.; Squires, R. Burke; Thacker, Robert W.; Thessen, Anne; Fernandez-Triana, Jose; Vihinen, Mauno; Vize, Peter D.; Vogt, Lars; Wall, Christine E.; Walls, Ramona L.; Westerfeld, Monte; Wharton, Robert A.; Wirkner, Christian S.; Woolley, James B.; Yoder, Matthew J.; Zorn, Aaron M.; Mabee, Paula

2015-01-01

Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility. PMID:25562316
Finding our way through phenotypes.

PubMed

Deans, Andrew R; Lewis, Suzanna E; Huala, Eva; Anzaldo, Salvatore S; Ashburner, Michael; Balhoff, James P; Blackburn, David C; Blake, Judith A; Burleigh, J Gordon; Chanet, Bruno; Cooper, Laurel D; Courtot, Mélanie; Csösz, Sándor; Cui, Hong; Dahdul, Wasila; Das, Sandip; Dececchi, T Alexander; Dettai, Agnes; Diogo, Rui; Druzinsky, Robert E; Dumontier, Michel; Franz, Nico M; Friedrich, Frank; Gkoutos, George V; Haendel, Melissa; Harmon, Luke J; Hayamizu, Terry F; He, Yongqun; Hines, Heather M; Ibrahim, Nizar; Jackson, Laura M; Jaiswal, Pankaj; James-Zorn, Christina; Köhler, Sebastian; Lecointre, Guillaume; Lapp, Hilmar; Lawrence, Carolyn J; Le Novère, Nicolas; Lundberg, John G; Macklin, James; Mast, Austin R; Midford, Peter E; Mikó, István; Mungall, Christopher J; Oellrich, Anika; Osumi-Sutherland, David; Parkinson, Helen; Ramírez, Martín J; Richter, Stefan; Robinson, Peter N; Ruttenberg, Alan; Schulz, Katja S; Segerdell, Erik; Seltmann, Katja C; Sharkey, Michael J; Smith, Aaron D; Smith, Barry; Specht, Chelsea D; Squires, R Burke; Thacker, Robert W; Thessen, Anne; Fernandez-Triana, Jose; Vihinen, Mauno; Vize, Peter D; Vogt, Lars; Wall, Christine E; Walls, Ramona L; Westerfeld, Monte; Wharton, Robert A; Wirkner, Christian S; Woolley, James B; Yoder, Matthew J; Zorn, Aaron M; Mabee, Paula

2015-01-01

Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility.
Adaptive evolution has targeted the C-terminal domain of the RXLR effectors of plant pathogenic oomycetes.

PubMed

Win, Joe; Kamoun, Sophien

2008-04-01

Plant pathogenic microbes deliver effector proteins inside host cells to modulate plant defense circuitry and enable parasitic colonization. As genome sequences from plant pathogens become available, genome-wide evolutionary analyses will shed light on how pathogen effector genes evolved and adapted to the cellular environment of their host plants. In the August 2007 issue of Plant Cell, we described adaptive evolution (positive selection) in the cytoplasmic RXLR effectors of three recently sequenced oomycete plant pathogens. Here, we summarize our findings and describe additional data that further validate our approach.
Combined genome-wide linkage and targeted association analysis of head circumference in autism spectrum disorder families.

PubMed

Woodbury-Smith, M; Bilder, D A; Morgan, J; Jerominski, L; Darlington, T; Dyer, T; Paterson, A D; Coon, H

2017-01-01

It has long been recognized that there is an association between enlarged head circumference (HC) and autism spectrum disorder (ASD), but the genetics of HC in ASD is not well understood. In order to investigate the genetic underpinning of HC in ASD, we undertook a genome-wide linkage study of HC followed by linkage signal targeted association among a sample of 67 extended pedigrees with ASD. HC measurements on members of 67 multiplex ASD extended pedigrees were used as a quantitative trait in a genome-wide linkage analysis. The Illumina 6K SNP linkage panel was used, and analyses were carried out using the SOLAR implemented variance components model. Loci identified in this way formed the target for subsequent association analysis using the Illumina OmniExpress chip and imputed genotypes. A modification of the qTDT was used as implemented in SOLAR. We identified a linkage signal spanning 6p21.31 to 6p22.2 (maximum LOD = 3.4). Although targeted association did not find evidence of association with any SNP overall, in one family with the strongest evidence of linkage, there was evidence for association (rs17586672, p = 1.72E-07). Although this region does not overlap with ASD linkage signals in these same samples, it has been associated with other psychiatric risk, including ADHD, developmental dyslexia, schizophrenia, specific language impairment, and juvenile bipolar disorder. The genome-wide significant linkage signal represents the first reported observation of a potential quantitative trait locus for HC in ASD and may be relevant in the context of complex multivariate risk likely leading to ASD.
Digital Quantification of Human Eye Color Highlights Genetic Association of Three New Loci

PubMed Central

Liu, Fan; Wollstein, Andreas; Hysi, Pirro G.; Ankra-Badu, Georgina A.; Spector, Timothy D.; Park, Daniel; Zhu, Gu; Larsson, Mats; Duffy, David L.; Montgomery, Grant W.; Mackey, David A.; Walsh, Susan; Lao, Oscar; Hofman, Albert; Rivadeneira, Fernando; Vingerling, Johannes R.; Uitterlinden, André G.; Martin, Nicholas G.; Hammond, Christopher J.; Kayser, Manfred

2010-01-01

Previous studies have successfully identified genetic variants in several genes associated with human iris (eye) color; however, they all used simplified categorical trait information. Here, we quantified continuous eye color variation into hue and saturation values using high-resolution digital full-eye photographs and conducted a genome-wide association study on 5,951 Dutch Europeans from the Rotterdam Study. Three new regions, 1q42.3, 17q25.3, and 21q22.13, were highlighted meeting the criterion for genome-wide statistically significant association. The latter two loci were replicated in 2,261 individuals from the UK and in 1,282 from Australia. The LYST gene at 1q42.3 and the DSCR9 gene at 21q22.13 serve as promising functional candidates. A model for predicting quantitative eye colors explained over 50% of trait variance in the Rotterdam Study. Over all our data exemplify that fine phenotyping is a useful strategy for finding genes involved in human complex traits. PMID:20463881
The Role of Copy Number Variation in Susceptibility to Amyotrophic Lateral Sclerosis: Genome-Wide Association Study and Comparison with Published Loci

PubMed Central

Wain, Louise V.; Pedroso, Inti; Landers, John E.; Breen, Gerome; Shaw, Christopher E.; Leigh, P. Nigel; Brown, Robert H.

2009-01-01

Background The genetic contribution to sporadic amyotrophic lateral sclerosis (ALS) has not been fully elucidated. There are increasing efforts to characterise the role of copy number variants (CNVs) in human diseases; two previous studies concluded that CNVs may influence risk of sporadic ALS, with multiple rare CNVs more important than common CNVs. A little-explored issue surrounding genome-wide CNV association studies is that of post-calling filtering and merging of raw CNV calls. We undertook simulations to define filter thresholds and considered optimal ways of merging overlapping CNV calls for association testing, taking into consideration possibly overlapping or nested, but distinct, CNVs and boundary estimation uncertainty. Methodology and Principal Findings In this study we screened Illumina 300K SNP genotyping data from 730 ALS cases and 789 controls for copy number variation. Following quality control filters using thresholds defined by simulation, a total of 11321 CNV calls were made across 575 cases and 621 controls. Using region-based and gene-based association analyses, we identified several loci showing nominally significant association. However, the choice of criteria for combining calls for association testing has an impact on the ranking of the results by their significance. Several loci which were previously reported as being associated with ALS were identified here. However, of another 15 genes previously reported as exhibiting ALS-specific copy number variation, only four exhibited copy number variation in this study. Potentially interesting novel loci, including EEF1D, a translation elongation factor involved in the delivery of aminoacyl tRNAs to the ribosome (a process which has previously been implicated in genetic studies of spinal muscular atrophy) were identified but must be treated with caution due to concerns surrounding genomic location and platform suitability. Conclusions and Significance Interpretation of CNV association findings must take into account the effects of filtering and combining CNV calls when based on early genome-wide genotyping platforms and modest study sizes. PMID:19997636
Genome Evolution Due to Allopolyploidization in Wheat

PubMed Central

Feldman, Moshe; Levy, Avraham A.

2012-01-01

The wheat group has evolved through allopolyploidization, namely, through hybridization among species from the plant genera Aegilops and Triticum followed by genome doubling. This speciation process has been associated with ecogeographical expansion and with domestication. In the past few decades, we have searched for explanations for this impressive success. Our studies attempted to probe the bases for the wide genetic variation characterizing these species, which accounts for their great adaptability and colonizing ability. Central to our work was the investigation of how allopolyploidization alters genome structure and expression. We found in wheat that allopolyploidy accelerated genome evolution in two ways: (1) it triggered rapid genome alterations through the instantaneous generation of a variety of cardinal genetic and epigenetic changes (which we termed “revolutionary” changes), and (2) it facilitated sporadic genomic changes throughout the species’ evolution (i.e., evolutionary changes), which are not attainable at the diploid level. Our major findings in natural and synthetic allopolyploid wheat indicate that these alterations have led to the cytological and genetic diploidization of the allopolyploids. These genetic and epigenetic changes reflect the dynamic structural and functional plasticity of the allopolyploid wheat genome. The significance of this plasticity for the successful establishment of wheat allopolyploids, in nature and under domestication, is discussed. PMID:23135324
In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.

PubMed

Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M

2014-01-30

RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.
Genetic predictors of response to serotonergic and noradrenergic antidepressants in major depressive disorder: a genome-wide analysis of individual-level data and a meta-analysis.

PubMed

Tansey, Katherine E; Guipponi, Michel; Perroud, Nader; Bondolfi, Guido; Domenici, Enrico; Evans, David; Hall, Stephanie K; Hauser, Joanna; Henigsberg, Neven; Hu, Xiaolan; Jerman, Borut; Maier, Wolfgang; Mors, Ole; O'Donovan, Michael; Peters, Tim J; Placentino, Anna; Rietschel, Marcella; Souery, Daniel; Aitchison, Katherine J; Craig, Ian; Farmer, Anne; Wendland, Jens R; Malafosse, Alain; Holmans, Peter; Lewis, Glyn; Lewis, Cathryn M; Stensbøl, Tine Bryan; Kapur, Shitij; McGuffin, Peter; Uher, Rudolf

2012-01-01

It has been suggested that outcomes of antidepressant treatment for major depressive disorder could be significantly improved if treatment choice is informed by genetic data. This study aims to test the hypothesis that common genetic variants can predict response to antidepressants in a clinically meaningful way. The NEWMEDS consortium, an academia-industry partnership, assembled a database of over 2,000 European-ancestry individuals with major depressive disorder, prospectively measured treatment outcomes with serotonin reuptake inhibiting or noradrenaline reuptake inhibiting antidepressants and available genetic samples from five studies (three randomized controlled trials, one part-randomized controlled trial, and one treatment cohort study). After quality control, a dataset of 1,790 individuals with high-quality genome-wide genotyping provided adequate power to test the hypotheses that antidepressant response or a clinically significant differential response to the two classes of antidepressants could be predicted from a single common genetic polymorphism. None of the more than half million genetic markers significantly predicted response to antidepressants overall, serotonin reuptake inhibitors, or noradrenaline reuptake inhibitors, or differential response to the two types of antidepressants (genome-wide significance p<5×10(-8)). No biological pathways were significantly overrepresented in the results. No significant associations (genome-wide significance p<5×10(-8)) were detected in a meta-analysis of NEWMEDS and another large sample (STAR*D), with 2,897 individuals in total. Polygenic scoring found no convergence among multiple associations in NEWMEDS and STAR*D. No single common genetic variant was associated with antidepressant response at a clinically relevant level in a European-ancestry cohort. Effects specific to particular antidepressant drugs could not be investigated in the current study. Please see later in the article for the Editors' Summary.
A robust clustering algorithm for identifying problematic samples in genome-wide association studies.

PubMed

Bellenguez, Céline; Strange, Amy; Freeman, Colin; Donnelly, Peter; Spencer, Chris C A

2012-01-01

High-throughput genotyping arrays provide an efficient way to survey single nucleotide polymorphisms (SNPs) across the genome in large numbers of individuals. Downstream analysis of the data, for example in genome-wide association studies (GWAS), often involves statistical models of genotype frequencies across individuals. The complexities of the sample collection process and the potential for errors in the experimental assay can lead to biases and artefacts in an individual's inferred genotypes. Rather than attempting to model these complications, it has become a standard practice to remove individuals whose genome-wide data differ from the sample at large. Here we describe a simple, but robust, statistical algorithm to identify samples with atypical summaries of genome-wide variation. Its use as a semi-automated quality control tool is demonstrated using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections. The algorithm is written in R and is freely available at www.well.ox.ac.uk/chris-spencer chris.spencer@well.ox.ac.uk Supplementary data are available at Bioinformatics online.
A Genome-Wide Map of Mitochondrial DNA Recombination in Yeast

PubMed Central

Fritsch, Emilie S.; Chabbert, Christophe D.; Klaus, Bernd; Steinmetz, Lars M.

2014-01-01

In eukaryotic cells, the production of cellular energy requires close interplay between nuclear and mitochondrial genomes. The mitochondrial genome is essential in that it encodes several genes involved in oxidative phosphorylation. Each cell contains several mitochondrial genome copies and mitochondrial DNA recombination is a widespread process occurring in plants, fungi, protists, and invertebrates. Saccharomyces cerevisiae has proved to be an excellent model to dissect mitochondrial biology. Several studies have focused on DNA recombination in this organelle, yet mostly relied on reporter genes or artificial systems. However, no complete mitochondrial recombination map has been released for any eukaryote so far. In the present work, we sequenced pools of diploids originating from a cross between two different S. cerevisiae strains to detect recombination events. This strategy allowed us to generate the first genome-wide map of recombination for yeast mitochondrial DNA. We demonstrated that recombination events are enriched in specific hotspots preferentially localized in non-protein-coding regions. Additionally, comparison of the recombination profiles of two different crosses showed that the genetic background affects hotspot localization and recombination rates. Finally, to gain insights into the mechanisms involved in mitochondrial recombination, we assessed the impact of individual depletion of four genes previously associated with this process. Deletion of NTG1 and MGT1 did not substantially influence the recombination landscape, alluding to the potential presence of additional regulatory factors. Our findings also revealed the loss of large mitochondrial DNA regions in the absence of MHR1, suggesting a pivotal role for Mhr1 in mitochondrial genome maintenance during mating. This study provides a comprehensive overview of mitochondrial DNA recombination in yeast and thus paves the way for future mechanistic studies of mitochondrial recombination and genome maintenance. PMID:25081569
A genome-wide map of mitochondrial DNA recombination in yeast.

PubMed

Fritsch, Emilie S; Chabbert, Christophe D; Klaus, Bernd; Steinmetz, Lars M

2014-10-01

In eukaryotic cells, the production of cellular energy requires close interplay between nuclear and mitochondrial genomes. The mitochondrial genome is essential in that it encodes several genes involved in oxidative phosphorylation. Each cell contains several mitochondrial genome copies and mitochondrial DNA recombination is a widespread process occurring in plants, fungi, protists, and invertebrates. Saccharomyces cerevisiae has proved to be an excellent model to dissect mitochondrial biology. Several studies have focused on DNA recombination in this organelle, yet mostly relied on reporter genes or artificial systems. However, no complete mitochondrial recombination map has been released for any eukaryote so far. In the present work, we sequenced pools of diploids originating from a cross between two different S. cerevisiae strains to detect recombination events. This strategy allowed us to generate the first genome-wide map of recombination for yeast mitochondrial DNA. We demonstrated that recombination events are enriched in specific hotspots preferentially localized in non-protein-coding regions. Additionally, comparison of the recombination profiles of two different crosses showed that the genetic background affects hotspot localization and recombination rates. Finally, to gain insights into the mechanisms involved in mitochondrial recombination, we assessed the impact of individual depletion of four genes previously associated with this process. Deletion of NTG1 and MGT1 did not substantially influence the recombination landscape, alluding to the potential presence of additional regulatory factors. Our findings also revealed the loss of large mitochondrial DNA regions in the absence of MHR1, suggesting a pivotal role for Mhr1 in mitochondrial genome maintenance during mating. This study provides a comprehensive overview of mitochondrial DNA recombination in yeast and thus paves the way for future mechanistic studies of mitochondrial recombination and genome maintenance. Copyright © 2014 by the Genetics Society of America.
Pervasive adaptive protein evolution apparent in diversity patterns around amino acid substitutions in Drosophila simulans.

PubMed

Sattath, Shmuel; Elyashiv, Eyal; Kolodny, Oren; Rinott, Yosef; Sella, Guy

2011-02-10

In Drosophila, multiple lines of evidence converge in suggesting that beneficial substitutions to the genome may be common. All suffer from confounding factors, however, such that the interpretation of the evidence-in particular, conclusions about the rate and strength of beneficial substitutions-remains tentative. Here, we use genome-wide polymorphism data in D. simulans and sequenced genomes of its close relatives to construct a readily interpretable characterization of the effects of positive selection: the shape of average neutral diversity around amino acid substitutions. As expected under recurrent selective sweeps, we find a trough in diversity levels around amino acid but not around synonymous substitutions, a distinctive pattern that is not expected under alternative models. This characterization is richer than previous approaches, which relied on limited summaries of the data (e.g., the slope of a scatter plot), and relates to underlying selection parameters in a straightforward way, allowing us to make more reliable inferences about the prevalence and strength of adaptation. Specifically, we develop a coalescent-based model for the shape of the entire curve and use it to infer adaptive parameters by maximum likelihood. Our inference suggests that ∼13% of amino acid substitutions cause selective sweeps. Interestingly, it reveals two classes of beneficial fixations: a minority (approximately 3%) that appears to have had large selective effects and accounts for most of the reduction in diversity, and the remaining 10%, which seem to have had very weak selective effects. These estimates therefore help to reconcile the apparent conflict among previously published estimates of the strength of selection. More generally, our findings provide unequivocal evidence for strongly beneficial substitutions in Drosophila and illustrate how the rapidly accumulating genome-wide data can be leveraged to address enduring questions about the genetic basis of adaptation.
Hi-C-constrained physical models of human chromosomes recover functionally-related properties of genome organization

NASA Astrophysics Data System (ADS)

di Stefano, Marco; Paulsen, Jonas; Lien, Tonje G.; Hovig, Eivind; Micheletti, Cristian

2016-10-01

Combining genome-wide structural models with phenomenological data is at the forefront of efforts to understand the organizational principles regulating the human genome. Here, we use chromosome-chromosome contact data as knowledge-based constraints for large-scale three-dimensional models of the human diploid genome. The resulting models remain minimally entangled and acquire several functional features that are observed in vivo and that were never used as input for the model. We find, for instance, that gene-rich, active regions are drawn towards the nuclear center, while gene poor and lamina associated domains are pushed to the periphery. These and other properties persist upon adding local contact constraints, suggesting their compatibility with non-local constraints for the genome organization. The results show that suitable combinations of data analysis and physical modelling can expose the unexpectedly rich functionally-related properties implicit in chromosome-chromosome contact data. Specific directions are suggested for further developments based on combining experimental data analysis and genomic structural modelling.

Hi-C-constrained physical models of human chromosomes recover functionally-related properties of genome organization.

PubMed

Di Stefano, Marco; Paulsen, Jonas; Lien, Tonje G; Hovig, Eivind; Micheletti, Cristian

2016-10-27

Combining genome-wide structural models with phenomenological data is at the forefront of efforts to understand the organizational principles regulating the human genome. Here, we use chromosome-chromosome contact data as knowledge-based constraints for large-scale three-dimensional models of the human diploid genome. The resulting models remain minimally entangled and acquire several functional features that are observed in vivo and that were never used as input for the model. We find, for instance, that gene-rich, active regions are drawn towards the nuclear center, while gene poor and lamina associated domains are pushed to the periphery. These and other properties persist upon adding local contact constraints, suggesting their compatibility with non-local constraints for the genome organization. The results show that suitable combinations of data analysis and physical modelling can expose the unexpectedly rich functionally-related properties implicit in chromosome-chromosome contact data. Specific directions are suggested for further developments based on combining experimental data analysis and genomic structural modelling.
The Arabidopsis thaliana mobilome and its impact at the species level

PubMed Central

Quadrana, Leandro; Bortolini Silveira, Amanda; Mayhew, George F; LeBlanc, Chantal; Martienssen, Robert A; Jeddeloh, Jeffrey A; Colot, Vincent

2016-01-01

Transposable elements (TEs) are powerful motors of genome evolution yet a comprehensive assessment of recent transposition activity at the species level is lacking for most organisms. Here, using genome sequencing data for 211 Arabidopsis thaliana accessions taken from across the globe, we identify thousands of recent transposition events involving half of the 326 TE families annotated in this plant species. We further show that the composition and activity of the 'mobilome' vary extensively between accessions in relation to climate and genetic factors. Moreover, TEs insert equally throughout the genome and are rapidly purged by natural selection from gene-rich regions because they frequently affect genes, in multiple ways. Remarkably, loci controlling adaptive responses to the environment are the most frequent transposition targets observed. These findings demonstrate the pervasive, species-wide impact that a rich mobilome can have and the importance of transposition as a recurrent generator of large-effect alleles. DOI: http://dx.doi.org/10.7554/eLife.15716.001 PMID:27258693
Cpf1-Database: web-based genome-wide guide RNA library design for gene knockout screens using CRISPR-Cpf1.

PubMed

Park, Jeongbin; Bae, Sangsu

2018-03-15

Following the type II CRISPR-Cas9 system, type V CRISPR-Cpf1 endonucleases have been found to be applicable for genome editing in various organisms in vivo. However, there are as yet no web-based tools capable of optimally selecting guide RNAs (gRNAs) among all possible genome-wide target sites. Here, we present Cpf1-Database, a genome-wide gRNA library design tool for LbCpf1 and AsCpf1, which have DNA recognition sequences of 5'-TTTN-3' at the 5' ends of target sites. Cpf1-Database provides a sophisticated but simple way to design gRNAs for AsCpf1 nucleases on the genome scale. One can easily access the data using a straightforward web interface, and using the powerful collections feature one can easily design gRNAs for thousands of genes in short time. Free access at http://www.rgenome.net/cpf1-database/. sangsubae@hanyang.ac.kr.
On the rank-distance median of 3 permutations.

PubMed

Chindelevitch, Leonid; Pereira Zanetti, João Paulo; Meidanis, João

2018-05-08

Recently, Pereira Zanetti, Biller and Meidanis have proposed a new definition of a rearrangement distance between genomes. In this formulation, each genome is represented as a matrix, and the distance d is the rank distance between these matrices. Although defined in terms of matrices, the rank distance is equal to the minimum total weight of a series of weighted operations that leads from one genome to the other, including inversions, translocations, transpositions, and others. The computational complexity of the median-of-three problem according to this distance is currently unknown. The genome matrices are a special kind of permutation matrices, which we study in this paper. In their paper, the authors provide an [Formula: see text] algorithm for determining three candidate medians, prove the tight approximation ratio [Formula: see text], and provide a sufficient condition for their candidates to be true medians. They also conduct some experiments that suggest that their method is accurate on simulated and real data. In this paper, we extend their results and provide the following: Three invariants characterizing the problem of finding the median of 3 matrices A sufficient condition for uniqueness of medians that can be checked in O(n) A faster, [Formula: see text] algorithm for determining the median under this condition A new heuristic algorithm for this problem based on compressed sensing A [Formula: see text] algorithm that exactly solves the problem when the inputs are orthogonal matrices, a class that includes both permutations and genomes as special cases. Our work provides the first proof that, with respect to the rank distance, the problem of finding the median of 3 genomes, as well as the median of 3 permutations, is exactly solvable in polynomial time, a result which should be contrasted with its NP-hardness for the DCJ (double cut-and-join) distance and most other families of genome rearrangement operations. This result, backed by our experimental tests, indicates that the rank distance is a viable alternative to the DCJ distance widely used in genome comparisons.
Genome-wide association study identifies six new loci influencing pulse pressure and mean arterial pressure

PubMed Central

Wain, Louise V; Verwoert, Germaine C; O’Reilly, Paul F; Shi, Gang; Johnson, Toby; Johnson, Andrew D; Bochud, Murielle; Rice, Kenneth M; Henneman, Peter; Smith, Albert V; Ehret, Georg B; Amin, Najaf; Larson, Martin G; Mooser, Vincent; Hadley, David; Dörr, Marcus; Bis, Joshua C; Aspelund, Thor; Esko, Tõnu; Janssens, A Cecile JW; Zhao, Jing Hua; Heath, Simon; Laan, Maris; Fu, Jingyuan; Pistis, Giorgio; Luan, Jian’an; Arora, Pankaj; Lucas, Gavin; Pirastu, Nicola; Pichler, Irene; Jackson, Anne U; Webster, Rebecca J; Zhang, Feng; Peden, John F; Schmidt, Helena; Tanaka, Toshiko; Campbell, Harry; Igl, Wilmar; Milaneschi, Yuri; Hotteng, Jouke-Jan; Vitart, Veronique; Chasman, Daniel I; Trompet, Stella; Bragg-Gresham, Jennifer L; Alizadeh, Behrooz Z; Chambers, John C; Guo, Xiuqing; Lehtimäki, Terho; Kühnel, Brigitte; Lopez, Lorna M; Polašek, Ozren; Boban, Mladen; Nelson, Christopher P; Morrison, Alanna C; Pihur, Vasyl; Ganesh, Santhi K; Hofman, Albert; Kundu, Suman; Mattace-Raso, Francesco US; Rivadeneira, Fernando; Sijbrands, Eric JG; Uitterlinden, Andre G; Hwang, Shih-Jen; Vasan, Ramachandran S; Wang, Thomas J; Bergmann, Sven; Vollenweider, Peter; Waeber, Gérard; Laitinen, Jaana; Pouta, Anneli; Zitting, Paavo; McArdle, Wendy L; Kroemer, Heyo K; Völker, Uwe; Völzke, Henry; Glazer, Nicole L; Taylor, Kent D; Harris, Tamara B; Alavere, Helene; Haller, Toomas; Keis, Aime; Tammesoo, Mari-Liis; Aulchenko, Yurii; Barroso, Inês; Khaw, Kay-Tee; Galan, Pilar; Hercberg, Serge; Lathrop, Mark; Eyheramendy, Susana; Org, Elin; Sõber, Siim; Lu, Xiaowen; Nolte, Ilja M; Penninx, Brenda W; Corre, Tanguy; Masciullo, Corrado; Sala, Cinzia; Groop, Leif; Voight, Benjamin F; Melander, Olle; O’Donnell, Christopher J; Salomaa, Veikko; d’Adamo, Adamo Pio; Fabretto, Antonella; Faletra, Flavio; Ulivi, Sheila; Del Greco, M Fabiola; Facheris, Maurizio; Collins, Francis S; Bergman, Richard N; Beilby, John P; Hung, Joseph; Musk, A William; Mangino, Massimo; Shin, So-Youn; Soranzo, Nicole; Watkins, Hugh; Goel, Anuj; Hamsten, Anders; Gider, Pierre; Loitfelder, Marisa; Zeginigg, Marion; Hernandez, Dena; Najjar, Samer S; Navarro, Pau; Wild, Sarah H; Corsi, Anna Maria; Singleton, Andrew; de Geus, Eco JC; Willemsen, Gonneke; Parker, Alex N; Rose, Lynda M; Buckley, Brendan; Stott, David; Orru, Marco; Uda, Manuela; van der Klauw, Melanie M; Zhang, Weihua; Li, Xinzhong; Scott, James; Chen, Yii-Der Ida; Burke, Gregory L; Kähönen, Mika; Viikari, Jorma; Döring, Angela; Meitinger, Thomas; Davies, Gail; Starr, John M; Emilsson, Valur; Plump, Andrew; Lindeman, Jan H; ’t Hoen, Peter AC; König, Inke R; Felix, Janine F; Clarke, Robert; Hopewell, Jemma C; Ongen, Halit; Breteler, Monique; Debette, Stéphanie; DeStefano, Anita L; Fornage, Myriam; Mitchell, Gary F; Smith, Nicholas L; Holm, Hilma; Stefansson, Kari; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Samani, Nilesh J; Preuss, Michael; Rudan, Igor; Hayward, Caroline; Deary, Ian J; Wichmann, H-Erich; Raitakari, Olli T; Palmas, Walter; Kooner, Jaspal S; Stolk, Ronald P; Jukema, J Wouter; Wright, Alan F; Boomsma, Dorret I; Bandinelli, Stefania; Gyllensten, Ulf B; Wilson, James F; Ferrucci, Luigi; Schmidt, Reinhold; Farrall, Martin; Spector, Tim D; Palmer, Lyle J; Tuomilehto, Jaakko; Pfeufer, Arne; Gasparini, Paolo; Siscovick, David; Altshuler, David; Loos, Ruth JF; Toniolo, Daniela; Snieder, Harold; Gieger, Christian; Meneton, Pierre; Wareham, Nicholas J; Oostra, Ben A; Metspalu, Andres; Launer, Lenore; Rettig, Rainer; Strachan, David P; Beckmann, Jacques S; Witteman, Jacqueline CM; Erdmann, Jeanette; van Dijk, Ko Willems; Boerwinkle, Eric; Boehnke, Michael; Ridker, Paul M; Jarvelin, Marjo-Riitta; Chakravarti, Aravinda; Abecasis, Goncalo R; Gudnason, Vilmundur; Newton-Cheh, Christopher; Levy, Daniel; Munroe, Patricia B; Psaty, Bruce M; Caulfield, Mark J; Rao, Dabeeru C

2012-01-01

Numerous genetic loci influence systolic blood pressure (SBP) and diastolic blood pressure (DBP) in Europeans 1-3. We now report genome-wide association studies of pulse pressure (PP) and mean arterial pressure (MAP). In discovery (N=74,064) and follow-up studies (N=48,607), we identified at genome-wide significance (P= 2.7×10-8 to P=2.3×10-13) four novel PP loci (at 4q12 near CHIC2/PDGFRAI, 7q22.3 near PIK3CG, 8q24.12 in NOV, 11q24.3 near ADAMTS-8), two novel MAP loci (3p21.31 in MAP4, 10q25.3 near ADRB1) and one locus associated with both traits (2q24.3 near FIGN) which has recently been associated with SBP in east Asians. For three of the novel PP signals, the estimated effect for SBP was opposite to that for DBP, in contrast to the majority of common SBP- and DBP-associated variants which show concordant effects on both traits. These findings indicate novel genetic mechanisms underlying blood pressure variation, including pathways that may differentially influence SBP and DBP. PMID:21909110
Six Degrees of Separation and Employment: Disability Services Reconsidered

ERIC Educational Resources Information Center

Stensrud, Robert; Sover-Wright, Ehren; Gilbride, Dennis

2009-01-01

If six degrees of separation is all that is needed for anyone to find anyone else, and if three clicks can find almost anything on the World Wide Web, perhaps analyzing how social networks form and connect people to each other offers a useful way to reconsider how the rehabilitation profession pursues job placement and employer development. Social…
Whole-genome resequencing of 292 pigeonpea accessions identifies genomic regions associated with domestication and agronomic traits.

PubMed

Varshney, Rajeev K; Saxena, Rachit K; Upadhyaya, Hari D; Khan, Aamir W; Yu, Yue; Kim, Changhoon; Rathore, Abhishek; Kim, Dongseon; Kim, Jihun; An, Shaun; Kumar, Vinay; Anuradha, Ghanta; Yamini, Kalinati Narasimhan; Zhang, Wei; Muniswamy, Sonnappa; Kim, Jong-So; Penmetsa, R Varma; von Wettberg, Eric; Datta, Swapan K

2017-07-01

Pigeonpea (Cajanus cajan), a tropical grain legume with low input requirements, is expected to continue to have an important role in supplying food and nutritional security in developing countries in Asia, Africa and the tropical Americas. From whole-genome resequencing of 292 Cajanus accessions encompassing breeding lines, landraces and wild species, we characterize genome-wide variation. On the basis of a scan for selective sweeps, we find several genomic regions that were likely targets of domestication and breeding. Using genome-wide association analysis, we identify associations between several candidate genes and agronomically important traits. Candidate genes for these traits in pigeonpea have sequence similarity to genes functionally characterized in other plants for flowering time control, seed development and pod dehiscence. Our findings will allow acceleration of genetic gains for key traits to improve yield and sustainability in pigeonpea.
Extensive retroviral diversity in shark.

PubMed

Han, Guan-Zhu

2015-04-28

Retroviruses infect a wide range of vertebrates. However, little is known about the diversity of retroviruses in basal vertebrates. Endogenous retrovirus (ERV) provides a valuable resource to study the ecology and evolution of retrovirus. I performed a genome-scale screening for ERVs in the elephant shark (Callorhinchus milii) and identified three complete or nearly complete ERVs and many short ERV fragments. I designate these retroviral elements "C. milli ERVs" (CmiERVs). Phylogenetic analysis shows that the CmiERVs form three distinct lineages. The genome invasions by these retroviruses are estimated to take place more than 50 million years ago. My results reveal the extensive retroviral diversity in the elephant shark. Diverse retroviruses appear to have been associated with cartilaginous fishes for millions of years. These findings have important implications in understanding the diversity and evolution of retroviruses.
Genome-wide association study in Chinese identifies novel loci for blood pressure and hypertension

PubMed Central

Lu, Xiangfeng; Wang, Laiyuan; Lin, Xu; Huang, Jianfeng; Charles Gu, C.; He, Meian; Shen, Hongbing; He, Jiang; Zhu, Jingwen; Li, Huaixing; Hixson, James E.; Wu, Tangchun; Dai, Juncheng; Lu, Ling; Shen, Chong; Chen, Shufeng; He, Lin; Mo, Zengnan; Hao, Yongchen; Mo, Xingbo; Yang, Xueli; Li, Jianxin; Cao, Jie; Chen, Jichun; Fan, Zhongjie; Li, Ying; Zhao, Liancheng; Li, Hongfan; Lu, Fanghong; Yao, Cailiang; Yu, Lin; Xu, Lihua; Mu, Jianjun; Wu, Xianping; Deng, Ying; Hu, Dongsheng; Zhang, Weidong; Ji, Xu; Guo, Dongshuang; Guo, Zhirong; Zhou, Zhengyuan; Yang, Zili; Wang, Renping; Yang, Jun; Zhou, Xiaoyang; Yan, Weili; Sun, Ningling; Gao, Pingjin; Gu, Dongfeng

2015-01-01

Hypertension is a common disorder and the leading risk factor for cardiovascular disease and premature deaths worldwide. Genome-wide association studies (GWASs) in the European population have identified multiple chromosomal regions associated with blood pressure, and the identified loci altogether explain only a small fraction of the variance for blood pressure. The differences in environmental exposures and genetic background between Chinese and European populations might suggest potential different pathways of blood pressure regulation. To identify novel genetic variants affecting blood pressure variation, we conducted a meta-analysis of GWASs of blood pressure and hypertension in 11 816 subjects followed by replication studies including 69 146 additional individuals. We identified genome-wide significant (P < 5.0 × 10−8) associations with blood pressure, which included variants at three new loci (CACNA1D, CYP21A2, and MED13L) and a newly discovered variant near SLC4A7. We also replicated 14 previously reported loci, 8 (CASZ1, MOV10, FGF5, CYP17A1, SOX6, ATP2B1, ALDH2, and JAG1) at genome-wide significance, and 6 (FIGN, ULK4, GUCY1A3, HFE, TBX3-TBX5, and TBX3) at a suggestive level of P = 1.81 × 10−3 to 5.16 × 10−8. These findings provide new mechanistic insights into the regulation of blood pressure and potential targets for treatments. PMID:25249183
Linkage Disequilibrium And Genome-Wide Association Studies In O. sativa

USDA-ARS?s Scientific Manuscript database

There is increasing evidence that genome-wide association studies provide a powerful approach to find the genetic basis of complex phenotypic variation in all kinds of species. For this purpose, we developed the first generation 44K Affymetrix SNP array in rice (see Tung et al. poster). We genotyped...
Whole-exome analysis of foetal autopsy tissue reveals a frameshift mutation in OBSL1, consistent with a diagnosis of 3-M Syndrome.

PubMed

Marshall, Christian R; Farrell, Sandra A; Cushing, Donna; Paton, Tara; Stockley, Tracy L; Stavropoulos, Dimitri J; Ray, Peter N; Szego, Michael; Lau, Lynette; Pereira, Sergio L; Cohn, Ronald D; Wintle, Richard F; Abuzenadah, Adel M; Abu-Elmagd, Muhammad; Scherer, Stephen W

2015-01-01

We report a consanguineous couple that has experienced three consecutive pregnancy losses following the foetal ultrasound finding of short limbs. Post-termination examination revealed no skeletal dysplasia, but some subtle proximal limb shortening in two foetuses, and a spectrum of mildly dysmorphic features. Karyotype was normal in all three foetuses (46, XX) and comparative genomic hybridization microarray analysis detected no pathogenic copy number variants. Whole-exome sequencing and genome-wide homozygosity mapping revealed a previously reported frameshift mutation in the OBSL1 gene (c.1273insA p.T425nfsX40), consistent with a diagnosis of 3-M Syndrome 2 (OMIM #612921), which had not been anticipated from the clinical findings. Our study provides novel insight into the early clinical manifestations of this form of 3-M syndrome, and demonstrates the utility of whole exome sequencing as a tool for prenatal diagnosis in particular when there is a family history suggestive of a recurrent set of clinical symptoms.
Trans-Pacific RAD-Seq population genomics confirms introgressive hybridization in Eastern Pacific Pocillopora corals.

PubMed

Combosch, David J; Vollmer, Steven V

2015-07-01

Discrepancies between morphology-based taxonomy and phylogenetic systematics are common in Scleractinian corals. In Pocillopora corals, nine recently identified genetic lineages disagree fundamentally with the 17 recognized Pocillopora species, including 5 major Indo-Pacific reef-builders. Pocillopora corals hybridize in the Tropical Eastern Pacific, so it is possible that some of the disagreement between the genetics and taxonomy may be due to introgressive hybridization. Here we used 6769 genome-wide SNPs from Restriction-site Associated DNA Sequencing (RAD-Seq) to conduct phylogenomic comparisons among three common, Indo-Pacific Pocillopora species - P. damicornis, P. eydouxi and P. elegans - within and between populations in the Tropical Eastern Pacific (TEP) and the Central Pacific. Genome-wide RAD-Seq comparisons of Central and TEP Pocillopora confirm that the morphospecies P. damicornis, P. eydouxi and P. elegans are not monophyletic, but instead fall into three distinct genetic groups. However, hybrid samples shared fixed alleles with their respective parental species and, even without strict monophyly, P. damicornis share a common set of 33 species-specific alleles across the Pacific. RAD-Seq data confirm the pattern of one-way introgressive hybridization among TEP Pocillopora, suggesting that introgression may play a role in generating shared, polyphyletic lineages among currently recognized Pocillopora species. Levels of population differentiation within genetic lineages indicate significantly higher levels of population differentiation in the Tropical Eastern Pacific than in the Central West Pacific. Copyright © 2015 Elsevier Inc. All rights reserved.
Genome segregation and packaging machinery in Acanthamoeba polyphaga mimivirus is reminiscent of bacterial apparatus.

PubMed

Chelikani, Venkata; Ranjan, Tushar; Zade, Amrutraj; Shukla, Avi; Kondabagil, Kiran

2014-06-01

Genome packaging is a critical step in the virion assembly process. The putative ATP-driven genome packaging motor of Acanthamoeba polyphaga mimivirus (APMV) and other nucleocytoplasmic large DNA viruses (NCLDVs) is a distant ortholog of prokaryotic chromosome segregation motors, such as FtsK and HerA, rather than other viral packaging motors, such as large terminase. Intriguingly, APMV also encodes other components, i.e., three putative serine recombinases and a putative type II topoisomerase, all of which are essential for chromosome segregation in prokaryotes. Based on our analyses of these components and taking the limited available literature into account, here we propose for the first time a model for genome segregation and packaging in APMV that can possibly be extended to NCLDV subfamilies, except perhaps Poxviridae and Ascoviridae. This model might represent a unique variation of the prokaryotic system acquired and contrived by the large DNA viruses of eukaryotes. It is also consistent with previous observations that unicellular eukaryotes, such as amoebae, are melting pots for the advent of chimeric organisms with novel mechanisms. Extremely large viruses with DNA genomes infect a wide range of eukaryotes, from human beings to amoebae and from crocodiles to algae. These large DNA viruses, unlike their much smaller cousins, have the capability of making most of the protein components required for their multiplication. Once they infect the cell, these viruses set up viral replication centers, known as viral factories, to carry out their multiplication with very little help from the host. Our sequence analyses show that there is remarkable similarity between prokaryotes (bacteria and archaea) and large DNA viruses, such as mimivirus, vaccinia virus, and pandoravirus, in the way that they process their newly synthesized genetic material to make sure that only one copy of the complete genome is generated and is meticulously placed inside the newly synthesized viral particle. These findings have important evolutionary implications about the origin and evolution of large viruses.
Genome Segregation and Packaging Machinery in Acanthamoeba polyphaga Mimivirus Is Reminiscent of Bacterial Apparatus

PubMed Central

Chelikani, Venkata; Ranjan, Tushar; Zade, Amrutraj; Shukla, Avi

2014-01-01

ABSTRACT Genome packaging is a critical step in the virion assembly process. The putative ATP-driven genome packaging motor of Acanthamoeba polyphaga mimivirus (APMV) and other nucleocytoplasmic large DNA viruses (NCLDVs) is a distant ortholog of prokaryotic chromosome segregation motors, such as FtsK and HerA, rather than other viral packaging motors, such as large terminase. Intriguingly, APMV also encodes other components, i.e., three putative serine recombinases and a putative type II topoisomerase, all of which are essential for chromosome segregation in prokaryotes. Based on our analyses of these components and taking the limited available literature into account, here we propose for the first time a model for genome segregation and packaging in APMV that can possibly be extended to NCLDV subfamilies, except perhaps Poxviridae and Ascoviridae. This model might represent a unique variation of the prokaryotic system acquired and contrived by the large DNA viruses of eukaryotes. It is also consistent with previous observations that unicellular eukaryotes, such as amoebae, are melting pots for the advent of chimeric organisms with novel mechanisms. IMPORTANCE Extremely large viruses with DNA genomes infect a wide range of eukaryotes, from human beings to amoebae and from crocodiles to algae. These large DNA viruses, unlike their much smaller cousins, have the capability of making most of the protein components required for their multiplication. Once they infect the cell, these viruses set up viral replication centers, known as viral factories, to carry out their multiplication with very little help from the host. Our sequence analyses show that there is remarkable similarity between prokaryotes (bacteria and archaea) and large DNA viruses, such as mimivirus, vaccinia virus, and pandoravirus, in the way that they process their newly synthesized genetic material to make sure that only one copy of the complete genome is generated and is meticulously placed inside the newly synthesized viral particle. These findings have important evolutionary implications about the origin and evolution of large viruses. PMID:24623441
Identification and Classification of Conserved RNA Secondary Structures in the Human Genome

PubMed Central

Pedersen, Jakob Skou; Bejerano, Gill; Siepel, Adam; Rosenbloom, Kate; Lindblad-Toh, Kerstin; Lander, Eric S; Kent, Jim; Miller, Webb; Haussler, David

2006-01-01

The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set of 48,479 candidate RNA structures. This screen finds a large number of known functional RNAs, including 195 miRNAs, 62 histone 3′UTR stem loops, and various types of known genetic recoding elements. Among the highest-scoring new predictions are 169 new miRNA candidates, as well as new candidate selenocysteine insertion sites, RNA editing hairpins, RNAs involved in transcript auto regulation, and many folds that form singletons or small functional RNA families of completely unknown function. While the rate of false positives in the overall set is difficult to estimate and is likely to be substantial, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization. PMID:16628248
Genome-wide Association Analysis of Blood-Pressure Traits in African-Ancestry Individuals Reveals Common Associated Genes in African and Non-African Populations

PubMed Central

Franceschini, Nora; Fox, Ervin; Zhang, Zhaogong; Edwards, Todd L.; Nalls, Michael A.; Sung, Yun Ju; Tayo, Bamidele O.; Sun, Yan V.; Gottesman, Omri; Adeyemo, Adebawole; Johnson, Andrew D.; Young, J. Hunter; Rice, Ken; Duan, Qing; Chen, Fang; Li, Yun; Tang, Hua; Fornage, Myriam; Keene, Keith L.; Andrews, Jeanette S.; Smith, Jennifer A.; Faul, Jessica D.; Guangfa, Zhang; Guo, Wei; Liu, Yu; Murray, Sarah S.; Musani, Solomon K.; Srinivasan, Sathanur; Velez Edwards, Digna R.; Wang, Heming; Becker, Lewis C.; Bovet, Pascal; Bochud, Murielle; Broeckel, Ulrich; Burnier, Michel; Carty, Cara; Chasman, Daniel I.; Ehret, Georg; Chen, Wei-Min; Chen, Guanjie; Chen, Wei; Ding, Jingzhong; Dreisbach, Albert W.; Evans, Michele K.; Guo, Xiuqing; Garcia, Melissa E.; Jensen, Rich; Keller, Margaux F.; Lettre, Guillaume; Lotay, Vaneet; Martin, Lisa W.; Moore, Jason H.; Morrison, Alanna C.; Mosley, Thomas H.; Ogunniyi, Adesola; Palmas, Walter; Papanicolaou, George; Penman, Alan; Polak, Joseph F.; Ridker, Paul M.; Salako, Babatunde; Singleton, Andrew B.; Shriner, Daniel; Taylor, Kent D.; Vasan, Ramachandran; Wiggins, Kerri; Williams, Scott M.; Yanek, Lisa R.; Zhao, Wei; Zonderman, Alan B.; Becker, Diane M.; Berenson, Gerald; Boerwinkle, Eric; Bottinger, Erwin; Cushman, Mary; Eaton, Charles; Nyberg, Fredrik; Heiss, Gerardo; Hirschhron, Joel N.; Howard, Virginia J.; Karczewsk, Konrad J.; Lanktree, Matthew B.; Liu, Kiang; Liu, Yongmei; Loos, Ruth; Margolis, Karen; Snyder, Michael; Go, Min Jin; Kim, Young Jin; Lee, Jong-Young; Jeon, Jae-Pil; Kim, Sung Soo; Han, Bok-Ghee; Cho, Yoon Shin; Sim, Xueling; Tay, Wan Ting; Ong, Rick Twee Hee; Seielstad, Mark; Liu, Jian Jun; Aung, Tin; Wong, Tien Yin; Teo, Yik Ying; Tai, E. Shyong; Chen, Chien-Hsiun; Chang, Li-ching; Chen, Yuan-Tsong; Wu, Jer-Yuarn; Kelly, Tanika N.; Gu, Dongfeng; Hixson, James E.; Sung, Yun Ju; He, Jiang; Tabara, Yasuharu; Kokubo, Yoshihiro; Miki, Tetsuro; Iwai, Naoharu; Kato, Norihiro; Takeuchi, Fumihiko; Katsuya, Tomohiro; Nabika, Toru; Sugiyama, Takao; Zhang, Yi; Huang, Wei; Zhang, Xuegong; Zhou, Xueya; Jin, Li; Zhu, Dingliang; Psaty, Bruce M.; Schork, Nicholas J.; Weir, David R.; Rotimi, Charles N.; Sale, Michele M.; Harris, Tamara; Kardia, Sharon L.R.; Hunt, Steven C.; Arnett, Donna; Redline, Susan; Cooper, Richard S.; Risch, Neil J.; Rao, D.C.; Rotter, Jerome I.; Chakravarti, Aravinda; Reiner, Alex P.; Levy, Daniel; Keating, Brendan J.; Zhu, Xiaofeng

2013-01-01

High blood pressure (BP) is more prevalent and contributes to more severe manifestations of cardiovascular disease (CVD) in African Americans than in any other United States ethnic group. Several small African-ancestry (AA) BP genome-wide association studies (GWASs) have been published, but their findings have failed to replicate to date. We report on a large AA BP GWAS meta-analysis that includes 29,378 individuals from 19 discovery cohorts and subsequent replication in additional samples of AA (n = 10,386), European ancestry (EA) (n = 69,395), and East Asian ancestry (n = 19,601). Five loci (EVX1-HOXA, ULK4, RSPO3, PLEKHG1, and SOX6) reached genome-wide significance (p < 1.0 × 10−8) for either systolic or diastolic BP in a transethnic meta-analysis after correction for multiple testing. Three of these BP loci (EVX1-HOXA, RSPO3, and PLEKHG1) lack previous associations with BP. We also identified one independent signal in a known BP locus (SOX6) and provide evidence for fine mapping in four additional validated BP loci. We also demonstrate that validated EA BP GWAS loci, considered jointly, show significant effects in AA samples. Consequently, these findings suggest that BP loci might have universal effects across studied populations, demonstrating that multiethnic samples are an essential component in identifying, fine mapping, and understanding their trait variability. PMID:23972371
A Novel QTL for Powdery Mildew Resistance in Nordic Spring Barley (Hordeum vulgare L. ssp. vulgare) Revealed by Genome-Wide Association Study

PubMed Central

Bengtsson, Therése; Åhman, Inger; Manninen, Outi; Reitan, Lars; Christerson, Therese; Due Jensen, Jens; Krusell, Lene; Jahoor, Ahmed; Orabi, Jihad

2017-01-01

The powdery mildew fungus, Blumeria graminis f. sp. hordei is a worldwide threat to barley (Hordeum vulgare L. ssp. vulgare) production. One way to control the disease is by the development and deployment of resistant cultivars. A genome-wide association study was performed in a Nordic spring barley panel consisting of 169 genotypes, to identify marker-trait associations significant for powdery mildew. Powdery mildew was scored during three years (2012–2014) in four different locations within the Nordic region. There were strong correlations between data from all locations and years. In total four QTLs were identified, one located on chromosome 4H in the same region as the previously identified mlo locus and three on chromosome 6H. Out of these three QTLs identified on chromosome 6H, two are in the same region as previously reported QTLs for powdery mildew resistance, whereas one QTL appears to be novel. The top NCBI BLASTn hit of the SNP markers within the novel QTL predicted the responsible gene to be the 26S proteasome regulatory subunit, RPN1, which is required for innate immunity and powdery mildew-induced cell death in Arabidopsis. The results from this study have revealed SNP marker candidates that can be exploited for use in marker-assisted selection and stacking of genes for powdery mildew resistance in barley. PMID:29184565
A Novel QTL for Powdery Mildew Resistance in Nordic Spring Barley (Hordeum vulgare L. ssp. vulgare) Revealed by Genome-Wide Association Study.

PubMed

Bengtsson, Therése; Åhman, Inger; Manninen, Outi; Reitan, Lars; Christerson, Therese; Due Jensen, Jens; Krusell, Lene; Jahoor, Ahmed; Orabi, Jihad

2017-01-01

The powdery mildew fungus, Blumeria graminis f. sp. hordei is a worldwide threat to barley ( Hordeum vulgare L. ssp. vulgare ) production. One way to control the disease is by the development and deployment of resistant cultivars. A genome-wide association study was performed in a Nordic spring barley panel consisting of 169 genotypes, to identify marker-trait associations significant for powdery mildew. Powdery mildew was scored during three years (2012-2014) in four different locations within the Nordic region. There were strong correlations between data from all locations and years. In total four QTLs were identified, one located on chromosome 4H in the same region as the previously identified mlo locus and three on chromosome 6H. Out of these three QTLs identified on chromosome 6H, two are in the same region as previously reported QTLs for powdery mildew resistance, whereas one QTL appears to be novel. The top NCBI BLASTn hit of the SNP markers within the novel QTL predicted the responsible gene to be the 26S proteasome regulatory subunit, RPN1, which is required for innate immunity and powdery mildew-induced cell death in Arabidopsis . The results from this study have revealed SNP marker candidates that can be exploited for use in marker-assisted selection and stacking of genes for powdery mildew resistance in barley.
Whole-genome resequencing reveals world-wide ancestry and adaptive introgression events of domesticated cattle in East Asia.

PubMed

Chen, Ningbo; Cai, Yudong; Chen, Qiuming; Li, Ran; Wang, Kun; Huang, Yongzhen; Hu, Songmei; Huang, Shisheng; Zhang, Hucai; Zheng, Zhuqing; Song, Weining; Ma, Zhijie; Ma, Yun; Dang, Ruihua; Zhang, Zijing; Xu, Lei; Jia, Yutang; Liu, Shanzhai; Yue, Xiangpeng; Deng, Weidong; Zhang, Xiaoming; Sun, Zhouyong; Lan, Xianyong; Han, Jianlin; Chen, Hong; Bradley, Daniel G; Jiang, Yu; Lei, Chuzhao

2018-06-14

Cattle domestication and the complex histories of East Asian cattle breeds warrant further investigation. Through analysing the genomes of 49 modern breeds and eight East Asian ancient samples, worldwide cattle are consistently classified into five continental groups based on Y-chromosome haplotypes and autosomal variants. We find that East Asian cattle populations are mainly composed of three distinct ancestries, including an earlier East Asian taurine ancestry that reached China at least ~3.9 kya, a later introduced Eurasian taurine ancestry, and a novel Chinese indicine ancestry that diverged from Indian indicine approximately 36.6-49.6 kya. We also report historic introgression events that helped domestic cattle from southern China and the Tibetan Plateau achieve rapid adaptation by acquiring ~2.93% and ~1.22% of their genomes from banteng and yak, respectively. Our findings provide new insights into the evolutionary history of cattle and the importance of introgression in adaptation of cattle to new environmental challenges in East Asia.
Neandertal admixture in Eurasia confirmed by maximum-likelihood analysis of three genomes.

PubMed

Lohse, Konrad; Frantz, Laurent A F

2014-04-01

Although there has been much interest in estimating histories of divergence and admixture from genomic data, it has proved difficult to distinguish recent admixture from long-term structure in the ancestral population. Thus, recent genome-wide analyses based on summary statistics have sparked controversy about the possibility of interbreeding between Neandertals and modern humans in Eurasia. Here we derive the probability of full mutational configurations in nonrecombining sequence blocks under both admixture and ancestral structure scenarios. Dividing the genome into short blocks gives an efficient way to compute maximum-likelihood estimates of parameters. We apply this likelihood scheme to triplets of human and Neandertal genomes and compare the relative support for a model of admixture from Neandertals into Eurasian populations after their expansion out of Africa against a history of persistent structure in their common ancestral population in Africa. Our analysis allows us to conclusively reject a model of ancestral structure in Africa and instead reveals strong support for Neandertal admixture in Eurasia at a higher rate (3.4-7.3%) than suggested previously. Using analysis and simulations we show that our inference is more powerful than previous summary statistics and robust to realistic levels of recombination.

Neandertal Admixture in Eurasia Confirmed by Maximum-Likelihood Analysis of Three Genomes

PubMed Central

Lohse, Konrad; Frantz, Laurent A. F.

2014-01-01

Although there has been much interest in estimating histories of divergence and admixture from genomic data, it has proved difficult to distinguish recent admixture from long-term structure in the ancestral population. Thus, recent genome-wide analyses based on summary statistics have sparked controversy about the possibility of interbreeding between Neandertals and modern humans in Eurasia. Here we derive the probability of full mutational configurations in nonrecombining sequence blocks under both admixture and ancestral structure scenarios. Dividing the genome into short blocks gives an efficient way to compute maximum-likelihood estimates of parameters. We apply this likelihood scheme to triplets of human and Neandertal genomes and compare the relative support for a model of admixture from Neandertals into Eurasian populations after their expansion out of Africa against a history of persistent structure in their common ancestral population in Africa. Our analysis allows us to conclusively reject a model of ancestral structure in Africa and instead reveals strong support for Neandertal admixture in Eurasia at a higher rate (3.4−7.3%) than suggested previously. Using analysis and simulations we show that our inference is more powerful than previous summary statistics and robust to realistic levels of recombination. PMID:24532731
A Preliminary Genome-Wide Association Study of Pain-Related Fear: Implications for Orofacial Pain.

PubMed

Randall, Cameron L; Wright, Casey D; Chernus, Jonathan M; McNeil, Daniel W; Feingold, Eleanor; Crout, Richard J; Neiswanger, Katherine; Weyant, Robert J; Shaffer, John R; Marazita, Mary L

2017-01-01

Acute and chronic orofacial pain can significantly impact overall health and functioning. Associations between fear of pain and the experience of orofacial pain are well-documented, and environmental, behavioral, and cognitive components of fear of pain have been elucidated. Little is known, however, regarding the specific genes contributing to fear of pain. A genome-wide association study (GWAS; N = 990) was performed to identify plausible genes that may predispose individuals to various levels of fear of pain. The total score and three subscales (fear of minor, severe, and medical/dental pain) of the Fear of Pain Questionnaire-9 (FPQ-9) were modeled in a variance components modeling framework to test for genetic association with 8.5 M genetic variants across the genome, while adjusting for sex, age, education, and income. Three genetic loci were significantly associated with fear of minor pain (8q24.13, 8p21.2, and 6q26; p < 5 × 10 -8 for all) near the genes TMEM65 , NEFM , NEFL , AGPAT4 , and PARK2 . Other suggestive loci were found for the fear of pain total score and each of the FPQ-9 subscales. Multiple genes were identified as possible candidates contributing to fear of pain. The findings may have implications for understanding and treating chronic orofacial pain.
Plume Response to Source Remediation: Case Study of Active Bioremediation

EPA Science Inventory

The three-dimensional distribution of hydraulic conductivity has a profound influence on the prospects for cleaning up contaminated ground water. When there are wide variations in texture (and associated hydraulic conductivity) organic contaminants can find their way into the lo...
Genome-wide detection of selection signatures in Chinese indigenous Laiwu pigs revealed candidate genes regulating fat deposition in muscle.

PubMed

Chen, Minhui; Wang, Jiying; Wang, Yanping; Wu, Ying; Fu, Jinluan; Liu, Jian-Feng

2018-05-18

Currently, genome-wide scans for positive selection signatures in commercial breed have been investigated. However, few studies have focused on selection footprints of indigenous breeds. Laiwu pig is an invaluable Chinese indigenous pig breed with extremely high proportion of intramuscular fat (IMF), and an excellent model to detect footprint as the result of natural and artificial selection for fat deposition in muscle. In this study, based on GeneSeek Genomic profiler Porcine HD data, three complementary methods, F ST , iHS (integrated haplotype homozygosity score) and CLR (composite likelihood ratio), were implemented to detect selection signatures in the whole genome of Laiwu pigs. Totally, 175 candidate selected regions were obtained by at least two of the three methods, which covered 43.75 Mb genomic regions and corresponded to 1.79% of the genome sequence. Gene annotation of the selected regions revealed a list of functionally important genes for feed intake and fat deposition, reproduction, and immune response. Especially, in accordance to the phenotypic features of Laiwu pigs, among the candidate genes, we identified several genes, NPY1R, NPY5R, PIK3R1 and JAKMIP1, involved in the actions of two sets of neurons, which are central regulators in maintaining the balance between food intake and energy expenditure. Our results identified a number of regions showing signatures of selection, as well as a list of functionally candidate genes with potential effect on phenotypic traits, especially fat deposition in muscle. Our findings provide insights into the mechanisms of artificial selection of fat deposition and further facilitate follow-up functional studies.
SQC: secure quality control for meta-analysis of genome-wide association studies.

PubMed

Huang, Zhicong; Lin, Huang; Fellay, Jacques; Kutalik, Zoltán; Hubaux, Jean-Pierre

2017-08-01

Due to the limited power of small-scale genome-wide association studies (GWAS), researchers tend to collaborate and establish a larger consortium in order to perform large-scale GWAS. Genome-wide association meta-analysis (GWAMA) is a statistical tool that aims to synthesize results from multiple independent studies to increase the statistical power and reduce false-positive findings of GWAS. However, it has been demonstrated that the aggregate data of individual studies are subject to inference attacks, hence privacy concerns arise when researchers share study data in GWAMA. In this article, we propose a secure quality control (SQC) protocol, which enables checking the quality of data in a privacy-preserving way without revealing sensitive information to a potential adversary. SQC employs state-of-the-art cryptographic and statistical techniques for privacy protection. We implement the solution in a meta-analysis pipeline with real data to demonstrate the efficiency and scalability on commodity machines. The distributed execution of SQC on a cluster of 128 cores for one million genetic variants takes less than one hour, which is a modest cost considering the 10-month time span usually observed for the completion of the QC procedure that includes timing of logistics. SQC is implemented in Java and is publicly available at https://github.com/acs6610987/secureqc. jean-pierre.hubaux@epfl.ch. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Genome-wide association study of preeclampsia detects novel maternal single nucleotide polymorphisms and copy-number variants in subsets of the Hyperglycemia and Adverse Pregnancy Outcome (HAPO) study cohort

PubMed Central

Zhao, Linlu; Bracken, Michael B.; DeWan, Andrew T.

2013-01-01

Summary A genome-wide association study was undertaken to identify maternal single nucleotide polymorphisms (SNPs) and copy-number variants (CNVs) associated with preeclampsia. Case-control analysis was performed on 1070 Afro-Caribbean (n=21 cases and 1049 controls) and 723 Hispanic (n=62 cases and 661 controls) mothers and 1257 mothers of European ancestry (n=50 cases and 1207 controls) from the Hyperglycemia and Adverse Pregnancy Outcome (HAPO) study. European ancestry subjects were genotyped on Illumina Human610-Quad and Afro-Caribbean and Hispanic subjects were genotyped on Illumina Human1M-Duo BeadChip microarrays. Genome-wide SNP data were analyzed using PLINK. CNVs were called using three detection algorithms (GNOSIS, PennCNV, and QuantiSNP), merged using CNVision, and then screened using stringent criteria. SNP and CNV findings were compared to those of the Study of Pregnancy Hypertension in Iowa (SOPHIA), an independent preeclampsia case-control dataset of Caucasian mothers (n=177 cases and 116 controls). A list of top SNPs were identified for each of the HAPO ethnic groups, but none reached Bonferroni-corrected significance. Novel candidate CNVs showing enrichment among preeclampsia cases were also identified in each of the three ethnic groups. Several variants were suggestively replicated in SOPHIA. The discovered SNPs and copy-number variable regions present interesting candidate genetic variants for preeclampsia that warrant further replication and investigation. PMID:23551011
Genome-wide comparison and taxonomic relatedness of multiple Xylella fastidiosa strains reveal the occurrence of three subspecies and a new Xylella species.

PubMed

Marcelletti, Simone; Scortichini, Marco

2016-10-01

A total of 21 Xylella fastidiosa strains were assessed by comparing their genomes to infer their taxonomic relationships. The whole-genome-based average nucleotide identity and tetranucleotide frequency correlation coefficient analyses were performed. In addition, a consensus tree based on comparisons of 956 core gene families, and a genome-wide phylogenetic tree and a Neighbor-net network were constructed with 820,088 nucleotides (i.e., approximately 30-33 % of the entire X. fastidiosa genome). All approaches revealed the occurrence of three well-demarcated genetic clusters that represent X. fastidiosa subspecies fastidiosa, multiplex and pauca, with the latter appeared to diverge. We suggest that the proposed but never formally described subspecies 'sandyi' and 'morus' are instead members of the subspecies fastidiosa. These analyses support the view that the Xylella strain isolated from Pyrus pyrifolia in Taiwan is likely to be a new species. A widely used multilocus sequence typing analysis yielded conflicting results.
Human evolutionary genomics: ethical and interpretive issues.

PubMed

Vitti, Joseph J; Cho, Mildred K; Tishkoff, Sarah A; Sabeti, Pardis C

2012-03-01

Genome-wide computational studies can now identify targets of natural selection. The unique information about humans these studies reveal, and the media attention they attract, indicate the need for caution and precision in communicating results. This need is exacerbated by ways in which evolutionary and genetic considerations have been misapplied to support discriminatory policies, by persistent misconceptions of these fields and by the social sensitivity surrounding discussions of racial ancestry. We discuss the foundations, accomplishments and future directions of human evolutionary genomics, attending to ways in which the interpretation of good science can go awry, and offer suggestions for researchers to prevent misapplication of their work. Copyright Â© 2011 Elsevier Ltd. All rights reserved.
Protocol matters: which methylome are you actually studying?

PubMed Central

Robinson, Mark D; Statham, Aaron L; Speed, Terence P; Clark, Susan J

2011-01-01

The field of epigenetics is now capitalizing on the vast number of emerging technologies, largely based on second-generation sequencing, which interrogate DNA methylation status and histone modifications genome-wide. However, getting an exhaustive and unbiased view of a methylome at a reasonable cost is proving to be a significant challenge. In this article, we take a closer look at the impact of the DNA sequence and bias effects introduced to datasets by genome-wide DNA methylation technologies and where possible, explore the bioinformatics tools that deconvolve them. There remains much to be learned about the performance of genome-wide technologies, the data we mine from these assays and how it reflects the actual biology. While there are several methods to interrogate the DNA methylation status genome-wide, our opinion is that no single technique suitably covers the minimum criteria of high coverage and, high resolution at a reasonable cost. In fact, the fraction of the methylome that is studied currently depends entirely on the inherent biases of the protocol employed. There is promise for this to change, as the third generation of sequencing technologies is expected to again ‘revolutionize’ the way that we study genomes and epigenomes. PMID:21566704
Genome-wide association analysis accounting for environmental factors through propensity-score matching: application to stressful live events in major depressive disorder.

PubMed

Power, Robert A; Cohen-Woods, Sarah; Ng, Mandy Y; Butler, Amy W; Craddock, Nick; Korszun, Ania; Jones, Lisa; Jones, Ian; Gill, Michael; Rice, John P; Maier, Wolfgang; Zobel, Astrid; Mors, Ole; Placentino, Anna; Rietschel, Marcella; Aitchison, Katherine J; Tozzi, Federica; Muglia, Pierandrea; Breen, Gerome; Farmer, Anne E; McGuffin, Peter; Lewis, Cathryn M; Uher, Rudolf

2013-09-01

Stressful life events are an established trigger for depression and may contribute to the heterogeneity within genome-wide association analyses. With depression cases showing an excess of exposure to stressful events compared to controls, there is difficulty in distinguishing between "true" cases and a "normal" response to a stressful environment. This potential contamination of cases, and that from genetically at risk controls that have not yet experienced environmental triggers for onset, may reduce the power of studies to detect causal variants. In the RADIANT sample of 3,690 European individuals, we used propensity score matching to pair cases and controls on exposure to stressful life events. In 805 case-control pairs matched on stressful life event, we tested the influence of 457,670 common genetic variants on the propensity to depression under comparable level of adversity with a sign test. While this analysis produced no significant findings after genome-wide correction for multiple testing, we outline a novel methodology and perspective for providing environmental context in genetic studies. We recommend contextualizing depression by incorporating environmental exposure into genome-wide analyses as a complementary approach to testing gene-environment interactions. Possible explanations for negative findings include a lack of statistical power due to small sample size and conditional effects, resulting from the low rate of adequate matching. Our findings underscore the importance of collecting information on environmental risk factors in studies of depression and other complex phenotypes, so that sufficient sample sizes are available to investigate their effect in genome-wide association analysis. Copyright © 2013 Wiley Periodicals, Inc.
Genome-Wide Association Study of Seed Dormancy and the Genomic Consequences of Improvement Footprints in Rice (Oryza sativa L.)

PubMed Central

Lu, Qing; Niu, Xiaojun; Zhang, Mengchen; Wang, Caihong; Xu, Qun; Feng, Yue; Yang, Yaolong; Wang, Shan; Yuan, Xiaoping; Yu, Hanyong; Wang, Yiping; Chen, Xiaoping; Liang, Xuanqiang; Wei, Xinghua

2018-01-01

Seed dormancy is an important agronomic trait affecting grain yield and quality because of pre-harvest germination and is influenced by both environmental and genetic factors. However, our knowledge of the factors controlling seed dormancy remains limited. To better reveal the molecular mechanism underlying this trait, a genome-wide association study was conducted in an indica-only population consisting of 453 accessions genotyped using 5,291 SNPs. Nine known and new significant SNPs were identified on eight chromosomes. These lead SNPs explained 34.9% of the phenotypic variation, and four of them were designed as dCAPS markers in the hope of accelerating molecular breeding. Moreover, a total of 212 candidate genes was predicted and eight candidate genes showed plant tissue-specific expression in expression profile data from different public bioinformatics databases. In particular, LOC_Os03g10110, which had a maize homolog involved in embryo development, was identified as a candidate regulator for further biological function investigations. Additionally, a polymorphism information content ratio method was used to screen improvement footprints and 27 selective sweeps were identified, most of which harbored domestication-related genes. Further studies suggested that three significant SNPs were adjacent to the candidate selection signals, supporting the accuracy of our genome-wide association study (GWAS) results. These findings show that genome-wide screening for selective sweeps can be used to identify new improvement-related DNA regions, although the phenotypes are unknown. This study enhances our knowledge of the genetic variation in seed dormancy, and the new dormancy-associated SNPs will provide real benefits in molecular breeding. PMID:29354150
Genome-wide association and linkage identify modifier loci of lung disease severity in cystic fibrosis at 11p13 and 20q13.2

PubMed Central

Wright, Fred A.; Strug, Lisa J.; Doshi, Vishal K.; Commander, Clayton W.; Blackman, Scott M.; Sun, Lei; Berthiaume, Yves; Cutler, David; Cojocaru, Andreea; Collaco, J. Michael; Corey, Mary; Dorfman, Ruslan; Goddard, Katrina; Green, Deanna; Kent, Jack W.; Lange, Ethan M.; Lee, Seunggeun; Li, Weili; Luo, Jingchun; Mayhew, Gregory M.; Naughton, Kathleen M.; Pace, Rhonda G.; Paré, Peter; Rommens, Johanna M.; Sandford, Andrew; Stonebraker, Jaclyn R.; Sun, Wei; Taylor, Chelsea; Vanscoy, Lori L.; Zou, Fei; Blangero, John; Zielenski, Julian; O’Neal, Wanda K.; Drumm, Mitchell L.; Durie, Peter R.; Knowles, Michael R.; Cutting, Garry R.

2012-01-01

A combined genome-wide association and linkage study was used to identify loci causing variation in CF lung disease severity. A significant association (P=3. 34 × 10-8) near EHF and APIP (chr11p13) was identified in F508del homozygotes (n=1,978). The association replicated in F508del homozygotes (P=0.006) from a separate family-based study (n=557), with P=1.49 × 10-9 for the three-study joint meta-analysis. Linkage analysis of 486 sibling pairs from the family-based study identified a significant QTL on chromosome 20q13.2 (LOD=5.03). Our findings provide insight into the causes of variation in lung disease severity in CF and suggest new therapeutic targets for this life-limiting disorder. PMID:21602797
Reverse Engineering of Genome-wide Gene Regulatory Networks from Gene Expression Data

PubMed Central

Liu, Zhi-Ping

2015-01-01

Transcriptional regulation plays vital roles in many fundamental biological processes. Reverse engineering of genome-wide regulatory networks from high-throughput transcriptomic data provides a promising way to characterize the global scenario of regulatory relationships between regulators and their targets. In this review, we summarize and categorize the main frameworks and methods currently available for inferring transcriptional regulatory networks from microarray gene expression profiling data. We overview each of strategies and introduce representative methods respectively. Their assumptions, advantages, shortcomings, and possible improvements and extensions are also clarified and commented. PMID:25937810
Recent human evolution has shaped geographical differences in susceptibility to disease

PubMed Central

2011-01-01

Background Searching for associations between genetic variants and complex diseases has been a very active area of research for over two decades. More than 51,000 potential associations have been studied and published, a figure that keeps increasing, especially with the recent explosion of array-based Genome-Wide Association Studies. Even if the number of true associations described so far is high, many of the putative risk variants detected so far have failed to be consistently replicated and are widely considered false positives. Here, we focus on the world-wide patterns of replicability of published association studies. Results We report three main findings. First, contrary to previous results, genes associated to complex diseases present lower degrees of genetic differentiation among human populations than average genome-wide levels. Second, also contrary to previous results, the differences in replicability of disease associated-loci between Europeans and East Asians are highly correlated with genetic differentiation between these populations. Finally, highly replicated genes present increased levels of high-frequency derived alleles in European and Asian populations when compared to African populations. Conclusions Our findings highlight the heterogeneous nature of the genetic etiology of complex disease, confirm the importance of the recent evolutionary history of our species in current patterns of disease susceptibility and could cast doubts on the status as false positives of some associations that have failed to replicate across populations. PMID:21261943
Copy number variation of individual cattle genomes using next-generation sequencing

USDA-ARS?s Scientific Manuscript database

Copy number variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often intractable. Using a read depth approach based on next-generation sequencing, we examined genome-wide copy number differences among five taurine (three Angus, one ...
Copy number variation of individual cattle genomes using next-generation sequencing

USDA-ARS?s Scientific Manuscript database

Copy Number Variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often difficult to track. Using a read depth approach based on next generation sequencing, we examined genome-wide copy number differences among five taurine (three Angu...
Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa

PubMed Central

Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Adan, R A H; Alfredsson, L; Ando, T; Andreassen, O A; Aschauer, H; Baker, J H; Barrett, J C; Bencko, V; Bergen, A W; Berrettini, W H; Birgegard, A; Boni, C; Boraska Perica, V; Brandt, H; Breen, G; Bulik, C M; Carlberg, L; Cassina, M; Cichon, S; Clementi, M; Cohen-Woods, S; Coleman, J; Cone, R D; Courtet, P; Crawford, S; Crow, S; Crowley, J; Danner, U N; Davis, O S P; de Zwaan, M; Dedoussis, G; Degortes, D; DeSocio, J E; Dick, D M; Dikeos, D; Dina, C; Ding, B; Dmitrzak-Weglarz, M; Docampo, E; Duncan, L; Egberts, K; Ehrlich, S; Escaramís, G; Esko, T; Espeseth, T; Estivill, X; Favaro, A; Fernández-Aranda, F; Fichter, M M; Finan, C; Fischer, K; Floyd, J A B; Foretova, L; Forzan, M; Franklin, C S; Gallinger, S; Gambaro, G; Gaspar, H A; Giegling, I; Gonidakis, F; Gorwood, P; Gratacos, M; Guillaume, S; Guo, Y; Hakonarson, H; Halmi, K A; Hatzikotoulas, K; Hauser, J; Hebebrand, J; Helder, S; Herms, S; Herpertz-Dahlmann, B; Herzog, W; Hilliard, C E; Hinney, A; Hübel, C; Huckins, L M; Hudson, J I; Huemer, J; Inoko, H; Janout, V; Jiménez-Murcia, S; Johnson, C; Julià, A; Juréus, A; Kalsi, G; Kaminska, D; Kaplan, A S; Kaprio, J; Karhunen, L; Karwautz, A; Kas, M J H; Kaye, W; Kennedy, J L; Keski-Rahkonen, A; Kiezebrink, K; Klareskog, L; Klump, K L; Knudsen, G P S; Koeleman, B P C; Koubek, D; La Via, M C; Landén, M; Le Hellard, S; Levitan, R D; Li, D; Lichtenstein, P; Lilenfeld, L; Lissowska, J; Lundervold, A; Magistretti, P; Maj, M; Mannik, K; Marsal, S; Martin, N; Mattingsdal, M; McDevitt, S; McGuffin, P; Merl, E; Metspalu, A; Meulenbelt, I; Micali, N; Mitchell, J; Mitchell, K; Monteleone, P; Monteleone, A M; Mortensen, P; Munn-Chernoff, M A; Navratilova, M; Nilsson, I; Norring, C; Ntalla, I; Ophoff, R A; O'Toole, J K; Palotie, A; Pante, J; Papezova, H; Pinto, D; Rabionet, R; Raevuori, A; Rajewski, A; Ramoz, N; Rayner, N W; Reichborn-Kjennerud, T; Ripatti, S; Roberts, M; Rotondo, A; Rujescu, D; Rybakowski, F; Santonastaso, P; Scherag, A; Scherer, S W; Schmidt, U; Schork, N J; Schosser, A; Slachtova, L; Sladek, R; Slagboom, P E; Slof-Op 't Landt, M C T; Slopien, A; Soranzo, N; Southam, L; Steen, V M; Strengman, E; Strober, M; Sullivan, P F; Szatkiewicz, J P; Szeszenia-Dabrowska, N; Tachmazidou, I; Tenconi, E; Thornton, L M; Tortorella, A; Tozzi, F; Treasure, J; Tsitsika, A; Tziouvas, K; van Elburg, A A; van Furth, E F; Wagner, G; Walton, E; Watson, H; Wichmann, H-E; Widen, E; Woodside, D B; Yanovski, J; Yao, S; Yilmaz, Z; Zeggini, E; Zerwas, S; Zipfel, S; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

2018-01-01

Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10−6), and rs7700147, an intergenic variant (P=2.93 × 10−5). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes. PMID:29155802
Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa.

PubMed

Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

2018-05-01

Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10 -6 ), and rs7700147, an intergenic variant (P=2.93 × 10 -5 ). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes.
Genetic modifiers of menopausal hormone replacement therapy and breast cancer risk: A genome-wide interaction study

PubMed Central

Rudolph, Anja; Hein, Rebecca; Lindström, Sara; Beckmann, Lars; Behrens, Sabine; Liu, Jianjun; Aschard, Hugues; Bolla, Manjeet K.; Wang, Jean; Truong, Thérèse; Cordina-Duverger, Emilie; Menegaux, Florence; Brüning, Thomas; Harth, Volker; Severi, Gianluca; Baglietto, Laura; Southey, Melissa; Chanock, Stephen J.; Lissowska, Jolanta; Figueroa, Jonine D.; Eriksson, Mikael; Humpreys, Keith; Darabi, Hatef; Olson, Janet E.; Stevens, Kristen N.; Vachon, Celine M.; Knight, Julia A.; Glendon, Gord; Mulligan, Anna Marie; Ashworth, Alan; Orr, Nicholas; Schoemaker, Minouk; Webb, Penny M.; Guénel, Pascal; Brauch, Hiltrud; Giles, Graham; García-Closas, Montserrat; Czene, Kamila; Chenevix-Trench, Georgia; Couch, Fergus J.; Andrulis, Irene L.; Swerdlow, Anthony; Hunter, David J.; Flesch-Janys, Dieter; Easton, Douglas F.; Hall, Per; Nevanlinna, Heli; Kraft, Peter; Chang-Claude, Jenny

2013-01-01

Women using menopausal hormone therapy (MHT) are at increased risk to develop breast cancer (BC). To detect genetic modifiers of the association between current use of MHT and BC risk, we conducted a meta-analysis of four genome-wide case-only studies followed by replication in eleven case-control studies. We used a case-only design to assess interactions between single nucleotide polymorphisms (SNPs) and current MHT use on risk of overall and lobular BC. The discovery stage included 2,920 cases (541 lobular) from four genome-wide association studies. The top 1,391 SNPs showing P-values for interaction (Pint) <3.0×10−03 were selected for replication using pooled case-control data from eleven studies of the Breast Cancer Association Consortium, including 7,689 cases (676 lobular) and 9,266 controls. Fixed effects meta-analysis was used to derive combined Pint. No SNP reached genome-wide significance in either the discovery or combined stage. We observed effect modification of current MHT use on overall BC risk by two SNPs on chr13 near POMP (combined Pint≤8.9×10−06), two SNPs in SLC25A21 (combined Pint≤4.8×10−05), and three SNPs in PLCG2 (combined Pint≤4.5×10−05). The association between lobular BC risk was potentially modified by one SNP in TMEFF2 (combined Pint≤2.7×10−05), one SNP in CD80 (combined Pint≤8.2×10−06), three SNPs on chr17 near TMEM132E (combined Pint≤2.2×10−06), and two SNPs on chr18 near SLC25A52 (combined Pint≤4.6×10−05). In conclusion, polymorphisms in genes related to solute transportation in mitochondria, transmembrane signaling and immune cell activation are potentially modifying BC risk associated with current use of MHT. These findings warrant replication in independent studies. PMID:24080446
Significance of genome-wide association studies in molecular anthropology.

PubMed

Gupta, Vipin; Khadgawat, Rajesh; Sachdeva, Mohinder Pal

2009-12-01

The successful advent of a genome-wide approach in association studies raises the hopes of human geneticists for solving a genetic maze of complex traits especially the disorders. This approach, which is replete with the application of cutting-edge technology and supported by big science projects (like Human Genome Project; and even more importantly the International HapMap Project) and various important databases (SNP database, CNV database, etc.), has had unprecedented success in rapidly uncovering many of the genetic determinants of complex disorders. The magnitude of this approach in the genetics of classical anthropological variables like height, skin color, eye color, and other genome diversity projects has certainly expanded the horizons of molecular anthropology. Therefore, in this article we have proposed a genome-wide association approach in molecular anthropological studies by providing lessons from the exemplary study of the Wellcome Trust Case Control Consortium. We have also highlighted the importance and uniqueness of Indian population groups in facilitating the design and finding optimum solutions for other genome-wide association-related challenges.

Dissection of complex adult traits in a mouse synthetic population.

PubMed

Burke, David T; Kozloff, Kenneth M; Chen, Shu; West, Joshua L; Wilkowski, Jodi M; Goldstein, Steven A; Miller, Richard A; Galecki, Andrzej T

2012-08-01

Finding the causative genetic variations that underlie complex adult traits is a significant experimental challenge. The unbiased search strategy of genome-wide association (GWAS) has been used extensively in recent human population studies. These efforts, however, typically find only a minor fraction of the genetic loci that are predicted to affect variation. As an experimental model for the analysis of adult polygenic traits, we measured a mouse population for multiple phenotypes and conducted a genome-wide search for effector loci. Complex adult phenotypes, related to body size and bone structure, were measured as component phenotypes, and each subphenotype was associated with a genomic spectrum of candidate effector loci. The strategy successfully detected several loci for the phenotypes, at genome-wide significance, using a single, modest-sized population (N = 505). The effector loci each explain 2%-10% of the measured trait variation and, taken together, the loci can account for over 25% of a trait's total population variation. A replicate population (N = 378) was used to confirm initially observed loci for one trait (femur length), and, when the two groups were merged, the combined population demonstrated increased power to detect loci. In contrast to human population studies, our mouse genome-wide searches find loci that individually explain a larger fraction of the observed variation. Also, the additive effects of our detected mouse loci more closely match the predicted genetic component of variation. The genetic loci discovered are logical candidates for components of the genetic networks having evolutionary conservation with human biology.
Identification of genetic elements in metabolism by high-throughput mouse phenotyping.

PubMed

Rozman, Jan; Rathkolb, Birgit; Oestereicher, Manuela A; Schütt, Christine; Ravindranath, Aakash Chavan; Leuchtenberger, Stefanie; Sharma, Sapna; Kistler, Martin; Willershäuser, Monja; Brommage, Robert; Meehan, Terrence F; Mason, Jeremy; Haselimashhadi, Hamed; Hough, Tertius; Mallon, Ann-Marie; Wells, Sara; Santos, Luis; Lelliott, Christopher J; White, Jacqueline K; Sorg, Tania; Champy, Marie-France; Bower, Lynette R; Reynolds, Corey L; Flenniken, Ann M; Murray, Stephen A; Nutter, Lauryl M J; Svenson, Karen L; West, David; Tocchini-Valentini, Glauco P; Beaudet, Arthur L; Bosch, Fatima; Braun, Robert B; Dobbie, Michael S; Gao, Xiang; Herault, Yann; Moshiri, Ala; Moore, Bret A; Kent Lloyd, K C; McKerlie, Colin; Masuya, Hiroshi; Tanaka, Nobuhiko; Flicek, Paul; Parkinson, Helen E; Sedlacek, Radislav; Seong, Je Kyung; Wang, Chi-Kuang Leo; Moore, Mark; Brown, Steve D; Tschöp, Matthias H; Wurst, Wolfgang; Klingenspor, Martin; Wolf, Eckhard; Beckers, Johannes; Machicao, Fausto; Peter, Andreas; Staiger, Harald; Häring, Hans-Ulrich; Grallert, Harald; Campillos, Monica; Maier, Holger; Fuchs, Helmut; Gailus-Durner, Valerie; Werner, Thomas; Hrabe de Angelis, Martin

2018-01-18

Metabolic diseases are a worldwide problem but the underlying genetic factors and their relevance to metabolic disease remain incompletely understood. Genome-wide research is needed to characterize so-far unannotated mammalian metabolic genes. Here, we generate and analyze metabolic phenotypic data of 2016 knockout mouse strains under the aegis of the International Mouse Phenotyping Consortium (IMPC) and find 974 gene knockouts with strong metabolic phenotypes. 429 of those had no previous link to metabolism and 51 genes remain functionally completely unannotated. We compared human orthologues of these uncharacterized genes in five GWAS consortia and indeed 23 candidate genes are associated with metabolic disease. We further identify common regulatory elements in promoters of candidate genes. As each regulatory element is composed of several transcription factor binding sites, our data reveal an extensive metabolic phenotype-associated network of co-regulated genes. Our systematic mouse phenotype analysis thus paves the way for full functional annotation of the genome.
CCG - News & Events

Cancer.gov

NCI's Center for Cancer Genomics (CCG) has been widely recognized for its research efforts to facilitiate advances in cancer genomic research and improve patient outcomes. Find the latest news about and events featuring CCG.
VCGDB: a dynamic genome database of the Chinese population

PubMed Central

2014-01-01

Background The data released by the 1000 Genomes Project contain an increasing number of genome sequences from different nations and populations with a large number of genetic variations. As a result, the focus of human genome studies is changing from single and static to complex and dynamic. The currently available human reference genome (GRCh37) is based on sequencing data from 13 anonymous Caucasian volunteers, which might limit the scope of genomics, transcriptomics, epigenetics, and genome wide association studies. Description We used the massive amount of sequencing data published by the 1000 Genomes Project Consortium to construct the Virtual Chinese Genome Database (VCGDB), a dynamic genome database of the Chinese population based on the whole genome sequencing data of 194 individuals. VCGDB provides dynamic genomic information, which contains 35 million single nucleotide variations (SNVs), 0.5 million insertions/deletions (indels), and 29 million rare variations, together with genomic annotation information. VCGDB also provides a highly interactive user-friendly virtual Chinese genome browser (VCGBrowser) with functions like seamless zooming and real-time searching. In addition, we have established three population-specific consensus Chinese reference genomes that are compatible with mainstream alignment software. Conclusions VCGDB offers a feasible strategy for processing big data to keep pace with the biological data explosion by providing a robust resource for genomics studies; in particular, studies aimed at finding regions of the genome associated with diseases. PMID:24708222
Single-trait and multi-trait genome-wide association analyses identify novel loci for blood pressure in African-ancestry populations

PubMed Central

Liang, Jingjing; Le, Thu H.; Edwards, Digna R. Velez; Tayo, Bamidele O.; Gaulton, Kyle J.; Lu, Yingchang; Jensen, Richard A.; Chen, Guanjie; Schwander, Karen; McKenzie, Colin A.; Fox, Ervin; Nalls, Michael A.; Young, J. Hunter; Lane, Jacqueline M.; Zhou, Jie; Tang, Hua; Fornage, Myriam; Musani, Solomon K.; Wang, Heming; Forrester, Terrence; Chu, Pei-Lun; Evans, Michele K.; Morrison, Alanna C.; Martin, Lisa W.; Wiggins, Kerri L.; Hui, Qin; Zhao, Wei; Jackson, Rebecca D.; Faul, Jessica D.; Reiner, Alex P.; Bray, Michael; Denny, Joshua C.; Mosley, Thomas H.; Palmas, Walter; Guo, Xiuqing; Polak, Joseph F.; Taylor, Ken D.; Boerwinkle, Eric; Bottinger, Erwin P.; Liu, Kiang; Risch, Neil; Hunt, Steven C.; Kooperberg, Charles; Zonderman, Alan B.; Becker, Diane M.; Cai, Jianwen; Loos, Ruth J. F.; Psaty, Bruce M.; Weir, David R.; Kardia, Sharon L. R.; Arnett, Donna K.; Won, Sungho; Edwards, Todd L.; Redline, Susan; Cooper, Richard S.; Rao, D. C.; Rotimi, Charles; Levy, Daniel; Chakravarti, Aravinda

2017-01-01

Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genome-wide association studies comprised of 31,968 individuals of African ancestry, and validated our results with additional 54,395 individuals from multi-ethnic studies. These analyses identified nine loci with eleven independent variants which reached genome-wide significance (P < 1.25×10−8) for either systolic and diastolic blood pressure, hypertension, or for combined traits. Single-trait analyses identified two loci (TARID/TCF21 and LLPH/TMBIM4) and multiple-trait analyses identified one novel locus (FRMD3) for blood pressure. At these three loci, as well as at GRP20/CDH17, associated variants had alleles common only in African-ancestry populations. Functional annotation showed enrichment for genes expressed in immune and kidney cells, as well as in heart and vascular cells/tissues. Experiments driven by these findings and using angiotensin-II induced hypertension in mice showed altered kidney mRNA expression of six genes, suggesting their potential role in hypertension. Our study provides new evidence for genes related to hypertension susceptibility, and the need to study African-ancestry populations in order to identify biologic factors contributing to hypertension. PMID:28498854
Finding cancer driver mutations in the era of big data research.

PubMed

Poulos, Rebecca C; Wong, Jason W H

2018-04-02

In the last decade, the costs of genome sequencing have decreased considerably. The commencement of large-scale cancer sequencing projects has enabled cancer genomics to join the big data revolution. One of the challenges still facing cancer genomics research is determining which are the driver mutations in an individual cancer, as these contribute only a small subset of the overall mutation profile of a tumour. Focusing primarily on somatic single nucleotide mutations in this review, we consider both coding and non-coding driver mutations, and discuss how such mutations might be identified from cancer sequencing datasets. We describe some of the tools and database that are available for the annotation of somatic variants and the identification of cancer driver genes. We also address the use of genome-wide variation in mutation load to establish background mutation rates from which to identify driver mutations under positive selection. Finally, we describe the ways in which mutational signatures can act as clues for the identification of cancer drivers, as these mutations may cause, or arise from, certain mutational processes. By defining the molecular changes responsible for driving cancer development, new cancer treatment strategies may be developed or novel preventative measures proposed.
[Construction and application of bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer].

PubMed

Fang, Xiang; Li, Ning-qiu; Fu, Xiao-zhe; Li, Kai-bin; Lin, Qiang; Liu, Li-hui; Shi, Cun-bin; Wu, Shu-qin

2015-07-01

As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.
Genetic variation at MECOM, TERT, JAK2 and HBS1L-MYB predisposes to myeloproliferative neoplasms

PubMed Central

Tapper, William; Jones, Amy V.; Kralovics, Robert; Harutyunyan, Ashot S.; Zoi, Katerina; Leung, William; Godfrey, Anna L.; Guglielmelli, Paola; Callaway, Alison; Ward, Daniel; Aranaz, Paula; White, Helen E.; Waghorn, Katherine; Lin, Feng; Chase, Andrew; Joanna Baxter, E.; Maclean, Cathy; Nangalia, Jyoti; Chen, Edwin; Evans, Paul; Short, Michael; Jack, Andrew; Wallis, Louise; Oscier, David; Duncombe, Andrew S.; Schuh, Anna; Mead, Adam J.; Griffiths, Michael; Ewing, Joanne; Gale, Rosemary E.; Schnittger, Susanne; Haferlach, Torsten; Stegelmann, Frank; Döhner, Konstanze; Grallert, Harald; Strauch, Konstantin; Tanaka, Toshiko; Bandinelli, Stefania; Giannopoulos, Andreas; Pieri, Lisa; Mannarelli, Carmela; Gisslinger, Heinz; Barosi, Giovanni; Cazzola, Mario; Reiter, Andreas; Harrison, Claire; Campbell, Peter; Green, Anthony R.; Vannucchi, Alessandro; Cross, Nicholas C.P.

2015-01-01

Clonal proliferation in myeloproliferative neoplasms (MPN) is driven by somatic mutations in JAK2, CALR or MPL, but the contribution of inherited factors is poorly characterized. Using a three-stage genome-wide association study of 3,437 MPN cases and 10,083 controls, we identify two SNPs with genome-wide significance in JAK2V617F-negative MPN: rs12339666 (JAK2; meta-analysis P=1.27 × 10−10) and rs2201862 (MECOM; meta-analysis P=1.96 × 10−9). Two additional SNPs, rs2736100 (TERT) and rs9376092 (HBS1L/MYB), achieve genome-wide significance when including JAK2V617F-positive cases. rs9376092 has a stronger effect in JAK2V617F-negative cases with CALR and/or MPL mutations (Breslow–Day P=4.5 × 10−7), whereas in JAK2V617F-positive cases rs9376092 associates with essential thrombocythemia (ET) rather than polycythemia vera (allelic χ2 P=7.3 × 10−7). Reduced MYB expression, previously linked to development of an ET-like disease in model systems, associates with rs9376092 in normal myeloid cells. These findings demonstrate that multiple germline variants predispose to MPN and link constitutional differences in MYB expression to disease phenotype. PMID:25849990
Genome analysis and polar tube firing dynamics of mosquito-infecting microsporidia

USDA-ARS?s Scientific Manuscript database

Microsporidia are highly divergent fungi that are obligate intracellular pathogens of a wide range of host organisms. Here we review recent findings from the genome sequences of mosquito-infecting microsporidian species Edhazardia aedis and Vavraia culicis, which show large differences in genome siz...
Repeated replacement of an intrabacterial symbiont in the tripartite nested mealybug symbiosis

PubMed Central

Husnik, Filip; McCutcheon, John P.

2016-01-01

Stable endosymbiosis of a bacterium into a host cell promotes cellular and genomic complexity. The mealybug Planococcus citri has two bacterial endosymbionts with an unusual nested arrangement: the γ-proteobacterium Moranella endobia lives in the cytoplasm of the β-proteobacterium Tremblaya princeps. These two bacteria, along with genes horizontally transferred from other bacteria to the P. citri genome, encode gene sets that form an interdependent metabolic patchwork. Here, we test the stability of this three-way symbiosis by sequencing host and symbiont genomes for five diverse mealybug species and find marked fluidity over evolutionary time. Although Tremblaya is the result of a single infection in the ancestor of mealybugs, the γ-proteobacterial symbionts result from multiple replacements of inferred different ages from related but distinct bacterial lineages. Our data show that symbiont replacement can happen even in the most intricate symbiotic arrangements and that preexisting horizontally transferred genes can remain stable on genomes in the face of extensive symbiont turnover. PMID:27573819
Polymorphisms, Chromosomal Rearrangements, and Mutator Phenotype Development during Experimental Evolution of Lactobacillus rhamnosus GG.

PubMed

Douillard, François P; Ribbera, Angela; Xiao, Kun; Ritari, Jarmo; Rasinkangas, Pia; Paulin, Lars; Palva, Airi; Hao, Yanling; de Vos, Willem M

2016-07-01

Lactobacillus rhamnosus GG is a lactic acid bacterium widely marketed by the food industry. Its genomic analysis led to the identification of a gene cluster encoding mucus-binding SpaCBA pili, which is located in a genomic island enriched in insertion sequence (IS) elements. In the present study, we analyzed by genome-wide resequencing the genomic integrity of L. rhamnosus GG in four distinct evolutionary experiments conducted for approximately 1,000 generations under conditions of no stress or salt, bile, and repetitive-shearing stress. Under both stress-free and salt-induced stress conditions, the GG population (excluding the mutator lineage in the stress-free series [see below]) accumulated only a few single nucleotide polymorphisms (SNPs) and no frequent chromosomal rearrangements. In contrast, in the presence of bile salts or repetitive shearing stress, some IS elements were found to be activated, resulting in the deletion of large chromosomal segments that include the spaCBA-srtC1 pilus gene cluster. Remarkably, a high number of SNPs were found in three strains obtained after 900 generations of stress-free growth. Detailed analysis showed that these three strains derived from a founder mutant with an altered DNA polymerase subunit that resulted in a mutator phenotype. The present work confirms the stability of the pilus production phenotype in L. rhamnosus GG under stress-free conditions, highlights the possible evolutionary scenarios that may occur when this probiotic strain is extensively cultured, and identifies external factors that affect the chromosomal integrity of GG. The results provide mechanistic insights into the stability of GG in regard to its extensive use in probiotic and other functional food products. Lactobacillus rhamnosus GG is a widely marketed probiotic strain that has been used in numerous clinical studies to assess its health-promoting properties. Hence, the stability of the probiotic functions of L. rhamnosus GG is of importance, and here we studied the impact of external stresses on the genomic integrity of L. rhamnosus GG. We studied three different stresses that are relevant for understanding its robustness and integrity under both ex vivo conditions, i.e., industrial manufacturing conditions, and in vivo conditions, i.e., intestinal tract-associated stress. Overall, our findings contribute to predicting the genomic stability of L. rhamnosus GG and its ecological performance. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Genome-wide search followed by replication reveals genetic interaction of CD80 and ALOX5AP associated with systemic lupus erythematosus in Asian populations.

PubMed

Zhang, Yan; Yang, Jing; Zhang, Jing; Sun, Liangdan; Hirankarn, Nattiya; Pan, Hai-Feng; Lau, Chak Sing; Chan, Tak Mao; Lee, Tsz Leung; Leung, Alexander Moon Ho; Mok, Chi Chiu; Zhang, Lu; Wang, Yongfei; Shen, Jiangshan Jane; Wong, Sik Nin; Lee, Ka Wing; Ho, Marco Hok Kung; Lee, Pamela Pui Wah; Chung, Brian Hon-Yin; Chong, Chun Yin; Wong, Raymond Woon Sing; Mok, Mo Yin; Wong, Wilfred Hing Sang; Tong, Kwok Lung; Tse, Niko Kei Chiu; Li, Xiang-Pei; Avihingsanon, Yingyos; Rianthavorn, Pornpimol; Deekajorndej, Thavatchai; Suphapeetiporn, Kanya; Shotelersuk, Vorasuk; Ying, Shirley King Yee; Fung, Samuel Ka Shun; Lai, Wai Ming; Wong, Chun-Ming; Ng, Irene Oi Lin; Garcia-Barcelo, Maria-Merce; Cherny, Stacey S; Cui, Yong; Sham, Pak Chung; Yang, Sen; Ye, Dong-Qing; Zhang, Xue-Jun; Lau, Yu Lung; Yang, Wanling

2016-05-01

Genetic interaction has been considered as a hallmark of the genetic architecture of systemic lupus erythematosus (SLE). Based on two independent genome-wide association studies (GWAS) on Chinese populations, we performed a genome-wide search for genetic interactions contributing to SLE susceptibility. The study involved a total of 1 659 cases and 3 398 controls in the discovery stage and 2 612 cases and 3 441 controls in three cohorts for replication. Logistic regression and multifactor dimensionality reduction were used to search for genetic interaction. Interaction of CD80 (rs2222631) and ALOX5AP (rs12876893) was found to be significantly associated with SLE (OR_int=1.16, P_int_all=7.7E-04 at false discovery rate<0.05). Single nuclear polymorphism rs2222631 was found associated with SLE with genome-wide significance (P_all=4.5E-08, OR=0.86) and is independent of rs6804441 in CD80, whose association was reported previously. Significant correlation was observed between expression of these two genes in healthy controls and SLE cases, together with differential expression of these genes between cases and controls, observed from individuals from the Hong Kong cohort. Genetic interactions between BLK (rs13277113) and DDX6 (rs4639966), and between TNFSF4 (rs844648) and PXK (rs6445975) were also observed in both GWAS data sets. Our study represents the first genome-wide evaluation of epistasis interactions on SLE and the findings suggest interactions and independent variants may help partially explain missing heritability for complex diseases. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Physical signals for protein-DNA recognition

NASA Astrophysics Data System (ADS)

Cao, Xiao-Qin; Zeng, Jia; Yan, Hong

2009-09-01

This paper discovers consensus physical signals around eukaryotic splice sites, transcription start sites, and replication origin start and end sites on a genome-wide scale based on their DNA flexibility profiles calculated by three different flexibility models. These salient physical signals are localized highly rigid and flexible DNAs, which may play important roles in protein-DNA recognition by the sliding search mechanism. The found physical signals lead us to a detailed hypothetical view of the search process in which a DNA-binding protein first finds a genomic region close to the target site from an arbitrary starting location by three-dimensional (3D) hopping and intersegment transfer mechanisms for long distances, and subsequently uses the one-dimensional (1D) sliding mechanism facilitated by the localized highly rigid DNAs to accurately locate the target flexible binding site within 30 bp (base pair) short distances. Guided by these physical signals, DNA-binding proteins rapidly search the entire genome to recognize a specific target site from the 3D to 1D pathway. Our findings also show that current promoter prediction programs (PPPs) based on DNA physical properties may suffer from lots of false positives because other functional sites such as splice sites and replication origins have similar physical signals as promoters do.
Genome-wide association analysis of blood-pressure traits in African-ancestry individuals reveals common associated genes in African and non-African populations.

PubMed

Franceschini, Nora; Fox, Ervin; Zhang, Zhaogong; Edwards, Todd L; Nalls, Michael A; Sung, Yun Ju; Tayo, Bamidele O; Sun, Yan V; Gottesman, Omri; Adeyemo, Adebawole; Johnson, Andrew D; Young, J Hunter; Rice, Ken; Duan, Qing; Chen, Fang; Li, Yun; Tang, Hua; Fornage, Myriam; Keene, Keith L; Andrews, Jeanette S; Smith, Jennifer A; Faul, Jessica D; Guangfa, Zhang; Guo, Wei; Liu, Yu; Murray, Sarah S; Musani, Solomon K; Srinivasan, Sathanur; Velez Edwards, Digna R; Wang, Heming; Becker, Lewis C; Bovet, Pascal; Bochud, Murielle; Broeckel, Ulrich; Burnier, Michel; Carty, Cara; Chasman, Daniel I; Ehret, Georg; Chen, Wei-Min; Chen, Guanjie; Chen, Wei; Ding, Jingzhong; Dreisbach, Albert W; Evans, Michele K; Guo, Xiuqing; Garcia, Melissa E; Jensen, Rich; Keller, Margaux F; Lettre, Guillaume; Lotay, Vaneet; Martin, Lisa W; Moore, Jason H; Morrison, Alanna C; Mosley, Thomas H; Ogunniyi, Adesola; Palmas, Walter; Papanicolaou, George; Penman, Alan; Polak, Joseph F; Ridker, Paul M; Salako, Babatunde; Singleton, Andrew B; Shriner, Daniel; Taylor, Kent D; Vasan, Ramachandran; Wiggins, Kerri; Williams, Scott M; Yanek, Lisa R; Zhao, Wei; Zonderman, Alan B; Becker, Diane M; Berenson, Gerald; Boerwinkle, Eric; Bottinger, Erwin; Cushman, Mary; Eaton, Charles; Nyberg, Fredrik; Heiss, Gerardo; Hirschhron, Joel N; Howard, Virginia J; Karczewsk, Konrad J; Lanktree, Matthew B; Liu, Kiang; Liu, Yongmei; Loos, Ruth; Margolis, Karen; Snyder, Michael; Psaty, Bruce M; Schork, Nicholas J; Weir, David R; Rotimi, Charles N; Sale, Michele M; Harris, Tamara; Kardia, Sharon L R; Hunt, Steven C; Arnett, Donna; Redline, Susan; Cooper, Richard S; Risch, Neil J; Rao, D C; Rotter, Jerome I; Chakravarti, Aravinda; Reiner, Alex P; Levy, Daniel; Keating, Brendan J; Zhu, Xiaofeng

2013-09-05

High blood pressure (BP) is more prevalent and contributes to more severe manifestations of cardiovascular disease (CVD) in African Americans than in any other United States ethnic group. Several small African-ancestry (AA) BP genome-wide association studies (GWASs) have been published, but their findings have failed to replicate to date. We report on a large AA BP GWAS meta-analysis that includes 29,378 individuals from 19 discovery cohorts and subsequent replication in additional samples of AA (n = 10,386), European ancestry (EA) (n = 69,395), and East Asian ancestry (n = 19,601). Five loci (EVX1-HOXA, ULK4, RSPO3, PLEKHG1, and SOX6) reached genome-wide significance (p < 1.0 × 10(-8)) for either systolic or diastolic BP in a transethnic meta-analysis after correction for multiple testing. Three of these BP loci (EVX1-HOXA, RSPO3, and PLEKHG1) lack previous associations with BP. We also identified one independent signal in a known BP locus (SOX6) and provide evidence for fine mapping in four additional validated BP loci. We also demonstrate that validated EA BP GWAS loci, considered jointly, show significant effects in AA samples. Consequently, these findings suggest that BP loci might have universal effects across studied populations, demonstrating that multiethnic samples are an essential component in identifying, fine mapping, and understanding their trait variability. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Identification of susceptibility loci for colorectal cancer in a genome-wide meta-analysis

PubMed Central

Whiffin, Nicola; Hosking, Fay J.; Farrington, Susan M.; Palles, Claire; Dobbins, Sara E.; Zgaga, Lina; Lloyd, Amy; Kinnersley, Ben; Gorman, Maggie; Tenesa, Albert; Broderick, Peter; Wang, Yufei; Barclay, Ella; Hayward, Caroline; Martin, Lynn; Buchanan, Daniel D.; Win, Aung Ko; Hopper, John; Jenkins, Mark; Lindor, Noralane M.; Newcomb, Polly A.; Gallinger, Steve; Conti, David; Schumacher, Fred; Casey, Graham; Liu, Tao; Campbell, Harry; Lindblom, Annika; Houlston, Richard S.; Tomlinson, Ian P.; Dunlop, Malcolm G.

2014-01-01

To identify common variants influencing colorectal cancer (CRC) risk, we performed a meta-analysis of five genome-wide association studies, comprising 5626 cases and 7817 controls of European descent. We conducted replication of top ranked single nucleotide polymorphisms (SNPs) in additional series totalling 14 037 cases and 15 937 controls, identifying a new CRC risk locus at 10q24.2 [rs1035209; odds ratio (OR) = 1.13, P = 4.54 × 10−11]. We also performed meta-analysis of our studies, with previously published data, of several recently purported CRC risk loci. We failed to find convincing evidence for a previously reported genome-wide association at rs11903757 (2q32.3). Of the three additional loci for which evidence of an association in Europeans has been previously described we failed to show an association between rs59336 (12q24.21) and CRC risk. However, for the other two SNPs, our analyses demonstrated new, formally significant associations with CRC. These are rs3217810 intronic in CCND2 (12p13.32; OR = 1.19, P = 2.16 × 10−10) and rs10911251 near LAMC1 (1q25.3; OR = 1.09, P = 1.75 × 10−8). Additionally, we found some evidence to support a relationship between, rs647161, rs2423297 and rs10774214 and CRC risk originally identified in East Asians in our European datasets. Our findings provide further insights into the genetic and biological basis of inherited genetic susceptibility to CRC. PMID:24737748
Lack of replication of thirteen single-nucleotide polymorphisms implicated in Parkinson’s disease: a large-scale international study

PubMed Central

Elbaz, Alexis; Nelson, Lorene M; Payami, Haydeh; Ioannidis, John P A; Fiske, Brian K; Annesi, Grazia; Belin, Andrea Carmine; Factor, Stewart A; Ferrarese, Carlo; Hadjigeorgiou, Georgios M; Higgins, Donald S; Kawakami, Hideshi; Krüger, Rejko; Marder, Karen S; Mayeux, Richard P; Mellick, George D; Nutt, John G; Ritz, Beate; Samii, Ali; Tanner, Caroline M; Van Broeckhoven, Christine; Van Den Eeden, Stephen K; Wirdefeldt, Karin; Zabetian, Cyrus P; Dehem, Marie; Montimurro, Jennifer S; Southwick, Audrey; Myers, Richard M; Trikalinos, Thomas A

2013-01-01

Summary Background A genome-wide association study identified 13 single-nucleotide polymorphisms (SNPs) significantly associated with Parkinson’s disease. Small-scale replication studies were largely non-confirmatory, but a meta-analysis that included data from the original study could not exclude all SNP associations, leaving relevance of several markers uncertain. Methods Investigators from three Michael J Fox Foundation for Parkinson’s Research-funded genetics consortia—comprising 14 teams—contributed DNA samples from 5526 patients with Parkinson’s disease and 6682 controls, which were genotyped for the 13 SNPs. Most (88%) participants were of white, non-Hispanic descent. We assessed log-additive genetic effects using fixed and random effects models stratified by team and ethnic origin, and tested for heterogeneity across strata. A meta-analysis was undertaken that incorporated data from the original genome-wide study as well as subsequent replication studies. Findings In fixed and random-effects models no associations with any of the 13 SNPs were identified (odds ratios 0·89 to 1·09). Heterogeneity between studies and between ethnic groups was low for all SNPs. Subgroup analyses by age at study entry, ethnic origin, sex, and family history did not show any consistent associations. In our meta-analysis, no SNP showed significant association (summary odds ratios 0·95 to 1.08); there was little heterogeneity except for SNP rs7520966. Interpretation Our results do not lend support to the finding that the 13 SNPs reported in the original genome-wide association study are genetic susceptibility factors for Parkinson’s disease. PMID:17052658
The Modern Synthesis in the Light of Microbial Genomics.

PubMed

Booth, Austin; Mariscal, Carlos; Doolittle, W Ford

2016-09-08

We review the theoretical implications of findings in genomics for evolutionary biology since the Modern Synthesis. We examine the ways in which microbial genomics has influenced our understanding of the last universal common ancestor, the tree of life, species, lineages, and evolutionary transitions. We conclude by advocating a piecemeal toolkit approach to evolutionary biology, in lieu of any grand unified theory updated to include microbial genomics.
Comparison of phasing strategies for whole human genomes

PubMed Central

Kirkness, Ewen; Schork, Nicholas J.

2018-01-01

Humans are a diploid species that inherit one set of chromosomes paternally and one homologous set of chromosomes maternally. Unfortunately, most human sequencing initiatives ignore this fact in that they do not directly delineate the nucleotide content of the maternal and paternal copies of the 23 chromosomes individuals possess (i.e., they do not ‘phase’ the genome) often because of the costs and complexities of doing so. We compared 11 different widely-used approaches to phasing human genomes using the publicly available ‘Genome-In-A-Bottle’ (GIAB) phased version of the NA12878 genome as a gold standard. The phasing strategies we compared included laboratory-based assays that prepare DNA in unique ways to facilitate phasing as well as purely computational approaches that seek to reconstruct phase information from general sequencing reads and constructs or population-level haplotype frequency information obtained through a reference panel of haplotypes. To assess the performance of the 11 approaches, we used metrics that included, among others, switch error rates, haplotype block lengths, the proportion of fully phase-resolved genes, phasing accuracy and yield between pairs of SNVs. Our comparisons suggest that a hybrid or combined approach that leverages: 1. population-based phasing using the SHAPEIT software suite, 2. either genome-wide sequencing read data or parental genotypes, and 3. a large reference panel of variant and haplotype frequencies, provides a fast and efficient way to produce highly accurate phase-resolved individual human genomes. We found that for population-based approaches, phasing performance is enhanced with the addition of genome-wide read data; e.g., whole genome shotgun and/or RNA sequencing reads. Further, we found that the inclusion of parental genotype data within a population-based phasing strategy can provide as much as a ten-fold reduction in phasing errors. We also considered a majority voting scheme for the construction of a consensus haplotype combining multiple predictions for enhanced performance and site coverage. Finally, we also identified DNA sequence signatures associated with the genomic regions harboring phasing switch errors, which included regions of low polymorphism or SNV density. PMID:29621242
Next-generation sequencing of translocation renal cell carcinoma reveals novel RNA splicing partners and frequent mutations of chromatin-remodeling genes.

PubMed

Malouf, Gabriel G; Su, Xiaoping; Yao, Hui; Gao, Jianjun; Xiong, Liangwen; He, Qiuming; Compérat, Eva; Couturier, Jérôme; Molinié, Vincent; Escudier, Bernard; Camparo, Philippe; Doss, Denaha J; Thompson, Erika J; Khayat, David; Wood, Christopher G; Yu, Willie; Teh, Bin T; Weinstein, John; Tannir, Nizar M

2014-08-01

MITF/TFE translocation renal cell carcinoma (TRCC) is a rare subtype of kidney cancer. Its incidence and the genome-wide characterization of its genetic origin have not been fully elucidated. We performed RNA and exome sequencing on an exploratory set of TRCC (n = 7), and validated our findings using The Cancer Genome Atlas (TCGA) clear-cell RCC (ccRCC) dataset (n = 460). Using the TCGA dataset, we identified seven TRCC (1.5%) cases and determined their genomic profile. We discovered three novel partners of MITF/TFE (LUC7L3, KHSRP, and KHDRBS2) that are involved in RNA splicing. TRCC displayed a unique gene expression signature as compared with other RCC types, and showed activation of MITF, the transforming growth factor β1 and the PI3K complex targets. Genes differentially spliced between TRCC and other RCC types were enriched for MITF and ID2 targets. Exome sequencing of TRCC revealed a distinct mutational spectrum as compared with ccRCC, with frequent mutations in chromatin-remodeling genes (six of eight cases, three of which were from the TCGA). In two cases, we identified mutations in INO80D, an ATP-dependent chromatin-remodeling gene, previously shown to control the amplitude of the S phase. Knockdown of INO80D decreased cell proliferation in a novel cell line bearing LUC7L3-TFE3 translocation. This genome-wide study defines the incidence of TRCC within a ccRCC-directed project and expands the genomic spectrum of TRCC by identifying novel MITF/TFE partners involved in RNA splicing and frequent mutations in chromatin-remodeling genes. ©2014 American Association for Cancer Research.
Genomics Assisted Ancestry Deconvolution in Grape

PubMed Central

Sawler, Jason; Reisch, Bruce; Aradhya, Mallikarjuna K.; Prins, Bernard; Zhong, Gan-Yuan; Schwaninger, Heidi; Simon, Charles; Buckler, Edward; Myles, Sean

2013-01-01

The genus Vitis (the grapevine) is a group of highly diverse, diploid woody perennial vines consisting of approximately 60 species from across the northern hemisphere. It is the world’s most valuable horticultural crop with ~8 million hectares planted, most of which is processed into wine. To gain insights into the use of wild Vitis species during the past century of interspecific grape breeding and to provide a foundation for marker-assisted breeding programmes, we present a principal components analysis (PCA) based ancestry estimation method to calculate admixture proportions of hybrid grapes in the United States Department of Agriculture grape germplasm collection using genome-wide polymorphism data. We find that grape breeders have backcrossed to both the domesticated V. vinifera and wild Vitis species and that reasonably accurate genome-wide ancestry estimation can be performed on interspecific Vitis hybrids using a panel of fewer than 50 ancestry informative markers (AIMs). We compare measures of ancestry informativeness used in selecting SNP panels for two-way admixture estimation, and verify the accuracy of our method on simulated populations of admixed offspring. Our method of ancestry deconvolution provides a first step towards selection at the seed or seedling stage for desirable admixture profiles, which will facilitate marker-assisted breeding that aims to introgress traits from wild Vitis species while retaining the desirable characteristics of elite V. vinifera cultivars. PMID:24244717

Realizing privacy preserving genome-wide association studies.

PubMed

Simmons, Sean; Berger, Bonnie

2016-05-01

As genomics moves into the clinic, there has been much interest in using this medical data for research. At the same time the use of such data raises many privacy concerns. These circumstances have led to the development of various methods to perform genome-wide association studies (GWAS) on patient records while ensuring privacy. In particular, there has been growing interest in applying differentially private techniques to this challenge. Unfortunately, up until now all methods for finding high scoring SNPs in a differentially private manner have had major drawbacks in terms of either accuracy or computational efficiency. Here we overcome these limitations with a substantially modified version of the neighbor distance method for performing differentially private GWAS, and thus are able to produce a more viable mechanism. Specifically, we use input perturbation and an adaptive boundary method to overcome accuracy issues. We also design and implement a convex analysis based algorithm to calculate the neighbor distance for each SNP in constant time, overcoming the major computational bottleneck in the neighbor distance method. It is our hope that methods such as ours will pave the way for more widespread use of patient data in biomedical research. A python implementation is available at http://groups.csail.mit.edu/cb/DiffPriv/ bab@csail.mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Genome-wide gene expression profiling of low-dose, long-term exposure of human osteosarcoma cells to bisphenol A and its analogs bisphenols AF and S.

PubMed

Fic, A; Mlakar, S Jurković; Juvan, P; Mlakar, V; Marc, J; Dolenc, M Sollner; Broberg, K; Mašič, L Peterlin

2015-08-01

The bisphenols AF (BPAF) and S (BPS) are structural analogs of the endocrine disruptor bisphenol A (BPA), and are used in common products as a replacement for BPA. To elucidate genome-wide gene expression responses, estrogen-dependent osteosarcoma cells were cultured with 10 nM BPA, BPAF, or BPS, for 8 h and 3 months. Genome-wide gene expression was analyzed using the Illumina Expression BeadChip. Three months exposure had significant effects on gene expression, particularly for BPS, followed by BPAF and BPA, according to the number of differentially expressed genes (1980, 778, 60, respectively), the magnitude of changes in gene expression, and the number of enriched biological processes (800, 415, 33, respectively) and pathways (77, 52, 6, respectively). 'Embryonic skeletal system development' was the most enriched bone-related process, which was affected only by BPAF and BPS. Interestingly, all three bisphenols showed highest down-regulation of genes related to the cardiovascular system (e.g., NPPB, NPR3, TXNIP). BPA only and BPA/BPAF/BPS also affected genes related to the immune system and fetal development, respectively. For BPAF and BPS, the 'isoprenoid biosynthetic process' was enriched (up-regulated genes: HMGCS1, PDSS1, ACAT2, RCE1, DHDDS). Compared to BPA, BPAF and BPS had more effects on gene expression after long-term exposure. These findings stress the need for careful toxicological characterization of BPA analogs in the future. Copyright © 2015 Elsevier Ltd. All rights reserved.
Genetics of Venous Thrombosis: Insights from a New Genome Wide Association Study

PubMed Central

Germain, Marine; Saut, Noémie; Greliche, Nicolas; Dina, Christian; Lambert, Jean-Charles; Perret, Claire; Cohen, William; Oudot-Mellakh, Tiphaine; Antoni, Guillemette; Alessi, Marie-Christine; Zelenika, Diana; Cambien, François; Tiret, Laurence; Bertrand, Marion; Dupuy, Anne-Marie; Letenneur, Luc; Lathrop, Mark; Emmerich, Joseph; Amouyel, Philippe; Trégouët, David-Alexandre; Morange, Pierre-Emmanuel

2011-01-01

Background Venous Thrombosis (VT) is a common multifactorial disease associated with a major public health burden. Genetics factors are known to contribute to the susceptibility of the disease but how many genes are involved and their contribution to VT risk still remain obscure. We aimed to identify genetic variants associated with VT risk. Methodology/Principal Findings We conducted a genome-wide association study (GWAS) based on 551,141 SNPs genotyped in 1,542 cases and 1,110 controls. Twelve SNPs reached the genome-wide significance level of 2.0×10−8 and encompassed four known VT-associated loci, ABO, F5, F11 and FGG. By means of haplotype analyses, we also provided novel arguments in favor of a role of HIVEP1, PROCR and STAB2, three loci recently hypothesized to participate in the susceptibility to VT. However, no novel VT-associated loci came out of our GWAS. Using a recently proposed statistical methodology, we also showed that common variants could explain about 35% of the genetic variance underlying VT susceptibility among which 3% could be attributable to the main identified VT loci. This analysis additionally suggested that the common variants left to be identified are not uniformly distributed across the genome and that chromosome 20, itself, could contribute to ∼7% of the total genetic variance. Conclusions/Significance This study might also provide a valuable source of information to expand our understanding of biological mechanisms regulating quantitative biomarkers for VT. PMID:21980494
Common variants of FUT2 are associated with plasma vitamin B12 levels

USDA-ARS?s Scientific Manuscript database

A genome-wide scan is a way to distinguish small differences in the genetic makeup of individuals. It is also a way which distinguishes if a mutation in any particular gene is widespread or it is "polymorphic." The value of these analyses lies in the identification of genes that could influence a th...
New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes.

PubMed

Parker, Brian J; Moltke, Ida; Roth, Adam; Washietl, Stefan; Wen, Jiayu; Kellis, Manolis; Breaker, Ronald; Pedersen, Jakob Skou

2011-11-01

Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.
Genetic Diversity and Societally Important Disparities

PubMed Central

Rosenberg, Noah A.; Kang, Jonathan T. L.

2015-01-01

The magnitude of genetic diversity within human populations varies in a way that reflects the sequence of migrations by which people spread throughout the world. Beyond its use in human evolutionary genetics, worldwide variation in genetic diversity sometimes can interact with social processes to produce differences among populations in their relationship to modern societal problems. We review the consequences of genetic diversity differences in the settings of familial identification in forensic genetic testing, match probabilities in bone marrow transplantation, and representation in genome-wide association studies of disease. In each of these three cases, the contribution of genetic diversity to social differences follows from population-genetic principles. For a fourth setting that is not similarly grounded, we reanalyze with expanded genetic data a report that genetic diversity differences influence global patterns of human economic development, finding no support for the claim. The four examples describe a limit to the importance of genetic diversity for explaining societal differences while illustrating a distinction that certain biologically based scenarios do require consideration of genetic diversity for solving problems to which populations have been differentially predisposed by the unique history of human migrations. PMID:26354973
Exploiting rRNA operon copy number to investigate bacterial reproductive strategies.

PubMed

Roller, Benjamin R K; Stoddard, Steven F; Schmidt, Thomas M

2016-09-12

The potential for rapid reproduction is a hallmark of microbial life, but microbes in nature must also survive and compete when growth is constrained by resource availability. Successful reproduction requires different strategies when resources are scarce and when they are abundant 1,2 , but a systematic framework for predicting these reproductive strategies in bacteria has not been available. Here, we show that the number of ribosomal RNA operons (rrn) in bacterial genomes predicts two important components of reproduction-growth rate and growth efficiency-which are favoured under contrasting regimes of resource availability 3,4 . We find that the maximum reproductive rate of bacteria doubles with a doubling of rrn copy number, and the efficiency of carbon use is inversely related to maximal growth rate and rrn copy number. We also identify a feasible explanation for these patterns: the rate and yield of protein synthesis mirror the overall pattern in maximum growth rate and growth efficiency. Furthermore, comparative analysis of genomes from 1,167 bacterial species reveals that rrn copy number predicts traits associated with resource availability, including chemotaxis and genome streamlining. Genome-wide patterns of orthologous gene content covary with rrn copy number, suggesting convergent evolution in response to resource availability. Our findings imply that basic cellular processes adapt in contrasting ways to long-term differences in resource availability. They also establish a basis for predicting changes in bacterial community composition in response to resource perturbations using rrn copy number measurements 5 or inferences 6,7 .
GWATCH: a web platform for automated gene association discovery analysis.

PubMed

Svitin, Anton; Malov, Sergey; Cherkasov, Nikolay; Geerts, Paul; Rotkevich, Mikhail; Dobrynin, Pavel; Shevchenko, Andrey; Guan, Li; Troyer, Jennifer; Hendrickson, Sher; Dilks, Holli Hutcheson; Oleksyk, Taras K; Donfield, Sharyne; Gomperts, Edward; Jabs, Douglas A; Sezgin, Efe; Van Natta, Mark; Harrigan, P Richard; Brumme, Zabrina L; O'Brien, Stephen J

2014-01-01

As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Here we present a dynamic web-based platform - GWATCH - that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH.
Integrative and conjugative elements and their hosts: composition, distribution and organization

PubMed Central

Touchon, Marie; Rocha, Eduardo P. C.

2017-01-01

Abstract Conjugation of single-stranded DNA drives horizontal gene transfer between bacteria and was widely studied in conjugative plasmids. The organization and function of integrative and conjugative elements (ICE), even if they are more abundant, was only studied in a few model systems. Comparative genomics of ICE has been precluded by the difficulty in finding and delimiting these elements. Here, we present the results of a method that circumvents these problems by requiring only the identification of the conjugation genes and the species’ pan-genome. We delimited 200 ICEs and this allowed the first large-scale characterization of these elements. We quantified the presence in ICEs of a wide set of functions associated with the biology of mobile genetic elements, including some that are typically associated with plasmids, such as partition and replication. Protein sequence similarity networks and phylogenetic analyses revealed that ICEs are structured in functional modules. Integrases and conjugation systems have different evolutionary histories, even if the gene repertoires of ICEs can be grouped in function of conjugation types. Our characterization of the composition and organization of ICEs paves the way for future functional and evolutionary analyses of their cargo genes, composed of a majority of unknown function genes. PMID:28911112
Genome wide approaches to identify protein-DNA interactions.

PubMed

Ma, Tao; Ye, Zhenqing; Wang, Liguo

2018-05-29

Transcription factors are DNA-binding proteins that play key roles in many fundamental biological processes. Unraveling their interactions with DNA is essential to identify their target genes and understand the regulatory network. Genome-wide identification of their binding sites became feasible thanks to recent progress in experimental and computational approaches. ChIP-chip, ChIP-seq, and ChIP-exo are three widely used techniques to demarcate genome-wide transcription factor binding sites. This review aims to provide an overview of these three techniques including their experiment procedures, computational approaches, and popular analytic tools. ChIP-chip, ChIP-seq, and ChIP-exo have been the major techniques to study genome-wide in vivo protein-DNA interaction. Due to the rapid development of next-generation sequencing technology, array-based ChIP-chip is deprecated and ChIP-seq has become the most widely used technique to identify transcription factor binding sites in genome-wide. The newly developed ChIP-exo further improves the spatial resolution to single nucleotide. Numerous tools have been developed to analyze ChIP-chip, ChIP-seq and ChIP-exo data. However, different programs may employ different mechanisms or underlying algorithms thus each will inherently include its own set of statistical assumption and bias. So choosing the most appropriate analytic program for a given experiment needs careful considerations. Moreover, most programs only have command line interface so their installation and usage will require basic computation expertise in Unix/Linux. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Genome-Wide Association Study Identifies Loci for Salt Tolerance during Germination in Autotetraploid Alfalfa (Medicargo sativa L.) using Genotyping by Sequencing

USDA-ARS?s Scientific Manuscript database

: In this study, we used a diverse panel of alfalfa accessions to identify molecular markers associated with salt tolerance during germination by genome-wide association (GWA) mapping and genotyping-by-sequencing (GBS). Three levels of salt treatments were applied during seed germination. Phenotypic...
Genotyping-by-sequencing in three octoploid cultivated strawberry families

USDA-ARS?s Scientific Manuscript database

With the goal of evaluating genotyping-by-sequencing (GBS) in a species with a complex octoploid genome, GBS was used to survey genome-wide single-nucleotide polymorphisms (SNPs) in three biparental strawberry (Fragaria ×ananassa) populations. GBS sequence data were aligned to the F. vesca ‘Fvb’ ref...
Genome-wide pathway-based association analysis identifies risk pathways associated with Parkinson's disease.

PubMed

Zhang, Mingming; Mu, Hongbo; Shang, Zhenwei; Kang, Kai; Lv, Hongchao; Duan, Lian; Li, Jin; Chen, Xinren; Teng, Yanbo; Jiang, Yongshuai; Zhang, Ruijie

2017-01-06

Parkinson's disease (PD) is the second most common neurodegenerative disease. It is generally believed that it is influenced by both genetic and environmental factors, but the precise pathogenesis of PD is unknown to date. In this study, we performed a pathway analysis based on genome-wide association study (GWAS) to detect risk pathways of PD in three GWAS datasets. We first mapped all SNP markers to autosomal genes in each GWAS dataset. Then, we evaluated gene risk values using the minimum P-value of the tagSNPs. We took a pathway as a unit to identify the risk pathways based on the cumulative risks of the genes in the pathway. Finally, we combine the analysis results of the three datasets to detect the high risk pathways associated with PD. We found there were five same pathways in the three datasets. Besides, we also found there were five pathways which were shared in two datasets. Most of these pathways are associated with nervoussystem. Five pathways had been reported to be PD-related pathways in the previous literature. Our findings also implied that there was a close association between immune response and PD. Continued investigation of these pathways will further help us explain the pathogenesis of PD. Copyright © 2016. Published by Elsevier Ltd.
Genetic structure, divergence and admixture of Han Chinese, Japanese and Korean populations.

PubMed

Wang, Yuchen; Lu, Dongsheng; Chung, Yeun-Jun; Xu, Shuhua

2018-01-01

Han Chinese, Japanese and Korean, the three major ethnic groups of East Asia, share many similarities in appearance, language and culture etc., but their genetic relationships, divergence times and subsequent genetic exchanges have not been well studied. We conducted a genome-wide study and evaluated the population structure of 182 Han Chinese, 90 Japanese and 100 Korean individuals, together with the data of 630 individuals representing 8 populations wordwide. Our analyses revealed that Han Chinese, Japanese and Korean populations have distinct genetic makeup and can be well distinguished based on either the genome wide data or a panel of ancestry informative markers (AIMs). Their genetic structure corresponds well to their geographical distributions, indicating geographical isolation played a critical role in driving population differentiation in East Asia. The most recent common ancestor of the three populations was dated back to 3000 ~ 3600 years ago. Our analyses also revealed substantial admixture within the three populations which occurred subsequent to initial splits, and distinct gene introgression from surrounding populations, of which northern ancestral component is dominant. These estimations and findings facilitate to understanding population history and mechanism of human genetic diversity in East Asia, and have implications for both evolutionary and medical studies.
Detection and sequencing of West Nile virus RNA from human urine and serum samples during the 2014 seasonal period.

PubMed

Nagy, Anna; Bán, Enikő; Nagy, Orsolya; Ferenczi, Emőke; Farkas, Ágnes; Bányai, Krisztián; Farkas, Szilvia; Takács, Mária

2016-07-01

West Nile virus, a widely distributed mosquito-borne flavivirus, is responsible for numerous animal and human infections in Europe, Africa and the Americas. In Hungary, the average number of human infections falls between 10 and 20 cases each year. The severity of clinically manifesting infections varies widely from the milder form of West Nile fever to West Nile neuroinvasive disease (WNND). In routine laboratory diagnosis of human West Nile virus infections, serological methods are mainly applied due to the limited duration of viremia. However, recent studies suggest that detection of West Nile virus RNA in urine samples may be useful as a molecular diagnostic test for these infections. The Hungarian National Reference Laboratory for Viral Zoonoses serologically confirmed eleven acute human infections during the 2014 seasonal period. In three patients with neurological symptoms, viral RNA was detected from both urine and serum specimens, albeit for a longer period and in higher copy numbers with urine. Phylogenetic analysis of the NS3 genomic region of three strains and the complete genome of one selected strain demonstrated that all three patients had lineage-2 West Nile virus infections. Our findings reaffirm the utility of viral RNA detection in urine as a molecular diagnostic procedure for diagnosis of West Nile virus infections.
Informed consent in direct-to-consumer personal genome testing: the outline of a model between specific and generic consent.

PubMed

Bunnik, Eline M; Janssens, A Cecile J W; Schermer, Maartje H N

2014-09-01

Broad genome-wide testing is increasingly finding its way to the public through the online direct-to-consumer marketing of so-called personal genome tests. Personal genome tests estimate genetic susceptibilities to multiple diseases and other phenotypic traits simultaneously. Providers commonly make use of Terms of Service agreements rather than informed consent procedures. However, to protect consumers from the potential physical, psychological and social harms associated with personal genome testing and to promote autonomous decision-making with regard to the testing offer, we argue that current practices of information provision are insufficient and that there is a place--and a need--for informed consent in personal genome testing, also when it is offered commercially. The increasing quantity, complexity and diversity of most testing offers, however, pose challenges for information provision and informed consent. Both specific and generic models for informed consent fail to meet its moral aims when applied to personal genome testing. Consumers should be enabled to know the limitations, risks and implications of personal genome testing and should be given control over the genetic information they do or do not wish to obtain. We present the outline of a new model for informed consent which can meet both the norm of providing sufficient information and the norm of providing understandable information. The model can be used for personal genome testing, but will also be applicable to other, future forms of broad genetic testing or screening in commercial and clinical settings. © 2012 John Wiley & Sons Ltd.
Genomic signatures of near-extinction and rebirth of the crested ibis and other endangered bird species.

PubMed

Li, Shengbin; Li, Bo; Cheng, Cheng; Xiong, Zijun; Liu, Qingbo; Lai, Jianghua; Carey, Hannah V; Zhang, Qiong; Zheng, Haibo; Wei, Shuguang; Zhang, Hongbo; Chang, Liao; Liu, Shiping; Zhang, Shanxin; Yu, Bing; Zeng, Xiaofan; Hou, Yong; Nie, Wenhui; Guo, Youmin; Chen, Teng; Han, Jiuqiang; Wang, Jian; Wang, Jun; Chen, Chen; Liu, Jiankang; Stambrook, Peter J; Xu, Ming; Zhang, Guojie; Gilbert, M Thomas P; Yang, Huanming; Jarvis, Erich D; Yu, Jun; Yan, Jianqun

2014-01-01

Nearly one-quarter of all avian species is either threatened or nearly threatened. Of these, 73 species are currently being rescued from going extinct in wildlife sanctuaries. One of the previously most critically-endangered is the crested ibis, Nipponia nippon. Once widespread across North-East Asia, by 1981 only seven individuals from two breeding pairs remained in the wild. The recovering crested ibis populations thus provide an excellent example for conservation genomics since every individual bird has been recruited for genomic and demographic studies. Using high-quality genome sequences of multiple crested ibis individuals, its thriving co-habitant, the little egret, Egretta garzetta, and the recently sequenced genomes of 41 other avian species that are under various degrees of survival threats, including the bald eagle, we carry out comparative analyses for genomic signatures of near extinction events in association with environmental and behavioral attributes of species. We confirm that both loss of genetic diversity and enrichment of deleterious mutations of protein-coding genes contribute to the major genetic defects of the endangered species. We further identify that genetic inbreeding and loss-of-function genes in the crested ibis may all constitute genetic susceptibility to other factors including long-term climate change, over-hunting, and agrochemical overuse. We also establish a genome-wide DNA identification platform for molecular breeding and conservation practices, to facilitate sustainable recovery of endangered species. These findings demonstrate common genomic signatures of population decline across avian species and pave a way for further effort in saving endangered species and enhancing conservation genomic efforts.
Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index

PubMed Central

Minster, Ryan L.; Sanders, Jason L.; Singh, Jatinder; Kammerer, Candace M.; Barmada, M. Michael; Matteini, Amy M.; Zhang, Qunyuan; Wojczynski, Mary K.; Daw, E. Warwick; Brody, Jennifer A.; Arnold, Alice M.; Lunetta, Kathryn L.; Murabito, Joanne M.; Christensen, Kaare; Perls, Thomas T.; Province, Michael A.

2015-01-01

Background. The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. Methods. We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted for mortality risk in 3,140 individuals selected for familial longevity from the Long Life Family Study. The genome-wide association study used the Long Life Family Study as the discovery cohort and individuals from the Cardiovascular Health Study and the Framingham Heart Study as replication cohorts. Results. There were no genome-wide significant findings from the genome-wide association study; however, several single-nucleotide polymorphisms near ZNF704 on chromosome 8q21.13 were suggestively associated with the HAI in the Long Life Family Study (p < 10− 6) and nominally replicated in the Cardiovascular Health Study and Framingham Heart Study. Linkage results revealed significant evidence (log-odds score = 3.36) for a quantitative trait locus for mortality-optimized HAI in women on chromosome 9p24–p23. However, results of fine-mapping studies did not implicate any specific candidate genes within this region of interest. Conclusions. ZNF704 may be a potential candidate gene for studies of the genetic underpinnings of longevity. PMID:25758594
A genome-wide 3C-method for characterizing the three-dimensional architectures of genomes.

PubMed

Duan, Zhijun; Andronescu, Mirela; Schutz, Kevin; Lee, Choli; Shendure, Jay; Fields, Stanley; Noble, William S; Anthony Blau, C

2012-11-01

Accumulating evidence demonstrates that the three-dimensional (3D) organization of chromosomes within the eukaryotic nucleus reflects and influences genomic activities, including transcription, DNA replication, recombination and DNA repair. In order to uncover structure-function relationships, it is necessary first to understand the principles underlying the folding and the 3D arrangement of chromosomes. Chromosome conformation capture (3C) provides a powerful tool for detecting interactions within and between chromosomes. A high throughput derivative of 3C, chromosome conformation capture on chip (4C), executes a genome-wide interrogation of interaction partners for a given locus. We recently developed a new method, a derivative of 3C and 4C, which, similar to Hi-C, is capable of comprehensively identifying long-range chromosome interactions throughout a genome in an unbiased fashion. Hence, our method can be applied to decipher the 3D architectures of genomes. Here, we provide a detailed protocol for this method. Published by Elsevier Inc.
The genome of the of the generalist plant pathogenic fungus Fusarium avenaceum is enriched with genes involved in redox, signaling and secondary metabolism

USDA-ARS?s Scientific Manuscript database

Fusarium avenaceum is a fungus commonly isolated from soil and with a wide range of host plants. We present here three genome sequences of F. avenaceum, one isolated from barley in Finland and two from spring and winter wheat in Canada. The physical sizes of the three genomes range from 41.6-43.2 MB...

Genome scan study of prostate cancer in Arabs: identification of three genomic regions with multiple prostate cancer susceptibility loci in Tunisians.

PubMed

Shan, Jingxuan; Al-Rumaihi, Khalid; Rabah, Danny; Al-Bozom, Issam; Kizhakayil, Dhanya; Farhat, Karim; Al-Said, Sami; Kfoury, Hala; Dsouza, Shoba P; Rowe, Jillian; Khalak, Hanif G; Jafri, Shahzad; Aigha, Idil I; Chouchane, Lotfi

2013-05-13

Large databases focused on genetic susceptibility to prostate cancer have been accumulated from population studies of different ancestries, including Europeans and African-Americans. Arab populations, however, have been only rarely studied. Using Affymetrix Genome-Wide Human SNP Array 6, we conducted a genome-wide association study (GWAS) in which 534,781 single nucleotide polymorphisms (SNPs) were genotyped in 221 Tunisians (90 prostate cancer patients and 131 age-matched healthy controls). TaqMan SNP Genotyping Assays on 11 prostate cancer associated SNPs were performed in a distinct cohort of 337 individuals from Arab ancestry living in Qatar and Saudi Arabia (155 prostate cancer patients and 182 age-matched controls). In-silico expression quantitative trait locus (eQTL) analysis along with mRNA quantification of nearby genes was performed to identify loci potentially cis-regulated by the identified SNPs. Three chromosomal regions, encompassing 14 SNPs, are significantly associated with prostate cancer risk in the Tunisian population (P = 1 × 10-4 to P = 1 × 10-5). In addition to SNPs located on chromosome 17q21, previously found associated with prostate cancer in Western populations, two novel chromosomal regions are revealed on chromosome 9p24 and 22q13. eQTL analysis and mRNA quantification indicate that the prostate cancer associated SNPs of chromosome 17 could enhance the expression of STAT5B gene. Our findings, identifying novel GWAS prostate cancer susceptibility loci, indicate that prostate cancer genetic risk factors could be ethnic specific.
A Genome-Wide Association Study for Regulators of Micronucleus Formation in Mice.

PubMed

McIntyre, Rebecca E; Nicod, Jérôme; Robles-Espinoza, Carla Daniela; Maciejowski, John; Cai, Na; Hill, Jennifer; Verstraten, Ruth; Iyer, Vivek; Rust, Alistair G; Balmus, Gabriel; Mott, Richard; Flint, Jonathan; Adams, David J

2016-08-09

In mammals the regulation of genomic instability plays a key role in tumor suppression and also controls genome plasticity, which is important for recombination during the processes of immunity and meiosis. Most studies to identify regulators of genomic instability have been performed in cells in culture or in systems that report on gross rearrangements of the genome, yet subtle differences in the level of genomic instability can contribute to whole organism phenotypes such as tumor predisposition. Here we performed a genome-wide association study in a population of 1379 outbred Crl:CFW(SW)-US_P08 mice to dissect the genetic landscape of micronucleus formation, a biomarker of chromosomal breaks, whole chromosome loss, and extranuclear DNA. Variation in micronucleus levels is a complex trait with a genome-wide heritability of 53.1%. We identify seven loci influencing micronucleus formation (false discovery rate <5%), and define candidate genes at each locus. Intriguingly at several loci we find evidence for sexual dimorphism in micronucleus formation, with a locus on chromosome 11 being specific to males. Copyright © 2016 McIntyre et al.
A genome-wide signature of positive selection in ancient and recent invasive expansions of the honey bee Apis mellifera

PubMed Central

Zayed, Amro; Whitfield, Charles W.

2008-01-01

Apis mellifera originated in Africa and extended its range into Eurasia in two or more ancient expansions. In 1956, honey bees of African origin were introduced into South America, their descendents admixing with previously introduced European bees, giving rise to the highly invasive and economically devastating “Africanized” honey bee. Here we ask whether the honey bee's out-of-Africa expansions, both ancient and recent (invasive), were associated with a genome-wide signature of positive selection, detected by contrasting genetic differentiation estimates (FST) between coding and noncoding SNPs. In native populations, SNPs in protein-coding regions had significantly higher FST estimates than those in noncoding regions, indicating adaptive evolution in the genome driven by positive selection. This signal of selection was associated with the expansion of honey bees from Africa into Western and Northern Europe, perhaps reflecting adaptation to temperate environments. We estimate that positive selection acted on a minimum of 852–1,371 genes or ≈10% of the bee's coding genome. We also detected positive selection associated with the invasion of African-derived honey bees in the New World. We found that introgression of European-derived alleles into Africanized bees was significantly greater for coding than noncoding regions. Our findings demonstrate that Africanized bees exploited the genetic diversity present from preexisting introductions in an adaptive way. Finally, we found a significant negative correlation between FST estimates and the local GC content surrounding coding SNPs, suggesting that AT-rich genes play an important role in adaptive evolution in the honey bee. PMID:18299560
A genome-wide signature of positive selection in ancient and recent invasive expansions of the honey bee Apis mellifera.

PubMed

Zayed, Amro; Whitfield, Charles W

2008-03-04

Apis mellifera originated in Africa and extended its range into Eurasia in two or more ancient expansions. In 1956, honey bees of African origin were introduced into South America, their descendents admixing with previously introduced European bees, giving rise to the highly invasive and economically devastating "Africanized" honey bee. Here we ask whether the honey bee's out-of-Africa expansions, both ancient and recent (invasive), were associated with a genome-wide signature of positive selection, detected by contrasting genetic differentiation estimates (F(ST)) between coding and noncoding SNPs. In native populations, SNPs in protein-coding regions had significantly higher F(ST) estimates than those in noncoding regions, indicating adaptive evolution in the genome driven by positive selection. This signal of selection was associated with the expansion of honey bees from Africa into Western and Northern Europe, perhaps reflecting adaptation to temperate environments. We estimate that positive selection acted on a minimum of 852-1,371 genes or approximately 10% of the bee's coding genome. We also detected positive selection associated with the invasion of African-derived honey bees in the New World. We found that introgression of European-derived alleles into Africanized bees was significantly greater for coding than noncoding regions. Our findings demonstrate that Africanized bees exploited the genetic diversity present from preexisting introductions in an adaptive way. Finally, we found a significant negative correlation between F(ST) estimates and the local GC content surrounding coding SNPs, suggesting that AT-rich genes play an important role in adaptive evolution in the honey bee.
VirtualPlant: A Software Platform to Support Systems Biology Research1[W][OA

PubMed Central

Katari, Manpreet S.; Nowicki, Steve D.; Aceituno, Felipe F.; Nero, Damion; Kelfer, Jonathan; Thompson, Lee Parnell; Cabello, Juan M.; Davidson, Rebecca S.; Goldberg, Arthur P.; Shasha, Dennis E.; Coruzzi, Gloria M.; Gutiérrez, Rodrigo A.

2010-01-01

Data generation is no longer the limiting factor in advancing biological research. In addition, data integration, analysis, and interpretation have become key bottlenecks and challenges that biologists conducting genomic research face daily. To enable biologists to derive testable hypotheses from the increasing amount of genomic data, we have developed the VirtualPlant software platform. VirtualPlant enables scientists to visualize, integrate, and analyze genomic data from a systems biology perspective. VirtualPlant integrates genome-wide data concerning the known and predicted relationships among genes, proteins, and molecules, as well as genome-scale experimental measurements. VirtualPlant also provides visualization techniques that render multivariate information in visual formats that facilitate the extraction of biological concepts. Importantly, VirtualPlant helps biologists who are not trained in computer science to mine lists of genes, microarray experiments, and gene networks to address questions in plant biology, such as: What are the molecular mechanisms by which internal or external perturbations affect processes controlling growth and development? We illustrate the use of VirtualPlant with three case studies, ranging from querying a gene of interest to the identification of gene networks and regulatory hubs that control seed development. Whereas the VirtualPlant software was developed to mine Arabidopsis (Arabidopsis thaliana) genomic data, its data structures, algorithms, and visualization tools are designed in a species-independent way. VirtualPlant is freely available at www.virtualplant.org. PMID:20007449
Integrative genomic profiling reveals conserved genetic mechanisms for tumorigenesis in common entities of non-Hodgkin's lymphoma.

PubMed

Green, Michael R; Aya-Bonilla, Carlos; Gandhi, Maher K; Lea, Rod A; Wellwood, Jeremy; Wood, Peter; Marlton, Paula; Griffiths, Lyn R

2011-05-01

Recent developments in genomic technologies have resulted in increased understanding of pathogenic mechanisms and emphasized the importance of central survival pathways. Here, we use a novel bioinformatic based integrative genomic profiling approach to elucidate conserved mechanisms of lymphomagenesis in the three commonest non-Hodgkin's lymphoma (NHL) entities: diffuse large B-cell lymphoma, follicular lymphoma, and B-cell chronic lymphocytic leukemia. By integrating genome-wide DNA copy number analysis and transcriptome profiling of tumor cohorts, we identified genetic lesions present in each entity and highlighted their likely target genes. This revealed a significant enrichment of components of both the apoptosis pathway and the mitogen activated protein kinase pathway, including amplification of the MAP3K12 locus in all three entities, within the set of genes targeted by genetic alterations in these diseases. Furthermore, amplification of 12p13.33 was identified in all three entities and found to target the FOXM1 oncogene. Amplification of FOXM1 was subsequently found to be associated with an increased MYC oncogenic signaling signature, and siRNA-mediated knock-down of FOXM1 resulted in decreased MYC expression and induced G2 arrest. Together, these findings underscore genetic alteration of the MAPK and apoptosis pathways, and genetic amplification of FOXM1 as conserved mechanisms of lymphomagenesis in common NHL entities. Integrative genomic profiling identifies common central survival mechanisms and highlights them as attractive targets for directed therapy. 2011 Wiley-Liss, Inc.
Global Implementation of Genomic Medicine: We Are Not Alone

PubMed Central

Manolio, Teri A.; Abramowicz, Marc; Al-Mulla, Fahd; Anderson, Warwick; Balling, Rudi; Berger, Adam C.; Bleyl, Steven; Chakravarti, Aravinda; Chantratita, Wasun; Chisholm, Rex L.; Dissanayake, Vajira H. W.; Dunn, Michael; Dzau, Victor J.; Han, Bok-Ghee; Hubbard, Tim; Kolbe, Anne; Korf, Bruce; Kubo, Michiaki; Lasko, Paul; Leego, Erkki; Mahasirimongkol, Surakameth; Majumdar, Partha P.; Matthijs, Gert; McLeod, Howard L.; Metspalu, Andres; Meulien, Pierre; Miyano, Satoru; Naparstek, Yaakov; O’Rourke, P. Pearl; Patrinos, George P.; Rehm, Heidi L.; Relling, Mary V.; Rennert, Gad; Rodriguez, Laura Lyman; Roden, Dan M.; Shuldiner, Alan R.; Sinha, Sukdev; Tan, Patrick; Ulfendahl, Mats; Ward, Robyn; Williams, Marc S.; Wong, John E.L.; Green, Eric D.; Ginsburg, Geoffrey S.

2016-01-01

Advances in high-throughput genomic technologies coupled with a growing number of genomic results potentially useful in clinical care have led to ground-breaking genomic medicine implementation programs in various nations. Many of these innovative programs capitalize on unique local capabilities arising from the structure of their health care systems or their cultural or political milieu, as well as from unusual burdens of disease or risk alleles. Many such programs are being conducted in relative isolation and might benefit from sharing of approaches and lessons learned in other nations. The National Human Genome Research Institute recently brought together 25 of these groups from around the world to describe and compare projects, examine the current state of implementation and desired near-term capabilities, and identify opportunities for collaboration to promote the responsible implementation of genomic medicine. The wide variety of nascent programs in diverse settings demonstrates that implementation of genomic medicine is expanding globally in varied and highly innovative ways. Opportunities for collaboration abound in the areas of evidence generation, health information technology, education, workforce development, pharmacogenomics, and policy and regulatory issues. Several international organizations that are already facilitating effective research collaborations should engage to ensure implementation proceeds collaboratively without potentially wasteful duplication. Efforts to coalesce these groups around concrete but compelling signature projects, such as global eradication of genetically-mediated drug reactions or developing a truly global genomic variant data resource across a wide number of ethnicities, would accelerate appropriate implementation of genomics to improve clinical care world-wide. PMID:26041702
Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers

PubMed Central

2010-01-01

Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time. PMID:20969788
CRISPR-cas System as a Genome Engineering Platform: Applications in Biomedicine and Biotechnology.

PubMed

Hashemi, Atieh

2018-01-01

Genome editing mediated by Clustered Regularly Interspaced Palindromic Repeats (CRISPR) and its associated proteins (Cas) has recently been considered to be used as efficient, rapid and site-specific tool in the modification of endogenous genes in biomedically important cell types and whole organisms. It has become a predictable and precise method of choice for genome engineering by specifying a 20-nt targeting sequence within its guide RNA. Firstly, this review aims to describe the biology of CRISPR system. Next, the applications of CRISPR-Cas9 in various ways, such as efficient generation of a wide variety of biomedically important cellular models as well as those of animals, modifying epigenomes, conducting genome-wide screens, gene therapy, labelling specific genomic loci in living cells, metabolic engineering of yeast and bacteria and endogenous gene expression regulation by an altered version of this system were reviewed. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Optimized distributed systems achieve significant performance improvement on sorted merging of massive VCF files

PubMed Central

Gao, Jingjing; Jin, Peng; Eng, Celeste; Burchard, Esteban G; Beaty, Terri H; Ruczinski, Ingo; Mathias, Rasika A; Barnes, Kathleen; Wang, Fusheng

2018-01-01

Abstract Background Sorted merging of genomic data is a common data operation necessary in many sequencing-based studies. It involves sorting and merging genomic data from different subjects by their genomic locations. In particular, merging a large number of variant call format (VCF) files is frequently required in large-scale whole-genome sequencing or whole-exome sequencing projects. Traditional single-machine based methods become increasingly inefficient when processing large numbers of files due to the excessive computation time and Input/Output bottleneck. Distributed systems and more recent cloud-based systems offer an attractive solution. However, carefully designed and optimized workflow patterns and execution plans (schemas) are required to take full advantage of the increased computing power while overcoming bottlenecks to achieve high performance. Findings In this study, we custom-design optimized schemas for three Apache big data platforms, Hadoop (MapReduce), HBase, and Spark, to perform sorted merging of a large number of VCF files. These schemas all adopt the divide-and-conquer strategy to split the merging job into sequential phases/stages consisting of subtasks that are conquered in an ordered, parallel, and bottleneck-free way. In two illustrating examples, we test the performance of our schemas on merging multiple VCF files into either a single TPED or a single VCF file, which are benchmarked with the traditional single/parallel multiway-merge methods, message passing interface (MPI)–based high-performance computing (HPC) implementation, and the popular VCFTools. Conclusions Our experiments suggest all three schemas either deliver a significant improvement in efficiency or render much better strong and weak scalabilities over traditional methods. Our findings provide generalized scalable schemas for performing sorted merging on genetics and genomics data using these Apache distributed systems. PMID:29762754
Bioinformatic analysis of Msx1 and Msx2 involved in craniofacial development.

PubMed

Dai, Jiewen; Mou, Zhifang; Shen, Shunyao; Dong, Yuefu; Yang, Tong; Shen, Steve Guofang

2014-01-01

Msx1 and Msx2 were revealed to be candidate genes for some craniofacial deformities, such as cleft lip with/without cleft palate (CL/P) and craniosynostosis. Many other genes were demonstrated to have a cross-talk with MSX genes in causing these defects. However, there is no systematic evaluation for these MSX gene-related factors. In this study, we performed systematic bioinformatic analysis for MSX genes by combining using GeneDecks, DAVID, and STRING database, and the results showed that there were numerous genes related to MSX genes, such as Irf6, TP63, Dlx2, Dlx5, Pax3, Pax9, Bmp4, Tgf-beta2, and Tgf-beta3 that have been demonstrated to be involved in CL/P, and Fgfr2, Fgfr1, Fgfr3, and Twist1 that were involved in craniosynostosis. Many of these genes could be enriched into different gene groups involved in different signaling ways, different craniofacial deformities, and different biological process. These findings could make us analyze the function of MSX gens in a gene network. In addition, our findings showed that Sumo, a novel gene whose polymorphisms were demonstrated to be associated with nonsyndromic CL/P by genome-wide association study, has protein-protein interaction with MSX1, which may offer us an alternative method to perform bioinformatic analysis for genes found by genome-wide association study and can make us predict the disrupted protein function due to the mutation in a gene DNA sequence. These findings may guide us to perform further functional studies in the future.
StereoGene: rapid estimation of genome-wide correlation of continuous or interval feature data.

PubMed

Stavrovskaya, Elena D; Niranjan, Tejasvi; Fertig, Elana J; Wheelan, Sarah J; Favorov, Alexander V; Mironov, Andrey A

2017-10-15

Genomics features with similar genome-wide distributions are generally hypothesized to be functionally related, for example, colocalization of histones and transcription start sites indicate chromatin regulation of transcription factor activity. Therefore, statistical algorithms to perform spatial, genome-wide correlation among genomic features are required. Here, we propose a method, StereoGene, that rapidly estimates genome-wide correlation among pairs of genomic features. These features may represent high-throughput data mapped to reference genome or sets of genomic annotations in that reference genome. StereoGene enables correlation of continuous data directly, avoiding the data binarization and subsequent data loss. Correlations are computed among neighboring genomic positions using kernel correlation. Representing the correlation as a function of the genome position, StereoGene outputs the local correlation track as part of the analysis. StereoGene also accounts for confounders such as input DNA by partial correlation. We apply our method to numerous comparisons of ChIP-Seq datasets from the Human Epigenome Atlas and FANTOM CAGE to demonstrate its wide applicability. We observe the changes in the correlation between epigenomic features across developmental trajectories of several tissue types consistent with known biology and find a novel spatial correlation of CAGE clusters with donor splice sites and with poly(A) sites. These analyses provide examples for the broad applicability of StereoGene for regulatory genomics. The StereoGene C ++ source code, program documentation, Galaxy integration scripts and examples are available from the project homepage http://stereogene.bioinf.fbb.msu.ru/. favorov@sensi.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

PubMed

Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

2000-12-15

The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.
On Computing Breakpoint Distances for Genomes with Duplicate Genes.

PubMed

Shao, Mingfu; Moret, Bernard M E

2017-06-01

A fundamental problem in comparative genomics is to compute the distance between two genomes in terms of its higher level organization (given by genes or syntenic blocks). For two genomes without duplicate genes, we can easily define (and almost always efficiently compute) a variety of distance measures, but the problem is NP-hard under most models when genomes contain duplicate genes. To tackle duplicate genes, three formulations (exemplar, maximum matching, and any matching) have been proposed, all of which aim to build a matching between homologous genes so as to minimize some distance measure. Of the many distance measures, the breakpoint distance (the number of nonconserved adjacencies) was the first one to be studied and remains of significant interest because of its simplicity and model-free property. The three breakpoint distance problems corresponding to the three formulations have been widely studied. Although we provided last year a solution for the exemplar problem that runs very fast on full genomes, computing optimal solutions for the other two problems has remained challenging. In this article, we describe very fast, exact algorithms for these two problems. Our algorithms rely on a compact integer-linear program that we further simplify by developing an algorithm to remove variables, based on new results on the structure of adjacencies and matchings. Through extensive experiments using both simulations and biological data sets, we show that our algorithms run very fast (in seconds) on mammalian genomes and scale well beyond. We also apply these algorithms (as well as the classic orthology tool MSOAR) to create orthology assignment, then compare their quality in terms of both accuracy and coverage. We find that our algorithm for the "any matching" formulation significantly outperforms other methods in terms of accuracy while achieving nearly maximum coverage.
GWAS of Follicular Lymphoma Reveals Allelic Heterogeneity at 6p21.32 and Suggests Shared Genetic Susceptibility with Diffuse Large B-cell Lymphoma

PubMed Central

Skibola, Christine F.; Darabi, Hatef; Conde, Lucia; Hjalgrim, Henrik; Kumar, Vikrant; Chang, Ellen T.; Rothman, Nathaniel; Cerhan, James R.; Brooks-Wilson, Angela R.; Rehnberg, Emil; Irwan, Ishak D.; Ryder, Lars P.; Brown, Peter N.; Bracci, Paige M.; Agana, Luz; Riby, Jacques; Cozen, Wendy; Davis, Scott; Hartge, Patricia; Morton, Lindsay M.; Severson, Richard K.; Wang, Sophia S.; Slager, Susan L.; Fredericksen, Zachary S.; Novak, Anne J.; Kay, Neil E.; Habermann, Thomas M.; Armstrong, Bruce; Kricker, Anne; Milliken, Sam; Purdue, Mark P.; Vajdic, Claire M.; Boyle, Peter; Lan, Qing; Zahm, Shelia H.; Zhang, Yawei; Zheng, Tongzhang; Leach, Stephen; Spinelli, John J.; Smith, Martyn T.; Chanock, Stephen J.; Padyukov, Leonid; Alfredsson, Lars; Klareskog, Lars; Glimelius, Bengt; Melbye, Mads; Liu, Edison T.; Adami, Hans-Olov; Humphreys, Keith; Liu, Jianjun

2011-01-01

Non-Hodgkin lymphoma (NHL) represents a diverse group of hematological malignancies, of which follicular lymphoma (FL) is a prevalent subtype. A previous genome-wide association study has established a marker, rs10484561 in the human leukocyte antigen (HLA) class II region on 6p21.32 associated with increased FL risk. Here, in a three-stage genome-wide association study, starting with a genome-wide scan of 379 FL cases and 791 controls followed by validation in 1,049 cases and 5,790 controls, we identified a second independent FL–associated locus on 6p21.32, rs2647012 (ORcombined = 0.64, Pcombined = 2×10−21) located 962 bp away from rs10484561 (r2<0.1 in controls). After mutual adjustment, the associations at the two SNPs remained genome-wide significant (rs2647012:ORadjusted = 0.70, Padjusted = 4×10−12; rs10484561:ORadjusted = 1.64, Padjusted = 5×10−15). Haplotype and coalescence analyses indicated that rs2647012 arose on an evolutionarily distinct haplotype from that of rs10484561 and tags a novel allele with an opposite (protective) effect on FL risk. Moreover, in a follow-up analysis of the top 6 FL–associated SNPs in 4,449 cases of other NHL subtypes, rs10484561 was associated with risk of diffuse large B-cell lymphoma (ORcombined = 1.36, Pcombined = 1.4×10−7). Our results reveal the presence of allelic heterogeneity within the HLA class II region influencing FL susceptibility and indicate a possible shared genetic etiology with diffuse large B-cell lymphoma. These findings suggest that the HLA class II region plays a complex yet important role in NHL. PMID:21533074
The Flynn Effect in South Africa

ERIC Educational Resources Information Center

te Nijenhuis, Jan; Murphy, Raegan; van Eeden, Rene

2011-01-01

This is a study of secular score gains in South Africa. The findings are based on representative samples from datasets utilized in norm studies of popular mainstream intelligence batteries such as the WAIS as well as widely used test batteries which were locally developed and normed in South Africa. Flynn effects were computed in three ways.…
Gigwa-Genotype investigator for genome-wide analyses.

PubMed

Sempéré, Guilhem; Philippe, Florian; Dereeper, Alexis; Ruiz, Manuel; Sarah, Gautier; Larmande, Pierre

2016-06-06

Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly true with regards to studies of genomic variation, which are currently lacking scalable and user-friendly data exploration solutions. Here we present Gigwa, a web-based tool that provides an easy and intuitive way to explore large amounts of genotyping data by filtering it not only on the basis of variant features, including functional annotations, but also on genotype patterns. The data storage relies on MongoDB, which offers good scalability properties. Gigwa can handle multiple databases and may be deployed in either single- or multi-user mode. In addition, it provides a wide range of popular export formats. The Gigwa application is suitable for managing large amounts of genomic variation data. Its user-friendly web interface makes such processing widely accessible. It can either be simply deployed on a workstation or be used to provide a shared data portal for a given community of researchers.
The Human Genome Project: how do we protect Australians?

PubMed

Stott Despoja, N

It is the moon landing of the nineties: the ambitious Human Genome Project--identifying the up to 100,000 genes that make up human DNA and the sequences of the three billion base-pairs that comprise the human genome. However, unlike the moon landing, the effects of the genome project will have a fundamental impact on the way we see ourselves and each other.
Genome-wide study of resistant hypertension identified from electronic health records.

PubMed

Dumitrescu, Logan; Ritchie, Marylyn D; Denny, Joshua C; El Rouby, Nihal M; McDonough, Caitrin W; Bradford, Yuki; Ramirez, Andrea H; Bielinski, Suzette J; Basford, Melissa A; Chai, High Seng; Peissig, Peggy; Carrell, David; Pathak, Jyotishman; Rasmussen, Luke V; Wang, Xiaoming; Pacheco, Jennifer A; Kho, Abel N; Hayes, M Geoffrey; Matsumoto, Martha; Smith, Maureen E; Li, Rongling; Cooper-DeHoff, Rhonda M; Kullo, Iftikhar J; Chute, Christopher G; Chisholm, Rex L; Jarvik, Gail P; Larson, Eric B; Carey, David; McCarty, Catherine A; Williams, Marc S; Roden, Dan M; Bottinger, Erwin; Johnson, Julie A; de Andrade, Mariza; Crawford, Dana C

2017-01-01

Resistant hypertension is defined as high blood pressure that remains above treatment goals in spite of the concurrent use of three antihypertensive agents from different classes. Despite the important health consequences of resistant hypertension, few studies of resistant hypertension have been conducted. To perform a genome-wide association study for resistant hypertension, we defined and identified cases of resistant hypertension and hypertensives with treated, controlled hypertension among >47,500 adults residing in the US linked to electronic health records (EHRs) and genotyped as part of the electronic MEdical Records & GEnomics (eMERGE) Network. Electronic selection logic using billing codes, laboratory values, text queries, and medication records was used to identify resistant hypertension cases and controls at each site, and a total of 3,006 cases of resistant hypertension and 876 controlled hypertensives were identified among eMERGE Phase I and II sites. After imputation and quality control, a total of 2,530,150 SNPs were tested for an association among 2,830 multi-ethnic cases of resistant hypertension and 876 controlled hypertensives. No test of association was genome-wide significant in the full dataset or in the dataset limited to European American cases (n = 1,719) and controls (n = 708). The most significant finding was CLNK rs13144136 at p = 1.00x10-6 (odds ratio = 0.68; 95% CI = 0.58-0.80) in the full dataset with similar results in the European American only dataset. We also examined whether SNPs known to influence blood pressure or hypertension also influenced resistant hypertension. None was significant after correction for multiple testing. These data highlight both the difficulties and the potential utility of EHR-linked genomic data to study clinically-relevant traits such as resistant hypertension.
InSilico DB genomic datasets hub: an efficient starting point for analyzing genome-wide studies in GenePattern, Integrative Genomics Viewer, and R/Bioconductor.

PubMed

Coletta, Alain; Molter, Colin; Duqué, Robin; Steenhoff, David; Taminau, Jonatan; de Schaetzen, Virginie; Meganck, Stijn; Lazar, Cosmin; Venet, David; Detours, Vincent; Nowé, Ann; Bersini, Hugues; Weiss Solís, David Y

2012-11-18

Genomics datasets are increasingly useful for gaining biomedical insights, with adoption in the clinic underway. However, multiple hurdles related to data management stand in the way of their efficient large-scale utilization. The solution proposed is a web-based data storage hub. Having clear focus, flexibility and adaptability, InSilico DB seamlessly connects genomics dataset repositories to state-of-the-art and free GUI and command-line data analysis tools. The InSilico DB platform is a powerful collaborative environment, with advanced capabilities for biocuration, dataset sharing, and dataset subsetting and combination. InSilico DB is available from https://insilicodb.org.

Genome-wide association studies in Alzheimer's disease.

PubMed

Bertram, Lars; Tanzi, Rudolph E

2009-10-15

Genome-wide association studies (GWAS) have gained considerable momentum over the last couple of years for the identification of novel complex disease genes. In the field of Alzheimer's disease (AD), there are currently eight published and two provisionally reported GWAS, highlighting over two dozen novel potential susceptibility loci beyond the well-established APOE association. On the basis of the data available at the time of this writing, the most compelling novel GWAS signal has been observed in GAB2 (GRB2-associated binding protein 2), followed by less consistently replicated signals in galanin-like peptide (GALP), piggyBac transposable element derived 1 (PGBD1), tyrosine kinase, non-receptor 1 (TNK1). Furthermore, consistent replication has been recently announced for CLU (clusterin, also known as apolipoprotein J). Finally, there are at least three replicated loci in hitherto uncharacterized genomic intervals on chromosomes 14q32.13, 14q31.2 and 6q24.1 likely implicating the existence of novel AD genes in these regions. In this review, we will discuss the characteristics and potential relevance to pathogenesis of the outcomes of all currently available GWAS in AD. A particular emphasis will be laid on findings with independent data in favor of the original association.
Mating system shifts and transposable element evolution in the plant genus Capsella.

PubMed

Agren, J Ågren; Wang, Wei; Koenig, Daniel; Neuffer, Barbara; Weigel, Detlef; Wright, Stephen I

2014-07-16

Despite having predominately deleterious fitness effects, transposable elements (TEs) are major constituents of eukaryote genomes in general and of plant genomes in particular. Although the proportion of the genome made up of TEs varies at least four-fold across plants, the relative importance of the evolutionary forces shaping variation in TE abundance and distributions across taxa remains unclear. Under several theoretical models, mating system plays an important role in governing the evolutionary dynamics of TEs. Here, we use the recently sequenced Capsella rubella reference genome and short-read whole genome sequencing of multiple individuals to quantify abundance, genome distributions, and population frequencies of TEs in three recently diverged species of differing mating system, two self-compatible species (C. rubella and C. orientalis) and their self-incompatible outcrossing relative, C. grandiflora. We detect different dynamics of TE evolution in our two self-compatible species; C. rubella shows a small increase in transposon copy number, while C. orientalis shows a substantial decrease relative to C. grandiflora. The direction of this change in copy number is genome wide and consistent across transposon classes. For insertions near genes, however, we detect the highest abundances in C. grandiflora. Finally, we also find differences in the population frequency distributions across the three species. Overall, our results suggest that the evolution of selfing may have different effects on TE evolution on a short and on a long timescale. Moreover, cross-species comparisons of transposon abundance are sensitive to reference genome bias, and efforts to control for this bias are key when making comparisons across species.
Meta-analysis identifies a MECOM gene as a novel predisposing factor of osteoporotic fracture

PubMed Central

Hwang, Joo-Yeon; Lee, Seung Hun; Go, Min Jin; Kim, Beom-Jun; Kou, Ikuyo; Ikegawa, Shiro; Guo, Yan; Deng, Hong-Wen; Raychaudhuri, Soumya; Kim, Young Jin; Oh, Ji Hee; Kim, Youngdoe; Moon, Sanghoon; Kim, Dong-Joon; Koo, Heejo; Cha, My-Jung; Lee, Min Hye; Yun, Ji Young; Yoo, Hye-Sook; Kang, Young-Ah; Cho, Eun-Hee; Kim, Sang-Wook; Oh, Ki Won; Kang, Moo II; Son, Ho Young; Kim, Shin-Yoon; Kim, Ghi Su; Han, Bok-Ghee; Cho, Yoon Shin; Cho, Myeong-Chan; Lee, Jong-Young; Koh, Jung-Min

2014-01-01

Background Osteoporotic fracture (OF) as a clinical endpoint is a major complication of osteoporosis. To screen for OF susceptibility genes, we performed a genome-wide association study and carried out de novo replication analysis of an East Asian population. Methods Association was tested using a logistic regression analysis. A meta-analysis was performed on the combined results using effect size and standard errors estimated for each study. Results In a combined meta-analysis of a discovery cohort (288 cases and 1139 controls), three hospital based sets in replication stage I (462 cases and 1745 controls), and an independent ethnic group in replication stage II (369 cases and 560 for controls), we identified a new locus associated with OF (rs784288 in the MECOM gene) that showed genome-wide significance (p=3.59×10−8; OR 1.39). RNA interference revealed that a MECOM knockdown suppresses osteoclastogenesis. Conclusions Our findings provide new insights into the genetic architecture underlying OF in East Asians. PMID:23349225
Efficient strategy for detecting gene × gene joint action and its application in schizophrenia.

PubMed

Won, Sungho; Kwon, Min-Seok; Mattheisen, Manuel; Park, Suyeon; Park, Changsoon; Kihara, Daisuke; Cichon, Sven; Ophoff, Roel; Nöthen, Markus M; Rietschel, Marcella; Baur, Max; Uitterlinden, Andre G; Hofmann, A; Lange, Christoph

2014-01-01

We propose a new approach to detect gene × gene joint action in genome-wide association studies (GWASs) for case-control designs. This approach offers an exhaustive search for all two-way joint action (including, as a special case, single gene action) that is computationally feasible at the genome-wide level and has reasonable statistical power under most genetic models. We found that the presence of any gene × gene joint action may imply differences in three types of genetic components: the minor allele frequencies and the amounts of Hardy-Weinberg disequilibrium may differ between cases and controls, and between the two genetic loci the degree of linkage disequilibrium may differ between cases and controls. Using Fisher's method, it is possible to combine the different sources of genetic information in an overall test for detecting gene × gene joint action. The proposed statistical analysis is efficient and its simplicity makes it applicable to GWASs. In the current study, we applied the proposed approach to a GWAS on schizophrenia and found several potential gene × gene interactions. Our application illustrates the practical advantage of the proposed method. © 2013 WILEY PERIODICALS, INC.
Genome-wide association and genomic prediction of resistance to viral nervous necrosis in European sea bass (Dicentrarchus labrax) using RAD sequencing.

PubMed

Palaiokostas, Christos; Cariou, Sophie; Bestin, Anastasia; Bruant, Jean-Sebastien; Haffray, Pierrick; Morin, Thierry; Cabon, Joëlle; Allal, François; Vandeputte, Marc; Houston, Ross D

2018-06-08

European sea bass (Dicentrarchus labrax) is one of the most important species for European aquaculture. Viral nervous necrosis (VNN), commonly caused by the redspotted grouper nervous necrosis virus (RGNNV), can result in high levels of morbidity and mortality, mainly during the larval and juvenile stages of cultured sea bass. In the absence of efficient therapeutic treatments, selective breeding for host resistance offers a promising strategy to control this disease. Our study aimed at investigating genetic resistance to VNN and genomic-based approaches to improve disease resistance by selective breeding. A population of 1538 sea bass juveniles from a factorial cross between 48 sires and 17 dams was challenged with RGNNV with mortalities and survivors being recorded and sampled for genotyping by the RAD sequencing approach. We used genome-wide genotype data from 9195 single nucleotide polymorphisms (SNPs) for downstream analysis. Estimates of heritability of survival on the underlying scale for the pedigree and genomic relationship matrices were 0.27 (HPD interval 95%: 0.14-0.40) and 0.43 (0.29-0.57), respectively. Classical genome-wide association analysis detected genome-wide significant quantitative trait loci (QTL) for resistance to VNN on chromosomes (unassigned scaffolds in the case of 'chromosome' 25) 3, 20 and 25 (P < 1e06). Weighted genomic best linear unbiased predictor provided additional support for the QTL on chromosome 3 and suggested that it explained 4% of the additive genetic variation. Genomic prediction approaches were tested to investigate the potential of using genome-wide SNP data to estimate breeding values for resistance to VNN and showed that genomic prediction resulted in a 13% increase in successful classification of resistant and susceptible animals compared to pedigree-based methods, with Bayes A and Bayes B giving the highest predictive ability. Genome-wide significant QTL were identified but each with relatively small effects on the trait. Tests of genomic prediction suggested that incorporating genome-wide SNP data is likely to result in higher accuracy of estimated breeding values for resistance to VNN. RAD sequencing is an effective method for generating such genome-wide SNPs, and our findings highlight the potential of genomic selection to breed farmed European sea bass with improved resistance to VNN.
The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation.

PubMed

McNeil, Leslie Klis; Reich, Claudia; Aziz, Ramy K; Bartels, Daniela; Cohoon, Matthew; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Hwang, Kaitlyn; Kubal, Michael; Margaryan, Gohar Rem; Meyer, Folker; Mihalo, William; Olsen, Gary J; Olson, Robert; Osterman, Andrei; Paarmann, Daniel; Paczian, Tobias; Parrello, Bruce; Pusch, Gordon D; Rodionov, Dmitry A; Shi, Xinghua; Vassieva, Olga; Vonstein, Veronika; Zagnitko, Olga; Xia, Fangfang; Zinner, Jenifer; Overbeek, Ross; Stevens, Rick

2007-01-01

The National Microbial Pathogen Data Resource (NMPDR) (http://www.nmpdr.org) is a National Institute of Allergy and Infections Disease (NIAID)-funded Bioinformatics Resource Center that supports research in selected Category B pathogens. NMPDR contains the complete genomes of approximately 50 strains of pathogenic bacteria that are the focus of our curators, as well as >400 other genomes that provide a broad context for comparative analysis across the three phylogenetic Domains. NMPDR integrates complete, public genomes with expertly curated biological subsystems to provide the most consistent genome annotations. Subsystems are sets of functional roles related by a biologically meaningful organizing principle, which are built over large collections of genomes; they provide researchers with consistent functional assignments in a biologically structured context. Investigators can browse subsystems and reactions to develop accurate reconstructions of the metabolic networks of any sequenced organism. NMPDR provides a comprehensive bioinformatics platform, with tools and viewers for genome analysis. Results of precomputed gene clustering analyses can be retrieved in tabular or graphic format with one-click tools. NMPDR tools include Signature Genes, which finds the set of genes in common or that differentiates two groups of organisms. Essentiality data collated from genome-wide studies have been curated. Drug target identification and high-throughput, in silico, compound screening are in development.
Man’s Best Friend Becomes Biology’s Best in Show: Genome Analyses in the Domestic Dog*

PubMed Central

Parker, Heidi G.; Shearin, Abigail L.; Ostrander, Elaine A.

2012-01-01

In the last five years, canine genetics has gone from map construction to complex disease deconstruction. The availability of a draft canine genome sequence, dense marker chips, and an understanding of the genome architecture has changed the types of studies canine geneticists can undertake. There is now a clear recognition that the dog system offers the opportunity to understand the genetics of both simple and complex traits, including those associated with morphology, disease susceptibility, and behavior. In this review, we summarize recent findings regarding canine domestication and review new information on the organization of the canine genome. We discuss studies aimed at finding genes controlling morphological phenotypes and provide examples of the way such paradigms may be applied to studies of behavior. We also discuss the many ways in which the dog has illuminated our understanding of human disease and conclude with a discussion on where the field is likely headed in the next five years. PMID:21047261
A new way to protect privacy in large-scale genome-wide association studies.

PubMed

Kamm, Liina; Bogdanov, Dan; Laur, Sven; Vilo, Jaak

2013-04-01

Increased availability of various genotyping techniques has initiated a race for finding genetic markers that can be used in diagnostics and personalized medicine. Although many genetic risk factors are known, key causes of common diseases with complex heritage patterns are still unknown. Identification of such complex traits requires a targeted study over a large collection of data. Ideally, such studies bring together data from many biobanks. However, data aggregation on such a large scale raises many privacy issues. We show how to conduct such studies without violating privacy of individual donors and without leaking the data to third parties. The presented solution has provable security guarantees. Supplementary data are available at Bioinformatics online.
Genomic impact of cigarette smoke, with application to three smoking-related diseases.

PubMed

Talikka, M; Sierro, N; Ivanov, N V; Chaudhary, N; Peck, M J; Hoeng, J; Coggins, C R E; Peitsch, M C

2012-11-01

There is considerable evidence that inhaled toxicants such as cigarette smoke can cause both irreversible changes to the genetic material (DNA mutations) and putatively reversible changes to the epigenetic landscape (changes in the DNA methylation and chromatin modification state). The diseases that are believed to involve genetic and epigenetic perturbations include lung cancer, chronic obstructive pulmonary disease (COPD), and cardiovascular disease (CVD), all of which are strongly linked epidemiologically to cigarette smoking. In this review, we highlight the significance of genomics and epigenomics in these major smoking-related diseases. We also summarize the in vitro and in vivo findings on the specific perturbations that smoke and its constituent compounds can inflict upon the genome, particularly on the pulmonary system. Finally, we review state-of-the-art genomics and new techniques such as high-throughput sequencing and genome-wide chromatin assays, rapidly evolving techniques which have allowed epigenetic changes to be characterized at the genome level. These techniques have the potential to significantly improve our understanding of the specific mechanisms by which exposure to environmental chemicals causes disease. Such mechanistic knowledge provides a variety of opportunities for enhanced product safety assessment and the discovery of novel therapeutic interventions.
Genome-Wide DNA Methylation Indicates Silencing of Tumor Suppressor Genes in Uterine Leiomyoma

PubMed Central

Navarro, Antonia; Yin, Ping; Monsivais, Diana; Lin, Simon M.; Du, Pan; Wei, Jian-Jun; Bulun, Serdar E.

2012-01-01

Background Uterine leiomyomas, or fibroids, represent the most common benign tumor of the female reproductive tract. Fibroids become symptomatic in 30% of all women and up to 70% of African American women of reproductive age. Epigenetic dysregulation of individual genes has been demonstrated in leiomyoma cells; however, the in vivo genome-wide distribution of such epigenetic abnormalities remains unknown. Principal Findings We characterized and compared genome-wide DNA methylation and mRNA expression profiles in uterine leiomyoma and matched adjacent normal myometrial tissues from 18 African American women. We found 55 genes with differential promoter methylation and concominant differences in mRNA expression in uterine leiomyoma versus normal myometrium. Eighty percent of the identified genes showed an inverse relationship between DNA methylation status and mRNA expression in uterine leiomyoma tissues, and the majority of genes (62%) displayed hypermethylation associated with gene silencing. We selected three genes, the known tumor suppressors KLF11, DLEC1, and KRT19 and verified promoter hypermethylation, mRNA repression and protein expression using bisulfite sequencing, real-time PCR and western blot. Incubation of primary leiomyoma smooth muscle cells with a DNA methyltransferase inhibitor restored KLF11, DLEC1 and KRT19 mRNA levels. Conclusions These results suggest a possible functional role of promoter DNA methylation-mediated gene silencing in the pathogenesis of uterine leiomyoma in African American women. PMID:22428009
SUSCEPTIBILITY LOCI FOR UMBILICAL HERNIA IN SWINE DETECTED BY GENOME-WIDE ASSOCIATION.

PubMed

Liao, X J; Lia, L; Zhang, Z Y; Long, Y; Yang, B; Ruan, G R; Su, Y; Ai, H S; Zhang, W C; Deng, W Y; Xiao, S J; Ren, J; Ding, N S; Huang, L S

2015-10-01

Umbilical hernia (UH) is a complex disorder caused by both genetic and environmental factors. UH brings animal welfare problems and severe economic loss to the pig industry. Until now, the genetic basis of UH is poorly understood. The high-density 60K porcine SNP array enables the rapid application of genome-wide association study (GWAS) to identify genetic loci for phenotypic traits at genome wide scale in pigs. The objective of this research was to identify susceptibility loci for swine umbilical hernia using the GWAS approach. We genotyped 478 piglets from 142 families representing three Western commercial breeds with the Illumina PorcineSNP60 BeadChip. Then significant SNPs were detected by GWAS using ROADTRIPS (Robust Association-Detection Test for Related Individuals with Population Substructure) software base on a Bonferroni corrected threshold (P = 1.67E-06) or suggestive threshold (P = 3.34E-05) and false discovery rate (FDR = 0.05). After quality control, 29,924 qualified SNPs and 472 piglets were used for GWAS. Two suggestive loci predisposing to pig UH were identified at 44.25MB on SSC2 (rs81358018, P = 3.34E-06, FDR = 0.049933) and at 45.90MB on SSC17 (rs81479278, P = 3.30E-06, FDR = 0.049933) in Duroc population, respectively. And no SNP was detected to be associated with pig UH at significant level in neither Landrace nor Large White population. Furthermore, we carried out a meta-analysis in the combined pure-breed population containing all the 472 piglets. rs81479278 (P = 1.16E-06, FDR = 0.022475) was identified to associate with pig UH at genome-wide significant level. SRC was characterized as plausible candidate gene for susceptibility to pig UH according to its genomic position and biological functions. To our knowledge, this study gives the first description of GWAS identifying susceptibility loci for umbilical hernia in pigs. Our findings provide deeper insights to the genetic architecture of umbilical hernia in pigs.
Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study

PubMed Central

Amyotte, Beatrice; Bowen, Amy J.; Banks, Travis; Rajcan, Istvan; Somers, Daryl J.

2017-01-01

Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants. PMID:28231290
Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study.

PubMed

Amyotte, Beatrice; Bowen, Amy J; Banks, Travis; Rajcan, Istvan; Somers, Daryl J

2017-01-01

Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants.
Genome-Wide Discovery of Drug-Dependent Human Liver Regulatory Elements

PubMed Central

Morrissey, Kari M.; Luizon, Marcelo R.; Hoffmann, Thomas J.; Sun, Xuefeng; Jones, Stacy L.; Force Aldred, Shelley; Ramamoorthy, Anuradha; Desta, Zeruesenay; Liu, Yunlong; Skaar, Todd C.; Trinklein, Nathan D.; Giacomini, Kathleen M.; Ahituv, Nadav

2014-01-01

Inter-individual variation in gene regulatory elements is hypothesized to play a causative role in adverse drug reactions and reduced drug activity. However, relatively little is known about the location and function of drug-dependent elements. To uncover drug-associated elements in a genome-wide manner, we performed RNA-seq and ChIP-seq using antibodies against the pregnane X receptor (PXR) and three active regulatory marks (p300, H3K4me1, H3K27ac) on primary human hepatocytes treated with rifampin or vehicle control. Rifampin and PXR were chosen since they are part of the CYP3A4 pathway, which is known to account for the metabolism of more than 50% of all prescribed drugs. We selected 227 proximal promoters for genes with rifampin-dependent expression or nearby PXR/p300 occupancy sites and assayed their ability to induce luciferase in rifampin-treated HepG2 cells, finding only 10 (4.4%) that exhibited drug-dependent activity. As this result suggested a role for distal enhancer modules, we searched more broadly to identify 1,297 genomic regions bearing a conditional PXR occupancy as well as all three active regulatory marks. These regions are enriched near genes that function in the metabolism of xenobiotics, specifically members of the cytochrome P450 family. We performed enhancer assays in rifampin-treated HepG2 cells for 42 of these sequences as well as 7 sequences that overlap linkage-disequilibrium blocks defined by lead SNPs from pharmacogenomic GWAS studies, revealing 15/42 and 4/7 to be functional enhancers, respectively. A common African haplotype in one of these enhancers in the GSTA locus was found to exhibit potential rifampin hypersensitivity. Combined, our results further suggest that enhancers are the predominant targets of rifampin-induced PXR activation, provide a genome-wide catalog of PXR targets and serve as a model for the identification of drug-responsive regulatory elements. PMID:25275310
Genetic risk factors for ischaemic stroke and its subtypes (the METASTROKE Collaboration): a meta-analysis of genome-wide association studies

PubMed Central

Traylor, Matthew; Farrall, Martin; Holliday, Elizabeth G; Sudlow, Cathie; Hopewell, Jemma C; Cheng, Yu-Ching; Fornage, Myriam; Ikram, M Arfan; Malik, Rainer; Bevan, Steve; Thorsteinsdottir, Unnur; Nalls, Mike A; Longstreth, WT; Wiggins, Kerri L; Yadav, Sunaina; Parati, Eugenio A; DeStefano, Anita L; Worrall, Bradford B; Kittner, Steven J; Khan, Muhammad Saleem; Reiner, Alex P; Helgadottir, Anna; Achterberg, Sefanja; Fernandez-Cadenas, Israel; Abboud, Sherine; Schmidt, Reinhold; Walters, Matthew; Chen, Wei-Min; Ringelstein, E Bernd; O'Donnell, Martin; Ho, Weang Kee; Pera, Joanna; Lemmens, Robin; Norrving, Bo; Higgins, Peter; Benn, Marianne; Sale, Michele; Kuhlenbäumer, Gregor; Doney, Alexander S F; Vicente, Astrid M; Delavaran, Hossein; Algra, Ale; Davies, Gail; Oliveira, Sofia A; Palmer, Colin N A; Deary, Ian; Schmidt, Helena; Pandolfo, Massimo; Montaner, Joan; Carty, Cara; de Bakker, Paul I W; Kostulas, Konstantinos; Ferro, Jose M; van Zuydam, Natalie R; Valdimarsson, Einar; Nordestgaard, Børge G; Lindgren, Arne; Thijs, Vincent; Slowik, Agnieszka; Saleheen, Danish; Paré, Guillaume; Berger, Klaus; Thorleifsson, Gudmar; Hofman, Albert; Mosley, Thomas H; Mitchell, Braxton D; Furie, Karen; Clarke, Robert; Levi, Christopher; Seshadri, Sudha; Gschwendtner, Andreas; Boncoraglio, Giorgio B; Sharma, Pankaj; Bis, Joshua C; Gretarsdottir, Solveig; Psaty, Bruce M; Rothwell, Peter M; Rosand, Jonathan; Meschia, James F; Stefansson, Kari; Dichgans, Martin; Markus, Hugh S

2012-01-01

Summary Background Various genome-wide association studies (GWAS) have been done in ischaemic stroke, identifying a few loci associated with the disease, but sample sizes have been 3500 cases or less. We established the METASTROKE collaboration with the aim of validating associations from previous GWAS and identifying novel genetic associations through meta-analysis of GWAS datasets for ischaemic stroke and its subtypes. Methods We meta-analysed data from 15 ischaemic stroke cohorts with a total of 12 389 individuals with ischaemic stroke and 62 004 controls, all of European ancestry. For the associations reaching genome-wide significance in METASTROKE, we did a further analysis, conditioning on the lead single nucleotide polymorphism in every associated region. Replication of novel suggestive signals was done in 13 347 cases and 29 083 controls. Findings We verified previous associations for cardioembolic stroke near PITX2 (p=2·8×10−16) and ZFHX3 (p=2·28×10−8), and for large-vessel stroke at a 9p21 locus (p=3·32×10−5) and HDAC9 (p=2·03×10−12). Additionally, we verified that all associations were subtype specific. Conditional analysis in the three regions for which the associations reached genome-wide significance (PITX2, ZFHX3, and HDAC9) indicated that all the signal in each region could be attributed to one risk haplotype. We also identified 12 potentially novel loci at p<5×10−6. However, we were unable to replicate any of these novel associations in the replication cohort. Interpretation Our results show that, although genetic variants can be detected in patients with ischaemic stroke when compared with controls, all associations we were able to confirm are specific to a stroke subtype. This finding has two implications. First, to maximise success of genetic studies in ischaemic stroke, detailed stroke subtyping is required. Second, different genetic pathophysiological mechanisms seem to be associated with different stroke subtypes. Funding Wellcome Trust, UK Medical Research Council (MRC), Australian National and Medical Health Research Council, National Institutes of Health (NIH) including National Heart, Lung and Blood Institute (NHLBI), the National Institute on Aging (NIA), the National Human Genome Research Institute (NHGRI), and the National Institute of Neurological Disorders and Stroke (NINDS). PMID:23041239
Landscape genomics reveals altered genome wide diversity within revegetated stands of Eucalyptus microcarpa (Grey Box).

PubMed

Jordan, Rebecca; Dillon, Shannon K; Prober, Suzanne M; Hoffmann, Ary A

2016-12-01

In order to contribute to evolutionary resilience and adaptive potential in highly modified landscapes, revegetated areas should ideally reflect levels of genetic diversity within and across natural stands. Landscape genomic analyses enable such diversity patterns to be characterized at genome and chromosomal levels. Landscape-wide patterns of genomic diversity were assessed in Eucalyptus microcarpa, a dominant tree species widely used in revegetation in Southeastern Australia. Trees from small and large patches within large remnants, small isolated remnants and revegetation sites were assessed across the now highly fragmented distribution of this species using the DArTseq genomic approach. Genomic diversity was similar within all three types of remnant patches analysed, although often significantly but only slightly lower in revegetation sites compared with natural remnants. Differences in diversity between stand types varied across chromosomes. Genomic differentiation was higher between small, isolated remnants, and among revegetated sites compared with natural stands. We conclude that small remnants and revegetated sites of our E. microcarpa samples largely but not completely capture patterns in genomic diversity across the landscape. Genomic approaches provide a powerful tool for assessing restoration efforts across the landscape. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Genetic effects on information processing speed are moderated by age--converging results from three samples.

PubMed

Ising, M; Mather, K A; Zimmermann, P; Brückl, T; Höhne, N; Heck, A; Schenk, L A; Rujescu, D; Armstrong, N J; Sachdev, P S; Reppermund, S

2014-06-01

Information processing is a cognitive trait forming the basis of complex abilities like executive function. The Trail Making Test (TMT) is a well-established test of information processing with moderate to high heritability. Age of the individual also plays an important role. A number of genetic association studies with the TMT have been performed, which, however, did not consider age as a moderating factor. We report the results of genome-wide association studies (GWASs) on age-independent and age-dependent TMT performance in two population-representative community samples (Munich Antidepressant Response Signature, MARS: N1 = 540; Ludwig Maximilians University, LMU: N2 = 350). Age-dependent genome-wide findings were then evaluated in a third sample of healthy elderly subjects (Sydney Memory and Ageing Study, Sydney MAS: N3 = 448). While a meta-analysis on the GWAS findings did not reveal age-independent TMT associations withstanding correction for multiple testing, we found a genome-wide significant age-moderated effect between variants in the DSG1 gene region and TMT-A performance predominantly reflecting visual processing speed (rs2199301, P(meta-analysis) = 1.3 × 10(-7)). The direction of the interaction suggests for the minor allele a beneficial effect in younger adults turning into a detrimental effect in older adults. The detrimental effect of the missense single nucleotide polymorphism rs1426310 within the same DSG1 gene region could be replicated in Sydney MAS participants aged 70-79, but not in those aged 80 years and older, presumably a result of survivor bias. Our findings demonstrate opposing effects of DSG1 variants on information processing speed depending on age, which might be related to the complex processes that DSG1 is involved with, including cell adhesion and apoptosis. © 2014 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.
Genome-wide survey and analysis of microsatellites in nematodes, with a focus on the plant-parasitic species Meloidogyne incognita.

PubMed

Castagnone-Sereno, Philippe; Danchin, Etienne G J; Deleury, Emeline; Guillemaud, Thomas; Malausa, Thibaut; Abad, Pierre

2010-10-25

Microsatellites are the most popular source of molecular markers for studying population genetic variation in eukaryotes. However, few data are currently available about their genomic distribution and abundance across the phylum Nematoda. The recent completion of the genomes of several nematode species, including Meloidogyne incognita, a major agricultural pest worldwide, now opens the way for a comparative survey and analysis of microsatellites in these organisms. Using MsatFinder, the total numbers of 1-6 bp perfect microsatellites detected in the complete genomes of five nematode species (Brugia malayi, Caenorhabditis elegans, M. hapla, M. incognita, Pristionchus pacificus) ranged from 2,842 to 61,547, and covered from 0.09 to 1.20% of the nematode genomes. Under our search criteria, the most common repeat motifs for each length class varied according to the different nematode species considered, with no obvious relation to the AT-richness of their genomes. Overall, (AT)n, (AG)n and (CT)n were the three most frequent dinucleotide microsatellite motifs found in the five genomes considered. Except for two motifs in P. pacificus, all the most frequent trinucleotide motifs were AT-rich, with (AAT)n and (ATT)n being the only common to the five nematode species. A particular attention was paid to the microsatellite content of the plant-parasitic species M. incognita. In this species, a repertoire of 4,880 microsatellite loci was identified, from which 2,183 appeared suitable to design markers for population genetic studies. Interestingly, 1,094 microsatellites were identified in 801 predicted protein-coding regions, 99% of them being trinucleotides. When compared against the InterPro domain database, 497 of these CDS were successfully annotated, and further assigned to Gene Ontology terms. Contrasted patterns of microsatellite abundance and diversity were characterized in five nematode genomes, even in the case of two closely related Meloidogyne species. 2,245 di- to hexanucleotide loci were identified in the genome of M. incognita, providing adequate material for the future development of a wide range of microsatellite markers in this major plant parasite.
Genome-wide measures of DNA methylation in peripheral blood and the risk of urothelial cell carcinoma: a prospective nested case–control study

PubMed Central

Dugué, Pierre-Antoine; Brinkman, Maree T; Milne, Roger L; Wong, Ee Ming; FitzGerald, Liesel M; Bassett, Julie K; Joo, Jihoon E; Jung, Chol-Hee; Makalic, Enes; Schmidt, Daniel F; Park, Daniel J; Chung, Jessica; Ta, Anthony D; Bolton, Damien M; Lonie, Andrew; Longano, Anthony; Hopper, John L; Severi, Gianluca; Saffery, Richard; English, Dallas R; Southey, Melissa C; Giles, Graham G

2016-01-01

Background: Global DNA methylation has been reported to be associated with urothelial cell carcinoma (UCC) by studies using blood samples collected at diagnosis. Using the Illumina HumanMethylation450 assay, we derived genome-wide measures of blood DNA methylation and assessed them for their prospective association with UCC risk. Methods: We used 439 case–control pairs from the Melbourne Collaborative Cohort Study matched on age, sex, country of birth, DNA sample type, and collection period. Conditional logistic regression was used to compute odds ratios (OR) of UCC risk per s.d. of each genome-wide measure of DNA methylation and 95% confidence intervals (CIs), adjusted for potential confounders. We also investigated associations by disease subtype, sex, smoking, and time since blood collection. Results: The risk of superficial UCC was decreased for individuals with higher levels of our genome-wide DNA methylation measure (OR=0.71, 95% CI: 0.54–0.94; P=0.02). This association was particularly strong for current smokers at sample collection (OR=0.47, 95% CI: 0.27–0.83). Intermediate levels of our genome-wide measure were associated with decreased risk of invasive UCC. Some variation was observed between UCC subtypes and the location and regulatory function of the CpGs included in the genome-wide measures of methylation. Conclusions: Higher levels of our genome-wide DNA methylation measure were associated with decreased risk of superficial UCC and intermediate levels were associated with reduced risk of invasive disease. These findings require replication by other prospective studies. PMID:27490804
Genome-wide measures of DNA methylation in peripheral blood and the risk of urothelial cell carcinoma: a prospective nested case-control study.

PubMed

Dugué, Pierre-Antoine; Brinkman, Maree T; Milne, Roger L; Wong, Ee Ming; FitzGerald, Liesel M; Bassett, Julie K; Joo, Jihoon E; Jung, Chol-Hee; Makalic, Enes; Schmidt, Daniel F; Park, Daniel J; Chung, Jessica; Ta, Anthony D; Bolton, Damien M; Lonie, Andrew; Longano, Anthony; Hopper, John L; Severi, Gianluca; Saffery, Richard; English, Dallas R; Southey, Melissa C; Giles, Graham G

2016-09-06

Global DNA methylation has been reported to be associated with urothelial cell carcinoma (UCC) by studies using blood samples collected at diagnosis. Using the Illumina HumanMethylation450 assay, we derived genome-wide measures of blood DNA methylation and assessed them for their prospective association with UCC risk. We used 439 case-control pairs from the Melbourne Collaborative Cohort Study matched on age, sex, country of birth, DNA sample type, and collection period. Conditional logistic regression was used to compute odds ratios (OR) of UCC risk per s.d. of each genome-wide measure of DNA methylation and 95% confidence intervals (CIs), adjusted for potential confounders. We also investigated associations by disease subtype, sex, smoking, and time since blood collection. The risk of superficial UCC was decreased for individuals with higher levels of our genome-wide DNA methylation measure (OR=0.71, 95% CI: 0.54-0.94; P=0.02). This association was particularly strong for current smokers at sample collection (OR=0.47, 95% CI: 0.27-0.83). Intermediate levels of our genome-wide measure were associated with decreased risk of invasive UCC. Some variation was observed between UCC subtypes and the location and regulatory function of the CpGs included in the genome-wide measures of methylation. Higher levels of our genome-wide DNA methylation measure were associated with decreased risk of superficial UCC and intermediate levels were associated with reduced risk of invasive disease. These findings require replication by other prospective studies.

A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis.

PubMed

Chiò, Adriano; Schymick, Jennifer C; Restagno, Gabriella; Scholz, Sonja W; Lombardo, Federica; Lai, Shiao-Lin; Mora, Gabriele; Fung, Hon-Chung; Britton, Angela; Arepalli, Sampath; Gibbs, J Raphael; Nalls, Michael; Berger, Stephen; Kwee, Lydia Coulter; Oddone, Eugene Z; Ding, Jinhui; Crews, Cynthia; Rafferty, Ian; Washecka, Nicole; Hernandez, Dena; Ferrucci, Luigi; Bandinelli, Stefania; Guralnik, Jack; Macciardi, Fabio; Torri, Federica; Lupoli, Sara; Chanock, Stephen J; Thomas, Gilles; Hunter, David J; Gieger, Christian; Wichmann, H Erich; Calvo, Andrea; Mutani, Roberto; Battistini, Stefania; Giannini, Fabio; Caponnetto, Claudia; Mancardi, Giovanni Luigi; La Bella, Vincenzo; Valentino, Francesca; Monsurrò, Maria Rosaria; Tedeschi, Gioacchino; Marinou, Kalliopi; Sabatelli, Mario; Conte, Amelia; Mandrioli, Jessica; Sola, Patrizia; Salvi, Fabrizio; Bartolomei, Ilaria; Siciliano, Gabriele; Carlesi, Cecilia; Orrell, Richard W; Talbot, Kevin; Simmons, Zachary; Connor, James; Pioro, Erik P; Dunkley, Travis; Stephan, Dietrich A; Kasperaviciute, Dalia; Fisher, Elizabeth M; Jabonka, Sibylle; Sendtner, Michael; Beck, Marcus; Bruijn, Lucie; Rothstein, Jeffrey; Schmidt, Silke; Singleton, Andrew; Hardy, John; Traynor, Bryan J

2009-04-15

The cause of sporadic amyotrophic lateral sclerosis (ALS) is largely unknown, but genetic factors are thought to play a significant role in determining susceptibility to motor neuron degeneration. To identify genetic variants altering risk of ALS, we undertook a two-stage genome-wide association study (GWAS): we followed our initial GWAS of 545 066 SNPs in 553 individuals with ALS and 2338 controls by testing the 7600 most associated SNPs from the first stage in three independent cohorts consisting of 2160 cases and 3008 controls. None of the SNPs selected for replication exceeded the Bonferroni threshold for significance. The two most significantly associated SNPs, rs2708909 and rs2708851 [odds ratio (OR) = 1.17 and 1.18, and P-values = 6.98 x 10(-7) and 1.16 x 10(-6)], were located on chromosome 7p13.3 within a 175 kb linkage disequilibrium block containing the SUNC1, HUS1 and C7orf57 genes. These associations did not achieve genome-wide significance in the original cohort and failed to replicate in an additional independent cohort of 989 US cases and 327 controls (OR = 1.18 and 1.19, P-values = 0.08 and 0.06, respectively). Thus, we chose to cautiously interpret our data as hypothesis-generating requiring additional confirmation, especially as all previously reported loci for ALS have failed to replicate successfully. Indeed, the three loci (FGGY, ITPR2 and DPP6) identified in previous GWAS of sporadic ALS were not significantly associated with disease in our study. Our findings suggest that ALS is more genetically and clinically heterogeneous than previously recognized. Genotype data from our study have been made available online to facilitate such future endeavors.
A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis

PubMed Central

Chiò, Adriano; Schymick, Jennifer C.; Restagno, Gabriella; Scholz, Sonja W.; Lombardo, Federica; Lai, Shiao-Lin; Mora, Gabriele; Fung, Hon-Chung; Britton, Angela; Arepalli, Sampath; Gibbs, J. Raphael; Nalls, Michael; Berger, Stephen; Kwee, Lydia Coulter; Oddone, Eugene Z.; Ding, Jinhui; Crews, Cynthia; Rafferty, Ian; Washecka, Nicole; Hernandez, Dena; Ferrucci, Luigi; Bandinelli, Stefania; Guralnik, Jack; Macciardi, Fabio; Torri, Federica; Lupoli, Sara; Chanock, Stephen J.; Thomas, Gilles; Hunter, David J.; Gieger, Christian; Wichmann, H. Erich; Calvo, Andrea; Mutani, Roberto; Battistini, Stefania; Giannini, Fabio; Caponnetto, Claudia; Mancardi, Giovanni Luigi; La Bella, Vincenzo; Valentino, Francesca; Monsurrò, Maria Rosaria; Tedeschi, Gioacchino; Marinou, Kalliopi; Sabatelli, Mario; Conte, Amelia; Mandrioli, Jessica; Sola, Patrizia; Salvi, Fabrizio; Bartolomei, Ilaria; Siciliano, Gabriele; Carlesi, Cecilia; Orrell, Richard W.; Talbot, Kevin; Simmons, Zachary; Connor, James; Pioro, Erik P.; Dunkley, Travis; Stephan, Dietrich A.; Kasperaviciute, Dalia; Fisher, Elizabeth M.; Jabonka, Sibylle; Sendtner, Michael; Beck, Marcus; Bruijn, Lucie; Rothstein, Jeffrey; Schmidt, Silke; Singleton, Andrew; Hardy, John; Traynor, Bryan J.

2009-01-01

The cause of sporadic amyotrophic lateral sclerosis (ALS) is largely unknown, but genetic factors are thought to play a significant role in determining susceptibility to motor neuron degeneration. To identify genetic variants altering risk of ALS, we undertook a two-stage genome-wide association study (GWAS): we followed our initial GWAS of 545 066 SNPs in 553 individuals with ALS and 2338 controls by testing the 7600 most associated SNPs from the first stage in three independent cohorts consisting of 2160 cases and 3008 controls. None of the SNPs selected for replication exceeded the Bonferroni threshold for significance. The two most significantly associated SNPs, rs2708909 and rs2708851 [odds ratio (OR) = 1.17 and 1.18, and P-values = 6.98 × 10−7 and 1.16 × 10−6], were located on chromosome 7p13.3 within a 175 kb linkage disequilibrium block containing the SUNC1, HUS1 and C7orf57 genes. These associations did not achieve genome-wide significance in the original cohort and failed to replicate in an additional independent cohort of 989 US cases and 327 controls (OR = 1.18 and 1.19, P-values = 0.08 and 0.06, respectively). Thus, we chose to cautiously interpret our data as hypothesis-generating requiring additional confirmation, especially as all previously reported loci for ALS have failed to replicate successfully. Indeed, the three loci (FGGY, ITPR2 and DPP6) identified in previous GWAS of sporadic ALS were not significantly associated with disease in our study. Our findings suggest that ALS is more genetically and clinically heterogeneous than previously recognized. Genotype data from our study have been made available online to facilitate such future endeavors. PMID:19193627
Epigenetic changes in leukocytes after 8 weeks of resistance exercise training.

PubMed

Denham, Joshua; Marques, Francine Z; Bruns, Emma L; O'Brien, Brendan J; Charchar, Fadi J

2016-06-01

Regular engagement in resistance exercise training elicits many health benefits including improvement to muscular strength, hypertrophy and insulin sensitivity, though the underpinning molecular mechanisms are poorly understood. The purpose of this study was to determine the influence 8 weeks of resistance exercise training has on leukocyte genome-wide DNA methylation and gene expression in healthy young men. Eight young (21.1 ± 2.2 years) men completed one repetition maximum (1RM) testing before completing 8 weeks of supervised, thrice-weekly resistance exercise training comprising three sets of 8-12 repetitions with a load equivalent to 80 % of 1RM. Blood samples were collected at rest before and after the 8-week training intervention. Genome-wide DNA methylation and gene expression were assessed on isolated leukocyte DNA and RNA using the 450K BeadChip and HumanHT-12 v4 Expression BeadChip (Illumina), respectively. Resistance exercise training significantly improved upper and lower body strength concurrently with diverse genome-wide DNA methylation and gene expression changes (p ≤ 0. 01). DNA methylation changes occurred at multiple regions throughout the genome in context with genes and CpG islands, and in genes relating to axon guidance, diabetes and immune pathways. There were multiple genes with increased expression that were enriched for RNA processing and developmental proteins. Growth factor genes-GHRH and FGF1-showed differential methylation and mRNA expression changes after resistance training. Our findings indicate that resistance exercise training improves muscular strength and is associated with reprogramming of the leukocyte DNA methylome and transcriptome.
SuperDCA for genome-wide epistasis analysis.

PubMed

Puranen, Santeri; Pesonen, Maiju; Pensar, Johan; Xu, Ying Ying; Lees, John A; Bentley, Stephen D; Croucher, Nicholas J; Corander, Jukka

2018-05-29

The potential for genome-wide modelling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has previously been shown to yield valuable predictions for single protein structures, and has recently been extended to genome-wide analysis of bacteria, identifying novel interactions in the co-evolution between resistance, virulence and core genome elements. However, earlier computational DCA methods have not been scalable to enable model fitting simultaneously to 10 4 -10 5 polymorphisms, representing the amount of core genomic variation observed in analyses of many bacterial species. Here, we introduce a novel inference method (SuperDCA) that employs a new scoring principle, efficient parallelization, optimization and filtering on phylogenetic information to achieve scalability for up to 10 5 polymorphisms. Using two large population samples of Streptococcus pneumoniae, we demonstrate the ability of SuperDCA to make additional significant biological findings about this major human pathogen. We also show that our method can uncover signals of selection that are not detectable by genome-wide association analysis, even though our analysis does not require phenotypic measurements. SuperDCA, thus, holds considerable potential in building understanding about numerous organisms at a systems biological level.
Genome-wide genetic homogeneity between sexes and populations for human height and body mass index.

PubMed

Yang, Jian; Bakshi, Andrew; Zhu, Zhihong; Hemani, Gibran; Vinkhuyzen, Anna A E; Nolte, Ilja M; van Vliet-Ostaptchouk, Jana V; Snieder, Harold; Esko, Tonu; Milani, Lili; Mägi, Reedik; Metspalu, Andres; Hamsten, Anders; Magnusson, Patrik K E; Pedersen, Nancy L; Ingelsson, Erik; Visscher, Peter M

2015-12-20

Sex-specific genetic effects have been proposed to be an important source of variation for human complex traits. Here we use two distinct genome-wide methods to estimate the autosomal genetic correlation (rg) between men and women for human height and body mass index (BMI), using individual-level (n = ∼44 000) and summary-level (n = ∼133 000) data from genome-wide association studies. Results are consistent and show that the between-sex genetic correlation is not significantly different from unity for both traits. In contrast, we find evidence of genetic heterogeneity between sexes for waist-hip ratio (rg = ∼0.7) and between populations for BMI (rg = ∼0.9 between Europe and the USA) but not for height. The lack of evidence for substantial genetic heterogeneity for body size is consistent with empirical findings across traits and species. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Biological Insights From 108 Schizophrenia-Associated Genetic Loci

PubMed Central

Ripke, Stephan; Neale, Benjamin M; Corvin, Aiden; Walters, James TR; Farh, Kai-How; Holmans, Peter A; Lee, Phil; Bulik-Sullivan, Brendan; Collier, David A; Huang, Hailiang; Pers, Tune H; Agartz, Ingrid; Agerbo, Esben; Albus, Margot; Alexander, Madeline; Amin, Farooq; Bacanu, Silviu A; Begemann, Martin; Belliveau, Richard A; Bene, Judit; Bergen, Sarah E; Bevilacqua, Elizabeth; Bigdeli, Tim B; Black, Donald W; Bruggeman, Richard; Buccola, Nancy G; Buckner, Randy L; Byerley, William; Cahn, Wiepke; Cai, Guiqing; Campion, Dominique; Cantor, Rita M; Carr, Vaughan J; Carrera, Noa; Catts, Stanley V; Chambert, Kimberley D; Chan, Raymond CK; Chan, Ronald YL; Chen, Eric YH; Cheng, Wei; Cheung, Eric FC; Chong, Siow Ann; Cloninger, C Robert; Cohen, David; Cohen, Nadine; Cormican, Paul; Craddock, Nick; Crowley, James J; Curtis, David; Davidson, Michael; Davis, Kenneth L; Degenhardt, Franziska; Del Favero, Jurgen; Demontis, Ditte; Dikeos, Dimitris; Dinan, Timothy; Djurovic, Srdjan; Donohoe, Gary; Drapeau, Elodie; Duan, Jubao; Dudbridge, Frank; Durmishi, Naser; Eichhammer, Peter; Eriksson, Johan; Escott-Price, Valentina; Essioux, Laurent; Fanous, Ayman H; Farrell, Martilias S; Frank, Josef; Franke, Lude; Freedman, Robert; Freimer, Nelson B; Friedl, Marion; Friedman, Joseph I; Fromer, Menachem; Genovese, Giulio; Georgieva, Lyudmila; Giegling, Ina; Giusti-Rodríguez, Paola; Godard, Stephanie; Goldstein, Jacqueline I; Golimbet, Vera; Gopal, Srihari; Gratten, Jacob; de Haan, Lieuwe; Hammer, Christian; Hamshere, Marian L; Hansen, Mark; Hansen, Thomas; Haroutunian, Vahram; Hartmann, Annette M; Henskens, Frans A; Herms, Stefan; Hirschhorn, Joel N; Hoffmann, Per; Hofman, Andrea; Hollegaard, Mads V; Hougaard, David M; Ikeda, Masashi; Joa, Inge; Julià, Antonio; Kahn, René S; Kalaydjieva, Luba; Karachanak-Yankova, Sena; Karjalainen, Juha; Kavanagh, David; Keller, Matthew C; Kennedy, James L; Khrunin, Andrey; Kim, Yunjung; Klovins, Janis; Knowles, James A; Konte, Bettina; Kucinskas, Vaidutis; Kucinskiene, Zita Ausrele; Kuzelova-Ptackova, Hana; Kähler, Anna K; Laurent, Claudine; Lee, Jimmy; Lee, S Hong; Legge, Sophie E; Lerer, Bernard; Li, Miaoxin; Li, Tao; Liang, Kung-Yee; Lieberman, Jeffrey; Limborska, Svetlana; Loughland, Carmel M; Lubinski, Jan; Lönnqvist, Jouko; Macek, Milan; Magnusson, Patrik KE; Maher, Brion S; Maier, Wolfgang; Mallet, Jacques; Marsal, Sara; Mattheisen, Manuel; Mattingsdal, Morten; McCarley, Robert W; McDonald, Colm; McIntosh, Andrew M; Meier, Sandra; Meijer, Carin J; Melegh, Bela; Melle, Ingrid; Mesholam-Gately, Raquelle I; Metspalu, Andres; Michie, Patricia T; Milani, Lili; Milanova, Vihra; Mokrab, Younes; Morris, Derek W; Mors, Ole; Murphy, Kieran C; Murray, Robin M; Myin-Germeys, Inez; Müller-Myhsok, Bertram; Nelis, Mari; Nenadic, Igor; Nertney, Deborah A; Nestadt, Gerald; Nicodemus, Kristin K; Nikitina-Zake, Liene; Nisenbaum, Laura; Nordin, Annelie; O’Callaghan, Eadbhard; O’Dushlaine, Colm; O’Neill, F Anthony; Oh, Sang-Yun; Olincy, Ann; Olsen, Line; Van Os, Jim; Pantelis, Christos; Papadimitriou, George N; Papiol, Sergi; Parkhomenko, Elena; Pato, Michele T; Paunio, Tiina; Pejovic-Milovancevic, Milica; Perkins, Diana O; Pietiläinen, Olli; Pimm, Jonathan; Pocklington, Andrew J; Powell, John; Price, Alkes; Pulver, Ann E; Purcell, Shaun M; Quested, Digby; Rasmussen, Henrik B; Reichenberg, Abraham; Reimers, Mark A; Richards, Alexander L; Roffman, Joshua L; Roussos, Panos; Ruderfer, Douglas M; Salomaa, Veikko; Sanders, Alan R; Schall, Ulrich; Schubert, Christian R; Schulze, Thomas G; Schwab, Sibylle G; Scolnick, Edward M; Scott, Rodney J; Seidman, Larry J; Shi, Jianxin; Sigurdsson, Engilbert; Silagadze, Teimuraz; Silverman, Jeremy M; Sim, Kang; Slominsky, Petr; Smoller, Jordan W; So, Hon-Cheong; Spencer, Chris C A; Stahl, Eli A; Stefansson, Hreinn; Steinberg, Stacy; Stogmann, Elisabeth; Straub, Richard E; Strengman, Eric; Strohmaier, Jana; Stroup, T Scott; Subramaniam, Mythily; Suvisaari, Jaana; Svrakic, Dragan M; Szatkiewicz, Jin P; Söderman, Erik; Thirumalai, Srinivas; Toncheva, Draga; Tosato, Sarah; Veijola, Juha; Waddington, John; Walsh, Dermot; Wang, Dai; Wang, Qiang; Webb, Bradley T; Weiser, Mark; Wildenauer, Dieter B; Williams, Nigel M; Williams, Stephanie; Witt, Stephanie H; Wolen, Aaron R; Wong, Emily HM; Wormley, Brandon K; Xi, Hualin Simon; Zai, Clement C; Zheng, Xuebin; Zimprich, Fritz; Wray, Naomi R; Stefansson, Kari; Visscher, Peter M; Adolfsson, Rolf; Andreassen, Ole A; Blackwood, Douglas HR; Bramon, Elvira; Buxbaum, Joseph D; Børglum, Anders D; Cichon, Sven; Darvasi, Ariel; Domenici, Enrico; Ehrenreich, Hannelore; Esko, Tõnu; Gejman, Pablo V; Gill, Michael; Gurling, Hugh; Hultman, Christina M; Iwata, Nakao; Jablensky, Assen V; Jönsson, Erik G; Kendler, Kenneth S; Kirov, George; Knight, Jo; Lencz, Todd; Levinson, Douglas F; Li, Qingqin S; Liu, Jianjun; Malhotra, Anil K; McCarroll, Steven A; McQuillin, Andrew; Moran, Jennifer L; Mortensen, Preben B; Mowry, Bryan J; Nöthen, Markus M; Ophoff, Roel A; Owen, Michael J; Palotie, Aarno; Pato, Carlos N; Petryshen, Tracey L; Posthuma, Danielle; Rietschel, Marcella; Riley, Brien P; Rujescu, Dan; Sham, Pak C; Sklar, Pamela; St Clair, David; Weinberger, Daniel R; Wendland, Jens R; Werge, Thomas; Daly, Mark J; Sullivan, Patrick F; O’Donovan, Michael C

2014-01-01

Summary Schizophrenia is a highly heritable disorder. Genetic risk is conferred by a large number of alleles, including common alleles of small effect that might be detected by genome-wide association studies. Here, we report a multi-stage schizophrenia genome-wide association study of up to 36,989 cases and 113,075 controls. We identify 128 independent associations spanning 108 conservatively defined loci that meet genome-wide significance, 83 of which have not been previously reported. Associations were enriched among genes expressed in brain providing biological plausibility for the findings. Many findings have the potential to provide entirely novel insights into aetiology, but associations at DRD2 and multiple genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses. Independent of genes expressed in brain, associations were enriched among genes expressed in tissues that play important roles in immunity, providing support for the hypothesized link between the immune system and schizophrenia. PMID:25056061
Genomics and functional genomics in Chlamydomonas reinhardtii

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blaby, Ian K.; Blaby-Haas, Crysten E.

The availability of the Chlamydomonas reinhardtii nuclear genome sequence continues to enable researchers to address biological questions relevant to algae, land plants and animals in unprecedented ways. As we continue to characterize and understand biological processes in C. reinhardtii and translate that knowledge to other systems, we are faced with the realization that many genes encode proteins without a defined function. The field of functional genomics aims to close this gap between genome sequence and protein function. Transcriptomes, proteomes and phenomes can each provide layers of gene-specific functional data while supplying a global snapshot of cellular behavior under different conditions.more » Herein we present a brief history of functional genomics, the present status of the C. reinhardtii genome, how genome-wide experiments can aid in supplying protein function inferences, and provide an outlook for functional genomics in C. reinhardtii.« less
Genomics and functional genomics in Chlamydomonas reinhardtii

DOE PAGES

Blaby, Ian K.; Blaby-Haas, Crysten E.

2017-03-21

The availability of the Chlamydomonas reinhardtii nuclear genome sequence continues to enable researchers to address biological questions relevant to algae, land plants and animals in unprecedented ways. As we continue to characterize and understand biological processes in C. reinhardtii and translate that knowledge to other systems, we are faced with the realization that many genes encode proteins without a defined function. The field of functional genomics aims to close this gap between genome sequence and protein function. Transcriptomes, proteomes and phenomes can each provide layers of gene-specific functional data while supplying a global snapshot of cellular behavior under different conditions.more » Herein we present a brief history of functional genomics, the present status of the C. reinhardtii genome, how genome-wide experiments can aid in supplying protein function inferences, and provide an outlook for functional genomics in C. reinhardtii.« less
The function and evolution of the Aspergillus genome

PubMed Central

Gibbons, John G.; Rokas, Antonis

2012-01-01

Species in the filamentous fungal genus Aspergillus display a wide diversity of lifestyles and are of great importance to humans. The decoding of genome sequences from a dozen species that vary widely in their degree of evolutionary affinity has galvanized studies of the function and evolution of the Aspergillus genome in clinical, industrial, and agricultural environments. Here, we synthesize recent key findings that shed light on the architecture of the Aspergillus genome, on the molecular foundations of the genus’ astounding dexterity and diversity in secondary metabolism, and on the genetic underpinnings of virulence in Aspergillus fumigatus, one of the most lethal fungal pathogens. Many of these insights dramatically expand our knowledge of fungal and microbial eukaryote genome evolution and function and argue that Aspergillus constitutes a superb model clade for the study of functional and comparative genomics. PMID:23084572
Genome wide analysis of flowering time trait in multiple environments via high-throughput genotyping technique in Brassica napus L.

PubMed

Li, Lun; Long, Yan; Zhang, Libin; Dalton-Morgan, Jessica; Batley, Jacqueline; Yu, Longjiang; Meng, Jinling; Li, Maoteng

2015-01-01

The prediction of the flowering time (FT) trait in Brassica napus based on genome-wide markers and the detection of underlying genetic factors is important not only for oilseed producers around the world but also for the other crop industry in the rotation system in China. In previous studies the low density and mixture of biomarkers used obstructed genomic selection in B. napus and comprehensive mapping of FT related loci. In this study, a high-density genome-wide SNP set was genotyped from a double-haploid population of B. napus. We first performed genomic prediction of FT traits in B. napus using SNPs across the genome under ten environments of three geographic regions via eight existing genomic predictive models. The results showed that all the models achieved comparably high accuracies, verifying the feasibility of genomic prediction in B. napus. Next, we performed a large-scale mapping of FT related loci among three regions, and found 437 associated SNPs, some of which represented known FT genes, such as AP1 and PHYE. The genes tagged by the associated SNPs were enriched in biological processes involved in the formation of flowers. Epistasis analysis showed that significant interactions were found between detected loci, even among some known FT related genes. All the results showed that our large scale and high-density genotype data are of great practical and scientific values for B. napus. To our best knowledge, this is the first evaluation of genomic selection models in B. napus based on a high-density SNP dataset and large-scale mapping of FT loci.
Finding the genes to build C4 rice.

PubMed

Wang, Peng; Vlad, Daniela; Langdale, Jane A

2016-06-01

Rice, a C3 crop, is a staple food for more than half of the world's population, with most consumers living in developing countries. Engineering C4 photosynthetic traits into rice is increasingly suggested as a way to meet the 50% yield increase that is predicted to be needed by 2050. Advances in genome-wide deep-sequencing, gene discovery and genome editing platforms have brought the possibility of engineering a C3 to C4 conversion closer than ever before. Because C4 plants have evolved independently multiple times from C3 origins, it is probably that key genes and gene regulatory networks that regulate C4 were recruited from C3 ancestors. In the past five years there have been over 20 comparative transcriptomic studies published that aimed to identify these recruited C4 genes and regulatory mechanisms. Here we present an overview of what we have learned so far and preview the efforts still needed to provide a practical blueprint for building C4 rice. Copyright © 2016 Elsevier Ltd. All rights reserved.
Interactome analysis reveals ZNF804A, a schizophrenia risk gene, as a novel component of protein translational machinery critical for embryonic neurodevelopment

PubMed Central

Zhou, Y; Dong, F; Lanz, T A; Reinhart, V; Li, M; Liu, L; Zou, J; Xi, H S; Mao, Y

2018-01-01

Recent genome-wide association studies identified over 100 genetic loci that significantly associate with schizophrenia (SZ). A top candidate gene, ZNF804A, was robustly replicated in different populations. However, its neural functions are largely unknown. Here we show in mouse that ZFP804A, the homolog of ZNF804A, is required for normal progenitor proliferation and neuronal migration. Using a yeast two-hybrid genome-wide screen, we identified novel interacting proteins of ZNF804A. Rather than transcriptional factors, genes involved in mRNA translation are highly represented in our interactome result. ZNF804A co-fractionates with translational machinery and modulates the translational efficiency as well as the mTOR pathway. The ribosomal protein RPSA interacts with ZNF804A and rescues the migration and translational defects caused by ZNF804A knockdown. RNA immunoprecipitation–RNAseq (RIP-Seq) identified transcripts bound to ZFP804A. Consistently, ZFP804A associates with many short transcripts involved in translational and mitochondrial regulation. Moreover, among the transcripts associated with ZFP804A, a SZ risk gene, neurogranin (NRGN), is one of ZFP804A targets. Interestingly, downregulation of ZFP804A decreases NRGN expression and overexpression of NRGN can ameliorate ZFP804A-mediated migration defect. To verify the downstream targets of ZNF804A, a Duolink in situ interaction assay confirmed genes from our RIP-Seq data as the ZNF804A targets. Thus, our work uncovered a novel mechanistic link of a SZ risk gene to neurodevelopment and translational control. The interactome-driven approach here is an effective way for translating genome-wide association findings into novel biological insights of human diseases. PMID:28924186
Whole genome sequences in pulse crops: a global community resource to expedite translational genomics and knowledge-based crop improvement.

PubMed

Bohra, Abhishek; Singh, Narendra P

2015-08-01

Unprecedented developments in legume genomics over the last decade have resulted in the acquisition of a wide range of modern genomic resources to underpin genetic improvement of grain legumes. The genome enabled insights direct investigators in various ways that primarily include unearthing novel structural variations, retrieving the lost genetic diversity, introducing novel/exotic alleles from wider gene pools, finely resolving the complex quantitative traits and so forth. To this end, ready availability of cost-efficient and high-density genotyping assays allows genome wide prediction to be increasingly recognized as the key selection criterion in crop breeding. Further, the high-dimensional measurements of agronomically significant phenotypes obtained by using new-generation screening techniques will empower reference based resequencing as well as allele mining and trait mapping methods to comprehensively associate genome diversity with the phenome scale variation. Besides stimulating the forward genetic systems, accessibility to precisely delineated genomic segments reveals novel candidates for reverse genetic techniques like targeted genome editing. The shifting paradigm in plant genomics in turn necessitates optimization of crop breeding strategies to enable the most efficient integration of advanced omics knowledge and tools. We anticipate that the crop improvement schemes will be bolstered remarkably with rational deployment of these genome-guided approaches, ultimately resulting in expanded plant breeding capacities and improved crop performance.
Health Information Retrieval Tool (HIRT)

PubMed Central

Nyun, Mra Thinzar; Ogunyemi, Omolola; Zeng, Qing

2002-01-01

The World Wide Web (WWW) is a powerful way to deliver on-line health information, but one major problem limits its value to consumers: content is highly distributed, while relevant and high quality information is often difficult to find. To address this issue, we experimented with an approach that utilizes three-dimensional anatomic models in conjunction with free-text search.
Whole-Genome Sequencing Suggests Schizophrenia Risk Mechanisms in Humans with 22q11.2 Deletion Syndrome.

PubMed

Merico, Daniele; Zarrei, Mehdi; Costain, Gregory; Ogura, Lucas; Alipanahi, Babak; Gazzellone, Matthew J; Butcher, Nancy J; Thiruvahindrapuram, Bhooma; Nalpathamkalam, Thomas; Chow, Eva W C; Andrade, Danielle M; Frey, Brendan J; Marshall, Christian R; Scherer, Stephen W; Bassett, Anne S

2015-09-16

Chromosome 22q11.2 microdeletions impart a high but incomplete risk for schizophrenia. Possible mechanisms include genome-wide effects of DGCR8 haploinsufficiency. In a proof-of-principle study to assess the power of this model, we used high-quality, whole-genome sequencing of nine individuals with 22q11.2 deletions and extreme phenotypes (schizophrenia, or no psychotic disorder at age >50 years). The schizophrenia group had a greater burden of rare, damaging variants impacting protein-coding neurofunctional genes, including genes involved in neuron projection (nominal P = 0.02, joint burden of three variant types). Variants in the intact 22q11.2 region were not major contributors. Restricting to genes affected by a DGCR8 mechanism tended to amplify between-group differences. Damaging variants in highly conserved long intergenic noncoding RNA genes also were enriched in the schizophrenia group (nominal P = 0.04). The findings support the 22q11.2 deletion model as a threshold-lowering first hit for schizophrenia risk. If applied to a larger and thus better-powered cohort, this appears to be a promising approach to identify genome-wide rare variants in coding and noncoding sequence that perturb gene networks relevant to idiopathic schizophrenia. Similarly designed studies exploiting genetic models may prove useful to help delineate the genetic architecture of other complex phenotypes. Copyright © 2015 Merico et al.
Whole-Genome Sequencing Suggests Schizophrenia Risk Mechanisms in Humans with 22q11.2 Deletion Syndrome

PubMed Central

Merico, Daniele; Zarrei, Mehdi; Costain, Gregory; Ogura, Lucas; Alipanahi, Babak; Gazzellone, Matthew J.; Butcher, Nancy J.; Thiruvahindrapuram, Bhooma; Nalpathamkalam, Thomas; Chow, Eva W. C.; Andrade, Danielle M.; Frey, Brendan J.; Marshall, Christian R.; Scherer, Stephen W.; Bassett, Anne S.

2015-01-01

Chromosome 22q11.2 microdeletions impart a high but incomplete risk for schizophrenia. Possible mechanisms include genome-wide effects of DGCR8 haploinsufficiency. In a proof-of-principle study to assess the power of this model, we used high-quality, whole-genome sequencing of nine individuals with 22q11.2 deletions and extreme phenotypes (schizophrenia, or no psychotic disorder at age >50 years). The schizophrenia group had a greater burden of rare, damaging variants impacting protein-coding neurofunctional genes, including genes involved in neuron projection (nominal P = 0.02, joint burden of three variant types). Variants in the intact 22q11.2 region were not major contributors. Restricting to genes affected by a DGCR8 mechanism tended to amplify between-group differences. Damaging variants in highly conserved long intergenic noncoding RNA genes also were enriched in the schizophrenia group (nominal P = 0.04). The findings support the 22q11.2 deletion model as a threshold-lowering first hit for schizophrenia risk. If applied to a larger and thus better-powered cohort, this appears to be a promising approach to identify genome-wide rare variants in coding and noncoding sequence that perturb gene networks relevant to idiopathic schizophrenia. Similarly designed studies exploiting genetic models may prove useful to help delineate the genetic architecture of other complex phenotypes. PMID:26384369
MicroRNA-guided prioritization of genome-wide association signals reveals the importance of microRNA-target gene networks for complex traits in cattle.

PubMed

Fang, Lingzhao; Sørensen, Peter; Sahana, Goutam; Panitz, Frank; Su, Guosheng; Zhang, Shengli; Yu, Ying; Li, Bingjie; Ma, Li; Liu, George; Lund, Mogens Sandø; Thomsen, Bo

2018-06-19

MicroRNAs (miRNA) are key modulators of gene expression and so act as putative fine-tuners of complex phenotypes. Here, we hypothesized that causal variants of complex traits are enriched in miRNAs and miRNA-target networks. First, we conducted a genome-wide association study (GWAS) for seven functional and milk production traits using imputed sequence variants (13~15 million) and >10,000 animals from three dairy cattle breeds, i.e., Holstein (HOL), Nordic red cattle (RDC) and Jersey (JER). Second, we analyzed for enrichments of association signals in miRNAs and their miRNA-target networks. Our results demonstrated that genomic regions harboring miRNA genes were significantly (P < 0.05) enriched with GWAS signals for milk production traits and mastitis, and that enrichments within miRNA-target gene networks were significantly higher than in random gene-sets for the majority of traits. Furthermore, most between-trait and across-breed correlations of enrichments with miRNA-target networks were significantly greater than with random gene-sets, suggesting pleiotropic effects of miRNAs. Intriguingly, genes that were differentially expressed in response to mammary gland infections were significantly enriched in the miRNA-target networks associated with mastitis. All these findings were consistent across three breeds. Collectively, our observations demonstrate the importance of miRNAs and their targets for the expression of complex traits.
Detecting associated single-nucleotide polymorphisms on the X chromosome in case control genome-wide association studies.

PubMed

Chen, Zhongxue; Ng, Hon Keung Tony; Li, Jing; Liu, Qingzhong; Huang, Hanwen

2017-04-01

In the past decade, hundreds of genome-wide association studies have been conducted to detect the significant single-nucleotide polymorphisms that are associated with certain diseases. However, most of the data from the X chromosome were not analyzed and only a few significant associated single-nucleotide polymorphisms from the X chromosome have been identified from genome-wide association studies. This is mainly due to the lack of powerful statistical tests. In this paper, we propose a novel statistical approach that combines the information of single-nucleotide polymorphisms on the X chromosome from both males and females in an efficient way. The proposed approach avoids the need of making strong assumptions about the underlying genetic models. Our proposed statistical test is a robust method that only makes the assumption that the risk allele is the same for both females and males if the single-nucleotide polymorphism is associated with the disease for both genders. Through simulation study and a real data application, we show that the proposed procedure is robust and have excellent performance compared to existing methods. We expect that many more associated single-nucleotide polymorphisms on the X chromosome will be identified if the proposed approach is applied to current available genome-wide association studies data.
Genomic understanding of dinoflagellates.

PubMed

Lin, Senjie

2011-01-01

The phylum of dinoflagellates is characterized by many unusual and interesting genomic and physiological features, the imprint of which, in its immense genome, remains elusive. Much novel understanding has been achieved in the last decade on various aspects of dinoflagellate biology, but most remarkably about the structure, expression pattern and epigenetic modification of protein-coding genes in the nuclear and organellar genomes. Major findings include: 1) the great diversity of dinoflagellates, especially at the base of the dinoflagellate tree of life; 2) mini-circularization of the genomes of typical dinoflagellate plastids (with three membranes, chlorophylls a, c1 and c2, and carotenoid peridinin), the scrambled mitochondrial genome and the extensive mRNA editing occurring in both systems; 3) ubiquitous spliced leader trans-splicing of nuclear-encoded mRNA and demonstrated potential as a novel tool for studying dinoflagellate transcriptomes in mixed cultures and natural assemblages; 4) existence and expression of histones and other nucleosomal proteins; 5) a ribosomal protein set expected of typical eukaryotes; 6) genetic potential of non-photosynthetic solar energy utilization via proton-pump rhodopsin; 7) gene candidates in the toxin synthesis pathways; and 8) evidence of a highly redundant, high gene number and highly recombined genome. Despite this progress, much more work awaits genome-wide transcriptome and whole genome sequencing in order to unfold the molecular mechanisms underlying the numerous mysterious attributes of dinoflagellates. Copyright © 2011 Institut Pasteur. Published by Elsevier SAS. All rights reserved.
Genome-wide association study in Finnish twins highlights the connection between nicotine addiction and neurotrophin signaling pathway.

PubMed

Hällfors, Jenni; Palviainen, Teemu; Surakka, Ida; Gupta, Richa; Buchwald, Jadwiga; Raevuori, Anu; Ripatti, Samuli; Korhonen, Tellervo; Jousilahti, Pekka; Madden, Pamela A F; Kaprio, Jaakko; Loukola, Anu

2018-03-13

The heritability of nicotine dependence based on family studies is substantial. Nevertheless, knowledge of the underlying genetic architecture remains meager. Our aim was to identify novel genetic variants responsible for interindividual differences in smoking behavior. We performed a genome-wide association study on 1715 ever smokers ascertained from the population-based Finnish Twin Cohort enriched for heavy smoking. Data imputation used the 1000 Genomes Phase I reference panel together with a whole genome sequence-based Finnish reference panel. We analyzed three measures of nicotine addiction-smoking quantity, nicotine dependence and nicotine withdrawal. We annotated all genome-wide significant SNPs for their functional potential. First, we detected genome-wide significant association on 16p12 with smoking quantity (P = 8.5 × 10 -9 ), near CLEC19A. The lead-SNP stands 22 kb from a binding site for NF-κB transcription factors, which play a role in the neurotrophin signaling pathway. However, the signal was not replicated in an independent Finnish population-based sample, FINRISK (n = 6763). Second, nicotine withdrawal showed association on 2q21 in an intron of TMEM163 (P = 2.1 × 10 -9 ), and on 11p15 (P = 6.6 × 10 -8 ) in an intron of AP2A2, and P = 4.2 × 10 -7 for a missense variant in MUC6, both involved in the neurotrophin signaling pathway). Third, association was detected on 3p22.3 for maximum number of cigarettes smoked per day (P = 3.1 × 10 -8 ) near STAC. Associating CLEC19A and TMEM163 SNPs were annotated to influence gene expression or methylation. The neurotrophin signaling pathway has previously been associated with smoking behavior. Our findings further support the role in nicotine addiction. © 2018 The Authors. Addiction Biology published by John Wiley & Sons Ltd on behalf of Society for the Study of Addiction.

Cancer Genome Anatomy Project (CGAP) | Office of Cancer Genomics

Cancer.gov

CGAP generated a wide range of genomics data on cancerous cells that are accessible through easy-to-use online tools. Researchers, educators, and students can find "in silico" answers to biological questions through the CGAP website. Request a free copy of the CGAP Website Virtual Tour CD from ocg@mail.nih.gov to learn how to navigate the website.
Genome-wide variation within and between wild and domestic yak.

PubMed

Wang, Kun; Hu, Quanjun; Ma, Hui; Wang, Lizhong; Yang, Yongzhi; Luo, Wenchun; Qiu, Qiang

2014-07-01

The yak is one of the few animals that can thrive in the harsh environment of the Qinghai-Tibetan Plateau and adjacent Alpine regions. Yak provides essential resources allowing Tibetans to live at high altitudes. However, genetic variation within and between wild and domestic yak remain unknown. Here, we present a genome-wide study of the genetic variation within and between wild and domestic yak. Using next-generation sequencing technology, we resequenced three wild and three domestic yak with a mean of fivefold coverage using our published domestic yak genome as a reference. We identified a total of 8.38 million SNPs (7.14 million novel), 383,241 InDels and 126,352 structural variants between the six yak. We observed higher linkage disequilibrium in domestic yak than in wild yak and a modest but distinct genetic divergence between these two groups. We further identified more than a thousand of potential selected regions (PSRs) for the three domestic yak by scanning the whole genome. These genomic resources can be further used to study genetic diversity and select superior breeds of yak and other bovid species. © 2014 John Wiley & Sons Ltd.
Genome-wide association studies on HIV susceptibility, pathogenesis and pharmacogenomics

PubMed Central

2012-01-01

Susceptibility to HIV-1 and the clinical course after infection show a substantial heterogeneity between individuals. Part of this variability can be attributed to host genetic variation. Initial candidate gene studies have revealed interesting host factors that influence HIV infection, replication and pathogenesis. Recently, genome-wide association studies (GWAS) were utilized for unbiased searches at a genome-wide level to discover novel genetic factors and pathways involved in HIV-1 infection. This review gives an overview of findings from the GWAS performed on HIV infection, within different cohorts, with variable patient and phenotype selection. Furthermore, novel techniques and strategies in research that might contribute to the complete understanding of virus-host interactions and its role on the pathogenesis of HIV infection are discussed. PMID:22920050
Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries.

PubMed

Baurley, James W; Edlund, Christopher K; Pardamean, Carissa I; Conti, David V; Krasnow, Ruth; Javitz, Harold S; Hops, Hyman; Swan, Gary E; Benowitz, Neal L; Bergen, Andrew W

2016-09-01

Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3'-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan-continental population biomarkers for nicotine metabolism. This multiple ancestry meta-GWAS of the laboratory study-based NMR provides novel evidence and replication for genome-wide association of CYP2A6 single nucleotide and insertion-deletion polymorphisms. We identify three regions of genome-wide significance: proximal, intronic, and distal to CYP2A6. We replicate the top-ranking single nucleotide polymorphism from a recent GWAS of the NMR in Finnish smokers, identify a functional mechanism for this intronic variant from in silico analyses of RNA-seq data that is consistent with CYP2A6 expression measured in postmortem lung and liver, and provide additional support for the intergenic region between CYP2A6 and CYP2A7. © The Author 2016. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco.
Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries

PubMed Central

Baurley, James W.; Edlund, Christopher K.; Pardamean, Carissa I.; Conti, David V.; Krasnow, Ruth; Javitz, Harold S.; Hops, Hyman; Swan, Gary E.; Benowitz, Neal L.

2016-01-01

Introduction: Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3′-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Methods: Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. Results: African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). Conclusions: This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan-continental population biomarkers for nicotine metabolism. Implications: This multiple ancestry meta-GWAS of the laboratory study-based NMR provides novel evidence and replication for genome-wide association of CYP2A6 single nucleotide and insertion–deletion polymorphisms. We identify three regions of genome-wide significance: proximal, intronic, and distal to CYP2A6. We replicate the top-ranking single nucleotide polymorphism from a recent GWAS of the NMR in Finnish smokers, identify a functional mechanism for this intronic variant from in silico analyses of RNA-seq data that is consistent with CYP2A6 expression measured in postmortem lung and liver, and provide additional support for the intergenic region between CYP2A6 and CYP2A7. PMID:27113016
snpGeneSets: An R Package for Genome-Wide Study Annotation

PubMed Central

Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

2016-01-01

Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048
Mediator binding to UASs is broadly uncoupled from transcription and cooperative with TFIID recruitment to promoters.

PubMed

Grünberg, Sebastian; Henikoff, Steven; Hahn, Steven; Zentner, Gabriel E

2016-11-15

Mediator is a conserved, essential transcriptional coactivator complex, but its in vivo functions have remained unclear due to conflicting data regarding its genome-wide binding pattern obtained by genome-wide ChIP Here, we used ChEC-seq, a method orthogonal to ChIP, to generate a high-resolution map of Mediator binding to the yeast genome. We find that Mediator associates with upstream activating sequences (UASs) rather than the core promoter or gene body under all conditions tested. Mediator occupancy is surprisingly correlated with transcription levels at only a small fraction of genes. Using the same approach to map TFIID, we find that TFIID is associated with both TFIID- and SAGA-dependent genes and that TFIID and Mediator occupancy is cooperative. Our results clarify Mediator recruitment and binding to the genome, showing that Mediator binding to UASs is widespread, partially uncoupled from transcription, and mediated in part by TFIID. © 2016 The Authors.
Genome-Wide Association Study for Traits Related to Plant and Grain Morphology, and Root Architecture in Temperate Rice Accessions.

PubMed

Biscarini, Filippo; Cozzi, Paolo; Casella, Laura; Riccardi, Paolo; Vattari, Alessandra; Orasen, Gabriele; Perrini, Rosaria; Tacconi, Gianni; Tondelli, Alessandro; Biselli, Chiara; Cattivelli, Luigi; Spindel, Jennifer; McCouch, Susan; Abbruscato, Pamela; Valé, Giampiero; Piffanelli, Pietro; Greco, Raffaella

2016-01-01

In this study we carried out a genome-wide association analysis for plant and grain morphology and root architecture in a unique panel of temperate rice accessions adapted to European pedo-climatic conditions. This is the first study to assess the association of selected phenotypic traits to specific genomic regions in the narrow genetic pool of temperate japonica. A set of 391 rice accessions were GBS-genotyped yielding-after data editing-57000 polymorphic and informative SNPS, among which 54% were in genic regions. In total, 42 significant genotype-phenotype associations were detected: 21 for plant morphology traits, 11 for grain quality traits, 10 for root architecture traits. The FDR of detected associations ranged from 3 · 10-7 to 0.92 (median: 0.25). In most cases, the significant detected associations co-localised with QTLs and candidate genes controlling the phenotypic variation of single or multiple traits. The most significant associations were those for flag leaf width on chromosome 4 (FDR = 3 · 10-7) and for plant height on chromosome 6 (FDR = 0.011). We demonstrate the effectiveness and resolution of the developed platform for high-throughput phenotyping, genotyping and GWAS in detecting major QTLs for relevant traits in rice. We identified strong associations that may be used for selection in temperate irrigated rice breeding: e.g. associations for flag leaf width, plant height, root volume and length, grain length, grain width and their ratio. Our findings pave the way to successfully exploit the narrow genetic pool of European temperate rice and to pinpoint the most relevant genetic components contributing to the adaptability and high yield of this germplasm. The generated data could be of direct use in genomic-assisted breeding strategies.
Identification of Novel Susceptibility Loci for Kawasaki Disease in a Han Chinese Population by a Genome-Wide Association Study

PubMed Central

Huang, Li-Min; Huang, Fu-Yuan; Chiu, Nan-Chang; Chen, Ming-Ren; Chi, Hsin; Lee, Yann-Jinn; Chang, Li-Ching; Liu, Yi-Min; Wang, Hsiang-Hua; Chen, Chien-Hsiun; Chen, Yuan-Tsong; Wu, Jer-Yuarn

2011-01-01

Kawasaki disease (KD) is an acute systemic vasculitis syndrome that primarily affects infants and young children. Its etiology is unknown; however, epidemiological findings suggest that genetic predisposition underlies disease susceptibility. Taiwan has the third-highest incidence of KD in the world, after Japan and Korea. To investigate novel mechanisms that might predispose individuals to KD, we conducted a genome-wide association study (GWAS) in 250 KD patients and 446 controls in a Han Chinese population residing in Taiwan, and further validated our findings in an independent Han Chinese cohort of 208 cases and 366 controls. The most strongly associated single-nucleotide polymorphisms (SNPs) detected in the joint analysis corresponded to three novel loci. Among these KD-associated SNPs three were close to the COPB2 (coatomer protein complex beta-2 subunit) gene: rs1873668 (p = 9.52×10−5), rs4243399 (p = 9.93×10−5), and rs16849083 (p = 9.93×10−5). We also identified a SNP in the intronic region of the ERAP1 (endoplasmic reticulum amino peptidase 1) gene (rs149481, pbest = 4.61×10−5). Six SNPs (rs17113284, rs8005468, rs10129255, rs2007467, rs10150241, and rs12590667) clustered in an area containing immunoglobulin heavy chain variable regions genes, with pbest-values between 2.08×10−5 and 8.93×10−6, were also identified. This is the first KD GWAS performed in a Han Chinese population. The novel KD candidates we identified have been implicated in T cell receptor signaling, regulation of proinflammatory cytokines, as well as antibody-mediated immune responses. These findings may lead to a better understanding of the underlying molecular pathogenesis of KD. PMID:21326860
A systematic review of genome-wide research on psychotic experiences and negative symptom traits: New revelations and implications for psychiatry.

PubMed

Ronald, Angelica; Pain, Oliver

2018-05-08

We present a systematic review of genome-wide research on psychotic experience and negative symptom traits (PENS) in the community. We integrate these new findings, most of which have emerged over the last four years, with more established behaviour genetic and epidemiological research. The review includes the first genome-wide association studies of PENS, including a recent meta-analysis, and the first SNP heritability estimates. Sample sizes of < 10,000 participants mean that no genome-wide significant variants have yet been replicated. Importantly, however, in the most recent and well-powered studies, polygenic risk score prediction and linkage disequilibrium (LD) score regression analyses show that all types of PENS share genetic influences with diagnosed schizophrenia and that negative symptom traits also share genetic influences with major depression. These genetic findings corroborate other evidence in supporting a link between PENS in the community and psychiatric conditions. Beyond the systematic review, we highlight recent work on gene-environment correlation, which appears to be a relevant process for psychotic experiences. Genes that influence risk factors such as tobacco use and stressful life events are likely to be harbouring 'hits' that also influence PENS. We argue for the acceptance of PENS within the mainstream, as heritable traits in the same vein as other subclinical psychopathology and personality styles such as neuroticism. While acknowledging some mixed findings, new evidence shows genetic overlap between PENS and psychiatric conditions. In sum, normal variations in adolescent and adult thinking styles, such as feeling paranoid, are heritable and show genetic associations with schizophrenia and major depression.
Genomic newborn screening: public health policy considerations and recommendations.

PubMed

Friedman, Jan M; Cornel, Martina C; Goldenberg, Aaron J; Lister, Karla J; Sénécal, Karine; Vears, Danya F

2017-02-21

The use of genome-wide (whole genome or exome) sequencing for population-based newborn screening presents an opportunity to detect and treat or prevent many more serious early-onset health conditions than is possible today. The Paediatric Task Team of the Global Alliance for Genomics and Health's Regulatory and Ethics Working Group reviewed current understanding and concerns regarding the use of genomic technologies for population-based newborn screening and developed, by consensus, eight recommendations for clinicians, clinical laboratory scientists, and policy makers. Before genome-wide sequencing can be implemented in newborn screening programs, its clinical utility and cost-effectiveness must be demonstrated, and the ability to distinguish disease-causing and benign variants of all genes screened must be established. In addition, each jurisdiction needs to resolve ethical and policy issues regarding the disclosure of incidental or secondary findings to families and ownership, appropriate storage and sharing of genomic data. The best interests of children should be the basis for all decisions regarding the implementation of genomic newborn screening.
Discovery of human inversion polymorphisms by comparative analysis of human and chimpanzee DNA sequence assemblies.

PubMed

Feuk, Lars; MacDonald, Jeffrey R; Tang, Terence; Carson, Andrew R; Li, Martin; Rao, Girish; Khaja, Razi; Scherer, Stephen W

2005-10-01

With a draft genome-sequence assembly for the chimpanzee available, it is now possible to perform genome-wide analyses to identify, at a submicroscopic level, structural rearrangements that have occurred between chimpanzees and humans. The goal of this study was to investigate chromosomal regions that are inverted between the chimpanzee and human genomes. Using the net alignments for the builds of the human and chimpanzee genome assemblies, we identified a total of 1,576 putative regions of inverted orientation, covering more than 154 mega-bases of DNA. The DNA segments are distributed throughout the genome and range from 23 base pairs to 62 mega-bases in length. For the 66 inversions more than 25 kilobases (kb) in length, 75% were flanked on one or both sides by (often unrelated) segmental duplications. Using PCR and fluorescence in situ hybridization we experimentally validated 23 of 27 (85%) semi-randomly chosen regions; the largest novel inversion confirmed was 4.3 mega-bases at human Chromosome 7p14. Gorilla was used as an out-group to assign ancestral status to the variants. All experimentally validated inversion regions were then assayed against a panel of human samples and three of the 23 (13%) regions were found to be polymorphic in the human genome. These polymorphic inversions include 730 kb (at 7p22), 13 kb (at 7q11), and 1 kb (at 16q24) fragments with a 5%, 30%, and 48% minor allele frequency, respectively. Our results suggest that inversions are an important source of variation in primate genome evolution. The finding of at least three novel inversion polymorphisms in humans indicates this type of structural variation may be a more common feature of our genome than previously realized.
A novel variant associated with HDL-C levels by modifying DAGLB expression levels: An annotation-based genome-wide association study.

PubMed

Zhou, Dan; Zhang, Dandan; Sun, Xiaohui; Li, Zhiqiang; Ni, Yaqin; Shan, Zhongyan; Li, Hong; Liu, Chengguo; Zhang, Shuai; Liu, Yi; Zheng, Ruizhi; Pan, Feixia; Zhu, Yimin; Shi, Yongyong; Lai, Maode

2018-06-01

Although numbers of genome-wide association studies (GWAS) have been performed for serum lipid levels, limited heritability has been explained. Studies showed that combining data from GWAS and expression quantitative trait loci (eQTLs) signals can both enhance the discovery of trait-associated SNPs and gain a better understanding of the mechanism. We performed an annotation-based, multistage genome-wide screening for serum-lipid-level-associated loci in totally 6863 Han Chinese. A serum high-density lipoprotein cholesterol (HDL-C) associated variant rs1880118 (hg19 chr7:g. 6435220G>C) was replicated (P combined = 1.4E-10). rs1880118 was associated with DAGLB (diacylglycerol lipase, beta) expression levels in subcutaneous adipose tissue (P = 5.9E-42) and explained 47.7% of the expression variance. After the replication, an active segment covering variants tagged by rs1880118 near 5' of DAGLB was annotated using histone modification and transcription factor binding signals. The luciferase report assay revealed that the segment containing the minor alleles showed increased transcriptional activity compared with segment contains the major alleles, which was consistent with the eQTL analyses. The expression-trait association tests indicated the association between the DAGLB and serum HDL-C levels using gene-based approaches called "TWAS" (P = 3.0E-8), "SMR" (P = 1.1E-4), and "Sherlock" (P = 1.6E-6). To summarize, we identified a novel HDL-C-associated variant which explained nearly half of the expression variance of DAGLB. Integrated analyses established a genotype-gene-phenotype three-way association and expanded our knowledge of DAGLB in lipid metabolism.
CMS: A Web-Based System for Visualization and Analysis of Genome-Wide Methylation Data of Human Cancers

PubMed Central

Huang, Yi-Wen; Roa, Juan C.; Goodfellow, Paul J.; Kizer, E. Lynette; Huang, Tim H. M.; Chen, Yidong

2013-01-01

Background DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Methodology/Principal Findings Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. Conclusions/Significance CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/. PMID:23630576
Integrative and conjugative elements and their hosts: composition, distribution and organization.

PubMed

Cury, Jean; Touchon, Marie; Rocha, Eduardo P C

2017-09-06

Conjugation of single-stranded DNA drives horizontal gene transfer between bacteria and was widely studied in conjugative plasmids. The organization and function of integrative and conjugative elements (ICE), even if they are more abundant, was only studied in a few model systems. Comparative genomics of ICE has been precluded by the difficulty in finding and delimiting these elements. Here, we present the results of a method that circumvents these problems by requiring only the identification of the conjugation genes and the species' pan-genome. We delimited 200 ICEs and this allowed the first large-scale characterization of these elements. We quantified the presence in ICEs of a wide set of functions associated with the biology of mobile genetic elements, including some that are typically associated with plasmids, such as partition and replication. Protein sequence similarity networks and phylogenetic analyses revealed that ICEs are structured in functional modules. Integrases and conjugation systems have different evolutionary histories, even if the gene repertoires of ICEs can be grouped in function of conjugation types. Our characterization of the composition and organization of ICEs paves the way for future functional and evolutionary analyses of their cargo genes, composed of a majority of unknown function genes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Complete Genome Sequence of a Genomovirus Associated with Common Bean Plant Leaves in Brazil.

PubMed

Lamas, Natalia Silva; Fontenele, Rafaela Salgado; Melo, Fernando Lucas; Costa, Antonio Felix; Varsani, Arvind; Ribeiro, Simone Graça

2016-11-10

A new genomovirus has been identified in three common bean plants in Brazil. This virus has a circular genome of 2,220 nucleotides and 3 major open reading frames. It shares 80.7% genome-wide pairwise identity with a genomovirus recovered from Tongan fruit bat guano. Copyright © 2016 Lamas et al.
How endogenous plant pararetroviruses shed light on Musa evolution

PubMed Central

Duroy, Pierre-Olivier; Perrier, Xavier; Laboureau, Nathalie; Jacquemoud-Collet, Jean-Pierre; Iskra-Caruana, Marie-Line

2016-01-01

Background and Aims Banana genomes harbour numerous copies of viral sequences derived from banana streak viruses (BSVs) – dsDNA viruses belonging to the family Caulimoviridae. These viral integrants (eBSVs) are mostly defective, probably as a result of ‘pseudogenization’ driven by host genome evolution. However, some can give rise to infection by releasing a functional viral genome following abiotic stresses. These distinct infective eBSVs correspond to the three main widespread BSV species (BSOLV, BSGFV and BSIMV), fully described within the Musa balbisiana B genomes of the seedy diploid ‘Pisang Klutuk Wulung’ (PKW). Methods We characterize eBSV distribution among a Musa sampling including seedy BB diploids and interspecific hybrids with Musa acuminata exhibiting different levels of ploidy for the B genome (ABB, AAB, AB). We used representative samples of the two areas of sympatry between M. acuminata and M. balbisiana species representing the native area of the most widely cultivated AAB cultivars (in India and in East Asia, ranging from the Philippines to New Guinea). Seventy-seven accessions were characterized using eBSV-related PCR markers and Southern hybridization approaches. We coded both sets of results to create a common dissimilarity matrix with which to interpret eBSV distribution. Key Results We propose a Musa phylogeny driven by the M. balbisiana genome based on a dendrogram resulting from a joint neighbour-joining analysis of the three BSV species, showing for the first time lineages between BB and ABB/AAB hybrids. eBSVs appear to be relevant phylogenetic markers that can illustrate the M. balbisiana phylogeography story. Conclusion The theoretical implications of this study for further elucidation of the historical and geographical process of Musa domestication are numerous. Discovery of banana plants with B genome non-infective for eBSV opens the way to the introduction of new genitors in programmes of genetic banana improvement. PMID:26971286
How endogenous plant pararetroviruses shed light on Musa evolution.

PubMed

Duroy, Pierre-Olivier; Perrier, Xavier; Laboureau, Nathalie; Jacquemoud-Collet, Jean-Pierre; Iskra-Caruana, Marie-Line

2016-04-01

Banana genomes harbour numerous copies of viral sequences derived from banana streak viruses (BSVs) - dsDNA viruses belonging to the family Caulimoviridae.These viral integrants (eBSVs) are mostly defective, probably as a result of 'pseudogenization' driven by host genome evolution. However, some can give rise to infection by releasing a functional viral genome following abiotic stresses. These distinct infective eBSVs correspond to the three main widespread BSV species (BSOLV, BSGFV and BSIMV), fully described within the Musa balbisiana B genomes of the seedy diploid 'Pisang Klutuk Wulung' (PKW). We characterize eBSV distribution among a Musa sampling including seedy BB diploids and interspecific hybrids with Musa acuminate exhibiting different levels of ploidy for the B genome (ABB, AAB, AB). We used representative samples of the two areas of sympatry between M. acuminate and M. balbisiana species representing the native area of the most widely cultivated AAB cultivars (in India and in East Asia, ranging from the Philippines to New Guinea). Seventy-seven accessions were characterized using eBSV-related PCR markers and Southern hybridization approaches. We coded both sets of results to create a common dissimilarity matrix with which to interpret eBSV distribution. We propose a Musa phylogeny driven by the M. balbisiana genome based on a dendrogram resulting from a joint neighbour-joining analysis of the three BSV species, showing for the first time lineages between BB and ABB/AAB hybrids. eBSVs appear to be relevant phylogenetic markers that can illustrate theM. balbisiana phylogeography story. The theoretical implications of this study for further elucidation of the historical and geographical process of Musa domestication are numerous. Discovery of banana plants with B genome non-infective for eBSV opens the way to the introduction of new genitors in programmes of genetic banana improvement. © The Author 2016. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Genome-wide association study identifies three novel loci for type 2 diabetes.

PubMed

Hara, Kazuo; Fujita, Hayato; Johnson, Todd A; Yamauchi, Toshimasa; Yasuda, Kazuki; Horikoshi, Momoko; Peng, Chen; Hu, Cheng; Ma, Ronald C W; Imamura, Minako; Iwata, Minoru; Tsunoda, Tatsuhiko; Morizono, Takashi; Shojima, Nobuhiro; So, Wing Yee; Leung, Ting Fan; Kwan, Patrick; Zhang, Rong; Wang, Jie; Yu, Weihui; Maegawa, Hiroshi; Hirose, Hiroshi; Kaku, Kohei; Ito, Chikako; Watada, Hirotaka; Tanaka, Yasushi; Tobe, Kazuyuki; Kashiwagi, Atsunori; Kawamori, Ryuzo; Jia, Weiping; Chan, Juliana C N; Teo, Yik Ying; Shyong, Tai E; Kamatani, Naoyuki; Kubo, Michiaki; Maeda, Shiro; Kadowaki, Takashi

2014-01-01

Although over 60 loci for type 2 diabetes (T2D) have been identified, there still remains a large genetic component to be clarified. To explore unidentified loci for T2D, we performed a genome-wide association study (GWAS) of 6 209 637 single-nucleotide polymorphisms (SNPs), which were directly genotyped or imputed using East Asian references from the 1000 Genomes Project (June 2011 release) in 5976 Japanese patients with T2D and 20 829 nondiabetic individuals. Nineteen unreported loci were selected and taken forward to follow-up analyses. Combined discovery and follow-up analyses (30 392 cases and 34 814 controls) identified three new loci with genome-wide significance, which were MIR129-LEP [rs791595; risk allele = A; risk allele frequency (RAF) = 0.080; P = 2.55 × 10(-13); odds ratio (OR) = 1.17], GPSM1 [rs11787792; risk allele = A; RAF = 0.874; P = 1.74 × 10(-10); OR = 1.15] and SLC16A13 (rs312457; risk allele = G; RAF = 0.078; P = 7.69 × 10(-13); OR = 1.20). This study demonstrates that GWASs based on the imputation of genotypes using modern reference haplotypes such as that from the 1000 Genomes Project data can assist in identification of new loci for common diseases.
Selection in action: dissecting the molecular underpinnings of the increasing muscle mass of Belgian Blue Cattle.

PubMed

Druet, Tom; Ahariz, Naima; Cambisano, Nadine; Tamma, Nico; Michaux, Charles; Coppieters, Wouter; Charlier, Carole; Georges, Michel

2014-09-17

Belgian Blue cattle are famous for their exceptional muscular development or "double-muscling". This defining feature emerged following the fixation of a loss-of-function variant in the myostatin gene in the eighties. Since then, sustained selection has further increased muscle mass of Belgian Blue animals to a comparable extent. In the present paper, we study the genetic determinants of this second wave of muscle growth. A scan for selective sweeps did not reveal the recent fixation of another allele with major effect on muscularity. However, a genome-wide association study identified two genome-wide significant and three suggestive quantitative trait loci (QTL) affecting specific muscle groups and jointly explaining 8-21% of the heritability. The top two QTL are caused by presumably recent mutations on unique haplotypes that have rapidly risen in frequency in the population. While one appears on its way to fixation, the ascent of the other is compromised as the likely underlying MRC2 mutation causes crooked tail syndrome in homozygotes. Genomic prediction models indicate that the residual additive variance is largely polygenic. Contrary to complex traits in humans which have a near-exclusive polygenic architecture, muscle mass in beef cattle (as other production traits under directional selection), appears to be controlled by (i) a handful of recent mutations with large effect that rapidly sweep through the population, and (ii) a large number of presumably older variants with very small effects that rise slowly in the population (polygenic adaptation).

Association of CLU and PICALM variants with Alzheimer's disease

PubMed Central

Kamboh, M.I.; Minster, R. L.; Demirci, F.Y.; Ganguli, M.; DeKosky, S.T.; Lopez, O.L.; Barmada, M.M.

2010-01-01

Two recent large genome-wide association studies have reported significant associations in the CLU (APOJ), CR1 and PICALM genes. In order to replicate these findings, we examined 7 single nucleotide polymorphisms (SNPs) most significantly implicated by these studies in a large case-control sample comprising of 2,707 individuals. Principle components analysis revealed no population substructure in our sample. While no association was observed with CR1 SNPs (P=0.30–0.457), a trend of association was seen with the PICALM (P=0.071–0.086) and CLU (P=0.148–0.258) SNPs. A meta-analysis of three studies revealed significant associations with all three genes. Our data from an independent and large case-control sample suggest that these gene regions should be followed up by comprehensive resequencing to find functional variants. PMID:20570404
The genomes of three stocks comprising the most widely utilized live sporozoite Theileria parva vaccine exhibit very different degrees and patterns of sequence divergence.

PubMed

Norling, Martin; Bishop, Richard P; Pelle, Roger; Qi, Weihong; Henson, Sonal; Drábek, Elliott F; Tretina, Kyle; Odongo, David; Mwaura, Stephen; Njoroge, Thomas; Bongcam-Rudloff, Erik; Daubenberger, Claudia A; Silva, Joana C

2015-09-24

There are no commercially available vaccines against human protozoan parasitic diseases, despite the success of vaccination-induced long-term protection against infectious diseases. East Coast fever, caused by the protist Theileria parva, kills one million cattle each year in sub-Saharan Africa, and contributes significantly to hunger and poverty in the region. A highly effective, live, multi-isolate vaccine against T. parva exists, but its component isolates have not been characterized. Here we sequence and compare the three component T. parva stocks within this vaccine, the Muguga Cocktail, namely Muguga, Kiambu5 and Serengeti-transformed, aiming to identify genomic features that contribute to vaccine efficacy. We find that Serengeti-transformed, originally isolated from the wildlife carrier, the African Cape buffalo, is remarkably and unexpectedly similar to the Muguga isolate. The 420 detectable non-synonymous SNPs were distributed among only 53 genes, primarily subtelomeric antigens and antigenic families. The Kiambu5 isolate is considerably more divergent, with close to 40,000 SNPs relative to Muguga, including >8,500 non-synonymous mutations distributed among >1,700 (42.5 %) of the predicted genes. These genetic markers of the component stocks can be used to characterize the composition of new batches of the Muguga Cocktail. Differences among these three isolates, while extensive, represent only a small proportion of the genetic variation in the entire species. Given the efficacy of the Muguga Cocktail in inducing long-lasting protection against infections in the field, our results suggest that whole-organism vaccines against parasitic diseases can be highly efficacious despite considerable genome-wide differences relative to the isolates against which they protect.
Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index.

PubMed

Minster, Ryan L; Sanders, Jason L; Singh, Jatinder; Kammerer, Candace M; Barmada, M Michael; Matteini, Amy M; Zhang, Qunyuan; Wojczynski, Mary K; Daw, E Warwick; Brody, Jennifer A; Arnold, Alice M; Lunetta, Kathryn L; Murabito, Joanne M; Christensen, Kaare; Perls, Thomas T; Province, Michael A; Newman, Anne B

2015-08-01

The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted for mortality risk in 3,140 individuals selected for familial longevity from the Long Life Family Study. The genome-wide association study used the Long Life Family Study as the discovery cohort and individuals from the Cardiovascular Health Study and the Framingham Heart Study as replication cohorts. There were no genome-wide significant findings from the genome-wide association study; however, several single-nucleotide polymorphisms near ZNF704 on chromosome 8q21.13 were suggestively associated with the HAI in the Long Life Family Study (p < 10(-) (6)) and nominally replicated in the Cardiovascular Health Study and Framingham Heart Study. Linkage results revealed significant evidence (log-odds score = 3.36) for a quantitative trait locus for mortality-optimized HAI in women on chromosome 9p24-p23. However, results of fine-mapping studies did not implicate any specific candidate genes within this region of interest. ZNF704 may be a potential candidate gene for studies of the genetic underpinnings of longevity. © The Author 2015. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Controversy and debate on clinical genomics sequencing-paper 1: genomics is not exceptional: rigorous evaluations are necessary for clinical applications of genomic sequencing.

PubMed

Wilson, Brenda J; Miller, Fiona Alice; Rousseau, François

2017-12-01

Next generation genomic sequencing (NGS) technologies-whole genome and whole exome sequencing-are now cheap enough to be within the grasp of many health care organizations. To many, NGS is symbolic of cutting edge health care, offering the promise of "precision" and "personalized" medicine. Historically, research and clinical application has been a two-way street in clinical genetics: research often driven directly by the desire to understand and try to solve immediate clinical problems affecting real, identifiable patients and families, accompanied by a low threshold of willingness to apply research-driven interventions without resort to formal empirical evaluations. However, NGS technologies are not simple substitutes for older technologies and need careful evaluation for use as screening, diagnostic, or prognostic tools. We have concerns across three areas. First, at the moment, analytic validity is unknown because technical platforms are not yet stable, laboratory quality assurance programs are in their infancy, and data interpretation capabilities are badly underdeveloped. Second, clinical validity of genomic findings for patient populations without pre-existing high genetic risk is doubtful, as most clinical experience with NGS technologies relates to patients with a high prior likelihood of a genetic etiology. Finally, we are concerned that proponents argue not only for clinically driven approaches to assessing a patient's genome, but also for seeking out variants associated with unrelated conditions or susceptibilities-so-called "secondary targets"-this is screening on a genomic scale. We argue that clinical uses of genomic sequencing should remain limited to specialist and research settings, that screening for secondary findings in clinical testing should be limited to the maximum extent possible, and that the benefits, harms, and economic implications of their routine use be systematically evaluated. All stakeholders have a responsibility to ensure that patients receive effective, safe health care, in an economically sustainable health care system. There should be no exception for genome-based interventions. Copyright © 2017 Elsevier Inc. All rights reserved.
Improved Statistical Methods Enable Greater Sensitivity in Rhythm Detection for Genome-Wide Data

PubMed Central

Hutchison, Alan L.; Maienschein-Cline, Mark; Chiang, Andrew H.; Tabei, S. M. Ali; Gudjonson, Herman; Bahroos, Neil; Allada, Ravi; Dinner, Aaron R.

2015-01-01

Robust methods for identifying patterns of expression in genome-wide data are important for generating hypotheses regarding gene function. To this end, several analytic methods have been developed for detecting periodic patterns. We improve one such method, JTK_CYCLE, by explicitly calculating the null distribution such that it accounts for multiple hypothesis testing and by including non-sinusoidal reference waveforms. We term this method empirical JTK_CYCLE with asymmetry search, and we compare its performance to JTK_CYCLE with Bonferroni and Benjamini-Hochberg multiple hypothesis testing correction, as well as to five other methods: cyclohedron test, address reduction, stable persistence, ANOVA, and F24. We find that ANOVA, F24, and JTK_CYCLE consistently outperform the other three methods when data are limited and noisy; empirical JTK_CYCLE with asymmetry search gives the greatest sensitivity while controlling for the false discovery rate. Our analysis also provides insight into experimental design and we find that, for a fixed number of samples, better sensitivity and specificity are achieved with higher numbers of replicates than with higher sampling density. Application of the methods to detecting circadian rhythms in a metadataset of microarrays that quantify time-dependent gene expression in whole heads of Drosophila melanogaster reveals annotations that are enriched among genes with highly asymmetric waveforms. These include a wide range of oxidation reduction and metabolic genes, as well as genes with transcripts that have multiple splice forms. PMID:25793520
Genome-Wide Variation Patterns Uncover the Origin and Selection in Cultivated Ginseng (Panax ginseng Meyer)

PubMed Central

Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili

2017-01-01

Abstract Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. PMID:28922794
Genome-wide significant risk associations for mucinous ovarian carcinoma

PubMed Central

Kelemen, Linda E.; Lawrenson, Kate; Tyrer, Jonathan; Li, Qiyuan; M. Lee, Janet; Seo, Ji-Heui; Phelan, Catherine M.; Beesley, Jonathan; Chen, Xiaoqin; Spindler, Tassja J.; Aben, Katja K.H.; Anton-Culver, Hoda; Antonenkova, Natalia; Baker, Helen; Bandera, Elisa V.; Bean, Yukie; Beckmann, Matthias W.; Bisogna, Maria; Bjorge, Line; Bogdanova, Natalia; Brinton, Louise A.; Brooks-Wilson, Angela; Bruinsma, Fiona; Butzow, Ralf; Campbell, Ian G.; Carty, Karen; Chang-Claude, Jenny; Chen, Y. Ann; Chen, Zhihua; Cook, Linda S.; Cramer, Daniel W.; Cunningham, Julie M.; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A.; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas T.; Edwards, Robert P.; Eilber, Ursula; Ekici, Arif B.; Engelholm, Svend Aage; Fasching, Peter A.; Fridley, Brooke L.; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G.; Glasspool, Rosalind; Goode, Ellen L.; Goodman, Marc T.; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hasmad, Hanis Nazihah; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A.T.; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus; Hosono, Satoyo; Iversen, Edwin S.; Jakubowska, Anna; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kellar, Melissa; Kelley, Joseph L.; Kiemeney, Lambertus A.; Krakstad, Camilla; Kjaer, Susanne K.; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Alice W.; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A.; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon F.A.G.; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R.; McNeish, Iain; Menon, Usha; Modugno, Francesmary; Moes-Sosnowska, Joanna; Moysich, Kirsten B.; Narod, Steven A.; Nedergaard, Lotte; Ness, Roberta B.; Nevanlinna, Heli; Azmi, Mat Adenan Noor; Odunsi, Kunle; Olson, Sara H.; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Paul, James; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M.; Permuth-Wey, Jennifer; Pike, Malcolm C.; Poole, Elizabeth M.; Ramus, Susan J.; Risch, Harvey A.; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H.; Rudolph, Anja; Runnebaum, Ingo B.; Rzepecka, Iwona K.; Salvesen, Helga B.; Schildkraut, Joellen M.; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C.; Sucheston, Lara; Tangen, Ingvild L.; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J; Tworoger, Shelley S.; van Altena, Anne M.; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A.; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S.; Wicklund, Kristine G.; Wilkens, Lynne R.; Wlodzimierz, Sawicki; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna H.; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A.; Freedman, Matthew L.; Chenevix-Trench, Georgia; Pharoah, Paul D.; Gayther, Simon A.; Berchuck, Andrew

2015-01-01

Genome-wide association studies have identified several risk associations for ovarian carcinomas (OC) but not for mucinous ovarian carcinomas (MOC). Genotypes from OC cases and controls were imputed into the 1000 Genomes Project reference panel. Analysis of 1,644 MOC cases and 21,693 controls identified three novel risk associations: rs752590 at 2q13 (P = 3.3 × 10−8), rs711830 at 2q31.1 (P = 7.5 × 10−12) and rs688187 at 19q13.2 (P = 6.8 × 10−13). Expression Quantitative Trait Locus (eQTL) analysis in ovarian and colorectal tumors (which are histologically similar to MOC) identified significant eQTL associations for HOXD9 at 2q31.1 in ovarian (P = 4.95 × 10−4, FDR = 0.003) and colorectal (P = 0.01, FDR = 0.09) tumors, and for PAX8 at 2q13 in colorectal tumors (P = 0.03, FDR = 0.09). Chromosome conformation capture analysis identified interactions between the HOXD9 promoter and risk SNPs at 2q31.1. Overexpressing HOXD9 in MOC cells augmented the neoplastic phenotype. These findings provide the first evidence for MOC susceptibility variants and insights into the underlying biology of the disease. PMID:26075790
A genome-wide association study of copy number variations with umbilical hernia in swine.

PubMed

Long, Yi; Su, Ying; Ai, Huashui; Zhang, Zhiyan; Yang, Bin; Ruan, Guorong; Xiao, Shijun; Liao, Xinjun; Ren, Jun; Huang, Lusheng; Ding, Nengshui

2016-06-01

Umbilical hernia (UH) is one of the most common congenital defects in pigs, leading to considerable economic loss and serious animal welfare problems. To test whether copy number variations (CNVs) contribute to pig UH, we performed a case-control genome-wide CNV association study on 905 pigs from the Duroc, Landrace and Yorkshire breeds using the Porcine SNP60 BeadChip and penncnv algorithm. We first constructed a genomic map comprising 6193 CNVs that pertain to 737 CNV regions. Then, we identified eight CNVs significantly associated with the risk for UH in the three pig breeds. Six of seven significantly associated CNVs were validated using quantitative real-time PCR. Notably, a rare CNV (CNV14:13030843-13059455) encompassing the NUGGC gene was strongly associated with UH (permutation-corrected P = 0.0015) in Duroc pigs. This CNV occurred exclusively in seven Duroc UH-affected individuals. SNPs surrounding the CNV did not show association signals, indicating that rare CNVs may play an important role in complex pig diseases such as UH. The NUGGC gene has been implicated in human omphalocele and inguinal hernia. Our finding supports that CNVs, including the NUGGC CNV, contribute to the pathogenesis of pig UH. © 2016 Stichting International Foundation for Animal Genetics.
Family studies to find rare high risk variants in migraine.

PubMed

Hansen, Rikke Dyhr; Christensen, Anne Francke; Olesen, Jes

2017-12-01

Migraine has long been known as a common complex disease caused by genetic and environmental factors. The pathophysiology and the specific genetic susceptibility are poorly understood. Common variants only explain a small part of the heritability of migraine. It is thought that rare genetic variants with bigger effect size may be involved in the disease. Since migraine has a tendency to cluster in families, a family approach might be the way to find these variants. This is also indicated by identification of migraine-associated loci in classical linkage-analyses in migraine families. A single migraine study using a candidate-gene approach was performed in 2010 identifying a rare mutation in the TRESK potassium channel segregating in a large family with migraine with aura, but this finding has later become questioned. The technologies of next-generation sequencing (NGS) now provides an affordable tool to investigate the genetic variation in the entire exome or genome. The family-based study design using NGS is described in this paper. We also review family studies using NGS that have been successful in finding rare variants in other common complex diseases in order to argue the promising application of a family approach to migraine. PubMed was searched to find studies that looked for rare genetic variants in common complex diseases through a family-based design using NGS, excluding studies looking for de-novo mutations, or using a candidate-gene approach and studies on cancer. All issues from Nature Genetics and PLOS genetics 2014, 2015 and 2016 (UTAI June) were screened for relevant papers. Reference lists from included and other relevant papers were also searched. For the description of the family-based study design using NGS an in-house protocol was used. Thirty-two successful studies, which covered 16 different common complex diseases, were included in this paper. We also found a single migraine study. Twenty-three studies found one or a few family specific variants (less than five), while other studies found several possible variants. Not all of them were genome wide significant. Four studies performed follow-up analyses in unrelated cases and controls and calculated odds ratios that supported an association between detected variants and risk of disease. Studies of 11 diseases identified rare variants that segregated fully or to a large degree with the disease in the pedigrees. It is possible to find rare high risk variants for common complex diseases through a family-based approach. One study using a family approach and NGS to find rare variants in migraine has already been published but with strong limitations. More studies are under way.
Genomic and Transcriptomic Associations Identify a New Insecticide Resistance Phenotype for the Selective Sweep at the Cyp6g1 Locus of Drosophila melanogaster.

PubMed

Battlay, Paul; Schmidt, Joshua M; Fournier-Level, Alexandre; Robin, Charles

2016-08-09

Scans of the Drosophila melanogaster genome have identified organophosphate resistance loci among those with the most pronounced signature of positive selection. In this study, the molecular basis of resistance to the organophosphate insecticide azinphos-methyl was investigated using the Drosophila Genetic Reference Panel, and genome-wide association. Recently released full transcriptome data were used to extend the utility of the Drosophila Genetic Reference Panel resource beyond traditional genome-wide association studies to allow systems genetics analyses of phenotypes. We found that both genomic and transcriptomic associations independently identified Cyp6g1, a gene involved in resistance to DDT and neonicotinoid insecticides, as the top candidate for azinphos-methyl resistance. This was verified by transgenically overexpressing Cyp6g1 using natural regulatory elements from a resistant allele, resulting in a 6.5-fold increase in resistance. We also identified four novel candidate genes associated with azinphos-methyl resistance, all of which are involved in either regulation of fat storage, or nervous system development. In Cyp6g1, we find a demonstrable resistance locus, a verification that transcriptome data can be used to identify variants associated with insecticide resistance, and an overlap between peaks of a genome-wide association study, and a genome-wide selective sweep analysis. Copyright © 2016 Battlay et al.
Genome Evolution of Plant-Parasitic Nematodes.

PubMed

Kikuchi, Taisei; Eves-van den Akker, Sebastian; Jones, John T

2017-08-04

Plant parasitism has evolved independently on at least four separate occasions in the phylum Nematoda. The application of next-generation sequencing (NGS) to plant-parasitic nematodes has allowed a wide range of genome- or transcriptome-level comparisons, and these have identified genome adaptations that enable parasitism of plants. Current genome data suggest that horizontal gene transfer, gene family expansions, evolution of new genes that mediate interactions with the host, and parasitism-specific gene regulation are important adaptations that allow nematodes to parasitize plants. Sequencing of a larger number of nematode genomes, including plant parasites that show different modes of parasitism or that have evolved in currently unsampled clades, and using free-living taxa as comparators would allow more detailed analysis and a better understanding of the organization of key genes within the genomes. This would facilitate a more complete understanding of the way in which parasitism has shaped the genomes of plant-parasitic nematodes.
Genome-wide Gene Expression Profiling of Acute Metal Exposures in Male Zebrafish

DTIC Science & Technology

2014-10-23

Data in Brief Genome-wide gene expression profiling of acute metal exposures in male zebrafish Christine E. Baer a,⁎, Danielle L. Ippolito b, Naissan... Zebrafish Whole organism Nickel Chromium Cobalt Toxicogenomics To capture global responses to metal poisoning and mechanistic insights into metal...toxicity, gene expression changes were evaluated in whole adult male zebrafish following acute 24 h high dose exposure to three metals with known human
Genomic amplification of the caprine EDNRA locus might lead to a dose dependent loss of pigmentation

PubMed Central

Menzi, Fiona; Keller, Irene; Reber, Irene; Beck, Julia; Brenig, Bertram; Schütz, Ekkehard; Leeb, Tosso; Drögemüller, Cord

2016-01-01

The South African Boer goat displays a characteristic white spotting phenotype, in which the pigment is limited to the head. Exploiting the existing phenotype variation within the breed, we mapped the locus causing this white spotting phenotype to chromosome 17 by genome wide association. Subsequent whole genome sequencing identified a 1 Mb copy number variant (CNV) harboring 5 genes including EDNRA. The analysis of 358 Boer goats revealed 3 alleles with one, two, and three copies of this CNV. The copy number is correlated with the degree of white spotting in goats. We propose a hypothesis that ectopic overexpression of a mutant EDNRA scavenges EDN3 required for EDNRB signaling and normal melanocyte development and thus likely lead to an absence of melanocytes in the non-pigmented body areas of Boer goats. Our findings demonstrate the value of domestic animals as reservoir of unique mutants and for identifying a precisely defined functional CNV. PMID:27329507
Genome-wide linkage in Utah autism pedigrees

PubMed Central

Allen-Brady, K; Robison, R; Cannon, D; Varvil, T; Villalobos, M; Pingree, C; Leppert, MF; Miller, J; McMahon, WM; Coon, H

2014-01-01

Genetic studies of autism over the past decade suggest a complex landscape of multiple genes. In the face of this heterogeneity, studies that include large extended pedigrees may offer valuable insight, as the relatively few susceptibility genes within single large families may be more easily discerned. This genome-wide screen of 70 families includes 20 large extended pedigrees of 6–9 generations, 6 moderate-sized families of 4–5 generations, and 44 smaller families of 2–3 generations. The Center for Inherited Disease Research (CIDR) provided genotyping using the Illumina Linkage Panel 12, a 6K single nucleotide polymorphism (SNP) platform. Results from 192 subjects with an Autism Spectrum Disorder (ASD), and 461 of their relatives revealed genome-wide significance on chromosome 15q, with three possibly distinct peaks: 15q13.1-q14 (HLOD=4.09 at 29,459,872bp); 15q14-q21.1 (HLOD=3.59 at 36,837,208bp); and 15q21.1-q22.2 (HLOD=5.31 at 55,629,733bp). Two of these peaks replicate previous findings. There were additional suggestive results on chromosomes 2p25.3-p24.1 (HLOD=1.87), 7q31.31-q32.3 (HLOD=1.97), and 13q12.11-q12.3 (HLOD=1.93). Affected subjects in families supporting the linkage peaks found in this study did not reveal strong evidence for distinct phenotypic subgroups. PMID:19455147
Genome-wide meta-analysis identifies five new susceptibility loci for cutaneous malignant melanoma.

PubMed

Law, Matthew H; Bishop, D Timothy; Lee, Jeffrey E; Brossard, Myriam; Martin, Nicholas G; Moses, Eric K; Song, Fengju; Barrett, Jennifer H; Kumar, Rajiv; Easton, Douglas F; Pharoah, Paul D P; Swerdlow, Anthony J; Kypreou, Katerina P; Taylor, John C; Harland, Mark; Randerson-Moor, Juliette; Akslen, Lars A; Andresen, Per A; Avril, Marie-Françoise; Azizi, Esther; Scarrà, Giovanna Bianchi; Brown, Kevin M; Dębniak, Tadeusz; Duffy, David L; Elder, David E; Fang, Shenying; Friedman, Eitan; Galan, Pilar; Ghiorzo, Paola; Gillanders, Elizabeth M; Goldstein, Alisa M; Gruis, Nelleke A; Hansson, Johan; Helsing, Per; Hočevar, Marko; Höiom, Veronica; Ingvar, Christian; Kanetsky, Peter A; Chen, Wei V; Landi, Maria Teresa; Lang, Julie; Lathrop, G Mark; Lubiński, Jan; Mackie, Rona M; Mann, Graham J; Molven, Anders; Montgomery, Grant W; Novaković, Srdjan; Olsson, Håkan; Puig, Susana; Puig-Butille, Joan Anton; Qureshi, Abrar A; Radford-Smith, Graham L; van der Stoep, Nienke; van Doorn, Remco; Whiteman, David C; Craig, Jamie E; Schadendorf, Dirk; Simms, Lisa A; Burdon, Kathryn P; Nyholt, Dale R; Pooley, Karen A; Orr, Nick; Stratigos, Alexander J; Cust, Anne E; Ward, Sarah V; Hayward, Nicholas K; Han, Jiali; Schulze, Hans-Joachim; Dunning, Alison M; Bishop, Julia A Newton; Demenais, Florence; Amos, Christopher I; MacGregor, Stuart; Iles, Mark M

2015-09-01

Thirteen common susceptibility loci have been reproducibly associated with cutaneous malignant melanoma (CMM). We report the results of an international 2-stage meta-analysis of CMM genome-wide association studies (GWAS). This meta-analysis combines 11 GWAS (5 previously unpublished) and a further three stage 2 data sets, totaling 15,990 CMM cases and 26,409 controls. Five loci not previously associated with CMM risk reached genome-wide significance (P < 5 × 10(-8)), as did 2 previously reported but unreplicated loci and all 13 established loci. Newly associated SNPs fall within putative melanocyte regulatory elements, and bioinformatic and expression quantitative trait locus (eQTL) data highlight candidate genes in the associated regions, including one involved in telomere biology.
Genome-wide meta-analysis identifies five new susceptibility loci for cutaneous malignant melanoma

PubMed Central

Law, Matthew H.; Bishop, D. Timothy; Martin, Nicholas G.; Moses, Eric K.; Song, Fengju; Barrett, Jennifer H.; Kumar, Rajiv; Easton, Douglas F.; Pharoah, Paul D. P.; Swerdlow, Anthony J.; Kypreou, Katerina P.; Taylor, John C.; Harland, Mark; Randerson-Moor, Juliette; Akslen, Lars A.; Andresen, Per A.; Avril, Marie-Françoise; Azizi, Esther; Scarrà, Giovanna Bianchi; Brown, Kevin M.; Dębniak, Tadeusz; Duffy, David L.; Elder, David E.; Fang, Shenying; Friedman, Eitan; Galan, Pilar; Ghiorzo, Paola; Gillanders, Elizabeth M.; Goldstein, Alisa M.; Gruis, Nelleke A.; Hansson, Johan; Helsing, Per; Hočevar, Marko; Höiom, Veronica; Ingvar, Christian; Kanetsky, Peter A.; Chen, Wei V.; Landi, Maria Teresa; Lang, Julie; Lathrop, G. Mark; Lubiński, Jan; Mackie, Rona M.; Mann, Graham J.; Molven, Anders; Montgomery, Grant W.; Novaković, Srdjan; Olsson, Håkan; Puig, Susana; Puig-Butille, Joan Anton; Qureshi, Abrar A.; Radford-Smith, Graham L.; van der Stoep, Nienke; van Doorn, Remco; Whiteman, David C.; Craig, Jamie E.; Schadendorf, Dirk; Simms, Lisa A.; Burdon, Kathryn P.; Nyholt, Dale R.; Pooley, Karen A.; Orr, Nick; Stratigos, Alexander J.; Cust, Anne E.; Ward, Sarah V.; Hayward, Nicholas K.; Han, Jiali; Schulze, Hans-Joachim; Dunning, Alison M.; Bishop, Julia A. Newton; MacGregor, Stuart; Iles, Mark M.

2015-01-01

Thirteen common susceptibility loci have been reproducibly associated with cutaneous malignant melanoma (CMM). We report the results of an international 2-stage meta-analysis of CMM genome-wide association studies (GWAS). This meta-analysis combines 11 GWAS (5 previously unpublished) and a further three stage 2 data sets, totaling 15,990 CMM cases and 26,409 controls. Five loci not previously associated with CMM risk reached genome-wide significance (P < 5×10–8), as did two previously-reported but un-replicated loci and all thirteen established loci. Novel SNPs fall within putative melanocyte regulatory elements, and bioinformatic and expression quantitative trait locus (eQTL) data highlight candidate genes including one involved in telomere biology. PMID:26237428
Parallel paleogenomic transects reveal complex genetic history of early European farmers

PubMed Central

Lipson, Mark; Szécsényi-Nagy, Anna; Mallick, Swapan; Pósa, Annamária; Stégmár, Balázs; Keerl, Victoria; Rohland, Nadin; Stewardson, Kristin; Ferry, Matthew; Michel, Megan; Oppenheimer, Jonas; Broomandkhoshbacht, Nasreen; Harney, Eadaoin; Nordenfelt, Susanne; Llamas, Bastien; Mende, Balázs Gusztáv; Köhler, Kitti; Oross, Krisztián; Bondár, Mária; Marton, Tibor; Osztás, Anett; Jakucs, János; Paluch, Tibor; Horváth, Ferenc; Csengeri, Piroska; Koós, Judit; Sebők, Katalin; Anders, Alexandra; Raczky, Pál; Regenye, Judit; Barna, Judit P.; Fábián, Szilvia; Serlegi, Gábor; Toldi, Zoltán; Nagy, Emese Gyöngyvér; Dani, János; Molnár, Erika; Pálfi, György; Márk, László; Melegh, Béla; Bánfai, Zsolt; Domboróczki, László; Fernández-Eraso, Javier; Mujika-Alustiza, José Antonio; Fernández, Carmen Alonso; Echevarría, Javier Jiménez; Bollongino, Ruth; Orschiedt, Jörg; Schierhold, Kerstin; Meller, Harald; Cooper, Alan; Burger, Joachim; Bánffy, Eszter; Alt, Kurt W.; Lalueza-Fox, Carles; Haak, Wolfgang; Reich, David

2017-01-01

Ancient DNA studies have established that Neolithic European populations were descended from Anatolian migrants1–8 who received a limited amount of admixture from resident hunter-gatherers3–5,9. Many open questions remain, however, about the spatial and temporal dynamics of population interactions and admixture during the Neolithic period. Using the highest-resolution genome-wide ancient DNA data set assembled to date—a total of 180 samples, 130 newly reported here, from the Neolithic and Chalcolithic of Hungary (6000–2900 BCE, n = 100), Germany (5500–3000 BCE, n = 42), and Spain (5500–2200 BCE, n = 38)—we investigate the population dynamics of Neolithization across Europe. We find that genetic diversity was shaped predominantly by local processes, with varied sources and proportions of hunter-gatherer ancestry among the three regions and through time. Admixture between groups with different ancestry profiles was pervasive and resulted in observable population transformation across almost all cultural transitions. Our results shed new light on the ways that gene flow reshaped European populations throughout the Neolithic period and demonstrate the potential of time-series-based sampling and modeling approaches to elucidate multiple dimensions of historical population interactions. PMID:29144465
Characterizing the genetic differences between two distinct migrant groups from Indo-European and Dravidian speaking populations in India.

PubMed

Ali, Mohammad; Liu, Xuanyao; Pillai, Esakimuthu Nisha; Chen, Peng; Khor, Chiea-Chuen; Ong, Rick Twee-Hee; Teo, Yik-Ying

2014-07-22

India is home to many ethnically and linguistically diverse populations. It is hypothesized that history of invasions by people from Persia and Central Asia, who are referred as Aryans in Hindu Holy Scriptures, had a defining role in shaping the Indian population canvas. A shift in spoken languages from Dravidian languages to Indo-European languages around 1500 B.C. is central to the Aryan Invasion Theory. Here we investigate the genetic differences between two sub-populations of India consisting of: (1) The Indo-European language speaking Gujarati Indians with genome-wide data from the International HapMap Project; and (2) the Dravidian language speaking Tamil Indians with genome-wide data from the Singapore Genome Variation Project. We implemented three population genetics measures to identify genomic regions that are significantly differentiated between the two Indian populations originating from the north and south of India. These measures singled out genomic regions with: (i) SNPs exhibiting significant variation in allele frequencies in the two Indian populations; and (ii) differential signals of positive natural selection as quantified by the integrated haplotype score (iHS) and cross-population extended haplotype homozygosity (XP-EHH). One of the regions that emerged spans the SLC24A5 gene that has been functionally shown to affect skin pigmentation, with a higher degree of genetic sharing between Gujarati Indians and Europeans. Our finding points to a gene-flow from Europe to north India that provides an explanation for the lighter skin tones present in North Indians in comparison to South Indians.
Construction of high-quality recombination maps with low-coverage genomic sequencing for joint linkage analysis in maize

USDA-ARS?s Scientific Manuscript database

A genome-wide association study (GWAS) is the foremost strategy used for finding genes that control human diseases and agriculturally important traits, but it often reports false positives. In contrast, its complementary method, linkage analysis, provides direct genetic confirmation, but with limite...
Comprehensive evaluation of genome-wide 5-hydroxymethylcytosine profiling approaches in human DNA.

PubMed

Skvortsova, Ksenia; Zotenko, Elena; Luu, Phuc-Loi; Gould, Cathryn M; Nair, Shalima S; Clark, Susan J; Stirzaker, Clare

2017-01-01

The discovery that 5-methylcytosine (5mC) can be oxidized to 5-hydroxymethylcytosine (5hmC) by the ten-eleven translocation (TET) proteins has prompted wide interest in the potential role of 5hmC in reshaping the mammalian DNA methylation landscape. The gold-standard bisulphite conversion technologies to study DNA methylation do not distinguish between 5mC and 5hmC. However, new approaches to mapping 5hmC genome-wide have advanced rapidly, although it is unclear how the different methods compare in accurately calling 5hmC. In this study, we provide a comparative analysis on brain DNA using three 5hmC genome-wide approaches, namely whole-genome bisulphite/oxidative bisulphite sequencing (WG Bis/OxBis-seq), Infinium HumanMethylation450 BeadChip arrays coupled with oxidative bisulphite (HM450K Bis/OxBis) and antibody-based immunoprecipitation and sequencing of hydroxymethylated DNA (hMeDIP-seq). We also perform loci-specific TET-assisted bisulphite sequencing (TAB-seq) for validation of candidate regions. We show that whole-genome single-base resolution approaches are advantaged in providing precise 5hmC values but require high sequencing depth to accurately measure 5hmC, as this modification is commonly in low abundance in mammalian cells. HM450K arrays coupled with oxidative bisulphite provide a cost-effective representation of 5hmC distribution, at CpG sites with 5hmC levels >~10%. However, 5hmC analysis is restricted to the genomic location of the probes, which is an important consideration as 5hmC modification is commonly enriched at enhancer elements. Finally, we show that the widely used hMeDIP-seq method provides an efficient genome-wide profile of 5hmC and shows high correlation with WG Bis/OxBis-seq 5hmC distribution in brain DNA. However, in cell line DNA with low levels of 5hmC, hMeDIP-seq-enriched regions are not detected by WG Bis/OxBis or HM450K, either suggesting misinterpretation of 5hmC calls by hMeDIP or lack of sensitivity of the latter methods. We highlight both the advantages and caveats of three commonly used genome-wide 5hmC profiling technologies and show that interpretation of 5hmC data can be significantly influenced by the sensitivity of methods used, especially as the levels of 5hmC are low and vary in different cell types and different genomic locations.

Comparative Functional Genomics of Lactobacillus spp. Reveals Possible Mechanisms for Specialization of Vaginal Lactobacilli to Their Environment

PubMed Central

Suzuki, Haruo; Hickey, Roxana J.; Forney, Larry J.

2014-01-01

Lactobacilli are found in a wide variety of habitats. Four species, Lactobacillus crispatus, L. gasseri, L. iners, and L. jensenii, are common and abundant in the human vagina and absent from other habitats. These may be adapted to the vagina and possess characteristics enabling them to thrive in that environment. Furthermore, stable codominance of multiple Lactobacillus species in a single community is infrequently observed. Thus, it is possible that individual vaginal Lactobacillus species possess unique characteristics that confer to them host-specific competitive advantages. We performed comparative functional genomic analyses of representatives of 25 species of Lactobacillus, searching for habitat-specific traits in the genomes of the vaginal lactobacilli. We found that the genomes of the vaginal species were significantly smaller and had significantly lower GC content than those of the nonvaginal species. No protein families were found to be specific to the vaginal species analyzed, but some were either over- or underrepresented relative to nonvaginal species. We also found that within the vaginal species, each genome coded for species-specific protein families. Our results suggest that even though the vaginal species show no general signatures of adaptation to the vaginal environment, each species has specific and perhaps unique ways of interacting with its environment, be it the host or other microbes in the community. These findings will serve as a foundation for further exploring the role of lactobacilli in the ecological dynamics of vaginal microbial communities and their ultimate impact on host health. PMID:24488312
Genome-Wide Association Study of the Genetic Determinants of Emphysema Distribution.

PubMed

Boueiz, Adel; Lutz, Sharon M; Cho, Michael H; Hersh, Craig P; Bowler, Russell P; Washko, George R; Halper-Stromberg, Eitan; Bakke, Per; Gulsvik, Amund; Laird, Nan M; Beaty, Terri H; Coxson, Harvey O; Crapo, James D; Silverman, Edwin K; Castaldi, Peter J; DeMeo, Dawn L

2017-03-15

Emphysema has considerable variability in the severity and distribution of parenchymal destruction throughout the lungs. Upper lobe-predominant emphysema has emerged as an important predictor of response to lung volume reduction surgery. Yet, aside from alpha-1 antitrypsin deficiency, the genetic determinants of emphysema distribution remain largely unknown. To identify the genetic influences of emphysema distribution in non-alpha-1 antitrypsin-deficient smokers. A total of 11,532 subjects with complete genotype and computed tomography densitometry data in the COPDGene (Genetic Epidemiology of Chronic Obstructive Pulmonary Disease [COPD]; non-Hispanic white and African American), ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints), and GenKOLS (Genetics of Chronic Obstructive Lung Disease) studies were analyzed. Two computed tomography scan emphysema distribution measures (difference between upper-third and lower-third emphysema; ratio of upper-third to lower-third emphysema) were tested for genetic associations in all study subjects. Separate analyses in each study population were followed by a fixed effect metaanalysis. Single-nucleotide polymorphism-, gene-, and pathway-based approaches were used. In silico functional evaluation was also performed. We identified five loci associated with emphysema distribution at genome-wide significance. These loci included two previously reported associations with COPD susceptibility (4q31 near HHIP and 15q25 near CHRNA5) and three new associations near SOWAHB, TRAPPC9, and KIAA1462. Gene set analysis and in silico functional evaluation revealed pathways and cell types that may potentially contribute to the pathogenesis of emphysema distribution. This multicohort genome-wide association study identified new genomic loci associated with differential emphysematous destruction throughout the lungs. These findings may point to new biologic pathways on which to expand diagnostic and therapeutic approaches in chronic obstructive pulmonary disease. Clinical trial registered with www.clinicaltrials.gov (NCT 00608764).
Reconstructing spatial organizations of chromosomes through manifold learning

PubMed Central

Deng, Wenxuan; Hu, Hailin; Ma, Rui; Zhang, Sai; Yang, Jinglin; Peng, Jian; Kaplan, Tommy; Zeng, Jianyang

2018-01-01

Abstract Decoding the spatial organizations of chromosomes has crucial implications for studying eukaryotic gene regulation. Recently, chromosomal conformation capture based technologies, such as Hi-C, have been widely used to uncover the interaction frequencies of genomic loci in a high-throughput and genome-wide manner and provide new insights into the folding of three-dimensional (3D) genome structure. In this paper, we develop a novel manifold learning based framework, called GEM (Genomic organization reconstructor based on conformational Energy and Manifold learning), to reconstruct the three-dimensional organizations of chromosomes by integrating Hi-C data with biophysical feasibility. Unlike previous methods, which explicitly assume specific relationships between Hi-C interaction frequencies and spatial distances, our model directly embeds the neighboring affinities from Hi-C space into 3D Euclidean space. Extensive validations demonstrated that GEM not only greatly outperformed other state-of-art modeling methods but also provided a physically and physiologically valid 3D representations of the organizations of chromosomes. Furthermore, we for the first time apply the modeled chromatin structures to recover long-range genomic interactions missing from original Hi-C data. PMID:29408992
Reconstructing spatial organizations of chromosomes through manifold learning.

PubMed

Zhu, Guangxiang; Deng, Wenxuan; Hu, Hailin; Ma, Rui; Zhang, Sai; Yang, Jinglin; Peng, Jian; Kaplan, Tommy; Zeng, Jianyang

2018-05-04

Decoding the spatial organizations of chromosomes has crucial implications for studying eukaryotic gene regulation. Recently, chromosomal conformation capture based technologies, such as Hi-C, have been widely used to uncover the interaction frequencies of genomic loci in a high-throughput and genome-wide manner and provide new insights into the folding of three-dimensional (3D) genome structure. In this paper, we develop a novel manifold learning based framework, called GEM (Genomic organization reconstructor based on conformational Energy and Manifold learning), to reconstruct the three-dimensional organizations of chromosomes by integrating Hi-C data with biophysical feasibility. Unlike previous methods, which explicitly assume specific relationships between Hi-C interaction frequencies and spatial distances, our model directly embeds the neighboring affinities from Hi-C space into 3D Euclidean space. Extensive validations demonstrated that GEM not only greatly outperformed other state-of-art modeling methods but also provided a physically and physiologically valid 3D representations of the organizations of chromosomes. Furthermore, we for the first time apply the modeled chromatin structures to recover long-range genomic interactions missing from original Hi-C data.
Copy Number Variations in Tilapia Genomes.

PubMed

Li, Bi Jun; Li, Hong Lian; Meng, Zining; Zhang, Yong; Lin, Haoran; Yue, Gen Hua; Xia, Jun Hong

2017-02-01

Discovering the nature and pattern of genome variation is fundamental in understanding phenotypic diversity among populations. Although several millions of single nucleotide polymorphisms (SNPs) have been discovered in tilapia, the genome-wide characterization of larger structural variants, such as copy number variation (CNV) regions has not been carried out yet. We conducted a genome-wide scan for CNVs in 47 individuals from three tilapia populations. Based on 254 Gb of high-quality paired-end sequencing reads, we identified 4642 distinct high-confidence CNVs. These CNVs account for 1.9% (12.411 Mb) of the used Nile tilapia reference genome. A total of 1100 predicted CNVs were found overlapping with exon regions of protein genes. Further association analysis based on linear model regression found 85 CNVs ranging between 300 and 27,000 base pairs significantly associated to population types (R 2 > 0.9 and P > 0.001). Our study sheds first insights on genome-wide CNVs in tilapia. These CNVs among and within tilapia populations may have functional effects on phenotypes and specific adaptation to particular environments.
Regulatory variation: an emerging vantage point for cancer biology.

PubMed

Li, Luolan; Lorzadeh, Alireza; Hirst, Martin

2014-01-01

Transcriptional regulation involves complex and interdependent interactions of noncoding and coding regions of the genome with proteins that interact and modify them. Genetic variation/mutation in coding and noncoding regions of the genome can drive aberrant transcription and disease. In spite of accounting for nearly 98% of the genome comparatively little is known about the contribution of noncoding DNA elements to disease. Genome-wide association studies of complex human diseases including cancer have revealed enrichment for variants in the noncoding genome. A striking finding of recent cancer genome re-sequencing efforts has been the previously underappreciated frequency of mutations in epigenetic modifiers across a wide range of cancer types. Taken together these results point to the importance of dysregulation in transcriptional regulatory control in genesis of cancer. Powered by recent technological advancements in functional genomic profiling, exploration of normal and transformed regulatory networks will provide novel insight into the initiation and progression of cancer and open new windows to future prognostic and diagnostic tools. © 2013 Wiley Periodicals, Inc.
Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application

PubMed Central

Cantor, Rita M.; Lange, Kenneth; Sinsheimer, Janet S.

2010-01-01

Genome-wide association studies (GWAS) have rapidly become a standard method for disease gene discovery. A substantial number of recent GWAS indicate that for most disorders, only a few common variants are implicated and the associated SNPs explain only a small fraction of the genetic risk. This review is written from the viewpoint that findings from the GWAS provide preliminary genetic information that is available for additional analysis by statistical procedures that accumulate evidence, and that these secondary analyses are very likely to provide valuable information that will help prioritize the strongest constellations of results. We review and discuss three analytic methods to combine preliminary GWAS statistics to identify genes, alleles, and pathways for deeper investigations. Meta-analysis seeks to pool information from multiple GWAS to increase the chances of finding true positives among the false positives and provides a way to combine associations across GWAS, even when the original data are unavailable. Testing for epistasis within a single GWAS study can identify the stronger results that are revealed when genes interact. Pathway analysis of GWAS results is used to prioritize genes and pathways within a biological context. Following a GWAS, association results can be assigned to pathways and tested in aggregate with computational tools and pathway databases. Reviews of published methods with recommendations for their application are provided within the framework for each approach. PMID:20074509
Efficient genome-wide association in biobanks using topic modeling identifies multiple novel disease loci.

PubMed

McCoy, Thomas H; Castro, Victor M; Snapper, Leslie A; Hart, Kamber L; Perlis, Roy H

2017-08-31

Biobanks and national registries represent a powerful tool for genomic discovery, but rely on diagnostic codes that may be unreliable and fail to capture the relationship between related diagnoses. We developed an efficient means of conducting genome-wide association studies using combinations of diagnostic codes from electronic health records (EHR) for 10845 participants in a biobanking program at two large academic medical centers. Specifically, we applied latent Dirichilet allocation to fit 50 disease topics based on diagnostic codes, then conducted genome-wide common-variant association for each topic. In sensitivity analysis, these results were contrasted with those obtained from traditional single-diagnosis phenome-wide association analysis, as well as those in which only a subset of diagnostic codes are included per topic. In meta-analysis across three biobank cohorts, we identified 23 disease-associated loci with p<1e-15, including previously associated autoimmune disease loci. In all cases, observed significant associations were of greater magnitude than for single phenome-wide diagnostic codes, and incorporation of less strongly-loading diagnostic codes enhanced association. This strategy provides a more efficient means of phenome-wide association in biobanks with coded clinical data.
Efficient Genome-wide Association in Biobanks Using Topic Modeling Identifies Multiple Novel Disease Loci

PubMed Central

McCoy, Thomas H; Castro, Victor M; Snapper, Leslie A; Hart, Kamber L; Perlis, Roy H

2017-01-01

Biobanks and national registries represent a powerful tool for genomic discovery, but rely on diagnostic codes that can be unreliable and fail to capture relationships between related diagnoses. We developed an efficient means of conducting genome-wide association studies using combinations of diagnostic codes from electronic health records for 10,845 participants in a biobanking program at two large academic medical centers. Specifically, we applied latent Dirichilet allocation to fit 50 disease topics based on diagnostic codes, then conducted a genome-wide common-variant association for each topic. In sensitivity analysis, these results were contrasted with those obtained from traditional single-diagnosis phenome-wide association analysis, as well as those in which only a subset of diagnostic codes were included per topic. In meta-analysis across three biobank cohorts, we identified 23 disease-associated loci with p < 1e-15, including previously associated autoimmune disease loci. In all cases, observed significant associations were of greater magnitude than single phenome-wide diagnostic codes, and incorporation of less strongly loading diagnostic codes enhanced association. This strategy provides a more efficient means of identifying phenome-wide associations in biobanks with coded clinical data. PMID:28861588
Association of genome-wide variation with the risk of incident heart failure in adults of European and African ancestry: a prospective meta-analysis from the cohorts for heart and aging research in genomic epidemiology (CHARGE) consortium.

PubMed

Smith, Nicholas L; Felix, Janine F; Morrison, Alanna C; Demissie, Serkalem; Glazer, Nicole L; Loehr, Laura R; Cupples, L Adrienne; Dehghan, Abbas; Lumley, Thomas; Rosamond, Wayne D; Lieb, Wolfgang; Rivadeneira, Fernando; Bis, Joshua C; Folsom, Aaron R; Benjamin, Emelia; Aulchenko, Yurii S; Haritunians, Talin; Couper, David; Murabito, Joanne; Wang, Ying A; Stricker, Bruno H; Gottdiener, John S; Chang, Patricia P; Wang, Thomas J; Rice, Kenneth M; Hofman, Albert; Heckbert, Susan R; Fox, Ervin R; O'Donnell, Christopher J; Uitterlinden, Andre G; Rotter, Jerome I; Willerson, James T; Levy, Daniel; van Duijn, Cornelia M; Psaty, Bruce M; Witteman, Jacqueline C M; Boerwinkle, Eric; Vasan, Ramachandran S

2010-06-01

Although genetic factors contribute to the onset of heart failure (HF), no large-scale genome-wide investigation of HF risk has been published to date. We have investigated the association of 2,478,304 single-nucleotide polymorphisms with incident HF by meta-analyzing data from 4 community-based prospective cohorts: the Atherosclerosis Risk in Communities Study, the Cardiovascular Health Study, the Framingham Heart Study, and the Rotterdam Study. Eligible participants for these analyses were of European or African ancestry and free of clinical HF at baseline. Each study independently conducted genome-wide scans and imputed data to the approximately 2.5 million single-nucleotide polymorphisms in HapMap. Within each study, Cox proportional hazards regression models provided age- and sex-adjusted estimates of the association between each variant and time to incident HF. Fixed-effect meta-analyses combined results for each single-nucleotide polymorphism from the 4 cohorts to produce an overall association estimate and P value. A genome-wide significance P value threshold was set a priori at 5.0x10(-7). During a mean follow-up of 11.5 years, 2526 incident HF events (12%) occurred in 20 926 European-ancestry participants. The meta-analysis identified a genome-wide significant locus at chromosomal position 15q22 (1.4x10(-8)), which was 58.8 kb from USP3. Among 2895 African-ancestry participants, 466 incident HF events (16%) occurred during a mean follow-up of 13.7 years. One genome-wide significant locus was identified at 12q14 (6.7x10(-8)), which was 6.3 kb from LRIG3. We identified 2 loci that were associated with incident HF and exceeded genome-wide significance. The findings merit replication in other community-based settings of incident HF.
Atlas2 Cloud: a framework for personal genome analysis in the cloud

PubMed Central

2012-01-01

Background Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues. Results We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set. Conclusions We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms. PMID:23134663
Atlas2 Cloud: a framework for personal genome analysis in the cloud.

PubMed

Evani, Uday S; Challis, Danny; Yu, Jin; Jackson, Andrew R; Paithankar, Sameer; Bainbridge, Matthew N; Jakkamsetti, Adinarayana; Pham, Peter; Coarfa, Cristian; Milosavljevic, Aleksandar; Yu, Fuli

2012-01-01

Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues. We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set. We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms.
Lessons from ten years of genome-wide association studies of asthma

PubMed Central

Vicente, Cristina T; Revez, Joana A; Ferreira, Manuel A R

2017-01-01

Twenty-five genome-wide association studies (GWAS) of asthma were published between 2007 and 2016, the largest with a sample size of 157242 individuals. Across these studies, 39 genetic variants in low linkage disequilibrium (LD) with each other were reported to associate with disease risk at a significance threshold of P<5 × 10−8, including 31 in populations of European ancestry. Results from analyses of the UK Biobank data (n=380 503) indicate that at least 28 of the 31 associations reported in Europeans represent true-positive findings, collectively explaining 2.5% of the variation in disease liability (median of 0.06% per variant). We identified 49 transcripts as likely target genes of the published asthma risk variants, mostly based on LD with expression quantitative trait loci (eQTL). Of these genes, 16 were previously implicated in disease pathophysiology by functional studies, including TSLP, TNFSF4, ADORA1, CHIT1 and USF1. In contrast, at present, there is limited or no functional evidence directly implicating the remaining 33 likely target genes in asthma pathophysiology. Some of these genes have a known function that is relevant to allergic disease, including F11R, CD247, PGAP3, AAGAB, CAMK4 and PEX14, and so could be prioritized for functional follow-up. We conclude by highlighting three areas of research that are essential to help translate GWAS findings into clinical research or practice, namely validation of target gene predictions, understanding target gene function and their role in disease pathophysiology and genomics-guided prioritization of targets for drug development. PMID:29333270
A common variant mapping to CACNA1A is associated with susceptibility to exfoliation syndrome.

PubMed

Aung, Tin; Ozaki, Mineo; Mizoguchi, Takanori; Allingham, R Rand; Li, Zheng; Haripriya, Aravind; Nakano, Satoko; Uebe, Steffen; Harder, Jeffrey M; Chan, Anita S Y; Lee, Mei Chin; Burdon, Kathryn P; Astakhov, Yury S; Abu-Amero, Khaled K; Zenteno, Juan C; Nilgün, Yildirim; Zarnowski, Tomasz; Pakravan, Mohammad; Safieh, Leen Abu; Jia, Liyun; Wang, Ya Xing; Williams, Susan; Paoli, Daniela; Schlottmann, Patricio G; Huang, Lulin; Sim, Kar Seng; Foo, Jia Nee; Nakano, Masakazu; Ikeda, Yoko; Kumar, Rajesh S; Ueno, Morio; Manabe, Shin-ichi; Hayashi, Ken; Kazama, Shigeyasu; Ideta, Ryuichi; Mori, Yosai; Miyata, Kazunori; Sugiyama, Kazuhisa; Higashide, Tomomi; Chihara, Etsuo; Inoue, Kenji; Ishiko, Satoshi; Yoshida, Akitoshi; Yanagi, Masahide; Kiuchi, Yoshiaki; Aihara, Makoto; Ohashi, Tsutomu; Sakurai, Toshiya; Sugimoto, Takako; Chuman, Hideki; Matsuda, Fumihiko; Yamashiro, Kenji; Gotoh, Norimoto; Miyake, Masahiro; Astakhov, Sergei Y; Osman, Essam A; Al-Obeidan, Saleh A; Owaidhah, Ohoud; Al-Jasim, Leyla; Al Shahwan, Sami; Fogarty, Rhys A; Leo, Paul; Yetkin, Yaz; Oğuz, Çilingir; Kanavi, Mozhgan Rezaei; Beni, Afsaneh Nederi; Yazdani, Shahin; Akopov, Evgeny L; Toh, Kai-Yee; Howell, Gareth R; Orr, Andrew C; Goh, Yufen; Meah, Wee Yang; Peh, Su Qin; Kosior-Jarecka, Ewa; Lukasik, Urszula; Krumbiegel, Mandy; Vithana, Eranga N; Wong, Tien Yin; Liu, Yutao; Koch, Allison E Ashley; Challa, Pratap; Rautenbach, Robyn M; Mackey, David A; Hewitt, Alex W; Mitchell, Paul; Wang, Jie Jin; Ziskind, Ari; Carmichael, Trevor; Ramakrishnan, Rangappa; Narendran, Kalpana; Venkatesh, Rangaraj; Vijayan, Saravanan; Zhao, Peiquan; Chen, Xueyi; Guadarrama-Vallejo, Dalia; Cheng, Ching Yu; Perera, Shamira A; Husain, Rahat; Ho, Su-Ling; Welge-Luessen, Ulrich-Christoph; Mardin, Christian; Schloetzer-Schrehardt, Ursula; Hillmer, Axel M; Herms, Stefan; Moebus, Susanne; Nöthen, Markus M; Weisschuh, Nicole; Shetty, Rohit; Ghosh, Arkasubhra; Teo, Yik Ying; Brown, Matthew A; Lischinsky, Ignacio; Crowston, Jonathan G; Coote, Michael; Zhao, Bowen; Sang, Jinghong; Zhang, Nihong; You, Qisheng; Vysochinskaya, Vera; Founti, Panayiota; Chatzikyriakidou, Anthoula; Lambropoulos, Alexandros; Anastasopoulos, Eleftherios; Coleman, Anne L; Wilson, M Roy; Rhee, Douglas J; Kang, Jae Hee; May-Bolchakova, Inna; Heegaard, Steffen; Mori, Kazuhiko; Alward, Wallace L M; Jonas, Jost B; Xu, Liang; Liebmann, Jeffrey M; Chowbay, Balram; Schaeffeler, Elke; Schwab, Matthias; Lerner, Fabian; Wang, Ningli; Yang, Zhenglin; Frezzotti, Paolo; Kinoshita, Shigeru; Fingert, John H; Inatani, Masaru; Tashiro, Kei; Reis, André; Edward, Deepak P; Pasquale, Louis R; Kubota, Toshiaki; Wiggs, Janey L; Pasutto, Francesca; Topouzis, Fotis; Dubina, Michael; Craig, Jamie E; Yoshimura, Nagahisa; Sundaresan, Periasamy; John, Simon W M; Ritch, Robert; Hauser, Michael A; Khor, Chiea-Chuen

2015-04-01

Exfoliation syndrome (XFS) is the most common recognizable cause of open-angle glaucoma worldwide. To better understand the etiology of XFS, we conducted a genome-wide association study (GWAS) of 1,484 cases and 1,188 controls from Japan and followed up the most significant findings in a further 6,901 cases and 20,727 controls from 17 countries across 6 continents. We discovered a genome-wide significant association between a new locus (CACNA1A rs4926244) and increased susceptibility to XFS (odds ratio (OR) = 1.16, P = 3.36 × 10(-11)). Although we also confirmed overwhelming association at the LOXL1 locus, the key SNP marker (LOXL1 rs4886776) demonstrated allelic reversal depending on the ancestry group (Japanese: OR(A allele) = 9.87, P = 2.13 × 10(-217); non-Japanese: OR(A allele) = 0.49, P = 2.35 × 10(-31)). Our findings represent the first genetic locus outside of LOXL1 surpassing genome-wide significance for XFS and provide insight into the biology and pathogenesis of the disease.
Alzheimer Disease Pathology in Cognitively Healthy Elderly:A Genome-wide Study

PubMed Central

Kramer, Patricia L; Xu, Haiyan; Woltjer, Randall L; Westaway, Shawn K; Clark, David; Erten-Lyons, Deniz; Kaye, Jeffrey A; Welsh-Bohmer, Kathleen A; Troncoso, Juan C; Markesbery, William R; Petersen, Ronald C; Turner, R Scott; Kukull, Walter A; Bennett, David A; DouglasGalasko; Morris, John C; Ott, Jurg

2010-01-01

Many elderly individuals remain dementia-free throughout their life. However, some of these individuals exhibit Alzheimer disease neuropathology on autopsy, evidenced by neurofibrillary tangles (NFTs) in AD-specific brain regions. We conducted a genome-wide association study to identify genetic mechanisms that distinguish non-demented elderly with a heavy NFT burden from those with a low NFT burden. The study included 344 non-demented subjects with autopsy (201 subjects with low and 143 with high NFT levels). Both a genotype test, using logistic regression, and an allele test provided genome-wide significant evidence that variants in the RELNgene are associated with neuropathology in the context of cognitive health. Immunohistochemical data for reelin expression in AD-related brain regions added support for these findings. Reelin signaling pathways modulate phosphorylation of tau, the major component of NFTs, either directly or through β-amyloid pathways that influence tau phosphorylation. Our findings suggest that up-regulation of reelin may be a compensatory response to tau-related or beta-amyloid stress associated with AD even prior to the onset of dementia. PMID:20452100
Genetic findings in anorexia and bulimia nervosa.

PubMed

Hinney, Anke; Scherag, Susann; Hebebrand, Johannes

2010-01-01

Anorexia nervosa (AN) and bulimia nervosa (BN) are complex disorders associated with disordered eating behavior. Heritability estimates derived from twin and family studies are high, so that substantial genetic influences on the etiology can be assumed for both. As the monoaminergic neurotransmitter systems are involved in eating disorders (EDs), candidate gene studies have centered on related genes; additionally, genes relevant for body weight regulation have been considered as candidates. Unfortunately, this approach has yielded very few positive results; confirmed associations or findings substantiated in meta-analyses are scant. None of these associations can be considered unequivocally validated. Systematic genome-wide approaches have been performed to identify genes with no a priori evidence for their relevance in EDs. Family-based scans revealed linkage peaks in single chromosomal regions for AN and BN. Analyses of candidate genes in one of these regions led to the identification of genetic variants associated with AN. Currently, an international consortium is conducting a genome-wide association study for AN, which will hopefully lead to the identification of the first genome-wide significant markers. Copyright © 2010 Elsevier Inc. All rights reserved.
Employing genome-wide SNP discovery and genotyping strategy to extrapolate the natural allelic diversity and domestication patterns in chickpea

PubMed Central

Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

2015-01-01

The genome-wide discovery and high-throughput genotyping of SNPs in chickpea natural germplasm lines is indispensable to extrapolate their natural allelic diversity, domestication, and linkage disequilibrium (LD) patterns leading to the genetic enhancement of this vital legume crop. We discovered 44,844 high-quality SNPs by sequencing of 93 diverse cultivated desi, kabuli, and wild chickpea accessions using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays that were physically mapped across eight chromosomes of desi and kabuli. Of these, 22,542 SNPs were structurally annotated in different coding and non-coding sequence components of genes. Genes with 3296 non-synonymous and 269 regulatory SNPs could functionally differentiate accessions based on their contrasting agronomic traits. A high experimental validation success rate (92%) and reproducibility (100%) along with strong sensitivity (93–96%) and specificity (99%) of GBS-based SNPs was observed. This infers the robustness of GBS as a high-throughput assay for rapid large-scale mining and genotyping of genome-wide SNPs in chickpea with sub-optimal use of resources. With 23,798 genome-wide SNPs, a relatively high intra-specific polymorphic potential (49.5%) and broader molecular diversity (13–89%)/functional allelic diversity (18–77%) was apparent among 93 chickpea accessions, suggesting their tremendous applicability in rapid selection of desirable diverse accessions/inter-specific hybrids in chickpea crossbred varietal improvement program. The genome-wide SNPs revealed complex admixed domestication pattern, extensive LD estimates (0.54–0.68) and extended LD decay (400–500 kb) in a structured population inclusive of 93 accessions. These findings reflect the utility of our identified SNPs for subsequent genome-wide association study (GWAS) and selective sweep-based domestication trait dissection analysis to identify potential genomic loci (gene-associated targets) specifically regulating important complex quantitative agronomic traits in chickpea. The numerous informative genome-wide SNPs, natural allelic diversity-led domestication pattern, and LD-based information generated in our study have got multidimensional applicability with respect to chickpea genomics-assisted breeding. PMID:25873920
Genome-wide single nucleotide polymorphisms reveal population history and adaptive divergence in wild guppies.

PubMed

Willing, Eva-Maria; Bentzen, Paul; van Oosterhout, Cock; Hoffmann, Margarete; Cable, Joanne; Breden, Felix; Weigel, Detlef; Dreyer, Christine

2010-03-01

Adaptation of guppies (Poecilia reticulata) to contrasting upland and lowland habitats has been extensively studied with respect to behaviour, morphology and life history traits. Yet population history has not been studied at the whole-genome level. Although single nucleotide polymorphisms (SNPs) are the most abundant form of variation in many genomes and consequently very informative for a genome-wide picture of standing natural variation in populations, genome-wide SNP data are rarely available for wild vertebrates. Here we use genetically mapped SNP markers to comprehensively survey genetic variation within and among naturally occurring guppy populations from a wide geographic range in Trinidad and Venezuela. Results from three different clustering methods, Neighbor-net, principal component analysis (PCA) and Bayesian analysis show that the population substructure agrees with geographic separation and largely with previously hypothesized patterns of historical colonization. Within major drainages (Caroni, Oropouche and Northern), populations are genetically similar, but those in different geographic regions are highly divergent from one another, with some indications of ancient shared polymorphisms. Clear genomic signatures of a previous introduction experiment were seen, and we detected additional potential admixture events. Headwater populations were significantly less heterozygous than downstream populations. Pairwise F(ST) values revealed marked differences in allele frequencies among populations from different regions, and also among populations within the same region. F(ST) outlier methods indicated some regions of the genome as being under directional selection. Overall, this study demonstrates the power of a genome-wide SNP data set to inform for studies on natural variation, adaptation and evolution of wild populations.
Adaptive Evolution of Extreme Acidophile Sulfobacillus thermosulfidooxidans Potentially Driven by Horizontal Gene Transfer and Gene Loss

PubMed Central

Zhang, Xian; Liu, Xueduan; Liang, Yili; Guo, Xue; Xiao, Yunhua; Ma, Liyuan; Miao, Bo; Liu, Hongwei; Peng, Deliang; Huang, Wenkun; Zhang, Yuguang

2017-01-01

ABSTRACT Recent phylogenomic analysis has suggested that three strains isolated from different copper mine tailings around the world were taxonomically affiliated with Sulfobacillus thermosulfidooxidans. Here, we present a detailed investigation of their genomic features, particularly with respect to metabolic potentials and stress tolerance mechanisms. Comprehensive analysis of the Sulfobacillus genomes identified a core set of essential genes with specialized biological functions in the survival of acidophiles in their habitats, despite differences in their metabolic pathways. The Sulfobacillus strains also showed evidence for stress management, thereby enabling them to efficiently respond to harsh environments. Further analysis of metabolic profiles provided novel insights into the presence of genomic streamlining, highlighting the importance of gene loss as a main mechanism that potentially contributes to cellular economization. Another important evolutionary force, especially in larger genomes, is gene acquisition via horizontal gene transfer (HGT), which might play a crucial role in the recruitment of novel functionalities. Also, a successful integration of genes acquired from archaeal donors appears to be an effective way of enhancing the adaptive capacity to cope with environmental changes. Taken together, the findings of this study significantly expand the spectrum of HGT and genome reduction in shaping the evolutionary history of Sulfobacillus strains. IMPORTANCE Horizontal gene transfer (HGT) and gene loss are recognized as major driving forces that contribute to the adaptive evolution of microbial genomes, although their relative importance remains elusive. The findings of this study suggest that highly frequent gene turnovers within microorganisms via HGT were necessary to incur additional novel functionalities to increase the capacity of acidophiles to adapt to changing environments. Evidence also reveals a fascinating phenomenon of potential cross-kingdom HGT. Furthermore, genome streamlining may be a critical force in driving the evolution of microbial genomes. Taken together, this study provides insights into the importance of both HGT and gene loss in the evolution and diversification of bacterial genomes. PMID:28115381
Adaptive Evolution of Extreme Acidophile Sulfobacillus thermosulfidooxidans Potentially Driven by Horizontal Gene Transfer and Gene Loss.

PubMed

Zhang, Xian; Liu, Xueduan; Liang, Yili; Guo, Xue; Xiao, Yunhua; Ma, Liyuan; Miao, Bo; Liu, Hongwei; Peng, Deliang; Huang, Wenkun; Zhang, Yuguang; Yin, Huaqun

2017-04-01

Recent phylogenomic analysis has suggested that three strains isolated from different copper mine tailings around the world were taxonomically affiliated with Sulfobacillus thermosulfidooxidans Here, we present a detailed investigation of their genomic features, particularly with respect to metabolic potentials and stress tolerance mechanisms. Comprehensive analysis of the Sulfobacillus genomes identified a core set of essential genes with specialized biological functions in the survival of acidophiles in their habitats, despite differences in their metabolic pathways. The Sulfobacillus strains also showed evidence for stress management, thereby enabling them to efficiently respond to harsh environments. Further analysis of metabolic profiles provided novel insights into the presence of genomic streamlining, highlighting the importance of gene loss as a main mechanism that potentially contributes to cellular economization. Another important evolutionary force, especially in larger genomes, is gene acquisition via horizontal gene transfer (HGT), which might play a crucial role in the recruitment of novel functionalities. Also, a successful integration of genes acquired from archaeal donors appears to be an effective way of enhancing the adaptive capacity to cope with environmental changes. Taken together, the findings of this study significantly expand the spectrum of HGT and genome reduction in shaping the evolutionary history of Sulfobacillus strains. IMPORTANCE Horizontal gene transfer (HGT) and gene loss are recognized as major driving forces that contribute to the adaptive evolution of microbial genomes, although their relative importance remains elusive. The findings of this study suggest that highly frequent gene turnovers within microorganisms via HGT were necessary to incur additional novel functionalities to increase the capacity of acidophiles to adapt to changing environments. Evidence also reveals a fascinating phenomenon of potential cross-kingdom HGT. Furthermore, genome streamlining may be a critical force in driving the evolution of microbial genomes. Taken together, this study provides insights into the importance of both HGT and gene loss in the evolution and diversification of bacterial genomes. Copyright © 2017 American Society for Microbiology.

Genome-wide meta-analysis identifies novel gender specific loci associated with thyroid antibodies level in Croatians.

PubMed

Matana, Antonela; Popović, Marijana; Boutin, Thibaud; Torlak, Vesela; Brdar, Dubravka; Gunjača, Ivana; Kolčić, Ivana; Boraska Perica, Vesna; Punda, Ante; Polašek, Ozren; Hayward, Caroline; Barbalić, Maja; Zemunik, Tatijana

2018-04-18

Autoimmune thyroid diseases (AITD) are multifactorial endocrine diseases most frequently accompanied by Tg and TPO autoantibodies. Both antibodies have a higher prevalence in females and act under a strong genetic influence. To identify novel variants underlying thyroid antibody levels, we performed GWAS meta-analysis on the plasma levels of TgAb and TPOAb in three Croatian cohorts, as well as gender specific GWAS and a bivariate analysis. No significant association was detected with the level of TgAb and TPOAb in the meta-analysis of GWAS or bivariate results for all individuals. The bivariate analysis in females only revealed a genome-wide significant association for the locus near GRIN3A (rs4457391, P = 7.76 × 10 -9 ). The same locus had borderline association with TPOAb levels in females (rs1935377, P = 8.58 × 10 -8 ). In conclusion, we identified a novel gender specific locus associated with TgAb and TPOAb levels. Our findings provide a novel insight into genetic and gender differences associated with thyroid antibodies. Copyright © 2018 Elsevier Inc. All rights reserved.
Recapitulation of genome-wide association studies on pulse pressure and mean arterial pressure in the Korean population.

PubMed

Hong, Kyung-Won; Min, Haesook; Heo, Byeong-Mun; Joo, Seong Eun; Kim, Sung Soo; Kim, Yeonjung

2012-06-01

Increased pulse pressure (PP) and decreased mean arterial pressure (MAP) are strong prognostic predictors of adverse cardiovascular events. Recently, the International Consortium for Blood Pressure Genome-Wide Association Studies (ICBP-GWAS) reported eight loci that influenced PP and MAP. The ICBP-GWAS examined 51 cohorts--comprising 122 671 individuals of European ancestry--and identified eight SNPs: five that governed PP and three that controlled MAP. Six of these loci were novel. To replicate these newly identified loci and examine genetic architecture of PP and MAP between European and Asian populations, we conducted a meta-analysis of the eight SNPs combining data from ICBP and general population-based Korean cohorts. Two SNPs (rs13002573 (FIGN) and rs871606 (CHIC2)) for PP and two SNPs (rs1446468 (FIGN) and rs319690 (MAP4)) for MAP were replicated in Koreans. Although our GWAS only found moderate association, we believe that the findings promote us to propose that a similar genetic architecture governs PP and MAP in Asians and Europeans. However, further studies will be needed to confirm the possibility using other Asian population.
Genetic loci associated with heart rate variability and their effects on cardiac disease risk

PubMed Central

Nolte, Ilja M.; Munoz, M. Loretto; Tragante, Vinicius; Amare, Azmeraw T.; Jansen, Rick; Vaez, Ahmad; von der Heyde, Benedikt; Avery, Christy L.; Bis, Joshua C.; Dierckx, Bram; van Dongen, Jenny; Gogarten, Stephanie M.; Goyette, Philippe; Hernesniemi, Jussi; Huikari, Ville; Hwang, Shih-Jen; Jaju, Deepali; Kerr, Kathleen F.; Kluttig, Alexander; Krijthe, Bouwe P.; Kumar, Jitender; van der Laan, Sander W.; Lyytikäinen, Leo-Pekka; Maihofer, Adam X.; Minassian, Arpi; van der Most, Peter J.; Müller-Nurasyid, Martina; Nivard, Michel; Salvi, Erika; Stewart, James D.; Thayer, Julian F.; Verweij, Niek; Wong, Andrew; Zabaneh, Delilah; Zafarmand, Mohammad H.; Abdellaoui, Abdel; Albarwani, Sulayma; Albert, Christine; Alonso, Alvaro; Ashar, Foram; Auvinen, Juha; Axelsson, Tomas; Baker, Dewleen G.; de Bakker, Paul I. W.; Barcella, Matteo; Bayoumi, Riad; Bieringa, Rob J.; Boomsma, Dorret; Boucher, Gabrielle; Britton, Annie R.; Christophersen, Ingrid; Dietrich, Andrea; Ehret, George B.; Ellinor, Patrick T.; Eskola, Markku; Felix, Janine F.; Floras, John S.; Franco, Oscar H.; Friberg, Peter; Gademan, Maaike G. J.; Geyer, Mark A.; Giedraitis, Vilmantas; Hartman, Catharina A.; Hemerich, Daiane; Hofman, Albert; Hottenga, Jouke-Jan; Huikuri, Heikki; Hutri-Kähönen, Nina; Jouven, Xavier; Junttila, Juhani; Juonala, Markus; Kiviniemi, Antti M.; Kors, Jan A.; Kumari, Meena; Kuznetsova, Tatiana; Laurie, Cathy C.; Lefrandt, Joop D.; Li, Yong; Li, Yun; Liao, Duanping; Limacher, Marian C.; Lin, Henry J.; Lindgren, Cecilia M.; Lubitz, Steven A.; Mahajan, Anubha; McKnight, Barbara; zu Schwabedissen, Henriette Meyer; Milaneschi, Yuri; Mononen, Nina; Morris, Andrew P.; Nalls, Mike A.; Navis, Gerjan; Neijts, Melanie; Nikus, Kjell; North, Kari E.; O'Connor, Daniel T.; Ormel, Johan; Perz, Siegfried; Peters, Annette; Psaty, Bruce M.; Raitakari, Olli T.; Risbrough, Victoria B.; Sinner, Moritz F.; Siscovick, David; Smit, Johannes H.; Smith, Nicholas L.; Soliman, Elsayed Z.; Sotoodehnia, Nona; Staessen, Jan A.; Stein, Phyllis K.; Stilp, Adrienne M.; Stolarz-Skrzypek, Katarzyna; Strauch, Konstantin; Sundström, Johan; Swenne, Cees A.; Syvänen, Ann-Christine; Tardif, Jean-Claude; Taylor, Kent D.; Teumer, Alexander; Thornton, Timothy A.; Tinker, Lesley E.; Uitterlinden, André G.; van Setten, Jessica; Voss, Andreas; Waldenberger, Melanie; Wilhelmsen, Kirk C.; Willemsen, Gonneke; Wong, Quenna; Zhang, Zhu-Ming; Zonderman, Alan B.; Cusi, Daniele; Evans, Michele K.; Greiser, Halina K.; van der Harst, Pim; Hassan, Mohammad; Ingelsson, Erik; Järvelin, Marjo-Riitta; Kääb, Stefan; Kähönen, Mika; Kivimaki, Mika; Kooperberg, Charles; Kuh, Diana; Lehtimäki, Terho; Lind, Lars; Nievergelt, Caroline M.; O'Donnell, Chris J.; Oldehinkel, Albertine J.; Penninx, Brenda; Reiner, Alexander P.; Riese, Harriëtte; van Roon, Arie M.; Rioux, John D.; Rotter, Jerome I.; Sofer, Tamar; Stricker, Bruno H.; Tiemeier, Henning; Vrijkotte, Tanja G. M.; Asselbergs, Folkert W.; Brundel, Bianca J. J. M.; Heckbert, Susan R.; Whitsel, Eric A.; den Hoed, Marcel; Snieder, Harold; de Geus, Eco J. C.

2017-01-01

Reduced cardiac vagal control reflected in low heart rate variability (HRV) is associated with greater risks for cardiac morbidity and mortality. In two-stage meta-analyses of genome-wide association studies for three HRV traits in up to 53,174 individuals of European ancestry, we detect 17 genome-wide significant SNPs in eight loci. HRV SNPs tag non-synonymous SNPs (in NDUFA11 and KIAA1755), expression quantitative trait loci (eQTLs) (influencing GNG11, RGS6 and NEO1), or are located in genes preferentially expressed in the sinoatrial node (GNG11, RGS6 and HCN4). Genetic risk scores account for 0.9 to 2.6% of the HRV variance. Significant genetic correlation is found for HRV with heart rate (−0.74
Novel approach for deriving genome wide SNP analysis data from archived blood spots

PubMed Central

2012-01-01

Background The ability to transport and store DNA at room temperature in low volumes has the advantage of optimising cost, time and storage space. Blood spots on adapted filter papers are popular for this, with FTA (Flinders Technology Associates) Whatman™TM technology being one of the most recent. Plant material, plasmids, viral particles, bacteria and animal blood have been stored and transported successfully using this technology, however the method of porcine DNA extraction from FTA Whatman™TM cards is a relatively new approach, allowing nucleic acids to be ready for downstream applications such as PCR, whole genome amplification, sequencing and subsequent application to single nucleotide polymorphism microarrays has hitherto been under-explored. Findings DNA was extracted from FTA Whatman™TM cards (following adaptations of the manufacturer’s instructions), whole genome amplified and subsequently analysed to validate the integrity of the DNA for downstream SNP analysis. DNA was successfully extracted from 288/288 samples and amplified by WGA. Allele dropout post WGA, was observed in less than 2% of samples and there was no clear evidence of amplification bias nor contamination. Acceptable call rates on porcine SNP chips were also achieved using DNA extracted and amplified in this way. Conclusions DNA extracted from FTA Whatman cards is of a high enough quality and quantity following whole genomic amplification to perform meaningful SNP chip studies. PMID:22974252
A Genome-Wide RNAi Screen for Factors Involved in Neuronal Specification in Caenorhabditis elegans

PubMed Central

Cochella, Luisa; Flowers, Eileen B.; Hobert, Oliver

2011-01-01

One of the central goals of developmental neurobiology is to describe and understand the multi-tiered molecular events that control the progression of a fertilized egg to a terminally differentiated neuron. In the nematode Caenorhabditis elegans, the progression from egg to terminally differentiated neuron has been visually traced by lineage analysis. For example, the two gustatory neurons ASEL and ASER, a bilaterally symmetric neuron pair that is functionally lateralized, are generated from a fertilized egg through an invariant sequence of 11 cellular cleavages that occur stereotypically along specific cleavage planes. Molecular events that occur along this developmental pathway are only superficially understood. We take here an unbiased, genome-wide approach to identify genes that may act at any stage to ensure the correct differentiation of ASEL. Screening a genome-wide RNAi library that knocks-down 18,179 genes (94% of the genome), we identified 245 genes that affect the development of the ASEL neuron, such that the neuron is either not generated, its fate is converted to that of another cell, or cells from other lineage branches now adopt ASEL fate. We analyze in detail two factors that we identify from this screen: (1) the proneural gene hlh-14, which we find to be bilaterally expressed in the ASEL/R lineages despite their asymmetric lineage origins and which we find is required to generate neurons from several lineage branches including the ASE neurons, and (2) the COMPASS histone methyltransferase complex, which we find to be a critical embryonic inducer of ASEL/R asymmetry, acting upstream of the previously identified miRNA lsy-6. Our study represents the first comprehensive, genome-wide analysis of a single neuronal cell fate decision. The results of this analysis provide a starting point for future studies that will eventually lead to a more complete understanding of how individual neuronal cell types are generated from a single-cell embryo. PMID:21698137
Extensive Conserved Synteny of Genes between the Karyotypes of Manduca sexta and Bombyx mori Revealed by BAC-FISH Mapping

PubMed Central

Tanaka-Okuyama, Makiko; Shibata, Fukashi; Yoshido, Atsuo; Marec, František; Wu, Chengcang; Zhang, Hongbin; Goldsmith, Marian R.

2009-01-01

Background Genome sequencing projects have been completed for several species representing four highly diverged holometabolous insect orders, Diptera, Hymenoptera, Coleoptera, and Lepidoptera. The striking evolutionary diversity of insects argues a need for efficient methods to apply genome information from such models to genetically uncharacterized species. Constructing conserved synteny maps plays a crucial role in this task. Here, we demonstrate the use of fluorescence in situ hybridization with bacterial artificial chromosome probes as a powerful tool for physical mapping of genes and comparative genome analysis in Lepidoptera, which have numerous and morphologically uniform holokinetic chromosomes. Methodology/Principal Findings We isolated 214 clones containing 159 orthologs of well conserved single-copy genes of a sequenced lepidopteran model, the silkworm, Bombyx mori, from a BAC library of a sphingid with an unexplored genome, the tobacco hornworm, Manduca sexta. We then constructed a BAC-FISH karyotype identifying all 28 chromosomes of M. sexta by mapping 124 loci using the corresponding BAC clones. BAC probes from three M. sexta chromosomes also generated clear signals on the corresponding chromosomes of the convolvulus hawk moth, Agrius convolvuli, which belongs to the same subfamily, Sphinginae, as M. sexta. Conclusions/Significance Comparison of the M. sexta BAC physical map with the linkage map and genome sequence of B. mori pointed to extensive conserved synteny including conserved gene order in most chromosomes. Only a few rearrangements, including three inversions, three translocations, and two fission/fusion events were estimated to have occurred after the divergence of Bombycidae and Sphingidae. These results add to accumulating evidence for the stability of lepidopteran genomes. Generating signals on A. convolvuli chromosomes using heterologous M. sexta probes demonstrated that BAC-FISH with orthologous sequences can be used for karyotyping a wide range of related and genetically uncharacterized species, significantly extending the ability to develop synteny maps for comparative and functional genomics. PMID:19829706
A genome-wide association study of corneal astigmatism: The CREAM Consortium.

PubMed

Shah, Rupal L; Li, Qing; Zhao, Wanting; Tedja, Milly S; Tideman, J Willem L; Khawaja, Anthony P; Fan, Qiao; Yazar, Seyhan; Williams, Katie M; Verhoeven, Virginie J M; Xie, Jing; Wang, Ya Xing; Hess, Moritz; Nickels, Stefan; Lackner, Karl J; Pärssinen, Olavi; Wedenoja, Juho; Biino, Ginevra; Concas, Maria Pina; Uitterlinden, André; Rivadeneira, Fernando; Jaddoe, Vincent W V; Hysi, Pirro G; Sim, Xueling; Tan, Nicholas; Tham, Yih-Chung; Sensaki, Sonoko; Hofman, Albert; Vingerling, Johannes R; Jonas, Jost B; Mitchell, Paul; Hammond, Christopher J; Höhn, René; Baird, Paul N; Wong, Tien-Yin; Cheng, Chinfsg-Yu; Teo, Yik Ying; Mackey, David A; Williams, Cathy; Saw, Seang-Mei; Klaver, Caroline C W; Guggenheim, Jeremy A; Bailey-Wilson, Joan E

2018-01-01

To identify genes and genetic markers associated with corneal astigmatism. A meta-analysis of genome-wide association studies (GWASs) of corneal astigmatism undertaken for 14 European ancestry (n=22,250) and 8 Asian ancestry (n=9,120) cohorts was performed by the Consortium for Refractive Error and Myopia. Cases were defined as having >0.75 diopters of corneal astigmatism. Subsequent gene-based and gene-set analyses of the meta-analyzed results of European ancestry cohorts were performed using VEGAS2 and MAGMA software. Additionally, estimates of single nucleotide polymorphism (SNP)-based heritability for corneal and refractive astigmatism and the spherical equivalent were calculated for Europeans using LD score regression. The meta-analysis of all cohorts identified a genome-wide significant locus near the platelet-derived growth factor receptor alpha ( PDGFRA ) gene: top SNP: rs7673984, odds ratio=1.12 (95% CI:1.08-1.16), p=5.55×10 -9 . No other genome-wide significant loci were identified in the combined analysis or European/Asian ancestry-specific analyses. Gene-based analysis identified three novel candidate genes for corneal astigmatism in Europeans-claudin-7 ( CLDN7 ), acid phosphatase 2, lysosomal ( ACP2 ), and TNF alpha-induced protein 8 like 3 ( TNFAIP8L3 ). In addition to replicating a previously identified genome-wide significant locus for corneal astigmatism near the PDGFRA gene, gene-based analysis identified three novel candidate genes, CLDN7 , ACP2 , and TNFAIP8L3 , that warrant further investigation to understand their role in the pathogenesis of corneal astigmatism. The much lower number of genetic variants and genes demonstrating an association with corneal astigmatism compared to published spherical equivalent GWAS analyses suggest a greater influence of rare genetic variants, non-additive genetic effects, or environmental factors in the development of astigmatism.
A Genome-Wide Association Meta-Analysis of Attention-Deficit/Hyperactivity Disorder Symptoms in Population-Based Paediatric Cohorts

PubMed Central

Groen-Blokhuis, Maria M.; Pourcain, Beate St.; Greven, Corina U.; Pappa, Irene; Tiesler, Carla M.T.; Ang, Wei; Nolte, Ilja M.; Vilor-Tejedor, Natalia; Bacelis, Jonas; Ebejer, Jane L.; Zhao, Huiying; Davies, Gareth E.; Ehli, Erik A.; Evans, David M.; Fedko, Iryna O.; Guxens, Mònica; Hottenga, Jouke-Jan; Hudziak, James J.; Jugessur, Astanand; Kemp, John P.; Krapohl, Eva; Martin, Nicholas G.; Murcia, Mario; Myhre, Ronny; Ormel, Johan; Ring, Susan M.; Standl, Marie; Stergiakouli, Evie; Stoltenberg, Camilla; Thiering, Elisabeth; Timpson, Nicholas J.; Trzaskowski, Maciej; van der Most, Peter J.; Wang, Carol; Nyholt, Dale R.; Medland, Sarah E.; Neale, Benjamin; Jacobsson, Bo; Sunyer, Jordi; Hartman, Catharina A.; Whitehouse, Andrew J.O.; Pennell, Craig E.; Heinrich, Joachim; Plomin, Robert; Smith, George Davey; Tiemeier, Henning; Posthuma, Danielle; Boomsma, Dorret I.

2016-01-01

Objective To elucidate the influence of common genetic variants on childhood attention-deficit/hyperactivity disorder (ADHD) symptoms, to identify genetic variants that explain its high heritability, and to investigate the genetic overlap of ADHD symptom scores with ADHD diagnosis. Method Within the EArly Genetics and Lifecourse Epidemiology (EAGLE) consortium, genome-wide single nucleotide polymorphisms (SNPs) and ADHD symptom scores were available for 17,666 children (< 13 years) from nine population-based cohorts. SNP-based heritability was estimated in data from the three largest cohorts. Meta-analysis based on genome-wide association (GWA) analyses with SNPs was followed by gene-based association tests, and the overlap in results with a meta-analysis in the Psychiatric Genomics Consortium (PGC) case-control ADHD study was investigated. Results SNP-based heritability ranged from 5% to 34%, indicating that variation in common genetic variants influences ADHD symptom scores. The meta-analysis did not detect genome-wide significant SNPs, but three genes, lying close to each other with SNPs in high linkage disequilibrium (LD), showed a gene-wide significant association (p values between 1.46×10-6 and 2.66×10-6). One gene, WASL, is involved in neuronal development. Both SNP- and gene-based analyses indicated overlap with the PGC meta-analysis results with the genetic correlation estimated at 0.96. Conclusion The SNP-based heritability for ADHD symptom scores indicates a polygenic architecture and genes involved in neurite outgrowth are possibly involved. Continuous and dichotomous measures of ADHD appear to assess a genetically common phenotype. A next step is to combine data from population-based and case-control cohorts in genetic association studies to increase sample size and improve statistical power for identifying genetic variants. PMID:27663945
Refining genome-wide linkage intervals using a meta-analysis of genome-wide association studies identifies loci influencing personality dimensions

PubMed Central

Amin, Najaf; Hottenga, Jouke-Jan; Hansell, Narelle K; Janssens, A Cecile JW; de Moor, Marleen HM; Madden, Pamela AF; Zorkoltseva, Irina V; Penninx, Brenda W; Terracciano, Antonio; Uda, Manuela; Tanaka, Toshiko; Esko, Tonu; Realo, Anu; Ferrucci, Luigi; Luciano, Michelle; Davies, Gail; Metspalu, Andres; Abecasis, Goncalo R; Deary, Ian J; Raikkonen, Katri; Bierut, Laura J; Costa, Paul T; Saviouk, Viatcheslav; Zhu, Gu; Kirichenko, Anatoly V; Isaacs, Aaron; Aulchenko, Yurii S; Willemsen, Gonneke; Heath, Andrew C; Pergadia, Michele L; Medland, Sarah E; Axenovich, Tatiana I; de Geus, Eco; Montgomery, Grant W; Wright, Margaret J; Oostra, Ben A; Martin, Nicholas G; Boomsma, Dorret I; van Duijn, Cornelia M

2013-01-01

Personality traits are complex phenotypes related to psychosomatic health. Individually, various gene finding methods have not achieved much success in finding genetic variants associated with personality traits. We performed a meta-analysis of four genome-wide linkage scans (N=6149 subjects) of five basic personality traits assessed with the NEO Five-Factor Inventory. We compared the significant regions from the meta-analysis of linkage scans with the results of a meta-analysis of genome-wide association studies (GWAS) (N∼17 000). We found significant evidence of linkage of neuroticism to chromosome 3p14 (rs1490265, LOD=4.67) and to chromosome 19q13 (rs628604, LOD=3.55); of extraversion to 14q32 (ATGG002, LOD=3.3); and of agreeableness to 3p25 (rs709160, LOD=3.67) and to two adjacent regions on chromosome 15, including 15q13 (rs970408, LOD=4.07) and 15q14 (rs1055356, LOD=3.52) in the individual scans. In the meta-analysis, we found strong evidence of linkage of extraversion to 4q34, 9q34, 10q24 and 11q22, openness to 2p25, 3q26, 9p21, 11q24, 15q26 and 19q13 and agreeableness to 4q34 and 19p13. Significant evidence of association in the GWAS was detected between openness and rs677035 at 11q24 (P-value=2.6 × 10−06, KCNJ1). The findings of our linkage meta-analysis and those of the GWAS suggest that 11q24 is a susceptible locus for openness, with KCNJ1 as the possible candidate gene. PMID:23211697
Accurate computation of survival statistics in genome-wide studies.

PubMed

Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J; Upfal, Eli

2015-05-01

A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.
Accurate Computation of Survival Statistics in Genome-Wide Studies

PubMed Central

Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J.; Upfal, Eli

2015-01-01

A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations. PMID:25950620
Very low-depth sequencing in a founder population identifies a cardioprotective APOC3 signal missed by genome-wide imputation.

PubMed

Gilly, Arthur; Ritchie, Graham Rs; Southam, Lorraine; Farmaki, Aliki-Eleni; Tsafantakis, Emmanouil; Dedoussis, George; Zeggini, Eleftheria

2016-06-01

Cohort-wide very low-depth whole-genome sequencing (WGS) can comprehensively capture low-frequency sequence variation for the cost of a dense genome-wide genotyping array. Here, we analyse 1x sequence data across the APOC3 gene in a founder population from the island of Crete in Greece (n = 1239) and find significant evidence for association with blood triglyceride levels with the previously reported R19X cardioprotective null mutation (β = -1.09,σ = 0.163, P = 8.2 × 10 -11 ) and a second loss of function mutation, rs138326449 (β = -1.17,σ = 0.188, P = 1.14 × 10 -9 ). The signal cannot be recapitulated by imputing genome-wide genotype data on a large reference panel of 5122 individuals including 249 with 4x WGS data from the same population. Gene-level meta-analysis with other studies reporting burden signals at APOC3 provides robust evidence for a replicable cardioprotective rare variant aggregation (P = 3.2 × 10 -31 , n = 13 480). © The Author 2016. Published by Oxford University Press.
Very low-depth sequencing in a founder population identifies a cardioprotective APOC3 signal missed by genome-wide imputation

PubMed Central

Gilly, Arthur; Ritchie, Graham Rs; Southam, Lorraine; Farmaki, Aliki-Eleni; Tsafantakis, Emmanouil; Dedoussis, George; Zeggini, Eleftheria

2016-01-01

Cohort-wide very low-depth whole-genome sequencing (WGS) can comprehensively capture low-frequency sequence variation for the cost of a dense genome-wide genotyping array. Here, we analyse 1x sequence data across the APOC3 gene in a founder population from the island of Crete in Greece (n = 1239) and find significant evidence for association with blood triglyceride levels with the previously reported R19X cardioprotective null mutation (β = −1.09,σ = 0.163, P = 8.2 × 10−11) and a second loss of function mutation, rs138326449 (β = −1.17,σ = 0.188, P = 1.14 × 10−9). The signal cannot be recapitulated by imputing genome-wide genotype data on a large reference panel of 5122 individuals including 249 with 4x WGS data from the same population. Gene-level meta-analysis with other studies reporting burden signals at APOC3 provides robust evidence for a replicable cardioprotective rare variant aggregation (P = 3.2 × 10−31, n = 13 480). PMID:27146844
Multi-Instance Metric Transfer Learning for Genome-Wide Protein Function Prediction.

PubMed

Xu, Yonghui; Min, Huaqing; Wu, Qingyao; Song, Hengjie; Ye, Bicui

2017-02-06

Multi-Instance (MI) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with multiple instances. Many studies in this literature attempted to find an appropriate Multi-Instance Learning (MIL) method for genome-wide protein function prediction under a usual assumption, the underlying distribution from testing data (target domain, i.e., TD) is the same as that from training data (source domain, i.e., SD). However, this assumption may be violated in real practice. To tackle this problem, in this paper, we propose a Multi-Instance Metric Transfer Learning (MIMTL) approach for genome-wide protein function prediction. In MIMTL, we first transfer the source domain distribution to the target domain distribution by utilizing the bag weights. Then, we construct a distance metric learning method with the reweighted bags. At last, we develop an alternative optimization scheme for MIMTL. Comprehensive experimental evidence on seven real-world organisms verifies the effectiveness and efficiency of the proposed MIMTL approach over several state-of-the-art methods.
A Genome-Wide Linkage Scan for Age at Menarche in Three Populations of European Descent

PubMed Central

Anderson, Carl A.; Zhu, Gu; Falchi, Mario; van den Berg, Stéphanie M.; Treloar, Susan A.; Spector, Timothy D.; Martin, Nicholas G.; Boomsma, Dorret I.; Visscher, Peter M.; Montgomery, Grant W.

2008-01-01

Context: Age at menarche (AAM) is an important trait both biologically and socially, a clearly defined event in female pubertal development, and has been associated with many clinically significant phenotypes. Objective: The objective of the study was to identify genetic loci influencing variation in AAM in large population-based samples from three countries. Design/Participants: Recalled AAM data were collected from 13,697 individuals and 4,899 pseudoindependent sister-pairs from three different populations (Australia, The Netherlands, and the United Kingdom) by mailed questionnaire or interview. Genome-wide variance components linkage analysis was implemented on each sample individually and in combination. Results: The mean, sd, and heritability of AAM across the three samples was 13.1 yr, 1.5 yr, and 0.69, respectively. No loci were detected that reached genome-wide significance in the combined analysis, but a suggestive locus was detected on chromosome 12 (logarithm of the odds = 2.0). Three loci of suggestive significance were seen in the U.K. sample on chromosomes 1, 4, and 18 (logarithm of the odds = 2.4, 2.2 and 3.2, respectively). Conclusions: There was no evidence for common highly penetrant variants influencing AAM. Linkage and association suggest that one trait locus for AAM is located on chromosome 12, but further studies are required to replicate these results. PMID:18647812
A genome-wide search for linkage of estimated glomerular filtration rate (eGFR) in the Family Investigation of Nephropathy and Diabetes (FIND).

PubMed

Thameem, Farook; Igo, Robert P; Freedman, Barry I; Langefeld, Carl; Hanson, Robert L; Schelling, Jeffrey R; Elston, Robert C; Duggirala, Ravindranath; Nicholas, Susanne B; Goddard, Katrina A B; Divers, Jasmin; Guo, Xiuqing; Ipp, Eli; Kimmel, Paul L; Meoni, Lucy A; Shah, Vallabh O; Smith, Michael W; Winkler, Cheryl A; Zager, Philip G; Knowler, William C; Nelson, Robert G; Pahl, Madeline V; Parekh, Rulan S; Kao, W H Linda; Rasooly, Rebekah S; Adler, Sharon G; Abboud, Hanna E; Iyengar, Sudha K; Sedor, John R

2013-01-01

Estimated glomerular filtration rate (eGFR), a measure of kidney function, is heritable, suggesting that genes influence renal function. Genes that influence eGFR have been identified through genome-wide association studies. However, family-based linkage approaches may identify loci that explain a larger proportion of the heritability. This study used genome-wide linkage and association scans to identify quantitative trait loci (QTL) that influence eGFR. Genome-wide linkage and sparse association scans of eGFR were performed in families ascertained by probands with advanced diabetic nephropathy (DN) from the multi-ethnic Family Investigation of Nephropathy and Diabetes (FIND) study. This study included 954 African Americans (AA), 781 American Indians (AI), 614 European Americans (EA) and 1,611 Mexican Americans (MA). A total of 3,960 FIND participants were genotyped for 6,000 single nucleotide polymorphisms (SNPs) using the Illumina Linkage IVb panel. GFR was estimated by the Modification of Diet in Renal Disease (MDRD) formula. The non-parametric linkage analysis, accounting for the effects of diabetes duration and BMI, identified the strongest evidence for linkage of eGFR on chromosome 20q11 (log of the odds [LOD] = 3.34; P = 4.4 × 10(-5)) in MA and chromosome 15q12 (LOD = 2.84; P = 1.5 × 10(-4)) in EA. In all subjects, the strongest linkage signal for eGFR was detected on chromosome 10p12 (P = 5.5 × 10(-4)) at 44 cM near marker rs1339048. A subsequent association scan in both ancestry-specific groups and the entire population identified several SNPs significantly associated with eGFR across the genome. The present study describes the localization of QTL influencing eGFR on 20q11 in MA, 15q21 in EA and 10p12 in the combined ethnic groups participating in the FIND study. Identification of causal genes/variants influencing eGFR, within these linkage and association loci, will open new avenues for functional analyses and development of novel diagnostic markers for DN.
A Genome-Wide Search for Linkage of Estimated Glomerular Filtration Rate (eGFR) in the Family Investigation of Nephropathy and Diabetes (FIND)

PubMed Central

Thameem, Farook; Igo, Robert P.; Freedman, Barry I.; Langefeld, Carl; Hanson, Robert L.; Schelling, Jeffrey R.; Elston, Robert C.; Duggirala, Ravindranath; Nicholas, Susanne B.; Goddard, Katrina A. B.; Divers, Jasmin; Guo, Xiuqing; Ipp, Eli; Kimmel, Paul L.; Meoni, Lucy A.; Shah, Vallabh O.; Smith, Michael W.; Winkler, Cheryl A.; Zager, Philip G.; Knowler, William C.; Nelson, Robert G.; Pahl, Madeline V.; Parekh, Rulan S.; Kao, W. H. Linda; Rasooly, Rebekah S.; Adler, Sharon G.; Abboud, Hanna E.; Iyengar, Sudha K.; Sedor, John R.

2013-01-01

Objective Estimated glomerular filtration rate (eGFR), a measure of kidney function, is heritable, suggesting that genes influence renal function. Genes that influence eGFR have been identified through genome-wide association studies. However, family-based linkage approaches may identify loci that explain a larger proportion of the heritability. This study used genome-wide linkage and association scans to identify quantitative trait loci (QTL) that influence eGFR. Methods Genome-wide linkage and sparse association scans of eGFR were performed in families ascertained by probands with advanced diabetic nephropathy (DN) from the multi-ethnic Family Investigation of Nephropathy and Diabetes (FIND) study. This study included 954 African Americans (AA), 781 American Indians (AI), 614 European Americans (EA) and 1,611 Mexican Americans (MA). A total of 3,960 FIND participants were genotyped for 6,000 single nucleotide polymorphisms (SNPs) using the Illumina Linkage IVb panel. GFR was estimated by the Modification of Diet in Renal Disease (MDRD) formula. Results The non-parametric linkage analysis, accounting for the effects of diabetes duration and BMI, identified the strongest evidence for linkage of eGFR on chromosome 20q11 (log of the odds [LOD] = 3.34; P = 4.4×10−5) in MA and chromosome 15q12 (LOD = 2.84; P = 1.5×10−4) in EA. In all subjects, the strongest linkage signal for eGFR was detected on chromosome 10p12 (P = 5.5×10−4) at 44 cM near marker rs1339048. A subsequent association scan in both ancestry-specific groups and the entire population identified several SNPs significantly associated with eGFR across the genome. Conclusion The present study describes the localization of QTL influencing eGFR on 20q11 in MA, 15q21 in EA and 10p12 in the combined ethnic groups participating in the FIND study. Identification of causal genes/variants influencing eGFR, within these linkage and association loci, will open new avenues for functional analyses and development of novel diagnostic markers for DN. PMID:24358131
Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data

PubMed Central

Wright, Caroline F; Fitzgerald, Tomas W; Jones, Wendy D; Clayton, Stephen; McRae, Jeremy F; van Kogelenberg, Margriet; King, Daniel A; Ambridge, Kirsty; Barrett, Daniel M; Bayzetinova, Tanya; Bevan, A Paul; Bragin, Eugene; Chatzimichali, Eleni A; Gribble, Susan; Jones, Philip; Krishnappa, Netravathi; Mason, Laura E; Miller, Ray; Morley, Katherine I; Parthiban, Vijaya; Prigmore, Elena; Rajan, Diana; Sifrim, Alejandro; Swaminathan, G Jawahar; Tivey, Adrian R; Middleton, Anna; Parker, Michael; Carter, Nigel P; Barrett, Jeffrey C; Hurles, Matthew E; FitzPatrick, David R; Firth, Helen V

2015-01-01

Summary Background Human genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount. Methods The Deciphering Developmental Disorders (DDD) study has developed a UK-wide patient recruitment network involving over 180 clinicians across all 24 regional genetics services, and has performed genome-wide microarray and whole exome sequencing on children with undiagnosed developmental disorders and their parents. After data analysis, pertinent genomic variants were returned to individual research participants via their local clinical genetics team. Findings Around 80 000 genomic variants were identified from exome sequencing and microarray analysis in each individual, of which on average 400 were rare and predicted to be protein altering. By focusing only on de novo and segregating variants in known developmental disorder genes, we achieved a diagnostic yield of 27% among 1133 previously investigated yet undiagnosed children with developmental disorders, whilst minimising incidental findings. In families with developmentally normal parents, whole exome sequencing of the child and both parents resulted in a 10-fold reduction in the number of potential causal variants that needed clinical evaluation compared to sequencing only the child. Most diagnostic variants identified in known genes were novel and not present in current databases of known disease variation. Interpretation Implementation of a robust translational genomics workflow is achievable within a large-scale rare disease research study to allow feedback of potentially diagnostic findings to clinicians and research participants. Systematic recording of relevant clinical data, curation of a gene–phenotype knowledge base, and development of clinical decision support software are needed in addition to automated exclusion of almost all variants, which is crucial for scalable prioritisation and review of possible diagnostic variants. However, the resource requirements of development and maintenance of a clinical reporting system within a research setting are substantial. Funding Health Innovation Challenge Fund, a parallel funding partnership between the Wellcome Trust and the UK Department of Health. PMID:25529582
Comparative analysis of metazoan chromatin organization.

PubMed

Ho, Joshua W K; Jung, Youngsook L; Liu, Tao; Alver, Burak H; Lee, Soohyun; Ikegami, Kohta; Sohn, Kyung-Ah; Minoda, Aki; Tolstorukov, Michael Y; Appert, Alex; Parker, Stephen C J; Gu, Tingting; Kundaje, Anshul; Riddle, Nicole C; Bishop, Eric; Egelhofer, Thea A; Hu, Sheng'en Shawn; Alekseyenko, Artyom A; Rechtsteiner, Andreas; Asker, Dalal; Belsky, Jason A; Bowman, Sarah K; Chen, Q Brent; Chen, Ron A-J; Day, Daniel S; Dong, Yan; Dose, Andrea C; Duan, Xikun; Epstein, Charles B; Ercan, Sevinc; Feingold, Elise A; Ferrari, Francesco; Garrigues, Jacob M; Gehlenborg, Nils; Good, Peter J; Haseley, Psalm; He, Daniel; Herrmann, Moritz; Hoffman, Michael M; Jeffers, Tess E; Kharchenko, Peter V; Kolasinska-Zwierz, Paulina; Kotwaliwale, Chitra V; Kumar, Nischay; Langley, Sasha A; Larschan, Erica N; Latorre, Isabel; Libbrecht, Maxwell W; Lin, Xueqiu; Park, Richard; Pazin, Michael J; Pham, Hoang N; Plachetka, Annette; Qin, Bo; Schwartz, Yuri B; Shoresh, Noam; Stempor, Przemyslaw; Vielle, Anne; Wang, Chengyang; Whittle, Christina M; Xue, Huiling; Kingston, Robert E; Kim, Ju Han; Bernstein, Bradley E; Dernburg, Abby F; Pirrotta, Vincenzo; Kuroda, Mitzi I; Noble, William S; Tullius, Thomas D; Kellis, Manolis; MacAlpine, David M; Strome, Susan; Elgin, Sarah C R; Liu, Xiaole Shirley; Lieb, Jason D; Ahringer, Julie; Karpen, Gary H; Park, Peter J

2014-08-28

Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal 'arms', and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.
Genome-wide identification and expression profiling of serine proteases and homologs in the diamondback moth, Plutella xylostella (L.).

PubMed

Lin, Hailan; Xia, Xiaofeng; Yu, Liying; Vasseur, Liette; Gurr, Geoff M; Yao, Fengluan; Yang, Guang; You, Minsheng

2015-12-10

Serine proteases (SPs) are crucial proteolytic enzymes responsible for digestion and other processes including signal transduction and immune responses in insects. Serine protease homologs (SPHs) lack catalytic activity but are involved in innate immunity. This study presents a genome-wide investigation of SPs and SPHs in the diamondback moth, Plutella xylostella (L.), a globally-distributed destructive pest of cruciferous crops. A total of 120 putative SPs and 101 putative SPHs were identified in the P. xylostella genome by bioinformatics analysis. Based on the features of trypsin, 38 SPs were putatively designated as trypsin genes. The distribution, transcription orientation, exon-intron structure and sequence alignments suggested that the majority of trypsin genes evolved from tandem duplications. Among the 221 SP/SPH genes, ten SP and three SPH genes with one or more clip domains were predicted and designated as PxCLIPs. Phylogenetic analysis of CLIPs in P. xylostella, two other Lepidoptera species (Bombyx mori and Manduca sexta), and two more distantly related insects (Drosophila melanogaster and Apis mellifera) showed that seven of the 13 PxCLIPs were clustered with homologs of the Lepidoptera rather than other species. Expression profiling of the P. xylostella SP and SPH genes in different developmental stages and tissues showed diverse expression patterns, suggesting high functional diversity with roles in digestion and development. This is the first genome-wide investigation on the SP and SPH genes in P. xylostella. The characterized features and profiled expression patterns of the P. xylostella SPs and SPHs suggest their involvement in digestion, development and immunity of this species. Our findings provide a foundation for further research on the functions of this gene family in P. xylostella, and a better understanding of its capacity to rapidly adapt to a wide range of environmental variables including host plants and insecticides.

A Genome-Wide Scan for Breast Cancer Risk Haplotypes among African American Women

PubMed Central

Song, Chi; Chen, Gary K.; Millikan, Robert C.; Ambrosone, Christine B.; John, Esther M.; Bernstein, Leslie; Zheng, Wei; Hu, Jennifer J.; Ziegler, Regina G.; Nyante, Sarah; Bandera, Elisa V.; Ingles, Sue A.; Press, Michael F.; Deming, Sandra L.; Rodriguez-Gil, Jorge L.; Chanock, Stephen J.; Wan, Peggy; Sheng, Xin; Pooler, Loreall C.; Van Den Berg, David J.; Le Marchand, Loic; Kolonel, Laurence N.; Henderson, Brian E.; Haiman, Chris A.; Stram, Daniel O.

2013-01-01

Genome-wide association studies (GWAS) simultaneously investigating hundreds of thousands of single nucleotide polymorphisms (SNP) have become a powerful tool in the investigation of new disease susceptibility loci. Haplotypes are sometimes thought to be superior to SNPs and are promising in genetic association analyses. The application of genome-wide haplotype analysis, however, is hindered by the complexity of haplotypes themselves and sophistication in computation. We systematically analyzed the haplotype effects for breast cancer risk among 5,761 African American women (3,016 cases and 2,745 controls) using a sliding window approach on the genome-wide scale. Three regions on chromosomes 1, 4 and 18 exhibited moderate haplotype effects. Furthermore, among 21 breast cancer susceptibility loci previously established in European populations, 10p15 and 14q24 are likely to harbor novel haplotype effects. We also proposed a heuristic of determining the significance level and the effective number of independent tests by the permutation analysis on chromosome 22 data. It suggests that the effective number was approximately half of the total (7,794 out of 15,645), thus the half number could serve as a quick reference to evaluating genome-wide significance if a similar sliding window approach of haplotype analysis is adopted in similar populations using similar genotype density. PMID:23468962
Systematic quantification of HDR and NHEJ reveals effects of locus, nuclease, and cell type on genome-editing.

PubMed

Miyaoka, Yuichiro; Berman, Jennifer R; Cooper, Samantha B; Mayerl, Steven J; Chan, Amanda H; Zhang, Bin; Karlin-Neumann, George A; Conklin, Bruce R

2016-03-31

Precise genome-editing relies on the repair of sequence-specific nuclease-induced DNA nicking or double-strand breaks (DSBs) by homology-directed repair (HDR). However, nonhomologous end-joining (NHEJ), an error-prone repair, acts concurrently, reducing the rate of high-fidelity edits. The identification of genome-editing conditions that favor HDR over NHEJ has been hindered by the lack of a simple method to measure HDR and NHEJ directly and simultaneously at endogenous loci. To overcome this challenge, we developed a novel, rapid, digital PCR-based assay that can simultaneously detect one HDR or NHEJ event out of 1,000 copies of the genome. Using this assay, we systematically monitored genome-editing outcomes of CRISPR-associated protein 9 (Cas9), Cas9 nickases, catalytically dead Cas9 fused to FokI, and transcription activator-like effector nuclease at three disease-associated endogenous gene loci in HEK293T cells, HeLa cells, and human induced pluripotent stem cells. Although it is widely thought that NHEJ generally occurs more often than HDR, we found that more HDR than NHEJ was induced under multiple conditions. Surprisingly, the HDR/NHEJ ratios were highly dependent on gene locus, nuclease platform, and cell type. The new assay system, and our findings based on it, will enable mechanistic studies of genome-editing and help improve genome-editing technology.
Network-Based Identification and Prioritization of Key Regulators of Coronary Artery Disease Loci

PubMed Central

Zhao, Yuqi; Chen, Jing; Freudenberg, Johannes M.; Meng, Qingying; Rajpal, Deepak K.; Yang, Xia

2017-01-01

Objective Recent genome-wide association studies of coronary artery disease (CAD) have revealed 58 genome-wide significant and 148 suggestive genetic loci. However, the molecular mechanisms through which they contribute to CAD and the clinical implications of these findings remain largely unknown. We aim to retrieve gene subnetworks of the 206 CAD loci and identify and prioritize candidate regulators to better understand the biological mechanisms underlying the genetic associations. Approach and Results We devised a new integrative genomics approach that incorporated (1) candidate genes from the top CAD loci, (2) the complete genetic association results from the 1000 genomes-based CAD genome-wide association studies from the Coronary Artery Disease Genome Wide Replication and Meta-Analysis Plus the Coronary Artery Disease consortium, (3) tissue-specific gene regulatory networks that depict the potential relationship and interactions between genes, and (4) tissue-specific gene expression patterns between CAD patients and controls. The networks and top-ranked regulators according to these data-driven criteria were further queried against literature, experimental evidence, and drug information to evaluate their disease relevance and potential as drug targets. Our analysis uncovered several potential novel regulators of CAD such as LUM and STAT3, which possess properties suitable as drug targets. We also revealed molecular relations and potential mechanisms through which the top CAD loci operate. Furthermore, we found that multiple CAD-relevant biological processes such as extracellular matrix, inflammatory and immune pathways, complement and coagulation cascades, and lipid metabolism interact in the CAD networks. Conclusions Our data-driven integrative genomics framework unraveled tissue-specific relations among the candidate genes of the CAD genome-wide association studies loci and prioritized novel network regulatory genes orchestrating biological processes relevant to CAD. PMID:26966275
NordicDB: a Nordic pool and portal for genome-wide control data.

PubMed

Leu, Monica; Humphreys, Keith; Surakka, Ida; Rehnberg, Emil; Muilu, Juha; Rosenström, Päivi; Almgren, Peter; Jääskeläinen, Juha; Lifton, Richard P; Kyvik, Kirsten Ohm; Kaprio, Jaakko; Pedersen, Nancy L; Palotie, Aarno; Hall, Per; Grönberg, Henrik; Groop, Leif; Peltonen, Leena; Palmgren, Juni; Ripatti, Samuli

2010-12-01

A cost-efficient way to increase power in a genetic association study is to pool controls from different sources. The genotyping effort can then be directed to large case series. The Nordic Control database, NordicDB, has been set up as a unique resource in the Nordic area and the data are available for authorized users through the web portal (http://www.nordicdb.org). The current version of NordicDB pools together high-density genome-wide SNP information from ∼5000 controls originating from Finnish, Swedish and Danish studies and shows country-specific allele frequencies for SNP markers. The genetic homogeneity of the samples was investigated using multidimensional scaling (MDS) analysis and pairwise allele frequency differences between the studies. The plot of the first two MDS components showed excellent resemblance to the geographical placement of the samples, with a clear NW-SE gradient. We advise researchers to assess the impact of population structure when incorporating NordicDB controls in association studies. This harmonized Nordic database presents a unique genome-wide resource for future genetic association studies in the Nordic countries.
NordicDB: a Nordic pool and portal for genome-wide control data

PubMed Central

Leu, Monica; Humphreys, Keith; Surakka, Ida; Rehnberg, Emil; Muilu, Juha; Rosenström, Päivi; Almgren, Peter; Jääskeläinen, Juha; Lifton, Richard P; Kyvik, Kirsten Ohm; Kaprio, Jaakko; Pedersen, Nancy L; Palotie, Aarno; Hall, Per; Grönberg, Henrik; Groop, Leif; Peltonen, Leena; Palmgren, Juni; Ripatti, Samuli

2010-01-01

A cost-efficient way to increase power in a genetic association study is to pool controls from different sources. The genotyping effort can then be directed to large case series. The Nordic Control database, NordicDB, has been set up as a unique resource in the Nordic area and the data are available for authorized users through the web portal (http://www.nordicdb.org). The current version of NordicDB pools together high-density genome-wide SNP information from ∼5000 controls originating from Finnish, Swedish and Danish studies and shows country-specific allele frequencies for SNP markers. The genetic homogeneity of the samples was investigated using multidimensional scaling (MDS) analysis and pairwise allele frequency differences between the studies. The plot of the first two MDS components showed excellent resemblance to the geographical placement of the samples, with a clear NW–SE gradient. We advise researchers to assess the impact of population structure when incorporating NordicDB controls in association studies. This harmonized Nordic database presents a unique genome-wide resource for future genetic association studies in the Nordic countries. PMID:20664631
Joint analysis of three genome-wide association studies of esophageal squamous cell carcinoma in Chinese populations.

PubMed

Wu, Chen; Wang, Zhaoming; Song, Xin; Feng, Xiao-Shan; Abnet, Christian C; He, Jie; Hu, Nan; Zuo, Xian-Bo; Tan, Wen; Zhan, Qimin; Hu, Zhibin; He, Zhonghu; Jia, Weihua; Zhou, Yifeng; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Zhao, Xue-Ke; Gao, She-Gan; Yuan, Zhi-Qing; Zhou, Fu-You; Fan, Zong-Min; Cui, Ji-Li; Lin, Hong-Li; Han, Xue-Na; Li, Bei; Chen, Xi; Dawsey, Sanford M; Liao, Linda; Lee, Maxwell P; Ding, Ti; Qiao, You-Lin; Liu, Zhihua; Liu, Yu; Yu, Dianke; Chang, Jiang; Wei, Lixuan; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Han, Jing-Jing; Zhou, Sheng-Li; Zhang, Peng; Zhang, Dong-Yun; Yuan, Yuan; Huang, Ying; Liu, Chunling; Zhai, Kan; Qiao, Yan; Jin, Guangfu; Guo, Chuanhai; Fu, Jianhua; Miao, Xiaoping; Lu, Changdong; Yang, Haijun; Wang, Chaoyu; Wheeler, William A; Gail, Mitchell; Yeager, Meredith; Yuenger, Jeff; Guo, Er-Tao; Li, Ai-Li; Zhang, Wei; Li, Xue-Min; Sun, Liang-Dan; Ma, Bao-Gen; Li, Yan; Tang, Sa; Peng, Xiu-Qing; Liu, Jing; Hutchinson, Amy; Jacobs, Kevin; Giffen, Carol; Burdette, Laurie; Fraumeni, Joseph F; Shen, Hongbing; Ke, Yang; Zeng, Yixin; Wu, Tangchun; Kraft, Peter; Chung, Charles C; Tucker, Margaret A; Hou, Zhi-Chao; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Wang, Li; Yuan, Guo; Chen, Li-Sha; Liu, Xiao; Ma, Teng; Meng, Hui; Sun, Li; Li, Xin-Min; Li, Xiu-Min; Ku, Jian-Wei; Zhou, Ying-Fa; Yang, Liu-Qin; Wang, Zhou; Li, Yin; Qige, Qirenwang; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Yuan, Ling; Yue, Wen-Bin; Wang, Ran; Wang, Lu-Wen; Fan, Xue-Ping; Zhu, Fang-Heng; Zhao, Wei-Xing; Mao, Yi-Min; Zhang, Mei; Xing, Guo-Lan; Li, Ji-Lin; Han, Min; Ren, Jing-Li; Liu, Bin; Ren, Shu-Wei; Kong, Qing-Peng; Li, Feng; Sheyhidin, Ilyar; Wei, Wu; Zhang, Yan-Rui; Feng, Chang-Wei; Wang, Jin; Yang, Yu-Hua; Hao, Hong-Zhang; Bao, Qi-De; Liu, Bao-Chi; Wu, Ai-Qun; Xie, Dong; Yang, Wan-Cai; Wang, Liang; Zhao, Xiao-Hang; Chen, Shu-Qing; Hong, Jun-Yan; Zhang, Xue-Jun; Freedman, Neal D; Goldstein, Alisa M; Lin, Dongxin; Taylor, Philip R; Wang, Li-Dong; Chanock, Stephen J

2014-09-01

We conducted a joint (pooled) analysis of three genome-wide association studies (GWAS) of esophageal squamous cell carcinoma (ESCC) in individuals of Chinese ancestry (5,337 ESCC cases and 5,787 controls) with 9,654 ESCC cases and 10,058 controls for follow-up. In a logistic regression model adjusted for age, sex, study and two eigenvectors, two new loci achieved genome-wide significance, marked by rs7447927 at 5q31.2 (per-allele odds ratio (OR) = 0.85, 95% confidence interval (CI) = 0.82-0.88; P = 7.72 × 10(-20)) and rs1642764 at 17p13.1 (per-allele OR = 0.88, 95% CI = 0.85-0.91; P = 3.10 × 10(-13)). rs7447927 is a synonymous SNP in TMEM173, and rs1642764 is an intronic SNP in ATP1B2, near TP53. Furthermore, a locus in the HLA class II region at 6p21.32 (rs35597309) achieved genome-wide significance in the two populations at highest risk for ESSC (OR = 1.33, 95% CI = 1.22-1.46; P = 1.99 × 10(-10)). Our joint analysis identifies new ESCC susceptibility loci overall as well as a new locus unique to the population in the Taihang Mountain region at high risk of ESCC.
Comparative Genomic Analysis of Globally Dominant ST131 Clone with Other Epidemiologically Successful Extraintestinal Pathogenic Escherichia coli (ExPEC) Lineages.

PubMed

Shaik, Sabiha; Ranjan, Amit; Tiwari, Sumeet K; Hussain, Arif; Nandanwar, Nishant; Kumar, Narender; Jadhav, Savita; Semmler, Torsten; Baddam, Ramani; Islam, Mohammed Aminul; Alam, Munirul; Wieler, Lothar H; Watanabe, Haruo; Ahmed, Niyaz

2017-10-24

Escherichia coli sequence type 131 (ST131), a pandemic clone responsible for the high incidence of extraintestinal pathogenic E. coli (ExPEC) infections, has been known widely for its contribution to the worldwide dissemination of multidrug resistance. Although other ExPEC-associated and extended-spectrum-β-lactamase (ESBL)-producing E. coli clones, such as ST38, ST405, and ST648 have been studied widely, no comparative genomic data with respect to other genotypes exist for ST131. In this study, comparative genomic analysis was performed for 99 ST131 E. coli strains with 40 genomes from three other STs, including ST38 ( n = 12), ST405 ( n = 10), and ST648 ( n = 18), and functional studies were performed on five in-house strains corresponding to the four STs. Phylogenomic analysis results from this study corroborated with the sequence type-specific clonality. Results from the genome-wide resistance profiling confirmed that all strains were inherently multidrug resistant. ST131 genomes showed unique virulence profiles, and analysis of mobile genetic elements and their associated methyltransferases (MTases) has revealed that several of them were missing from the majority of the non-ST131 strains. Despite the fact that non-ST131 strains lacked few essential genes belonging to the serum resistome, the in-house strains representing all four STs demonstrated similar resistance levels to serum antibactericidal activity. Core genome analysis data revealed that non-ST131 strains usually lacked several ST131-defined genomic coordinates, and a significant number of genes were missing from the core of the ST131 genomes. Data from this study reinforce adaptive diversification of E. coli strains belonging to the ST131 lineage and provide new insights into the molecular mechanisms underlying clonal diversification of the ST131 lineage. IMPORTANCE E. coli , particularly the ST131 extraintestinal pathogenic E. coli (ExPEC) lineage, is an important cause of community- and hospital-acquired infections, such as urinary tract infections, surgical site infections, bloodstream infections, and sepsis. The treatment of infections caused by ExPEC has become very challenging due to the emergence of resistance to the first-line as well as the last-resort antibiotics. This study analyzes E. coli ST131 against three other important and globally distributed ExPEC lineages (ST38, ST405, and ST648) that also produced extended-spectrum β-lactamase (ESBL). This is perhaps the first study that employs the high-throughput whole-genome sequence-based approach to compare and study the genomic features of these four ExPEC lineages in relation to their functional properties. Findings from this study highlight the differences in the genomic coordinates of ST131 with respect to the other STs considered here. Results from this comparative genomics study can help in advancing the understanding of ST131 evolution and also offer a framework towards future developments in pathogen identification and targeted therapeutics to prevent diseases caused by this pandemic E. coli ST131 clone. Copyright © 2017 Shaik et al.
Genome-wide detection of intervals of genetic heterogeneity associated with complex traits

PubMed Central

Llinares-López, Felipe; Grimm, Dominik G.; Bodenham, Dean A.; Gieraths, Udo; Sugiyama, Mahito; Rowan, Beth; Borgwardt, Karsten

2015-01-01

Motivation: Genetic heterogeneity, the fact that several sequence variants give rise to the same phenotype, is a phenomenon that is of the utmost interest in the analysis of complex phenotypes. Current approaches for finding regions in the genome that exhibit genetic heterogeneity suffer from at least one of two shortcomings: (i) they require the definition of an exact interval in the genome that is to be tested for genetic heterogeneity, potentially missing intervals of high relevance, or (ii) they suffer from an enormous multiple hypothesis testing problem due to the large number of potential candidate intervals being tested, which results in either many false positives or a lack of power to detect true intervals. Results: Here, we present an approach that overcomes both problems: it allows one to automatically find all contiguous sequences of single nucleotide polymorphisms in the genome that are jointly associated with the phenotype. It also solves both the inherent computational efficiency problem and the statistical problem of multiple hypothesis testing, which are both caused by the huge number of candidate intervals. We demonstrate on Arabidopsis thaliana genome-wide association study data that our approach can discover regions that exhibit genetic heterogeneity and would be missed by single-locus mapping. Conclusions: Our novel approach can contribute to the genome-wide discovery of intervals that are involved in the genetic heterogeneity underlying complex phenotypes. Availability and implementation: The code can be obtained at: http://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/sis.html. Contact: felipe.llinares@bsse.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26072488
Pooled genome wide association detects association upstream of FCRL3 with Graves' disease.

PubMed

Khong, Jwu Jin; Burdon, Kathryn P; Lu, Yi; Laurie, Kate; Leonardos, Lefta; Baird, Paul N; Sahebjada, Srujana; Walsh, John P; Gajdatsy, Adam; Ebeling, Peter R; Hamblin, Peter Shane; Wong, Rosemary; Forehan, Simon P; Fourlanos, Spiros; Roberts, Anthony P; Doogue, Matthew; Selva, Dinesh; Montgomery, Grant W; Macgregor, Stuart; Craig, Jamie E

2016-11-18

Graves' disease is an autoimmune thyroid disease of complex inheritance. Multiple genetic susceptibility loci are thought to be involved in Graves' disease and it is therefore likely that these can be identified by genome wide association studies. This study aimed to determine if a genome wide association study, using a pooling methodology, could detect genomic loci associated with Graves' disease. Nineteen of the top ranking single nucleotide polymorphisms including HLA-DQA1 and C6orf10, were clustered within the Major Histo-compatibility Complex region on chromosome 6p21, with rs1613056 reaching genome wide significance (p = 5 × 10 -8 ). Technical validation of top ranking non-Major Histo-compatablity complex single nucleotide polymorphisms with individual genotyping in the discovery cohort revealed four single nucleotide polymorphisms with p ≤ 10 -4 . Rs17676303 on chromosome 1q23.1, located upstream of FCRL3, showed evidence of association with Graves' disease across the discovery, replication and combined cohorts. A second single nucleotide polymorphism rs9644119 downstream of DPYSL2 showed some evidence of association supported by finding in the replication cohort that warrants further study. Pooled genome wide association study identified a genetic variant upstream of FCRL3 as a susceptibility locus for Graves' disease in addition to those identified in the Major Histo-compatibility Complex. A second locus downstream of DPYSL2 is potentially a novel genetic variant in Graves' disease that requires further confirmation.
Weak Lensing : Ground vs. Space in the Cosmos Field

NASA Astrophysics Data System (ADS)

Kasliwal, Mansi M.; Massey, R. J.; Ellis, R. S.; Rhodes, J.

2006-12-01

Weak lensing statistics are best for large numbers wide surveys with greater number of galaxies and deep surveys with a higher number density of galaxies. Although space-based surveys are unparalleled in their depth, ground-based surveys are the more cost-effective way to survey wide regions of the sky. We assess the relative merits of the two observing platforms, by using premier, multi-band, ground-based Subaru SuprimeCam data and space-based Hubble ACS data, in the 2 sq. degree COSMOS field in three ways. First, we compare shear measurements of individual galaxies and identify the relative calibration of the two datasets in terms of the largest subset in magnitude and size that is consistent. Second, we compare spaceand ground-based mass maps to quantify the relative completeness and contamination of the resulting cluster catalogs. We find that more clusters with XMM catalog counterparts are detected from space than ground and some ground-based clusters are possibly spurious detections. Third, we perform a detailed comparison of the precision with which it is possible to reconstruct the mass and size of four clusters at various redshifts identified from both ground and space. We find that the noise is much lower from space in all three investigations, but find no evidence for systematic overestimation or underestimation of the individual cluster properties by either survey.
Development and application of a novel genome-wide SNP array reveals domestication history in soybean

PubMed Central

Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

2016-01-01

Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean. PMID:26856884
Development and application of a novel genome-wide SNP array reveals domestication history in soybean.

PubMed

Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

2016-02-09

Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean.
PATRIC, the bacterial bioinformatics database and analysis resource.

PubMed

Wattam, Alice R; Abraham, David; Dalay, Oral; Disz, Terry L; Driscoll, Timothy; Gabbard, Joseph L; Gillespie, Joseph J; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K; Olson, Robert; Overbeek, Ross; Pusch, Gordon D; Shukla, Maulik; Schulman, Julie; Stevens, Rick L; Sullivan, Daniel E; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J C; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W

2014-01-01

The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.
PATRIC, the bacterial bioinformatics database and analysis resource

PubMed Central

Wattam, Alice R.; Abraham, David; Dalay, Oral; Disz, Terry L.; Driscoll, Timothy; Gabbard, Joseph L.; Gillespie, Joseph J.; Gough, Roger; Hix, Deborah; Kenyon, Ronald; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olson, Robert; Overbeek, Ross; Pusch, Gordon D.; Shukla, Maulik; Schulman, Julie; Stevens, Rick L.; Sullivan, Daniel E.; Vonstein, Veronika; Warren, Andrew; Will, Rebecca; Wilson, Meredith J.C.; Yoo, Hyun Seung; Zhang, Chengdong; Zhang, Yan; Sobral, Bruno W.

2014-01-01

The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein–protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. PMID:24225323
Genome-culture coevolution promotes rapid divergence of killer whale ecotypes.

PubMed

Foote, Andrew D; Vijay, Nagarjun; Ávila-Arcos, María C; Baird, Robin W; Durban, John W; Fumagalli, Matteo; Gibbs, Richard A; Hanson, M Bradley; Korneliussen, Thorfinn S; Martin, Michael D; Robertson, Kelly M; Sousa, Vitor C; Vieira, Filipe G; Vinař, Tomáš; Wade, Paul; Worley, Kim C; Excoffier, Laurent; Morin, Phillip A; Gilbert, M Thomas P; Wolf, Jochen B W

2016-05-31

Analysing population genomic data from killer whale ecotypes, which we estimate have globally radiated within less than 250,000 years, we show that genetic structuring including the segregation of potentially functional alleles is associated with socially inherited ecological niche. Reconstruction of ancestral demographic history revealed bottlenecks during founder events, likely promoting ecological divergence and genetic drift resulting in a wide range of genome-wide differentiation between pairs of allopatric and sympatric ecotypes. Functional enrichment analyses provided evidence for regional genomic divergence associated with habitat, dietary preferences and post-zygotic reproductive isolation. Our findings are consistent with expansion of small founder groups into novel niches by an initial plastic behavioural response, perpetuated by social learning imposing an altered natural selection regime. The study constitutes an important step towards an understanding of the complex interaction between demographic history, culture, ecological adaptation and evolution at the genomic level.
Genome-culture coevolution promotes rapid divergence of killer whale ecotypes

PubMed Central

Foote, Andrew D.; Vijay, Nagarjun; Ávila-Arcos, María C.; Baird, Robin W.; Durban, John W.; Fumagalli, Matteo; Gibbs, Richard A.; Hanson, M. Bradley; Korneliussen, Thorfinn S.; Martin, Michael D.; Robertson, Kelly M.; Sousa, Vitor C.; Vieira, Filipe G.; Vinař, Tomáš; Wade, Paul; Worley, Kim C.; Excoffier, Laurent; Morin, Phillip A.; Gilbert, M. Thomas P.; Wolf, Jochen B.W.

2016-01-01

Analysing population genomic data from killer whale ecotypes, which we estimate have globally radiated within less than 250,000 years, we show that genetic structuring including the segregation of potentially functional alleles is associated with socially inherited ecological niche. Reconstruction of ancestral demographic history revealed bottlenecks during founder events, likely promoting ecological divergence and genetic drift resulting in a wide range of genome-wide differentiation between pairs of allopatric and sympatric ecotypes. Functional enrichment analyses provided evidence for regional genomic divergence associated with habitat, dietary preferences and post-zygotic reproductive isolation. Our findings are consistent with expansion of small founder groups into novel niches by an initial plastic behavioural response, perpetuated by social learning imposing an altered natural selection regime. The study constitutes an important step towards an understanding of the complex interaction between demographic history, culture, ecological adaptation and evolution at the genomic level. PMID:27243207
Array-based assay detects genome-wide 5-mC and 5-hmC in the brains of humans, non-human primates, and mice.

PubMed

Chopra, Pankaj; Papale, Ligia A; White, Andrew T J; Hatch, Andrea; Brown, Ryan M; Garthwaite, Mark A; Roseboom, Patrick H; Golos, Thaddeus G; Warren, Stephen T; Alisch, Reid S

2014-02-13

Methylation on the fifth position of cytosine (5-mC) is an essential epigenetic mark that is linked to both normal neurodevelopment and neurological diseases. The recent identification of another modified form of cytosine, 5-hydroxymethylcytosine (5-hmC), in both stem cells and post-mitotic neurons, raises new questions as to the role of this base in mediating epigenetic effects. Genomic studies of these marks using model systems are limited, particularly with array-based tools, because the standard method of detecting DNA methylation cannot distinguish between 5-mC and 5-hmC and most methods have been developed to only survey the human genome. We show that non-human data generated using the optimization of a widely used human DNA methylation array, designed only to detect 5-mC, reproducibly distinguishes tissue types within and between chimpanzee, rhesus, and mouse, with correlations near the human DNA level (R(2) > 0.99). Genome-wide methylation analysis, using this approach, reveals 6,102 differentially methylated loci between rhesus placental and fetal tissues with pathways analysis significantly overrepresented for developmental processes. Restricting the analysis to oncogenes and tumor suppressor genes finds 76 differentially methylated loci, suggesting that rhesus placental tissue carries a cancer epigenetic signature. Similarly, adapting the assay to detect 5-hmC finds highly reproducible 5-hmC levels within human, rhesus, and mouse brain tissue that is species-specific with a hierarchical abundance among the three species (human > rhesus > mouse). Annotation of 5-hmC with respect to gene structure reveals a significant prevalence in the 3'UTR and an association with chromatin-related ontological terms, suggesting an epigenetic feedback loop mechanism for 5-hmC. Together, these data show that this array-based methylation assay is generalizable to all mammals for the detection of both 5-mC and 5-hmC, greatly improving the utility of mammalian model systems to study the role of epigenetics in human health, disease, and evolution.
A Genome-Wide Association Study Suggests Novel Loci Associated with a Schizophrenia-Related Brain-Based Phenotype

PubMed Central

Hass, Johanna; Walton, Esther; Kirsten, Holger; Liu, Jingyu; Priebe, Lutz; Wolf, Christiane; Karbalai, Nazanin; Gollub, Randy; White, Tonya; Roessner, Veit; Müller, Kathrin U.; Paus, Tomas; Smolka, Michael N.; Schumann, Gunter; Scholz, Markus; Cichon, Sven; Calhoun, Vince; Ehrlich, Stefan

2013-01-01

Patients with schizophrenia and their siblings typically show subtle changes of brain structures, such as a reduction of hippocampal volume. Hippocampal volume is heritable, may explain a variety of cognitive symptoms of schizophrenia and is thus considered an intermediate phenotype for this mental illness. The aim of our analyses was to identify single-nucleotide polymorphisms (SNP) related to hippocampal volume without making prior assumptions about possible candidate genes. In this study, we combined genetics, imaging and neuropsychological data obtained from the Mind Clinical Imaging Consortium study of schizophrenia (n = 328). A total of 743,591 SNPs were tested for association with hippocampal volume in a genome-wide association study. Gene expression profiles of human hippocampal tissue were investigated for gene regions of significantly associated SNPs. None of the genetic markers reached genome-wide significance. However, six highly correlated SNPs (rs4808611, rs35686037, rs12982178, rs1042178, rs10406920, rs8170) on chromosome 19p13.11, located within or in close proximity to the genes NR2F6, USHBP1, and BABAM1, as well as four SNPs in three other genomic regions (chromosome 1, 2 and 10) had p-values between 6.75×10−6 and 8.3×10−7. Using existing data of a very recently published GWAS of hippocampal volume and additional data of a multicentre study in a large cohort of adolescents of European ancestry, we found supporting evidence for our results. Furthermore, allelic differences in rs4808611 and rs8170 were highly associated with differential mRNA expression in the cis-acting region. Associations with memory functioning indicate a possible functional importance of the identified risk variants. Our findings provide new insights into the genetic architecture of a brain structure closely linked to schizophrenia. In silico replication, mRNA expression and cognitive data provide additional support for the relevance of our findings. Identification of causal variants and their functional effects may unveil yet unknown players in the neurodevelopment and the pathogenesis of neuropsychiatric disorders. PMID:23805179
Family-based Association Analyses of Imputed Genotypes Reveal Genome-Wide Significant Association of Alzheimer’s disease with OSBPL6, PTPRG and PDCL3

PubMed Central

Herold, Christine; Hooli, Basavaraj V.; Mullin, Kristina; Liu, Tian; Roehr, Johannes T; Mattheisen, Manuel; Parrado, Antonio R.; Bertram, Lars; Lange, Christoph; Tanzi, Rudolph E.

2015-01-01

The genetic basis of Alzheimer's disease (AD) is complex and heterogeneous. Over 200 highly penetrant pathogenic variants in the genes APP, PSEN1 and PSEN2 cause a subset of early-onset familial Alzheimer's disease (EOFAD). On the other hand, susceptibility to late-onset forms of AD (LOAD) is indisputably associated to the ε4 allele in the gene APOE, and more recently to variants in more than two-dozen additional genes identified in the large-scale genome-wide association studies (GWAS) and meta-analyses reports. Taken together however, although the heritability in AD is estimated to be as high as 80%, a large proportion of the underlying genetic factors still remain to be elucidated. In this study we performed a systematic family-based genome-wide association and meta-analysis on close to 15 million imputed variants from three large collections of AD families (~3,500 subjects from 1,070 families). Using a multivariate phenotype combining affection status and onset age, meta-analysis of the association results revealed three single nucleotide polymorphisms (SNPs) that achieved genome-wide significance for association with AD risk: rs7609954 in the gene PTPRG (P-value = 3.98·10−08), rs1347297 in the gene OSBPL6 (P-value = 4.53·10−08), and rs1513625 near PDCL3 (P-value = 4.28·10−08). In addition, rs72953347 in OSBPL6 (P-value = 6.36·10−07) and two SNPs in the gene CDKAL1 showed marginally significant association with LOAD (rs10456232, P-value: 4.76·10−07; rs62400067, P-value: 3.54·10−07). In summary, family-based GWAS meta-analysis of imputed SNPs revealed novel genomic variants in (or near) PTPRG, OSBPL6, and PDCL3 that influence risk for AD with genome-wide significance. PMID:26830138
Cooperative Genome-Wide Analysis Shows Increased Homozygosity in Early Onset Parkinson's Disease

PubMed Central

Nalls, Michael A.; Martinez, Maria; Schulte, Claudia; Holmans, Peter; Gasser, Thomas; Hardy, John; Singleton, Andrew B.; Wood, Nicholas W.; Brice, Alexis; Heutink, Peter; Williams, Nigel; Morris, Huw R.

2012-01-01

Parkinson's disease (PD) occurs in both familial and sporadic forms, and both monogenic and complex genetic factors have been identified. Early onset PD (EOPD) is particularly associated with autosomal recessive (AR) mutations, and three genes, PARK2, PARK7 and PINK1, have been found to carry mutations leading to AR disease. Since mutations in these genes account for less than 10% of EOPD patients, we hypothesized that further recessive genetic factors are involved in this disorder, which may appear in extended runs of homozygosity. We carried out genome wide SNP genotyping to look for extended runs of homozygosity (ROHs) in 1,445 EOPD cases and 6,987 controls. Logistic regression analyses showed an increased level of genomic homozygosity in EOPD cases compared to controls. These differences are larger for ROH of 9 Mb and above, where there is a more than three-fold increase in the proportion of cases carrying a ROH. These differences are not explained by occult recessive mutations at existing loci. Controlling for genome wide homozygosity in logistic regression analyses increased the differences between cases and controls, indicating that in EOPD cases ROHs do not simply relate to genome wide measures of inbreeding. Homozygosity at a locus on chromosome19p13.3 was identified as being more common in EOPD cases as compared to controls. Sequencing analysis of genes and predicted transcripts within this locus failed to identify a novel mutation causing EOPD in our cohort. There is an increased rate of genome wide homozygosity in EOPD, as measured by an increase in ROHs. These ROHs are a signature of inbreeding and do not necessarily harbour disease-causing genetic variants. Although there might be other regions of interest apart from chromosome 19p13.3, we lack the power to detect them with this analysis. PMID:22427796

Complete and Draft Genome Sequences of Nine Lactobacillus sakei Strains Selected from the Three Known Phylogenetic Lineages and Their Main Clonal Complexes.

PubMed

Loux, Valentin; Coeuret, Gwendoline; Zagorec, Monique; Champomier Vergès, Marie-Christine; Chaillou, Stéphane

2018-04-19

We present here the complete and draft genome sequences of nine Lactobacillus sakei strains, selected from the entire range of clonal complexes from the three known lineages of the species. The strains were chosen to provide a wide view of pangenomic and plasmidic diversity for this important foodborne species. Copyright © 2018 Loux et al.
Human CST Facilitates Genome-wide RAD51 Recruitment to GC-Rich Repetitive Sequences in Response to Replication Stress.

PubMed

Chastain, Megan; Zhou, Qing; Shiva, Olga; Fadri-Moskwik, Maria; Whitmore, Leanne; Jia, Pingping; Dai, Xueyu; Huang, Chenhui; Ye, Ping; Chai, Weihang

2016-08-02

The telomeric CTC1/STN1/TEN1 (CST) complex has been implicated in promoting replication recovery under replication stress at genomic regions, yet its precise role is unclear. Here, we report that STN1 is enriched at GC-rich repetitive sequences genome-wide in response to hydroxyurea (HU)-induced replication stress. STN1 deficiency exacerbates the fragility of these sequences under replication stress, resulting in chromosome fragmentation. We find that upon fork stalling, CST proteins form distinct nuclear foci that colocalize with RAD51. Furthermore, replication stress induces physical association of CST with RAD51 in an ATR-dependent manner. Strikingly, CST deficiency diminishes HU-induced RAD51 foci formation and reduces RAD51 recruitment to telomeres and non-telomeric GC-rich fragile sequences. Collectively, our findings establish that CST promotes RAD51 recruitment to GC-rich repetitive sequences in response to replication stress to facilitate replication restart, thereby providing insights into the mechanism underlying genome stability maintenance. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
A study assessing the association of glycated hemoglobin A1C (HbA1C) associated variants with HbA1C, chronic kidney disease and diabetic retinopathy in populations of Asian ancestry.

PubMed

Chen, Peng; Ong, Rick Twee-Hee; Tay, Wan-Ting; Sim, Xueling; Ali, Mohammad; Xu, Haiyan; Suo, Chen; Liu, Jianjun; Chia, Kee-Seng; Vithana, Eranga; Young, Terri L; Aung, Tin; Lim, Wei-Yen; Khor, Chiea-Chuen; Cheng, Ching-Yu; Wong, Tien-Yin; Teo, Yik-Ying; Tai, E-Shyong

2013-01-01

Glycated hemoglobin A1C (HbA1C) level is used as a diagnostic marker for diabetes mellitus and a predictor of diabetes associated complications. Genome-wide association studies have identified genetic variants associated with HbA1C level. Most of these studies have been conducted in populations of European ancestry. Here we report the findings from a meta-analysis of genome-wide association studies of HbA1C levels in 6,682 non-diabetic subjects of Chinese, Malay and South Asian ancestries. We also sought to examine the associations between HbA1C associated SNPs and microvascular complications associated with diabetes mellitus, namely chronic kidney disease and retinopathy. A cluster of 6 SNPs on chromosome 17 showed an association with HbA1C which achieved genome-wide significance in the Malays but not in Chinese and Asian Indians. No other variants achieved genome-wide significance in the individual studies or in the meta-analysis. When we investigated the reproducibility of the findings that emerged from the European studies, six loci out of fifteen were found to be associated with HbA1C with effect sizes similar to those reported in the populations of European ancestry and P-value ≤ 0.05. No convincing associations with chronic kidney disease and retinopathy were identified in this study.
A Study Assessing the Association of Glycated Hemoglobin A1C (HbA1C) Associated Variants with HbA1C, Chronic Kidney Disease and Diabetic Retinopathy in Populations of Asian Ancestry

PubMed Central

Chen, Peng; Ong, Rick Twee-Hee; Tay, Wan-Ting; Sim, Xueling; Ali, Mohammad; Xu, Haiyan; Suo, Chen; Liu, Jianjun; Chia, Kee-Seng; Vithana, Eranga; Young, Terri L.; Aung, Tin; Lim, Wei-Yen; Khor, Chiea-Chuen; Cheng, Ching-Yu; Wong, Tien-Yin; Teo, Yik-Ying; Tai, E-Shyong

2013-01-01

Glycated hemoglobin A1C (HbA1C) level is used as a diagnostic marker for diabetes mellitus and a predictor of diabetes associated complications. Genome-wide association studies have identified genetic variants associated with HbA1C level. Most of these studies have been conducted in populations of European ancestry. Here we report the findings from a meta-analysis of genome-wide association studies of HbA1C levels in 6,682 non-diabetic subjects of Chinese, Malay and South Asian ancestries. We also sought to examine the associations between HbA1C associated SNPs and microvascular complications associated with diabetes mellitus, namely chronic kidney disease and retinopathy. A cluster of 6 SNPs on chromosome 17 showed an association with HbA1C which achieved genome-wide significance in the Malays but not in Chinese and Asian Indians. No other variants achieved genome-wide significance in the individual studies or in the meta-analysis. When we investigated the reproducibility of the findings that emerged from the European studies, six loci out of fifteen were found to be associated with HbA1C with effect sizes similar to those reported in the populations of European ancestry and P-value ≤ 0.05. No convincing associations with chronic kidney disease and retinopathy were identified in this study. PMID:24244560
Optimized gene editing technology for Drosophila melanogaster using germ line-specific Cas9.

PubMed

Ren, Xingjie; Sun, Jin; Housden, Benjamin E; Hu, Yanhui; Roesel, Charles; Lin, Shuailiang; Liu, Lu-Ping; Yang, Zhihao; Mao, Decai; Sun, Lingzhu; Wu, Qujie; Ji, Jun-Yuan; Xi, Jianzhong; Mohr, Stephanie E; Xu, Jiang; Perrimon, Norbert; Ni, Jian-Quan

2013-11-19

The ability to engineer genomes in a specific, systematic, and cost-effective way is critical for functional genomic studies. Recent advances using the CRISPR-associated single-guide RNA system (Cas9/sgRNA) illustrate the potential of this simple system for genome engineering in a number of organisms. Here we report an effective and inexpensive method for genome DNA editing in Drosophila melanogaster whereby plasmid DNAs encoding short sgRNAs under the control of the U6b promoter are injected into transgenic flies in which Cas9 is specifically expressed in the germ line via the nanos promoter. We evaluate the off-targets associated with the method and establish a Web-based resource, along with a searchable, genome-wide database of predicted sgRNAs appropriate for genome engineering in flies. Finally, we discuss the advantages of our method in comparison with other recently published approaches.
Exome-wide DNA capture and next generation sequencing in domestic and wild species.

PubMed

Cosart, Ted; Beja-Pereira, Albano; Chen, Shanyuan; Ng, Sarah B; Shendure, Jay; Luikart, Gordon

2011-07-05

Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses.We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus) to capture (enrich for), and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison). Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.
A genome-wide association study identifies multiple loci for variation in human ear morphology.

PubMed

Adhikari, Kaustubh; Reales, Guillermo; Smith, Andrew J P; Konka, Esra; Palmen, Jutta; Quinto-Sanchez, Mirsha; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Fuentes, Macarena; Pizarro, María; Barquera Lozano, Rodrigo; Macín Pérez, Gastón; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M; Bortolini, Maria-Cátira; Canizales-Quinteros, Samuel; Rothhammer, Francisco; Bedoya, Gabriel; Calderón, Rosario; Rosique, Javier; Cheeseman, Michael; Bhutta, Mahmood F; Humphries, Steve E; Gonzalez-José, Rolando; Headon, Denis; Balding, David; Ruiz-Linares, Andrés

2015-06-24

Here we report a genome-wide association study for non-pathological pinna morphology in over 5,000 Latin Americans. We find genome-wide significant association at seven genomic regions affecting: lobe size and attachment, folding of antihelix, helix rolling, ear protrusion and antitragus size (linear regression P values 2 × 10(-8) to 3 × 10(-14)). Four traits are associated with a functional variant in the Ectodysplasin A receptor (EDAR) gene, a key regulator of embryonic skin appendage development. We confirm expression of Edar in the developing mouse ear and that Edar-deficient mice have an abnormally shaped pinna. Two traits are associated with SNPs in a region overlapping the T-Box Protein 15 (TBX15) gene, a major determinant of mouse skeletal development. Strongest association in this region is observed for SNP rs17023457 located in an evolutionarily conserved binding site for the transcription factor Cartilage paired-class homeoprotein 1 (CART1), and we confirm that rs17023457 alters in vitro binding of CART1.
Partitioning heritability by functional annotation using genome-wide association summary statistics.

PubMed

Finucane, Hilary K; Bulik-Sullivan, Brendan; Gusev, Alexander; Trynka, Gosia; Reshef, Yakir; Loh, Po-Ru; Anttila, Verneri; Xu, Han; Zang, Chongzhi; Farh, Kyle; Ripke, Stephan; Day, Felix R; Purcell, Shaun; Stahl, Eli; Lindstrom, Sara; Perry, John R B; Okada, Yukinori; Raychaudhuri, Soumya; Daly, Mark J; Patterson, Nick; Neale, Benjamin M; Price, Alkes L

2015-11-01

Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here we analyze a broad set of functional elements, including cell type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits with an average sample size of 73,599. To enable this analysis, we introduce a new method, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers. This new method is computationally tractable at very large sample sizes and leverages genome-wide information. Our findings include a large enrichment of heritability in conserved regions across many traits, a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers and many cell type-specific enrichments, including significant enrichment of central nervous system cell types in the heritability of body mass index, age at menarche, educational attainment and smoking behavior.
Old foes, new understandings: nuclear entry of small non-enveloped DNA viruses.

PubMed

Fay, Nikta; Panté, Nelly

2015-06-01

The nuclear import of viral genomes is an important step of the infectious cycle for viruses that replicate in the nucleus of their host cells. Although most viruses use the cellular nuclear import machinery or some components of this machinery, others have developed sophisticated ways to reach the nucleus. Some of these have been known for some time; however, recent studies have changed our understanding of how some non-enveloped DNA viruses access the nucleus. For example, parvoviruses enter the nucleus through small disruptions of the nuclear membranes and nuclear lamina, and adenovirus tugs at the nuclear pore complex, using kinesin-1, to disassemble their capsids and deliver viral proteins and genomes into the nucleus. Here we review recent findings of the nuclear import strategies of three small non-enveloped DNA viruses, including adenovirus, parvovirus, and the polyomavirus simian virus 40. Copyright © 2015 Elsevier B.V. All rights reserved.
LD Score Regression Distinguishes Confounding from Polygenicity in Genome-Wide Association Studies

PubMed Central

Bulik-Sullivan, Brendan K.; Loh, Po-Ru; Finucane, Hilary; Ripke, Stephan; Yang, Jian; Patterson, Nick; Daly, Mark J.; Price, Alkes L.; Neale, Benjamin M.

2015-01-01

Both polygenicity (i.e., many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of test statistic inflation in many GWAS of large sample size. PMID:25642630
Genome-wide linkage meta-analysis identifies susceptibility loci at 2q34 and 13q31.3 for genetic generalized epilepsies.

PubMed

Leu, Costin; de Kovel, Carolien G F; Zara, Federico; Striano, Pasquale; Pezzella, Marianna; Robbiano, Angela; Bianchi, Amedeo; Bisulli, Francesca; Coppola, Antonietta; Giallonardo, Anna Teresa; Beccaria, Francesca; Trenité, Dorothée Kasteleijn-Nolst; Lindhout, Dick; Gaus, Verena; Schmitz, Bettina; Janz, Dieter; Weber, Yvonne G; Becker, Felicitas; Lerche, Holger; Kleefuss-Lie, Ailing A; Hallman, Kerstin; Kunz, Wolfram S; Elger, Christian E; Muhle, Hiltrud; Stephani, Ulrich; Møller, Rikke S; Hjalgrim, Helle; Mullen, Saul; Scheffer, Ingrid E; Berkovic, Samuel F; Everett, Kate V; Gardiner, Mark R; Marini, Carla; Guerrini, Renzo; Lehesjoki, Anna-Elina; Siren, Auli; Nabbout, Rima; Baulac, Stephanie; Leguern, Eric; Serratosa, Jose M; Rosenow, Felix; Feucht, Martha; Unterberger, Iris; Covanis, Athanasios; Suls, Arvid; Weckhuysen, Sarah; Kaneva, Radka; Caglayan, Hande; Turkdogan, Dilsad; Baykan, Betul; Bebek, Nerses; Ozbek, Ugur; Hempelmann, Anne; Schulz, Herbert; Rüschendorf, Franz; Trucks, Holger; Nürnberg, Peter; Avanzini, Giuliano; Koeleman, Bobby P C; Sander, Thomas

2012-02-01

Genetic generalized epilepsies (GGEs) have a lifetime prevalence of 0.3% with heritability estimates of 80%. A considerable proportion of families with siblings affected by GGEs presumably display an oligogenic inheritance. The present genome-wide linkage meta-analysis aimed to map: (1) susceptibility loci shared by a broad spectrum of GGEs, and (2) seizure type-related genetic factors preferentially predisposing to either typical absence or myoclonic seizures, respectively. Meta-analysis of three genome-wide linkage datasets was carried out in 379 GGE-multiplex families of European ancestry including 982 relatives with GGEs. To dissect out seizure type-related susceptibility genes, two family subgroups were stratified comprising 235 families with predominantly genetic absence epilepsies (GAEs) and 118 families with an aggregation of juvenile myoclonic epilepsy (JME). To map shared and seizure type-related susceptibility loci, both nonparametric loci (NPL) and parametric linkage analyses were performed for a broad trait model (GGEs) in the entire set of GGE-multiplex families and a narrow trait model (typical absence or myoclonic seizures) in the subgroups of JME and GAE families. For the entire set of 379 GGE-multiplex families, linkage analysis revealed six loci achieving suggestive evidence for linkage at 1p36.22, 3p14.2, 5q34, 13q12.12, 13q31.3, and 19q13.42. The linkage finding at 5q34 was consistently supported by both NPL and parametric linkage results across all three family groups. A genome-wide significant nonparametric logarithm of odds score of 3.43 was obtained at 2q34 in 118 JME families. Significant parametric linkage to 13q31.3 was found in 235 GAE families assuming recessive inheritance (heterogeneity logarithm of odds = 5.02). Our linkage results support an oligogenic predisposition of familial GGE syndromes. The genetic risk factor at 5q34 confers risk to a broad spectrum of familial GGE syndromes, whereas susceptibility loci at 2q34 and 13q31.3 preferentially predispose to myoclonic seizures or absence seizures, respectively. Phenotype- genotype strategies applying narrow trait definitions in phenotypic homogeneous subgroups of families improve the prospects of disentangling the genetic basis of common familial GGE syndromes. Wiley Periodicals, Inc. © 2012 International League Against Epilepsy.
Genome-Wide Variation Patterns Uncover the Origin and Selection in Cultivated Ginseng (Panax ginseng Meyer).

PubMed

Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili; Liu, Bao; Li, Lin-Feng

2017-09-01

Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Ancestry, admixture and fitness in Colombian genomes

PubMed Central

Rishishwar, Lavanya; Conley, Andrew B.; Wigington, Charles H.; Wang, Lu; Valderrama-Aguirre, Augusto; King Jordan, I.

2015-01-01

The human dimension of the Columbian Exchange entailed substantial genetic admixture between ancestral source populations from Africa, the Americas and Europe, which had evolved separately for many thousands of years. We sought to address the implications of the creation of admixed American genomes, containing novel allelic combinations, for human health and fitness via analysis of an admixed Colombian population from Medellin. Colombian genomes from Medellin show a wide range of three-way admixture contributions from ancestral source populations. The primary ancestry component for the population is European (average = 74.6%, range = 45.0%–96.7%), followed by Native American (average = 18.1%, range = 2.1%–33.3%) and African (average = 7.3%, range = 0.2%–38.6%). Locus-specific patterns of ancestry were evaluated to search for genomic regions that are enriched across the population for particular ancestry contributions. Adaptive and innate immune system related genes and pathways are particularly over-represented among ancestry-enriched segments, including genes (HLA-B and MAPK10) that are involved in defense against endemic pathogens such as malaria. Genes that encode functions related to skin pigmentation (SCL4A5) and cutaneous glands (EDAR) are also found in regions with anomalous ancestry patterns. These results suggest the possibility that ancestry-specific loci were differentially retained in the modern admixed Colombian population based on their utility in the New World environment. PMID:26197429
Brief Overview of a Decade of Genome-Wide Association Studies on Primary Hypertension.

PubMed

Azam, Afifah Binti; Azizan, Elena Aisha Binti

2018-01-01

Primary hypertension is widely believed to be a complex polygenic disorder with the manifestation influenced by the interactions of genomic and environmental factors making identification of susceptibility genes a major challenge. With major advancement in high-throughput genotyping technology, genome-wide association study (GWAS) has become a powerful tool for researchers studying genetically complex diseases. GWASs work through revealing links between DNA sequence variation and a disease or trait with biomedical importance. The human genome is a very long DNA sequence which consists of billions of nucleotides arranged in a unique way. A single base-pair change in the DNA sequence is known as a single nucleotide polymorphism (SNP). With the help of modern genotyping techniques such as chip-based genotyping arrays, thousands of SNPs can be genotyped easily. Large-scale GWASs, in which more than half a million of common SNPs are genotyped and analyzed for disease association in hundreds of thousands of cases and controls, have been broadly successful in identifying SNPs associated with heart diseases, diabetes, autoimmune diseases, and psychiatric disorders. It is however still debatable whether GWAS is the best approach for hypertension. The following is a brief overview on the outcomes of a decade of GWASs on primary hypertension.
Assessing Predictive Properties of Genome-Wide Selection in Soybeans

PubMed Central

Xavier, Alencar; Muir, William M.; Rainey, Katy Martin

2016-01-01

Many economically important traits in plant breeding have low heritability or are difficult to measure. For these traits, genomic selection has attractive features and may boost genetic gains. Our goal was to evaluate alternative scenarios to implement genomic selection for yield components in soybean (Glycine max L. merr). We used a nested association panel with cross validation to evaluate the impacts of training population size, genotyping density, and prediction model on the accuracy of genomic prediction. Our results indicate that training population size was the factor most relevant to improvement in genome-wide prediction, with greatest improvement observed in training sets up to 2000 individuals. We discuss assumptions that influence the choice of the prediction model. Although alternative models had minor impacts on prediction accuracy, the most robust prediction model was the combination of reproducing kernel Hilbert space regression and BayesB. Higher genotyping density marginally improved accuracy. Our study finds that breeding programs seeking efficient genomic selection in soybeans would best allocate resources by investing in a representative training set. PMID:27317786
Genome size diversity in angiosperms and its influence on gene space.

PubMed

Dodsworth, Steven; Leitch, Andrew R; Leitch, Ilia J

2015-12-01

Genome size varies c. 2400-fold in angiosperms (flowering plants), although the range of genome size is skewed towards small genomes, with a mean genome size of 1C=5.7Gb. One of the most crucial factors governing genome size in angiosperms is the relative amount and activity of repetitive elements. Recently, there have been new insights into how these repeats, previously discarded as 'junk' DNA, can have a significant impact on gene space (i.e. the part of the genome comprising all the genes and gene-related DNA). Here we review these new findings and explore in what ways genome size itself plays a role in influencing how repeats impact genome dynamics and gene space, including gene expression. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Genome, transcriptome, and functional analyses of Penicillium expansum provide new insights into secondary metabolism and pathogenicity

USDA-ARS?s Scientific Manuscript database

The relationship between secondary metabolism and infection in pathogenic fungi has remained largely elusive. Penicillium comprises a group of plant pathogens with varying host specificities and with the ability to produce a wide array of secondary metabolites. The genomes of three Penicillium exp...
Frontotemporal dementia and its subtypes: a genome-wide association study

PubMed Central

Ferrari, Raffaele; Hernandez, Dena G; Nalls, Michael A; Rohrer, Jonathan D; Ramasamy, Adaikalavan; Kwok, John B J; Dobson-Stone, Carol; Brooks, William S; Schofield, Peter R; Halliday, Glenda M; Hodges, John R; Piguet, Olivier; Bartley, Lauren; Thompson, Elizabeth; Haan, Eric; Hernández, Isabel; Ruiz, Agustín; Boada, Mercè; Borroni, Barbara; Padovani, Alessandro; Cruchaga, Carlos; Cairns, Nigel J; Benussi, Luisa; Binetti, Giuliano; Ghidoni, Roberta; Forloni, Gianluigi; Galimberti, Daniela; Fenoglio, Chiara; Serpente, Maria; Scarpini, Elio; Clarimón, Jordi; Lleó, Alberto; Blesa, Rafael; Waldö, Maria Landqvist; Nilsson, Karin; Nilsson, Christer; Mackenzie, Ian R A; Hsiung, Ging-Yuek R; Mann, David M A; Grafman, Jordan; Morris, Christopher M; Attems, Johannes; Griffiths, Timothy D; McKeith, Ian G; Thomas, Alan J; Pietrini, P; Huey, Edward D; Wassermann, Eric M; Baborie, Atik; Jaros, Evelyn; Tierney, Michael C; Pastor, Pau; Razquin, Cristina; Ortega-Cubero, Sara; Alonso, Elena; Perneczky, Robert; Diehl-Schmid, Janine; Alexopoulos, Panagiotis; Kurz, Alexander; Rainero, Innocenzo; Rubino, Elisa; Pinessi, Lorenzo; Rogaeva, Ekaterina; George-Hyslop, Peter St; Rossi, Giacomina; Tagliavini, Fabrizio; Giaccone, Giorgio; Rowe, James B; Schlachetzki, J C M; Uphill, James; Collinge, John; Mead, S; Danek, Adrian; Van Deerlin, Vivianna M; Grossman, Murray; Trojanowsk, John Q; van der Zee, Julie; Deschamps, William; Van Langenhove, Tim; Cruts, Marc; Van Broeckhoven, Christine; Cappa, Stefano F; Le Ber, Isabelle; Hannequin, Didier; Golfier, Véronique; Vercelletto, Martine; Brice, Alexis; Nacmias, Benedetta; Sorbi, Sandro; Bagnoli, Silvia; Piaceri, Irene; Nielsen, Jørgen E; Hjermind, Lena E; Riemenschneider, Matthias; Mayhaus, Manuel; Ibach, Bernd; Gasparoni, Gilles; Pichler, Sabrina; Gu, Wei; Rossor, Martin N; Fox, Nick C; Warren, Jason D; Spillantini, Maria Grazia; Morris, Huw R; Rizzu, Patrizia; Heutink, Peter; Snowden, Julie S; Rollinson, Sara; Richardson, Anna; Gerhard, Alexander; Bruni, Amalia C; Maletta, Raffaele; Frangipane, Francesca; Cupidi, Chiara; Bernardi, Livia; Anfossi, Maria; Gallo, Maura; Conidi, Maria Elena; Smirne, Nicoletta; Rademakers, Rosa; Baker, Matt; Dickson, Dennis W; Graff-Radford, Neill R; Petersen, Ronald C; Knopman, David; Josephs, Keith A; Boeve, Bradley F; Parisi, Joseph E; Seeley, William W; Miller, Bruce L; Karydas, Anna M; Rosen, Howard; van Swieten, John C; Dopper, Elise G P; Seelaar, Harro; Pijnenburg, Yolande AL; Scheltens, Philip; Logroscino, Giancarlo; Capozzo, Rosa; Novelli, Valeria; Puca, Annibale A; Franceschi, M; Postiglione, Alfredo; Milan, Graziella; Sorrentino, Paolo; Kristiansen, Mark; Chiang, Huei-Hsin; Graff, Caroline; Pasquier, Florence; Rollin, Adeline; Deramecourt, Vincent; Lebert, Florence; Kapogiannis, Dimitrios; Ferrucci, Luigi; Pickering-Brown, Stuart; Singleton, Andrew B; Hardy, John; Momeni, Parastoo

2014-01-01

Summary Background Frontotemporal dementia (FTD) is a complex disorder characterised by a broad range of clinical manifestations, differential pathological signatures, and genetic variability. Mutations in three genes—MAPT, GRN, and C9orf72—have been associated with FTD. We sought to identify novel genetic risk loci associated with the disorder. Methods We did a two-stage genome-wide association study on clinical FTD, analysing samples from 3526 patients with FTD and 9402 healthy controls. All participants had European ancestry. In the discovery phase (samples from 2154 patients with FTD and 4308 controls), we did separate association analyses for each FTD subtype (behavioural variant FTD, semantic dementia, progressive non-fluent aphasia, and FTD overlapping with motor neuron disease [FTD-MND]), followed by a meta-analysis of the entire dataset. We carried forward replication of the novel suggestive loci in an independent sample series (samples from 1372 patients and 5094 controls) and then did joint phase and brain expression and methylation quantitative trait loci analyses for the associated (p<5 × 10−8) and suggestive single-nucleotide polymorphisms. Findings We identified novel associations exceeding the genome-wide significance threshold (p<5 × 10−8) that encompassed the HLA locus at 6p21.3 in the entire cohort. We also identified a potential novel locus at 11q14, encompassing RAB38/CTSC, for the behavioural FTD subtype. Analysis of expression and methylation quantitative trait loci data suggested that these loci might affect expression and methylation incis. Interpretation Our findings suggest that immune system processes (link to 6p21.3) and possibly lysosomal and autophagy pathways (link to 11q14) are potentially involved in FTD. Our findings need to be replicated to better define the association of the newly identified loci with disease and possibly to shed light on the pathomechanisms contributing to FTD. Funding The National Institute of Neurological Disorders and Stroke and National Institute on Aging, the Wellcome/ MRC Centre on Parkinson’s disease, Alzheimer’s Research UK, and Texas Tech University Health Sciences Center. PMID:24943344
Ischemic Stroke: From Next Generation Sequencing and GWAS to Community Genomics?

PubMed

Black, Michael; Wang, Wenzhi; Wang, Wei

2015-08-01

Stroke is a major cause of mortality and morbidity in both the developed and developing world. Next generation sequencing (NGS) and multi-omics integrative biology research offer new opportunities in the way we research and understand stroke. These biotechnologies also signal a shift from genetics to genomics of stroke, which is highlighted in this review. Stroke is a focal neurological deficit resulting from disruption of the cerebral blood supply. There are two main types of common stroke, ischemic stroke (IS), which comprises 80% of cases, and hemorrhagic stroke (HS) that accounts for about 20% of cases. IS is a complex multi-factorial disease with multiple environmental and genomic determinants. We discuss here IS from genomics and bioinformatics perspectives, including the highlights of the genome wide association studies (GWAS), NGS progress to date, and exome studies. While both 'common variant, common disease' and 'rare variant, common disease' approaches need to be assessed in tandem, future studies into IS omics should also consider pedigree and/or community based sampling to take account of the complex diversity of IS genetics. We conclude by presenting an example of such community genomics research from China in an extended pedigree sample, and the ways in which the intersection of genomics and global society can usefully inform our understanding of IS pathophysiology and potential preventive medicine interventions in the future.
Construction and sequence sampling of deep-coverage, large-insert BAC libraries for three model lepidopteran species

PubMed Central

Wu, Chengcang; Proestou, Dina; Carter, Dorothy; Nicholson, Erica; Santos, Filippe; Zhao, Shaying; Zhang, Hong-Bin; Goldsmith, Marian R

2009-01-01

Background Manduca sexta, Heliothis virescens, and Heliconius erato represent three widely-used insect model species for genomic and fundamental studies in Lepidoptera. Large-insert BAC libraries of these insects are critical resources for many molecular studies, including physical mapping and genome sequencing, but not available to date. Results We report the construction and characterization of six large-insert BAC libraries for the three species and sampling sequence analysis of the genomes. The six BAC libraries were constructed with two restriction enzymes, two libraries for each species, and each has an average clone insert size ranging from 152–175 kb. We estimated that the genome coverage of each library ranged from 6–9 ×, with the two combined libraries of each species being equivalent to 13.0–16.3 × haploid genomes. The genome coverage, quality and utility of the libraries were further confirmed by library screening using 6~8 putative single-copy probes. To provide a first glimpse into these genomes, we sequenced and analyzed the BAC ends of ~200 clones randomly selected from the libraries of each species. The data revealed that the genomes are AT-rich, contain relatively small fractions of repeat elements with a majority belonging to the category of low complexity repeats, and are more abundant in retro-elements than DNA transposons. Among the species, the H. erato genome is somewhat more abundant in repeat elements and simple repeats than those of M. sexta and H. virescens. The BLAST analysis of the BAC end sequences suggested that the evolution of the three genomes is widely varied, with the genome of H. virescens being the most conserved as a typical lepidopteran, whereas both genomes of H. erato and M. sexta appear to have evolved significantly, resulting in a higher level of species- or evolutionary lineage-specific sequences. Conclusion The high-quality and large-insert BAC libraries of the insects, together with the identified BACs containing genes of interest, provide valuable information, resources and tools for comprehensive understanding and studies of the insect genomes and for addressing many fundamental questions in Lepidoptera. The sample of the genomic sequences provides the first insight into the constitution and evolution of the insect genomes. PMID:19558662

The three-dimensional genome organization of Drosophila melanogaster through data integration.

PubMed

Li, Qingjiao; Tjong, Harianto; Li, Xiao; Gong, Ke; Zhou, Xianghong Jasmine; Chiolo, Irene; Alber, Frank

2017-07-31

Genome structures are dynamic and non-randomly organized in the nucleus of higher eukaryotes. To maximize the accuracy and coverage of three-dimensional genome structural models, it is important to integrate all available sources of experimental information about a genome's organization. It remains a major challenge to integrate such data from various complementary experimental methods. Here, we present an approach for data integration to determine a population of complete three-dimensional genome structures that are statistically consistent with data from both genome-wide chromosome conformation capture (Hi-C) and lamina-DamID experiments. Our structures resolve the genome at the resolution of topological domains, and reproduce simultaneously both sets of experimental data. Importantly, this data deconvolution framework allows for structural heterogeneity between cells, and hence accounts for the expected plasticity of genome structures. As a case study we choose Drosophila melanogaster embryonic cells, for which both data types are available. Our three-dimensional genome structures have strong predictive power for structural features not directly visible in the initial data sets, and reproduce experimental hallmarks of the D. melanogaster genome organization from independent and our own imaging experiments. Also they reveal a number of new insights about genome organization and its functional relevance, including the preferred locations of heterochromatic satellites of different chromosomes, and observations about homologous pairing that cannot be directly observed in the original Hi-C or lamina-DamID data. Our approach allows systematic integration of Hi-C and lamina-DamID data for complete three-dimensional genome structure calculation, while also explicitly considering genome structural variability.
De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds.

PubMed

Dudchenko, Olga; Batra, Sanjit S; Omer, Arina D; Nyquist, Sarah K; Hoeger, Marie; Durand, Neva C; Shamim, Muhammad S; Machol, Ido; Lander, Eric S; Aiden, Aviva Presser; Aiden, Erez Lieberman

2017-04-07

The Zika outbreak, spread by the Aedes aegypti mosquito, highlights the need to create high-quality assemblies of large genomes in a rapid and cost-effective way. Here we combine Hi-C data with existing draft assemblies to generate chromosome-length scaffolds. We validate this method by assembling a human genome, de novo, from short reads alone (67× coverage). We then combine our method with draft sequences to create genome assemblies of the mosquito disease vectors Ae aegypti and Culex quinquefasciatus , each consisting of three scaffolds corresponding to the three chromosomes in each species. These assemblies indicate that almost all genomic rearrangements among these species occur within, rather than between, chromosome arms. The genome assembly procedure we describe is fast, inexpensive, and accurate, and can be applied to many species. Copyright © 2017, American Association for the Advancement of Science.
Human inversions and their functional consequences

PubMed Central

Puig, Marta; Casillas, Sònia; Villatoro, Sergi

2015-01-01

Polymorphic inversions are a type of structural variants that are difficult to analyze owing to their balanced nature and the location of breakpoints within complex repeated regions. So far, only a handful of inversions have been studied in detail in humans and current knowledge about their possible functional effects is still limited. However, inversions have been related to phenotypic changes and adaptation in multiple species. In this review, we summarize the evidences of the functional impact of inversions in the human genome. First, given that inversions have been shown to inhibit recombination in heterokaryotes, chromosomes displaying different orientation are expected to evolve independently and this may lead to distinct gene-expression patterns. Second, inversions have a role as disease-causing mutations both by directly affecting gene structure or regulation in different ways, and by predisposing to other secondary arrangements in the offspring of inversion carriers. Finally, several inversions show signals of being selected during human evolution. These findings illustrate the potential of inversions to have phenotypic consequences also in humans and emphasize the importance of their inclusion in genome-wide association studies. PMID:25998059
Epigenetic regulation of bud dormancy events in perennial plants

PubMed Central

Ríos, Gabino; Leida, Carmen; Conejero, Ana; Badenes, María Luisa

2014-01-01

Release of bud dormancy in perennial plants resembles vernalization in Arabidopsis thaliana and cereals. In both cases, a certain period of chilling is required for accomplishing the reproductive phase, and several transcription factors with the MADS-box domain perform a central regulatory role in these processes. The expression of DORMANCY-ASSOCIATED MADS-box (DAM)-related genes has been found to be up-regulated in dormant buds of numerous plant species, such as poplar, raspberry, leafy spurge, blackcurrant, Japanese apricot, and peach. Moreover, functional evidence suggests the involvement of DAM genes in the regulation of seasonal dormancy in peach. Recent findings highlight the presence of genome-wide epigenetic modifications related to dormancy events, and more specifically the epigenetic regulation of DAM-related genes in a similar way to FLOWERING LOCUS C, a key integrator of vernalization effectors on flowering initiation in Arabidopsis. We revise the most relevant molecular and genomic contributions in the field of bud dormancy, and discuss the increasing evidence for chromatin modification involvement in the epigenetic regulation of seasonal dormancy cycles in perennial plants. PMID:24917873
Genome-wide DNA methylation sequencing reveals miR-663a is a novel epimutation candidate in CIMP-high endometrial cancer

PubMed Central

Yanokura, Megumi; Banno, Kouji; Adachi, Masataka; Aoki, Daisuke; Abe, Kuniya

2017-01-01

Aberrant DNA methylation is widely observed in many cancers. Concurrent DNA methylation of multiple genes occurs in endometrial cancer and is referred to as the CpG island methylator phenotype (CIMP). However, the features and causes of CIMP-positive endometrial cancer are not well understood. To investigate DNA methylation features characteristic to CIMP-positive endometrial cancer, we first classified samples from 25 patients with endometrial cancer based on the methylation status of three genes, i.e. MLH1, CDH1 (E-cadherin) and APC: CIMP-high (CIMP-H, 2/25, 8.0%), CIMP-low (CIMP-L, 7/25, 28.0%) and CIMP-negative (CIMP(-), 16/25, 64.0%). We then selected two samples each from CIMP-H and CIMP(-) classes, and analyzed DNA methylation status of both normal (peripheral blood cells: PBCs) and cancer tissues by genome-wide, targeted bisulfite sequencing. Genomes of the CIMP-H cancer tissues were significantly hypermethylated compared to those of the CIMP(-). Surprisingly, in normal tissues of the CIMP-H patients, promoter region of the miR-663a locus is hypermethylated relative to CIMP(-) samples. Consistent with this finding, miR-663a expression was lower in the CIMP-H PBCs than in the CIMP(-) PBCs. The same region of the miR663a locus is found to be highly methylated in cancer tissues of both CIMP-H and CIMP(-) cases. This is the first report showing that aberrant DNA methylation of the miR-663a promoter can occur in normal tissue of the cancer patients, suggesting a possible link between this epigenetic abnormality and endometrial cancer. This raises the possibility that the hypermethylation of the miR-663a promoter represents an epimutation associated with the CIMP-H endometrial cancers. Based on these findings, relationship of the aberrant DNA methylation and CIMP-H phenotype is discussed. PMID:28440489
Genome-wide DNA methylation sequencing reveals miR-663a is a novel epimutation candidate in CIMP-high endometrial cancer.

PubMed

Yanokura, Megumi; Banno, Kouji; Adachi, Masataka; Aoki, Daisuke; Abe, Kuniya

2017-06-01

Aberrant DNA methylation is widely observed in many cancers. Concurrent DNA methylation of multiple genes occurs in endometrial cancer and is referred to as the CpG island methylator phenotype (CIMP). However, the features and causes of CIMP-positive endometrial cancer are not well understood. To investigate DNA methylation features characteristic to CIMP-positive endometrial cancer, we first classified samples from 25 patients with endometrial cancer based on the methylation status of three genes, i.e. MLH1, CDH1 (E-cadherin) and APC: CIMP-high (CIMP-H, 2/25, 8.0%), CIMP-low (CIMP-L, 7/25, 28.0%) and CIMP-negative (CIMP(-), 16/25, 64.0%). We then selected two samples each from CIMP-H and CIMP(-) classes, and analyzed DNA methylation status of both normal (peripheral blood cells: PBCs) and cancer tissues by genome-wide, targeted bisulfite sequencing. Genomes of the CIMP-H cancer tissues were significantly hypermethylated compared to those of the CIMP(-). Surprisingly, in normal tissues of the CIMP-H patients, promoter region of the miR-663a locus is hypermethylated relative to CIMP(-) samples. Consistent with this finding, miR-663a expression was lower in the CIMP-H PBCs than in the CIMP(-) PBCs. The same region of the miR663a locus is found to be highly methylated in cancer tissues of both CIMP-H and CIMP(-) cases. This is the first report showing that aberrant DNA methylation of the miR-663a promoter can occur in normal tissue of the cancer patients, suggesting a possible link between this epigenetic abnormality and endometrial cancer. This raises the possibility that the hypermethylation of the miR-663a promoter represents an epimutation associated with the CIMP-H endometrial cancers. Based on these findings, relationship of the aberrant DNA methylation and CIMP-H phenotype is discussed.
A Genome Wide Association Study Identifies Common Variants Associated with Lipid Levels in the Chinese Population

PubMed Central

Wu, Chen; Yang, Handong; Yu, Dianke; Yang, Xiaobo; Zhang, Xiaomin; Wang, Yiqin; Sun, Jielin; Gao, Yong; Tan, Aihua; He, Yunfeng; Zhang, Haiying; Qin, Xue; Zhu, Jingwen; Li, Huaixing; Lin, Xu; Zhu, Jiang; Min, Xinwen; Lang, Mingjian; Li, Dongfeng; Zhai, Kan; Chang, Jiang; Tan, Wen; Yuan, Jing; Chen, Weihong; Wang, Youjie; Wei, Sheng; Miao, Xiaoping; Wang, Feng; Fang, Weimin; Liang, Yuan; Deng, Qifei; Dai, Xiayun; Lin, Dafeng; Huang, Suli; Guo, Huan; Lilly Zheng, S.; Xu, Jianfeng; Lin, Dongxin; Hu, Frank B.; Wu, Tangchun

2013-01-01

Plasma lipid levels are important risk factors for cardiovascular disease and are influenced by genetic and environmental factors. Recent genome wide association studies (GWAS) have identified several lipid-associated loci, but these loci have been identified primarily in European populations. In order to identify genetic markers for lipid levels in a Chinese population and analyze the heterogeneity between Europeans and Asians, especially Chinese, we performed a meta-analysis of two genome wide association studies on four common lipid traits including total cholesterol (TC), triglycerides (TG), low-density lipoprotein cholesterol (LDL) and high-density lipoprotein cholesterol (HDL) in a Han Chinese population totaling 3,451 healthy subjects. Replication was performed in an additional 8,830 subjects of Han Chinese ethnicity. We replicated eight loci associated with lipid levels previously reported in a European population. The loci genome wide significantly associated with TC were near DOCK7, HMGCR and ABO; those genome wide significantly associated with TG were near APOA1/C3/A4/A5 and LPL; those genome wide significantly associated with LDL were near HMGCR, ABO and TOMM40; and those genome wide significantly associated with HDL were near LPL, LIPC and CETP. In addition, an additive genotype score of eight SNPs representing the eight loci that were found to be associated with lipid levels was associated with higher TC, TG and LDL levels (P = 5.52×10-16, 1.38×10-6 and 5.59×10-9, respectively). These findings suggest the cumulative effects of multiple genetic loci on plasma lipid levels. Comparisons with previous GWAS of lipids highlight heterogeneity in allele frequency and in effect size for some loci between Chinese and European populations. The results from our GWAS provided comprehensive and convincing evidence of the genetic determinants of plasma lipid levels in a Chinese population. PMID:24386095
Genome-wide analysis of Dongxiang wild rice (Oryza rufipogon Griff.) to investigate lost/acquired genes during rice domestication.

PubMed

Zhang, Fantao; Xu, Tao; Mao, Linyong; Yan, Shuangyong; Chen, Xiwen; Wu, Zhenfeng; Chen, Rui; Luo, Xiangdong; Xie, Jiankun; Gao, Shan

2016-04-26

It is widely accepted that cultivated rice (Oryza sativa L.) was domesticated from common wild rice (Oryza rufipogon Griff.). Compared to other studies which concentrate on rice origin, this study is to genetically elucidate the substantially phenotypic and physiological changes from wild rice to cultivated rice at the whole genome level. Instead of comparing two assembled genomes, this study directly compared the Dongxiang wild rice (DXWR) Illumina sequencing reads with the Nipponbare (O. sativa) complete genome without assembly of the DXWR genome. Based on the results from the comparative genomics analysis, structural variations (SVs) between DXWR and Nipponbare were determined to locate deleted genes which could have been acquired by Nipponbare during rice domestication. To overcome the limit of the SV detection, the DXWR transcriptome was also sequenced and compared with the Nipponbare transcriptome to discover the genes which could have been lost in DXWR during domestication. Both 1591 Nipponbare-acquired genes and 206 DXWR-lost transcripts were further analyzed using annotations from multiple sources. The NGS data are available in the NCBI SRA database with ID SRP070627. These results help better understanding the domestication from wild rice to cultivated rice at the whole genome level and provide a genomic data resource for rice genetic research or breeding. One finding confirmed transposable elements contribute greatly to the genome evolution from wild rice to cultivated rice. Another finding suggested the photophosphorylation and oxidative phosphorylation system in cultivated rice could have adapted to environmental changes simultaneously during domestication.
Uncovering drug-responsive regulatory elements

PubMed Central

Luizon, Marcelo R; Ahituv, Nadav

2015-01-01

Nucleotide changes in gene regulatory elements can have a major effect on interindividual differences in drug response. For example, by reviewing all published pharmacogenomic genome-wide association studies, we show here that 96.4% of the associated single nucleotide polymorphisms reside in noncoding regions. We discuss how sequencing technologies are improving our ability to identify drug response-associated regulatory elements genome-wide and to annotate nucleotide variants within them. We highlight specific examples of how nucleotide changes in these elements can affect drug response and illustrate the techniques used to find them and functionally characterize them. Finally, we also discuss challenges in the field of drug-responsive regulatory elements that need to be considered in order to translate these findings into the clinic. PMID:26555224
Exploring hypertension genome-wide association studies findings and impact on pathophysiology, pathways, and pharmacogenetics.

PubMed

Cabrera, Claudia P; Ng, Fu Liang; Warren, Helen R; Barnes, Michael R; Munroe, Patricia B; Caulfield, Mark J

2015-01-01

Hypertension is a major risk factor for global mortality. Recent genome-wide association studies (GWAS) have led to successful identification of many genetic loci influencing blood pressure, although these studies account for less than 5% of heritability. While genetic discovery efforts continue, it is timely to pause and reflect on what information has been gained to date from reported loci. Knowledge from GWAS findings inform our understanding of the pathways and pleiotropy underpinning hypertension and aid in the identification of potential druggable targets. By reviewing blood pressure loci we aim to determine how much potential the current observations have for future clinical utility. The authors have declared no conflicts of interest for this article. © 2015 Wiley Periodicals, Inc.
Genome-wide association study identifies three novel loci in Fuchs endothelial corneal dystrophy

PubMed Central

Afshari, Natalie A.; Igo, Robert P.; Morris, Nathan J.; Stambolian, Dwight; Sharma, Shiwani; Pulagam, V. Lakshmi; Dunn, Steven; Stamler, John F.; Truitt, Barbara J.; Rimmler, Jacqueline; Kuot, Abraham; Croasdale, Christopher R.; Qin, Xuejun; Burdon, Kathryn P.; Riazuddin, S. Amer; Mills, Richard; Klebe, Sonja; Minear, Mollie A.; Zhao, Jiagang; Balajonda, Elmer; Rosenwasser, George O.; Baratz, Keith H; Mootha, V. Vinod; Patel, Sanjay V.; Gregory, Simon G.; Bailey-Wilson, Joan E.; Price, Marianne O.; Price, Francis W.; Craig, Jamie E.; Fingert, John H.; Gottsch, John D.; Aldave, Anthony J.; Klintworth, Gordon K.; Lass, Jonathan H.; Li, Yi-Ju; Iyengar, Sudha K.

2017-01-01

The structure of the cornea is vital to its transparency, and dystrophies that disrupt corneal organization are highly heritable. To understand the genetic aetiology of Fuchs endothelial corneal dystrophy (FECD), the most prevalent corneal disorder requiring transplantation, we conducted a genome-wide association study (GWAS) on 1,404 FECD cases and 2,564 controls of European ancestry, followed by replication and meta-analysis, for a total of 2,075 cases and 3,342 controls. We identify three novel loci meeting genome-wide significance (P<5 × 10−8): KANK4 rs79742895, LAMC1 rs3768617 and LINC00970/ATP1B1 rs1200114. We also observe an overwhelming effect of the established TCF4 locus. Interestingly, we detect differential sex-specific association at LAMC1, with greater risk in women, and TCF4, with greater risk in men. Combining GWAS results with biological evidence we expand the knowledge of common FECD loci from one to four, and provide a deeper understanding of the underlying pathogenic basis of FECD. PMID:28358029
Genome-wide association analysis identifies three new risk loci for gout arthritis in Han Chinese

PubMed Central

Li, Changgui; Li, Zhiqiang; Liu, Shiguo; Wang, Can; Han, Lin; Cui, Lingling; Zhou, Jingguo; Zou, Hejian; Liu, Zhen; Chen, Jianhua; Cheng, Xiaoyu; Zhou, Zhaowei; Ding, Chengcheng; Wang, Meng; Chen, Tong; Cui, Ying; He, Hongmei; Zhang, Keke; Yin, Congcong; Wang, Yunlong; Xing, Shichao; Li, Baojie; Ji, Jue; Jia, Zhaotong; Ma, Lidan; Niu, Jiapeng; Xin, Ying; Liu, Tian; Chu, Nan; Yu, Qing; Ren, Wei; Wang, Xuefeng; Zhang, Aiqing; Sun, Yuping; Wang, Haili; Lu, Jie; Li, Yuanyuan; Qing, Yufeng; Chen, Gang; Wang, Yangang; Zhou, Li; Niu, Haitao; Liang, Jun; Dong, Qian; Li, Xinde; Mi, Qing-Sheng; Shi, Yongyong

2015-01-01

Gout is one of the most common types of inflammatory arthritis, caused by the deposition of monosodium urate crystals in and around the joints. Previous genome-wide association studies (GWASs) have identified many genetic loci associated with raised serum urate concentrations. However, hyperuricemia alone is not sufficient for the development of gout arthritis. Here we conduct a multistage GWAS in Han Chinese using 4,275 male gout patients and 6,272 normal male controls (1,255 cases and 1,848 controls were genome-wide genotyped), with an additional 1,644 hyperuricemic controls. We discover three new risk loci, 17q23.2 (rs11653176, P=1.36 × 10−13, BCAS3), 9p24.2 (rs12236871, P=1.48 × 10−10, RFX3) and 11p15.5 (rs179785, P=1.28 × 10−8, KCNQ1), which contain inflammatory candidate genes. Our results suggest that these loci are most likely related to the progression from hyperuricemia to inflammatory gout, which will provide new insights into the pathogenesis of gout arthritis. PMID:25967671
Genome-wide association analysis identifies three new risk loci for gout arthritis in Han Chinese.

PubMed

Li, Changgui; Li, Zhiqiang; Liu, Shiguo; Wang, Can; Han, Lin; Cui, Lingling; Zhou, Jingguo; Zou, Hejian; Liu, Zhen; Chen, Jianhua; Cheng, Xiaoyu; Zhou, Zhaowei; Ding, Chengcheng; Wang, Meng; Chen, Tong; Cui, Ying; He, Hongmei; Zhang, Keke; Yin, Congcong; Wang, Yunlong; Xing, Shichao; Li, Baojie; Ji, Jue; Jia, Zhaotong; Ma, Lidan; Niu, Jiapeng; Xin, Ying; Liu, Tian; Chu, Nan; Yu, Qing; Ren, Wei; Wang, Xuefeng; Zhang, Aiqing; Sun, Yuping; Wang, Haili; Lu, Jie; Li, Yuanyuan; Qing, Yufeng; Chen, Gang; Wang, Yangang; Zhou, Li; Niu, Haitao; Liang, Jun; Dong, Qian; Li, Xinde; Mi, Qing-Sheng; Shi, Yongyong

2015-05-13

Gout is one of the most common types of inflammatory arthritis, caused by the deposition of monosodium urate crystals in and around the joints. Previous genome-wide association studies (GWASs) have identified many genetic loci associated with raised serum urate concentrations. However, hyperuricemia alone is not sufficient for the development of gout arthritis. Here we conduct a multistage GWAS in Han Chinese using 4,275 male gout patients and 6,272 normal male controls (1,255 cases and 1,848 controls were genome-wide genotyped), with an additional 1,644 hyperuricemic controls. We discover three new risk loci, 17q23.2 (rs11653176, P=1.36 × 10(-13), BCAS3), 9p24.2 (rs12236871, P=1.48 × 10(-10), RFX3) and 11p15.5 (rs179785, P=1.28 × 10(-8), KCNQ1), which contain inflammatory candidate genes. Our results suggest that these loci are most likely related to the progression from hyperuricemia to inflammatory gout, which will provide new insights into the pathogenesis of gout arthritis.
CHESS (CgHExpreSS): a comprehensive analysis tool for the analysis of genomic alterations and their effects on the expression profile of the genome.

PubMed

Lee, Mikyung; Kim, Yangseok

2009-12-16

Genomic alterations frequently occur in many cancer patients and play important mechanistic roles in the pathogenesis of cancer. Furthermore, they can modify the expression level of genes due to altered copy number in the corresponding region of the chromosome. An accumulating body of evidence supports the possibility that strong genome-wide correlation exists between DNA content and gene expression. Therefore, more comprehensive analysis is needed to quantify the relationship between genomic alteration and gene expression. A well-designed bioinformatics tool is essential to perform this kind of integrative analysis. A few programs have already been introduced for integrative analysis. However, there are many limitations in their performance of comprehensive integrated analysis using published software because of limitations in implemented algorithms and visualization modules. To address this issue, we have implemented the Java-based program CHESS to allow integrative analysis of two experimental data sets: genomic alteration and genome-wide expression profile. CHESS is composed of a genomic alteration analysis module and an integrative analysis module. The genomic alteration analysis module detects genomic alteration by applying a threshold based method or SW-ARRAY algorithm and investigates whether the detected alteration is phenotype specific or not. On the other hand, the integrative analysis module measures the genomic alteration's influence on gene expression. It is divided into two separate parts. The first part calculates overall correlation between comparative genomic hybridization ratio and gene expression level by applying following three statistical methods: simple linear regression, Spearman rank correlation and Pearson's correlation. In the second part, CHESS detects the genes that are differentially expressed according to the genomic alteration pattern with three alternative statistical approaches: Student's t-test, Fisher's exact test and Chi square test. By successive operations of two modules, users can clarify how gene expression levels are affected by the phenotype specific genomic alterations. As CHESS was developed in both Java application and web environments, it can be run on a web browser or a local machine. It also supports all experimental platforms if a properly formatted text file is provided to include the chromosomal position of probes and their gene identifiers. CHESS is a user-friendly tool for investigating disease specific genomic alterations and quantitative relationships between those genomic alterations and genome-wide gene expression profiling.
Genome-wide association study of a nicotine metabolism biomarker in African American smokers: impact of chromosome 19 genetic influences.

PubMed

Chenoweth, Meghan J; Ware, Jennifer J; Zhu, Andy Z X; Cole, Christopher B; Cox, Lisa Sanderson; Nollen, Nikki; Ahluwalia, Jasjit S; Benowitz, Neal L; Schnoll, Robert A; Hawk, Larry W; Cinciripini, Paul M; George, Tony P; Lerman, Caryn; Knight, Joanne; Tyndale, Rachel F

2018-03-01

The activity of CYP2A6, the major nicotine-inactivating enzyme, is measurable in smokers using the nicotine metabolite ratio (NMR; 3'hydroxycotinine/cotinine). Due to its role in nicotine clearance, the NMR is associated with smoking behaviours and response to pharmacotherapies. The NMR is highly heritable (~80%), and on average lower in African Americans (AA) versus whites. We previously identified several reduce and loss-of-function CYP2A6 variants common in individuals of African descent. Our current aim was to identify novel genetic influences on the NMR in AA smokers using genome-wide approaches. Genome-wide association study (GWAS). Multiple sites within Canada and the United States. AA smokers from two clinical trials: Pharmacogenetics of Nicotine Addiction Treatment (PNAT)-2 (NCT01314001; n = 504) and Kick-it-at-Swope (KIS)-3 (NCT00666978; n = 450). Genome-wide SNP genotyping, the NMR (phenotype) and population substructure and NMR covariates. Meta-analysis revealed three independent chromosome 19 signals (rs12459249, rs111645190 and rs185430475) associated with the NMR. The top overall hit, rs12459249 (P = 1.47e-39; beta = 0.59 per C (versus T) allele, SE = 0.045), located ~9.5 kb 3' of CYP2A6, remained genome-wide significant after controlling for the common (~10% in AA) non-functional CYP2A6*17 allele. In contrast, rs111645190 and rs185430475 were not genome-wide significant when controlling for CYP2A6*17. In total, 96 signals associated with the NMR were identified; many were not found in prior NMR GWASs in individuals of European descent. The top hits were also associated with the NMR in a third cohort of AA (KIS2; n = 480). None of the hits were in UGT or OCT2 genes. Three independent chromosome 19 signals account for ~20% of the variability in the nicotine metabolite ratio in African American smokers. The hits identified may contribute to inter-ethnic variability in nicotine metabolism, smoking behaviours and tobacco-related disease risk. © 2017 Society for the Study of Addiction.
The clinical application of genome-wide sequencing for monogenic diseases in Canada: Position Statement of the Canadian College of Medical Geneticists

PubMed Central

Boycott, Kym; Hartley, Taila; Adam, Shelin; Bernier, Francois; Chong, Karen; Fernandez, Bridget A; Friedman, Jan M; Geraghty, Michael T; Hume, Stacey; Knoppers, Bartha M; Laberge, Anne-Marie; Majewski, Jacek; Mendoza-Londono, Roberto; Meyn, M Stephen; Michaud, Jacques L; Nelson, Tanya N; Richer, Julie; Sadikovic, Bekim; Skidmore, David L; Stockley, Tracy; Taylor, Sherry; van Karnebeek, Clara; Zawati, Ma'n H; Lauzon, Julie; Armour, Christine M

2015-01-01

Purpose and scope The aim of this Position Statement is to provide recommendations for Canadian medical geneticists, clinical laboratory geneticists, genetic counsellors and other physicians regarding the use of genome-wide sequencing of germline DNA in the context of clinical genetic diagnosis. This statement has been developed to facilitate the clinical translation and development of best practices for clinical genome-wide sequencing for genetic diagnosis of monogenic diseases in Canada; it does not address the clinical application of this technology in other fields such as molecular investigation of cancer or for population screening of healthy individuals. Methods of statement development Two multidisciplinary groups consisting of medical geneticists, clinical laboratory geneticists, genetic counsellors, ethicists, lawyers and genetic researchers were assembled to review existing literature and guidelines on genome-wide sequencing for clinical genetic diagnosis in the context of monogenic diseases, and to make recommendations relevant to the Canadian context. The statement was circulated for comment to the Canadian College of Medical Geneticists (CCMG) membership-at-large and, following incorporation of feedback, approved by the CCMG Board of Directors. The CCMG is a Canadian organisation responsible for certifying medical geneticists and clinical laboratory geneticists, and for establishing professional and ethical standards for clinical genetics services in Canada. Results and conclusions Recommendations include (1) clinical genome-wide sequencing is an appropriate approach in the diagnostic assessment of a patient for whom there is suspicion of a significant monogenic disease that is associated with a high degree of genetic heterogeneity, or where specific genetic tests have failed to provide a diagnosis; (2) until the benefits of reporting incidental findings are established, we do not endorse the intentional clinical analysis of disease-associated genes other than those linked to the primary indication; and (3) clinicians should provide genetic counselling and obtain informed consent prior to undertaking clinical genome-wide sequencing. Counselling should include discussion of the limitations of testing, likelihood and implications of diagnosis and incidental findings, and the potential need for further analysis to facilitate clinical interpretation, including studies performed in a research setting. These recommendations will be routinely re-evaluated as knowledge of diagnostic and clinical utility of clinical genome-wide sequencing improves. While the document was developed to direct practice in Canada, the applicability of the statement is broader and will be of interest to clinicians and health jurisdictions internationally. PMID:25951830
The clinical application of genome-wide sequencing for monogenic diseases in Canada: Position Statement of the Canadian College of Medical Geneticists.

PubMed

Boycott, Kym; Hartley, Taila; Adam, Shelin; Bernier, Francois; Chong, Karen; Fernandez, Bridget A; Friedman, Jan M; Geraghty, Michael T; Hume, Stacey; Knoppers, Bartha M; Laberge, Anne-Marie; Majewski, Jacek; Mendoza-Londono, Roberto; Meyn, M Stephen; Michaud, Jacques L; Nelson, Tanya N; Richer, Julie; Sadikovic, Bekim; Skidmore, David L; Stockley, Tracy; Taylor, Sherry; van Karnebeek, Clara; Zawati, Ma'n H; Lauzon, Julie; Armour, Christine M

2015-07-01

The aim of this Position Statement is to provide recommendations for Canadian medical geneticists, clinical laboratory geneticists, genetic counsellors and other physicians regarding the use of genome-wide sequencing of germline DNA in the context of clinical genetic diagnosis. This statement has been developed to facilitate the clinical translation and development of best practices for clinical genome-wide sequencing for genetic diagnosis of monogenic diseases in Canada; it does not address the clinical application of this technology in other fields such as molecular investigation of cancer or for population screening of healthy individuals. Two multidisciplinary groups consisting of medical geneticists, clinical laboratory geneticists, genetic counsellors, ethicists, lawyers and genetic researchers were assembled to review existing literature and guidelines on genome-wide sequencing for clinical genetic diagnosis in the context of monogenic diseases, and to make recommendations relevant to the Canadian context. The statement was circulated for comment to the Canadian College of Medical Geneticists (CCMG) membership-at-large and, following incorporation of feedback, approved by the CCMG Board of Directors. The CCMG is a Canadian organisation responsible for certifying medical geneticists and clinical laboratory geneticists, and for establishing professional and ethical standards for clinical genetics services in Canada. Recommendations include (1) clinical genome-wide sequencing is an appropriate approach in the diagnostic assessment of a patient for whom there is suspicion of a significant monogenic disease that is associated with a high degree of genetic heterogeneity, or where specific genetic tests have failed to provide a diagnosis; (2) until the benefits of reporting incidental findings are established, we do not endorse the intentional clinical analysis of disease-associated genes other than those linked to the primary indication; and (3) clinicians should provide genetic counselling and obtain informed consent prior to undertaking clinical genome-wide sequencing. Counselling should include discussion of the limitations of testing, likelihood and implications of diagnosis and incidental findings, and the potential need for further analysis to facilitate clinical interpretation, including studies performed in a research setting. These recommendations will be routinely re-evaluated as knowledge of diagnostic and clinical utility of clinical genome-wide sequencing improves. While the document was developed to direct practice in Canada, the applicability of the statement is broader and will be of interest to clinicians and health jurisdictions internationally. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Longitudinal analyses of the DNA methylome in deployed military servicemen identify susceptibility loci for post-traumatic stress disorder.

PubMed

Rutten, B P F; Vermetten, E; Vinkers, C H; Ursini, G; Daskalakis, N P; Pishva, E; de Nijs, L; Houtepen, L C; Eijssen, L; Jaffe, A E; Kenis, G; Viechtbauer, W; van den Hove, D; Schraut, K G; Lesch, K-P; Kleinman, J E; Hyde, T M; Weinberger, D R; Schalkwyk, L; Lunnon, K; Mill, J; Cohen, H; Yehuda, R; Baker, D G; Maihofer, A X; Nievergelt, C M; Geuze, E; Boks, M P M

2018-05-01

In order to determine the impact of the epigenetic response to traumatic stress on post-traumatic stress disorder (PTSD), this study examined longitudinal changes of genome-wide blood DNA methylation profiles in relation to the development of PTSD symptoms in two prospective military cohorts (one discovery and one replication data set). In the first cohort consisting of male Dutch military servicemen (n=93), the emergence of PTSD symptoms over a deployment period to a combat zone was significantly associated with alterations in DNA methylation levels at 17 genomic positions and 12 genomic regions. Evidence for mediation of the relation between combat trauma and PTSD symptoms by longitudinal changes in DNA methylation was observed at several positions and regions. Bioinformatic analyses of the reported associations identified significant enrichment in several pathways relevant for symptoms of PTSD. Targeted analyses of the significant findings from the discovery sample in an independent prospective cohort of male US marines (n=98) replicated the observed relation between decreases in DNA methylation levels and PTSD symptoms at genomic regions in ZFP57, RNF39 and HIST1H2APS2. Together, our study pinpoints three novel genomic regions where longitudinal decreases in DNA methylation across the period of exposure to combat trauma marks susceptibility for PTSD.
Genome-wide association studies in Africans and African Americans: Expanding the Framework of the Genomics of Human Traits and Disease

PubMed Central

Peprah, Emmanuel; Xu, Huichun; Tekola-Ayele, Fasil; Royal, Charmaine D.

2014-01-01

Genomic research is one of the tools for elucidating the pathogenesis of diseases of global health relevance, and paving the research dimension to clinical and public health translation. Recent advances in genomic research and technologies have increased our understanding of human diseases, genes associated with these disorders, and the relevant mechanisms. Genome-wide association studies (GWAS) have proliferated since the first studies were published several years ago, and have become an important tool in helping researchers comprehend human variation and the role genetic variants play in disease. However, the need to expand the diversity of populations in GWAS has become increasingly apparent as new knowledge is gained about genetic variation. Inclusion of diverse populations in genomic studies is critical to a more complete understanding of human variation and elucidation of the underpinnings of complex diseases. In this review, we summarize the available data on GWAS in recent-African ancestry populations within the western hemisphere (i.e. African Americans and peoples of the Caribbean) and continental African populations. Furthermore, we highlight ways in which genomic studies in populations of recent African ancestry have led to advances in the areas of malaria, HIV, prostate cancer, and other diseases. Finally, we discuss the advantages of conducting GWAS in recent African ancestry populations in the context of addressing existing and emerging global health conditions. PMID:25427668
Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie

2014-06-18

Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealedmore » substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ‘ecotype model’ of diversification, but not previously observed in natural populations.« less

Genome-wide Selective Sweeps in Natural Bacterial Populations Revealed by Time-series Metagenomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chan, Leong-Keat; Bendall, Matthew L.; Malfatti, Stephanie

2014-05-12

Multiple evolutionary models have been proposed to explain the formation of genetically and ecologically distinct bacterial groups. Time-series metagenomics enables direct observation of evolutionary processes in natural populations, and if applied over a sufficiently long time frame, this approach could capture events such as gene-specific or genome-wide selective sweeps. Direct observations of either process could help resolve how distinct groups form in natural microbial assemblages. Here, from a three-year metagenomic study of a freshwater lake, we explore changes in single nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in populations of Chlorobiaceae and Methylophilaceae. SNP analyses revealedmore » substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied considerably among closely related, co-occurring Methylophilaceae populations. SNP allele frequencies, as well as the relative abundance of certain genes, changed dramatically over time in each population. Interestingly, SNP diversity was purged at nearly every genome position in one of the Chlorobiaceae populations over the course of three years, while at the same time multiple genes either swept through or were swept from this population. These patterns were consistent with a genome-wide selective sweep, a process predicted by the ecotype model? of diversification, but not previously observed in natural populations.« less
Genomic signatures of positive selection in humans and the limits of outlier approaches.

PubMed

Kelley, Joanna L; Madeoy, Jennifer; Calhoun, John C; Swanson, Willie; Akey, Joshua M

2006-08-01

Identifying regions of the human genome that have been targets of positive selection will provide important insights into recent human evolutionary history and may facilitate the search for complex disease genes. However, the confounding effects of population demographic history and selection on patterns of genetic variation complicate inferences of selection when a small number of loci are studied. To this end, identifying outlier loci from empirical genome-wide distributions of genetic variation is a promising strategy to detect targets of selection. Here, we evaluate the power and efficiency of a simple outlier approach and describe a genome-wide scan for positive selection using a dense catalog of 1.58 million SNPs that were genotyped in three human populations. In total, we analyzed 14,589 genes, 385 of which possess patterns of genetic variation consistent with the hypothesis of positive selection. Furthermore, several extended genomic regions were found, spanning >500 kb, that contained multiple contiguous candidate selection genes. More generally, these data provide important practical insights into the limits of outlier approaches in genome-wide scans for selection, provide strong candidate selection genes to study in greater detail, and may have important implications for disease related research.
Chromatin Landscapes of Retroviral and Transposon Integration Profiles

PubMed Central

Badhai, Jitendra; Rust, Alistair G.; Rad, Roland; Hilkens, John; Berns, Anton; van Lohuizen, Maarten; Wessels, Lodewyk F. A.; de Ridder, Jeroen

2014-01-01

The ability of retroviruses and transposons to insert their genetic material into host DNA makes them widely used tools in molecular biology, cancer research and gene therapy. However, these systems have biases that may strongly affect research outcomes. To address this issue, we generated very large datasets consisting of to unselected integrations in the mouse genome for the Sleeping Beauty (SB) and piggyBac (PB) transposons, and the Mouse Mammary Tumor Virus (MMTV). We analyzed (epi)genomic features to generate bias maps at both local and genome-wide scales. MMTV showed a remarkably uniform distribution of integrations across the genome. More distinct preferences were observed for the two transposons, with PB showing remarkable resemblance to bias profiles of the Murine Leukemia Virus. Furthermore, we present a model where target site selection is directed at multiple scales. At a large scale, target site selection is similar across systems, and defined by domain-oriented features, namely expression of proximal genes, proximity to CpG islands and to genic features, chromatin compaction and replication timing. Notable differences between the systems are mainly observed at smaller scales, and are directed by a diverse range of features. To study the effect of these biases on integration sites occupied under selective pressure, we turned to insertional mutagenesis (IM) screens. In IM screens, putative cancer genes are identified by finding frequently targeted genomic regions, or Common Integration Sites (CISs). Within three recently completed IM screens, we identified 7%–33% putative false positive CISs, which are likely not the result of the oncogenic selection process. Moreover, results indicate that PB, compared to SB, is more suited to tag oncogenes. PMID:24721906
A mega-analysis of genome-wide association studies for major depressive disorder.

PubMed

Ripke, Stephan; Wray, Naomi R; Lewis, Cathryn M; Hamilton, Steven P; Weissman, Myrna M; Breen, Gerome; Byrne, Enda M; Blackwood, Douglas H R; Boomsma, Dorret I; Cichon, Sven; Heath, Andrew C; Holsboer, Florian; Lucae, Susanne; Madden, Pamela A F; Martin, Nicholas G; McGuffin, Peter; Muglia, Pierandrea; Noethen, Markus M; Penninx, Brenda P; Pergadia, Michele L; Potash, James B; Rietschel, Marcella; Lin, Danyu; Müller-Myhsok, Bertram; Shi, Jianxin; Steinberg, Stacy; Grabe, Hans J; Lichtenstein, Paul; Magnusson, Patrik; Perlis, Roy H; Preisig, Martin; Smoller, Jordan W; Stefansson, Kari; Uher, Rudolf; Kutalik, Zoltan; Tansey, Katherine E; Teumer, Alexander; Viktorin, Alexander; Barnes, Michael R; Bettecken, Thomas; Binder, Elisabeth B; Breuer, René; Castro, Victor M; Churchill, Susanne E; Coryell, William H; Craddock, Nick; Craig, Ian W; Czamara, Darina; De Geus, Eco J; Degenhardt, Franziska; Farmer, Anne E; Fava, Maurizio; Frank, Josef; Gainer, Vivian S; Gallagher, Patience J; Gordon, Scott D; Goryachev, Sergey; Gross, Magdalena; Guipponi, Michel; Henders, Anjali K; Herms, Stefan; Hickie, Ian B; Hoefels, Susanne; Hoogendijk, Witte; Hottenga, Jouke Jan; Iosifescu, Dan V; Ising, Marcus; Jones, Ian; Jones, Lisa; Jung-Ying, Tzeng; Knowles, James A; Kohane, Isaac S; Kohli, Martin A; Korszun, Ania; Landen, Mikael; Lawson, William B; Lewis, Glyn; Macintyre, Donald; Maier, Wolfgang; Mattheisen, Manuel; McGrath, Patrick J; McIntosh, Andrew; McLean, Alan; Middeldorp, Christel M; Middleton, Lefkos; Montgomery, Grant M; Murphy, Shawn N; Nauck, Matthias; Nolen, Willem A; Nyholt, Dale R; O'Donovan, Michael; Oskarsson, Högni; Pedersen, Nancy; Scheftner, William A; Schulz, Andrea; Schulze, Thomas G; Shyn, Stanley I; Sigurdsson, Engilbert; Slager, Susan L; Smit, Johannes H; Stefansson, Hreinn; Steffens, Michael; Thorgeirsson, Thorgeir; Tozzi, Federica; Treutlein, Jens; Uhr, Manfred; van den Oord, Edwin J C G; Van Grootheest, Gerard; Völzke, Henry; Weilburg, Jeffrey B; Willemsen, Gonneke; Zitman, Frans G; Neale, Benjamin; Daly, Mark; Levinson, Douglas F; Sullivan, Patrick F

2013-04-01

Prior genome-wide association studies (GWAS) of major depressive disorder (MDD) have met with limited success. We sought to increase statistical power to detect disease loci by conducting a GWAS mega-analysis for MDD. In the MDD discovery phase, we analyzed more than 1.2 million autosomal and X chromosome single-nucleotide polymorphisms (SNPs) in 18 759 independent and unrelated subjects of recent European ancestry (9240 MDD cases and 9519 controls). In the MDD replication phase, we evaluated 554 SNPs in independent samples (6783 MDD cases and 50 695 controls). We also conducted a cross-disorder meta-analysis using 819 autosomal SNPs with P<0.0001 for either MDD or the Psychiatric GWAS Consortium bipolar disorder (BIP) mega-analysis (9238 MDD cases/8039 controls and 6998 BIP cases/7775 controls). No SNPs achieved genome-wide significance in the MDD discovery phase, the MDD replication phase or in pre-planned secondary analyses (by sex, recurrent MDD, recurrent early-onset MDD, age of onset, pre-pubertal onset MDD or typical-like MDD from a latent class analyses of the MDD criteria). In the MDD-bipolar cross-disorder analysis, 15 SNPs exceeded genome-wide significance (P<5 × 10(-8)), and all were in a 248 kb interval of high LD on 3p21.1 (chr3:52 425 083-53 822 102, minimum P=5.9 × 10(-9) at rs2535629). Although this is the largest genome-wide analysis of MDD yet conducted, its high prevalence means that the sample is still underpowered to detect genetic effects typical for complex traits. Therefore, we were unable to identify robust and replicable findings. We discuss what this means for genetic research for MDD. The 3p21.1 MDD-BIP finding should be interpreted with caution as the most significant SNP did not replicate in MDD samples, and genotyping in independent samples will be needed to resolve its status.
Ancestry-specific and sex-specific risk alleles identified in a genome-wide gene-by-alcohol dependence interaction study of risky sexual behaviors.

PubMed

Polimanti, Renato; Zhao, Hongyu; Farrer, Lindsay A; Kranzler, Henry R; Gelernter, Joel

2017-12-01

We previously mapped loci for the genome-wide association studies (GWAS) and genome-wide gene-by-alcohol dependence interaction (GW-GxAD) analyses of risky sexual behaviors (RSB). This study extends those findings by analyzing the ancestry- and sex-specific AD-stratified effects on RSB. We examined the concordance of findings for the AD-stratified GWAS and the GW-GxAD analysis of RSB, with concordance defined as genome-wide significance in one analysis and at least nominal significance in the second analysis. A total of 2,173 African-American (AA) and 1,751 European-American (EA) subjects were investigated. Information regarding RSB (lifetime experiences of unprotected sex and multiple sexual partners) and DSM-IV diagnosis of lifetime AD were derived from the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA). In our ancestry- and sex-specific analyses, we identified four independent genome-wide significant (GWS) loci (p < 5*10 -8 ) and one suggestive locus (p < 6*10 -8 ). In men, we observed a GWS signal in FAM162A (rs2002594, p = 4.96*10 -8 ). In women, there was a suggestive locus in PLGRKT (rs3824435, p = 5.52*10 -8 ). In AAs, there was a GWS signal in GRK5 (rs1316543, p = 1.25*10 -9 ). In AA men, we observed an intergenic GWS signal (rs12898370, p = 4.49*10 -8 ) near LINGO1. In EA men, there was a GWS signal in CCSER1 (rs62313897; p = 7.93*10 -10 ). The loci identified in this GWAS implicate molecular mechanisms related to psychiatric illness and personality features, suggesting that the interplay between AD and RSB is mediated by alleles associated with behavioral traits. © 2017 Wiley Periodicals, Inc.
Linkage Analysis in Autoimmune Addison's Disease: NFATC1 as a Potential Novel Susceptibility Locus.

PubMed

Mitchell, Anna L; Bøe Wolff, Anette; MacArthur, Katie; Weaver, Jolanta U; Vaidya, Bijay; Erichsen, Martina M; Darlay, Rebecca; Husebye, Eystein S; Cordell, Heather J; Pearce, Simon H S

2015-01-01

Autoimmune Addison's disease (AAD) is a rare, highly heritable autoimmune endocrinopathy. It is possible that there may be some highly penetrant variants which confer disease susceptibility that have yet to be discovered. DNA samples from 23 multiplex AAD pedigrees from the UK and Norway (50 cases, 67 controls) were genotyped on the Affymetrix SNP 6.0 array. Linkage analysis was performed using Merlin. EMMAX was used to carry out a genome-wide association analysis comparing the familial AAD cases to 2706 UK WTCCC controls. To explore some of the linkage findings further, a replication study was performed by genotyping 64 SNPs in two of the four linked regions (chromosomes 7 and 18), on the Sequenom iPlex platform in three European AAD case-control cohorts (1097 cases, 1117 controls). The data were analysed using a meta-analysis approach. In a parametric analysis, applying a rare dominant model, loci on chromosomes 7, 9 and 18 had LOD scores >2.8. In a non-parametric analysis, a locus corresponding to the HLA region on chromosome 6, known to be associated with AAD, had a LOD score >3.0. In the genome-wide association analysis, a SNP cluster on chromosome 2 and a pair of SNPs on chromosome 6 were associated with AAD (P <5x10-7). A meta-analysis of the replication study data demonstrated that three chromosome 18 SNPs were associated with AAD, including a non-synonymous variant in the NFATC1 gene. This linkage study has implicated a number of novel chromosomal regions in the pathogenesis of AAD in multiplex AAD families and adds further support to the role of HLA in AAD. The genome-wide association analysis has also identified a region of interest on chromosome 2. A replication study has demonstrated that the NFATC1 gene is worthy of future investigation, however each of the regions identified require further, systematic analysis.
Segment-Wise Genome-Wide Association Analysis Identifies a Candidate Region Associated with Schizophrenia in Three Independent Samples

PubMed Central

Rietschel, Marcella; Mattheisen, Manuel; Breuer, René; Schulze, Thomas G.; Nöthen, Markus M.; Levinson, Douglas; Shi, Jianxin; Gejman, Pablo V.; Cichon, Sven; Ophoff, Roel A.

2012-01-01

Recent studies suggest that variation in complex disorders (e.g., schizophrenia) is explained by a large number of genetic variants with small effect size (Odds Ratio∼1.05–1.1). The statistical power to detect these genetic variants in Genome Wide Association (GWA) studies with large numbers of cases and controls (∼15,000) is still low. As it will be difficult to further increase sample size, we decided to explore an alternative method for analyzing GWA data in a study of schizophrenia, dramatically reducing the number of statistical tests. The underlying hypothesis was that at least some of the genetic variants related to a common outcome are collocated in segments of chromosomes at a wider scale than single genes. Our approach was therefore to study the association between relatively large segments of DNA and disease status. An association test was performed for each SNP and the number of nominally significant tests in a segment was counted. We then performed a permutation-based binomial test to determine whether this region contained significantly more nominally significant SNPs than expected under the null hypothesis of no association, taking linkage into account. Genome Wide Association data of three independent schizophrenia case/control cohorts with European ancestry (Dutch, German, and US) using segments of DNA with variable length (2 to 32 Mbp) was analyzed. Using this approach we identified a region at chromosome 5q23.3-q31.3 (128–160 Mbp) that was significantly enriched with nominally associated SNPs in three independent case-control samples. We conclude that considering relatively wide segments of chromosomes may reveal reliable relationships between the genome and schizophrenia, suggesting novel methodological possibilities as well as raising theoretical questions. PMID:22723893
Mining Genomes of Three Marine Sponge-Associated Actinobacterial Isolates for Secondary Metabolism.

PubMed

Horn, Hannes; Hentschel, Ute; Abdelmohsen, Usama Ramadan

2015-10-01

Here, we report the draft genome sequences of three actinobacterial isolates, Micromonospora sp. RV43, Rubrobacter sp. RV113, and Nocardiopsis sp. RV163 that had previously been isolated from Mediterranean sponges. The draft genomes were analyzed for the presence of gene clusters indicative of secondary metabolism using antiSMASH 3.0 and NapDos pipelines. Our findings demonstrated the chemical richness of sponge-associated actinomycetes and the efficacy of genome mining in exploring the genomic potential of sponge-derived actinomycetes. Copyright © 2015 Horn et al.
A Genome-Wide Association Study of Chronic Obstructive Pulmonary Disease in Hispanics

PubMed Central

Chen, Wei; Brehm, John M.; Manichaikul, Ani; Cho, Michael H.; Boutaoui, Nadia; Yan, Qi; Burkart, Kristin M.; Enright, Paul L.; Rotter, Jerome I.; Petersen, Hans; Leng, Shuguang; Obeidat, Ma’en; Bossé, Yohan; Brandsma, Corry-Anke; Hao, Ke; Rich, Stephen S.; Powell, Rhea; Avila, Lydiana; Soto-Quiros, Manuel; Silverman, Edwin K.; Tesfaigzi, Yohannes; Barr, R. Graham

2015-01-01

Rationale: Genome-wide association studies (GWAS) of chronic obstructive pulmonary disease (COPD) have identified disease-susceptibility loci, mostly in subjects of European descent. Objectives: We hypothesized that by studying Hispanic populations we would be able to identify unique loci that contribute to COPD pathogenesis in Hispanics but remain undetected in GWAS of non-Hispanic populations. Methods: We conducted a metaanalysis of two GWAS of COPD in independent cohorts of Hispanics in Costa Rica and the United States (Multi-Ethnic Study of Atherosclerosis [MESA]). We performed a replication study of the top single-nucleotide polymorphisms in an independent Hispanic cohort in New Mexico (the Lovelace Smokers Cohort). We also attempted to replicate prior findings from genome-wide studies in non-Hispanic populations in Hispanic cohorts. Measurements and Main Results: We found no genome-wide significant association with COPD in our metaanalysis of Costa Rica and MESA. After combining the top results from this metaanalysis with those from our replication study in the Lovelace Smokers Cohort, we identified two single-nucleotide polymorphisms approaching genome-wide significance for an association with COPD. The first (rs858249, combined P value = 6.1 × 10−8) is near the genes KLHL7 and NUPL2 on chromosome 7. The second (rs286499, combined P value = 8.4 × 10−8) is located in an intron of DLG2. The two most significant single-nucleotide polymorphisms in FAM13A from a previous genome-wide study in non-Hispanics were associated with COPD in Hispanics. Conclusions: We have identified two novel loci (in or near the genes KLHL7/NUPL2 and DLG2) that may play a role in COPD pathogenesis in Hispanic populations. PMID:25584925
A genome-wide association study of chronic obstructive pulmonary disease in Hispanics.

PubMed

Chen, Wei; Brehm, John M; Manichaikul, Ani; Cho, Michael H; Boutaoui, Nadia; Yan, Qi; Burkart, Kristin M; Enright, Paul L; Rotter, Jerome I; Petersen, Hans; Leng, Shuguang; Obeidat, Ma'en; Bossé, Yohan; Brandsma, Corry-Anke; Hao, Ke; Rich, Stephen S; Powell, Rhea; Avila, Lydiana; Soto-Quiros, Manuel; Silverman, Edwin K; Tesfaigzi, Yohannes; Barr, R Graham; Celedón, Juan C

2015-03-01

Genome-wide association studies (GWAS) of chronic obstructive pulmonary disease (COPD) have identified disease-susceptibility loci, mostly in subjects of European descent. We hypothesized that by studying Hispanic populations we would be able to identify unique loci that contribute to COPD pathogenesis in Hispanics but remain undetected in GWAS of non-Hispanic populations. We conducted a metaanalysis of two GWAS of COPD in independent cohorts of Hispanics in Costa Rica and the United States (Multi-Ethnic Study of Atherosclerosis [MESA]). We performed a replication study of the top single-nucleotide polymorphisms in an independent Hispanic cohort in New Mexico (the Lovelace Smokers Cohort). We also attempted to replicate prior findings from genome-wide studies in non-Hispanic populations in Hispanic cohorts. We found no genome-wide significant association with COPD in our metaanalysis of Costa Rica and MESA. After combining the top results from this metaanalysis with those from our replication study in the Lovelace Smokers Cohort, we identified two single-nucleotide polymorphisms approaching genome-wide significance for an association with COPD. The first (rs858249, combined P value = 6.1 × 10(-8)) is near the genes KLHL7 and NUPL2 on chromosome 7. The second (rs286499, combined P value = 8.4 × 10(-8)) is located in an intron of DLG2. The two most significant single-nucleotide polymorphisms in FAM13A from a previous genome-wide study in non-Hispanics were associated with COPD in Hispanics. We have identified two novel loci (in or near the genes KLHL7/NUPL2 and DLG2) that may play a role in COPD pathogenesis in Hispanic populations.
A genome-wide approach to children's aggressive behavior: The EAGLE consortium.

PubMed

Pappa, Irene; St Pourcain, Beate; Benke, Kelly; Cavadino, Alana; Hakulinen, Christian; Nivard, Michel G; Nolte, Ilja M; Tiesler, Carla M T; Bakermans-Kranenburg, Marian J; Davies, Gareth E; Evans, David M; Geoffroy, Marie-Claude; Grallert, Harald; Groen-Blokhuis, Maria M; Hudziak, James J; Kemp, John P; Keltikangas-Järvinen, Liisa; McMahon, George; Mileva-Seitz, Viara R; Motazedi, Ehsan; Power, Christine; Raitakari, Olli T; Ring, Susan M; Rivadeneira, Fernando; Rodriguez, Alina; Scheet, Paul A; Seppälä, Ilkka; Snieder, Harold; Standl, Marie; Thiering, Elisabeth; Timpson, Nicholas J; Veenstra, René; Velders, Fleur P; Whitehouse, Andrew J O; Smith, George Davey; Heinrich, Joachim; Hypponen, Elina; Lehtimäki, Terho; Middeldorp, Christel M; Oldehinkel, Albertine J; Pennell, Craig E; Boomsma, Dorret I; Tiemeier, Henning

2016-07-01

Individual differences in aggressive behavior emerge in early childhood and predict persisting behavioral problems and disorders. Studies of antisocial and severe aggression in adulthood indicate substantial underlying biology. However, little attention has been given to genome-wide approaches of aggressive behavior in children. We analyzed data from nine population-based studies and assessed aggressive behavior using well-validated parent-reported questionnaires. This is the largest sample exploring children's aggressive behavior to date (N = 18,988), with measures in two developmental stages (N = 15,668 early childhood and N = 16,311 middle childhood/early adolescence). First, we estimated the additive genetic variance of children's aggressive behavior based on genome-wide SNP information, using genome-wide complex trait analysis (GCTA). Second, genetic associations within each study were assessed using a quasi-Poisson regression approach, capturing the highly right-skewed distribution of aggressive behavior. Third, we performed meta-analyses of genome-wide associations for both the total age-mixed sample and the two developmental stages. Finally, we performed a gene-based test using the summary statistics of the total sample. GCTA quantified variance tagged by common SNPs (10-54%). The meta-analysis of the total sample identified one region in chromosome 2 (2p12) at near genome-wide significance (top SNP rs11126630, P = 5.30 × 10(-8) ). The separate meta-analyses of the two developmental stages revealed suggestive evidence of association at the same locus. The gene-based analysis indicated association of variation within AVPR1A with aggressive behavior. We conclude that common variants at 2p12 show suggestive evidence for association with childhood aggression. Replication of these initial findings is needed, and further studies should clarify its biological meaning. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Genome-wide Association Study Identifies Loci for the Polled Phenotype in Yak

PubMed Central

Wu, Xiaoyun; Wang, Kun; Ding, Xuezhi; Wang, Mingcheng; Chu, Min; Xie, Xiuyue; Qiu, Qiang; Yan, Ping

2016-01-01

The absence of horns, known as the polled phenotype, is an economically important trait in modern yak husbandry, but the genomic structure and genetic basis of this phenotype have yet to be discovered. Here, we conducted a genome-wide association study with a panel of 10 horned and 10 polled yaks using whole genome sequencing. We mapped the POLLED locus to a 200-kb interval, which comprises three protein-coding genes. Further characterization of the candidate region showed recent artificial selection signals resulting from the breeding process. We suggest that expressional variations rather than structural variations in protein probably contribute to the polled phenotype. Our results not only represent the first and important step in establishing the genomic structure of the polled region in yak, but also add to our understanding of the polled trait in bovid species. PMID:27389700
Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits.

PubMed

Shi, Huwenbo; Mancuso, Nicholas; Spendlove, Sarah; Pasaniuc, Bogdan

2017-11-02

Although genetic correlations between complex traits provide valuable insights into epidemiological and etiological studies, a precise quantification of which genomic regions disproportionately contribute to the genome-wide correlation is currently lacking. Here, we introduce ρ-HESS, a technique to quantify the correlation between pairs of traits due to genetic variation at a small region in the genome. Our approach requires GWAS summary data only and makes no distributional assumption on the causal variant effect sizes while accounting for linkage disequilibrium (LD) and overlapping GWAS samples. We analyzed large-scale GWAS summary data across 36 quantitative traits, and identified 25 genomic regions that contribute significantly to the genetic correlation among these traits. Notably, we find 6 genomic regions that contribute to the genetic correlation of 10 pairs of traits that show negligible genome-wide correlation, further showcasing the power of local genetic correlation analyses. Finally, we report the distribution of local genetic correlations across the genome for 55 pairs of traits that show putative causal relationships. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Comparison of Widely Used Listeria monocytogenes Strains EGD, 10403S, and EGD-e Highlights Genomic Differences Underlying Variations in Pathogenicity

PubMed Central

Bécavin, Christophe; Bouchier, Christiane; Lechat, Pierre; Archambaud, Cristel; Creno, Sophie; Gouin, Edith; Wu, Zongfu; Kühbacher, Andreas; Brisse, Sylvain; Pucciarelli, M. Graciela; García-del Portillo, Francisco; Hain, Torsten; Portnoy, Daniel A.; Chakraborty, Trinad; Lecuit, Marc; Pizarro-Cerdá, Javier; Moszer, Ivan; Bierne, Hélène; Cossart, Pascale

2014-01-01

ABSTRACT For nearly 3 decades, listeriologists and immunologists have used mainly three strains of the same serovar (1/2a) to analyze the virulence of the bacterial pathogen Listeria monocytogenes. The genomes of two of these strains, EGD-e and 10403S, were released in 2001 and 2008, respectively. Here we report the genome sequence of the third reference strain, EGD, and extensive genomic and phenotypic comparisons of the three strains. Strikingly, EGD-e is genetically highly distinct from EGD (29,016 single nucleotide polymorphisms [SNPs]) and 10403S (30,296 SNPs), and is more related to serovar 1/2c than 1/2a strains. We also found that while EGD and 10403S strains are genetically very close (317 SNPs), EGD has a point mutation in the transcriptional regulator PrfA (PrfA*), leading to constitutive expression of several major virulence genes. We generated an EGD-e PrfA* mutant and showed that EGD behaves like this strain in vitro, with slower growth in broth and higher invasiveness in human cells than those of EGD-e and 10403S. In contrast, bacterial counts in blood, liver, and spleen during infection in mice revealed that EGD and 10403S are less virulent than EGD-e, which is itself less virulent than EGD-e PrfA*. Thus, constitutive expression of PrfA-regulated virulence genes does not appear to provide a significant advantage to the EGD strain during infection in vivo, highlighting the fact that in vitro invasion assays are not sufficient for evaluating the pathogenic potential of L. monocytogenes strains. Together, our results pave the way for deciphering unexplained differences or discrepancies in experiments using different L. monocytogenes strains. PMID:24667708
Moving into a new era of periodontal genetic studies: relevance of large case-control samples using severe phenotypes for genome-wide association studies.

PubMed

Vaithilingam, R D; Safii, S H; Baharuddin, N A; Ng, C C; Cheong, S C; Bartold, P M; Schaefer, A S; Loos, B G

2014-12-01

Studies to elucidate the role of genetics as a risk factor for periodontal disease have gone through various phases. In the majority of cases, the initial 'hypothesis-dependent' candidate-gene polymorphism studies did not report valid genetic risk loci. Following a large-scale replication study, these initially positive results are believed to be caused by type 1 errors. However, susceptibility genes, such as CDKN2BAS (Cyclin Dependend KiNase 2B AntiSense RNA; alias ANRIL [ANtisense Rna In the Ink locus]), glycosyltransferase 6 domain containing 1 (GLT6D1) and cyclooxygenase 2 (COX2), have been reported as conclusive risk loci of periodontitis. The search for genetic risk factors accelerated with the advent of 'hypothesis-free' genome-wide association studies (GWAS). However, despite many different GWAS being performed for almost all human diseases, only three GWAS on periodontitis have been published - one reported genome-wide association of GLT6D1 with aggressive periodontitis (a severe phenotype of periodontitis), whereas the remaining two, which were performed on patients with chronic periodontitis, were not able to find significant associations. This review discusses the problems faced and the lessons learned from the search for genetic risk variants of periodontitis. Current and future strategies for identifying genetic variance in periodontitis, and the importance of planning a well-designed genetic study with large and sufficiently powered case-control samples of severe phenotypes, are also discussed. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
UCSC genome browser: deep support for molecular biomedical research.

PubMed

Mangan, Mary E; Williams, Jennifer M; Lathe, Scott M; Karolchik, Donna; Lathe, Warren C

2008-01-01

The volume and complexity of genomic sequence data, and the additional experimental data required for annotation of the genomic context, pose a major challenge for display and access for biomedical researchers. Genome browsers organize this data and make it available in various ways to extract useful information to advance research projects. The UCSC Genome Browser is one of these resources. The official sequence data for a given species forms the framework to display many other types of data such as expression, variation, cross-species comparisons, and more. Visual representations of the data are available for exploration. Data can be queried with sequences. Complex database queries are also easily achieved with the Table Browser interface. Associated tools permit additional query types or access to additional data sources such as images of in situ localizations. Support for solving researcher's issues is provided with active discussion mailing lists and by providing updated training materials. The UCSC Genome Browser provides a source of deep support for a wide range of biomedical molecular research (http://genome.ucsc.edu).
Time for Genome Editing: Next-Generation Attenuated Malaria Parasites.

PubMed

Singer, Mirko; Frischknecht, Friedrich

2017-03-01

Immunization with malaria parasites that developmentally arrest in or immediately after the liver stage is the only way currently known to confer sterilizing immunity in both humans and rodent models. There are various ways to attenuate parasite development resulting in different timings of arrest, which has a significant impact on vaccination efficiency. To understand what most impacts vaccination efficiency, newly developed gain-of-function methods can now be used to generate a wide array of differently attenuated parasites. The combination of multiple attenuation approaches offers the potential to engineer efficiently attenuated Plasmodium parasites and learn about their fascinating biology at the same time. Here we discuss recent studies and the potential of targeted parasite manipulation using genome editing to develop live attenuated malaria vaccines. Copyright © 2016 Elsevier Ltd. All rights reserved.
Genome-Wide Analysis of DNA Methylation and Fine Particulate Matter Air Pollution in Three Study Populations: KORA F3, KORA F4, and the Normative Aging Study.

PubMed

Panni, Tommaso; Mehta, Amar J; Schwartz, Joel D; Baccarelli, Andrea A; Just, Allan C; Wolf, Kathrin; Wahl, Simone; Cyrys, Josef; Kunze, Sonja; Strauch, Konstantin; Waldenberger, Melanie; Peters, Annette

2016-07-01

Epidemiological studies have reported associations between particulate matter (PM) concentrations and cancer and respiratory and cardiovascular diseases. DNA methylation has been identified as a possible link but so far it has only been analyzed in candidate sites. We studied the association between DNA methylation and short- and mid-term air pollution exposure using genome-wide data and identified potential biological pathways for additional investigation. We collected whole blood samples from three independent studies-KORA F3 (2004-2005) and F4 (2006-2008) in Germany, and the Normative Aging Study (1999-2007) in the United States-and measured genome-wide DNA methylation proportions with the Illumina 450k BeadChip. PM concentration was measured daily at fixed monitoring stations and three different trailing averages were considered and regressed against DNA methylation: 2-day, 7-day and 28-day. Meta-analysis was performed to pool the study-specific results. Random-effect meta-analysis revealed 12 CpG (cytosine-guanine dinucleotide) sites as associated with PM concentration (1 for 2-day average, 1 for 7-day, and 10 for 28-day) at a genome-wide Bonferroni significance level (p ≤ 7.5E-8); 9 out of these 12 sites expressed increased methylation. Through estimation of I2 for homogeneity assessment across the studies, 4 of these sites (annotated in NSMAF, C1orf212, MSGN1, NXN) showed p > 0.05 and I2 < 0.5: the site from the 7-day average results and 3 for the 28-day average. Applying false discovery rate, p-value < 0.05 was observed in 8 and 1,819 additional CpGs at 7- and 28-day average PM2.5 exposure respectively. The PM-related CpG sites found in our study suggest novel plausible systemic pathways linking ambient PM exposure to adverse health effect through variations in DNA methylation. Panni T, Mehta AJ, Schwartz JD, Baccarelli AA, Just AC, Wolf K, Wahl S, Cyrys J, Kunze S, Strauch K, Waldenberger M, Peters A. 2016. A genome-wide analysis of DNA methylation and fine particulate matter air pollution in three study populations: KORA F3, KORA F4, and the Normative Aging Study. Environ Health Perspect 124:983-990; http://dx.doi.org/10.1289/ehp.1509966.
p53 shapes genome-wide and cell type-specific changes in microRNA expression during the human DNA damage response.

PubMed

Hattori, Hiroyoshi; Janky, Rekin's; Nietfeld, Wilfried; Aerts, Stein; Madan Babu, M; Venkitaraman, Ashok R

2014-01-01

The human DNA damage response (DDR) triggers profound changes in gene expression, whose nature and regulation remain uncertain. Although certain micro-(mi)RNA species including miR34, miR-18, miR-16 and miR-143 have been implicated in the DDR, there is as yet no comprehensive description of genome-wide changes in the expression of miRNAs triggered by DNA breakage in human cells. We have used next-generation sequencing (NGS), combined with rigorous integrative computational analyses, to describe genome-wide changes in the expression of miRNAs during the human DDR. The changes affect 150 of 1523 miRNAs known in miRBase v18 from 4-24 h after the induction of DNA breakage, in cell-type dependent patterns. The regulatory regions of the most-highly regulated miRNA species are enriched in conserved binding sites for p53. Indeed, genome-wide changes in miRNA expression during the DDR are markedly altered in TP53-/- cells compared to otherwise isogenic controls. The expression levels of certain damage-induced, p53-regulated miRNAs in cancer samples correlate with patient survival. Our work reveals genome-wide and cell type-specific alterations in miRNA expression during the human DDR, which are regulated by the tumor suppressor protein p53. These findings provide a genomic resource to identify new molecules and mechanisms involved in the DDR, and to examine their role in tumor suppression and the clinical outcome of cancer patients.
Parent-of-origin specific allelic associations among 106 genomic loci for age at menarche

PubMed Central

Thompson, Deborah J; Ferreira, Teresa; He, Chunyan; Chasman, Daniel I; Esko, Tõnu; Thorleifsson, Gudmar; Albrecht, Eva; Ang, Wei Q; Corre, Tanguy; Cousminer, Diana L; Feenstra, Bjarke; Franceschini, Nora; Ganna, Andrea; Johnson, Andrew D; Kjellqvist, Sanela; Lunetta, Kathryn L; McMahon, George; Nolte, Ilja M; Paternoster, Lavinia; Porcu, Eleonora; Smith, Albert V; Stolk, Lisette; Teumer, Alexander; Tšernikova, Natalia; Tikkanen, Emmi; Ulivi, Sheila; Wagner, Erin K; Amin, Najaf; Bierut, Laura J; Byrne, Enda M; Hottenga, Jouke-Jan; Koller, Daniel L; Mangino, Massimo; Pers, Tune H; Yerges-Armstrong, Laura M; Zhao, Jing Hua; Andrulis, Irene L; Anton-Culver, Hoda; Atsma, Femke; Bandinelli, Stefania; Beckmann, Matthias W; Benitez, Javier; Blomqvist, Carl; Bojesen, Stig E; Bolla, Manjeet K; Bonanni, Bernardo; Brauch, Hiltrud; Brenner, Hermann; Buring, Julie E; Chang-Claude, Jenny; Chanock, Stephen; Chen, Jinhui; Chenevix-Trench, Georgia; Collée, J. Margriet; Couch, Fergus J; Couper, David; Coveillo, Andrea D; Cox, Angela; Czene, Kamila; D’adamo, Adamo Pio; Smith, George Davey; De Vivo, Immaculata; Demerath, Ellen W; Dennis, Joe; Devilee, Peter; Dieffenbach, Aida K; Dunning, Alison M; Eiriksdottir, Gudny; Eriksson, Johan G; Fasching, Peter A; Ferrucci, Luigi; Flesch-Janys, Dieter; Flyger, Henrik; Foroud, Tatiana; Franke, Lude; Garcia, Melissa E; García-Closas, Montserrat; Geller, Frank; de Geus, Eco EJ; Giles, Graham G; Gudbjartsson, Daniel F; Gudnason, Vilmundur; Guénel, Pascal; Guo, Suiqun; Hall, Per; Hamann, Ute; Haring, Robin; Hartman, Catharina A; Heath, Andrew C; Hofman, Albert; Hooning, Maartje J; Hopper, John L; Hu, Frank B; Hunter, David J; Karasik, David; Kiel, Douglas P; Knight, Julia A; Kosma, Veli-Matti; Kutalik, Zoltan; Lai, Sandra; Lambrechts, Diether; Lindblom, Annika; Mägi, Reedik; Magnusson, Patrik K; Mannermaa, Arto; Martin, Nicholas G; Masson, Gisli; McArdle, Patrick F; McArdle, Wendy L; Melbye, Mads; Michailidou, Kyriaki; Mihailov, Evelin; Milani, Lili; Milne, Roger L; Nevanlinna, Heli; Neven, Patrick; Nohr, Ellen A; Oldehinkel, Albertine J; Oostra, Ben A; Palotie, Aarno; Peacock, Munro; Pedersen, Nancy L; Peterlongo, Paolo; Peto, Julian; Pharoah, Paul DP; Postma, Dirkje S; Pouta, Anneli; Pylkäs, Katri; Radice, Paolo; Ring, Susan; Rivadeneira, Fernando; Robino, Antonietta; Rose, Lynda M; Rudolph, Anja; Salomaa, Veikko; Sanna, Serena; Schlessinger, David; Schmidt, Marjanka K; Southey, Mellissa C; Sovio, Ulla; Stampfer, Meir J; Stöckl, Doris; Storniolo, Anna M; Timpson, Nicholas J; Tyrer, Jonathan; Visser, Jenny A; Vollenweider, Peter; Völzke, Henry; Waeber, Gerard; Waldenberger, Melanie; Wallaschofski, Henri; Wang, Qin; Willemsen, Gonneke; Winqvist, Robert; Wolffenbuttel, Bruce HR; Wright, Margaret J; Boomsma, Dorret I; Econs, Michael J; Khaw, Kay-Tee; Loos, Ruth JF; McCarthy, Mark I; Montgomery, Grant W; Rice, John P; Streeten, Elizabeth A; Thorsteinsdottir, Unnur; van Duijn, Cornelia M; Alizadeh, Behrooz Z; Bergmann, Sven; Boerwinkle, Eric; Boyd, Heather A; Crisponi, Laura; Gasparini, Paolo; Gieger, Christian; Harris, Tamara B; Ingelsson, Erik; Järvelin, Marjo-Riitta; Kraft, Peter; Lawlor, Debbie; Metspalu, Andres; Pennell, Craig E; Ridker, Paul M; Snieder, Harold; Sørensen, Thorkild IA; Spector, Tim D; Strachan, David P; Uitterlinden, André G; Wareham, Nicholas J; Widen, Elisabeth; Zygmunt, Marek; Murray, Anna; Easton, Douglas F

2014-01-01

Age at menarche is a marker of timing of puberty in females. It varies widely between individuals, is a heritable trait and is associated with risks for obesity, type 2 diabetes, cardiovascular disease, breast cancer and all-cause mortality1. Studies of rare human disorders of puberty and animal models point to a complex hypothalamic-pituitary-hormonal regulation2,3, but the mechanisms that determine pubertal timing and underlie its links to disease risk remain unclear. Here, using genome-wide and custom-genotyping arrays in up to 182,416 women of European descent from 57 studies, we found robust evidence (P<5×10−8) for 123 signals at 106 genomic loci associated with age at menarche. Many loci were associated with other pubertal traits in both sexes, and there was substantial overlap with genes implicated in body mass index and various diseases, including rare disorders of puberty. Menarche signals were enriched in imprinted regions, with three loci (DLK1/WDR25, MKRN3/MAGEL2 and KCNK9) demonstrating parent-of-origin specific associations concordant with known parental expression patterns. Pathway analyses implicated nuclear hormone receptors, particularly retinoic acid and gamma-aminobutyric acid-B2 receptor signaling, among novel mechanisms that regulate pubertal timing in humans. Our findings suggest a genetic architecture involving at least hundreds of common variants in the coordinated timing of the pubertal transition. PMID:25231870

Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche.

PubMed

Perry, John Rb; Day, Felix; Elks, Cathy E; Sulem, Patrick; Thompson, Deborah J; Ferreira, Teresa; He, Chunyan; Chasman, Daniel I; Esko, Tõnu; Thorleifsson, Gudmar; Albrecht, Eva; Ang, Wei Q; Corre, Tanguy; Cousminer, Diana L; Feenstra, Bjarke; Franceschini, Nora; Ganna, Andrea; Johnson, Andrew D; Kjellqvist, Sanela; Lunetta, Kathryn L; McMahon, George; Nolte, Ilja M; Paternoster, Lavinia; Porcu, Eleonora; Smith, Albert V; Stolk, Lisette; Teumer, Alexander; Tšernikova, Natalia; Tikkanen, Emmi; Ulivi, Sheila; Wagner, Erin K; Amin, Najaf; Bierut, Laura J; Byrne, Enda M; Hottenga, Jouke-Jan; Koller, Daniel L; Mangino, Massimo; Pers, Tune H; Yerges-Armstrong, Laura M; Zhao, Jing Hua; Andrulis, Irene L; Anton-Culver, Hoda; Atsma, Femke; Bandinelli, Stefania; Beckmann, Matthias W; Benitez, Javier; Blomqvist, Carl; Bojesen, Stig E; Bolla, Manjeet K; Bonanni, Bernardo; Brauch, Hiltrud; Brenner, Hermann; Buring, Julie E; Chang-Claude, Jenny; Chanock, Stephen; Chen, Jinhui; Chenevix-Trench, Georgia; Collée, J Margriet; Couch, Fergus J; Couper, David; Coveillo, Andrea D; Cox, Angela; Czene, Kamila; D'adamo, Adamo Pio; Smith, George Davey; De Vivo, Immaculata; Demerath, Ellen W; Dennis, Joe; Devilee, Peter; Dieffenbach, Aida K; Dunning, Alison M; Eiriksdottir, Gudny; Eriksson, Johan G; Fasching, Peter A; Ferrucci, Luigi; Flesch-Janys, Dieter; Flyger, Henrik; Foroud, Tatiana; Franke, Lude; Garcia, Melissa E; García-Closas, Montserrat; Geller, Frank; de Geus, Eco Ej; Giles, Graham G; Gudbjartsson, Daniel F; Gudnason, Vilmundur; Guénel, Pascal; Guo, Suiqun; Hall, Per; Hamann, Ute; Haring, Robin; Hartman, Catharina A; Heath, Andrew C; Hofman, Albert; Hooning, Maartje J; Hopper, John L; Hu, Frank B; Hunter, David J; Karasik, David; Kiel, Douglas P; Knight, Julia A; Kosma, Veli-Matti; Kutalik, Zoltan; Lai, Sandra; Lambrechts, Diether; Lindblom, Annika; Mägi, Reedik; Magnusson, Patrik K; Mannermaa, Arto; Martin, Nicholas G; Masson, Gisli; McArdle, Patrick F; McArdle, Wendy L; Melbye, Mads; Michailidou, Kyriaki; Mihailov, Evelin; Milani, Lili; Milne, Roger L; Nevanlinna, Heli; Neven, Patrick; Nohr, Ellen A; Oldehinkel, Albertine J; Oostra, Ben A; Palotie, Aarno; Peacock, Munro; Pedersen, Nancy L; Peterlongo, Paolo; Peto, Julian; Pharoah, Paul Dp; Postma, Dirkje S; Pouta, Anneli; Pylkäs, Katri; Radice, Paolo; Ring, Susan; Rivadeneira, Fernando; Robino, Antonietta; Rose, Lynda M; Rudolph, Anja; Salomaa, Veikko; Sanna, Serena; Schlessinger, David; Schmidt, Marjanka K; Southey, Mellissa C; Sovio, Ulla; Stampfer, Meir J; Stöckl, Doris; Storniolo, Anna M; Timpson, Nicholas J; Tyrer, Jonathan; Visser, Jenny A; Vollenweider, Peter; Völzke, Henry; Waeber, Gerard; Waldenberger, Melanie; Wallaschofski, Henri; Wang, Qin; Willemsen, Gonneke; Winqvist, Robert; Wolffenbuttel, Bruce Hr; Wright, Margaret J; Boomsma, Dorret I; Econs, Michael J; Khaw, Kay-Tee; Loos, Ruth Jf; McCarthy, Mark I; Montgomery, Grant W; Rice, John P; Streeten, Elizabeth A; Thorsteinsdottir, Unnur; van Duijn, Cornelia M; Alizadeh, Behrooz Z; Bergmann, Sven; Boerwinkle, Eric; Boyd, Heather A; Crisponi, Laura; Gasparini, Paolo; Gieger, Christian; Harris, Tamara B; Ingelsson, Erik; Järvelin, Marjo-Riitta; Kraft, Peter; Lawlor, Debbie; Metspalu, Andres; Pennell, Craig E; Ridker, Paul M; Snieder, Harold; Sørensen, Thorkild Ia; Spector, Tim D; Strachan, David P; Uitterlinden, André G; Wareham, Nicholas J; Widen, Elisabeth; Zygmunt, Marek; Murray, Anna; Easton, Douglas F; Stefansson, Kari; Murabito, Joanne M; Ong, Ken K

2014-10-02

Age at menarche is a marker of timing of puberty in females. It varies widely between individuals, is a heritable trait and is associated with risks for obesity, type 2 diabetes, cardiovascular disease, breast cancer and all-cause mortality. Studies of rare human disorders of puberty and animal models point to a complex hypothalamic-pituitary-hormonal regulation, but the mechanisms that determine pubertal timing and underlie its links to disease risk remain unclear. Here, using genome-wide and custom-genotyping arrays in up to 182,416 women of European descent from 57 studies, we found robust evidence (P < 5 × 10(-8)) for 123 signals at 106 genomic loci associated with age at menarche. Many loci were associated with other pubertal traits in both sexes, and there was substantial overlap with genes implicated in body mass index and various diseases, including rare disorders of puberty. Menarche signals were enriched in imprinted regions, with three loci (DLK1-WDR25, MKRN3-MAGEL2 and KCNK9) demonstrating parent-of-origin-specific associations concordant with known parental expression patterns. Pathway analyses implicated nuclear hormone receptors, particularly retinoic acid and γ-aminobutyric acid-B2 receptor signalling, among novel mechanisms that regulate pubertal timing in humans. Our findings suggest a genetic architecture involving at least hundreds of common variants in the coordinated timing of the pubertal transition.
Familial aggregation analysis of gene expressions

PubMed Central

Rao, Shao-Qi; Xu, Liang-De; Zhang, Guang-Mei; Li, Xia; Li, Lin; Shen, Gong-Qing; Jiang, Yang; Yang, Yue-Ying; Gong, Bin-Sheng; Jiang, Wei; Zhang, Fan; Xiao, Yun; Wang, Qing K

2007-01-01

Traditional studies of familial aggregation are aimed at defining the genetic (and non-genetic) causes of a disease from physiological or clinical traits. However, there has been little attempt to use genome-wide gene expressions, the direct phenotypic measures of genes, as the traits to investigate several extended issues regarding the distributions of familially aggregated genes on chromosomes or in functions. In this study we conducted a genome-wide familial aggregation analysis by using the in vitro cell gene expressions of 3300 human autosome genes (Problem 1 data provided to Genetic Analysis Workshop 15) in order to answer three basic genetics questions. First, we investigated how gene expressions aggregate among different types (degrees) of relative pairs. Second, we conducted a bioinformatics analysis of highly familially aggregated genes to see how they are distributed on chromosomes. Third, we performed a gene ontology enrichment test of familially aggregated genes to find evidence to support their functional consensus. The results indicated that 1) gene expressions did aggregate in families, especially between sibs. Of 3300 human genes analyzed, there were a total of 1105 genes with one or more significant (empirical p < 0.05) familial correlation; 2) there were several genomic hot spots where highly familially aggregated genes (e.g., the chromosome 6 HLA genes cluster) were clustered; 3) as we expected, gene ontology enrichment tests revealed that the 1105 genes were aggregating not only in families but also in functional categories. PMID:18466548
Genomic prediction and genome-wide association analysis of female longevity in a composite beef cattle breed.

PubMed

Hamidi Hay, E; Roberts, A

2017-04-01

Longevity is a highly important trait to the efficiency of beef cattle production. The objective of this study was to evaluate the genomic prediction of longevity and identify genomic regions associated with this trait. The data used in this study consisted of 547 Composite Gene Combination cows (1/2 Red Angus, 1/4 Charolais, 1/4 Tarentaise) born from 2002 to 2011 genotyped with Illumina BovineSNP50 BeadChip. Three models were used to assess genomic prediction: Bayes A, Bayes B and GBLUP using a genomic relationship matrix. To identify genomic regions associated with longevity 2 approaches were adopted: single marker genome wide association and Bayesian approach using GenSel software. The genomic prediction accuracy was low 0.28, 0.25, and 0.22 for Bayes A, Bayes B and GBLUP, respectively. The single-marker genome wide association study (GWAS)identified 5 loci with -value less than 0.05 after false discovery correction: UA-IFASA-7571 on chromosome 19 (58.03 Mb), ARS-BFGL-BAC-15059 on BTA 1 (28.8 Mb), ARS-BFGL-NGS-104159 on BTA3 (29.4 Mb), ARS-BFGL-NGS-32882 on BTA9 (104.07 Mb) and ARS-BFGL-NGS-32883 on BTA25 (33.77 Mb). The Bayesian GWAS yielded 4 genomic regions overlapping with the single marker GWAS results. The region with the highest percentage of genomic variance (3.73%) was detected on chromosome 19. Both GWAS approaches adopted in this study showed evidence for association with various chromosomal locations.
Identification of true EST alignments for recognising transcribed regions.

PubMed

Ma, Chuang; Wang, Jia; Li, Lun; Duan, Mo-Jie; Zhou, Yan-Hong

2011-01-01

Transcribed regions can be determined by aligning Expressed Sequence Tags (ESTs) with genome sequences. The kernel of this strategy is to effectively distinguish true EST alignments from spurious ones. In this study, three measures including Direction Check, Identity Check and Terminal Check were introduced to more effectively eliminate spurious EST alignments. On the basis of these introduced measures and other widely used measures, a computational tool, named ESTCleanser, has been developed to identify true EST alignments for obtaining reliable transcribed regions. The performance of ESTCleanser has been evaluated on the well-annotated human ENCyclopedia of DNA Elements (ENCODE) regions using human ESTs in the dbEST database. The evaluation results show that the accuracy of ESTCleanser at exon and intron levels is more remarkably enhanced than that of UCSC-spliced EST alignments. This work would be helpful to EST-based researches on finding new genes, complementing genome annotation, recognising alternative splicing events and Single Nucleotide Polymorphisms (SNPs), etc.
Genes Important for Schizosaccharomyces pombe Meiosis Identified Through a Functional Genomics Screen

PubMed Central

Blyth, Julie; Makrantoni, Vasso; Barton, Rachael E.; Spanos, Christos; Rappsilber, Juri; Marston, Adele L.

2018-01-01

Meiosis is a specialized cell division that generates gametes, such as eggs and sperm. Errors in meiosis result in miscarriages and are the leading cause of birth defects; however, the molecular origins of these defects remain unknown. Studies in model organisms are beginning to identify the genes and pathways important for meiosis, but the parts list is still poorly defined. Here we present a comprehensive catalog of genes important for meiosis in the fission yeast, Schizosaccharomyces pombe. Our genome-wide functional screen surveyed all nonessential genes for roles in chromosome segregation and spore formation. Novel genes important at distinct stages of the meiotic chromosome segregation and differentiation program were identified. Preliminary characterization implicated three of these genes in centrosome/spindle pole body, centromere, and cohesion function. Our findings represent a near-complete parts list of genes important for meiosis in fission yeast, providing a valuable resource to advance our molecular understanding of meiosis. PMID:29259000
A combined analysis of genetically correlated traits identifies 187 loci and a role for neurogenesis and myelination in intelligence.

PubMed

Hill, W D; Marioni, R E; Maghzian, O; Ritchie, S J; Hagenaars, S P; McIntosh, A M; Gale, C R; Davies, G; Deary, I J

2018-01-11

Intelligence, or general cognitive function, is phenotypically and genetically correlated with many traits, including a wide range of physical, and mental health variables. Education is strongly genetically correlated with intelligence (r g = 0.70). We used these findings as foundations for our use of a novel approach-multi-trait analysis of genome-wide association studies (MTAG; Turley et al. 2017)-to combine two large genome-wide association studies (GWASs) of education and intelligence, increasing statistical power and resulting in the largest GWAS of intelligence yet reported. Our study had four goals: first, to facilitate the discovery of new genetic loci associated with intelligence; second, to add to our understanding of the biology of intelligence differences; third, to examine whether combining genetically correlated traits in this way produces results consistent with the primary phenotype of intelligence; and, finally, to test how well this new meta-analytic data sample on intelligence predicts phenotypic intelligence in an independent sample. By combining datasets using MTAG, our functional sample size increased from 199,242 participants to 248,482. We found 187 independent loci associated with intelligence, implicating 538 genes, using both SNP-based and gene-based GWAS. We found evidence that neurogenesis and myelination-as well as genes expressed in the synapse, and those involved in the regulation of the nervous system-may explain some of the biological differences in intelligence. The results of our combined analysis demonstrated the same pattern of genetic correlations as those from previous GWASs of intelligence, providing support for the meta-analysis of these genetically-related phenotypes.
Common variants on 2p16.1, 6p22.1 and 10q24.32 are associated with schizophrenia in Han Chinese population.

PubMed

Yu, H; Yan, H; Li, J; Li, Z; Zhang, X; Ma, Y; Mei, L; Liu, C; Cai, L; Wang, Q; Zhang, F; Iwata, N; Ikeda, M; Wang, L; Lu, T; Li, M; Xu, H; Wu, X; Liu, B; Yang, J; Li, K; Lv, L; Ma, X; Wang, C; Li, L; Yang, F; Jiang, T; Shi, Y; Li, T; Zhang, D; Yue, W

2017-07-01

Many schizophrenia susceptibility loci have been identified through genome-wide association studies (GWASs) in European populations. However, until recently, schizophrenia GWASs in non-European populations were limited to small sample sizes and have yielded few loci associated with schizophrenia. To identify genetic risk variations for schizophrenia in the Han Chinese population, we performed a two-stage GWAS of schizophrenia comprising 4384 cases and 5770 controls, followed by independent replications of 13 single-nucleotide polymorphisms in an additional 4339 schizophrenia cases and 7043 controls of Han Chinese ancestry. Furthermore, we conducted additional analyses based on the results in the discovery stage. The combined analysis confirmed evidence of genome-wide significant associations in the Han Chinese population for three loci, at 2p16.1 (rs1051061, in an exon of VRK2, P=1.14 × 10 -12 , odds ratio (OR)=1.17), 6p22.1 (rs115070292 in an intron of GABBR1, P=4.96 × 10 -10 , OR=0.77) and 10q24.32 (rs10883795 in an intron of AS3MT, P=7.94 × 10 -10 , OR=0.87; rs10883765 at an intron of ARL3, P=3.06 × 10 -9 , OR=0.87). The polygenic risk score based on Psychiatric Genomics Consortium schizophrenia GWAS data modestly predicted case-control status in the Chinese population (Nagelkerke R 2 : 1.7% ~5.7%). Our pathway analysis suggested that neurological biological pathways such as GABAergic signaling, dopaminergic signaling, cell adhesion molecules and myelination pathways are involved in schizophrenia. These findings provide new insights into the pathogenesis of schizophrenia in the Han Chinese population. Further studies are needed to establish the biological context and potential clinical utility of these findings.
West Nile virus (WNV) genome RNAs with up to three adjacent mutations that disrupt long distance 5'-3' cyclization sequence basepairs are viable

DOE Office of Scientific and Technical Information (OSTI.GOV)

Basu, Mausumi; Brinton, Margo A., E-mail: mbrinton@gsu.ed

2011-03-30

Mosquito-borne flavivirus genomes contain conserved 5' and 3' cyclization sequences (CYC) that facilitate long distance RNA-RNA interactions. In previous studies, flavivirus replicon RNA replication was completely inhibited by single or multiple mismatching CYC nt substitutions. In the present study, full-length WNV genomes with one, two or three mismatching CYC substitutions showed reduced replication efficiencies but were viable and generated revertants with increased replication efficiency. Several different three adjacent mismatching CYC substitution mutant RNAs were rescued by a second site mutation that created an additional basepair (nts 147-10913) on the internal genomic side of the 5'-3' CYC. The finding that full-lengthmore » genomes with up to three mismatching CYC mutations are viable and can be rescued by a single nt spontaneous mutation indicates that more than three adjacent CYC basepair substitutions would be required to increase the safety of vaccine genomes by creating mismatches in inter-genomic recombinants.« less
Genome-Wide Association Study of Cardiac Structure and Systolic Function in African Americans: The Candidate Gene Association Resource (CARe) Study

PubMed Central

Fox, Ervin R.; Musani, Solomon K.; Barbalic, Maja; Lin, Honghuang; Yu, Bing; Ogunyankin, Kofo O.; Smith, Nicholas L.; Kutlar, Abdullah; Glazer, Nicole L.; Post, Wendy S.; Paltoo, Dina N.; Dries, Daniel L.; Farlow, Deborah N.; Duarte, Christine W.; Kardia, Sharon L.; Meyers, Kristin J.; Sun, Yan V.; Arnett, Donna K.; Patki, Amit A.; Sha, Jin; Cui, Xiangqui; Samdarshi, Tandaw E.; Penman, Alan D.; Bibbins-Domingo, Kirsten; Bůžková, Petra; Benjamin, Emelia J.; Bluemke, David A.; Morrison, Alanna C.; Heiss, Gerardo; Carr, J. Jeffrey; Tracy, Russell P.; Mosley, Thomas H.; Taylor, Herman A.; Psaty, Bruce M.; Heckbert, Susan R.; Cappola, Thomas P.; Vasan, Ramachandran S.

2013-01-01

Background Using data from four community-based cohorts of African Americans (AA), we tested the association between genome-wide markers (SNPs) and cardiac phenotypes in the Candidate-gene Association REsource (CARe) study. Methods and Results Among 6,765 AA, we related age, sex, height and weight-adjusted residuals for nine cardiac phenotypes (assessed by echocardiogram or MRI) to 2.5 million SNPs genotyped using Genome-Wide Affymetrix Human SNP Array 6.0 (Affy6.0) and the remainder imputed. Within cohort genome-wide association analysis was conducted followed by meta-analysis across cohorts using inverse variance weights (genome-wide significance threshold=4.0 ×10−07). Supplementary pathway analysis was performed. We attempted replication in 3 smaller cohorts of African ancestry and tested look-ups in one consortium of European ancestry (EchoGEN). Across the 9 phenotypes, variants in 4 genetic loci reached genome-wide significance: rs4552931 in UBE2V2 (p=1.43 × 10−07) for left ventricular mass (LVM); rs7213314 in WIPI1 (p=1.68 × 10−07) for LV internal diastolic diameter (LVIDD); rs1571099 in PPAPDC1A (p= 2.57 × 10−08) for interventricular septal wall thickness (IVST); and rs9530176 in KLF5 (p=4.02 × 10−07) for ejection fraction (EF). Associated variants were enriched in three signaling pathways involved in cardiac remodeling. None of the 4 loci replicated in cohorts of African ancestry were confirmed in look-ups in EchoGEN. Conclusions In the largest GWAS of cardiac structure and function to date in AA, we identified 4 genetic loci related to LVM, IVST, LVIDD and EF that reached genome-wide significance. Replication results suggest that these loci may represent unique to individuals of African ancestry. Additional large-scale studies are warranted for these complex phenotypes. PMID:23275298
SynFind: Compiling Syntenic Regions across Any Set of Genomes on Demand.

PubMed

Tang, Haibao; Bomhoff, Matthew D; Briones, Evan; Zhang, Liangsheng; Schnable, James C; Lyons, Eric

2015-11-11

The identification of conserved syntenic regions enables discovery of predicted locations for orthologous and homeologous genes, even when no such gene is present. This capability means that synteny-based methods are far more effective than sequence similarity-based methods in identifying true-negatives, a necessity for studying gene loss and gene transposition. However, the identification of syntenic regions requires complex analyses which must be repeated for pairwise comparisons between any two species. Therefore, as the number of published genomes increases, there is a growing demand for scalable, simple-to-use applications to perform comparative genomic analyses that cater to both gene family studies and genome-scale studies. We implemented SynFind, a web-based tool that addresses this need. Given one query genome, SynFind is capable of identifying conserved syntenic regions in any set of target genomes. SynFind is capable of reporting per-gene information, useful for researchers studying specific gene families, as well as genome-wide data sets of syntenic gene and predicted gene locations, critical for researchers focused on large-scale genomic analyses. Inference of syntenic homologs provides the basis for correlation of functional changes around genes of interests between related organisms. Deployed on the CoGe online platform, SynFind is connected to the genomic data from over 15,000 organisms from all domains of life as well as supporting multiple releases of the same organism. SynFind makes use of a powerful job execution framework that promises scalability and reproducibility. SynFind can be accessed at http://genomevolution.org/CoGe/SynFind.pl. A video tutorial of SynFind using Phytophthrora as an example is available at http://www.youtube.com/watch?v=2Agczny9Nyc. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genome-wide association study and gene network analysis of fertility, retained placenta, and metritis in US Holstein cattle

USDA-ARS?s Scientific Manuscript database

The objectives of this research were to identify genes, genomic regions, and gene networks associated with three measures of fertility (daughter pregnancy rate, DPR; heifer conception rate, HCR; and cow conception rate, CCR) and two measures of reproductive health (metritis, METR; and retained place...
What is bioinformatics? A proposed definition and overview of the field.

PubMed

Luscombe, N M; Greenbaum, D; Gerstein, M

2001-01-01

The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.
Evaluation of different sources of DNA for use in genome wide studies and forensic application.

PubMed

Al Safar, Habiba S; Abidi, Fatima H; Khazanehdari, Kamal A; Dadour, Ian R; Tay, Guan K

2011-02-01

In the field of epidemiology, Genome-Wide Association Studies (GWAS) are commonly used to identify genetic predispositions of many human diseases. Large repositories housing biological specimens for clinical and genetic investigations have been established to store material and data for these studies. The logistics of specimen collection and sample storage can be onerous, and new strategies have to be explored. This study examines three different DNA sources (namely, degraded genomic DNA, amplified degraded genomic DNA and amplified extracted DNA from FTA card) for GWAS using the Illumina platform. No significant difference in call rate was detected between amplified degraded genomic DNA extracted from whole blood and amplified DNA retrieved from FTA™ cards. However, using unamplified-degraded genomic DNA reduced the call rate to a mean of 42.6% compared to amplified DNA extracted from FTA card (mean of 96.6%). This study establishes the utility of FTA™ cards as a viable storage matrix for cells from which DNA can be extracted to perform GWAS analysis.
In Situ Hi-C Library Preparation for Plants to Study Their Three-Dimensional Chromatin Interactions on a Genome-Wide Scale.

PubMed

Liu, Chang

2017-01-01

The spatial organization of the genome in the nucleus is critical for many cellular processes. It has been broadly accepted that the packing of chromatin inside the nucleus is not random, but structured at several hierarchical levels. The Hi-C method combines Chromatin Conformation Capture and high-throughput sequencing, which allows interrogating genome-wide chromatin interactions. Depending on the sequencing depth, chromatin packing patterns derived from Hi-C experiments can be viewed on a chromosomal scale or at a local genic level. Here, I describe a protocol of plant in situ Hi-C library preparation, which covers procedures starting from tissue fixation to library amplification.
Identification of SNPs associated with variola virus virulence.

PubMed

Hoen, Anne Gatewood; Gardner, Shea N; Moore, Jason H

2013-02-14

Decades after the eradication of smallpox, its etiological agent, variola virus (VARV), remains a threat as a potential bioweapon. Outbreaks of smallpox around the time of the global eradication effort exhibited variable case fatality rates (CFRs), likely attributable in part to complex viral genetic determinants of smallpox virulence. We aimed to identify genome-wide single nucleotide polymorphisms associated with CFR. We evaluated unadjusted and outbreak geographic location-adjusted models of single SNPs and two- and three-way interactions between SNPs. Using the data mining approach multifactor dimensionality reduction (MDR), we identified five VARV SNPs in models significantly associated with CFR. The top performing unadjusted model and adjusted models both revealed the same two-way gene-gene interaction. We discuss the biological plausibility of the influence of the SNPs identified these and other significant models on the strain-specific virulence of VARV. We have identified genetic loci in the VARV genome that are statistically associated with VARV virulence as measured by CFR. While our ability to infer a causal relationship between the specific SNPs identified in our analysis and VARV virulence is limited, our results suggest that smallpox severity is in part associated with VARV strain variation and that VARV virulence may be determined by multiple genetic loci. This study represents the first application of MDR to the identification of pathogen gene-gene interactions for predicting infectious disease outbreak severity.
Genomes of microsporidia in mosquitoes: status and preliminary findings

USDA-ARS?s Scientific Manuscript database

The status and preliminary findings for full genome sequencing of three species of microsporidia with mosquitoes as type hosts will be presented. Vavraia culicis, the type species of the genus Vavraia, was originally described from Culex pipiens. Type material was not available and therefore Vavra...
The impact of iterated games on traffic flow at noncontrolled intersections

NASA Astrophysics Data System (ADS)

Zhao, Chao; Jia, Ning

2015-05-01

Intersections without signal control widely exist in urban road networks. This paper studied the traffic flow in a noncontrolled intersection within an iterated game framework. We assume drivers have learning ability and can repetitively adjust their strategies (to give way or to rush through) in the intersection according to memories. A cellular automata model is applied to investigate the characteristics of the traffic flow. Numerical experiments indicate two main findings. First, the traffic flow experiences a "volcano-shaped" fundamental diagram with three different phases. Second, most drivers choose to give way in the intersection, but the aggressive drivers cannot be completely eliminated, which is coincident with field observations. Analysis are also given out to explain the observed phenomena. These findings allow deeper insight of the real-world bottleneck traffic flow.
Genome-Wide Locations of Potential Epimutations Associated with Environmentally Induced Epigenetic Transgenerational Inheritance of Disease Using a Sequential Machine Learning Prediction Approach.

PubMed

Haque, M Muksitul; Holder, Lawrence B; Skinner, Michael K

2015-01-01

Environmentally induced epigenetic transgenerational inheritance of disease and phenotypic variation involves germline transmitted epimutations. The primary epimutations identified involve altered differential DNA methylation regions (DMRs). Different environmental toxicants have been shown to promote exposure (i.e., toxicant) specific signatures of germline epimutations. Analysis of genomic features associated with these epimutations identified low-density CpG regions (<3 CpG / 100bp) termed CpG deserts and a number of unique DNA sequence motifs. The rat genome was annotated for these and additional relevant features. The objective of the current study was to use a machine learning computational approach to predict all potential epimutations in the genome. A number of previously identified sperm epimutations were used as training sets. A novel machine learning approach using a sequential combination of Active Learning and Imbalance Class Learner analysis was developed. The transgenerational sperm epimutation analysis identified approximately 50K individual sites with a 1 kb mean size and 3,233 regions that had a minimum of three adjacent sites with a mean size of 3.5 kb. A select number of the most relevant genomic features were identified with the low density CpG deserts being a critical genomic feature of the features selected. A similar independent analysis with transgenerational somatic cell epimutation training sets identified a smaller number of 1,503 regions of genome-wide predicted sites and differences in genomic feature contributions. The predicted genome-wide germline (sperm) epimutations were found to be distinct from the predicted somatic cell epimutations. Validation of the genome-wide germline predicted sites used two recently identified transgenerational sperm epimutation signature sets from the pesticides dichlorodiphenyltrichloroethane (DDT) and methoxychlor (MXC) exposure lineage F3 generation. Analysis of this positive validation data set showed a 100% prediction accuracy for all the DDT-MXC sperm epimutations. Observations further elucidate the genomic features associated with transgenerational germline epimutations and identify a genome-wide set of potential epimutations that can be used to facilitate identification of epigenetic diagnostics for ancestral environmental exposures and disease susceptibility.
Changes in Malaria Parasite Drug Resistance in an Endemic Population Over a 25-Year Period With Resulting Genomic Evidence of Selection

PubMed Central

Nwakanma, Davis C.; Duffy, Craig W.; Amambua-Ngwa, Alfred; Oriero, Eniyou C.; Bojang, Kalifa A.; Pinder, Margaret; Drakeley, Chris J.; Sutherland, Colin J.; Milligan, Paul J.; MacInnis, Bronwyn; Kwiatkowski, Dominic P.; Clark, Taane G.; Greenwood, Brian M.; Conway, David J.

2014-01-01

Background. Analysis of genome-wide polymorphism in many organisms has potential to identify genes under recent selection. However, data on historical allele frequency changes are rarely available for direct confirmation. Methods. We genotyped single nucleotide polymorphisms (SNPs) in 4 Plasmodium falciparum drug resistance genes in 668 archived parasite-positive blood samples of a Gambian population between 1984 and 2008. This covered a period before antimalarial resistance was detected locally, through subsequent failure of multiple drugs until introduction of artemisinin combination therapy. We separately performed genome-wide sequence analysis of 52 clinical isolates from 2008 to prospect for loci under recent directional selection. Results. Resistance alleles increased from very low frequencies, peaking in 2000 for chloroquine resistance-associated crt and mdr1 genes and at the end of the survey period for dhfr and dhps genes respectively associated with pyrimethamine and sulfadoxine resistance. Temporal changes fit a model incorporating likely selection coefficients over the period. Three of the drug resistance loci were in the top 4 regions under strong selection implicated by the genome-wide analysis. Conclusions. Genome-wide polymorphism analysis of an endemic population sample robustly identifies loci with detailed documentation of recent selection, demonstrating power to prospectively detect emerging drug resistance genes. PMID:24265439
Genomic analysis of thermophilic Bacillus coagulans strains: efficient producers for platform bio-chemicals.

PubMed

Su, Fei; Xu, Ping

2014-01-29

Microbial strains with high substrate efficiency and excellent environmental tolerance are urgently needed for the production of platform bio-chemicals. Bacillus coagulans has these merits; however, little genetic information is available about this species. Here, we determined the genome sequences of five B. coagulans strains, and used a comparative genomic approach to reconstruct the central carbon metabolism of this species to explain their fermentation features. A novel xylose isomerase in the xylose utilization pathway was identified in these strains. Based on a genome-wide positive selection scan, the selection pressure on amino acid metabolism may have played a significant role in the thermal adaptation. We also researched the immune systems of B. coagulans strains, which provide them with acquired resistance to phages and mobile genetic elements. Our genomic analysis provides comprehensive insights into the genetic characteristics of B. coagulans and paves the way for improving and extending the uses of this species.

Genomic analysis of thermophilic Bacillus coagulans strains: efficient producers for platform bio-chemicals

PubMed Central

Su, Fei; Xu, Ping

2014-01-01

Microbial strains with high substrate efficiency and excellent environmental tolerance are urgently needed for the production of platform bio-chemicals. Bacillus coagulans has these merits; however, little genetic information is available about this species. Here, we determined the genome sequences of five B. coagulans strains, and used a comparative genomic approach to reconstruct the central carbon metabolism of this species to explain their fermentation features. A novel xylose isomerase in the xylose utilization pathway was identified in these strains. Based on a genome-wide positive selection scan, the selection pressure on amino acid metabolism may have played a significant role in the thermal adaptation. We also researched the immune systems of B. coagulans strains, which provide them with acquired resistance to phages and mobile genetic elements. Our genomic analysis provides comprehensive insights into the genetic characteristics of B. coagulans and paves the way for improving and extending the uses of this species. PMID:24473268
Polygenic risk score, genome-wide association, and gene set analyses of cognitive domain deficits in schizophrenia.

PubMed

Nakahara, Soichiro; Medland, Sarah; Turner, Jessica A; Calhoun, Vince D; Lim, Kelvin O; Mueller, Bryon A; Bustillo, Juan R; O'Leary, Daniel S; Vaidya, Jatin G; McEwen, Sarah; Voyvodic, James; Belger, Aysenil; Mathalon, Daniel H; Ford, Judith M; Guffanti, Guia; Macciardi, Fabio; Potkin, Steven G; van Erp, Theo G M

2018-06-12

This study assessed genetic contributions to six cognitive domains, identified by the MATRICS Cognitive Consensus Battery as relevant for schizophrenia, cognition-enhancing, clinical trials. Psychiatric Genomics Consortium Schizophrenia polygenic risk scores showed significant negative correlations with each cognitive domain. Genome-wide association analyses identified loci associated with attention/vigilance (rs830786 within HNF4G), verbal memory (rs67017972 near NDUFS4), and reasoning/problem solving (rs76872642 within HDAC9). Gene set analysis identified unique and shared genes across cognitive domains. These findings suggest involvement of common and unique mechanisms across cognitive domains and may contribute to the discovery of new therapeutic targets to treat cognitive deficits in schizophrenia. Copyright © 2018 Elsevier B.V. All rights reserved.
No genes for intelligence in the fluid genome.

PubMed

Ho, Mae-Wan

2013-01-01

Revolution is brewing belatedly within the heartlands of the genetic determinist establishment still in denial about the fluid genome that makes identifying genes even for common disease well-nigh impossible. The fruitless hunt for intelligence genes serves to expose the poverty of an obsolete paradigm that is obstructing knowledge and preventing fruitful policies from being widely implemented. Genome-wide scans using state-of-the art technologies on extensive databases have failed to find a single gene for intelligence; instead, environment and maternal effects may account for most, if not all correlation among relatives, while identical twins diverge genetically and epigenetically throughout life. Abundant evidence points to the enormous potential for improving intellectual abilities (and health) through simple environmental and social interventions.
Toward a Genome-Wide Systems Biology Analysis of Host-Pathogen Interactions in Group A Streptococcus

PubMed Central

Musser, James M.; DeLeo, Frank R.

2005-01-01

Genome-wide analysis of microbial pathogens and molecular pathogenesis processes has become an area of considerable activity in the last 5 years. These studies have been made possible by several advances, including completion of the human genome sequence, publication of genome sequences for many human pathogens, development of microarray technology and high-throughput proteomics, and maturation of bioinformatics. Despite these advances, relatively little effort has been expended in the bacterial pathogenesis arena to develop and use integrated research platforms in a systems biology approach to enhance our understanding of disease processes. This review discusses progress made in exploiting an integrated genome-wide research platform to gain new knowledge about how the human bacterial pathogen group A Streptococcus causes disease. Results of these studies have provided many new avenues for basic pathogenesis research and translational research focused on development of an efficacious human vaccine and novel therapeutics. One goal in summarizing this line of study is to bring exciting new findings to the attention of the investigative pathology community. In addition, we hope the review will stimulate investigators to consider using analogous approaches for analysis of the molecular pathogenesis of other microbes. PMID:16314461
A Genome-Wide Association Study Identifies Five Loci Influencing Facial Morphology in Europeans

PubMed Central

Liu, Fan; van der Lijn, Fedde; Schurmann, Claudia; Zhu, Gu; Chakravarty, M. Mallar; Hysi, Pirro G.; Wollstein, Andreas; Lao, Oscar; de Bruijne, Marleen; Ikram, M. Arfan; van der Lugt, Aad; Rivadeneira, Fernando; Uitterlinden, André G.; Hofman, Albert; Niessen, Wiro J.; Homuth, Georg; de Zubicaray, Greig; McMahon, Katie L.; Thompson, Paul M.; Daboul, Amro; Puls, Ralf; Hegenscheid, Katrin; Bevan, Liisa; Pausova, Zdenka; Medland, Sarah E.; Montgomery, Grant W.; Wright, Margaret J.; Wicking, Carol; Boehringer, Stefan; Spector, Timothy D.; Paus, Tomáš; Martin, Nicholas G.; Biffar, Reiner; Kayser, Manfred

2012-01-01

Inter-individual variation in facial shape is one of the most noticeable phenotypes in humans, and it is clearly under genetic regulation; however, almost nothing is known about the genetic basis of normal human facial morphology. We therefore conducted a genome-wide association study for facial shape phenotypes in multiple discovery and replication cohorts, considering almost ten thousand individuals of European descent from several countries. Phenotyping of facial shape features was based on landmark data obtained from three-dimensional head magnetic resonance images (MRIs) and two-dimensional portrait images. We identified five independent genetic loci associated with different facial phenotypes, suggesting the involvement of five candidate genes—PRDM16, PAX3, TP63, C5orf50, and COL17A1—in the determination of the human face. Three of them have been implicated previously in vertebrate craniofacial development and disease, and the remaining two genes potentially represent novel players in the molecular networks governing facial development. Our finding at PAX3 influencing the position of the nasion replicates a recent GWAS of facial features. In addition to the reported GWA findings, we established links between common DNA variants previously associated with NSCL/P at 2p21, 8q24, 13q31, and 17q22 and normal facial-shape variations based on a candidate gene approach. Overall our study implies that DNA variants in genes essential for craniofacial development contribute with relatively small effect size to the spectrum of normal variation in human facial morphology. This observation has important consequences for future studies aiming to identify more genes involved in the human facial morphology, as well as for potential applications of DNA prediction of facial shape such as in future forensic applications. PMID:23028347
Genome-Wide Comparative Analysis Reveals Similar Types of NBS Genes in Hybrid Citrus sinensis Genome and Original Citrus clementine Genome and Provides New Insights into Non-TIR NBS Genes

PubMed Central

Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K.; Duan, Yongping; Luo, Feng

2015-01-01

In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention. PMID:25811466
Genome-wide comparative analysis reveals similar types of NBS genes in hybrid Citrus sinensis genome and original Citrus clementine genome and provides new insights into non-TIR NBS genes.

PubMed

Wang, Yunsheng; Zhou, Lijuan; Li, Dazhi; Dai, Liangying; Lawton-Rauh, Amy; Srimani, Pradip K; Duan, Yongping; Luo, Feng

2015-01-01

In this study, we identified and compared nucleotide-binding site (NBS) domain-containing genes from three Citrus genomes (C. clementina, C. sinensis from USA and C. sinensis from China). Phylogenetic analysis of all Citrus NBS genes across these three genomes revealed that there are three approximately evenly numbered groups: one group contains the Toll-Interleukin receptor (TIR) domain and two different Non-TIR groups in which most of proteins contain the Coiled Coil (CC) domain. Motif analysis confirmed that the two groups of CC-containing NBS genes are from different evolutionary origins. We partitioned NBS genes into clades using NBS domain sequence distances and found most clades include NBS genes from all three Citrus genomes. This suggests that three Citrus genomes have similar numbers and types of NBS genes. We also mapped the re-sequenced reads of three pomelo and three mandarin genomes onto the C. sinensis genome. We found that most NBS genes of the hybrid C. sinensis genome have corresponding homologous genes in both pomelo and mandarin genomes. The homologous NBS genes in pomelo and mandarin suggest that the parental species of C. sinensis may contain similar types of NBS genes. This explains why the hybrid C. sinensis and original C. clementina have similar types of NBS genes in this study. Furthermore, we found that sequence variation amongst Citrus NBS genes were shaped by multiple independent and shared accelerated mutation accumulation events among different groups of NBS genes and in different Citrus genomes. Our comparative analyses yield valuable insight into the structure, organization and evolution of NBS genes in Citrus genomes. Furthermore, our comprehensive analysis showed that the non-TIR NBS genes can be divided into two groups that come from different evolutionary origins. This provides new insights into non-TIR genes, which have not received much attention.
Systems Biology Approaches for Understanding Genome Architecture.

PubMed

Sewitz, Sven; Lipkow, Karen

2016-01-01

The linear and three-dimensional arrangement and composition of chromatin in eukaryotic genomes underlies the mechanisms directing gene regulation. Understanding this organization requires the integration of many data types and experimental results. Here we describe the approach of integrating genome-wide protein-DNA binding data to determine chromatin states. To investigate spatial aspects of genome organization, we present a detailed description of how to run stochastic simulations of protein movements within a simulated nucleus in 3D. This systems level approach enables the development of novel questions aimed at understanding the basic mechanisms that regulate genome dynamics.
The diversity of shell matrix proteins: genome-wide investigation of the pearl oyster, Pinctada fucata.

PubMed

Miyamoto, Hiroshi; Endo, Hirotoshi; Hashimoto, Naoki; Limura, Kurin; Isowa, Yukinobu; Kinoshita, Shigeharu; Kotaki, Tomohiro; Masaoka, Tetsuji; Miki, Takumi; Nakayama, Seiji; Nogawa, Chihiro; Notazawa, Atsuto; Ohmori, Fumito; Sarashina, Isao; Suzuki, Michio; Takagi, Ryousuke; Takahashi, Jun; Takeuchi, Takeshi; Yokoo, Naoki; Satoh, Nori; Toyohara, Haruhiko; Miyashita, Tomoyuki; Wada, Hiroshi; Samata, Tetsuro; Endo, Kazuyoshi; Nagasawa, Hiromichi; Asakawa, Shuichi; Watabe, Shugo

2013-10-01

In molluscs, shell matrix proteins are associated with biomineralization, a biologically controlled process that involves nucleation and growth of calcium carbonate crystals. Identification and characterization of shell matrix proteins are important for better understanding of the adaptive radiation of a large variety of molluscs. We searched the draft genome sequence of the pearl oyster Pinctada fucata and annotated 30 different kinds of shell matrix proteins. Of these, we could identified Perlucin, ependymin-related protein and SPARC as common genes shared by bivalves and gastropods; however, most gastropod shell matrix proteins were not found in the P. fucata genome. Glycinerich proteins were conserved in the genus Pinctada. Another important finding with regard to these annotated genes was that numerous shell matrix proteins are encoded by more than one gene; e.g., three ACCBP-like proteins, three CaLPs, five chitin synthase-like proteins, two N16 proteins (pearlins), 10 N19 proteins, two nacreins, four Pifs, nine shematrins, two prismalin-14 proteins, and 21 tyrosinases. This diversity of shell matrix proteins may be implicated in the morphological diversity of mollusc shells. The annotated genes reported here can be searched in P. fucata gene models version 1.1 and genome assembly version 1.0 ( http://marinegenomics.oist.jp/pinctada_fucata ). These genes should provide a useful resource for studies of the genetic basis of biomineralization and evaluation of the role of shell matrix proteins as an evolutionary toolkit among the molluscs.
Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.

PubMed

Mørk, Søren; Holmes, Ian

2012-03-01

Probabilistic logic programming offers a powerful way to describe and evaluate structured statistical models. To investigate the practicality of probabilistic logic programming for structure learning in bioinformatics, we undertook a simplified bacterial gene-finding benchmark in PRISM, a probabilistic dialect of Prolog. We evaluate Hidden Markov Model structures for bacterial protein-coding gene potential, including a simple null model structure, three structures based on existing bacterial gene finders and two novel model structures. We test standard versions as well as ADPH length modeling and three-state versions of the five model structures. The models are all represented as probabilistic logic programs and evaluated using the PRISM machine learning system in terms of statistical information criteria and gene-finding prediction accuracy, in two bacterial genomes. Neither of our implementations of the two currently most used model structures are best performing in terms of statistical information criteria or prediction performances, suggesting that better-fitting models might be achievable. The source code of all PRISM models, data and additional scripts are freely available for download at: http://github.com/somork/codonhmm. Supplementary data are available at Bioinformatics online.
High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies.

PubMed

Goudey, Benjamin; Abedini, Mani; Hopper, John L; Inouye, Michael; Makalic, Enes; Schmidt, Daniel F; Wagner, John; Zhou, Zeyu; Zobel, Justin; Reumann, Matthias

2015-01-01

Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
Transcription regulation by distal enhancers

PubMed Central

Stadhouders, Ralph; van den Heuvel, Anita; Kolovos, Petros; Jorna, Ruud; Leslie, Kris; Grosveld, Frank; Soler, Eric

2012-01-01

Genome-wide chromatin profiling efforts have shown that enhancers are often located at large distances from gene promoters within the noncoding genome. Whereas enhancers can stimulate transcription initiation by communicating with promoters via chromatin looping mechanisms, we propose that enhancers may also stimulate transcription elongation by physical interactions with intronic elements. We review here recent findings derived from the study of the hematopoietic system. PMID:22771987
Human cDNA mapping using fluorescence in situ hybridization. Final progress report, April 1, 1994--July 31, 1997

DOE Office of Scientific and Technical Information (OSTI.GOV)

Korenberg, J.R.

The ultimate goal of this research is to generate and apply novel technologies to speed completion and integration of the human genome map and sequence with biomedical problems. To do this, techniques were developed and genome-wide resources generated. This includes a genome-wide Mapped and Integrated BAC/PAC Resource that has been used for gene finding, map completion and anchoring, breakpoint definition and sequencing. In the last period of the grant, the Human Mapped BAC/PAC Resource was also applied to determine regions of human variation and to develop a novel paradigm of primate evolution through to humans. Further, in order to moremore » rapidly evaluate animal models of human disease, a BAC Map of the mouse was generated in collaboration with the MTI Genome Center, Dr. Bruce Birren.« less
Indel Group in Genomes (IGG) Molecular Genetic Markers1[OPEN

PubMed Central

Burkart-Waco, Diana; Kuppu, Sundaram; Britt, Anne; Chetelat, Roger

2016-01-01

Genetic markers are essential when developing or working with genetically variable populations. Indel Group in Genomes (IGG) markers are primer pairs that amplify single-locus sequences that differ in size for two or more alleles. They are attractive for their ease of use for rapid genotyping and their codominant nature. Here, we describe a heuristic algorithm that uses a k-mer-based approach to search two or more genome sequences to locate polymorphic regions suitable for designing candidate IGG marker primers. As input to the IGG pipeline software, the user provides genome sequences and the desired amplicon sizes and size differences. Primer sequences flanking polymorphic insertions/deletions are produced as output. IGG marker files for three sets of genomes, Solanum lycopersicum/Solanum pennellii, Arabidopsis (Arabidopsis thaliana) Columbia-0/Landsberg erecta-0 accessions, and S. lycopersicum/S. pennellii/Solanum tuberosum (three-way polymorphic) are included. PMID:27436831
Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data.

PubMed

Wright, Caroline F; Fitzgerald, Tomas W; Jones, Wendy D; Clayton, Stephen; McRae, Jeremy F; van Kogelenberg, Margriet; King, Daniel A; Ambridge, Kirsty; Barrett, Daniel M; Bayzetinova, Tanya; Bevan, A Paul; Bragin, Eugene; Chatzimichali, Eleni A; Gribble, Susan; Jones, Philip; Krishnappa, Netravathi; Mason, Laura E; Miller, Ray; Morley, Katherine I; Parthiban, Vijaya; Prigmore, Elena; Rajan, Diana; Sifrim, Alejandro; Swaminathan, G Jawahar; Tivey, Adrian R; Middleton, Anna; Parker, Michael; Carter, Nigel P; Barrett, Jeffrey C; Hurles, Matthew E; FitzPatrick, David R; Firth, Helen V

2015-04-04

Human genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount. The Deciphering Developmental Disorders (DDD) study has developed a UK-wide patient recruitment network involving over 180 clinicians across all 24 regional genetics services, and has performed genome-wide microarray and whole exome sequencing on children with undiagnosed developmental disorders and their parents. After data analysis, pertinent genomic variants were returned to individual research participants via their local clinical genetics team. Around 80,000 genomic variants were identified from exome sequencing and microarray analysis in each individual, of which on average 400 were rare and predicted to be protein altering. By focusing only on de novo and segregating variants in known developmental disorder genes, we achieved a diagnostic yield of 27% among 1133 previously investigated yet undiagnosed children with developmental disorders, whilst minimising incidental findings. In families with developmentally normal parents, whole exome sequencing of the child and both parents resulted in a 10-fold reduction in the number of potential causal variants that needed clinical evaluation compared to sequencing only the child. Most diagnostic variants identified in known genes were novel and not present in current databases of known disease variation. Implementation of a robust translational genomics workflow is achievable within a large-scale rare disease research study to allow feedback of potentially diagnostic findings to clinicians and research participants. Systematic recording of relevant clinical data, curation of a gene-phenotype knowledge base, and development of clinical decision support software are needed in addition to automated exclusion of almost all variants, which is crucial for scalable prioritisation and review of possible diagnostic variants. However, the resource requirements of development and maintenance of a clinical reporting system within a research setting are substantial. Health Innovation Challenge Fund, a parallel funding partnership between the Wellcome Trust and the UK Department of Health. Copyright © 2015 Wright et al. Open Access article distributed under the terms of CC BY. Published by Elsevier Ltd. All rights reserved.
What can genes tell us about the relationship between education and health?

PubMed

Boardman, Jason D; Domingue, Benjamin W; Daw, Jonathan

2015-02-01

We use genome wide data from respondents of the Health and Retirement Study (HRS) to evaluate the possibility that common genetic influences are associated with education and three health outcomes: depression, self-rated health, and body mass index. We use a total of 1.7 million single nucleotide polymorphisms obtained from the Illumina HumanOmni2.5-4v1 chip from 4233 non-Hispanic white respondents to characterize genetic similarities among unrelated persons in the HRS. We then used the Genome Wide Complex Trait Analysis (GCTA) toolkit, to estimate univariate and bivariate heritability. We provide evidence that education (h(2) = 0.33), BMI (h(2) = 0.43), depression (h(2) = 0.19), and self-rated health (h(2) = 0.18) are all moderately heritable phenotypes. We also provide evidence that some of the correlation between depression and education as well as self-rated health and education is due to common genetic factors associated with one or both traits. We find no evidence that the correlation between education and BMI is influenced by common genetic factors. Copyright © 2014 Elsevier Ltd. All rights reserved.
Leveraging genome-wide datasets to quantify the functional role of the anti-Shine-Dalgarno sequence in regulating translation efficiency.

PubMed

Hockenberry, Adam J; Pah, Adam R; Jewett, Michael C; Amaral, Luís A N

2017-01-01

Studies dating back to the 1970s established that sequence complementarity between the anti-Shine-Dalgarno (aSD) sequence on prokaryotic ribosomes and the 5' untranslated region of mRNAs helps to facilitate translation initiation. The optimal location of aSD sequence binding relative to the start codon, the full extents of the aSD sequence and the functional form of the relationship between aSD sequence complementarity and translation efficiency have not been fully resolved. Here, we investigate these relationships by leveraging the sequence diversity of endogenous genes and recently available genome-wide estimates of translation efficiency. We show that-after accounting for predicted mRNA structure-aSD sequence complementarity increases the translation of endogenous mRNAs by roughly 50%. Further, we observe that this relationship is nonlinear, with translation efficiency maximized for mRNAs with intermediate levels of aSD sequence complementarity. The mechanistic insights that we observe are highly robust: we find nearly identical results in multiple datasets spanning three distantly related bacteria. Further, we verify our main conclusions by re-analysing a controlled experimental dataset. © 2017 The Authors.
Maternal experience with predation risk influences genome-wide embryonic gene expression in threespined sticklebacks (Gasterosteus aculeatus).

PubMed

Mommer, Brett C; Bell, Alison M

2014-01-01

There is growing evidence for nongenetic effects of maternal experience on offspring. For example, previous studies have shown that female threespined stickleback fish (Gasterosteus aculeatus) exposed to predation risk produce offspring with altered behavior, metabolism and stress physiology. Here, we investigate the effect of maternal exposure to predation risk on the embryonic transcriptome in sticklebacks. Using RNA-sequencing we compared genome-wide transcription in three day post-fertilization embryos of predator-exposed and control mothers. There were hundreds of differentially expressed transcripts between embryos of predator-exposed mothers and embryos of control mothers including several non-coding RNAs. Gene Ontology analysis revealed biological pathways involved in metabolism, epigenetic inheritance, and neural proliferation and differentiation that differed between treatments. Interestingly, predation risk is associated with an accelerated life history in many vertebrates, and several of the genes and biological pathways that were identified in this study suggest that maternal exposure to predation risk accelerates the timing of embryonic development. Consistent with this hypothesis, embryos of predator-exposed mothers were larger than embryos of control mothers. These findings point to some of the molecular mechanisms that might underlie maternal effects.
Genome-wide association studies in preterm birth: implications for the practicing obstetrician-gynaecologist

PubMed Central

2013-01-01

Preterm birth has the highest mortality and morbidity of all pregnancy complications. The burden of preterm birth on public health worldwide is enormous, yet there are few effective means to prevent a preterm delivery. To date, much of its etiology is unexplained, but genetic predisposition is thought to play a major role. In the upcoming year, the international Preterm Birth Genome Project (PGP) consortium plans to publish a large genome wide association study in early preterm birth. Genome-wide association studies (GWAS) are designed to identify common genetic variants that influence health and disease. Despite the many challenges that are involved, GWAS can be an important discovery tool, revealing genetic variations that are associated with preterm birth. It is highly unlikely that findings of a GWAS can be directly translated into clinical practice in the short run. Nonetheless, it will help us to better understand the etiology of preterm birth and the GWAS results will generate new hypotheses for further research, thus enhancing our understanding of preterm birth and informing prevention efforts in the long run. PMID:23445776
Genome-wide association studies in preterm birth: implications for the practicing obstetrician-gynaecologist.

PubMed

Dolan, Siobhan M; Christiaens, Inge

2013-01-01

Preterm birth has the highest mortality and morbidity of all pregnancy complications. The burden of preterm birth on public health worldwide is enormous, yet there are few effective means to prevent a preterm delivery. To date, much of its etiology is unexplained, but genetic predisposition is thought to play a major role. In the upcoming year, the international Preterm Birth Genome Project (PGP) consortium plans to publish a large genome wide association study in early preterm birth. Genome-wide association studies (GWAS) are designed to identify common genetic variants that influence health and disease. Despite the many challenges that are involved, GWAS can be an important discovery tool, revealing genetic variations that are associated with preterm birth. It is highly unlikely that findings of a GWAS can be directly translated into clinical practice in the short run. Nonetheless, it will help us to better understand the etiology of preterm birth and the GWAS results will generate new hypotheses for further research, thus enhancing our understanding of preterm birth and informing prevention efforts in the long run.

A genome-wide association study of corneal astigmatism: The CREAM Consortium

PubMed Central

Shah, Rupal L.; Li, Qing; Zhao, Wanting; Tedja, Milly S.; Tideman, J. Willem L.; Khawaja, Anthony P.; Fan, Qiao; Yazar, Seyhan; Williams, Katie M.; Verhoeven, Virginie J.M.; Xie, Jing; Wang, Ya Xing; Hess, Moritz; Nickels, Stefan; Lackner, Karl J.; Pärssinen, Olavi; Wedenoja, Juho; Biino, Ginevra; Concas, Maria Pina; Uitterlinden, André; Rivadeneira, Fernando; Jaddoe, Vincent W.V.; Hysi, Pirro G.; Sim, Xueling; Tan, Nicholas; Tham, Yih-Chung; Sensaki, Sonoko; Hofman, Albert; Vingerling, Johannes R.; Jonas, Jost B.; Mitchell, Paul; Hammond, Christopher J.; Höhn, René; Baird, Paul N.; Wong, Tien-Yin; Cheng, Chinfsg-Yu; Teo, Yik Ying; Mackey, David A.; Williams, Cathy; Saw, Seang-Mei; Klaver, Caroline C.W.; Bailey-Wilson, Joan E.

2018-01-01

Purpose To identify genes and genetic markers associated with corneal astigmatism. Methods A meta-analysis of genome-wide association studies (GWASs) of corneal astigmatism undertaken for 14 European ancestry (n=22,250) and 8 Asian ancestry (n=9,120) cohorts was performed by the Consortium for Refractive Error and Myopia. Cases were defined as having >0.75 diopters of corneal astigmatism. Subsequent gene-based and gene-set analyses of the meta-analyzed results of European ancestry cohorts were performed using VEGAS2 and MAGMA software. Additionally, estimates of single nucleotide polymorphism (SNP)-based heritability for corneal and refractive astigmatism and the spherical equivalent were calculated for Europeans using LD score regression. Results The meta-analysis of all cohorts identified a genome-wide significant locus near the platelet-derived growth factor receptor alpha (PDGFRA) gene: top SNP: rs7673984, odds ratio=1.12 (95% CI:1.08–1.16), p=5.55×10−9. No other genome-wide significant loci were identified in the combined analysis or European/Asian ancestry-specific analyses. Gene-based analysis identified three novel candidate genes for corneal astigmatism in Europeans—claudin-7 (CLDN7), acid phosphatase 2, lysosomal (ACP2), and TNF alpha-induced protein 8 like 3 (TNFAIP8L3). Conclusions In addition to replicating a previously identified genome-wide significant locus for corneal astigmatism near the PDGFRA gene, gene-based analysis identified three novel candidate genes, CLDN7, ACP2, and TNFAIP8L3, that warrant further investigation to understand their role in the pathogenesis of corneal astigmatism. The much lower number of genetic variants and genes demonstrating an association with corneal astigmatism compared to published spherical equivalent GWAS analyses suggest a greater influence of rare genetic variants, non-additive genetic effects, or environmental factors in the development of astigmatism. PMID:29422769
Chromosome Evolution in the Free-Living Flatworms: First Evidence of Intrachromosomal Rearrangements in Karyotype Evolution of Macrostomum lignano (Platyhelminthes, Macrostomida)

PubMed Central

Zadesenets, Kira S.; Ershov, Nikita I.; Berezikov, Eugene; Rubtsov, Nikolay B.

2017-01-01

The free-living flatworm Macrostomum lignano is a hidden tetraploid. Its genome was formed by a recent whole genome duplication followed by chromosome fusions. Its karyotype (2n = 8) consists of a pair of large chromosomes (MLI1), which contain regions of all other chromosomes, and three pairs of small metacentric chromosomes. Comparison of MLI1 with metacentrics was performed by painting with microdissected DNA probes and fluorescent in situ hybridization of unique DNA fragments. Regions of MLI1 homologous to small metacentrics appeared to be contiguous. Besides the loss of DNA repeat clusters (pericentromeric and telomeric repeats and the 5S rDNA cluster) from MLI1, the difference between small metacentrics MLI2 and MLI4 and regions homologous to them in MLI1 were revealed. Abnormal karyotypes found in the inbred DV1/10 subline were analyzed, and structurally rearranged chromosomes were described with the painting technique, suggesting the mechanism of their origin. The revealed chromosomal rearrangements generate additional diversity, opening the way toward massive loss of duplicated genes from a duplicated genome. Our findings suggest that the karyotype of M. lignano is in the early stage of genome diploidization after whole genome duplication, and further studies on M. lignano and closely related species can address many questions about karyotype evolution in animals. PMID:29084138
The genome landscape of indigenous African cattle.

PubMed

Kim, Jaemin; Hanotte, Olivier; Mwai, Okeyo Ally; Dessie, Tadelle; Bashir, Salim; Diallo, Boubacar; Agaba, Morris; Kim, Kwondo; Kwak, Woori; Sung, Samsun; Seo, Minseok; Jeong, Hyeonsoo; Kwon, Taehyung; Taye, Mengistie; Song, Ki-Duk; Lim, Dajeong; Cho, Seoae; Lee, Hyun-Jeong; Yoon, Duhak; Oh, Sung Jong; Kemp, Stephen; Lee, Hak-Kyo; Kim, Heebal

2017-02-20

The history of African indigenous cattle and their adaptation to environmental and human selection pressure is at the root of their remarkable diversity. Characterization of this diversity is an essential step towards understanding the genomic basis of productivity and adaptation to survival under African farming systems. We analyze patterns of African cattle genetic variation by sequencing 48 genomes from five indigenous populations and comparing them to the genomes of 53 commercial taurine breeds. We find the highest genetic diversity among African zebu and sanga cattle. Our search for genomic regions under selection reveals signatures of selection for environmental adaptive traits. In particular, we identify signatures of selection including genes and/or pathways controlling anemia and feeding behavior in the trypanotolerant N'Dama, coat color and horn development in Ankole, and heat tolerance and tick resistance across African cattle especially in zebu breeds. Our findings unravel at the genome-wide level, the unique adaptive diversity of African cattle while emphasizing the opportunities for sustainable improvement of livestock productivity on the continent.
How could disclosing incidental information from whole-genome sequencing affect patient behavior?

PubMed Central

Christensen, Kurt D; Green, Robert C

2013-01-01

In this article, we argue that disclosure of incidental findings from whole-genome sequencing has the potential to motivate individuals to change health behaviors through psychological mechanisms that differ from typical risk assessment interventions. Their ability to do so, however, is likely to be highly contingent upon the nature of the incidental findings and how they are disclosed, the context of the disclosure and the characteristics of the patient. Moreover, clinicians need to be aware that behavioral responses may occur in unanticipated ways. This article argues for commentators and policy makers to take a cautious but optimistic perspective while empirical evidence is collected through ongoing research involving whole-genome sequencing and the disclosure of incidental information. PMID:24319470
How could disclosing incidental information from whole-genome sequencing affect patient behavior?

PubMed

Christensen, Kurt D; Green, Robert C

2013-06-01

In this article, we argue that disclosure of incidental findings from whole-genome sequencing has the potential to motivate individuals to change health behaviors through psychological mechanisms that differ from typical risk assessment interventions. Their ability to do so, however, is likely to be highly contingent upon the nature of the incidental findings and how they are disclosed, the context of the disclosure and the characteristics of the patient. Moreover, clinicians need to be aware that behavioral responses may occur in unanticipated ways. This article argues for commentators and policy makers to take a cautious but optimistic perspective while empirical evidence is collected through ongoing research involving whole-genome sequencing and the disclosure of incidental information.
A Genome-Wide Metabolic QTL Analysis in Europeans Implicates Two Loci Shaped by Recent Positive Selection

PubMed Central

Nicholson, George; Rantalainen, Mattias; Li, Jia V.; Maher, Anthony D.; Malmodin, Daniel; Ahmadi, Kourosh R.; Faber, Johan H.; Barrett, Amy; Min, Josine L.; Rayner, N. William; Toft, Henrik; Krestyaninova, Maria; Viksna, Juris; Neogi, Sudeshna Guha; Dumas, Marc-Emmanuel; Sarkans, Ugis; Donnelly, Peter; Illig, Thomas; Adamski, Jerzy; Suhre, Karsten; Allen, Maxine; Zondervan, Krina T.; Spector, Tim D.; Nicholson, Jeremy K.; Lindon, John C.

2011-01-01

We have performed a metabolite quantitative trait locus (mQTL) study of the 1H nuclear magnetic resonance spectroscopy (1H NMR) metabolome in humans, building on recent targeted knowledge of genetic drivers of metabolic regulation. Urine and plasma samples were collected from two cohorts of individuals of European descent, with one cohort comprised of female twins donating samples longitudinally. Sample metabolite concentrations were quantified by 1H NMR and tested for association with genome-wide single-nucleotide polymorphisms (SNPs). Four metabolites' concentrations exhibited significant, replicable association with SNP variation (8.6×10−11
Pathway Distiller - multisource biological pathway consolidation

PubMed Central

2012-01-01

Background One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. Methods After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. Results We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. Conclusions By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments. PMID:23134636
Pathway Distiller - multisource biological pathway consolidation.

PubMed

Doderer, Mark S; Anguiano, Zachry; Suresh, Uthra; Dashnamoorthy, Ravi; Bishop, Alexander J R; Chen, Yidong

2012-01-01

One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments.
Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lan, Yemin; Rosen, Gail; Hershberg, Ruth

The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less
Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

DOE PAGES

Lan, Yemin; Rosen, Gail; Hershberg, Ruth

2016-05-03

The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less
The value of new genome references.

PubMed

Worley, Kim C; Richards, Stephen; Rogers, Jeffrey

2017-09-15

Genomic information has become a ubiquitous and almost essential aspect of biological research. Over the last 10-15 years, the cost of generating sequence data from DNA or RNA samples has dramatically declined and our ability to interpret those data increased just as remarkably. Although it is still possible for biologists to conduct interesting and valuable research on species for which genomic data are not available, the impact of having access to a high quality whole genome reference assembly for a given species is nothing short of transformational. Research on a species for which we have no DNA or RNA sequence data is restricted in fundamental ways. In contrast, even access to an initial draft quality genome (see below for definitions) opens a wide range of opportunities that are simply not available without that reference genome assembly. Although a complete discussion of the impact of genome sequencing and assembly is beyond the scope of this short paper, the goal of this review is to summarize the most common and highest impact contributions that whole genome sequencing and assembly has had on comparative and evolutionary biology. Copyright © 2016. Published by Elsevier Inc.
Mining the human phenome using allelic scores that index biological intermediates.

PubMed

Evans, David M; Brion, Marie Jo A; Paternoster, Lavinia; Kemp, John P; McMahon, George; Munafò, Marcus; Whitfield, John B; Medland, Sarah E; Montgomery, Grant W; Timpson, Nicholas J; St Pourcain, Beate; Lawlor, Debbie A; Martin, Nicholas G; Dehghan, Abbas; Hirschhorn, Joel; Smith, George Davey

2013-10-01

It is common practice in genome-wide association studies (GWAS) to focus on the relationship between disease risk and genetic variants one marker at a time. When relevant genes are identified it is often possible to implicate biological intermediates and pathways likely to be involved in disease aetiology. However, single genetic variants typically explain small amounts of disease risk. Our idea is to construct allelic scores that explain greater proportions of the variance in biological intermediates, and subsequently use these scores to data mine GWAS. To investigate the approach's properties, we indexed three biological intermediates where the results of large GWAS meta-analyses were available: body mass index, C-reactive protein and low density lipoprotein levels. We generated allelic scores in the Avon Longitudinal Study of Parents and Children, and in publicly available data from the first Wellcome Trust Case Control Consortium. We compared the explanatory ability of allelic scores in terms of their capacity to proxy for the intermediate of interest, and the extent to which they associated with disease. We found that allelic scores derived from known variants and allelic scores derived from hundreds of thousands of genetic markers explained significant portions of the variance in biological intermediates of interest, and many of these scores showed expected correlations with disease. Genome-wide allelic scores however tended to lack specificity suggesting that they should be used with caution and perhaps only to proxy biological intermediates for which there are no known individual variants. Power calculations confirm the feasibility of extending our strategy to the analysis of tens of thousands of molecular phenotypes in large genome-wide meta-analyses. We conclude that our method represents a simple way in which potentially tens of thousands of molecular phenotypes could be screened for causal relationships with disease without having to expensively measure these variables in individual disease collections.
Optimized distributed systems achieve significant performance improvement on sorted merging of massive VCF files.

PubMed

Sun, Xiaobo; Gao, Jingjing; Jin, Peng; Eng, Celeste; Burchard, Esteban G; Beaty, Terri H; Ruczinski, Ingo; Mathias, Rasika A; Barnes, Kathleen; Wang, Fusheng; Qin, Zhaohui S

2018-06-01

Sorted merging of genomic data is a common data operation necessary in many sequencing-based studies. It involves sorting and merging genomic data from different subjects by their genomic locations. In particular, merging a large number of variant call format (VCF) files is frequently required in large-scale whole-genome sequencing or whole-exome sequencing projects. Traditional single-machine based methods become increasingly inefficient when processing large numbers of files due to the excessive computation time and Input/Output bottleneck. Distributed systems and more recent cloud-based systems offer an attractive solution. However, carefully designed and optimized workflow patterns and execution plans (schemas) are required to take full advantage of the increased computing power while overcoming bottlenecks to achieve high performance. In this study, we custom-design optimized schemas for three Apache big data platforms, Hadoop (MapReduce), HBase, and Spark, to perform sorted merging of a large number of VCF files. These schemas all adopt the divide-and-conquer strategy to split the merging job into sequential phases/stages consisting of subtasks that are conquered in an ordered, parallel, and bottleneck-free way. In two illustrating examples, we test the performance of our schemas on merging multiple VCF files into either a single TPED or a single VCF file, which are benchmarked with the traditional single/parallel multiway-merge methods, message passing interface (MPI)-based high-performance computing (HPC) implementation, and the popular VCFTools. Our experiments suggest all three schemas either deliver a significant improvement in efficiency or render much better strong and weak scalabilities over traditional methods. Our findings provide generalized scalable schemas for performing sorted merging on genetics and genomics data using these Apache distributed systems.
Genome-Wide Analysis of Transposon and Retroviral Insertions Reveals Preferential Integrations in Regions of DNA Flexibility.

PubMed

Vrljicak, Pavle; Tao, Shijie; Varshney, Gaurav K; Quach, Helen Ngoc Bao; Joshi, Adita; LaFave, Matthew C; Burgess, Shawn M; Sampath, Karuna

2016-04-07

DNA transposons and retroviruses are important transgenic tools for genome engineering. An important consideration affecting the choice of transgenic vector is their insertion site preferences. Previous large-scale analyses of Ds transposon integration sites in plants were done on the basis of reporter gene expression or germ-line transmission, making it difficult to discern vertebrate integration preferences. Here, we compare over 1300 Ds transposon integration sites in zebrafish with Tol2 transposon and retroviral integration sites. Genome-wide analysis shows that Ds integration sites in the presence or absence of marker selection are remarkably similar and distributed throughout the genome. No strict motif was found, but a preference for structural features in the target DNA associated with DNA flexibility (Twist, Tilt, Rise, Roll, Shift, and Slide) was observed. Remarkably, this feature is also found in transposon and retroviral integrations in maize and mouse cells. Our findings show that structural features influence the integration of heterologous DNA in genomes, and have implications for targeted genome engineering. Copyright © 2016 Vrljicak et al.
Integration of Nuclear- and Extranuclear-Initiated Estrogen Receptor Signaling in Breast Cancer Cells

ERIC Educational Resources Information Center

Madak Erdogan, Zeynep

2009-01-01

Estrogenic hormones exert their effects through binding to Estrogen Receptors (ERs), which work in concert with coregulators and extranuclear signaling pathways to control gene expression in normal as well as cancerous states, including breast tumors. In this thesis, we have used multiple genome-wide analysis tools to elucidate various ways that…
Genetic Associations with Plasma B12, B6, and Folate Levels in an Ischemic Stroke Population from the Vitamin Intervention for Stroke Prevention (VISP) Trial.

PubMed

Keene, Keith L; Chen, Wei-Min; Chen, Fang; Williams, Stephen R; Elkhatib, Stacey D; Hsu, Fang-Chi; Mychaleckyj, Josyf C; Doheny, Kimberly F; Pugh, Elizabeth W; Ling, Hua; Laurie, Cathy C; Gogarten, Stephanie M; Madden, Ebony B; Worrall, Bradford B; Sale, Michele M

2014-01-01

B vitamins play an important role in homocysteine metabolism, with vitamin deficiencies resulting in increased levels of homocysteine and increased risk for stroke. We performed a genome-wide association study (GWAS) in 2,100 stroke patients from the Vitamin Intervention for Stroke Prevention (VISP) trial, a clinical trial designed to determine whether the daily intake of high-dose folic acid, vitamins B6, and B12 reduce recurrent cerebral infarction. Extensive quality control (QC) measures resulted in a total of 737,081 SNPs for analysis. Genome-wide association analyses for baseline quantitative measures of folate, Vitamins B12, and B6 were completed using linear regression approaches, implemented in PLINK. Six associations met or exceeded genome-wide significance (P ≤ 5 × 10(-08)). For baseline Vitamin B12, the strongest association was observed with a non-synonymous SNP (nsSNP) located in the CUBN gene (P = 1.76 × 10(-13)). Two additional CUBN intronic SNPs demonstrated strong associations with B12 (P = 2.92 × 10(-10) and 4.11 × 10(-10)), while a second nsSNP, located in the TCN1 gene, also reached genome-wide significance (P = 5.14 × 10(-11)). For baseline measures of Vitamin B6, we identified genome-wide significant associations for SNPs at the ALPL locus (rs1697421; P = 7.06 × 10(-10) and rs1780316; P = 2.25 × 10(-08)). In addition to the six genome-wide significant associations, nine SNPs (two for Vitamin B6, six for Vitamin B12, and one for folate measures) provided suggestive evidence for association (P ≤ 10(-07)). Our GWAS study has identified six genome-wide significant associations, nine suggestive associations, and successfully replicated 5 of 16 SNPs previously reported to be associated with measures of B vitamins. The six genome-wide significant associations are located in gene regions that have shown previous associations with measures of B vitamins; however, four of the nine suggestive associations represent novel finding and warrant further investigation in additional populations.
Whole genome sequencing distinguishes between relapse and reinfection in recurrent leprosy cases

PubMed Central

Bührer-Sékula, Samira; Benjak, Andrej; Loiseau, Chloé; Singh, Pushpendra; Pontes, Maria A. A.; Gonçalves, Heitor S.; Hungria, Emerith M.; Busso, Philippe; Piton, Jérémie; Silveira, Maria I. S.; Cruz, Rossilene; Schetinni, Antônio; Costa, Maurício B.; Virmond, Marcos C. L.; Diorio, Suzana M.; Dias-Baptista, Ida M. F.; Rosa, Patricia S.; Matsuoka, Masanori; Penna, Maria L. F.; Cole, Stewart T.; Penna, Gerson O.

2017-01-01

Background Since leprosy is both treated and controlled by multidrug therapy (MDT) it is important to monitor recurrent cases for drug resistance and to distinguish between relapse and reinfection as a means of assessing therapeutic efficacy. All three objectives can be reached with single nucleotide resolution using next generation sequencing and bioinformatics analysis of Mycobacterium leprae DNA present in human skin. Methodology DNA was isolated by means of optimized extraction and enrichment methods from samples from three recurrent cases in leprosy patients participating in an open-label, randomized, controlled clinical trial of uniform MDT in Brazil (U-MDT/CT-BR). Genome-wide sequencing of M. leprae was performed and the resultant sequence assemblies analyzed in silico. Principal findings In all three cases, no mutations responsible for resistance to rifampicin, dapsone and ofloxacin were found, thus eliminating drug resistance as a possible cause of disease recurrence. However, sequence differences were detected between the strains from the first and second disease episodes in all three patients. In one case, clear evidence was obtained for reinfection with an unrelated strain whereas in the other two cases, relapse appeared more probable. Conclusions/Significance This is the first report of using M. leprae whole genome sequencing to reveal that treated and cured leprosy patients who remain in endemic areas can be reinfected by another strain. Next generation sequencing can be applied reliably to M. leprae DNA extracted from biopsies to discriminate between cases of relapse and reinfection, thereby providing a powerful tool for evaluating different outcomes of therapeutic regimens and for following disease transmission. PMID:28617800
A Genome Wide Genotyping Study To Find Candidate Genes That Influence Varroa-Sensitive Hygiene (VSH)

USDA-ARS?s Scientific Manuscript database

Varroa parasitism of honey bees is widely considered by apicultural researchers to be the greatest threat to beekeeping. Varroa-sensitive hygiene (VSH) is one of two identified behaviors that are highly important for controlling the growth of Varroa mite populations in bee hives. Bees exhibiting th...
Genome-wide association study of antisocial personality disorder

PubMed Central

Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J

2016-01-01

The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53–3.14), P=1.9 × 10-5). Two polymorphisms at 6p21.2 LINC00951–LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37–1.85), P=1.6 × 10−9) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder. PMID:27598967
Genome-wide association study of antisocial personality disorder.

PubMed

Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J

2016-09-06

The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53-3.14), P=1.9 × 10(-5)). Two polymorphisms at 6p21.2 LINC00951-LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37-1.85), P=1.6 × 10(-9)) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder.

The three-dimensional architecture of a bacterial genome and its alteration by genetic perturbation.

PubMed

Umbarger, Mark A; Toro, Esteban; Wright, Matthew A; Porreca, Gregory J; Baù, Davide; Hong, Sun-Hae; Fero, Michael J; Zhu, Lihua J; Marti-Renom, Marc A; McAdams, Harley H; Shapiro, Lucy; Dekker, Job; Church, George M

2011-10-21

We have determined the three-dimensional (3D) architecture of the Caulobacter crescentus genome by combining genome-wide chromatin interaction detection, live-cell imaging, and computational modeling. Using chromosome conformation capture carbon copy (5C), we derive ~13 kb resolution 3D models of the Caulobacter genome. The resulting models illustrate that the genome is ellipsoidal with periodically arranged arms. The parS sites, a pair of short contiguous sequence elements known to be involved in chromosome segregation, are positioned at one pole, where they anchor the chromosome to the cell and contribute to the formation of a compact chromatin conformation. Repositioning these elements resulted in rotations of the chromosome that changed the subcellular positions of most genes. Such rotations did not lead to large-scale changes in gene expression, indicating that genome folding does not strongly affect gene regulation. Collectively, our data suggest that genome folding is globally dictated by the parS sites and chromosome segregation. Copyright © 2011 Elsevier Inc. All rights reserved.
Transposable Element Proliferation and Genome Expansion Are Rare in Contemporary Sunflower Hybrid Populations Despite Widespread Transcriptional Activity of LTR Retrotransposons

PubMed Central

Kawakami, Takeshi; Dhakal, Preeti; Katterhenry, Angela N.; Heatherington, Chelsea A.; Ungerer, Mark C.

2011-01-01

Hybridization is a natural phenomenon that has been linked in several organismal groups to transposable element derepression and copy number amplification. A noteworthy example involves three diploid annual sunflower species from North America that have arisen via ancient hybridization between the same two parental taxa, Helianthus annuus and H. petiolaris. The genomes of the hybrid species have undergone large-scale increases in genome size attributable to long terminal repeat (LTR) retrotransposon proliferation. The parental species that gave rise to the hybrid taxa are widely distributed, often sympatric, and contemporary hybridization between them is common. Natural H. annuus × H. petiolaris hybrid populations likely served as source populations from which the hybrid species arose and, as such, represent excellent natural experiments for examining the potential role of hybridization in transposable element derepression and proliferation in this group. In the current report, we examine multiple H. annuus × H. petiolaris hybrid populations for evidence of genome expansion, LTR retrotransposon copy number increases, and LTR retrotransposon transcriptional activity. We demonstrate that genome expansion and LTR retrotransposon proliferation are rare in contemporary hybrid populations, despite independent proliferation events that took place in the genomes of the ancient hybrid species. Interestingly, LTR retrotransposon lineages that proliferated in the hybrid species genomes remain transcriptionally active in hybrid and nonhybrid genotypes across the entire sampling area. The finding of transcriptional activity but not copy number increases in hybrid genotypes suggests that proliferation and genome expansion in contemporary hybrid populations may be mitigated by posttranscriptional mechanisms of repression. PMID:21282712
Heritability and molecular genetic basis of acoustic startle eye blink and affectively modulated startle response: A genome-wide association study

PubMed Central

VAIDYANATHAN, UMA; MALONE, STEPHEN M.; MILLER, MICHAEL B.; McGUE, MATT; IACONO, WILLIAM G.

2014-01-01

Acoustic startle responses have been studied extensively in relation to individual differences and psychopathology. We examined three indices of the blink response in a picture-viewing paradigm—overall startle magnitude across all picture types, and aversive and pleasant modulation scores—in 3,323 twins and parents. Biometric models and molecular genetic analyses showed that half the variance in overall startle was due to additive genetic effects. No single nucleotide polymorphism was genome-wide significant, but GRIK3 did produce a significant effect when examined as part of a candidate gene set. In contrast, emotion modulation scores showed little evidence of heritability in either biometric or molecular genetic analyses. However, in a genome-wide scan, PARP14 did produce a significant effect for aversive modulation. We conclude that, although overall startle retains potential as an endophenotype, emotion-modulated startle does not. PMID:25387708
Three-dimensional reconstruction of single-cell chromosome structure using recurrence plots.

PubMed

Hirata, Yoshito; Oda, Arisa; Ohta, Kunihiro; Aihara, Kazuyuki

2016-10-11

Single-cell analysis of the three-dimensional (3D) chromosome structure can reveal cell-to-cell variability in genome activities. Here, we propose to apply recurrence plots, a mathematical method of nonlinear time series analysis, to reconstruct the 3D chromosome structure of a single cell based on information of chromosomal contacts from genome-wide chromosome conformation capture (Hi-C) data. This recurrence plot-based reconstruction (RPR) method enables rapid reconstruction of a unique structure in single cells, even from incomplete Hi-C information.
Three-dimensional reconstruction of single-cell chromosome structure using recurrence plots

NASA Astrophysics Data System (ADS)

Hirata, Yoshito; Oda, Arisa; Ohta, Kunihiro; Aihara, Kazuyuki

2016-10-01

Single-cell analysis of the three-dimensional (3D) chromosome structure can reveal cell-to-cell variability in genome activities. Here, we propose to apply recurrence plots, a mathematical method of nonlinear time series analysis, to reconstruct the 3D chromosome structure of a single cell based on information of chromosomal contacts from genome-wide chromosome conformation capture (Hi-C) data. This recurrence plot-based reconstruction (RPR) method enables rapid reconstruction of a unique structure in single cells, even from incomplete Hi-C information.
Genome-wide analysis of epistasis in body mass index using multiple human populations.

PubMed

Wei, Wen-Hua; Hemani, Gib; Gyenesei, Attila; Vitart, Veronique; Navarro, Pau; Hayward, Caroline; Cabrera, Claudia P; Huffman, Jennifer E; Knott, Sara A; Hicks, Andrew A; Rudan, Igor; Pramstaller, Peter P; Wild, Sarah H; Wilson, James F; Campbell, Harry; Hastie, Nicholas D; Wright, Alan F; Haley, Chris S

2012-08-01

We surveyed gene-gene interactions (epistasis) in human body mass index (BMI) in four European populations (n<1200) via exhaustive pair-wise genome scans where interactions were computed as F ratios by testing a linear regression model fitting two single-nucleotide polymorphisms (SNPs) with interactions against the one without. Before the association tests, BMI was corrected for sex and age, normalised and adjusted for relatedness. Neither single SNPs nor SNP interactions were genome-wide significant in either cohort based on the consensus threshold (P=5.0E-08) and a Bonferroni corrected threshold (P=1.1E-12), respectively. Next we compared sub genome-wide significant SNP interactions (P<5.0E-08) across cohorts to identify common epistatic signals, where SNPs were annotated to genes to test for gene ontology (GO) enrichment. Among the epistatic genes contributing to the commonly enriched GO terms, 19 were shared across study cohorts of which 15 are previously published genome-wide association loci, including CDH13 (cadherin 13) associated with height and SORCS2 (sortilin-related VPS10 domain containing receptor 2) associated with circulating insulin-like growth factor 1 and binding protein 3. Interactions between the 19 shared epistatic genes and those involving BMI candidate loci (P<5.0E-08) were tested across cohorts and found eight replicated at the SNP level (P<0.05) in at least one cohort, which were further tested and showed limited replication in a separate European population (n>5000). We conclude that genome-wide analysis of epistasis in multiple populations is an effective approach to provide new insights into the genetic regulation of BMI but requires additional efforts to confirm the findings.
Genomic data into everyday work of a medical practitioner - digital tools for decision-making.

PubMed

Jokiranta, Sakari; Hotakainen, Kristina; Salonen, Iiris; Pöllänen, Pasi; Hänninen, Kai-Petri; Forsström, Jari; Kunnamo, Ilkka

Recent technological development has enabled fast and cost-effective simultaneous analyses of several gene variants or sequence of even the whole genome. For medical practitioners this has created challenges although genomic information may be clinically useful in new applications such as finding out individual risk for diseases influenced by as many as 50,000 variable DNA regions or in detecting pharmacogenetic risks prior to prescribing a medicine. New digital tools have paved the way for utilization of genomic data via easy access and clear clinical interpretation for both doctor and patient. In this review we describe some of these tools and applications for clinical use.
A Central Support System Can Facilitate Implementation and Sustainability of a Classroom-Based Undergraduate Research Experience (CURE) in Genomics

PubMed Central

Lopatto, David; Hauser, Charles; Jones, Christopher J.; Paetkau, Don; Chandrasekaran, Vidya; Dunbar, David; MacKinnon, Christy; Stamm, Joyce; Alvarez, Consuelo; Barnard, Daron; Bedard, James E. J.; Bednarski, April E.; Bhalla, Satish; Braverman, John M.; Burg, Martin; Chung, Hui-Min; DeJong, Randall J.; DiAngelo, Justin R.; Du, Chunguang; Eckdahl, Todd T.; Emerson, Julia; Frary, Amy; Frohlich, Donald; Goodman, Anya L.; Gosser, Yuying; Govind, Shubha; Haberman, Adam; Hark, Amy T.; Hoogewerf, Arlene; Johnson, Diana; Kadlec, Lisa; Kaehler, Marian; Key, S. Catherine Silver; Kokan, Nighat P.; Kopp, Olga R.; Kuleck, Gary A.; Lopilato, Jane; Martinez-Cruzado, Juan C.; McNeil, Gerard; Mel, Stephanie; Nagengast, Alexis; Overvoorde, Paul J.; Parrish, Susan; Preuss, Mary L.; Reed, Laura D.; Regisford, E. Gloria; Revie, Dennis; Robic, Srebrenka; Roecklien-Canfield, Jennifer A.; Rosenwald, Anne G.; Rubin, Michael R.; Saville, Kenneth; Schroeder, Stephanie; Sharif, Karim A.; Shaw, Mary; Skuse, Gary; Smith, Christopher D.; Smith, Mary; Smith, Sheryl T.; Spana, Eric P.; Spratt, Mary; Sreenivasan, Aparna; Thompson, Jeffrey S.; Wawersik, Matthew; Wolyniak, Michael J.; Youngblom, James; Zhou, Leming; Buhler, Jeremy; Mardis, Elaine; Leung, Wilson; Threlfall, Jennifer; Elgin, Sarah C. R.

2014-01-01

In their 2012 report, the President's Council of Advisors on Science and Technology advocated “replacing standard science laboratory courses with discovery-based research courses”—a challenging proposition that presents practical and pedagogical difficulties. In this paper, we describe our collective experiences working with the Genomics Education Partnership, a nationwide faculty consortium that aims to provide undergraduates with a research experience in genomics through a scheduled course (a classroom-based undergraduate research experience, or CURE). We examine the common barriers encountered in implementing a CURE, program elements of most value to faculty, ways in which a shared core support system can help, and the incentives for and rewards of establishing a CURE on our diverse campuses. While some of the barriers and rewards are specific to a research project utilizing a genomics approach, other lessons learned should be broadly applicable. We find that a central system that supports a shared investigation can mitigate some shortfalls in campus infrastructure (such as time for new curriculum development, availability of IT services) and provides collegial support for change. Our findings should be useful for designing similar supportive programs to facilitate change in the way we teach science for undergraduates. PMID:25452493
Genome-Wide Microsatellite Characterization and Marker Development in the Sequenced Brassica Crop Species

PubMed Central

Shi, Jiaqin; Huang, Shunmou; Zhan, Jiepeng; Yu, Jingyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

2014-01-01

Although much research has been conducted, the pattern of microsatellite distribution has remained ambiguous, and the development/utilization of microsatellite markers has still been limited/inefficient in Brassica, due to the lack of genome sequences. In view of this, we conducted genome-wide microsatellite characterization and marker development in three recently sequenced Brassica crops: Brassica rapa, Brassica oleracea and Brassica napus. The analysed microsatellite characteristics of these Brassica species were highly similar or almost identical, which suggests that the pattern of microsatellite distribution is likely conservative in Brassica. The genomic distribution of microsatellites was highly non-uniform and positively or negatively correlated with genes or transposable elements, respectively. Of the total of 115 869, 185 662 and 356 522 simple sequence repeat (SSR) markers developed with high frequencies (408.2, 343.8 and 356.2 per Mb or one every 2.45, 2.91 and 2.81 kb, respectively), most represented new SSR markers, the majority had determined physical positions, and a large number were genic or putative single-locus SSR markers. We also constructed a comprehensive database for the newly developed SSR markers, which was integrated with public Brassica SSR markers and annotated genome components. The genome-wide SSR markers developed in this study provide a useful tool to extend the annotated genome resources of sequenced Brassica species to genetic study/breeding in different Brassica species. PMID:24130371
Genome-wide microsatellite characterization and marker development in the sequenced Brassica crop species.

PubMed

Shi, Jiaqin; Huang, Shunmou; Zhan, Jiepeng; Yu, Jingyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

2014-02-01

Although much research has been conducted, the pattern of microsatellite distribution has remained ambiguous, and the development/utilization of microsatellite markers has still been limited/inefficient in Brassica, due to the lack of genome sequences. In view of this, we conducted genome-wide microsatellite characterization and marker development in three recently sequenced Brassica crops: Brassica rapa, Brassica oleracea and Brassica napus. The analysed microsatellite characteristics of these Brassica species were highly similar or almost identical, which suggests that the pattern of microsatellite distribution is likely conservative in Brassica. The genomic distribution of microsatellites was highly non-uniform and positively or negatively correlated with genes or transposable elements, respectively. Of the total of 115 869, 185 662 and 356 522 simple sequence repeat (SSR) markers developed with high frequencies (408.2, 343.8 and 356.2 per Mb or one every 2.45, 2.91 and 2.81 kb, respectively), most represented new SSR markers, the majority had determined physical positions, and a large number were genic or putative single-locus SSR markers. We also constructed a comprehensive database for the newly developed SSR markers, which was integrated with public Brassica SSR markers and annotated genome components. The genome-wide SSR markers developed in this study provide a useful tool to extend the annotated genome resources of sequenced Brassica species to genetic study/breeding in different Brassica species.
Genome-wide and digital polymerase chain reaction epigenetic assessments of alcohol consumption.

PubMed

Philibert, Robert; Dogan, Meesha; Noel, Amanda; Miller, Shelly; Krukow, Brianna; Papworth, Emma; Cowley, Joseph; Knudsen, April; Beach, Steven R H; Black, Donald

2018-04-28

The lack of readily employable biomarkers of alcohol consumption is a problem for clinicians and researchers. In 2014, we published a preliminary DNA methylation signature of heavy alcohol consumption that remits as a function of abstinence. Herein, we present new genome-wide methylation findings from a cohort of additional subjects and a meta-analysis of the data. Using DNA from 47 consecutive heavy drinkers admitted for alcohol detoxification in the context of alcohol treatment and 47 abstinent controls, we replicate the 2014 results and show that 21,221 CpG residues are differentially methylated in active heavy drinkers. Meta-analysis of all data from the 448,058 probes common to the two methylation platforms shows a similarly profound signature with confirmation of findings from other groups. Principal components analyses show that genome-wide methylation changes in response to alcohol consumption load on two major factors with one component accounting at least 50% of the total variance in both smokers and nonsmoking alcoholics. Using data from the arrays, we derive a panel of five methylation probes that classifies use status with a receiver operator characteristic area under the curve (AUC) of 0.97. Finally, using droplet digital polymerase chain reaction (PCR), we convert these array-based findings to two marker assays with an AUC of 0.95 and a four marker set AUC of 0.98. We conclude that DNA methylation assessments are capable of quantifying alcohol use status and suggest that readily employable digital PCR approaches for substance consumption may find widespread use in alcohol-related research and patient care. © 2018 Wiley Periodicals, Inc.
Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data

PubMed Central

Oetjens, Matthew T.; Brown-Gentry, Kristin; Goodloe, Robert; Dilks, Holli H.; Crawford, Dana C.

2016-01-01

Population stratification or confounding by genetic ancestry is a potential cause of false associations in genetic association studies. Estimation of and adjustment for genetic ancestry has become common practice thanks in part to the availability of ancestry informative markers on genome-wide association study (GWAS) arrays. While array data is now widespread, these data are not ubiquitous as several large epidemiologic and clinic-based studies lack genome-wide data. One such large epidemiologic-based study lacking genome-wide data accessible to investigators is the National Health and Nutrition Examination Surveys (NHANES), population-based cross-sectional surveys of Americans linked to demographic, health, and lifestyle data conducted by the Centers for Disease Control and Prevention. DNA samples (n = 14,998) were extracted from biospecimens from consented NHANES participants between 1991–1994 (NHANES III, phase 2) and 1999–2002 and represent three major self-identified racial/ethnic groups: non-Hispanic whites (n = 6,634), non-Hispanic blacks (n = 3,458), and Mexican Americans (n = 3,950). We as the Epidemiologic Architecture for Genes Linked to Environment study genotyped candidate gene and GWAS-identified index variants in NHANES as part of the larger Population Architecture using Genomics and Epidemiology I study for collaborative genetic association studies. To enable basic quality control such as estimation of genetic ancestry to control for population stratification in NHANES san genome-wide data, we outline here strategies that use limited genetic data to identify the markers optimal for characterizing genetic ancestry. From among 411 and 295 autosomal SNPs available in NHANES III and NHANES 1999–2002, we demonstrate that markers with ancestry information can be identified to estimate global ancestry. Despite limited resolution, global genetic ancestry is highly correlated with self-identified race for the majority of participants, although less so for ethnicity. Overall, the strategies outlined here for a large epidemiologic study can be applied to other datasets accessible for genotype–phenotype studies but are sans genome-wide data. PMID:27200085
GWAS meta-analysis and replication identifies three new susceptibility loci for ovarian cancer

PubMed Central

Pharoah, Paul D. P.; Tsai, Ya-Yu; Ramus, Susan J.; Phelan, Catherine M.; Goode, Ellen L.; Lawrenson, Kate; Price, Melissa; Fridley, Brooke L.; Tyrer, Jonathan P.; Shen, Howard; Weber, Rachel; Karevan, Rod; Larson, Melissa C.; Song, Honglin; Tessier, Daniel C.; Bacot, François; Vincent, Daniel; Cunningham, Julie M.; Dennis, Joe; Dicks, Ed; Aben, Katja K.; Anton-Culver, Hoda; Antonenkova, Natalia; Armasu, Sebastian M.; Baglietto, Laura; Bandera, Elisa V.; Beckmann, Matthias W.; Birrer, Michael J.; Bloom, Greg; Bogdanova, Natalia; Brenton, James D.; Brinton, Louise A.; Brooks-Wilson, Angela; Brown, Robert; Butzow, Ralf; Campbell, Ian; Carney, Michael E; Carvalho, Renato S.; Chang-Claude, Jenny; Chen, Y. Anne; Chen, Zhihua; Chow, Wong-Ho; Cicek, Mine S.; Coetzee, Gerhard; Cook, Linda S.; Cramer, Daniel W.; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Despierre, Evelyn; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Edwards, Robert; Ekici, Arif B.; Fasching, Peter A.; Fenstermacher, David; Flanagan, James; Gao, Yu-Tang; Garcia-Closas, Montserrat; Gentry-Maharaj, Aleksandra; Giles, Graham; Gjyshi, Anxhela; Gore, Martin; Gronwald, Jacek; Guo, Qi; Halle, Mari K; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hillemanns, Peter; Hoatlin, Maureen; Høgdall, Estrid; Høgdall, Claus K.; Hosono, Satoyo; Jakubowska, Anna; Jensen, Allan; Kalli, Kimberly R.; Karlan, Beth Y.; Kelemen, Linda E.; Kiemeney, Lambertus A.; Kjaer, Susanne Krüger; Konecny, Gottfried E.; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Nathan; Lee, Janet; Leminen, Arto; Lim, Boon Kiong; Lissowska, Jolanta; Lubiński, Jan; Lundvall, Lene; Lurie, Galina; Massuger, Leon F.A.G.; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B.; Nakanishi, Toru; Narod, Steven A.; Ness, Roberta B.; Nevanlinna, Heli; Nickels, Stefan; Noushmehr, Houtan; Odunsi, Kunle; Olson, Sara; Orlow, Irene; Paul, James; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jenny; Pike, Malcolm C; Poole, Elizabeth M; Qu, Xiaotao; Risch, Harvey A.; Rodriguez-Rodriguez, Lorna; Rossing, Mary Anne; Rudolph, Anja; Runnebaum, Ingo; Rzepecka, Iwona K; Salvesen, Helga B.; Schwaab, Ira; Severi, Gianluca; Shen, Hui; Shridhar, Vijayalakshmi; Shu, Xiao-Ou; Sieh, Weiva; Southey, Melissa C.; Spellman, Paul; Tajima, Kazuo; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J; Timorek, Agnieszka; Tworoger, Shelley S.; van Altena, Anne M.; Berg, David Van Den; Vergote, Ignace; Vierkant, Robert A.; Vitonis, Allison F.; Wang-Gohrke, Shan; Wentzensen, Nicolas; Whittemore, Alice S.; Wik, Elisabeth; Winterhoff, Boris; Woo, Yin Ling; Wu, Anna H; Yang, Hannah P.; Zheng, Wei; Ziogas, Argyrios; Zulkifli, Famida; Goodman, Marc T.; Hall, Per; Easton, Douglas F; Pearce, Celeste L; Berchuck, Andrew; Chenevix-Trench, Georgia; Iversen, Edwin; Monteiro, Alvaro N.A.; Gayther, Simon A.; Schildkraut, Joellen M.; Sellers, Thomas A.

2013-01-01

Genome wide association studies (GWAS) have identified four susceptibility loci for epithelial ovarian cancer (EOC) with another two loci being close to genome-wide significance. We pooled data from a GWAS conducted in North America with another GWAS from the United Kingdom. We selected the top 24,551 SNPs for inclusion on the iCOGS custom genotyping array. Follow-up genotyping was carried out in 18,174 cases and 26,134 controls from 43 studies from the Ovarian Cancer Association Consortium. We validated the two loci at 3q25 and 17q21 previously near genome-wide significance and identified three novel loci associated with risk; two loci associated with all EOC subtypes, at 8q21 (rs11782652, P=5.5×10-9) and 10p12 (rs1243180; P=1.8×10-8), and another locus specific to the serous subtype at 17q12 (rs757210; P=8.1×10-10). An integrated molecular analysis of genes and regulatory regions at these loci provided evidence for functional mechanisms underlying susceptibility that implicates CHMP4C in the pathogenesis of ovarian cancer. PMID:23535730
GWAS meta-analysis and replication identifies three new susceptibility loci for ovarian cancer.

PubMed

Pharoah, Paul D P; Tsai, Ya-Yu; Ramus, Susan J; Phelan, Catherine M; Goode, Ellen L; Lawrenson, Kate; Buckley, Melissa; Fridley, Brooke L; Tyrer, Jonathan P; Shen, Howard; Weber, Rachel; Karevan, Rod; Larson, Melissa C; Song, Honglin; Tessier, Daniel C; Bacot, François; Vincent, Daniel; Cunningham, Julie M; Dennis, Joe; Dicks, Ed; Aben, Katja K; Anton-Culver, Hoda; Antonenkova, Natalia; Armasu, Sebastian M; Baglietto, Laura; Bandera, Elisa V; Beckmann, Matthias W; Birrer, Michael J; Bloom, Greg; Bogdanova, Natalia; Brenton, James D; Brinton, Louise A; Brooks-Wilson, Angela; Brown, Robert; Butzow, Ralf; Campbell, Ian; Carney, Michael E; Carvalho, Renato S; Chang-Claude, Jenny; Chen, Y Anne; Chen, Zhihua; Chow, Wong-Ho; Cicek, Mine S; Coetzee, Gerhard; Cook, Linda S; Cramer, Daniel W; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Despierre, Evelyn; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Edwards, Robert; Ekici, Arif B; Fasching, Peter A; Fenstermacher, David; Flanagan, James; Gao, Yu-Tang; Garcia-Closas, Montserrat; Gentry-Maharaj, Aleksandra; Giles, Graham; Gjyshi, Anxhela; Gore, Martin; Gronwald, Jacek; Guo, Qi; Halle, Mari K; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hillemanns, Peter; Hoatlin, Maureen; Høgdall, Estrid; Høgdall, Claus K; Hosono, Satoyo; Jakubowska, Anna; Jensen, Allan; Kalli, Kimberly R; Karlan, Beth Y; Kelemen, Linda E; Kiemeney, Lambertus A; Kjaer, Susanne Krüger; Konecny, Gottfried E; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Nathan; Lee, Janet; Leminen, Arto; Lim, Boon Kiong; Lissowska, Jolanta; Lubiński, Jan; Lundvall, Lene; Lurie, Galina; Massuger, Leon F A G; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Nakanishi, Toru; Narod, Steven A; Ness, Roberta B; Nevanlinna, Heli; Nickels, Stefan; Noushmehr, Houtan; Odunsi, Kunle; Olson, Sara; Orlow, Irene; Paul, James; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jenny; Pike, Malcolm C; Poole, Elizabeth M; Qu, Xiaotao; Risch, Harvey A; Rodriguez-Rodriguez, Lorna; Rossing, Mary Anne; Rudolph, Anja; Runnebaum, Ingo; Rzepecka, Iwona K; Salvesen, Helga B; Schwaab, Ira; Severi, Gianluca; Shen, Hui; Shridhar, Vijayalakshmi; Shu, Xiao-Ou; Sieh, Weiva; Southey, Melissa C; Spellman, Paul; Tajima, Kazuo; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tworoger, Shelley S; van Altena, Anne M; van den Berg, David; Vergote, Ignace; Vierkant, Robert A; Vitonis, Allison F; Wang-Gohrke, Shan; Wentzensen, Nicolas; Whittemore, Alice S; Wik, Elisabeth; Winterhoff, Boris; Woo, Yin Ling; Wu, Anna H; Yang, Hannah P; Zheng, Wei; Ziogas, Argyrios; Zulkifli, Famida; Goodman, Marc T; Hall, Per; Easton, Douglas F; Pearce, Celeste L; Berchuck, Andrew; Chenevix-Trench, Georgia; Iversen, Edwin; Monteiro, Alvaro N A; Gayther, Simon A; Schildkraut, Joellen M; Sellers, Thomas A

2013-04-01

Genome-wide association studies (GWAS) have identified four susceptibility loci for epithelial ovarian cancer (EOC), with another two suggestive loci reaching near genome-wide significance. We pooled data from a GWAS conducted in North America with another GWAS from the UK. We selected the top 24,551 SNPs for inclusion on the iCOGS custom genotyping array. We performed follow-up genotyping in 18,174 individuals with EOC (cases) and 26,134 controls from 43 studies from the Ovarian Cancer Association Consortium. We validated the two loci at 3q25 and 17q21 that were previously found to have associations close to genome-wide significance and identified three loci newly associated with risk: two loci associated with all EOC subtypes at 8q21 (rs11782652, P = 5.5 × 10(-9)) and 10p12 (rs1243180, P = 1.8 × 10(-8)) and another locus specific to the serous subtype at 17q12 (rs757210, P = 8.1 × 10(-10)). An integrated molecular analysis of genes and regulatory regions at these loci provided evidence for functional mechanisms underlying susceptibility and implicated CHMP4C in the pathogenesis of ovarian cancer.
Identification of a unique library of complex, but ordered, arrays of repetitive elements in the human genome and implication of their potential involvement in pathobiology.

PubMed

Lee, Kang-Hoon; Lee, Young-Kwan; Kwon, Deug-Nam; Chiu, Sophia; Chew, Victoria; Rah, Hyungchul; Kujawski, Gregory; Melhem, Ramzi; Hsu, Karen; Chung, Cecilia; Greenhalgh, David G; Cho, Kiho

2011-06-01

Approximately 2% of the human genome is reported to be occupied by genes. Various forms of repetitive elements (REs), both characterized and uncharacterized, are presumed to make up the vast majority of the rest of the genomes of human and other species. In conjunction with a comprehensive annotation of genes, information regarding components of genome biology, such as gene polymorphisms, non-coding RNAs, and certain REs, is found in human genome databases. However, the genome-wide profile of unique RE arrangements formed by different groups of REs has not been fully characterized yet. In this study, the entire human genome was subjected to an unbiased RE survey to establish a whole-genome profile of REs and their arrangements. Due to the limitation in query size within the bl2seq alignment program (National Center for Biotechnology Information [NCBI]) utilized for the RE survey, the entire NCBI reference human genome was fragmented into 6206 units of 0.5M nucleotides. A number of RE arrangements with varying complexities and patterns were identified throughout the genome. Each chromosome had unique profiles of RE arrangements and density, and high levels of RE density were measured near the centromere regions. Subsequently, 175 complex RE arrangements, which were selected throughout the genome, were subjected to a comparison analysis using five different human genome sequences. Interestingly, three of the five human genome databases shared the exactly same arrangement patterns and sequences for all 175 RE arrangement regions (a total of 12,765,625 nucleotides). The findings from this study demonstrate that a substantial fraction of REs in the human genome are clustered into various forms of ordered structures. Further investigations are needed to examine whether some of these ordered RE arrangements contribute to the human pathobiology as a functional genome unit. Copyright © 2011 Elsevier Inc. All rights reserved.
Genome-Wide and Gene-Based Meta-Analyses Identify Novel Loci Influencing Blood Pressure Response to Hydrochlorothiazide.

PubMed

Salvi, Erika; Wang, Zhiying; Rizzi, Federica; Gong, Yan; McDonough, Caitrin W; Padmanabhan, Sandosh; Hiltunen, Timo P; Lanzani, Chiara; Zaninello, Roberta; Chittani, Martina; Bailey, Kent R; Sarin, Antti-Pekka; Barcella, Matteo; Melander, Olle; Chapman, Arlene B; Manunta, Paolo; Kontula, Kimmo K; Glorioso, Nicola; Cusi, Daniele; Dominiczak, Anna F; Johnson, Julie A; Barlassina, Cristina; Boerwinkle, Eric; Cooper-DeHoff, Rhonda M; Turner, Stephen T

2017-01-01

This study aimed to identify novel loci influencing the antihypertensive response to hydrochlorothiazide monotherapy. A genome-wide meta-analysis of blood pressure (BP) response to hydrochlorothiazide was performed in 1739 white hypertensives from 6 clinical trials within the International Consortium for Antihypertensive Pharmacogenomics Studies, making it the largest study to date of its kind. No signals reached genome-wide significance (P<5×10 - 8 ), and the suggestive regions (P<10 -5 ) were cross-validated in 2 black cohorts treated with hydrochlorothiazide. In addition, a gene-based analysis was performed on candidate genes with previous evidence of involvement in diuretic response, in BP regulation, or in hypertension susceptibility. Using the genome-wide meta-analysis approach, with validation in blacks, we identified 2 suggestive regulatory regions linked to gap junction protein α1 gene (GJA1) and forkhead box A1 gene (FOXA1), relevant for cardiovascular and kidney function. With the gene-based approach, we identified hydroxy-delta-5-steroid dehydrogenase, 3 β- and steroid δ-isomerase 1 gene (HSD3B1) as significantly associated with BP response (P<2.28×10 - 4 ). HSD3B1 encodes the 3β-hydroxysteroid dehydrogenase enzyme and plays a crucial role in the biosynthesis of aldosterone and endogenous ouabain. By amassing all of the available pharmacogenomic studies of BP response to hydrochlorothiazide, and using 2 different analytic approaches, we identified 3 novel loci influencing BP response to hydrochlorothiazide. The gene-based analysis, never before applied to pharmacogenomics of antihypertensive drugs to our knowledge, provided a powerful strategy to identify a locus of interest, which was not identified in the genome-wide meta-analysis because of high allelic heterogeneity. These data pave the way for future investigations on new pathways and drug targets to enhance the current understanding of personalized antihypertensive treatment. © 2016 American Heart Association, Inc.
Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse.

PubMed

Eppig, Janan T

2017-07-01

The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided. © The Author 2017. Published by Oxford University Press.
Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse

PubMed Central

Eppig, Janan T.

2017-01-01

Abstract The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided. PMID:28838066
Sniffing out significant “Pee values”: genome wide association study of asparagus anosmia

PubMed Central

Markt, Sarah C; Nuttall, Elizabeth; Turman, Constance; Sinnott, Jennifer; Rimm, Eric B; Ecsedy, Ethan; Unger, Robert H; Fall, Katja; Finn, Stephen; Jensen, Majken K; Rider, Jennifer R; Kraft, Peter

2016-01-01

Objective To determine the inherited factors associated with the ability to smell asparagus metabolites in urine. Design Genome wide association study. Setting Nurses’ Health Study and Health Professionals Follow-up Study cohorts. Participants 6909 men and women of European-American descent with available genetic data from genome wide association studies. Main outcome measure Participants were characterized as asparagus smellers if they strongly agreed with the prompt “after eating asparagus, you notice a strong characteristic odor in your urine,” and anosmic if otherwise. We calculated per-allele estimates of asparagus anosmia for about nine million single nucleotide polymorphisms using logistic regression. P values <5×10-8 were considered as genome wide significant. Results 58.0% of men (n=1449/2500) and 61.5% of women (n=2712/4409) had anosmia. 871 single nucleotide polymorphisms reached genome wide significance for asparagus anosmia, all in a region on chromosome 1 (1q44: 248139851-248595299) containing multiple genes in the olfactory receptor 2 (OR2) family. Conditional analyses revealed three independent markers associated with asparagus anosmia: rs13373863, rs71538191, and rs6689553. Conclusion A large proportion of people have asparagus anosmia. Genetic variation near multiple olfactory receptor genes is associated with the ability of an individual to smell the metabolites of asparagus in urine. Future replication studies are necessary before considering targeted therapies to help anosmic people discover what they are missing. PMID:27965198
Genome-wide and gene-based association implicates FRMD6 in Alzheimer disease.

PubMed

Hong, Mun-Gwan; Reynolds, Chandra A; Feldman, Adina L; Kallin, Mikael; Lambert, Jean-Charles; Amouyel, Philippe; Ingelsson, Erik; Pedersen, Nancy L; Prince, Jonathan A

2012-03-01

Genome-wide association studies (GWAS) that allow for allelic heterogeneity may facilitate the discovery of novel genes not detectable by models that require replication of a single variant site. One strategy to accomplish this is to focus on genes rather than markers as units of association, and so potentially capture a spectrum of causal alleles that differ across populations. Here, we conducted a GWAS of Alzheimer disease (AD) in 2,586 Swedes and performed gene-based meta-analysis with three additional studies from France, Canada, and the United States, in total encompassing 4,259 cases and 8,284 controls. Implementing a newly designed gene-based algorithm, we identified two loci apart from the region around APOE that achieved study-wide significance in combined samples, the strongest finding being for FRMD6 on chromosome 14q (P = 2.6 × 10(-14)) and a weaker signal for NARS2 that is immediately adjacent to GAB2 on chromosome 11q (P = 7.8 × 10(-9)). Ontology-based pathway analyses revealed significant enrichment of genes involved in glycosylation. Results suggest that gene-based approaches that accommodate allelic heterogeneity in GWAS can provide a complementary avenue for gene discovery and may help to explain a portion of the missing heritability not detectable with single nucleotide polymorphisms (SNPs) derived from marker-specific meta-analysis. © 2011 Wiley Periodicals, Inc.

Individualized cattle copy number and segmental duplication maps using next generation sequencing

USDA-ARS?s Scientific Manuscript database

Copy Number Variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often intractable. Using a read depth approach based on next generation sequencing, we examined genome-wide copy number differences among five taurine (three Angus, one ...
Genome-Wide Association Study of the Genetic Determinants of Emphysema Distribution

PubMed Central

Boueiz, Adel; Lutz, Sharon M.; Cho, Michael H.; Hersh, Craig P.; Bowler, Russell P.; Washko, George R.; Halper-Stromberg, Eitan; Bakke, Per; Gulsvik, Amund; Laird, Nan M.; Beaty, Terri H.; Coxson, Harvey O.; Crapo, James D.; Silverman, Edwin K.; Castaldi, Peter J.

2017-01-01

Rationale: Emphysema has considerable variability in the severity and distribution of parenchymal destruction throughout the lungs. Upper lobe–predominant emphysema has emerged as an important predictor of response to lung volume reduction surgery. Yet, aside from alpha-1 antitrypsin deficiency, the genetic determinants of emphysema distribution remain largely unknown. Objectives: To identify the genetic influences of emphysema distribution in non–alpha-1 antitrypsin–deficient smokers. Methods: A total of 11,532 subjects with complete genotype and computed tomography densitometry data in the COPDGene (Genetic Epidemiology of Chronic Obstructive Pulmonary Disease [COPD]; non-Hispanic white and African American), ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints), and GenKOLS (Genetics of Chronic Obstructive Lung Disease) studies were analyzed. Two computed tomography scan emphysema distribution measures (difference between upper-third and lower-third emphysema; ratio of upper-third to lower-third emphysema) were tested for genetic associations in all study subjects. Separate analyses in each study population were followed by a fixed effect metaanalysis. Single-nucleotide polymorphism–, gene-, and pathway-based approaches were used. In silico functional evaluation was also performed. Measurements and Main Results: We identified five loci associated with emphysema distribution at genome-wide significance. These loci included two previously reported associations with COPD susceptibility (4q31 near HHIP and 15q25 near CHRNA5) and three new associations near SOWAHB, TRAPPC9, and KIAA1462. Gene set analysis and in silico functional evaluation revealed pathways and cell types that may potentially contribute to the pathogenesis of emphysema distribution. Conclusions: This multicohort genome-wide association study identified new genomic loci associated with differential emphysematous destruction throughout the lungs. These findings may point to new biologic pathways on which to expand diagnostic and therapeutic approaches in chronic obstructive pulmonary disease. Clinical trial registered with www.clinicaltrials.gov (NCT 00608764). PMID:27669027
Phylogeographic and population genetic analyses reveal multiple species of Boa and independent origins of insular dwarfism.

PubMed

Card, Daren C; Schield, Drew R; Adams, Richard H; Corbin, Andrew B; Perry, Blair W; Andrew, Audra L; Pasquesi, Giulia I M; Smith, Eric N; Jezkova, Tereza; Boback, Scott M; Booth, Warren; Castoe, Todd A

2016-09-01

Boa is a Neotropical genus of snakes historically recognized as monotypic despite its expansive distribution. The distinct morphological traits and color patterns exhibited by these snakes, together with the wide diversity of ecosystems they inhabit, collectively suggest that the genus may represent multiple species. Morphological variation within Boa also includes instances of dwarfism observed in multiple offshore island populations. Despite this substantial diversity, the systematics of the genus Boa has received little attention until very recently. In this study we examined the genetic structure and phylogenetic relationships of Boa populations using mitochondrial sequences and genome-wide SNP data obtained from RADseq. We analyzed these data at multiple geographic scales using a combination of phylogenetic inference (including coalescent-based species delimitation) and population genetic analyses. We identified extensive population structure across the range of the genus Boa and multiple lines of evidence for three widely-distributed clades roughly corresponding with the three primary land masses of the Western Hemisphere. We also find both mitochondrial and nuclear support for independent origins and parallel evolution of dwarfism on offshore island clusters in Belize and Cayos Cochinos Menor, Honduras. Copyright © 2016 Elsevier Inc. All rights reserved.
Screening of duplicated loci reveals hidden divergence patterns in a complex salmonid genome

USGS Publications Warehouse

Limborg, Morten T.; Larson, Wesley; Seeb, Lisa W.; Seeb, James E.

2017-01-01

A whole-genome duplication (WGD) doubles the entire genomic content of a species and is thought to have catalysed adaptive radiation in some polyploid-origin lineages. However, little is known about general consequences of a WGD because gene duplicates (i.e., paralogs) are commonly filtered in genomic studies; such filtering may remove substantial portions of the genome in data sets from polyploid-origin species. We demonstrate a new method that enables genome-wide scans for signatures of selection at both nonduplicated and duplicated loci by taking locus-specific copy number into account. We apply this method to RAD sequence data from different ecotypes of a polyploid-origin salmonid (Oncorhynchus nerka) and reveal signatures of divergent selection that would have been missed if duplicated loci were filtered. We also find conserved signatures of elevated divergence at pairs of homeologous chromosomes with residual tetrasomic inheritance, suggesting that joint evolution of some nondiverged gene duplicates may affect the adaptive potential of these genes. These findings illustrate that including duplicated loci in genomic analyses enables novel insights into the evolutionary consequences of WGDs and local segmental gene duplications.
CRISPR/Cas9-mediated gene targeting in Arabidopsis using sequential transformation.

PubMed

Miki, Daisuke; Zhang, Wenxin; Zeng, Wenjie; Feng, Zhengyan; Zhu, Jian-Kang

2018-05-17

Homologous recombination-based gene targeting is a powerful tool for precise genome modification and has been widely used in organisms ranging from yeast to higher organisms such as Drosophila and mouse. However, gene targeting in higher plants, including the most widely used model plant Arabidopsis thaliana, remains challenging. Here we report a sequential transformation method for gene targeting in Arabidopsis. We find that parental lines expressing the bacterial endonuclease Cas9 from the egg cell- and early embryo-specific DD45 gene promoter can improve the frequency of single-guide RNA-targeted gene knock-ins and sequence replacements via homologous recombination at several endogenous sites in the Arabidopsis genome. These heritable gene targeting can be identified by regular PCR. Our approach enables routine and fine manipulation of the Arabidopsis genome.
Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas.

PubMed

Galanter, Joshua Mark; Fernandez-Lopez, Juan Carlos; Gignoux, Christopher R; Barnholtz-Sloan, Jill; Fernandez-Rozadilla, Ceres; Via, Marc; Hidalgo-Miranda, Alfredo; Contreras, Alejandra V; Figueroa, Laura Uribe; Raska, Paola; Jimenez-Sanchez, Gerardo; Zolezzi, Irma Silva; Torres, Maria; Ponte, Clara Ruiz; Ruiz, Yarimar; Salas, Antonio; Nguyen, Elizabeth; Eng, Celeste; Borjas, Lisbeth; Zabala, William; Barreto, Guillermo; González, Fernando Rondón; Ibarra, Adriana; Taboada, Patricia; Porras, Liliana; Moreno, Fabián; Bigham, Abigail; Gutierrez, Gerardo; Brutsaert, Tom; León-Velarde, Fabiola; Moore, Lorna G; Vargas, Enrique; Cruz, Miguel; Escobedo, Jorge; Rodriguez-Santana, José; Rodriguez-Cintrón, William; Chapela, Rocio; Ford, Jean G; Bustamante, Carlos; Seminara, Daniela; Shriver, Mark; Ziv, Elad; Burchard, Esteban Gonzalez; Haile, Robert; Parra, Esteban; Carracedo, Angel

2012-01-01

Most individuals throughout the Americas are admixed descendants of Native American, European, and African ancestors. Complex historical factors have resulted in varying proportions of ancestral contributions between individuals within and among ethnic groups. We developed a panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America. We used genome-wide data from 953 individuals from diverse African, European, and Native American populations to select AIMs optimized for each of the three main continental populations that form the basis of modern Latin American populations. We selected markers on the basis of locus-specific branch length to be informative, well distributed throughout the genome, capable of being genotyped on widely available commercial platforms, and applicable throughout the Americas by minimizing within-continent heterogeneity. We then validated the panel in samples from four admixed populations by comparing ancestry estimates based on the AIMs panel to estimates based on genome-wide association study (GWAS) data. The panel provided balanced discriminatory power among the three ancestral populations and accurate estimates of individual ancestry proportions (R² > 0.9 for ancestral components with significant between-subject variance). Finally, we genotyped samples from 18 populations from Latin America using the AIMs panel and estimated variability in ancestry within and between these populations. This panel and its reference genotype information will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region.
Development of a Panel of Genome-Wide Ancestry Informative Markers to Study Admixture Throughout the Americas

PubMed Central

Galanter, Joshua Mark; Fernandez-Lopez, Juan Carlos; Gignoux, Christopher R.; Barnholtz-Sloan, Jill; Fernandez-Rozadilla, Ceres; Via, Marc; Hidalgo-Miranda, Alfredo; Contreras, Alejandra V.; Figueroa, Laura Uribe; Raska, Paola; Jimenez-Sanchez, Gerardo; Silva Zolezzi, Irma; Torres, Maria; Ponte, Clara Ruiz; Ruiz, Yarimar; Salas, Antonio; Nguyen, Elizabeth; Eng, Celeste; Borjas, Lisbeth; Zabala, William; Barreto, Guillermo; Rondón González, Fernando; Ibarra, Adriana; Taboada, Patricia; Porras, Liliana; Moreno, Fabián; Bigham, Abigail; Gutierrez, Gerardo; Brutsaert, Tom; León-Velarde, Fabiola; Moore, Lorna G.; Vargas, Enrique; Cruz, Miguel; Escobedo, Jorge; Rodriguez-Santana, José; Rodriguez-Cintrón, William; Chapela, Rocio; Ford, Jean G.; Bustamante, Carlos; Seminara, Daniela; Shriver, Mark; Ziv, Elad; Gonzalez Burchard, Esteban; Haile, Robert

2012-01-01

Most individuals throughout the Americas are admixed descendants of Native American, European, and African ancestors. Complex historical factors have resulted in varying proportions of ancestral contributions between individuals within and among ethnic groups. We developed a panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America. We used genome-wide data from 953 individuals from diverse African, European, and Native American populations to select AIMs optimized for each of the three main continental populations that form the basis of modern Latin American populations. We selected markers on the basis of locus-specific branch length to be informative, well distributed throughout the genome, capable of being genotyped on widely available commercial platforms, and applicable throughout the Americas by minimizing within-continent heterogeneity. We then validated the panel in samples from four admixed populations by comparing ancestry estimates based on the AIMs panel to estimates based on genome-wide association study (GWAS) data. The panel provided balanced discriminatory power among the three ancestral populations and accurate estimates of individual ancestry proportions (R2>0.9 for ancestral components with significant between-subject variance). Finally, we genotyped samples from 18 populations from Latin America using the AIMs panel and estimated variability in ancestry within and between these populations. This panel and its reference genotype information will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region. PMID:22412386
Analysis of genome-wide copy number variations in Chinese indigenous and western pig breeds by 60 K SNP genotyping arrays.

PubMed

Wang, Yanan; Tang, Zhonglin; Sun, Yaqi; Wang, Hongyang; Wang, Chao; Yu, Shaobo; Liu, Jing; Zhang, Yu; Fan, Bin; Li, Kui; Liu, Bang

2014-01-01

Copy number variations (CNVs) represent a substantial source of structural variants in mammals and contribute to both normal phenotypic variability and disease susceptibility. Although low-resolution CNV maps are produced in many domestic animals, and several reports have been published about the CNVs of porcine genome, the differences between Chinese and western pigs still remain to be elucidated. In this study, we used Porcine SNP60 BeadChip and PennCNV algorithm to perform a genome-wide CNV detection in 302 individuals from six Chinese indigenous breeds (Tongcheng, Laiwu, Luchuan, Bama, Wuzhishan and Ningxiang pigs), three western breeds (Yorkshire, Landrace and Duroc) and one hybrid (Tongcheng×Duroc). A total of 348 CNV Regions (CNVRs) across genome were identified, covering 150.49 Mb of the pig genome or 6.14% of the autosomal genome sequence. In these CNVRs, 213 CNVRs were found to exist only in the six Chinese indigenous breeds, and 60 CNVRs only in the three western breeds. The characters of CNVs in four Chinese normal size breeds (Luchuan, Tongcheng and Laiwu pigs) and two minipig breeds (Bama and Wuzhishan pigs) were also analyzed in this study. Functional annotation suggested that these CNVRs possess a great variety of molecular function and may play important roles in phenotypic and production traits between Chinese and western breeds. Our results are important complementary to the CNV map in pig genome, which provide new information about the diversity of Chinese and western pig breeds, and facilitate further research on porcine genome CNVs.
Analysis of Genome-Wide Copy Number Variations in Chinese Indigenous and Western Pig Breeds by 60 K SNP Genotyping Arrays

PubMed Central

Sun, Yaqi; Wang, Hongyang; Wang, Chao; Yu, Shaobo; Liu, Jing; Zhang, Yu; Fan, Bin; Li, Kui; Liu, Bang

2014-01-01

Copy number variations (CNVs) represent a substantial source of structural variants in mammals and contribute to both normal phenotypic variability and disease susceptibility. Although low-resolution CNV maps are produced in many domestic animals, and several reports have been published about the CNVs of porcine genome, the differences between Chinese and western pigs still remain to be elucidated. In this study, we used Porcine SNP60 BeadChip and PennCNV algorithm to perform a genome-wide CNV detection in 302 individuals from six Chinese indigenous breeds (Tongcheng, Laiwu, Luchuan, Bama, Wuzhishan and Ningxiang pigs), three western breeds (Yorkshire, Landrace and Duroc) and one hybrid (Tongcheng×Duroc). A total of 348 CNV Regions (CNVRs) across genome were identified, covering 150.49 Mb of the pig genome or 6.14% of the autosomal genome sequence. In these CNVRs, 213 CNVRs were found to exist only in the six Chinese indigenous breeds, and 60 CNVRs only in the three western breeds. The characters of CNVs in four Chinese normal size breeds (Luchuan, Tongcheng and Laiwu pigs) and two minipig breeds (Bama and Wuzhishan pigs) were also analyzed in this study. Functional annotation suggested that these CNVRs possess a great variety of molecular function and may play important roles in phenotypic and production traits between Chinese and western breeds. Our results are important complementary to the CNV map in pig genome, which provide new information about the diversity of Chinese and western pig breeds, and facilitate further research on porcine genome CNVs. PMID:25198154
Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

PubMed

Sanitá Lima, Matheus; Smith, David Roy

2017-11-06

Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.
Transcription regulation by distal enhancers: who's in the loop?

PubMed

Stadhouders, Ralph; van den Heuvel, Anita; Kolovos, Petros; Jorna, Ruud; Leslie, Kris; Grosveld, Frank; Soler, Eric

2012-01-01

Genome-wide chromatin profiling efforts have shown that enhancers are often located at large distances from gene promoters within the noncoding genome. Whereas enhancers can stimulate transcription initiation by communicating with promoters via chromatin looping mechanisms, we propose that enhancers may also stimulate transcription elongation by physical interactions with intronic elements. We review here recent findings derived from the study of the hematopoietic system.
GENOME-WIDE GENE-SODIUM INTERACTION ANALYSES ON BLOOD PRESSURE: THE GENSALT STUDY

PubMed Central

Li, Changwei; He, Jiang; Chen, Jing; Zhao, Jinying; Gu, Dongfeng; Hixson, James E.; Rao, Dabeeru C.; Jaquish, Cashell E.; Gu, Charles C.; Chen, Jichun; Huang, Jianfeng; Chen, Shufeng; Kelly, Tanika N.

2016-01-01

We performed genome-wide analyses to identify genomic loci that interact with sodium to influence blood pressure (BP) using single marker (one and two degree-of-freedom joint tests) and gene-based tests among 1,876 Chinese participants of the Genetic Epidemiology Network of Salt-Sensitivity (GenSalt) study. Among GenSalt participants, the average of three urine samples was used to estimate sodium excretion. Nine BP measurements were taken using a random-zero-sphygmomanometer. A total of 2.05 million SNPs were imputed using Affymetrix 6.0 genotype data and the Chinese Han of Beijing and Japanese of Tokyo HapMap reference panel. Promising findings (P <1.00×10−4) from GenSalt were evaluated for replication among 775 Chinese participants of the Multi-ethnic Study of Atherosclerosis (MESA). SNP and gene-based results were meta-analyzed across the GenSalt and MESA studies to determine genome-wide significance. The one degree-of-freedom tests identified interactions for UST rs13211840 on diastolic BP (P=3.13×10−9). The two degree-of-freedom tests additionally identified associations for CLGN rs2567241 (P=3.90×10−12) and LOC105369882 rs11104632 (P=4.51×10−8) with systolic BP. The CLGN variant rs2567241 was also associated with diastolic BP (P=3.11×10−22) and mean arterial pressure (P= 2.86×10−15). Genome-wide gene-based analysis identified MKNK1 (P=6.70×10−7), C2orf80 (P<1.00×10−12), EPHA6 (P=2.88×10−7), SCOC-AS1 (P=4.35×10−14), SCOC (P=6.46×10−11), CLGN (P=3.68×10−13), MGAT4D (P=4.73×10−11), ARHGAP42 (P=<1.00×10−12), CASP4 (P=1.31×10−8), and LINC01478 (P=6.75×10−10) that were associated with at least one BP phenotype. In summary, we identified 8 novel and 1 previously reported BP loci through the examination of SNP and gene-based interactions with sodium. PMID:27271309
Constrained Allocation Flux Balance Analysis

PubMed Central

Mori, Matteo; Hwa, Terence; Martin, Olivier C.

2016-01-01

New experimental results on bacterial growth inspire a novel top-down approach to study cell metabolism, combining mass balance and proteomic constraints to extend and complement Flux Balance Analysis. We introduce here Constrained Allocation Flux Balance Analysis, CAFBA, in which the biosynthetic costs associated to growth are accounted for in an effective way through a single additional genome-wide constraint. Its roots lie in the experimentally observed pattern of proteome allocation for metabolic functions, allowing to bridge regulation and metabolism in a transparent way under the principle of growth-rate maximization. We provide a simple method to solve CAFBA efficiently and propose an “ensemble averaging” procedure to account for unknown protein costs. Applying this approach to modeling E. coli metabolism, we find that, as the growth rate increases, CAFBA solutions cross over from respiratory, growth-yield maximizing states (preferred at slow growth) to fermentative states with carbon overflow (preferred at fast growth). In addition, CAFBA allows for quantitatively accurate predictions on the rate of acetate excretion and growth yield based on only 3 parameters determined by empirical growth laws. PMID:27355325
Identification of new susceptibility loci for osteoarthritis (arcOGEN): a genome-wide association study.

PubMed

Zeggini, Eleftheria; Panoutsopoulou, Kalliope; Southam, Lorraine; Rayner, Nigel W; Day-Williams, Aaron G; Lopes, Margarida C; Boraska, Vesna; Esko, Tonu; Evangelou, Evangelos; Hoffman, Albert; Houwing-Duistermaat, Jeanine J; Ingvarsson, Thorvaldur; Jonsdottir, Ingileif; Jonnson, Helgi; Kerkhof, Hanneke J; Kloppenburg, Margreet; Bos, Steffan D; Mangino, Massimo; Metrustry, Sarah; Slagboom, P Eline; Thorleifsson, Gudmar; Raine, Emma V A; Ratnayake, Madhushika; Ricketts, Michelle; Beazley, Claude; Blackburn, Hannah; Bumpstead, Suzannah; Elliott, Katherine S; Hunt, Sarah E; Potter, Simon C; Shin, So-Youn; Yadav, Vijay K; Zhai, Guangju; Sherburn, Kate; Dixon, Kate; Arden, Elizabeth; Aslam, Nadim; Battley, Phillippa-kate; Carluke, Ian; Doherty, Sally; Gordon, Andrew; Joseph, John; Keen, Richard; Koller, Nicola C; Mitchell, Sheryl; O'Neill, Fiona; Paling, Ellen; Reed, Mike R; Rivadeneira, Fernando; Swift, Diane; Walker, Kirsten; Watkins, Bridget; Wheeler, Maggie; Birrell, Fraser; Ioannidis, John P A; Meulenbelt, Ingrid; Metspalu, Andres; Rai, Ashok; Salter, Donald; Stefansson, Kari; Stykarsdottir, Unnur; Uitterlinden, André G; van Meurs, Joyce B J; Chapman, Kay; Deloukas, Panos; Ollier, William E R; Wallis, Gillian A; Arden, Nigel; Carr, Andrew; Doherty, Michael; McCaskie, Andrew; Willkinson, J Mark; Ralston, Stuart H; Valdes, Ana M; Spector, Tim D; Loughlin, John

2012-09-01

Osteoarthritis is the most common form of arthritis worldwide and is a major cause of pain and disability in elderly people. The health economic burden of osteoarthritis is increasing commensurate with obesity prevalence and longevity. Osteoarthritis has a strong genetic component but the success of previous genetic studies has been restricted due to insufficient sample sizes and phenotype heterogeneity. We undertook a large genome-wide association study (GWAS) in 7410 unrelated and retrospectively and prospectively selected patients with severe osteoarthritis in the arcOGEN study, 80% of whom had undergone total joint replacement, and 11,009 unrelated controls from the UK. We replicated the most promising signals in an independent set of up to 7473 cases and 42,938 controls, from studies in Iceland, Estonia, the Netherlands, and the UK. All patients and controls were of European descent. We identified five genome-wide significant loci (binomial test p≤5·0×10(-8)) for association with osteoarthritis and three loci just below this threshold. The strongest association was on chromosome 3 with rs6976 (odds ratio 1·12 [95% CI 1·08-1·16]; p=7·24×10(-11)), which is in perfect linkage disequilibrium with rs11177. This SNP encodes a missense polymorphism within the nucleostemin-encoding gene GNL3. Levels of nucleostemin were raised in chondrocytes from patients with osteoarthritis in functional studies. Other significant loci were on chromosome 9 close to ASTN2, chromosome 6 between FILIP1 and SENP6, chromosome 12 close to KLHDC5 and PTHLH, and in another region of chromosome 12 close to CHST11. One of the signals close to genome-wide significance was within the FTO gene, which is involved in regulation of bodyweight-a strong risk factor for osteoarthritis. All risk variants were common in frequency and exerted small effects. Our findings provide insight into the genetics of arthritis and identify new pathways that might be amenable to future therapeutic intervention. arcOGEN was funded by a special purpose grant from Arthritis Research UK. Copyright © 2012 Elsevier Ltd. All rights reserved.
Genome-to-genome analysis highlights the effect of the human innate and adaptive immune systems on the hepatitis C virus.

PubMed

Ansari, M Azim; Pedergnana, Vincent; L C Ip, Camilla; Magri, Andrea; Von Delft, Annette; Bonsall, David; Chaturvedi, Nimisha; Bartha, Istvan; Smith, David; Nicholson, George; McVean, Gilean; Trebes, Amy; Piazza, Paolo; Fellay, Jacques; Cooke, Graham; Foster, Graham R; Hudson, Emma; McLauchlan, John; Simmonds, Peter; Bowden, Rory; Klenerman, Paul; Barnes, Eleanor; Spencer, Chris C A

2017-05-01

Outcomes of hepatitis C virus (HCV) infection and treatment depend on viral and host genetic factors. Here we use human genome-wide genotyping arrays and new whole-genome HCV viral sequencing technologies to perform a systematic genome-to-genome study of 542 individuals who were chronically infected with HCV, predominantly genotype 3. We show that both alleles of genes encoding human leukocyte antigen molecules and genes encoding components of the interferon lambda innate immune system drive viral polymorphism. Additionally, we show that IFNL4 genotypes determine HCV viral load through a mechanism dependent on a specific amino acid residue in the HCV NS5A protein. These findings highlight the interplay between the innate immune system and the viral genome in HCV control.
Intra and Interspecific Variations of Gene Expression Levels in Yeast Are Largely Neutral: (Nei Lecture, SMBE 2016, Gold Coast).

PubMed

Yang, Jian-Rong; Maclean, Calum J; Park, Chungoo; Zhao, Huabin; Zhang, Jianzhi

2017-09-01

It is commonly, although not universally, accepted that most intra and interspecific genome sequence variations are more or less neutral, whereas a large fraction of organism-level phenotypic variations are adaptive. Gene expression levels are molecular phenotypes that bridge the gap between genotypes and corresponding organism-level phenotypes. Yet, it is unknown whether natural variations in gene expression levels are mostly neutral or adaptive. Here we address this fundamental question by genome-wide profiling and comparison of gene expression levels in nine yeast strains belonging to three closely related Saccharomyces species and originating from five different ecological environments. We find that the transcriptome-based clustering of the nine strains approximates the genome sequence-based phylogeny irrespective of their ecological environments. Remarkably, only ∼0.5% of genes exhibit similar expression levels among strains from a common ecological environment, no greater than that among strains with comparable phylogenetic relationships but different environments. These and other observations strongly suggest that most intra and interspecific variations in yeast gene expression levels result from the accumulation of random mutations rather than environmental adaptations. This finding has profound implications for understanding the driving force of gene expression evolution, genetic basis of phenotypic adaptation, and general role of stochasticity in evolution. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The association of genome-wide significant spirometric loci with chronic obstructive pulmonary disease susceptibility.

PubMed

Castaldi, Peter J; Cho, Michael H; Litonjua, Augusto A; Bakke, Per; Gulsvik, Amund; Lomas, David A; Anderson, Wayne; Beaty, Terri H; Hokanson, John E; Crapo, James D; Laird, Nan; Silverman, Edwin K

2011-12-01

Two recent metaanalyses of genome-wide association studies conducted by the CHARGE and SpiroMeta consortia identified novel loci yielding evidence of association at or near genome-wide significance (GWS) with FEV(1) and FEV(1)/FVC. We hypothesized that a subset of these markers would also be associated with chronic obstructive pulmonary disease (COPD) susceptibility. Thirty-two single-nucleotide polymorphisms (SNPs) in or near 17 genes in 11 previously identified GWS spirometric genomic regions were tested for association with COPD status in four COPD case-control study samples (NETT/NAS, the Norway case-control study, ECLIPSE, and the first 1,000 subjects in COPDGene; total sample size, 3,456 cases and 1,906 controls). In addition to testing the 32 spirometric GWS SNPs, we tested a dense panel of imputed HapMap2 SNP markers from the 17 genes located near the 32 GWS SNPs and in a set of 21 well studied COPD candidate genes. Of the previously identified GWS spirometric genomic regions, three loci harbored SNPs associated with COPD susceptibility at a 5% false discovery rate: the 4q24 locus including FLJ20184/INTS12/GSTCD/NPNT, the 6p21 locus including AGER and PPT2, and the 5q33 locus including ADAM19. In conclusion, markers previously associated at or near GWS with spirometric measures were tested for association with COPD status in data from four COPD case-control studies, and three loci showed evidence of association with COPD susceptibility at a 5% false discovery rate.
Diverse and highly recombinant anelloviruses associated with Weddell seals in Antarctica

PubMed Central

Fahsbender, Elizabeth; Kim, Stacy; Kraberger, Simona; Frankfurter, Greg; Eilers, Alice A.; Shero, Michelle R.; Beltran, Roxanne; Kirkham, Amy; McCorkell, Robert; Berngartt, Rachel K.; Male, Maketalena F.; Ballard, Grant; Ainley, David G.; Breitbart, Mya

2017-01-01

Abstract The viruses circulating among Antarctic wildlife remain largely unknown. In an effort to identify viruses associated with Weddell seals (Leptonychotes weddellii) inhabiting the Ross Sea, vaginal and nasal swabs, and faecal samples were collected between November 2014 and February 2015. In addition, a Weddell seal kidney and South Polar skua (Stercorarius maccormicki) faeces were opportunistically sampled. Using high throughput sequencing, we identified and recovered 152 anellovirus genomes that share 63–70% genome-wide identities with other pinniped anelloviruses. Genome-wide pairwise comparisons coupled with phylogenetic analysis revealed two novel anellovirus species, tentatively named torque teno Leptonychotes weddellii virus (TTLwV) -1 and -2. TTLwV-1 (n = 133, genomes encompassing 40 genotypes) is highly recombinant, whereas TTLwV-2 (n = 19, genomes encompassing three genotypes) is relatively less recombinant. This study documents ubiquitous TTLwVs among Weddell seals in Antarctica with frequent co-infection by multiple genotypes, however, the role these anelloviruses play in seal health remains unknown. PMID:28744371
Diverse and highly recombinant anelloviruses associated with Weddell seals in Antarctica.

PubMed

Fahsbender, Elizabeth; Burns, Jennifer M; Kim, Stacy; Kraberger, Simona; Frankfurter, Greg; Eilers, Alice A; Shero, Michelle R; Beltran, Roxanne; Kirkham, Amy; McCorkell, Robert; Berngartt, Rachel K; Male, Maketalena F; Ballard, Grant; Ainley, David G; Breitbart, Mya; Varsani, Arvind

2017-01-01

The viruses circulating among Antarctic wildlife remain largely unknown. In an effort to identify viruses associated with Weddell seals ( Leptonychotes weddellii ) inhabiting the Ross Sea, vaginal and nasal swabs, and faecal samples were collected between November 2014 and February 2015. In addition, a Weddell seal kidney and South Polar skua ( Stercorarius maccormicki ) faeces were opportunistically sampled. Using high throughput sequencing, we identified and recovered 152 anellovirus genomes that share 63-70% genome-wide identities with other pinniped anelloviruses. Genome-wide pairwise comparisons coupled with phylogenetic analysis revealed two novel anellovirus species, tentatively named torque teno Leptonychotes weddellii virus (TTLwV) -1 and -2. TTLwV-1 ( n = 133, genomes encompassing 40 genotypes) is highly recombinant, whereas TTLwV-2 ( n = 19, genomes encompassing three genotypes) is relatively less recombinant. This study documents ubiquitous TTLwVs among Weddell seals in Antarctica with frequent co-infection by multiple genotypes, however, the role these anelloviruses play in seal health remains unknown.
Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits.

PubMed

Fang, Lei; Wang, Qiong; Hu, Yan; Jia, Yinhua; Chen, Jiedan; Liu, Bingliang; Zhang, Zhiyuan; Guan, Xueying; Chen, Shuqi; Zhou, Baoliang; Mei, Gaofu; Sun, Junling; Pan, Zhaoe; He, Shoupu; Xiao, Songhua; Shi, Weijun; Gong, Wenfang; Liu, Jianguang; Ma, Jun; Cai, Caiping; Zhu, Xiefei; Guo, Wangzhen; Du, Xiongming; Zhang, Tianzhen

2017-07-01

Upland cotton (Gossypium hirsutum) is the most important natural fiber crop in the world. The overall genetic diversity among cultivated species of cotton and the genetic changes that occurred during their improvement are poorly understood. Here we report a comprehensive genomic assessment of modern improved upland cotton based on the genome-wide resequencing of 318 landraces and modern improved cultivars or lines. We detected more associated loci for lint yield than for fiber quality, which suggests that lint yield has stronger selection signatures than other traits. We found that two ethylene-pathway-related genes were associated with increased lint yield in improved cultivars. We evaluated the population frequency of each elite allele in historically released cultivar groups and found that 54.8% of the elite genome-wide association study (GWAS) alleles detected were transferred from three founder landraces: Deltapine 15, Stoneville 2B and Uganda Mian. Our results provide a genomic basis for improving cotton cultivars and for further evolutionary analysis of polyploid crops.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.