Sample records for validated genome wide

  1. Anonymization of electronic medical records for validating genome-wide association studies

    PubMed Central

    Loukides, Grigorios; Gkoulalas-Divanis, Aris; Malin, Bradley

    2010-01-01

    Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet of personalized medicine. The increasing adoption of electronic medical records allows large amounts of patients’ standardized clinical features to be combined with the genomic sequences of these patients and shared to support validation of GWAS findings and to enable novel discoveries. However, disseminating these data “as is” may lead to patient reidentification when genomic sequences are linked to resources that contain the corresponding patients’ identity information based on standardized clinical features. This work proposes an approach that provably prevents this type of data linkage and furnishes a result that helps support GWAS. Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual reidentification, while supporting GWAS validation and clinical case analysis tasks. PMID:20385806

  2. Genome-Wide SNP Detection, Validation, and Development of an 8K SNP Array for Apple

    PubMed Central

    Chagné, David; Crowhurst, Ross N.; Troggio, Michela; Davey, Mark W.; Gilmore, Barbara; Lawley, Cindy; Vanderzande, Stijn; Hellens, Roger P.; Kumar, Satish; Cestaro, Alessandro; Velasco, Riccardo; Main, Dorrie; Rees, Jasper D.; Iezzoni, Amy; Mockler, Todd; Wilhelm, Larry; Van de Weg, Eric; Gardiner, Susan E.; Bassil, Nahla; Peace, Cameron

    2012-01-01

    As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide evaluation of allelic variation in apple (Malus×domestica) breeding germplasm. For genome-wide SNP discovery, 27 apple cultivars were chosen to represent worldwide breeding germplasm and re-sequenced at low coverage with the Illumina Genome Analyzer II. Following alignment of these sequences to the whole genome sequence of ‘Golden Delicious’, SNPs were identified using SoapSNP. A total of 2,113,120 SNPs were detected, corresponding to one SNP to every 288 bp of the genome. The Illumina GoldenGate® assay was then used to validate a subset of 144 SNPs with a range of characteristics, using a set of 160 apple accessions. This validation assay enabled fine-tuning of the final subset of SNPs for the Illumina Infinium® II system. The set of stringent filtering criteria developed allowed choice of a set of SNPs that not only exhibited an even distribution across the apple genome and a range of minor allele frequencies to ensure utility across germplasm, but also were located in putative exonic regions to maximize genotyping success rate. A total of 7867 apple SNPs was established for the IRSC apple 8K SNP array v1, of which 5554 were polymorphic after evaluation in segregating families and a germplasm collection. This publicly available genomics resource will provide an unprecedented resolution of SNP haplotypes, which will enable marker-locus-trait association discovery, description of the genetic architecture of quantitative traits, investigation of genetic variation (neutral and functional), and genomic selection in apple. PMID:22363718

  3. Optimal selection of markers for validation or replication from genome-wide association studies.

    PubMed

    Greenwood, Celia M T; Rangrej, Jagadish; Sun, Lei

    2007-07-01

    With reductions in genotyping costs and the fast pace of improvements in genotyping technology, it is not uncommon for the individuals in a single study to undergo genotyping using several different platforms, where each platform may contain different numbers of markers selected via different criteria. For example, a set of cases and controls may be genotyped at markers in a small set of carefully selected candidate genes, and shortly thereafter, the same cases and controls may be used for a genome-wide single nucleotide polymorphism (SNP) association study. After such initial investigations, often, a subset of "interesting" markers is selected for validation or replication. Specifically, by validation, we refer to the investigation of associations between the selected subset of markers and the disease in independent data. However, it is not obvious how to choose the best set of markers for this validation. There may be a prior expectation that some sets of genotyping data are more likely to contain real associations. For example, it may be more likely for markers in plausible candidate genes to show disease associations than markers in a genome-wide scan. Hence, it would be desirable to select proportionally more markers from the candidate gene set. When a fixed number of markers are selected for validation, we propose an approach for identifying an optimal marker-selection configuration by basing the approach on minimizing the stratified false discovery rate. We illustrate this approach using a case-control study of colorectal cancer from Ontario, Canada, and we show that this approach leads to substantial reductions in the estimated false discovery rates in the Ontario dataset for the selected markers, as well as reductions in the expected false discovery rates for the proposed validation dataset. Copyright 2007 Wiley-Liss, Inc.

  4. Discovery and validation of sub-threshold genome-wide association study loci using epigenomic signatures

    PubMed Central

    Wang, Xinchen; Tucker, Nathan R; Rizki, Gizem; Mills, Robert; Krijger, Peter HL; de Wit, Elzo; Subramanian, Vidya; Bartell, Eric; Nguyen, Xinh-Xinh; Ye, Jiangchuan; Leyton-Mange, Jordan; Dolmatova, Elena V; van der Harst, Pim; de Laat, Wouter; Ellinor, Patrick T; Newton-Cheh, Christopher; Milan, David J; Kellis, Manolis; Boyer, Laurie A

    2016-01-01

    Genetic variants identified by genome-wide association studies explain only a modest proportion of heritability, suggesting that meaningful associations lie 'hidden' below current thresholds. Here, we integrate information from association studies with epigenomic maps to demonstrate that enhancers significantly overlap known loci associated with the cardiac QT interval and QRS duration. We apply functional criteria to identify loci associated with QT interval that do not meet genome-wide significance and are missed by existing studies. We demonstrate that these 'sub-threshold' signals represent novel loci, and that epigenomic maps are effective at discriminating true biological signals from noise. We experimentally validate the molecular, gene-regulatory, cellular and organismal phenotypes of these sub-threshold loci, demonstrating that most sub-threshold loci have regulatory consequences and that genetic perturbation of nearby genes causes cardiac phenotypes in mouse. Our work provides a general approach for improving the detection of novel loci associated with complex human traits. DOI: http://dx.doi.org/10.7554/eLife.10557.001 PMID:27162171

  5. Validation of Genome-Wide Prostate Cancer Associations in Men of African Descent

    PubMed Central

    Chang, Bao-Li; Spangler, Elaine; Gallagher, Stephen; Haiman, Christopher A.; Henderson, Brian; Isaacs, William; Benford, Marnita L.; Kidd, LaCreis R.; Cooney, Kathleen; Strom, Sara; Ann Ingles, Sue; Stern, Mariana C.; Corral, Roman; Joshi, Amit D.; Xu, Jianfeng; Giri, Veda N.; Rybicki, Benjamin; Neslund-Dudas, Christine; Kibel, Adam S.; Thompson, Ian M.; Leach, Robin J.; Ostrander, Elaine A.; Stanford, Janet L.; Witte, John; Casey, Graham; Eeles, Rosalind; Hsing, Ann W.; Chanock, Stephen; Hu, Jennifer J.; John, Esther M.; Park, Jong; Stefflova, Klara; Zeigler-Johnson, Charnita; Rebbeck, Timothy R.

    2010-01-01

    Background Genome-wide association studies (GWAS) have identified numerous prostate cancer susceptibility alleles, but these loci have been identified primarily in men of European descent. There is limited information about the role of these loci in men of African descent. Methods We identified 7,788 prostate cancer cases and controls with genotype data for 47 GWAS-identified loci. Results We identified significant associations for SNP rs10486567 at JAZF1, rs10993994 at MSMB, rs12418451 and rs7931342 at 11q13, and rs5945572 and rs5945619 at NUDT10/11. These associations were in the same direction and of similar magnitude as those reported in men of European descent. Significance was attained at all report prostate cancer susceptibility regions at chromosome 8q24, including associations reaching genome-wide significance in region 2. Conclusion We have validated in men of African descent the associations at some, but not all, prostate cancer susceptibility loci originally identified in European descent populations. This may be due to heterogeneity in genetic etiology or in the pattern of genetic variation across populations. Impact The genetic etiology of prostate cancer in men of African descent differs from that of men of European descent. PMID:21071540

  6. Enhancing genomic prediction with genome-wide association studies in multiparental maize populations

    USDA-ARS?s Scientific Manuscript database

    Genome-wide association mapping using dense marker sets has identified some nucleotide variants affecting complex traits which have been validated with fine-mapping and functional analysis. Many sequence variants associated with complex traits in maize have small effects and low repeatability, howev...

  7. Genome-Wide Mapping of Copy Number Variation in Humans: Comparative Analysis of High Resolution Array Platforms

    PubMed Central

    Haraksingh, Rajini R.; Abyzov, Alexej; Gerstein, Mark; Urban, Alexander E.; Snyder, Michael

    2011-01-01

    Accurate and efficient genome-wide detection of copy number variants (CNVs) is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH), Single Nucleotide Polymorphism (SNP) genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications. PMID:22140474

  8. Inferring genome-wide interplay landscape between DNA methylation and transcriptional regulation.

    PubMed

    Tang, Binhua; Wang, Xin

    2015-01-01

    DNA methylation and transcriptional regulation play important roles in cancer cell development and differentiation processes. Based on the currently available cell line profiling information from the ENCODE Consortium, we propose a Bayesian inference model to infer and construct genome-wide interaction landscape between DNA methylation and transcriptional regulation, which sheds light on the underlying complex functional mechanisms important within the human cancer and disease context. For the first time, we select all the currently available cell lines (>=20) and transcription factors (>=80) profiling information from the ENCODE Consortium portal. Through the integration of those genome-wide profiling sources, our genome-wide analysis detects multiple functional loci of interest, and indicates that DNA methylation is cell- and region-specific, due to the interplay mechanisms with transcription regulatory activities. We validate our analysis results with the corresponding RNA-sequencing technique for those detected genomic loci. Our results provide novel and meaningful insights for the interplay mechanisms of transcriptional regulation and gene expression for the human cancer and disease studies.

  9. Genome-wide patterns of copy number variation in the diversified chicken genomes using next-generation sequencing.

    PubMed

    Yi, Guoqiang; Qu, Lujiang; Liu, Jianfeng; Yan, Yiyuan; Xu, Guiyun; Yang, Ning

    2014-11-07

    Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing. A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson's correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding. Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.

  10. A genome-wide SNP scan accelerates trait-regulatory genomic loci identification in chickpea

    PubMed Central

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C.L.L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    We identified 44844 high-quality SNPs by sequencing 92 diverse chickpea accessions belonging to a seed and pod trait-specific association panel using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays. A GWAS (genome-wide association study) in an association panel of 211, including the 92 sequenced accessions, identified 22 major genomic loci showing significant association (explaining 23–47% phenotypic variation) with pod and seed number/plant and 100-seed weight. Eighteen trait-regulatory major genomic loci underlying 13 robust QTLs were validated and mapped on an intra-specific genetic linkage map by QTL mapping. A combinatorial approach of GWAS, QTL mapping and gene haplotype-specific LD mapping and transcript profiling uncovered one superior haplotype and favourable natural allelic variants in the upstream regulatory region of a CesA-type cellulose synthase (Ca_Kabuli_CesA3) gene regulating high pod and seed number/plant (explaining 47% phenotypic variation) in chickpea. The up-regulation of this superior gene haplotype correlated with increased transcript expression of Ca_Kabuli_CesA3 gene in the pollen and pod of high pod/seed number accession, resulting in higher cellulose accumulation for normal pollen and pollen tube growth. A rapid combinatorial genome-wide SNP genotyping-based approach has potential to dissect complex quantitative agronomic traits and delineate trait-regulatory genomic loci (candidate genes) for genetic enhancement in crop plants, including chickpea. PMID:26058368

  11. Genome-Wide Analysis of A-to-I RNA Editing.

    PubMed

    Savva, Yiannis A; Laurent, Georges St; Reenan, Robert A

    2016-01-01

    Adenosine (A)-to-inosine (I) RNA editing is a fundamental posttranscriptional modification that ensures the deamination of A-to-I in double-stranded (ds) RNA molecules. Intriguingly, the A-to-I RNA editing system is particularly active in the nervous system of higher eukaryotes, altering a plethora of noncoding and coding sequences. Abnormal RNA editing is highly associated with many neurological phenotypes and neurodevelopmental disorders. However, the molecular mechanisms underlying RNA editing-mediated pathogenesis still remain enigmatic and have attracted increasing attention from researchers. Over the last decade, methods available to perform genome-wide transcriptome analysis, have evolved rapidly. Within the RNA editing field researchers have adopted next-generation sequencing technologies to identify RNA-editing sites within genomes and to elucidate the underlying process. However, technical challenges associated with editing site discovery have hindered efforts to uncover comprehensive editing site datasets, resulting in the general perception that the collections of annotated editing sites represent only a small minority of the total number of sites in a given organism, tissue, or cell type of interest. Additionally to doubts about sensitivity, existing RNA-editing site lists often contain high percentages of false positives, leading to uncertainty about their validity and usefulness in downstream studies. An accurate investigation of A-to-I editing requires properly validated datasets of editing sites with demonstrated and transparent levels of sensitivity and specificity. Here, we describe a high signal-to-noise method for RNA-editing site detection using single-molecule sequencing (SMS). With this method, authentic RNA-editing sites may be differentiated from artifacts. Machine learning approaches provide a procedure to improve upon and experimentally validate sequencing outcomes through use of computationally predicted, iterative feedback loops

  12. Assessing Predictive Properties of Genome-Wide Selection in Soybeans

    PubMed Central

    Xavier, Alencar; Muir, William M.; Rainey, Katy Martin

    2016-01-01

    Many economically important traits in plant breeding have low heritability or are difficult to measure. For these traits, genomic selection has attractive features and may boost genetic gains. Our goal was to evaluate alternative scenarios to implement genomic selection for yield components in soybean (Glycine max L. merr). We used a nested association panel with cross validation to evaluate the impacts of training population size, genotyping density, and prediction model on the accuracy of genomic prediction. Our results indicate that training population size was the factor most relevant to improvement in genome-wide prediction, with greatest improvement observed in training sets up to 2000 individuals. We discuss assumptions that influence the choice of the prediction model. Although alternative models had minor impacts on prediction accuracy, the most robust prediction model was the combination of reproducing kernel Hilbert space regression and BayesB. Higher genotyping density marginally improved accuracy. Our study finds that breeding programs seeking efficient genomic selection in soybeans would best allocate resources by investing in a representative training set. PMID:27317786

  13. Transcription facilitated genome-wide recruitment of topoisomerase I and DNA gyrase.

    PubMed

    Ahmed, Wareed; Sala, Claudia; Hegde, Shubhada R; Jha, Rajiv Kumar; Cole, Stewart T; Nagaraja, Valakunja

    2017-05-01

    Movement of the transcription machinery along a template alters DNA topology resulting in the accumulation of supercoils in DNA. The positive supercoils generated ahead of transcribing RNA polymerase (RNAP) and the negative supercoils accumulating behind impose severe topological constraints impeding transcription process. Previous studies have implied the role of topoisomerases in the removal of torsional stress and the maintenance of template topology but the in vivo interaction of functionally distinct topoisomerases with heterogeneous chromosomal territories is not deciphered. Moreover, how the transcription-induced supercoils influence the genome-wide recruitment of DNA topoisomerases remains to be explored in bacteria. Using ChIP-Seq, we show the genome-wide occupancy profile of both topoisomerase I and DNA gyrase in conjunction with RNAP in Mycobacterium tuberculosis taking advantage of minimal topoisomerase representation in the organism. The study unveils the first in vivo genome-wide interaction of both the topoisomerases with the genomic regions and establishes that transcription-induced supercoils govern their recruitment at genomic sites. Distribution profiles revealed co-localization of RNAP and the two topoisomerases on the active transcriptional units (TUs). At a given locus, topoisomerase I and DNA gyrase were localized behind and ahead of RNAP, respectively, correlating with the twin-supercoiled domains generated. The recruitment of topoisomerases was higher at the genomic loci with higher transcriptional activity and/or at regions under high torsional stress compared to silent genomic loci. Importantly, the occupancy of DNA gyrase, sole type II topoisomerase in Mtb, near the Ter domain of the Mtb chromosome validates its function as a decatenase.

  14. Transcription facilitated genome-wide recruitment of topoisomerase I and DNA gyrase

    PubMed Central

    Ahmed, Wareed; Sala, Claudia; Hegde, Shubhada R.; Jha, Rajiv Kumar

    2017-01-01

    Movement of the transcription machinery along a template alters DNA topology resulting in the accumulation of supercoils in DNA. The positive supercoils generated ahead of transcribing RNA polymerase (RNAP) and the negative supercoils accumulating behind impose severe topological constraints impeding transcription process. Previous studies have implied the role of topoisomerases in the removal of torsional stress and the maintenance of template topology but the in vivo interaction of functionally distinct topoisomerases with heterogeneous chromosomal territories is not deciphered. Moreover, how the transcription-induced supercoils influence the genome-wide recruitment of DNA topoisomerases remains to be explored in bacteria. Using ChIP-Seq, we show the genome-wide occupancy profile of both topoisomerase I and DNA gyrase in conjunction with RNAP in Mycobacterium tuberculosis taking advantage of minimal topoisomerase representation in the organism. The study unveils the first in vivo genome-wide interaction of both the topoisomerases with the genomic regions and establishes that transcription-induced supercoils govern their recruitment at genomic sites. Distribution profiles revealed co-localization of RNAP and the two topoisomerases on the active transcriptional units (TUs). At a given locus, topoisomerase I and DNA gyrase were localized behind and ahead of RNAP, respectively, correlating with the twin-supercoiled domains generated. The recruitment of topoisomerases was higher at the genomic loci with higher transcriptional activity and/or at regions under high torsional stress compared to silent genomic loci. Importantly, the occupancy of DNA gyrase, sole type II topoisomerase in Mtb, near the Ter domain of the Mtb chromosome validates its function as a decatenase. PMID:28463980

  15. Genome-Wide Methylation Analyses in Glioblastoma Multiforme

    PubMed Central

    Lai, Rose K.; Chen, Yanwen; Guan, Xiaowei; Nousome, Darryl; Sharma, Charu; Canoll, Peter; Bruce, Jeffrey; Sloan, Andrew E.; Cortes, Etty; Vonsattel, Jean-Paul; Su, Tao; Delgado-Cruzata, Lissette; Gurvich, Irina; Santella, Regina M.; Ostrom, Quinn; Lee, Annette; Gregersen, Peter; Barnholtz-Sloan, Jill

    2014-01-01

    Few studies had investigated genome-wide methylation in glioblastoma multiforme (GBM). Our goals were to study differential methylation across the genome in gene promoters using an array-based method, as well as repetitive elements using surrogate global methylation markers. The discovery sample set for this study consisted of 54 GBM from Columbia University and Case Western Reserve University, and 24 brain controls from the New York Brain Bank. We assembled a validation dataset using methylation data of 162 TCGA GBM and 140 brain controls from dbGAP. HumanMethylation27 Analysis Bead-Chips (Illumina) were used to interrogate 26,486 informative CpG sites in both the discovery and validation datasets. Global methylation levels were assessed by analysis of L1 retrotransposon (LINE1), 5 methyl-deoxycytidine (5m-dC) and 5 hydroxylmethyl-deoxycytidine (5hm-dC) in the discovery dataset. We validated a total of 1548 CpG sites (1307 genes) that were differentially methylated in GBM compared to controls. There were more than twice as many hypomethylated genes as hypermethylated ones. Both the discovery and validation datasets found 5 tumor methylation classes. Pathway analyses showed that the top ten pathways in hypomethylated genes were all related to functions of innate and acquired immunities. Among hypermethylated pathways, transcriptional regulatory network in embryonic stem cells was the most significant. In the study of global methylation markers, 5m-dC level was the best discriminant among methylation classes, whereas in survival analyses, high level of LINE1 methylation was an independent, favorable prognostic factor in the discovery dataset. Based on a pathway approach, hypermethylation in genes that control stem cell differentiation were significant, poor prognostic factors of overall survival in both the discovery and validation datasets. Approaches that targeted these methylated genes may be a future therapeutic goal. PMID:24586730

  16. Memory management in genome-wide association studies

    PubMed Central

    2009-01-01

    Genome-wide association is a powerful tool for the identification of genes that underlie common diseases. Genome-wide association studies generate billions of genotypes and pose significant computational challenges for most users including limited computer memory. We applied a recently developed memory management tool to two analyses of North American Rheumatoid Arthritis Consortium studies and measured the performance in terms of central processing unit and memory usage. We conclude that our memory management approach is simple, efficient, and effective for genome-wide association studies. PMID:20018047

  17. Citalopram and escitalopram plasma drug and metabolite concentrations: genome-wide associations

    PubMed Central

    Ji, Yuan; Schaid, Daniel J; Desta, Zeruesenay; Kubo, Michiaki; Batzler, Anthony J; Snyder, Karen; Mushiroda, Taisei; Kamatani, Naoyuki; Ogburn, Evan; Hall-Flavin, Daniel; Flockhart, David; Nakamura, Yusuke; Mrazek, David A; Weinshilboum, Richard M

    2014-01-01

    Aims Citalopram (CT) and escitalopram (S-CT) are among the most widely prescribed selective serotonin reuptake inhibitors used to treat major depressive disorder (MDD). We applied a genome-wide association study to identify genetic factors that contribute to variation in plasma concentrations of CT or S-CT and their metabolites in MDD patients treated with CT or S-CT. Methods Our genome-wide association study was performed using samples from 435 MDD patients. Linear mixed models were used to account for within-subject correlations of longitudinal measures of plasma drug/metabolite concentrations (4 and 8 weeks after the initiation of drug therapy), and single-nucleotide polymorphisms (SNPs) were modelled as additive allelic effects. Results Genome-wide significant associations were observed for S-CT concentration with SNPs in or near the CYP2C19 gene on chromosome 10 (rs1074145, P = 4.1 × 10−9) and with S-didesmethylcitalopram concentration for SNPs near the CYP2D6 locus on chromosome 22 (rs1065852, P = 2.0 × 10−16), supporting the important role of these cytochrome P450 (CYP) enzymes in biotransformation of citalopram. After adjustment for the effect of CYP2C19 functional alleles, the analyses also identified novel loci that will require future replication and functional validation. Conclusions In vitro and in vivo studies have suggested that the biotransformation of CT to monodesmethylcitalopram and didesmethylcitalopram is mediated by CYP isozymes. The results of our genome-wide association study performed in MDD patients treated with CT or S-CT have confirmed those observations but also identified novel genomic loci that might play a role in variation in plasma levels of CT or its metabolites during the treatment of MDD patients with these selective serotonin reuptake inhibitors. PMID:24528284

  18. Citalopram and escitalopram plasma drug and metabolite concentrations: genome-wide associations.

    PubMed

    Ji, Yuan; Schaid, Daniel J; Desta, Zeruesenay; Kubo, Michiaki; Batzler, Anthony J; Snyder, Karen; Mushiroda, Taisei; Kamatani, Naoyuki; Ogburn, Evan; Hall-Flavin, Daniel; Flockhart, David; Nakamura, Yusuke; Mrazek, David A; Weinshilboum, Richard M

    2014-08-01

    Citalopram (CT) and escitalopram (S-CT) are among the most widely prescribed selective serotonin reuptake inhibitors used to treat major depressive disorder (MDD). We applied a genome-wide association study to identify genetic factors that contribute to variation in plasma concentrations of CT or S-CT and their metabolites in MDD patients treated with CT or S-CT. Our genome-wide association study was performed using samples from 435 MDD patients. Linear mixed models were used to account for within-subject correlations of longitudinal measures of plasma drug/metabolite concentrations (4 and 8 weeks after the initiation of drug therapy), and single-nucleotide polymorphisms (SNPs) were modelled as additive allelic effects. Genome-wide significant associations were observed for S-CT concentration with SNPs in or near the CYP2C19 gene on chromosome 10 (rs1074145, P = 4.1 × 10(-9) ) and with S-didesmethylcitalopram concentration for SNPs near the CYP2D6 locus on chromosome 22 (rs1065852, P = 2.0 × 10(-16) ), supporting the important role of these cytochrome P450 (CYP) enzymes in biotransformation of citalopram. After adjustment for the effect of CYP2C19 functional alleles, the analyses also identified novel loci that will require future replication and functional validation. In vitro and in vivo studies have suggested that the biotransformation of CT to monodesmethylcitalopram and didesmethylcitalopram is mediated by CYP isozymes. The results of our genome-wide association study performed in MDD patients treated with CT or S-CT have confirmed those observations but also identified novel genomic loci that might play a role in variation in plasma levels of CT or its metabolites during the treatment of MDD patients with these selective serotonin reuptake inhibitors. © 2014 The British Pharmacological Society.

  19. A genome-wide approach to children's aggressive behavior: The EAGLE consortium.

    PubMed

    Pappa, Irene; St Pourcain, Beate; Benke, Kelly; Cavadino, Alana; Hakulinen, Christian; Nivard, Michel G; Nolte, Ilja M; Tiesler, Carla M T; Bakermans-Kranenburg, Marian J; Davies, Gareth E; Evans, David M; Geoffroy, Marie-Claude; Grallert, Harald; Groen-Blokhuis, Maria M; Hudziak, James J; Kemp, John P; Keltikangas-Järvinen, Liisa; McMahon, George; Mileva-Seitz, Viara R; Motazedi, Ehsan; Power, Christine; Raitakari, Olli T; Ring, Susan M; Rivadeneira, Fernando; Rodriguez, Alina; Scheet, Paul A; Seppälä, Ilkka; Snieder, Harold; Standl, Marie; Thiering, Elisabeth; Timpson, Nicholas J; Veenstra, René; Velders, Fleur P; Whitehouse, Andrew J O; Smith, George Davey; Heinrich, Joachim; Hypponen, Elina; Lehtimäki, Terho; Middeldorp, Christel M; Oldehinkel, Albertine J; Pennell, Craig E; Boomsma, Dorret I; Tiemeier, Henning

    2016-07-01

    Individual differences in aggressive behavior emerge in early childhood and predict persisting behavioral problems and disorders. Studies of antisocial and severe aggression in adulthood indicate substantial underlying biology. However, little attention has been given to genome-wide approaches of aggressive behavior in children. We analyzed data from nine population-based studies and assessed aggressive behavior using well-validated parent-reported questionnaires. This is the largest sample exploring children's aggressive behavior to date (N = 18,988), with measures in two developmental stages (N = 15,668 early childhood and N = 16,311 middle childhood/early adolescence). First, we estimated the additive genetic variance of children's aggressive behavior based on genome-wide SNP information, using genome-wide complex trait analysis (GCTA). Second, genetic associations within each study were assessed using a quasi-Poisson regression approach, capturing the highly right-skewed distribution of aggressive behavior. Third, we performed meta-analyses of genome-wide associations for both the total age-mixed sample and the two developmental stages. Finally, we performed a gene-based test using the summary statistics of the total sample. GCTA quantified variance tagged by common SNPs (10-54%). The meta-analysis of the total sample identified one region in chromosome 2 (2p12) at near genome-wide significance (top SNP rs11126630, P = 5.30 × 10(-8) ). The separate meta-analyses of the two developmental stages revealed suggestive evidence of association at the same locus. The gene-based analysis indicated association of variation within AVPR1A with aggressive behavior. We conclude that common variants at 2p12 show suggestive evidence for association with childhood aggression. Replication of these initial findings is needed, and further studies should clarify its biological meaning. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  20. Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome.

    PubMed

    Pedersen, Jakob Skou; Valen, Eivind; Velazquez, Amhed M Vargas; Parker, Brian J; Rasmussen, Morten; Lindgreen, Stinus; Lilje, Berit; Tobin, Desmond J; Kelly, Theresa K; Vang, Søren; Andersson, Robin; Jones, Peter A; Hoover, Cindi A; Tikhonov, Alexei; Prokhortchouk, Egor; Rubin, Edward M; Sandelin, Albin; Gilbert, M Thomas P; Krogh, Anders; Willerslev, Eske; Orlando, Ludovic

    2014-03-01

    Epigenetic information is available from contemporary organisms, but is difficult to track back in evolutionary time. Here, we show that genome-wide epigenetic information can be gathered directly from next-generation sequence reads of DNA isolated from ancient remains. Using the genome sequence data generated from hair shafts of a 4000-yr-old Paleo-Eskimo belonging to the Saqqaq culture, we generate the first ancient nucleosome map coupled with a genome-wide survey of cytosine methylation levels. The validity of both nucleosome map and methylation levels were confirmed by the recovery of the expected signals at promoter regions, exon/intron boundaries, and CTCF sites. The top-scoring nucleosome calls revealed distinct DNA positioning biases, attesting to nucleotide-level accuracy. The ancient methylation levels exhibited high conservation over time, clustering closely with modern hair tissues. Using ancient methylation information, we estimated the age at death of the Saqqaq individual and illustrate how epigenetic information can be used to infer ancient gene expression. Similar epigenetic signatures were found in other fossil material, such as 110,000- to 130,000-yr-old bones, supporting the contention that ancient epigenomic information can be reconstructed from a deep past. Our findings lay the foundation for extracting epigenomic information from ancient samples, allowing shifts in epialleles to be tracked through evolutionary time, as well as providing an original window into modern epigenomics.

  1. Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome

    PubMed Central

    Pedersen, Jakob Skou; Valen, Eivind; Velazquez, Amhed M. Vargas; Parker, Brian J.; Rasmussen, Morten; Lindgreen, Stinus; Lilje, Berit; Tobin, Desmond J.; Kelly, Theresa K.; Vang, Søren; Andersson, Robin; Jones, Peter A.; Hoover, Cindi A.; Tikhonov, Alexei; Prokhortchouk, Egor; Rubin, Edward M.; Sandelin, Albin; Gilbert, M. Thomas P.; Krogh, Anders; Willerslev, Eske; Orlando, Ludovic

    2014-01-01

    Epigenetic information is available from contemporary organisms, but is difficult to track back in evolutionary time. Here, we show that genome-wide epigenetic information can be gathered directly from next-generation sequence reads of DNA isolated from ancient remains. Using the genome sequence data generated from hair shafts of a 4000-yr-old Paleo-Eskimo belonging to the Saqqaq culture, we generate the first ancient nucleosome map coupled with a genome-wide survey of cytosine methylation levels. The validity of both nucleosome map and methylation levels were confirmed by the recovery of the expected signals at promoter regions, exon/intron boundaries, and CTCF sites. The top-scoring nucleosome calls revealed distinct DNA positioning biases, attesting to nucleotide-level accuracy. The ancient methylation levels exhibited high conservation over time, clustering closely with modern hair tissues. Using ancient methylation information, we estimated the age at death of the Saqqaq individual and illustrate how epigenetic information can be used to infer ancient gene expression. Similar epigenetic signatures were found in other fossil material, such as 110,000- to 130,000-yr-old bones, supporting the contention that ancient epigenomic information can be reconstructed from a deep past. Our findings lay the foundation for extracting epigenomic information from ancient samples, allowing shifts in epialleles to be tracked through evolutionary time, as well as providing an original window into modern epigenomics. PMID:24299735

  2. Statistical Selection of Biological Models for Genome-Wide Association Analyses.

    PubMed

    Bi, Wenjian; Kang, Guolian; Pounds, Stanley B

    2018-05-24

    Genome-wide association studies have discovered many biologically important associations of genes with phenotypes. Typically, genome-wide association analyses formally test the association of each genetic feature (SNP, CNV, etc) with the phenotype of interest and summarize the results with multiplicity-adjusted p-values. However, very small p-values only provide evidence against the null hypothesis of no association without indicating which biological model best explains the observed data. Correctly identifying a specific biological model may improve the scientific interpretation and can be used to more effectively select and design a follow-up validation study. Thus, statistical methodology to identify the correct biological model for a particular genotype-phenotype association can be very useful to investigators. Here, we propose a general statistical method to summarize how accurately each of five biological models (null, additive, dominant, recessive, co-dominant) represents the data observed for each variant in a GWAS study. We show that the new method stringently controls the false discovery rate and asymptotically selects the correct biological model. Simulations of two-stage discovery-validation studies show that the new method has these properties and that its validation power is similar to or exceeds that of simple methods that use the same statistical model for all SNPs. Example analyses of three data sets also highlight these advantages of the new method. An R package is freely available at www.stjuderesearch.org/site/depts/biostats/maew. Copyright © 2018. Published by Elsevier Inc.

  3. Spotting and validation of a genome wide oligonucleotide chip with duplicate measurement of each gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thomassen, Mads; Skov, Vibe; Eiriksdottir, Freyja

    2006-06-16

    The quality of DNA microarray based gene expression data relies on the reproducibility of several steps in a microarray experiment. We have developed a spotted genome wide microarray chip with oligonucleotides printed in duplicate in order to minimise undesirable biases, thereby optimising detection of true differential expression. The validation study design consisted of an assessment of the microarray chip performance using the MessageAmp and FairPlay labelling kits. Intraclass correlation coefficient (ICC) was used to demonstrate that MessageAmp was significantly more reproducible than FairPlay. Further examinations with MessageAmp revealed the applicability of the system. The linear range of the chips wasmore » three orders of magnitude, the precision was high, as 95% of measurements deviated less than 1.24-fold from the expected value, and the coefficient of variation for relative expression was 13.6%. Relative quantitation was more reproducible than absolute quantitation and substantial reduction of variance was attained with duplicate spotting. An analysis of variance (ANOVA) demonstrated no significant day-to-day variation.« less

  4. GWAMA: software for genome-wide association meta-analysis.

    PubMed

    Mägi, Reedik; Morris, Andrew P

    2010-05-28

    Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.

  5. Genome-wide association study identifies a locus associated with rotator cuff injury

    PubMed Central

    Roos, Thomas R.; Roos, Andrew K.; Avins, Andrew L.; Ahmed, Marwa A.; Kleimeyer, John P.; Fredericson, Michael; Ioannidis, John P. A.; Dragoo, Jason L.

    2017-01-01

    Rotator cuff tears are common, especially in the fifth and sixth decades of life, but can also occur in the competitive athlete. Genetic differences may contribute to overall injury risk. Identifying genetic loci associated with rotator cuff injury could shed light on the etiology of this injury. We performed a genome-wide association screen using publically available data from the Research Program in Genes, Environment and Health including 8,357 cases of rotator cuff injury and 94,622 controls. We found rs71404070 to show a genome-wide significant association with rotator cuff injury with p = 2.31x10-8 and an odds ratio of 1.25 per allele. This SNP is located next to cadherin8, which encodes a protein involved in cell adhesion. We also attempted to validate previous gene association studies that had reported a total of 18 SNPs showing a significant association with rotator cuff injury. However, none of the 18 SNPs were validated in our dataset. rs71404070 may be informative in explaining why some individuals are more susceptible to rotator cuff injury than others. PMID:29228018

  6. Genome-wide DNA methylation analysis of pseudohypoparathyroidism patients with GNAS imprinting defects.

    PubMed

    Rochtus, Anne; Martin-Trujillo, Alejandro; Izzi, Benedetta; Elli, Francesca; Garin, Intza; Linglart, Agnes; Mantovani, Giovanna; Perez de Nanclares, Guiomar; Thiele, Suzanne; Decallonne, Brigitte; Van Geet, Chris; Monk, David; Freson, Kathleen

    2016-01-01

    Pseudohypoparathyroidism (PHP) is caused by (epi)genetic defects in the imprinted GNAS cluster. Current classification of PHP patients is hampered by clinical and molecular diagnostic overlaps. The European Consortium for the study of PHP designed a genome-wide methylation study to improve molecular diagnosis. The HumanMethylation 450K BeadChip was used to analyze genome-wide methylation in 24 PHP patients with parathyroid hormone resistance and 20 age- and gender-matched controls. Patients were previously diagnosed with GNAS-specific differentially methylated regions (DMRs) and include 6 patients with known STX16 deletion (PHP(Δstx16)) and 18 without deletion (PHP(neg)). The array demonstrated that PHP patients do not show DNA methylation differences at the whole-genome level. Unsupervised clustering of GNAS-specific DMRs divides PHP(Δstx16) versus PHP(neg) patients. Interestingly, in contrast to the notion that all PHP patients share methylation defects in the A/B DMR while only PHP(Δstx16) patients have normal NESP, GNAS-AS1 and XL methylation, we found a novel DMR (named GNAS-AS2) in the GNAS-AS1 region that is significantly different in both PHP(Δstx16) and PHP(neg), as validated by Sequenom EpiTYPER in a larger PHP cohort. The analysis of 58 DMRs revealed that 8/18 PHP(neg) and 1/6 PHP(Δstx16) patients have multi-locus methylation defects. Validation was performed for FANCC and SVOPL DMRs. This is the first genome-wide methylation study for PHP patients that confirmed that GNAS is the most significant DMR, and the presence of STX16 deletion divides PHP patients in two groups. Moreover, a novel GNAS-AS2 DMR affects all PHP patients, and PHP patients seem sensitive to multi-locus methylation defects.

  7. Pooled genome wide association detects association upstream of FCRL3 with Graves' disease.

    PubMed

    Khong, Jwu Jin; Burdon, Kathryn P; Lu, Yi; Laurie, Kate; Leonardos, Lefta; Baird, Paul N; Sahebjada, Srujana; Walsh, John P; Gajdatsy, Adam; Ebeling, Peter R; Hamblin, Peter Shane; Wong, Rosemary; Forehan, Simon P; Fourlanos, Spiros; Roberts, Anthony P; Doogue, Matthew; Selva, Dinesh; Montgomery, Grant W; Macgregor, Stuart; Craig, Jamie E

    2016-11-18

    Graves' disease is an autoimmune thyroid disease of complex inheritance. Multiple genetic susceptibility loci are thought to be involved in Graves' disease and it is therefore likely that these can be identified by genome wide association studies. This study aimed to determine if a genome wide association study, using a pooling methodology, could detect genomic loci associated with Graves' disease. Nineteen of the top ranking single nucleotide polymorphisms including HLA-DQA1 and C6orf10, were clustered within the Major Histo-compatibility Complex region on chromosome 6p21, with rs1613056 reaching genome wide significance (p = 5 × 10 -8 ). Technical validation of top ranking non-Major Histo-compatablity complex single nucleotide polymorphisms with individual genotyping in the discovery cohort revealed four single nucleotide polymorphisms with p ≤ 10 -4 . Rs17676303 on chromosome 1q23.1, located upstream of FCRL3, showed evidence of association with Graves' disease across the discovery, replication and combined cohorts. A second single nucleotide polymorphism rs9644119 downstream of DPYSL2 showed some evidence of association supported by finding in the replication cohort that warrants further study. Pooled genome wide association study identified a genetic variant upstream of FCRL3 as a susceptibility locus for Graves' disease in addition to those identified in the Major Histo-compatibility Complex. A second locus downstream of DPYSL2 is potentially a novel genetic variant in Graves' disease that requires further confirmation.

  8. Genome-wide identification of bacterial plant colonization genes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cole, Benjamin J.; Feltcher, Meghan E.; Waters, Robert J.

    Diverse soil-resident bacteria can contribute to plant growth and health, but the molecular mechanisms enabling them to effectively colonize their plant hosts remain poorly understood. We used randomly barcoded transposon mutagenesis sequencing (RB-TnSeq) in Pseudomonas simiae, a model root-colonizing bacterium, to establish a genome-wide map of bacterial genes required for colonization of the Arabidopsis thaliana root system. We identified 115 genes (2% of all P. simiae genes) with functions that are required for maximal competitive colonization of the root system. Among the genes we identified were some with obvious colonization-related roles in motility and carbon metabolism, as well as 44more » other genes that had no or vague functional predictions. Independent validation assays of individual genes confirmed colonization functions for 20 of 22 (91%) cases tested. To further characterize genes identified by our screen, we compared the functional contributions of P. simiae genes to growth in 90 distinct in vitro conditions by RB-TnSeq, highlighting specific metabolic functions associated with root colonization genes. Here, our analysis of bacterial genes by sequence-driven saturation mutagenesis revealed a genome-wide map of the genetic determinants of plant root colonization and offers a starting point for targeted improvement of the colonization capabilities of plant-beneficial microbes.« less

  9. Genome-wide identification of bacterial plant colonization genes

    DOE PAGES

    Cole, Benjamin J.; Feltcher, Meghan E.; Waters, Robert J.; ...

    2017-09-22

    Diverse soil-resident bacteria can contribute to plant growth and health, but the molecular mechanisms enabling them to effectively colonize their plant hosts remain poorly understood. We used randomly barcoded transposon mutagenesis sequencing (RB-TnSeq) in Pseudomonas simiae, a model root-colonizing bacterium, to establish a genome-wide map of bacterial genes required for colonization of the Arabidopsis thaliana root system. We identified 115 genes (2% of all P. simiae genes) with functions that are required for maximal competitive colonization of the root system. Among the genes we identified were some with obvious colonization-related roles in motility and carbon metabolism, as well as 44more » other genes that had no or vague functional predictions. Independent validation assays of individual genes confirmed colonization functions for 20 of 22 (91%) cases tested. To further characterize genes identified by our screen, we compared the functional contributions of P. simiae genes to growth in 90 distinct in vitro conditions by RB-TnSeq, highlighting specific metabolic functions associated with root colonization genes. Here, our analysis of bacterial genes by sequence-driven saturation mutagenesis revealed a genome-wide map of the genetic determinants of plant root colonization and offers a starting point for targeted improvement of the colonization capabilities of plant-beneficial microbes.« less

  10. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    PubMed Central

    2011-01-01

    amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated. Conclusion An annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml). PMID:21266061

  11. A genome-wide survey of transgenerational genetic effects in autism.

    PubMed

    Tsang, Kathryn M; Croen, Lisa A; Torres, Anthony R; Kharrazi, Martin; Delorenze, Gerald N; Windham, Gayle C; Yoshida, Cathleen K; Zerbo, Ousseny; Weiss, Lauren A

    2013-01-01

    Effects of parental genotype or parent-offspring genetic interaction are well established in model organisms for a variety of traits. However, these transgenerational genetic models are rarely studied in humans. We have utilized an autism case-control study with 735 mother-child pairs to perform genome-wide screening for maternal genetic effects and maternal-offspring genetic interaction. We used simple models of single locus parent-child interaction and identified suggestive results (P<10(-4)) that cannot be explained by main effects, but no genome-wide significant signals. Some of these maternal and maternal-child associations were in or adjacent to autism candidate genes including: PCDH9, FOXP1, GABRB3, NRXN1, RELN, MACROD2, FHIT, RORA, CNTN4, CNTNAP2, FAM135B, LAMA1, NFIA, NLGN4X, RAPGEF4, and SDK1. We attempted validation of potential autism association under maternal-specific models using maternal-paternal comparison in family-based GWAS datasets. Our results suggest that further study of parental genetic effects and parent-child interaction in autism is warranted.

  12. Placental genome and maternal-placental genetic interactions: a genome-wide and candidate gene association study of placental abruption.

    PubMed

    Denis, Marie; Enquobahrie, Daniel A; Tadesse, Mahlet G; Gelaye, Bizu; Sanchez, Sixto E; Salazar, Manuel; Ananth, Cande V; Williams, Michelle A

    2014-01-01

    While available evidence supports the role of genetics in the pathogenesis of placental abruption (PA), PA-related placental genome variations and maternal-placental genetic interactions have not been investigated. Maternal blood and placental samples collected from participants in the Peruvian Abruptio Placentae Epidemiology study were genotyped using Illumina's Cardio-Metabochip platform. We examined 118,782 genome-wide SNPs and 333 SNPs in 32 candidate genes from mitochondrial biogenesis and oxidative phosphorylation pathways in placental DNA from 280 PA cases and 244 controls. We assessed maternal-placental interactions in the candidate gene SNPS and two imprinted regions (IGF2/H19 and C19MC). Univariate and penalized logistic regression models were fit to estimate odds ratios. We examined the combined effect of multiple SNPs on PA risk using weighted genetic risk scores (WGRS) with repeated ten-fold cross-validations. A multinomial model was used to investigate maternal-placental genetic interactions. In placental genome-wide and candidate gene analyses, no SNP was significant after false discovery rate correction. The top genome-wide association study (GWAS) hits were rs544201, rs1484464 (CTNNA2), rs4149570 (TNFRSF1A) and rs13055470 (ZNRF3) (p-values: 1.11e-05 to 3.54e-05). The top 200 SNPs of the GWAS overrepresented genes involved in cell cycle, growth and proliferation. The top candidate gene hits were rs16949118 (COX10) and rs7609948 (THRB) (p-values: 6.00e-03 and 8.19e-03). Participants in the highest quartile of WGRS based on cross-validations using SNPs selected from the GWAS and candidate gene analyses had a 8.40-fold (95% CI: 5.8-12.56) and a 4.46-fold (95% CI: 2.94-6.72) higher odds of PA compared to participants in the lowest quartile. We found maternal-placental genetic interactions on PA risk for two SNPs in PPARG (chr3:12313450 and chr3:12412978) and maternal imprinting effects for multiple SNPs in the C19MC and IGF2/H19 regions. Variations in

  13. A Discovery Genome-Wide Association Study of Entrepreneurship

    ERIC Educational Resources Information Center

    Quaye, Lydia; Nicolaou, Nicos; Shane, Scott; Mangino, Massimo

    2012-01-01

    To identify specific genetic variants influencing the phenotype of entrepreneurship, we conducted a genome-wide association study (GWAS) with 3,933 Caucasian females from the TwinsUK Adult Twin Registry. Following stringent genotype quality control, GWAF (genome-wide association analyses for family data) software was used to assess the association…

  14. SuperDCA for genome-wide epistasis analysis.

    PubMed

    Puranen, Santeri; Pesonen, Maiju; Pensar, Johan; Xu, Ying Ying; Lees, John A; Bentley, Stephen D; Croucher, Nicholas J; Corander, Jukka

    2018-05-29

    The potential for genome-wide modelling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has previously been shown to yield valuable predictions for single protein structures, and has recently been extended to genome-wide analysis of bacteria, identifying novel interactions in the co-evolution between resistance, virulence and core genome elements. However, earlier computational DCA methods have not been scalable to enable model fitting simultaneously to 10 4 -10 5 polymorphisms, representing the amount of core genomic variation observed in analyses of many bacterial species. Here, we introduce a novel inference method (SuperDCA) that employs a new scoring principle, efficient parallelization, optimization and filtering on phylogenetic information to achieve scalability for up to 10 5 polymorphisms. Using two large population samples of Streptococcus pneumoniae, we demonstrate the ability of SuperDCA to make additional significant biological findings about this major human pathogen. We also show that our method can uncover signals of selection that are not detectable by genome-wide association analysis, even though our analysis does not require phenotypic measurements. SuperDCA, thus, holds considerable potential in building understanding about numerous organisms at a systems biological level.

  15. Comprehensive evaluation of genome-wide 5-hydroxymethylcytosine profiling approaches in human DNA.

    PubMed

    Skvortsova, Ksenia; Zotenko, Elena; Luu, Phuc-Loi; Gould, Cathryn M; Nair, Shalima S; Clark, Susan J; Stirzaker, Clare

    2017-01-01

    The discovery that 5-methylcytosine (5mC) can be oxidized to 5-hydroxymethylcytosine (5hmC) by the ten-eleven translocation (TET) proteins has prompted wide interest in the potential role of 5hmC in reshaping the mammalian DNA methylation landscape. The gold-standard bisulphite conversion technologies to study DNA methylation do not distinguish between 5mC and 5hmC. However, new approaches to mapping 5hmC genome-wide have advanced rapidly, although it is unclear how the different methods compare in accurately calling 5hmC. In this study, we provide a comparative analysis on brain DNA using three 5hmC genome-wide approaches, namely whole-genome bisulphite/oxidative bisulphite sequencing (WG Bis/OxBis-seq), Infinium HumanMethylation450 BeadChip arrays coupled with oxidative bisulphite (HM450K Bis/OxBis) and antibody-based immunoprecipitation and sequencing of hydroxymethylated DNA (hMeDIP-seq). We also perform loci-specific TET-assisted bisulphite sequencing (TAB-seq) for validation of candidate regions. We show that whole-genome single-base resolution approaches are advantaged in providing precise 5hmC values but require high sequencing depth to accurately measure 5hmC, as this modification is commonly in low abundance in mammalian cells. HM450K arrays coupled with oxidative bisulphite provide a cost-effective representation of 5hmC distribution, at CpG sites with 5hmC levels >~10%. However, 5hmC analysis is restricted to the genomic location of the probes, which is an important consideration as 5hmC modification is commonly enriched at enhancer elements. Finally, we show that the widely used hMeDIP-seq method provides an efficient genome-wide profile of 5hmC and shows high correlation with WG Bis/OxBis-seq 5hmC distribution in brain DNA. However, in cell line DNA with low levels of 5hmC, hMeDIP-seq-enriched regions are not detected by WG Bis/OxBis or HM450K, either suggesting misinterpretation of 5hmC calls by hMeDIP or lack of sensitivity of the latter methods. We

  16. A Genome-Wide Association Study Identifies Genetic Variants Associated with Mathematics Ability

    PubMed Central

    Chen, Huan; Gu, Xiao-hong; Zhou, Yuxi; Ge, Zeng; Wang, Bin; Siok, Wai Ting; Wang, Guoqing; Huen, Michael; Jiang, Yuyang; Tan, Li-Hai; Sun, Yimin

    2017-01-01

    Mathematics ability is a complex cognitive trait with polygenic heritability. Genome-wide association study (GWAS) has been an effective approach to investigate genetic components underlying mathematic ability. Although previous studies reported several candidate genetic variants, none of them exceeded genome-wide significant threshold in general populations. Herein, we performed GWAS in Chinese elementary school students to identify potential genetic variants associated with mathematics ability. The discovery stage included 494 and 504 individuals from two independent cohorts respectively. The replication stage included another cohort of 599 individuals. In total, 28 of 81 candidate SNPs that met validation criteria were further replicated. Combined meta-analysis of three cohorts identified four SNPs (rs1012694, rs11743006, rs17778739 and rs17777541) of SPOCK1 gene showing association with mathematics ability (minimum p value 5.67 × 10−10, maximum β −2.43). The SPOCK1 gene is located on chromosome 5q31.2 and encodes a highly conserved glycoprotein testican-1 which was associated with tumor progression and prognosis as well as neurogenesis. This is the first study to report genome-wide significant association of individual SNPs with mathematics ability in general populations. Our preliminary results further supported the role of SPOCK1 during neurodevelopment. The genetic complexities underlying mathematics ability might contribute to explain the basis of human cognition and intelligence at genetic level. PMID:28155865

  17. A Genome-Wide Association Study Identifies Genetic Variants Associated with Mathematics Ability.

    PubMed

    Chen, Huan; Gu, Xiao-Hong; Zhou, Yuxi; Ge, Zeng; Wang, Bin; Siok, Wai Ting; Wang, Guoqing; Huen, Michael; Jiang, Yuyang; Tan, Li-Hai; Sun, Yimin

    2017-02-03

    Mathematics ability is a complex cognitive trait with polygenic heritability. Genome-wide association study (GWAS) has been an effective approach to investigate genetic components underlying mathematic ability. Although previous studies reported several candidate genetic variants, none of them exceeded genome-wide significant threshold in general populations. Herein, we performed GWAS in Chinese elementary school students to identify potential genetic variants associated with mathematics ability. The discovery stage included 494 and 504 individuals from two independent cohorts respectively. The replication stage included another cohort of 599 individuals. In total, 28 of 81 candidate SNPs that met validation criteria were further replicated. Combined meta-analysis of three cohorts identified four SNPs (rs1012694, rs11743006, rs17778739 and rs17777541) of SPOCK1 gene showing association with mathematics ability (minimum p value 5.67 × 10 -10 , maximum β -2.43). The SPOCK1 gene is located on chromosome 5q31.2 and encodes a highly conserved glycoprotein testican-1 which was associated with tumor progression and prognosis as well as neurogenesis. This is the first study to report genome-wide significant association of individual SNPs with mathematics ability in general populations. Our preliminary results further supported the role of SPOCK1 during neurodevelopment. The genetic complexities underlying mathematics ability might contribute to explain the basis of human cognition and intelligence at genetic level.

  18. Susceptibility to corticosteroid-induced adrenal suppression: a genome-wide association study.

    PubMed

    Hawcutt, Daniel B; Francis, Ben; Carr, Daniel F; Jorgensen, Andrea L; Yin, Peng; Wallin, Naomi; O'Hara, Natalie; Zhang, Eunice J; Bloch, Katarzyna M; Ganguli, Amitava; Thompson, Ben; McEvoy, Laurence; Peak, Matthew; Crawford, Andrew A; Walker, Brian R; Blair, Joanne C; Couriel, Jonathan; Smyth, Rosalind L; Pirmohamed, Munir

    2018-06-01

    A serious adverse effect of corticosteroid therapy is adrenal suppression. Our aim was to identify genetic variants affecting susceptibility to corticosteroid-induced adrenal suppression. We enrolled children with asthma who used inhaled corticosteroids as part of their treatment from 25 sites across the UK (discovery cohort), as part of the Pharmacogenetics of Adrenal Suppression with Inhaled Steroids (PASS) study. We included two validation cohorts, one comprising children with asthma (PASS study) and the other consisting of adults with chronic obstructive pulmonary disorder (COPD) who were recruited from two UK centres for the Pharmacogenomics of Adrenal Suppression in COPD (PASIC) study. Participants underwent a low-dose short synacthen test. Adrenal suppression was defined as peak cortisol less than 350 nmol/L (in children) and less than 500 nmol/L (in adults). A case-control genome-wide association study was done with the control subset augmented by Wellcome Trust Case Control Consortium 2 (WTCCC2) participants. Single nucleotide polymorphisms (SNPs) that fulfilled criteria to be advanced to replication were tested by a random-effects inverse variance meta-analysis. This report presents the primary analysis. The PASS study is registered in the European Genome-phenome Archive (EGA). The PASS study is complete whereas the PASIC study is ongoing. Between November, 2008, and September, 2011, 499 children were enrolled to the discovery cohort. Between October, 2011, and December, 2012, 81 children were enrolled to the paediatric validation cohort, and from February, 2010, to June, 2015, 78 adults were enrolled to the adult validation cohort. Adrenal suppression was present in 35 (7%) children in the discovery cohort and six (7%) children and 17 (22%) adults in the validation cohorts. In the discovery cohort, 40 SNPs were found to be associated with adrenal suppression (genome-wide significance p<1 × 10 -6 ), including an intronic SNP within the PDGFD gene

  19. Genome-Wide and Gene-Based Meta-Analyses Identify Novel Loci Influencing Blood Pressure Response to Hydrochlorothiazide.

    PubMed

    Salvi, Erika; Wang, Zhiying; Rizzi, Federica; Gong, Yan; McDonough, Caitrin W; Padmanabhan, Sandosh; Hiltunen, Timo P; Lanzani, Chiara; Zaninello, Roberta; Chittani, Martina; Bailey, Kent R; Sarin, Antti-Pekka; Barcella, Matteo; Melander, Olle; Chapman, Arlene B; Manunta, Paolo; Kontula, Kimmo K; Glorioso, Nicola; Cusi, Daniele; Dominiczak, Anna F; Johnson, Julie A; Barlassina, Cristina; Boerwinkle, Eric; Cooper-DeHoff, Rhonda M; Turner, Stephen T

    2017-01-01

    This study aimed to identify novel loci influencing the antihypertensive response to hydrochlorothiazide monotherapy. A genome-wide meta-analysis of blood pressure (BP) response to hydrochlorothiazide was performed in 1739 white hypertensives from 6 clinical trials within the International Consortium for Antihypertensive Pharmacogenomics Studies, making it the largest study to date of its kind. No signals reached genome-wide significance (P<5×10 - 8 ), and the suggestive regions (P<10 -5 ) were cross-validated in 2 black cohorts treated with hydrochlorothiazide. In addition, a gene-based analysis was performed on candidate genes with previous evidence of involvement in diuretic response, in BP regulation, or in hypertension susceptibility. Using the genome-wide meta-analysis approach, with validation in blacks, we identified 2 suggestive regulatory regions linked to gap junction protein α1 gene (GJA1) and forkhead box A1 gene (FOXA1), relevant for cardiovascular and kidney function. With the gene-based approach, we identified hydroxy-delta-5-steroid dehydrogenase, 3 β- and steroid δ-isomerase 1 gene (HSD3B1) as significantly associated with BP response (P<2.28×10 - 4 ). HSD3B1 encodes the 3β-hydroxysteroid dehydrogenase enzyme and plays a crucial role in the biosynthesis of aldosterone and endogenous ouabain. By amassing all of the available pharmacogenomic studies of BP response to hydrochlorothiazide, and using 2 different analytic approaches, we identified 3 novel loci influencing BP response to hydrochlorothiazide. The gene-based analysis, never before applied to pharmacogenomics of antihypertensive drugs to our knowledge, provided a powerful strategy to identify a locus of interest, which was not identified in the genome-wide meta-analysis because of high allelic heterogeneity. These data pave the way for future investigations on new pathways and drug targets to enhance the current understanding of personalized antihypertensive treatment. © 2016

  20. Genome wide approaches to identify protein-DNA interactions.

    PubMed

    Ma, Tao; Ye, Zhenqing; Wang, Liguo

    2018-05-29

    Transcription factors are DNA-binding proteins that play key roles in many fundamental biological processes. Unraveling their interactions with DNA is essential to identify their target genes and understand the regulatory network. Genome-wide identification of their binding sites became feasible thanks to recent progress in experimental and computational approaches. ChIP-chip, ChIP-seq, and ChIP-exo are three widely used techniques to demarcate genome-wide transcription factor binding sites. This review aims to provide an overview of these three techniques including their experiment procedures, computational approaches, and popular analytic tools. ChIP-chip, ChIP-seq, and ChIP-exo have been the major techniques to study genome-wide in vivo protein-DNA interaction. Due to the rapid development of next-generation sequencing technology, array-based ChIP-chip is deprecated and ChIP-seq has become the most widely used technique to identify transcription factor binding sites in genome-wide. The newly developed ChIP-exo further improves the spatial resolution to single nucleotide. Numerous tools have been developed to analyze ChIP-chip, ChIP-seq and ChIP-exo data. However, different programs may employ different mechanisms or underlying algorithms thus each will inherently include its own set of statistical assumption and bias. So choosing the most appropriate analytic program for a given experiment needs careful considerations. Moreover, most programs only have command line interface so their installation and usage will require basic computation expertise in Unix/Linux. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  1. Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking.

    PubMed

    Daetwyler, Hans D; Calus, Mario P L; Pong-Wong, Ricardo; de Los Campos, Gustavo; Hickey, John M

    2013-02-01

    The genomic prediction of phenotypes and breeding values in animals and plants has developed rapidly into its own research field. Results of genomic prediction studies are often difficult to compare because data simulation varies, real or simulated data are not fully described, and not all relevant results are reported. In addition, some new methods have been compared only in limited genetic architectures, leading to potentially misleading conclusions. In this article we review simulation procedures, discuss validation and reporting of results, and apply benchmark procedures for a variety of genomic prediction methods in simulated and real example data. Plant and animal breeding programs are being transformed by the use of genomic data, which are becoming widely available and cost-effective to predict genetic merit. A large number of genomic prediction studies have been published using both simulated and real data. The relative novelty of this area of research has made the development of scientific conventions difficult with regard to description of the real data, simulation of genomes, validation and reporting of results, and forward in time methods. In this review article we discuss the generation of simulated genotype and phenotype data, using approaches such as the coalescent and forward in time simulation. We outline ways to validate simulated data and genomic prediction results, including cross-validation. The accuracy and bias of genomic prediction are highlighted as performance indicators that should be reported. We suggest that a measure of relatedness between the reference and validation individuals be reported, as its impact on the accuracy of genomic prediction is substantial. A large number of methods were compared in example simulated and real (pine and wheat) data sets, all of which are publicly available. In our limited simulations, most methods performed similarly in traits with a large number of quantitative trait loci (QTL), whereas in traits

  2. Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking

    PubMed Central

    Daetwyler, Hans D.; Calus, Mario P. L.; Pong-Wong, Ricardo; de los Campos, Gustavo; Hickey, John M.

    2013-01-01

    The genomic prediction of phenotypes and breeding values in animals and plants has developed rapidly into its own research field. Results of genomic prediction studies are often difficult to compare because data simulation varies, real or simulated data are not fully described, and not all relevant results are reported. In addition, some new methods have been compared only in limited genetic architectures, leading to potentially misleading conclusions. In this article we review simulation procedures, discuss validation and reporting of results, and apply benchmark procedures for a variety of genomic prediction methods in simulated and real example data. Plant and animal breeding programs are being transformed by the use of genomic data, which are becoming widely available and cost-effective to predict genetic merit. A large number of genomic prediction studies have been published using both simulated and real data. The relative novelty of this area of research has made the development of scientific conventions difficult with regard to description of the real data, simulation of genomes, validation and reporting of results, and forward in time methods. In this review article we discuss the generation of simulated genotype and phenotype data, using approaches such as the coalescent and forward in time simulation. We outline ways to validate simulated data and genomic prediction results, including cross-validation. The accuracy and bias of genomic prediction are highlighted as performance indicators that should be reported. We suggest that a measure of relatedness between the reference and validation individuals be reported, as its impact on the accuracy of genomic prediction is substantial. A large number of methods were compared in example simulated and real (pine and wheat) data sets, all of which are publicly available. In our limited simulations, most methods performed similarly in traits with a large number of quantitative trait loci (QTL), whereas in traits

  3. Meta-analysis for genome-wide association studies using case-control design: application and practice

    PubMed Central

    2016-01-01

    This review aimed to arrange the process of a systematic review of genome-wide association studies in order to practice and apply a genome-wide meta-analysis (GWMA). The process has a series of five steps: searching and selection, extraction of related information, evaluation of validity, meta-analysis by type of genetic model, and evaluation of heterogeneity. In contrast to intervention meta-analyses, GWMA has to evaluate the Hardy–Weinberg equilibrium (HWE) in the third step and conduct meta-analyses by five potential genetic models, including dominant, recessive, homozygote contrast, heterozygote contrast, and allelic contrast in the fourth step. The ‘genhwcci’ and ‘metan’ commands of STATA software evaluate the HWE and calculate a summary effect size, respectively. A meta-regression using the ‘metareg’ command of STATA should be conducted to evaluate related factors of heterogeneities. PMID:28092928

  4. Significance of genome-wide association studies in molecular anthropology.

    PubMed

    Gupta, Vipin; Khadgawat, Rajesh; Sachdeva, Mohinder Pal

    2009-12-01

    The successful advent of a genome-wide approach in association studies raises the hopes of human geneticists for solving a genetic maze of complex traits especially the disorders. This approach, which is replete with the application of cutting-edge technology and supported by big science projects (like Human Genome Project; and even more importantly the International HapMap Project) and various important databases (SNP database, CNV database, etc.), has had unprecedented success in rapidly uncovering many of the genetic determinants of complex disorders. The magnitude of this approach in the genetics of classical anthropological variables like height, skin color, eye color, and other genome diversity projects has certainly expanded the horizons of molecular anthropology. Therefore, in this article we have proposed a genome-wide association approach in molecular anthropological studies by providing lessons from the exemplary study of the Wellcome Trust Case Control Consortium. We have also highlighted the importance and uniqueness of Indian population groups in facilitating the design and finding optimum solutions for other genome-wide association-related challenges.

  5. Fluorescence Reporter-Based Genome-Wide RNA Interference Screening to Identify Alternative Splicing Regulators.

    PubMed

    Misra, Ashish; Green, Michael R

    2017-01-01

    Alternative splicing is a regulated process that leads to inclusion or exclusion of particular exons in a pre-mRNA transcript, resulting in multiple protein isoforms being encoded by a single gene. With more than 90 % of human genes known to undergo alternative splicing, it represents a major source for biological diversity inside cells. Although in vitro splicing assays have revealed insights into the mechanisms regulating individual alternative splicing events, our global understanding of alternative splicing regulation is still evolving. In recent years, genome-wide RNA interference (RNAi) screening has transformed biological research by enabling genome-scale loss-of-function screens in cultured cells and model organisms. In addition to resulting in the identification of new cellular pathways and potential drug targets, these screens have also uncovered many previously unknown mechanisms regulating alternative splicing. Here, we describe a method for the identification of alternative splicing regulators using genome-wide RNAi screening, as well as assays for further validation of the identified candidates. With modifications, this method can also be adapted to study the splicing regulation of pre-mRNAs that contain two or more splice isoforms.

  6. Automated ensemble assembly and validation of microbial genomes.

    PubMed

    Koren, Sergey; Treangen, Todd J; Hill, Christopher M; Pop, Mihai; Phillippy, Adam M

    2014-05-03

    The continued democratization of DNA sequencing has sparked a new wave of development of genome assembly and assembly validation methods. As individual research labs, rather than centralized centers, begin to sequence the majority of new genomes, it is important to establish best practices for genome assembly. However, recent evaluations such as GAGE and the Assemblathon have concluded that there is no single best approach to genome assembly. Instead, it is preferable to generate multiple assemblies and validate them to determine which is most useful for the desired analysis; this is a labor-intensive process that is often impossible or unfeasible. To encourage best practices supported by the community, we present iMetAMOS, an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. We demonstrate the utility of the ensemble process on 225 previously unassembled Mycobacterium tuberculosis genomes as well as a Rhodobacter sphaeroides benchmark dataset. On these real data, iMetAMOS reliably produces validated assemblies and identifies potential contamination without user intervention. In addition, intelligent parameter selection produces assemblies of R. sphaeroides comparable to or exceeding the quality of those from the GAGE-B evaluation, affecting the relative ranking of some assemblers. Ensemble assembly with iMetAMOS provides users with multiple, validated assemblies for each genome. Although computationally limited to small or mid-sized genomes, this approach is the most effective and reproducible means for generating high-quality assemblies and enables users to select an assembly best tailored to

  7. Genome-wide association analysis for feed efficiency in Angus cattle.

    PubMed

    Rolf, M M; Taylor, J F; Schnabel, R D; McKay, S D; McClure, M C; Northcutt, S L; Kerley, M S; Weaber, R L

    2012-08-01

    Estimated breeding values for average daily feed intake (AFI; kg/day), residual feed intake (RFI; kg/day) and average daily gain (ADG; kg/day) were generated using a mixed linear model incorporating genomic relationships for 698 Angus steers genotyped with the Illumina BovineSNP50 assay. Association analyses of estimated breeding values (EBVs) were performed for 41,028 single nucleotide polymorphisms (SNPs), and permutation analysis was used to empirically establish the genome-wide significance threshold (P < 0.05) for each trait. SNPs significantly associated with each trait were used in a forward selection algorithm to identify genomic regions putatively harbouring genes with effects on each trait. A total of 53, 66 and 68 SNPs explained 54.12% (24.10%), 62.69% (29.85%) and 55.13% (26.54%) of the additive genetic variation (when accounting for the genomic relationships) in steer breeding values for AFI, RFI and ADG, respectively, within this population. Evaluation by pathway analysis revealed that many of these SNPs are in genomic regions that harbour genes with metabolic functions. The presence of genetic correlations between traits resulted in 13.2% of SNPs selected for AFI and 4.5% of SNPs selected for RFI also being selected for ADG in the analysis of breeding values. While our study identifies panels of SNPs significant for efficiency traits in our population, validation of all SNPs in independent populations will be necessary before commercialization. © 2011 The Authors, Animal Genetics © 2011 Stichting International Foundation for Animal Genetics.

  8. Genome-wide screening and identification of antigens for rickettsial vaccine development

    USDA-ARS?s Scientific Manuscript database

    The capacity to identify immunogens for vaccine development by genome-wide screening has been markedly enhanced by the availability of complete microbial genome sequences coupled to rapid proteomic and bioinformatic analysis. Critical to this genome-wide screening is in vivo testing in the context o...

  9. A genome-wide association study in soybean

    USDA-ARS?s Scientific Manuscript database

    A genome-wide association study (GWAS) was performed to estimate the feasibility of identifying genes controlling the quantitative traits, seed protein and oil concentration, in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. A total of 55,159 single nucleo...

  10. Genomic prediction in contrast to a genome-wide association study in explaining heritable variation of complex growth traits in breeding populations of Eucalyptus.

    PubMed

    Müller, Bárbara S F; Neves, Leandro G; de Almeida Filho, Janeo E; Resende, Márcio F R; Muñoz, Patricio R; Dos Santos, Paulo E T; Filho, Estefano Paludzyszyn; Kirst, Matias; Grattapaglia, Dario

    2017-07-11

    The advent of high-throughput genotyping technologies coupled to genomic prediction methods established a new paradigm to integrate genomics and breeding. We carried out whole-genome prediction and contrasted it to a genome-wide association study (GWAS) for growth traits in breeding populations of Eucalyptus benthamii (n =505) and Eucalyptus pellita (n =732). Both species are of increasing commercial interest for the development of germplasm adapted to environmental stresses. Predictive ability reached 0.16 in E. benthamii and 0.44 in E. pellita for diameter growth. Predictive abilities using either Genomic BLUP or different Bayesian methods were similar, suggesting that growth adequately fits the infinitesimal model. Genomic prediction models using ~5000-10,000 SNPs provided predictive abilities equivalent to using all 13,787 and 19,506 SNPs genotyped in the E. benthamii and E. pellita populations, respectively. No difference was detected in predictive ability when different sets of SNPs were utilized, based on position (equidistantly genome-wide, inside genes, linkage disequilibrium pruned or on single chromosomes), as long as the total number of SNPs used was above ~5000. Predictive abilities obtained by removing relatedness between training and validation sets fell near zero for E. benthamii and were halved for E. pellita. These results corroborate the current view that relatedness is the main driver of genomic prediction, although some short-range historical linkage disequilibrium (LD) was likely captured for E. pellita. A GWAS identified only one significant association for volume growth in E. pellita, illustrating the fact that while genome-wide regression is able to account for large proportions of the heritability, very little or none of it is captured into significant associations using GWAS in breeding populations of the size evaluated in this study. This study provides further experimental data supporting positive prospects of using genome-wide data to

  11. Optimization and quality control of genome-wide Hi-C library preparation.

    PubMed

    Zhang, Xiang-Yuan; He, Chao; Ye, Bing-Yu; Xie, De-Jian; Shi, Ming-Lei; Zhang, Yan; Shen, Wen-Long; Li, Ping; Zhao, Zhi-Hu

    2017-09-20

    Highest-throughput chromosome conformation capture (Hi-C) is one of the key assays for genome- wide chromatin interaction studies. It is a time-consuming process that involves many steps and many different kinds of reagents, consumables, and equipments. At present, the reproducibility is unsatisfactory. By optimizing the key steps of the Hi-C experiment, such as crosslinking, pretreatment of digestion, inactivation of restriction enzyme, and in situ ligation etc., we established a robust Hi-C procedure and prepared two biological replicates of Hi-C libraries from the GM12878 cells. After preliminary quality control by Sanger sequencing, the two replicates were high-throughput sequenced. The bioinformatics analysis of the raw sequencing data revealed the mapping-ability and pair-mate rate of the raw data were around 90% and 72%, respectively. Additionally, after removal of self-circular ligations and dangling-end products, more than 96% of the valid pairs were reached. Genome-wide interactome profiling shows clear topological associated domains (TADs), which is consistent with previous reports. Further correlation analysis showed that the two biological replicates strongly correlate with each other in terms of both bin coverage and all bin pairs. All these results indicated that the optimized Hi-C procedure is robust and stable, which will be very helpful for the wide applications of the Hi-C assay.

  12. Genome-Wide Identification and Transferability of Microsatellite Markers between Palmae Species

    PubMed Central

    Xiao, Yong; Xia, Wei; Ma, Jianwei; Mason, Annaliese S.; Fan, Haikuo; Shi, Peng; Lei, Xintao; Ma, Zilong; Peng, Ming

    2016-01-01

    The Palmae family contains 202 genera and approximately 2800 species. Except for Elaeis guineensis and Phoenix dactylifera, almost no genetic and genomic information is available for Palmae species. Therefore, this is an obstacle to the conservation and genetic assessment of Palmae species, especially those that are currently endangered. The study was performed to develop a large number of microsatellite markers which can be used for genetic analysis in different Palmae species. Based on the assembled genome of E. guineensis and P. dactylifera, a total of 814 383 and 371 629 microsatellites were identified. Among these microsatellites identified in E. guineensis, 734 509 primer pairs could be designed from the flanking sequences of these microsatellites. The majority (618 762) of these designed primer pairs had in silico products in the genome of E. guineensis. These 618 762 primer pairs were subsequently used to in silico amplify the genome of P. dactylifera. A total of 7 265 conserved microsatellites were identified between E. guineensis and P. dactylifera. One hundred and thirty-five primer pairs flanking the conserved SSRs were stochastically selected and validated to have high cross-genera transferability, varying from 16.7 to 93.3% with an average of 73.7%. These genome-wide conserved microsatellite markers will provide a useful tool for genetic assessment and conservation of different Palmae species in the future. PMID:27826307

  13. A Genome-Wide Association Study of Circulating Galectin-3

    PubMed Central

    van Veldhuisen, Dirk J.; Westra, Harm-Jan; Bakker, Stephan J. L.; Gansevoort, Ron T.; Muller Kobold, Anneke C.; van Gilst, Wiek H.; Franke, Lude

    2012-01-01

    Galectin-3 is a lectin involved in fibrosis, inflammation and proliferation. Increased circulating levels of galectin-3 have been associated with various diseases, including cancer, immunological disorders, and cardiovascular disease. To enhance our knowledge on galectin-3 biology we performed the first genome-wide association study (GWAS) using the Illumina HumanCytoSNP-12 array imputed with the HapMap 2 CEU panel on plasma galectin-3 levels in 3,776 subjects and follow-up genotyping in an additional 3,516 subjects. We identified 2 genome wide significant loci associated with plasma galectin-3 levels. One locus harbours the LGALS3 gene (rs2274273; P = 2.35×10−188) and the other locus the ABO gene (rs644234; P = 3.65×10−47). The variance explained by the LGALS3 locus was 25.6% and by the ABO locus 3.8% and jointly they explained 29.2%. Rs2274273 lies in high linkage disequilibrium with two non-synonymous SNPs (rs4644; r2 = 1.0, and rs4652; r2 = 0.91) and wet lab follow-up genotyping revealed that both are strongly associated with galectin-3 levels (rs4644; P = 4.97×10−465 and rs4652 P = 1.50×10−421) and were also associated with LGALS3 gene-expression. The origins of our associations should be further validated by means of functional experiments. PMID:23056639

  14. GenoGAM: genome-wide generalized additive models for ChIP-Seq analysis.

    PubMed

    Stricker, Georg; Engelhardt, Alexander; Schulz, Daniel; Schmid, Matthias; Tresch, Achim; Gagneur, Julien

    2017-08-01

    Chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq) is a widely used approach to study protein-DNA interactions. Often, the quantities of interest are the differential occupancies relative to controls, between genetic backgrounds, treatments, or combinations thereof. Current methods for differential occupancy of ChIP-Seq data rely however on binning or sliding window techniques, for which the choice of the window and bin sizes are subjective. Here, we present GenoGAM (Genome-wide Generalized Additive Model), which brings the well-established and flexible generalized additive models framework to genomic applications using a data parallelism strategy. We model ChIP-Seq read count frequencies as products of smooth functions along chromosomes. Smoothing parameters are objectively estimated from the data by cross-validation, eliminating ad hoc binning and windowing needed by current approaches. GenoGAM provides base-level and region-level significance testing for full factorial designs. Application to a ChIP-Seq dataset in yeast showed increased sensitivity over existing differential occupancy methods while controlling for type I error rate. By analyzing a set of DNA methylation data and illustrating an extension to a peak caller, we further demonstrate the potential of GenoGAM as a generic statistical modeling tool for genome-wide assays. Software is available from Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/GenoGAM.html . gagneur@in.tum.de. Supplementary information is available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. Genome-Wide Detection and Analysis of Multifunctional Genes

    PubMed Central

    Pritykin, Yuri; Ghersi, Dario; Singh, Mona

    2015-01-01

    Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655

  16. Genome-Wide Locations of Potential Epimutations Associated with Environmentally Induced Epigenetic Transgenerational Inheritance of Disease Using a Sequential Machine Learning Prediction Approach.

    PubMed

    Haque, M Muksitul; Holder, Lawrence B; Skinner, Michael K

    2015-01-01

    Environmentally induced epigenetic transgenerational inheritance of disease and phenotypic variation involves germline transmitted epimutations. The primary epimutations identified involve altered differential DNA methylation regions (DMRs). Different environmental toxicants have been shown to promote exposure (i.e., toxicant) specific signatures of germline epimutations. Analysis of genomic features associated with these epimutations identified low-density CpG regions (<3 CpG / 100bp) termed CpG deserts and a number of unique DNA sequence motifs. The rat genome was annotated for these and additional relevant features. The objective of the current study was to use a machine learning computational approach to predict all potential epimutations in the genome. A number of previously identified sperm epimutations were used as training sets. A novel machine learning approach using a sequential combination of Active Learning and Imbalance Class Learner analysis was developed. The transgenerational sperm epimutation analysis identified approximately 50K individual sites with a 1 kb mean size and 3,233 regions that had a minimum of three adjacent sites with a mean size of 3.5 kb. A select number of the most relevant genomic features were identified with the low density CpG deserts being a critical genomic feature of the features selected. A similar independent analysis with transgenerational somatic cell epimutation training sets identified a smaller number of 1,503 regions of genome-wide predicted sites and differences in genomic feature contributions. The predicted genome-wide germline (sperm) epimutations were found to be distinct from the predicted somatic cell epimutations. Validation of the genome-wide germline predicted sites used two recently identified transgenerational sperm epimutation signature sets from the pesticides dichlorodiphenyltrichloroethane (DDT) and methoxychlor (MXC) exposure lineage F3 generation. Analysis of this positive validation data set

  17. Genome-Wide Association Study (GWAS) and Genome-Wide Environment Interaction Study (GWEIS) of Depressive Symptoms in African American and Hispanic/Latina Women

    PubMed Central

    Dunn, Erin C.; Wiste, Anna; Radmanesh, Farid; Almli, Lynn M.; Gogarten, Stephanie M.; Sofer, Tamar; Faul, Jessica D.; Kardia, Sharon L.R.; Smith, Jennifer A.; Weir, David R.; Zhao, Wei; Soare, Thomas W.; Mirza, Saira S.; Hek, Karin; Tiemeier, Henning W.; Goveas, Joseph S.; Sarto, Gloria E.; Snively, Beverly M.; Cornelis, Marilyn; Koenen, Karestan C.; Kraft, Peter; Purcell, Shaun; Ressler, Kerry J.; Rosand, Jonathan; Wassertheil-Smoller, Sylvia; Smoller, Jordan W.

    2016-01-01

    Background Genome-wide association studies (GWAS) have been unable to identify variants linked to depression. We hypothesized that examining depressive symptoms and considering gene-environment interaction (G×E) might improve efficiency for gene discovery. We therefore conducted a GWAS and genome-wide environment interaction study (GWEIS) of depressive symptoms. Methods Using data from the SHARe cohort of the Women’s Health Initiative, comprising African Americans (n=7179) and Hispanics/Latinas (n=3138), we examined genetic main effects and G×E with stressful life events and social support. We also conducted a heritability analysis using genome-wide complex trait analysis (GCTA). Replication was attempted in four independent cohorts. Results No SNPs achieved genome-wide significance for main effects in either discovery sample. The top signals in African Americans were rs73531535 (located 20kb from GPR139, p=5.75×10−8) and rs75407252 (intronic to CACNA2D3, p=6.99×10−7). In Hispanics/Latinas, the top signals were rs2532087 (located 27kb from CD38, p=2.44×10−7) and rs4542757 (intronic to DCC, p=7.31×10−7). In the GWEIS with stressful life events, one interaction signal was genome-wide significant in African Americans (rs4652467; p=4.10×10−10; located 14kb from CEP350). This interaction was not observed in a smaller replication cohort. Although heritability estimates for depressive symptoms and stressful life events were each less than 10%, they were strongly genetically correlated (rG=0.95), suggesting that common variation underlying depressive symptoms and stressful life event exposure, though modest on their own, were highly overlapping in this sample. Conclusions Our results underscore the need for larger samples, more GWEIS, and greater investigation into genetic and environmental determinants of depressive symptoms in minorities. PMID:27038408

  18. Quality control and quality assurance in genotypic data for genome-wide association studies

    PubMed Central

    Laurie, Cathy C.; Doheny, Kimberly F.; Mirel, Daniel B.; Pugh, Elizabeth W.; Bierut, Laura J.; Bhangale, Tushar; Boehm, Frederick; Caporaso, Neil E.; Cornelis, Marilyn C.; Edenberg, Howard J.; Gabriel, Stacy B.; Harris, Emily L.; Hu, Frank B.; Jacobs, Kevin; Kraft, Peter; Landi, Maria Teresa; Lumley, Thomas; Manolio, Teri A.; McHugh, Caitlin; Painter, Ian; Paschall, Justin; Rice, John P.; Rice, Kenneth M.; Zheng, Xiuwen; Weir, Bruce S.

    2011-01-01

    Genome-wide scans of nucleotide variation in human subjects are providing an increasing number of replicated associations with complex disease traits. Most of the variants detected have small effects and, collectively, they account for a small fraction of the total genetic variance. Very large sample sizes are required to identify and validate findings. In this situation, even small sources of systematic or random error can cause spurious results or obscure real effects. The need for careful attention to data quality has been appreciated for some time in this field, and a number of strategies for quality control and quality assurance (QC/QA) have been developed. Here we extend these methods and describe a system of QC/QA for genotypic data in genome-wide association studies. This system includes some new approaches that (1) combine analysis of allelic probe intensities and called genotypes to distinguish gender misidentification from sex chromosome aberrations, (2) detect autosomal chromosome aberrations that may affect genotype calling accuracy, (3) infer DNA sample quality from relatedness and allelic intensities, (4) use duplicate concordance to infer SNP quality, (5) detect genotyping artifacts from dependence of Hardy-Weinberg equilibrium (HWE) test p-values on allelic frequency, and (6) demonstrate sensitivity of principal components analysis (PCA) to SNP selection. The methods are illustrated with examples from the ‘Gene Environment Association Studies’ (GENEVA) program. The results suggest several recommendations for QC/QA in the design and execution of genome-wide association studies. PMID:20718045

  19. Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans.

    PubMed

    Haraksingh, Rajini R; Abyzov, Alexej; Urban, Alexander Eckehart

    2017-04-24

    High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools for genome analysis in the coming years. We systematically compared the genome-wide CNV detection power of all 17 available array designs from the Affymetrix, Agilent, and Illumina platforms by hybridizing the well-characterized genome of 1000 Genomes Project subject NA12878 to all arrays, and performing data analysis using both manufacturer-recommended and platform-independent software. We benchmarked the resulting CNV call sets from each array using a gold standard set of CNVs for this genome derived from 1000 Genomes Project whole genome sequencing data. The arrays tested comprise both SNP and aCGH platforms with varying designs and contain between ~0.5 to ~4.6 million probes. Across the arrays CNV detection varied widely in number of CNV calls (4-489), CNV size range (~40 bp to ~8 Mbp), and percentage of non-validated CNVs (0-86%). We discovered strikingly strong effects of specific array design principles on performance. For example, some SNP array designs with the largest numbers of probes and extensive exonic coverage produced a considerable number of CNV calls that could not be validated, compared to designs with probe numbers that are sometimes an order of magnitude smaller. This effect was only partially ameliorated using different analysis software and optimizing data analysis parameters. High-resolution microarrays will continue to be used as reliable, cost- and time-efficient tools for CNV analysis. However, different applications tolerate different limitations in CNV detection. Our study quantified how these arrays differ in total number and size range of detected CNVs as well as sensitivity, and determined how each array balances these attributes. This analysis will

  20. A Genome-Wide Breast Cancer Scan in African Americans

    DTIC Science & Technology

    2010-06-01

    SNPs from the African American breast cancer scan to COGs , a European collaborative study which is has designed a SNP array with that will be genotyped...Award Number: W81XWH-08-1-0383 TITLE: A Genome-wide Breast Cancer Scan in African Americans PRINCIPAL INVESTIGATOR: Christopher A...SUBTITLE A Genome-wide Breast Cancer Scan in African Americans 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH-08-1-0383 5c. PROGRAM

  1. The complex genetics of gait speed: genome-wide meta-analysis approach

    PubMed Central

    Lunetta, Kathryn L.; Smith, Jennifer A.; Eicher, John D.; Vered, Rotem; Deelen, Joris; Arnold, Alice M.; Buchman, Aron S.; Tanaka, Toshiko; Faul, Jessica D.; Nethander, Maria; Fornage, Myriam; Adams, Hieab H.; Matteini, Amy M.; Callisaya, Michele L.; Smith, Albert V.; Yu, Lei; De Jager, Philip L.; Evans, Denis A.; Gudnason, Vilmundur; Hofman, Albert; Pattie, Alison; Corley, Janie; Launer, Lenore J.; Knopman, Davis S.; Parimi, Neeta; Turner, Stephen T.; Bandinelli, Stefania; Beekman, Marian; Gutman, Danielle; Sharvit, Lital; Mooijaart, Simon P.; Liewald, David C.; Houwing-Duistermaat, Jeanine J.; Ohlsson, Claes; Moed, Matthijs; Verlinden, Vincent J.; Mellström, Dan; van der Geest, Jos N.; Karlsson, Magnus; Hernandez, Dena; McWhirter, Rebekah; Liu, Yongmei; Thomson, Russell; Tranah, Gregory J.; Uitterlinden, Andre G.; Weir, David R.; Zhao, Wei; Starr, John M.; Johnson, Andrew D.; Ikram, M. Arfan; Bennett, David A.; Cummings, Steven R.; Deary, Ian J.; Harris, Tamara B.; Kardia, Sharon L. R.; Mosley, Thomas H.; Srikanth, Velandai K.; Windham, Beverly G.; Newman, Ann B.; Walston, Jeremy D.; Davies, Gail; Evans, Daniel S.; Slagboom, Eline P.; Ferrucci, Luigi; Kiel, Douglas P.; Murabito, Joanne M.; Atzmon, Gil

    2017-01-01

    Emerging evidence suggests that the basis for variation in late-life mobility is attributable, in part, to genetic factors, which may become increasingly important with age. Our objective was to systematically assess the contribution of genetic variation to gait speed in older individuals. We conducted a meta-analysis of gait speed GWASs in 31,478 older adults from 17 cohorts of the CHARGE consortium, and validated our results in 2,588 older adults from 4 independent studies. We followed our initial discoveries with network and eQTL analysis of candidate signals in tissues. The meta-analysis resulted in a list of 536 suggestive genome wide significant SNPs in or near 69 genes. Further interrogation with Pathway Analysis placed gait speed as a polygenic complex trait in five major networks. Subsequent eQTL analysis revealed several SNPs significantly associated with the expression of PRSS16, WDSUB1 and PTPRT, which in addition to the meta-analysis and pathway suggested that genetic effects on gait speed may occur through synaptic function and neuronal development pathways. No genome-wide significant signals for gait speed were identified from this moderately large sample of older adults, suggesting that more refined physical function phenotypes will be needed to identify the genetic basis of gait speed in aging. PMID:28077804

  2. Layers of epistasis: genome-wide regulatory networks and network approaches to genome-wide association studies.

    PubMed

    Cowper-Sal lari, Richard; Cole, Michael D; Karagas, Margaret R; Lupien, Mathieu; Moore, Jason H

    2011-01-01

    The conceptual foundation of the genome-wide association study (GWAS) has advanced unchecked since its conception. A revision might seem premature as the potential of GWAS has not been fully realized. Multiple technical and practical limitations need to be overcome before GWAS can be fairly criticized. But with the completion of hundreds of studies and a deeper understanding of the genetic architecture of disease, warnings are being raised. The results compiled to date indicate that risk-associated variants lie predominantly in noncoding regions of the genome. Additionally, alternative methodologies are uncovering large and heterogeneous sets of rare variants underlying disease. The fear is that, even in its fulfillment, the current GWAS paradigm might be incapable of dissecting all kinds of phenotypes. In the following text, we review several initiatives that aim to overcome these limitations. The overarching theme of these studies is the inclusion of biological knowledge to both the analysis and interpretation of genotyping data. GWAS is uninformed of biology by design and although there is some virtue in its simplicity, it is also its most conspicuous deficiency. We propose a framework in which to integrate these novel approaches, both empirical and theoretical, in the form of a genome-wide regulatory network (GWRN). By processing experimental data into networks, emerging data types based on chromatin immunoprecipitation are made computationally tractable. This will give GWAS re-analysis efforts the most current and relevant substrates, and root them firmly on our knowledge of human disease. Copyright © 2010 John Wiley & Sons, Inc.

  3. Landscape genomics reveals altered genome wide diversity within revegetated stands of Eucalyptus microcarpa (Grey Box).

    PubMed

    Jordan, Rebecca; Dillon, Shannon K; Prober, Suzanne M; Hoffmann, Ary A

    2016-12-01

    In order to contribute to evolutionary resilience and adaptive potential in highly modified landscapes, revegetated areas should ideally reflect levels of genetic diversity within and across natural stands. Landscape genomic analyses enable such diversity patterns to be characterized at genome and chromosomal levels. Landscape-wide patterns of genomic diversity were assessed in Eucalyptus microcarpa, a dominant tree species widely used in revegetation in Southeastern Australia. Trees from small and large patches within large remnants, small isolated remnants and revegetation sites were assessed across the now highly fragmented distribution of this species using the DArTseq genomic approach. Genomic diversity was similar within all three types of remnant patches analysed, although often significantly but only slightly lower in revegetation sites compared with natural remnants. Differences in diversity between stand types varied across chromosomes. Genomic differentiation was higher between small, isolated remnants, and among revegetated sites compared with natural stands. We conclude that small remnants and revegetated sites of our E. microcarpa samples largely but not completely capture patterns in genomic diversity across the landscape. Genomic approaches provide a powerful tool for assessing restoration efforts across the landscape. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  4. A Genome-Wide Association Study to Identify Genomic Modulators of Rate Control Therapy in Patients with Atrial Fibrillation

    PubMed Central

    Kolek, Matthew J.; Edwards, Todd L.; Muhammad, Raafia; Balouch, Adnan; Shoemaker, M. Benjamin; Blair, Marcia A.; Kor, Kaylen C.; Takahashi, Atsushi; Kubo, Michiaki; Roden, Dan M.; Tanaka, Toshihiro; Darbar, Dawood

    2014-01-01

    For many patients with atrial fibrillation (AF), ventricular rate control with atrioventricular (AV) nodal blockers is considered first-line therapy, though response to treatment is highly variable. Using an extreme phenotype of failure of rate control necessitating AV nodal ablation and pacemaker implantation, we conducted a genome wide association study (GWAS) to identify genomic modulators of rate control therapy. Cases included 95 patients who failed rate control therapy. Controls (N=190) achieved adequate rate control therapy with ≤2 AV nodal blockers using a conventional clinical definition. Genotyping was performed on the Illumina 610-Quad platform, and results were imputed to the 1000 Genomes reference haplotypes. 554,041 single nucleotide polymorphisms (SNPs) met criteria for minor allele frequency (>0.01), call rate (>95%), and quality control, and 6,055,224 SNPs were available after imputation. No SNP reached the canonical threshold for significance for GWAS of P<5 × 10−8. Sixty-three SNPs with P<10−5 at 6 genomic loci were genotyped in a validation cohort of 130 cases and 157 controls. These included 6q24.3 (near SAMD5/SASH1, P=9.36 × 10−8), 4q12 (IGFBP7, P=1.75 × 10−7), 6q22.33 (C6orf174, P=4.86 × 10−7), 3p21.31 (CDCP1, P=1.18 × 10−6), 12p12.1 (SOX5, P=1.62 × 10−6), and 7p11 (LANCL2, P=6.51 × 10−6). However, none of these were significant in the replication cohort or in a meta-analysis of both cohorts. In conclusion, we identified several potentially important genomic modulators of rate control therapy in AF, particularly SOX5, which was previously associated with resting heart rate and PR interval. However these failed to reach genome-wide significance. PMID:25015694

  5. A genome-wide association study to identify genomic modulators of rate control therapy in patients with atrial fibrillation.

    PubMed

    Kolek, Matthew J; Edwards, Todd L; Muhammad, Raafia; Balouch, Adnan; Shoemaker, M Benjamin; Blair, Marcia A; Kor, Kaylen C; Takahashi, Atsushi; Kubo, Michiaki; Roden, Dan M; Tanaka, Toshihiro; Darbar, Dawood

    2014-08-15

    For many patients with atrial fibrillation, ventricular rate control with atrioventricular (AV) nodal blockers is considered first-line therapy, although response to treatment is highly variable. Using an extreme phenotype of failure of rate control necessitating AV nodal ablation and pacemaker implantation, we conducted a genome-wide association study (GWAS) to identify genomic modulators of rate control therapy. Cases included 95 patients who failed rate control therapy. Controls (n = 190) achieved adequate rate control therapy with ≤2 AV nodal blockers using a conventional clinical definition. Genotyping was performed on the Illumina 610-Quad platform, and results were imputed to the 1000 Genomes reference haplotypes. A total of 554,041 single-nucleotide polymorphisms (SNPs) met criteria for minor allele frequency (>0.01), call rate (>95%), and quality control, and 6,055,224 SNPs were available after imputation. No SNP reached the canonical threshold for significance for GWAS of p <5 × 10(-8). Sixty-three SNPs with p <10(-5) at 6 genomic loci were genotyped in a validation cohort of 130 cases and 157 controls. These included 6q24.3 (near SAMD5/SASH1, p = 9.36 × 10(-8)), 4q12 (IGFBP7, p = 1.75 × 10(-7)), 6q22.33 (C6orf174, p = 4.86 × 10(-7)), 3p21.31 (CDCP1, p = 1.18 × 10(-6)), 12p12.1 (SOX5, p = 1.62 × 10(-6)), and 7p11 (LANCL2, p = 6.51 × 10(-6)). However, none of these were significant in the replication cohort or in a meta-analysis of both cohorts. In conclusion, we identified several potentially important genomic modulators of rate control therapy in atrial fibrillation, particularly SOX5, which was previously associated with heart rate at rest and PR interval. However, these failed to reach genome-wide significance. Copyright © 2014 Elsevier Inc. All rights reserved.

  6. Employing genome-wide SNP discovery and genotyping strategy to extrapolate the natural allelic diversity and domestication patterns in chickpea

    PubMed Central

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    The genome-wide discovery and high-throughput genotyping of SNPs in chickpea natural germplasm lines is indispensable to extrapolate their natural allelic diversity, domestication, and linkage disequilibrium (LD) patterns leading to the genetic enhancement of this vital legume crop. We discovered 44,844 high-quality SNPs by sequencing of 93 diverse cultivated desi, kabuli, and wild chickpea accessions using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays that were physically mapped across eight chromosomes of desi and kabuli. Of these, 22,542 SNPs were structurally annotated in different coding and non-coding sequence components of genes. Genes with 3296 non-synonymous and 269 regulatory SNPs could functionally differentiate accessions based on their contrasting agronomic traits. A high experimental validation success rate (92%) and reproducibility (100%) along with strong sensitivity (93–96%) and specificity (99%) of GBS-based SNPs was observed. This infers the robustness of GBS as a high-throughput assay for rapid large-scale mining and genotyping of genome-wide SNPs in chickpea with sub-optimal use of resources. With 23,798 genome-wide SNPs, a relatively high intra-specific polymorphic potential (49.5%) and broader molecular diversity (13–89%)/functional allelic diversity (18–77%) was apparent among 93 chickpea accessions, suggesting their tremendous applicability in rapid selection of desirable diverse accessions/inter-specific hybrids in chickpea crossbred varietal improvement program. The genome-wide SNPs revealed complex admixed domestication pattern, extensive LD estimates (0.54–0.68) and extended LD decay (400–500 kb) in a structured population inclusive of 93 accessions. These findings reflect the utility of our identified SNPs for subsequent genome-wide association study (GWAS) and selective sweep-based domestication trait dissection analysis to identify potential genomic loci (gene-associated targets) specifically

  7. Genome-wide survey and expression analysis of F-box genes in chickpea.

    PubMed

    Gupta, Shefali; Garg, Vanika; Kant, Chandra; Bhatia, Sabhyata

    2015-02-13

    The F-box genes constitute one of the largest gene families in plants involved in degradation of cellular proteins. F-box proteins can recognize a wide array of substrates and regulate many important biological processes such as embryogenesis, floral development, plant growth and development, biotic and abiotic stress, hormonal responses and senescence, among others. However, little is known about the F-box genes in the important legume crop, chickpea. The available draft genome sequence of chickpea allowed us to conduct a genome-wide survey of the F-box gene family in chickpea. A total of 285 F-box genes were identified in chickpea which were classified based on their C-terminal domain structures into 10 subfamilies. Thirteen putative novel motifs were also identified in F-box proteins with no known functional domain at their C-termini. The F-box genes were physically mapped on the 8 chickpea chromosomes and duplication events were investigated which revealed that the F-box gene family expanded largely due to tandem duplications. Phylogenetic analysis classified the chickpea F-box genes into 9 clusters. Also, maximum syntenic relationship was observed with soybean followed by Medicago truncatula, Lotus japonicus and Arabidopsis. Digital expression analysis of F-box genes in various chickpea tissues as well as under abiotic stress conditions utilizing the available chickpea transcriptome data revealed differential expression patterns with several F-box genes specifically expressing in each tissue, few of which were validated by using quantitative real-time PCR. The genome-wide analysis of chickpea F-box genes provides new opportunities for characterization of candidate F-box genes and elucidation of their function in growth, development and stress responses for utilization in chickpea improvement.

  8. GENOME-WIDE ASSOCIATION STUDY (GWAS) AND GENOME-WIDE BY ENVIRONMENT INTERACTION STUDY (GWEIS) OF DEPRESSIVE SYMPTOMS IN AFRICAN AMERICAN AND HISPANIC/LATINA WOMEN.

    PubMed

    Dunn, Erin C; Wiste, Anna; Radmanesh, Farid; Almli, Lynn M; Gogarten, Stephanie M; Sofer, Tamar; Faul, Jessica D; Kardia, Sharon L R; Smith, Jennifer A; Weir, David R; Zhao, Wei; Soare, Thomas W; Mirza, Saira S; Hek, Karin; Tiemeier, Henning; Goveas, Joseph S; Sarto, Gloria E; Snively, Beverly M; Cornelis, Marilyn; Koenen, Karestan C; Kraft, Peter; Purcell, Shaun; Ressler, Kerry J; Rosand, Jonathan; Wassertheil-Smoller, Sylvia; Smoller, Jordan W

    2016-04-01

    Genome-wide association studies (GWAS) have made little progress in identifying variants linked to depression. We hypothesized that examining depressive symptoms and considering gene-environment interaction (GxE) might improve efficiency for gene discovery. We therefore conducted a GWAS and genome-wide by environment interaction study (GWEIS) of depressive symptoms. Using data from the SHARe cohort of the Women's Health Initiative, comprising African Americans (n = 7,179) and Hispanics/Latinas (n = 3,138), we examined genetic main effects and GxE with stressful life events and social support. We also conducted a heritability analysis using genome-wide complex trait analysis (GCTA). Replication was attempted in four independent cohorts. No SNPs achieved genome-wide significance for main effects in either discovery sample. The top signals in African Americans were rs73531535 (located 20 kb from GPR139, P = 5.75 × 10(-8) ) and rs75407252 (intronic to CACNA2D3, P = 6.99 × 10(-7) ). In Hispanics/Latinas, the top signals were rs2532087 (located 27 kb from CD38, P = 2.44 × 10(-7) ) and rs4542757 (intronic to DCC, P = 7.31 × 10(-7) ). In the GEWIS with stressful life events, one interaction signal was genome-wide significant in African Americans (rs4652467; P = 4.10 × 10(-10) ; located 14 kb from CEP350). This interaction was not observed in a smaller replication cohort. Although heritability estimates for depressive symptoms and stressful life events were each less than 10%, they were strongly genetically correlated (rG = 0.95), suggesting that common variation underlying self-reported depressive symptoms and stressful life event exposure, though modest on their own, were highly overlapping in this sample. Our results underscore the need for larger samples, more GEWIS, and greater investigation into genetic and environmental determinants of depressive symptoms in minorities. © 2016 Wiley Periodicals, Inc.

  9. FGWAS: Functional genome wide association analysis.

    PubMed

    Huang, Chao; Thompson, Paul; Wang, Yalin; Yu, Yang; Zhang, Jingwen; Kong, Dehan; Colen, Rivka R; Knickmeyer, Rebecca C; Zhu, Hongtu

    2017-10-01

    Functional phenotypes (e.g., subcortical surface representation), which commonly arise in imaging genetic studies, have been used to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. However, existing statistical methods largely ignore the functional features (e.g., functional smoothness and correlation). The aim of this paper is to develop a functional genome-wide association analysis (FGWAS) framework to efficiently carry out whole-genome analyses of functional phenotypes. FGWAS consists of three components: a multivariate varying coefficient model, a global sure independence screening procedure, and a test procedure. Compared with the standard multivariate regression model, the multivariate varying coefficient model explicitly models the functional features of functional phenotypes through the integration of smooth coefficient functions and functional principal component analysis. Statistically, compared with existing methods for genome-wide association studies (GWAS), FGWAS can substantially boost the detection power for discovering important genetic variants influencing brain structure and function. Simulation studies show that FGWAS outperforms existing GWAS methods for searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. We have successfully applied FGWAS to large-scale analysis of data from the Alzheimer's Disease Neuroimaging Initiative for 708 subjects, 30,000 vertices on the left and right hippocampal surfaces, and 501,584 SNPs. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Genome-wide association study for Crohn's disease in the Quebec Founder Population identifies multiple validated disease loci.

    PubMed

    Raelson, John V; Little, Randall D; Ruether, Andreas; Fournier, Hélène; Paquin, Bruno; Van Eerdewegh, Paul; Bradley, W E C; Croteau, Pascal; Nguyen-Huu, Quynh; Segal, Jonathan; Debrus, Sophie; Allard, René; Rosenstiel, Philip; Franke, Andre; Jacobs, Gunnar; Nikolaus, Susanna; Vidal, Jean-Michel; Szego, Peter; Laplante, Nathalie; Clark, Hilary F; Paulussen, René J; Hooper, John W; Keith, Tim P; Belouchi, Abdelmajid; Schreiber, Stefan

    2007-09-11

    Genome-wide association (GWA) studies offer a powerful unbiased method for the identification of multiple susceptibility genes for complex diseases. Here we report the results of a GWA study for Crohn's disease (CD) using family trios from the Quebec Founder Population (QFP). Haplotype-based association analyses identified multiple regions associated with the disease that met the criteria for genome-wide significance, with many containing a gene whose function appears relevant to CD. A proportion of these were replicated in two independent German Caucasian samples, including the established CD loci NOD2 and IBD5. The recently described IL23R locus was also identified and replicated. For this region, multiple individuals with all major haplotypes in the QFP were sequenced and extensive fine mapping performed to identify risk and protective alleles. Several additional loci, including a region on 3p21 containing several plausible candidate genes, a region near JAKMIP1 on 4p16.1, and two larger regions on chromosome 17 were replicated. Together with previously published loci, the spectrum of CD genes identified to date involves biochemical networks that affect epithelial defense mechanisms, innate and adaptive immune response, and the repair or remodeling of tissue.

  11. Meta-analysis of genome-wide association from genomic prediction models

    USDA-ARS?s Scientific Manuscript database

    A limitation of many genome-wide association studies (GWA) in animal breeding is that there are many loci with small effect sizes; thus, larger sample sizes (N) are required to guarantee suitable power of detection. To increase sample size, results from different GWA can be combined in a meta-analys...

  12. Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies

    PubMed Central

    Zhang, Yu; Liu, Jun S.

    2011-01-01

    Genome-wide association studies commonly involve simultaneous tests of millions of single nucleotide polymorphisms (SNP) for disease association. The SNPs in nearby genomic regions, however, are often highly correlated due to linkage disequilibrium (LD, a genetic term for correlation). Simple Bonferonni correction for multiple comparisons is therefore too conservative. Permutation tests, which are often employed in practice, are both computationally expensive for genome-wide studies and limited in their scopes. We present an accurate and computationally efficient method, based on Poisson de-clumping heuristics, for approximating genome-wide significance of SNP associations. Compared with permutation tests and other multiple comparison adjustment approaches, our method computes the most accurate and robust p-value adjustments for millions of correlated comparisons within seconds. We demonstrate analytically that the accuracy and the efficiency of our method are nearly independent of the sample size, the number of SNPs, and the scale of p-values to be adjusted. In addition, our method can be easily adopted to estimate false discovery rate. When applied to genome-wide SNP datasets, we observed highly variable p-value adjustment results evaluated from different genomic regions. The variation in adjustments along the genome, however, are well conserved between the European and the African populations. The p-value adjustments are significantly correlated with LD among SNPs, recombination rates, and SNP densities. Given the large variability of sequence features in the genome, we further discuss a novel approach of using SNP-specific (local) thresholds to detect genome-wide significant associations. This article has supplementary material online. PMID:22140288

  13. Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation

    PubMed Central

    Chen, Jieming; Zheng, Houfeng; Bei, Jin-Xin; Sun, Liangdan; Jia, Wei-hua; Li, Tao; Zhang, Furen; Seielstad, Mark; Zeng, Yi-Xin; Zhang, Xuejun; Liu, Jianjun

    2009-01-01

    Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, our study revealed a one-dimensional “north-south” population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused “outliers,” probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. Via simulation, we show that empirical levels of population structure observed across modern China can cause spurious associations in GWAS if not properly handled. In the Han Chinese, geographic matching is a good proxy for genetic matching, particularly in validation and candidate-gene studies in which population stratification cannot be directly accessed and accounted for because of the lack of genome-wide data, with the exception of the metropolitan cities, where geographical location is no longer a good indicator of ancestral origin. Our findings are important for designing GWAS in the Chinese population, an activity that is expected to intensify greatly in the near future. PMID:19944401

  14. Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

    PubMed

    Sanitá Lima, Matheus; Smith, David Roy

    2017-11-06

    Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.

  15. Family-Based Genome-Wide Association Scan of Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    Mick, Eric; Todorov, Alexandre; Smalley, Susan; Hu, Xiaolan; Loo, Sandra; Todd, Richard D.; Biederman, Joseph; Byrne, Deirdre; Dechairo, Bryan; Guiney, Allan; McCracken, James; McGough, James; Nelson, Stanley F.; Reiersen, Angela M.; Wilens, Timothy E.; Wozniak, Janet; Neale, Benjamin M.; Faraone, Stephen V.

    2010-01-01

    Objective: Genes likely play a substantial role in the etiology of attention-deficit/hyperactivity disorder (ADHD). However, the genetic architecture of the disorder is unknown, and prior genome-wide association studies (GWAS) have not identified a genome-wide significant association. We have conducted a third, independent, multisite GWAS of…

  16. Gigwa-Genotype investigator for genome-wide analyses.

    PubMed

    Sempéré, Guilhem; Philippe, Florian; Dereeper, Alexis; Ruiz, Manuel; Sarah, Gautier; Larmande, Pierre

    2016-06-06

    Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly true with regards to studies of genomic variation, which are currently lacking scalable and user-friendly data exploration solutions. Here we present Gigwa, a web-based tool that provides an easy and intuitive way to explore large amounts of genotyping data by filtering it not only on the basis of variant features, including functional annotations, but also on genotype patterns. The data storage relies on MongoDB, which offers good scalability properties. Gigwa can handle multiple databases and may be deployed in either single- or multi-user mode. In addition, it provides a wide range of popular export formats. The Gigwa application is suitable for managing large amounts of genomic variation data. Its user-friendly web interface makes such processing widely accessible. It can either be simply deployed on a workstation or be used to provide a shared data portal for a given community of researchers.

  17. Genome-wide analysis of WRKY transcription factors in Solanum lycopersicum.

    PubMed

    Huang, Shengxiong; Gao, Yongfeng; Liu, Jikai; Peng, Xiaoli; Niu, Xiangli; Fei, Zhangjun; Cao, Shuqing; Liu, Yongsheng

    2012-06-01

    The WRKY transcription factors have been implicated in multiple biological processes in plants, especially in regulating defense against biotic and abiotic stresses. However, little information is available about the WRKYs in tomato (Solanum lycopersicum). The recent release of the whole-genome sequence of tomato allowed us to perform a genome-wide investigation for tomato WRKY proteins, and to compare these positively identified proteins with their orthologs in model plants, such as Arabidopsis and rice. In the present study, based on the recently released tomato whole-genome sequences, we identified 81 SlWRKY genes that were classified into three main groups, with the second group further divided into five subgroups. Depending on WRKY domains' sequences derived from tomato, Arabidopsis and rice, construction of a phylogenetic tree demonstrated distinct clustering and unique gene expansion of WRKY genes among the three species. Genome mapping analysis revealed that tomato WRKY genes were enriched on several chromosomes, especially on chromosome 5, and 16 % of the family members were tandemly duplicated genes. The tomato WRKYs from each group were shown to share similar motif compositions. Furthermore, tomato WRKY genes showed distinct temporal and spatial expression patterns in different developmental processes and in response to various biotic and abiotic stresses. The expression of 18 selected tomato WRKY genes in response to drought and salt stresses and Pseudomonas syringae invasion, respectively, was validated by quantitative RT-PCR. Our results will provide a platform for functional identification and molecular breeding study of WRKY genes in tomato and probably other Solanaceae plants.

  18. Genome-wide association studies and resting heart rate.

    PubMed

    Kilpeläinen, Tuomas O

    Genome-wide association studies (GWASs) have revolutionized the search for genetic variants regulating resting heart rate. In the last 10years, GWASs have led to the identification of at least 21 novel heart rate loci. These discoveries have provided valuable insights into the mechanisms and pathways that regulate heart rate and link heart rate to cardiovascular morbidity and mortality. GWASs capture majority of genetic variation in a population sample by utilizing high-throughput genotyping chips measuring genotypes for up to several millions of SNPs across the genome in thousands of individuals. This allows the identification of the strongest heart rate associated signals at genome-wide level. While GWASs provide robust statistical evidence of the association of a given genetic locus with heart rate, they are only the starting point for detailed follow-up studies to locate the causal variants and genes and gain further insights into the biological mechanisms underlying the observed associations. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Case-Control Genome-Wide Association Study of Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    Neale, Benjamin M.; Medland, Sarah; Ripke, Stephan; Anney, Richard J. L.; Asherson, Philip; Buitelaar, Jan; Franke, Barbara; Gill, Michael; Kent, Lindsey; Holmans, Peter; Middleton, Frank; Thapar, Anita; Lesch, Klaus-Peter; Faraone, Stephen V.; Daly, Mark; Nguyen, Thuy Trang; Schafer, Helmut; Steinhausen, Hans-Christoph; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Warnke, Andreas; Walitza, Susanne; Freitag, Christine; Meyer, Jobst; Palmason, Haukur; Rothenberger, Aribert; Hawi, Ziarih; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. Thus additional genome-wide association studies (GWAS) are needed. Method: We used case-control analyses of 896 cases…

  20. Discovering modes of action for therapeutic compounds using a genome-wide screen of yeast heterozygotes.

    PubMed

    Lum, Pek Yee; Armour, Christopher D; Stepaniants, Sergey B; Cavet, Guy; Wolf, Maria K; Butler, J Scott; Hinshaw, Jerald C; Garnier, Philippe; Prestwich, Glenn D; Leonardson, Amy; Garrett-Engele, Philip; Rush, Christopher M; Bard, Martin; Schimmack, Greg; Phillips, John W; Roberts, Christopher J; Shoemaker, Daniel D

    2004-01-09

    Modern medicine faces the challenge of developing safer and more effective therapies to treat human diseases. Many drugs currently in use were discovered without knowledge of their underlying molecular mechanisms. Understanding their biological targets and modes of action will be essential to design improved second-generation compounds. Here, we describe the use of a genome-wide pool of tagged heterozygotes to assess the cellular effects of 78 compounds in Saccharomyces cerevisiae. Specifically, lanosterol synthase in the sterol biosynthetic pathway was identified as a target of the antianginal drug molsidomine, which may explain its cholesterol-lowering effects. Further, the rRNA processing exosome was identified as a potential target of the cell growth inhibitor 5-fluorouracil. This genome-wide screen validated previously characterized targets or helped identify potentially new modes of action for over half of the compounds tested, providing proof of this principle for analyzing the modes of action of clinically relevant compounds.

  1. snpGeneSets: An R Package for Genome-Wide Study Annotation

    PubMed Central

    Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

    2016-01-01

    Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048

  2. Genome-wide DNA polymorphisms in two cultivars of mei (Prunus mume sieb. et zucc.).

    PubMed

    Sun, Lidan; Zhang, Qixiang; Xu, Zongda; Yang, Weiru; Guo, Yu; Lu, Jiuxing; Pan, Huitang; Cheng, Tangren; Cai, Ming

    2013-10-06

    Mei (Prunus mume Sieb. et Zucc.) is a famous ornamental plant and fruit crop grown in East Asian countries. Limited genetic resources, especially molecular markers, have hindered the progress of mei breeding projects. Here, we performed low-depth whole-genome sequencing of Prunus mume 'Fenban' and Prunus mume 'Kouzi Yudie' to identify high-quality polymorphic markers between the two cultivars on a large scale. A total of 1464.1 Mb and 1422.1 Mb of 'Fenban' and 'Kouzi Yudie' sequencing data were uniquely mapped to the mei reference genome with about 6-fold coverage, respectively. We detected a large number of putative polymorphic markers from the 196.9 Mb of sequencing data shared by the two cultivars, which together contained 200,627 SNPs, 4,900 InDels, and 7,063 SSRs. Among these markers, 38,773 SNPs, 174 InDels, and 418 SSRs were distributed in the 22.4 Mb CDS region, and 63.0% of these marker-containing CDS sequences were assigned to GO terms. Subsequently, 670 selected SNPs were validated using an Agilent's SureSelect solution phase hybridization assay. A subset of 599 SNPs was used to assess the genetic similarity of a panel of mei germplasm samples and a plum (P. salicina) cultivar, producing a set of informative diversity data. We also analyzed the frequency and distribution of detected InDels and SSRs in mei genome and validated their usefulness as DNA markers. These markers were successfully amplified in the cultivars and in their segregating progeny. A large set of high-quality polymorphic SNPs, InDels, and SSRs were identified in parallel between 'Fenban' and 'Kouzi Yudie' using low-depth whole-genome sequencing. The study presents extensive data on these polymorphic markers, which can be useful for constructing high-resolution genetic maps, performing genome-wide association studies, and designing genomic selection strategies in mei.

  3. CONAN: copy number variation analysis software for genome-wide association studies

    PubMed Central

    2010-01-01

    Background Genome-wide association studies (GWAS) based on single nucleotide polymorphisms (SNPs) revolutionized our perception of the genetic regulation of complex traits and diseases. Copy number variations (CNVs) promise to shed additional light on the genetic basis of monogenic as well as complex diseases and phenotypes. Indeed, the number of detected associations between CNVs and certain phenotypes are constantly increasing. However, while several software packages support the determination of CNVs from SNP chip data, the downstream statistical inference of CNV-phenotype associations is still subject to complicated and inefficient in-house solutions, thus strongly limiting the performance of GWAS based on CNVs. Results CONAN is a freely available client-server software solution which provides an intuitive graphical user interface for categorizing, analyzing and associating CNVs with phenotypes. Moreover, CONAN assists the evaluation process by visualizing detected associations via Manhattan plots in order to enable a rapid identification of genome-wide significant CNV regions. Various file formats including the information on CNVs in population samples are supported as input data. Conclusions CONAN facilitates the performance of GWAS based on CNVs and the visual analysis of calculated results. CONAN provides a rapid, valid and straightforward software solution to identify genetic variation underlying the 'missing' heritability for complex traits that remains unexplained by recent GWAS. The freely available software can be downloaded at http://genepi-conan.i-med.ac.at. PMID:20546565

  4. International genome-wide meta-analysis identifies new primary biliary cirrhosis risk loci and targetable pathogenic pathways.

    PubMed

    Cordell, Heather J; Han, Younghun; Mells, George F; Li, Yafang; Hirschfield, Gideon M; Greene, Casey S; Xie, Gang; Juran, Brian D; Zhu, Dakai; Qian, David C; Floyd, James A B; Morley, Katherine I; Prati, Daniele; Lleo, Ana; Cusi, Daniele; Gershwin, M Eric; Anderson, Carl A; Lazaridis, Konstantinos N; Invernizzi, Pietro; Seldin, Michael F; Sandford, Richard N; Amos, Christopher I; Siminovitch, Katherine A

    2015-09-22

    Primary biliary cirrhosis (PBC) is a classical autoimmune liver disease for which effective immunomodulatory therapy is lacking. Here we perform meta-analyses of discovery data sets from genome-wide association studies of European subjects (n=2,764 cases and 10,475 controls) followed by validation genotyping in an independent cohort (n=3,716 cases and 4,261 controls). We discover and validate six previously unknown risk loci for PBC (Pcombined<5 × 10(-8)) and used pathway analysis to identify JAK-STAT/IL12/IL27 signalling and cytokine-cytokine pathways, for which relevant therapies exist.

  5. Genome-wide Association Study Identifies New Loci for Resistance to Leptosphaeria maculans in Canola

    PubMed Central

    Raman, Harsh; Raman, Rosy; Coombes, Neil; Song, Jie; Diffey, Simon; Kilian, Andrzej; Lindbeck, Kurt; Barbulescu, Denise M.; Batley, Jacqueline; Edwards, David; Salisbury, Phil A.; Marcroft, Steve

    2016-01-01

    Key message “We identified both quantitative and quantitative resistance loci to Leptosphaeria maculans, a fungal pathogen, causing blackleg disease in canola. Several genome-wide significant associations were detected at known and new loci for blackleg resistance. We further validated statistically significant associations in four genetic mapping populations, demonstrating that GWAS marker loci are indeed associated with resistance to L. maculans. One of the novel loci identified for the first time, Rlm12, conveys adult plant resistance in canola.” Blackleg, caused by Leptosphaeria maculans, is a significant disease which affects the sustainable production of canola (Brassica napus). This study reports a genome-wide association study based on 18,804 polymorphic SNPs to identify loci associated with qualitative and quantitative resistance to L. maculans. Genomic regions delimited with 694 significant SNP markers, that are associated with resistance evaluated using 12 single spore isolates and pathotypes from four canola stubble were identified. Several significant associations were detected at known disease resistance loci including in the vicinity of recently cloned Rlm2/LepR3 genes, and at new loci on chromosomes A01/C01, A02/C02, A03/C03, A05/C05, A06, A08, and A09. In addition, we validated statistically significant associations on A01, A07, and A10 in four genetic mapping populations, demonstrating that GWAS marker loci are indeed associated with resistance to L. maculans. One of the novel loci identified for the first time, Rlm12, conveys adult plant resistance and mapped within 13.2 kb from Arabidopsis R gene of TIR-NBS class. We showed that resistance loci are located in the vicinity of R genes of Arabidopsis thaliana and Brassica napus on the sequenced genome of B. napus cv. Darmor-bzh. Significantly associated SNP markers provide a valuable tool to enrich germplasm for favorable alleles in order to improve the level of resistance to L. maculans in

  6. Genome-wide comparative analysis of four Indian Drosophila species.

    PubMed

    Mohanty, Sujata; Khanna, Radhika

    2017-12-01

    Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.

  7. Genome-wide association studies and epigenome-wide association studies go together in cancer control

    PubMed Central

    Verma, Mukesh

    2016-01-01

    Completion of the human genome a decade ago laid the foundation for: using genetic information in assessing risk to identify individuals and populations that are likely to develop cancer, and designing treatments based on a person's genetic profiling (precision medicine). Genome-wide association studies (GWAS) completed during the past few years have identified risk-associated single nucleotide polymorphisms that can be used as screening tools in epidemiologic studies of a variety of tumor types. This led to the conduct of epigenome-wide association studies (EWAS). This article discusses the current status, challenges and research opportunities in GWAS and EWAS. Information gained from GWAS and EWAS has potential applications in cancer control and treatment. PMID:27079684

  8. Genotypic variability-based genome-wide association study identifies non-additive loci HLA-C and IL12B for psoriasis.

    PubMed

    Wei, Wen-Hua; Massey, Jonathan; Worthington, Jane; Barton, Anne; Warren, Richard B

    2018-03-01

    Genome-wide association studies (GWASs) have identified a number of loci for psoriasis but largely ignored non-additive effects. We report a genotypic variability-based GWAS (vGWAS) that can prioritize non-additive loci without requiring prior knowledge of interaction types or interacting factors in two steps, using a mixed model to partition dichotomous phenotypes into an additive component and non-additive environmental residuals on the liability scale and then the Levene's (Brown-Forsythe) test to assess equality of the residual variances across genotype groups genome widely. The vGWAS identified two genome-wide significant (P < 5.0e-08) non-additive loci HLA-C and IL12B that were also genome-wide significant in an accompanying GWAS in the discovery cohort. Both loci were statistically replicated in vGWAS of an independent cohort with a small sample size. HLA-C and IL12B were reported in moderate gene-gene and/or gene-environment interactions in several occasions. We found a moderate interaction with age-of-onset of psoriasis, which was replicated indirectly. The vGWAS also revealed five suggestive loci (P < 6.76e-05) including FUT2 that was associated with psoriasis with environmental aspects triggered by virus infection and/or metabolic factors. Replication and functional investigation are needed to validate the suggestive vGWAS loci.

  9. Microfluidics for genome-wide studies involving next generation sequencing

    PubMed Central

    Murphy, Travis W.; Lu, Chang

    2017-01-01

    Next-generation sequencing (NGS) has revolutionized how molecular biology studies are conducted. Its decreasing cost and increasing throughput permit profiling of genomic, transcriptomic, and epigenomic features for a wide range of applications. Microfluidics has been proven to be highly complementary to NGS technology with its unique capabilities for handling small volumes of samples and providing platforms for automation, integration, and multiplexing. In this article, we review recent progress on applying microfluidics to facilitate genome-wide studies. We emphasize on several technical aspects of NGS and how they benefit from coupling with microfluidic technology. We also summarize recent efforts on developing microfluidic technology for genomic, transcriptomic, and epigenomic studies, with emphasis on single cell analysis. We envision rapid growth in these directions, driven by the needs for testing scarce primary cell samples from patients in the context of precision medicine. PMID:28396707

  10. Web-Based Genome-Wide Association Study Identifies Two Novel Loci and a Substantial Genetic Component for Parkinson's Disease

    PubMed Central

    Do, Chuong B.; Tung, Joyce Y.; Dorfman, Elizabeth; Kiefer, Amy K.; Drabant, Emily M.; Francke, Uta; Mountain, Joanna L.; Goldman, Samuel M.; Tanner, Caroline M.; Langston, J. William; Wojcicki, Anne; Eriksson, Nicholas

    2011-01-01

    Although the causes of Parkinson's disease (PD) are thought to be primarily environmental, recent studies suggest that a number of genes influence susceptibility. Using targeted case recruitment and online survey instruments, we conducted the largest case-control genome-wide association study (GWAS) of PD based on a single collection of individuals to date (3,426 cases and 29,624 controls). We discovered two novel, genome-wide significant associations with PD–rs6812193 near SCARB2 (, ) and rs11868035 near SREBF1/RAI1 (, )—both replicated in an independent cohort. We also replicated 20 previously discovered genetic associations (including LRRK2, GBA, SNCA, MAPT, GAK, and the HLA region), providing support for our novel study design. Relying on a recently proposed method based on genome-wide sharing estimates between distantly related individuals, we estimated the heritability of PD to be at least 0.27. Finally, using sparse regression techniques, we constructed predictive models that account for 6%–7% of the total variance in liability and that suggest the presence of true associations just beyond genome-wide significance, as confirmed through both internal and external cross-validation. These results indicate a substantial, but by no means total, contribution of genetics underlying susceptibility to both early-onset and late-onset PD, suggesting that, despite the novel associations discovered here and elsewhere, the majority of the genetic component for Parkinson's disease remains to be discovered. PMID:21738487

  11. The Glyphosate-Based Herbicide Roundup Does not Elevate Genome-Wide Mutagenesis of Escherichia coli.

    PubMed

    Tincher, Clayton; Long, Hongan; Behringer, Megan; Walker, Noah; Lynch, Michael

    2017-10-05

    Mutations induced by pollutants may promote pathogen evolution, for example by accelerating mutations conferring antibiotic resistance. Generally, evaluating the genome-wide mutagenic effects of long-term sublethal pollutant exposure at single-nucleotide resolution is extremely difficult. To overcome this technical barrier, we use the mutation accumulation/whole-genome sequencing (MA/WGS) method as a mutagenicity test, to quantitatively evaluate genome-wide mutagenesis of Escherichia coli after long-term exposure to a wide gradient of the glyphosate-based herbicide (GBH) Roundup Concentrate Plus. The genome-wide mutation rate decreases as GBH concentration increases, suggesting that even long-term GBH exposure does not compromise the genome stability of bacteria. Copyright © 2017 Tincher et al.

  12. Genome-wide association and genomic prediction of resistance to viral nervous necrosis in European sea bass (Dicentrarchus labrax) using RAD sequencing.

    PubMed

    Palaiokostas, Christos; Cariou, Sophie; Bestin, Anastasia; Bruant, Jean-Sebastien; Haffray, Pierrick; Morin, Thierry; Cabon, Joëlle; Allal, François; Vandeputte, Marc; Houston, Ross D

    2018-06-08

    European sea bass (Dicentrarchus labrax) is one of the most important species for European aquaculture. Viral nervous necrosis (VNN), commonly caused by the redspotted grouper nervous necrosis virus (RGNNV), can result in high levels of morbidity and mortality, mainly during the larval and juvenile stages of cultured sea bass. In the absence of efficient therapeutic treatments, selective breeding for host resistance offers a promising strategy to control this disease. Our study aimed at investigating genetic resistance to VNN and genomic-based approaches to improve disease resistance by selective breeding. A population of 1538 sea bass juveniles from a factorial cross between 48 sires and 17 dams was challenged with RGNNV with mortalities and survivors being recorded and sampled for genotyping by the RAD sequencing approach. We used genome-wide genotype data from 9195 single nucleotide polymorphisms (SNPs) for downstream analysis. Estimates of heritability of survival on the underlying scale for the pedigree and genomic relationship matrices were 0.27 (HPD interval 95%: 0.14-0.40) and 0.43 (0.29-0.57), respectively. Classical genome-wide association analysis detected genome-wide significant quantitative trait loci (QTL) for resistance to VNN on chromosomes (unassigned scaffolds in the case of 'chromosome' 25) 3, 20 and 25 (P < 1e06). Weighted genomic best linear unbiased predictor provided additional support for the QTL on chromosome 3 and suggested that it explained 4% of the additive genetic variation. Genomic prediction approaches were tested to investigate the potential of using genome-wide SNP data to estimate breeding values for resistance to VNN and showed that genomic prediction resulted in a 13% increase in successful classification of resistant and susceptible animals compared to pedigree-based methods, with Bayes A and Bayes B giving the highest predictive ability. Genome-wide significant QTL were identified but each with relatively small effects on

  13. ARG-based genome-wide analysis of cacao cultivars.

    PubMed

    Utro, Filippo; Cornejo, Omar Eduardo; Livingstone, Donald; Motamayor, Juan Carlos; Parida, Laxmi

    2012-01-01

    Ancestral recombinations graph (ARG) is a topological structure that captures the relationship between the extant genomic sequences in terms of genetic events including recombinations. IRiS is a system that estimates the ARG on sequences of individuals, at genomic scales, capturing the relationship between these individuals of the species. Recently, this system was used to estimate the ARG of the recombining X Chromosome of a collection of human populations using relatively dense, bi-allelic SNP data. While the ARG is a natural model for capturing the inter-relationship between a single chromosome of the individuals of a species, it is not immediately apparent how the model can utilize whole-genome (across chromosomes) diploid data. Also, the sheer complexity of an ARG structure presents a challenge to graph visualization techniques. In this paper we examine the ARG reconstruction for (1) genome-wide or multiple chromosomes, (2) multi-allelic and (3) extremely sparse data. To aid in the visualization of the results of the reconstructed ARG, we additionally construct a much simplified topology, a classification tree, suggested by the ARG.As the test case, we study the problem of extracting the relationship between populations of Theobroma cacao. The chocolate tree is an outcrossing species in the wild, due to self-incompatibility mechanisms at play. Thus a principled approach to understanding the inter-relationships between the different populations must take the shuffling of the genomic segments into account. The polymorphisms in the test data are short tandem repeats (STR) and are multi-allelic (sometimes as high as 30 distinct possible values at a locus). Each is at a genomic location that is bilaterally transmitted, hence the ARG is a natural model for this data. Another characteristic of this plant data set is that while it is genome-wide, across 10 linkage groups or chromosomes, it is very sparse, i.e., only 96 loci from a genome of approximately 400 megabases

  14. ARG-based genome-wide analysis of cacao cultivars

    PubMed Central

    2012-01-01

    Background Ancestral recombinations graph (ARG) is a topological structure that captures the relationship between the extant genomic sequences in terms of genetic events including recombinations. IRiS is a system that estimates the ARG on sequences of individuals, at genomic scales, capturing the relationship between these individuals of the species. Recently, this system was used to estimate the ARG of the recombining X Chromosome of a collection of human populations using relatively dense, bi-allelic SNP data. Results While the ARG is a natural model for capturing the inter-relationship between a single chromosome of the individuals of a species, it is not immediately apparent how the model can utilize whole-genome (across chromosomes) diploid data. Also, the sheer complexity of an ARG structure presents a challenge to graph visualization techniques. In this paper we examine the ARG reconstruction for (1) genome-wide or multiple chromosomes, (2) multi-allelic and (3) extremely sparse data. To aid in the visualization of the results of the reconstructed ARG, we additionally construct a much simplified topology, a classification tree, suggested by the ARG. As the test case, we study the problem of extracting the relationship between populations of Theobroma cacao. The chocolate tree is an outcrossing species in the wild, due to self-incompatibility mechanisms at play. Thus a principled approach to understanding the inter-relationships between the different populations must take the shuffling of the genomic segments into account. The polymorphisms in the test data are short tandem repeats (STR) and are multi-allelic (sometimes as high as 30 distinct possible values at a locus). Each is at a genomic location that is bilaterally transmitted, hence the ARG is a natural model for this data. Another characteristic of this plant data set is that while it is genome-wide, across 10 linkage groups or chromosomes, it is very sparse, i.e., only 96 loci from a genome of

  15. Genome-wide scans for loci under selection in humans

    PubMed Central

    2005-01-01

    Natural selection, which can be defined as the differential contribution of genetic variants to future generations, is the driving force of Darwinian evolution. Identifying regions of the human genome that have been targets of natural selection is an important step in clarifying human evolutionary history and understanding how genetic variation results in phenotypic diversity, it may also facilitate the search for complex disease genes. Technological advances in high-throughput DNA sequencing and single nucleotide polymorphism genotyping have enabled several genome-wide scans of natural selection to be undertaken. Here, some of the observations that are beginning to emerge from these studies will be reviewed, including evidence for geographically restricted selective pressures (ie local adaptation) and a relationship between genes subject to natural selection and human disease. In addition, the paper will highlight several important problems that need to be addressed in future genome-wide studies of natural selection. PMID:16004726

  16. Meta-analysis of 32 genome-wide linkage studies of schizophrenia

    PubMed Central

    Ng, MYM; Levinson, DF; Faraone, SV; Suarez, BK; DeLisi, LE; Arinami, T; Riley, B; Paunio, T; Pulver, AE; Irmansyah; Holmans, PA; Escamilla, M; Wildenauer, DB; Williams, NM; Laurent, C; Mowry, BJ; Brzustowicz, LM; Maziade, M; Sklar, P; Garver, DL; Abecasis, GR; Lerer, B; Fallin, MD; Gurling, HMD; Gejman, PV; Lindholm, E; Moises, HW; Byerley, W; Wijsman, EM; Forabosco, P; Tsuang, MT; Hwu, H-G; Okazaki, Y; Kendler, KS; Wormley, B; Fanous, A; Walsh, D; O’Neill, FA; Peltonen, L; Nestadt, G; Lasseter, VK; Liang, KY; Papadimitriou, GM; Dikeos, DG; Schwab, SG; Owen, MJ; O’Donovan, MC; Norton, N; Hare, E; Raventos, H; Nicolini, H; Albus, M; Maier, W; Nimgaonkar, VL; Terenius, L; Mallet, J; Jay, M; Godard, S; Nertney, D; Alexander, M; Crowe, RR; Silverman, JM; Bassett, AS; Roy, M-A; Mérette, C; Pato, CN; Pato, MT; Roos, J Louw; Kohn, Y; Amann-Zalcenstein, D; Kalsi, G; McQuillin, A; Curtis, D; Brynjolfson, J; Sigmundsson, T; Petursson, H; Sanders, AR; Duan, J; Jazin, E; Myles-Worsley, M; Karayiorgou, M; Lewis, CM

    2009-01-01

    A genome scan meta-analysis (GSMA) was carried out on 32 independent genome-wide linkage scan analyses that included 3255 pedigrees with 7413 genotyped cases affected with schizophrenia (SCZ) or related disorders. The primary GSMA divided the autosomes into 120 bins, rank-ordered the bins within each study according to the most positive linkage result in each bin, summed these ranks (weighted for study size) for each bin across studies and determined the empirical probability of a given summed rank (PSR) by simulation. Suggestive evidence for linkage was observed in two single bins, on chromosomes 5q (142-168 Mb) and 2q (103-134 Mb). Genome-wide evidence for linkage was detected on chromosome 2q (119-152 Mb) when bin boundaries were shifted to the middle of the previous bins. The primary analysis met empirical criteria for ‘aggregate’ genome-wide significance, indicating that some or all of 10 bins are likely to contain loci linked to SCZ, including regions of chromosomes 1, 2q, 3q, 4q, 5q, 8p and 10q. In a secondary analysis of 22 studies of European-ancestry samples, suggestive evidence for linkage was observed on chromosome 8p (16-33 Mb). Although the newer genome-wide association methodology has greater power to detect weak associations to single common DNA sequence variants, linkage analysis can detect diverse genetic effects that segregate in families, including multiple rare variants within one locus or several weakly associated loci in the same region. Therefore, the regions supported by this meta-analysis deserve close attention in future studies. PMID:19349958

  17. Meta-Analysis of Genome-Wide Association Studies of Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schafer, Helmut; Holmans, Peter; Daly, Mark; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Walitza, Susanne; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Buitelaar, Jan; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Gill, Michael; Anney, Richard J. L.; Langely, Kate; O'Donovan, Michael; Williams, Nigel; Owen, Michael; Thapar, Anita; Kent, Lindsey; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph; Doyle, Alysa; Smalley, Susan; Loo, Sandra; Hakonarson, Hakon; Elia, Josephine; Todorov, Alexandre; Miranda, Ana; Mulas, Fernando; Ebstein, Richard P.; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; Sonuga-Barke, Edmund; McGough, James; Nisenbaum, Laura; Middleton, Frank; Hu, Xiaolan; Nelson, Stan

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of…

  18. Genome-wide association study of Tourette Syndrome

    PubMed Central

    Scharf, Jeremiah M.; Yu, Dongmei; Mathews, Carol A.; Neale, Benjamin M.; Stewart, S. Evelyn; Fagerness, Jesen A; Evans, Patrick; Gamazon, Eric; Edlund, Christopher K.; Service, Susan; Tikhomirov, Anna; Osiecki, Lisa; Illmann, Cornelia; Pluzhnikov, Anna; Konkashbaev, Anuar; Davis, Lea K; Han, Buhm; Crane, Jacquelyn; Moorjani, Priya; Crenshaw, Andrew T.; Parkin, Melissa A.; Reus, Victor I.; Lowe, Thomas L.; Rangel-Lugo, Martha; Chouinard, Sylvain; Dion, Yves; Girard, Simon; Cath, Danielle C; Smit, Jan H; King, Robert A.; Fernandez, Thomas; Leckman, James F.; Kidd, Kenneth K.; Kidd, Judith R.; Pakstis, Andrew J.; State, Matthew; Herrera, Luis Diego; Romero, Roxana; Fournier, Eduardo; Sandor, Paul; Barr, Cathy L; Phan, Nam; Gross-Tsur, Varda; Benarroch, Fortu; Pollak, Yehuda; Budman, Cathy L.; Bruun, Ruth D.; Erenberg, Gerald; Naarden, Allan L; Lee, Paul C; Weiss, Nicholas; Kremeyer, Barbara; Berrío, Gabriel Bedoya; Campbell, Desmond; Silgado, Julio C. Cardona; Ochoa, William Cornejo; Restrepo, Sandra C. Mesa; Muller, Heike; Duarte, Ana V. Valencia; Lyon, Gholson J; Leppert, Mark; Morgan, Jubel; Weiss, Robert; Grados, Marco A.; Anderson, Kelley; Davarya, Sarah; Singer, Harvey; Walkup, John; Jankovic, Joseph; Tischfield, Jay A.; Heiman, Gary A.; Gilbert, Donald L.; Hoekstra, Pieter J.; Robertson, Mary M.; Kurlan, Roger; Liu, Chunyu; Gibbs, J. Raphael; Singleton, Andrew; Hardy, John; Strengman, Eric; Ophoff, Roel; Wagner, Michael; Moessner, Rainald; Mirel, Daniel B.; Posthuma, Danielle; Sabatti, Chiara; Eskin, Eleazar; Conti, David V.; Knowles, James A.; Ruiz-Linares, Andres; Rouleau, Guy A.; Purcell, Shaun; Heutink, Peter; Oostra, Ben A.; McMahon, William; Freimer, Nelson; Cox, Nancy J.; Pauls, David L.

    2012-01-01

    Tourette Syndrome (TS) is a developmental disorder that has one of the highest familial recurrence rates among neuropsychiatric diseases with complex inheritance. However, the identification of definitive TS susceptibility genes remains elusive. Here, we report the first genome-wide association study (GWAS) of TS in 1285 cases and 4964 ancestry-matched controls of European ancestry, including two European-derived population isolates, Ashkenazi Jews from North America and Israel, and French Canadians from Quebec, Canada. In a primary meta-analysis of GWAS data from these European ancestry samples, no markers achieved a genome-wide threshold of significance (p<5 × 10−8); the top signal was found in rs7868992 on chromosome 9q32 within COL27A1 (p=1.85 × 10−6). A secondary analysis including an additional 211 cases and 285 controls from two closely-related Latin-American population isolates from the Central Valley of Costa Rica and Antioquia, Colombia also identified rs7868992 as the top signal (p=3.6 × 10−7 for the combined sample of 1496 cases and 5249 controls following imputation with 1000 Genomes data). This study lays the groundwork for the eventual identification of common TS susceptibility variants in larger cohorts and helps to provide a more complete understanding of the full genetic architecture of this disorder. PMID:22889924

  19. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index

    PubMed Central

    Minster, Ryan L.; Sanders, Jason L.; Singh, Jatinder; Kammerer, Candace M.; Barmada, M. Michael; Matteini, Amy M.; Zhang, Qunyuan; Wojczynski, Mary K.; Daw, E. Warwick; Brody, Jennifer A.; Arnold, Alice M.; Lunetta, Kathryn L.; Murabito, Joanne M.; Christensen, Kaare; Perls, Thomas T.; Province, Michael A.

    2015-01-01

    Background. The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. Methods. We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted for mortality risk in 3,140 individuals selected for familial longevity from the Long Life Family Study. The genome-wide association study used the Long Life Family Study as the discovery cohort and individuals from the Cardiovascular Health Study and the Framingham Heart Study as replication cohorts. Results. There were no genome-wide significant findings from the genome-wide association study; however, several single-nucleotide polymorphisms near ZNF704 on chromosome 8q21.13 were suggestively associated with the HAI in the Long Life Family Study (p < 10− 6) and nominally replicated in the Cardiovascular Health Study and Framingham Heart Study. Linkage results revealed significant evidence (log-odds score = 3.36) for a quantitative trait locus for mortality-optimized HAI in women on chromosome 9p24–p23. However, results of fine-mapping studies did not implicate any specific candidate genes within this region of interest. Conclusions. ZNF704 may be a potential candidate gene for studies of the genetic underpinnings of longevity. PMID:25758594

  20. Genome-wide association study of the four-constitution medicine.

    PubMed

    Yin, Chang Shik; Park, Hi Joon; Chung, Joo-Ho; Lee, Hye-Jung; Lee, Byung-Cheol

    2009-12-01

    Four-constitution medicine (FCM), also known as Sasang constitutional medicine, and the heritage of the long history of individualized acupuncture medicine tradition, is one of the holistic and traditional systems of constitution to appraise and categorize individual differences into four major types. This study first reports a genome-wide association study on FCM, to explore the genetic basis of FCM and facilitate the integration of FCM with conventional individual differences research. Healthy individuals of the Korean population were classified into the four constitutional types (FCTs). A total of 353,202 single nucleotide polymorphisms (SNPs) were typed using whole genome amplified samples, and six-way comparison of FCM types provided lists of significantly differential SNPs. In one-to-one FCT comparisons, 15,944 SNPs were significantly differential, and 5 SNPs were commonly significant in all of the three comparisons. In one-to-two FCT comparisons, 22,616 SNPs were significantly differential, and 20 SNPs were commonly significant in all of the three comparison groups. This study presents the association between genome-wide SNP profiles and the categorization of the FCM, and it could further provide a starting point of genome-based identification and research of the constitutions of FCM.

  1. RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants.

    PubMed

    Li, Pingchuan; Quan, Xiande; Jia, Gaofeng; Xiao, Jin; Cloutier, Sylvie; You, Frank M

    2016-11-02

    Resistance gene analogs (RGAs), such as NBS-encoding proteins, receptor-like protein kinases (RLKs) and receptor-like proteins (RLPs), are potential R-genes that contain specific conserved domains and motifs. Thus, RGAs can be predicted based on their conserved structural features using bioinformatics tools. Computer programs have been developed for the identification of individual domains and motifs from the protein sequences of RGAs but none offer a systematic assessment of the different types of RGAs. A user-friendly and efficient pipeline is needed for large-scale genome-wide RGA predictions of the growing number of sequenced plant genomes. An integrative pipeline, named RGAugury, was developed to automate RGA prediction. The pipeline first identifies RGA-related protein domains and motifs, namely nucleotide binding site (NB-ARC), leucine rich repeat (LRR), transmembrane (TM), serine/threonine and tyrosine kinase (STTK), lysin motif (LysM), coiled-coil (CC) and Toll/Interleukin-1 receptor (TIR). RGA candidates are identified and classified into four major families based on the presence of combinations of these RGA domains and motifs: NBS-encoding, TM-CC, and membrane associated RLP and RLK. All time-consuming analyses of the pipeline are paralleled to improve performance. The pipeline was evaluated using the well-annotated Arabidopsis genome. A total of 98.5, 85.2, and 100 % of the reported NBS-encoding genes, membrane associated RLPs and RLKs were validated, respectively. The pipeline was also successfully applied to predict RGAs for 50 sequenced plant genomes. A user-friendly web interface was implemented to ease command line operations, facilitate visualization and simplify result management for multiple datasets. RGAugury is an efficiently integrative bioinformatics tool for large scale genome-wide identification of RGAs. It is freely available at Bitbucket: https://bitbucket.org/yaanlpc/rgaugury .

  2. Genomic prediction and genome-wide association analysis of female longevity in a composite beef cattle breed.

    PubMed

    Hamidi Hay, E; Roberts, A

    2017-04-01

    Longevity is a highly important trait to the efficiency of beef cattle production. The objective of this study was to evaluate the genomic prediction of longevity and identify genomic regions associated with this trait. The data used in this study consisted of 547 Composite Gene Combination cows (1/2 Red Angus, 1/4 Charolais, 1/4 Tarentaise) born from 2002 to 2011 genotyped with Illumina BovineSNP50 BeadChip. Three models were used to assess genomic prediction: Bayes A, Bayes B and GBLUP using a genomic relationship matrix. To identify genomic regions associated with longevity 2 approaches were adopted: single marker genome wide association and Bayesian approach using GenSel software. The genomic prediction accuracy was low 0.28, 0.25, and 0.22 for Bayes A, Bayes B and GBLUP, respectively. The single-marker genome wide association study (GWAS)identified 5 loci with -value less than 0.05 after false discovery correction: UA-IFASA-7571 on chromosome 19 (58.03 Mb), ARS-BFGL-BAC-15059 on BTA 1 (28.8 Mb), ARS-BFGL-NGS-104159 on BTA3 (29.4 Mb), ARS-BFGL-NGS-32882 on BTA9 (104.07 Mb) and ARS-BFGL-NGS-32883 on BTA25 (33.77 Mb). The Bayesian GWAS yielded 4 genomic regions overlapping with the single marker GWAS results. The region with the highest percentage of genomic variance (3.73%) was detected on chromosome 19. Both GWAS approaches adopted in this study showed evidence for association with various chromosomal locations.

  3. Genome-wide RNAi Screening to Identify Host Factors That Modulate Oncolytic Virus Therapy.

    PubMed

    Allan, Kristina J; Mahoney, Douglas J; Baird, Stephen D; Lefebvre, Charles A; Stojdl, David F

    2018-04-03

    High-throughput genome-wide RNAi (RNA interference) screening technology has been widely used for discovering host factors that impact virus replication. Here we present the application of this technology to uncovering host targets that specifically modulate the replication of Maraba virus, an oncolytic rhabdovirus, and vaccinia virus with the goal of enhancing therapy. While the protocol has been tested for use with oncolytic Maraba virus and oncolytic vaccinia virus, this approach is applicable to other oncolytic viruses and can also be utilized for identifying host targets that modulate virus replication in mammalian cells in general. This protocol describes the development and validation of an assay for high-throughput RNAi screening in mammalian cells, the key considerations and preparation steps important for conducting a primary high-throughput RNAi screen, and a step-by-step guide for conducting a primary high-throughput RNAi screen; in addition, it broadly outlines the methods for conducting secondary screen validation and tertiary validation studies. The benefit of high-throughput RNAi screening is that it allows one to catalogue, in an extensive and unbiased fashion, host factors that modulate any aspect of virus replication for which one can develop an in vitro assay such as infectivity, burst size, and cytotoxicity. It has the power to uncover biotherapeutic targets unforeseen based on current knowledge.

  4. A Genome-Wide Association Study of Depressive Symptoms

    PubMed Central

    Cornelis, Marilyn C.; Amin, Najaf; Bakshis, Erin; Baumert, Jens; Ding, Jingzhong; Liu, Yongmei; Marciante, Kristin; Meirelles, Osorio; Nalls, Michael A.; Sun, Yan V.; Vogelzangs, Nicole; Yu, Lei; Bandinelli, Stefania; Benjamin, Emelia J.; Bennett, David A.; Boomsma, Dorret; Cannas, Alessandra; Coker, Laura H.; de Geus, Eco; De Jager, Philip L.; Diez-Roux, Ana V.; Purcell, Shaun; Hu, Frank B.; Rimma, Eric B.; Hunter, David J.; Jensen, Majken K.; Curhan, Gary; Rice, Kenneth; Penman, Alan D.; Rotter, Jerome I.; Sotoodehnia, Nona; Emeny, Rebecca; Eriksson, Johan G.; Evans, Denis A.; Ferrucci, Luigi; Fornage, Myriam; Gudnason, Vilmundur; Hofman, Albert; Illig, Thomas; Kardia, Sharon; Kelly-Hayes, Margaret; Koenen, Karestan; Kraft, Peter; Kuningas, Maris; Massaro, Joseph M.; Melzer, David; Mulas, Antonella; Mulder, Cornelis L.; Murray, Anna; Oostra, Ben A.; Palotie, Aarno; Penninx, Brenda; Petersmann, Astrid; Pilling, Luke C.; Psaty, Bruce; Rawal, Rajesh; Reiman, Eric M.; Schulz, Andrea; Shulman, Joshua M.; Singleton, Andrew B.; Smith, Albert V.; Sutin, Angelina R.; Uitterlinden, André G.; Völzke, Henry; Widen, Elisabeth; Yaffe, Kristine; Zonderman, Alan B.; Cucca, Francesco; Harris, Tamara; Ladwig, Karl-Heinz; Llewellyn, David J.; Räikkönen, Katri; Tanaka, Toshiko

    2013-01-01

    Background Depression is a heritable trait that exists on a continuum of varying severity and duration. Yet, the search for genetic variants associated with depression has had few successes. We exploit the entire continuum of depression to find common variants for depressive symptoms. Methods In this genome-wide association study, we combined the results of 17 population-based studies assessing depressive symptoms with the Center for Epidemiological Studies Depression Scale. Replication of the independent top hits (p < 1 × 10−5) was performed in five studies assessing depressive symptoms with other instruments. In addition, we performed a combined meta-analysis of all 22 discovery and replication studies. Results The discovery sample comprised 34,549 individuals (mean age of 66.5) and no loci reached genome-wide significance (lowest p = 1.05 × 10−7). Seven independent single nucleotide polymorphisms were considered for replication. In the replication set (n = 16,709), we found suggestive association of one single nucleotide polymorphism with depressive symptoms (rs161645, 5q21, p = 9.19 × 10−3). This 5q21 region reached genome-wide significance (p = 4.78 × 10−8) in the overall meta-analysis combining discovery and replication studies (n = 51,258). Conclusions The results suggest that only a large sample comprising more than 50,000 subjects may be sufficiently powered to detect genes for depressive symptoms. PMID:23290196

  5. Genome-wide SNP identification and QTL mapping for black rot resistance in cabbage.

    PubMed

    Lee, Jonghoon; Izzah, Nur Kholilatul; Jayakodi, Murukarthick; Perumal, Sampath; Joh, Ho Jun; Lee, Hyeon Ju; Lee, Sang-Choon; Park, Jee Young; Yang, Ki-Woung; Nou, Il-Sup; Seo, Joodeok; Yoo, Jaeheung; Suh, Youngdeok; Ahn, Kyounggu; Lee, Ji Hyun; Choi, Gyung Ja; Yu, Yeisoo; Kim, Heebal; Yang, Tae-Jin

    2015-02-03

    Black rot is a destructive bacterial disease causing large yield and quality losses in Brassica oleracea. To detect quantitative trait loci (QTL) for black rot resistance, we performed whole-genome resequencing of two cabbage parental lines and genome-wide SNP identification using the recently published B. oleracea genome sequences as reference. Approximately 11.5 Gb of sequencing data was produced from each parental line. Reference genome-guided mapping and SNP calling revealed 674,521 SNPs between the two cabbage lines, with an average of one SNP per 662.5 bp. Among 167 dCAPS markers derived from candidate SNPs, 117 (70.1%) were validated as bona fide SNPs showing polymorphism between the parental lines. We then improved the resolution of a previous genetic map by adding 103 markers including 87 SNP-based dCAPS markers. The new map composed of 368 markers and covers 1467.3 cM with an average interval of 3.88 cM between adjacent markers. We evaluated black rot resistance in the mapping population in three independent inoculation tests using F2:3 progenies and identified one major QTL and three minor QTLs. We report successful utilization of whole-genome resequencing for large-scale SNP identification and development of molecular markers for genetic map construction. In addition, we identified novel QTLs for black rot resistance. The high-density genetic map will promote QTL analysis for other important agricultural traits and marker-assisted breeding of B. oleracea.

  6. Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set.

    PubMed

    Kanai, Masahiro; Tanaka, Toshihiro; Okada, Yukinori

    2016-10-01

    To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10 -8 , the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were P sig =3.24 × 10 -8 (AFR), 9.26 × 10 -8 (EUR), 1.83 × 10 -7 (AMR), 1.61 × 10 -7 (EAS) and 9.46 × 10 -8 (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded P sig =3.25 × 10 -8 (ALL) and 4.20 × 10 -8 (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10 -8 ) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples.

  7. Genome Wide Association Study of Sepsis in Extremely Premature Infants

    PubMed Central

    Srinivasan, Lakshmi; Page, Grier; Kirpalani, Haresh; Murray, Jeffrey C.; Das, Abhik; Higgins, Rosemary D.; Carlo, Waldemar A.; Bell, Edward F.; Goldberg, Ronald N.; Schibler, Kurt; Sood, Beena G.; Stevenson, David K.; Stoll, Barbara J.; Van Meurs, Krisa P.; Johnson, Karen J.; Levy, Joshua; McDonald, Scott A.; Zaterka-Baxter, Kristin M.; Kennedy, Kathleen A.; Sánchez, Pablo J.; Duara, Shahnaz; Walsh, Michele C.; Shankaran, Seetha; Wynn, James L.; Cotten, C. Michael

    2017-01-01

    Objective To identify genetic variants associated with sepsis (early and late-onset) using a genome wide association (GWA) analysis in a cohort of extremely premature infants. Study Design Previously generated GWA data from the Neonatal Research Network’s anonymized genomic database biorepository of extremely premature infants were used for this study. Sepsis was defined as culture-positive early-onset or late-onset sepsis or culture-proven meningitis. Genomic and whole genome amplified DNA was genotyped for 1.2 million single nucleotide polymorphisms (SNPs); 91% of SNPs were successfully genotyped. We imputed 7.2 million additional SNPs. P values and false discovery rates were calculated from multivariate logistic regression analysis adjusting for gender, gestational age and ancestry. Target statistical value was p<10−5. Secondary analyses assessed associations of SNPs with pathogen type. Pathway analyses were also run on primary and secondary end points. Results Data from 757 extremely premature infants were included: 351 infants with sepsis and 406 infants without sepsis. No SNPs reached genome-wide significance levels (5×10−8); two SNPs in proximity to FOXC2 and FOXL1 genes achieved target levels of significance. In secondary analyses, SNPs for ELMO1, IRAK2 (Gram positive sepsis), RALA, IMMP2L (Gram negative sepsis) and PIEZO2 (fungal sepsis) met target significance levels. Pathways associated with sepsis and Gram negative sepsis included gap junctions, fibroblast growth factor receptors, regulators of cell division and Interleukin-1 associated receptor kinase 2 (p values<0.001 and FDR<20%). Conclusions No SNPs met genome-wide significance in this cohort of ELBW infants; however, areas of potential association and pathways meriting further study were identified. PMID:28283553

  8. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms

    PubMed Central

    Nimmakayala, Padma; Abburi, Venkata L.; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C. V. Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K.

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum, indicating a population bottleneck during domestication of C. baccatum. In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum, 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index (FST) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9–2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers. PMID:27857720

  9. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms.

    PubMed

    Nimmakayala, Padma; Abburi, Venkata L; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C V Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum , indicating a population bottleneck during domestication of C. baccatum . In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum , 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index ( F ST ) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9-2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers.

  10. Genome-wide association study of alcohol dependence

    PubMed Central

    Treutlein, Jens; Cichon, Sven; Ridinger, Monika; Wodarz, Norbert; Soyka, Michael; Zill, Peter; Maier, Wolfgang; Moessner, Rainald; Gaebel, Wolfgang; Dahmen, Norbert; Fehr, Christoph; Scherbaum, Norbert; Steffens, Michael; Ludwig, Kerstin U.; Frank, Josef; Wichmann, H.- Erich; Schreiber, Stefan; Dragano, Nico; Sommer, Wolfgang; Leonardi-Essmann, Fernando; Lourdusamy, Anbarasu; Gebicke-Haerter, Peter; Wienker, Thomas F.; Sullivan, Patrick F.; Nöthen, Markus M.; Kiefer, Falk; Spanagel, Rainer; Mann, Karl; Rietschel, Marcella

    2014-01-01

    Context Identification of genes contributing to alcohol dependence will improve our understanding of the mechanisms underlying this disorder. Objective To identify susceptibility genes for alcohol dependence through a genome-wide association study (GWAS) and follow-up study in a population of German male inpatients with an early age at onset. Design The GWAS included 487 male inpatients with DSM-IV alcohol dependence with an age at onset below 28 years and 1,358 population based control individuals. The follow-up study included 1,024 male inpatients and 996 age-matched male controls. All subjects were of German descent. The GWAS tested 524,396 single nucleotide polymorphisms (SNPs). All SNPs with p<10-4 were subjected to the follow-up study. In addition, nominally significant SNPs from those genes that had also shown expression changes in rat brains after chronic alcohol consumption were selected for the follow-up step. Results The GWAS produced 121 SNPs with nominal p<10-4. These, together with 19 additional SNPs from homologs of rat genes showing differential expression, were genotyped in the follow-up sample. Fifteen SNPs showed significant association with the same allele as in the GWAS. In the combined analysis, two closely linked intergenic SNPs met genome-wide significance (rs7590720 p=9.72×10-9; rs1344694 p=1.69×10-8). They are located on chromosome 2q35, a region which has been implicated in linkage studies for alcohol phenotypes. Nine SNPs were located in genes, including CDH13 and ADH1C genes which have been reported to be associated with alcohol dependence. Conclusion This is the first GWAS and follow-up study to identify a genome-wide significant association in alcohol dependence. Further independent studies are required to confirm these findings. PMID:19581569

  11. Genome-wide comparative analysis of DNA methylation between soybean cytoplasmic male-sterile line NJCMS5A and its maintainer NJCMS5B.

    PubMed

    Li, Yanwei; Ding, Xianlong; Wang, Xuan; He, Tingting; Zhang, Hao; Yang, Longshu; Wang, Tanliu; Chen, Linfeng; Gai, Junyi; Yang, Shouping

    2017-08-10

    DNA methylation is an important epigenetic modification. It can regulate the expression of many key genes without changing the primary structure of the genomic DNA, and plays a vital role in the growth and development of the organism. The genome-wide DNA methylation profile of the cytoplasmic male sterile (CMS) line in soybean has not been reported so far. In this study, genome-wide comparative analysis of DNA methylation between soybean CMS line NJCMS5A and its maintainer NJCMS5B was conducted by whole-genome bisulfite sequencing. The results showed 3527 differentially methylated regions (DMRs) and 485 differentially methylated genes (DMGs), including 353 high-credible methylated genes, 56 methylated genes coding unknown protein and 76 novel methylated genes with no known function were identified. Among them, 25 DMRs were further validated that the genome-wide DNA methylation data were reliable through bisulfite treatment, and 9 DMRs were confirmed the relationship between DNA methylation and gene expression by qRT-PCR. Finally, 8 key DMGs possibly associated with soybean CMS were identified. Genome-wide DNA methylation profile of the soybean CMS line NJCMS5A and its maintainer NJCMS5B was obtained for the first time. Several specific DMGs which participated in pollen and flower development were further identified to be probably associated with soybean CMS. This study will contribute to further understanding of the molecular mechanism behind soybean CMS.

  12. Genome-wide analysis of tandem repeats in plants and green algae

    Treesearch

    Zhixin Zhao; Cheng Guo; Sreeskandarajan Sutharzan; Pei Li; Craig Echt; Jie Zhang; Chun Liang

    2014-01-01

    Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among...

  13. Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa.

    PubMed

    Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

    2018-05-01

    Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10 -6 ), and rs7700147, an intergenic variant (P=2.93 × 10 -5 ). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes.

  14. Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa

    PubMed Central

    Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Adan, R A H; Alfredsson, L; Ando, T; Andreassen, O A; Aschauer, H; Baker, J H; Barrett, J C; Bencko, V; Bergen, A W; Berrettini, W H; Birgegard, A; Boni, C; Boraska Perica, V; Brandt, H; Breen, G; Bulik, C M; Carlberg, L; Cassina, M; Cichon, S; Clementi, M; Cohen-Woods, S; Coleman, J; Cone, R D; Courtet, P; Crawford, S; Crow, S; Crowley, J; Danner, U N; Davis, O S P; de Zwaan, M; Dedoussis, G; Degortes, D; DeSocio, J E; Dick, D M; Dikeos, D; Dina, C; Ding, B; Dmitrzak-Weglarz, M; Docampo, E; Duncan, L; Egberts, K; Ehrlich, S; Escaramís, G; Esko, T; Espeseth, T; Estivill, X; Favaro, A; Fernández-Aranda, F; Fichter, M M; Finan, C; Fischer, K; Floyd, J A B; Foretova, L; Forzan, M; Franklin, C S; Gallinger, S; Gambaro, G; Gaspar, H A; Giegling, I; Gonidakis, F; Gorwood, P; Gratacos, M; Guillaume, S; Guo, Y; Hakonarson, H; Halmi, K A; Hatzikotoulas, K; Hauser, J; Hebebrand, J; Helder, S; Herms, S; Herpertz-Dahlmann, B; Herzog, W; Hilliard, C E; Hinney, A; Hübel, C; Huckins, L M; Hudson, J I; Huemer, J; Inoko, H; Janout, V; Jiménez-Murcia, S; Johnson, C; Julià, A; Juréus, A; Kalsi, G; Kaminska, D; Kaplan, A S; Kaprio, J; Karhunen, L; Karwautz, A; Kas, M J H; Kaye, W; Kennedy, J L; Keski-Rahkonen, A; Kiezebrink, K; Klareskog, L; Klump, K L; Knudsen, G P S; Koeleman, B P C; Koubek, D; La Via, M C; Landén, M; Le Hellard, S; Levitan, R D; Li, D; Lichtenstein, P; Lilenfeld, L; Lissowska, J; Lundervold, A; Magistretti, P; Maj, M; Mannik, K; Marsal, S; Martin, N; Mattingsdal, M; McDevitt, S; McGuffin, P; Merl, E; Metspalu, A; Meulenbelt, I; Micali, N; Mitchell, J; Mitchell, K; Monteleone, P; Monteleone, A M; Mortensen, P; Munn-Chernoff, M A; Navratilova, M; Nilsson, I; Norring, C; Ntalla, I; Ophoff, R A; O'Toole, J K; Palotie, A; Pante, J; Papezova, H; Pinto, D; Rabionet, R; Raevuori, A; Rajewski, A; Ramoz, N; Rayner, N W; Reichborn-Kjennerud, T; Ripatti, S; Roberts, M; Rotondo, A; Rujescu, D; Rybakowski, F; Santonastaso, P; Scherag, A; Scherer, S W; Schmidt, U; Schork, N J; Schosser, A; Slachtova, L; Sladek, R; Slagboom, P E; Slof-Op 't Landt, M C T; Slopien, A; Soranzo, N; Southam, L; Steen, V M; Strengman, E; Strober, M; Sullivan, P F; Szatkiewicz, J P; Szeszenia-Dabrowska, N; Tachmazidou, I; Tenconi, E; Thornton, L M; Tortorella, A; Tozzi, F; Treasure, J; Tsitsika, A; Tziouvas, K; van Elburg, A A; van Furth, E F; Wagner, G; Walton, E; Watson, H; Wichmann, H-E; Widen, E; Woodside, D B; Yanovski, J; Yao, S; Yilmaz, Z; Zeggini, E; Zerwas, S; Zipfel, S; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

    2018-01-01

    Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10−6), and rs7700147, an intergenic variant (P=2.93 × 10−5). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes. PMID:29155802

  15. Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers

    PubMed Central

    2010-01-01

    Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time. PMID:20969788

  16. Genome-wide association study of rice grain width variation.

    PubMed

    Zheng, Xiao-Ming; Gong, Tingting; Ou, Hong-Ling; Xue, Dayuan; Qiao, Weihua; Wang, Junrui; Liu, Sha; Yang, Qingwen; Olsen, Kenneth M

    2018-04-01

    Seed size is variable within many plant species, and understanding the underlying genetic factors can provide insights into mechanisms of local environmental adaptation. Here we make use of the abundant genomic and germplasm resources available for rice (Oryza sativa) to perform a large-scale genome-wide association study (GWAS) of grain width. Grain width varies widely within the crop and is also known to show climate-associated variation across populations of its wild progenitor. Using a filtered dataset of >1.9 million genome-wide SNPs in a sample of 570 cultivated and wild rice accessions, we performed GWAS with two complementary models, GLM and MLM. The models yielded 10 and 33 significant associations, respectively, and jointly yielded seven candidate locus regions, two of which have been previously identified. Analyses of nucleotide diversity and haplotype distributions at these loci revealed signatures of selection and patterns consistent with adaptive introgression of grain width alleles across rice variety groups. The results provide a 50% increase in the total number of rice grain width loci mapped to date and support a polygenic model whereby grain width is shaped by gene-by-environment interactions. These loci can potentially serve as candidates for studies of adaptive seed size variation in wild grass species.

  17. AID/APOBEC cytosine deaminase induces genome-wide kataegis

    PubMed Central

    2012-01-01

    Clusters of localized hypermutation in human breast cancer genomes, named “kataegis” (from the Greek for thunderstorm), are hypothesized to result from multiple cytosine deaminations catalyzed by AID/APOBEC proteins. However, a direct link between APOBECs and kataegis is still lacking. We have sequenced the genomes of yeast mutants induced in diploids by expression of the gene for PmCDA1, a hypermutagenic deaminase from sea lamprey. Analysis of the distribution of 5,138 induced mutations revealed localized clusters very similar to those found in tumors. Our data provide evidence that unleashed cytosine deaminase activity is an evolutionary conserved, prominent source of genome-wide kataegis events. Reviewers This article was reviewed by: Professor Sandor Pongor, Professor Shamil R. Sunyaev, and Dr Vladimir Kuznetsov. PMID:23249472

  18. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    PubMed

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  19. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index.

    PubMed

    Minster, Ryan L; Sanders, Jason L; Singh, Jatinder; Kammerer, Candace M; Barmada, M Michael; Matteini, Amy M; Zhang, Qunyuan; Wojczynski, Mary K; Daw, E Warwick; Brody, Jennifer A; Arnold, Alice M; Lunetta, Kathryn L; Murabito, Joanne M; Christensen, Kaare; Perls, Thomas T; Province, Michael A; Newman, Anne B

    2015-08-01

    The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted for mortality risk in 3,140 individuals selected for familial longevity from the Long Life Family Study. The genome-wide association study used the Long Life Family Study as the discovery cohort and individuals from the Cardiovascular Health Study and the Framingham Heart Study as replication cohorts. There were no genome-wide significant findings from the genome-wide association study; however, several single-nucleotide polymorphisms near ZNF704 on chromosome 8q21.13 were suggestively associated with the HAI in the Long Life Family Study (p < 10(-) (6)) and nominally replicated in the Cardiovascular Health Study and Framingham Heart Study. Linkage results revealed significant evidence (log-odds score = 3.36) for a quantitative trait locus for mortality-optimized HAI in women on chromosome 9p24-p23. However, results of fine-mapping studies did not implicate any specific candidate genes within this region of interest. ZNF704 may be a potential candidate gene for studies of the genetic underpinnings of longevity. © The Author 2015. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. Genome-Wide Analysis of Gene-Gene and Gene-Environment Interactions Using Closed-Form Wald Tests.

    PubMed

    Yu, Zhaoxia; Demetriou, Michael; Gillen, Daniel L

    2015-09-01

    Despite the successful discovery of hundreds of variants for complex human traits using genome-wide association studies, the degree to which genes and environmental risk factors jointly affect disease risk is largely unknown. One obstacle toward this goal is that the computational effort required for testing gene-gene and gene-environment interactions is enormous. As a result, numerous computationally efficient tests were recently proposed. However, the validity of these methods often relies on unrealistic assumptions such as additive main effects, main effects at only one variable, no linkage disequilibrium between the two single-nucleotide polymorphisms (SNPs) in a pair or gene-environment independence. Here, we derive closed-form and consistent estimates for interaction parameters and propose to use Wald tests for testing interactions. The Wald tests are asymptotically equivalent to the likelihood ratio tests (LRTs), largely considered to be the gold standard tests but generally too computationally demanding for genome-wide interaction analysis. Simulation studies show that the proposed Wald tests have very similar performances with the LRTs but are much more computationally efficient. Applying the proposed tests to a genome-wide study of multiple sclerosis, we identify interactions within the major histocompatibility complex region. In this application, we find that (1) focusing on pairs where both SNPs are marginally significant leads to more significant interactions when compared to focusing on pairs where at least one SNP is marginally significant; and (2) parsimonious parameterization of interaction effects might decrease, rather than increase, statistical power. © 2015 WILEY PERIODICALS, INC.

  1. Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

    PubMed

    Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

    2014-01-01

    Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.

  2. GUIDE-Seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases

    PubMed Central

    Nguyen, Nhu T.; Liebers, Matthew; Topkar, Ved V.; Thapar, Vishal; Wyvekens, Nicolas; Khayter, Cyd; Iafrate, A. John; Le, Long P.; Aryee, Martin J.; Joung, J. Keith

    2014-01-01

    CRISPR RNA-guided nucleases (RGNs) are widely used genome-editing reagents, but methods to delineate their genome-wide off-target cleavage activities have been lacking. Here we describe an approach for global detection of DNA double-stranded breaks (DSBs) introduced by RGNs and potentially other nucleases. This method, called Genome-wide Unbiased Identification of DSBs Enabled by Sequencing (GUIDE-Seq), relies on capture of double-stranded oligodeoxynucleotides into breaks Application of GUIDE-Seq to thirteen RGNs in two human cell lines revealed wide variability in RGN off-target activities and unappreciated characteristics of off-target sequences. The majority of identified sites were not detected by existing computational methods or ChIP-Seq. GUIDE-Seq also identified RGN-independent genomic breakpoint ‘hotspots’. Finally, GUIDE-Seq revealed that truncated guide RNAs exhibit substantially reduced RGN-induced off-target DSBs. Our experiments define the most rigorous framework for genome-wide identification of RGN off-target effects to date and provide a method for evaluating the safety of these nucleases prior to clinical use. PMID:25513782

  3. Genome-wide association as a means to understanding the mammary gland

    USDA-ARS?s Scientific Manuscript database

    Next-generation sequencing and related technologies have facilitated the creation of enormous public databases that catalogue genomic variation. These databases have facilitated a variety of approaches to discover new genes that regulate normal biology as well as disease. Genome wide association (...

  4. A Genome-wide Combinatorial Strategy Dissects Complex Genetic Architecture of Seed Coat Color in Chickpea

    PubMed Central

    Bajaj, Deepak; Das, Shouvik; Upadhyaya, Hari D.; Ranjan, Rajeev; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. Laxmipathi; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    The study identified 9045 high-quality SNPs employing both genome-wide GBS- and candidate gene-based SNP genotyping assays in 172, including 93 cultivated (desi and kabuli) and 79 wild chickpea accessions. The GWAS in a structured population of 93 sequenced accessions detected 15 major genomic loci exhibiting significant association with seed coat color. Five seed color-associated major genomic loci underlying robust QTLs mapped on a high-density intra-specific genetic linkage map were validated by QTL mapping. The integration of association and QTL mapping with gene haplotype-specific LD mapping and transcript profiling identified novel allelic variants (non-synonymous SNPs) and haplotypes in a MATE secondary transporter gene regulating light/yellow brown and beige seed coat color differentiation in chickpea. The down-regulation and decreased transcript expression of beige seed coat color-associated MATE gene haplotype was correlated with reduced proanthocyanidins accumulation in the mature seed coats of beige than light/yellow brown seed colored desi and kabuli accessions for their coloration/pigmentation. This seed color-regulating MATE gene revealed strong purifying selection pressure primarily in LB/YB seed colored desi and wild Cicer reticulatum accessions compared with the BE seed colored kabuli accessions. The functionally relevant molecular tags identified have potential to decipher the complex transcriptional regulatory gene function of seed coat coloration and for understanding the selective sweep-based seed color trait evolutionary pattern in cultivated and wild accessions during chickpea domestication. The genome-wide integrated approach employed will expedite marker-assisted genetic enhancement for developing cultivars with desirable seed coat color types in chickpea. PMID:26635822

  5. A Genome-Wide Breast Cancer Scan in African Americans

    DTIC Science & Technology

    2011-06-01

    cancer in women of African ancestry. 13 References 1. Easton DF, P.K., Dunning AM, Pharoah PDP, Thompson D, Ballinger DG, et al . Genome...M, Hankinson, SE, et al . A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer...Millikan, R.C. Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. Jama 295, 2492-502 ( 2006 ). 16 17. Huo, D., Ikpatt

  6. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

    DTIC Science & Technology

    2011-09-01

    Almasy, L, Blangero, J. (2009) Human QTL linkage mapping. Genetica 136:333-340. Amos, CI. (2007) Successful design and conduct of genome-wide...quantitative trait loci. Genetica 136:237-243. Skol AD, Scott LJ, Abecasis GR, Boehnke M. (2006) Joint analysis is more efficient than replication

  7. Genome-wide selection components analysis in a fish with male pregnancy.

    PubMed

    Flanagan, Sarah P; Jones, Adam G

    2017-04-01

    A major goal of evolutionary biology is to identify the genome-level targets of natural and sexual selection. With the advent of next-generation sequencing, whole-genome selection components analysis provides a promising avenue in the search for loci affected by selection in nature. Here, we implement a genome-wide selection components analysis in the sex role reversed Gulf pipefish, Syngnathus scovelli. Our approach involves a double-digest restriction-site associated DNA sequencing (ddRAD-seq) technique, applied to adult females, nonpregnant males, pregnant males, and their offspring. An F ST comparison of allele frequencies among these groups reveals 47 genomic regions putatively experiencing sexual selection, as well as 468 regions showing a signature of differential viability selection between males and females. A complementary likelihood ratio test identifies similar patterns in the data as the F ST analysis. Sexual selection and viability selection both tend to favor the rare alleles in the population. Ultimately, we conclude that genome-wide selection components analysis can be a useful tool to complement other approaches in the effort to pinpoint genome-level targets of selection in the wild. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.

  8. Genome-wide expression profiling in pediatric septic shock

    PubMed Central

    Wong, Hector R.

    2013-01-01

    For nearly a decade, our research group has had the privilege of developing and mining a multi-center, microarray-based, genome-wide expression database of critically ill children (≤ 10 years of age) with septic shock. Using bioinformatic and systems biology approaches, the expression data generated through this discovery-oriented, exploratory approach have been leveraged for a variety of objectives, which will be reviewed. Fundamental observations include wide spread repression of gene programs corresponding to the adaptive immune system, and biologically significant differential patterns of gene expression across developmental age groups. The data have also identified gene expression-based subclasses of pediatric septic shock having clinically relevant phenotypic differences. The data have also been leveraged for the discovery of novel therapeutic targets, and for the discovery and development of novel stratification and diagnostic biomarkers. Almost a decade of genome-wide expression profiling in pediatric septic shock is now demonstrating tangible results. The studies have progressed from an initial discovery-oriented and exploratory phase, to a new phase where the data are being translated and applied to address several areas of clinical need. PMID:23329198

  9. A Genome-Wide Association Study for Regulators of Micronucleus Formation in Mice.

    PubMed

    McIntyre, Rebecca E; Nicod, Jérôme; Robles-Espinoza, Carla Daniela; Maciejowski, John; Cai, Na; Hill, Jennifer; Verstraten, Ruth; Iyer, Vivek; Rust, Alistair G; Balmus, Gabriel; Mott, Richard; Flint, Jonathan; Adams, David J

    2016-08-09

    In mammals the regulation of genomic instability plays a key role in tumor suppression and also controls genome plasticity, which is important for recombination during the processes of immunity and meiosis. Most studies to identify regulators of genomic instability have been performed in cells in culture or in systems that report on gross rearrangements of the genome, yet subtle differences in the level of genomic instability can contribute to whole organism phenotypes such as tumor predisposition. Here we performed a genome-wide association study in a population of 1379 outbred Crl:CFW(SW)-US_P08 mice to dissect the genetic landscape of micronucleus formation, a biomarker of chromosomal breaks, whole chromosome loss, and extranuclear DNA. Variation in micronucleus levels is a complex trait with a genome-wide heritability of 53.1%. We identify seven loci influencing micronucleus formation (false discovery rate <5%), and define candidate genes at each locus. Intriguingly at several loci we find evidence for sexual dimorphism in micronucleus formation, with a locus on chromosome 11 being specific to males. Copyright © 2016 McIntyre et al.

  10. Genome-Wide Association Study for Identification and Validation of Novel SNP Markers for Sr6 Stem Rust Resistance Gene in Bread Wheat.

    PubMed

    Mourad, Amira M I; Sallam, Ahmed; Belamkar, Vikas; Wegulo, Stephen; Bowden, Robert; Jin, Yue; Mahdy, Ezzat; Bakheit, Bahy; El-Wafaa, Atif A; Poland, Jesse; Baenziger, Peter S

    2018-01-01

    Stem rust (caused by Puccinia graminis f. sp. tritici Erikss. & E. Henn.), is a major disease in wheat ( Triticum aestivium L.). However, in recent years it occurs rarely in Nebraska due to weather and the effective selection and gene pyramiding of resistance genes. To understand the genetic basis of stem rust resistance in Nebraska winter wheat, we applied genome-wide association study (GWAS) on a set of 270 winter wheat genotypes (A-set). Genotyping was carried out using genotyping-by-sequencing and ∼35,000 high-quality SNPs were identified. The tested genotypes were evaluated for their resistance to the common stem rust race in Nebraska (QFCSC) in two replications. Marker-trait association identified 32 SNP markers, which were significantly (Bonferroni corrected P < 0.05) associated with the resistance on chromosome 2D. The chromosomal location of the significant SNPs (chromosome 2D) matched the location of Sr6 gene which was expected in these genotypes based on pedigree information. A highly significant linkage disequilibrium (LD, r 2 ) was found between the significant SNPs and the specific SSR marker for the Sr6 gene ( Xcfd43 ). This suggests the significant SNP markers are tagging Sr6 gene. Out of the 32 significant SNPs, eight SNPs were in six genes that are annotated as being linked to disease resistance in the IWGSC RefSeq v1.0. The 32 significant SNP markers were located in nine haplotype blocks. All the 32 significant SNPs were validated in a set of 60 different genotypes (V-set) using single marker analysis. SNP markers identified in this study can be used in marker-assisted selection, genomic selection, and to develop KASP (Kompetitive Allele Specific PCR) marker for the Sr6 gene. Novel SNPs for Sr6 gene, an important stem rust resistant gene, were identified and validated in this study. These SNPs can be used to improve stem rust resistance in wheat.

  11. StereoGene: rapid estimation of genome-wide correlation of continuous or interval feature data.

    PubMed

    Stavrovskaya, Elena D; Niranjan, Tejasvi; Fertig, Elana J; Wheelan, Sarah J; Favorov, Alexander V; Mironov, Andrey A

    2017-10-15

    Genomics features with similar genome-wide distributions are generally hypothesized to be functionally related, for example, colocalization of histones and transcription start sites indicate chromatin regulation of transcription factor activity. Therefore, statistical algorithms to perform spatial, genome-wide correlation among genomic features are required. Here, we propose a method, StereoGene, that rapidly estimates genome-wide correlation among pairs of genomic features. These features may represent high-throughput data mapped to reference genome or sets of genomic annotations in that reference genome. StereoGene enables correlation of continuous data directly, avoiding the data binarization and subsequent data loss. Correlations are computed among neighboring genomic positions using kernel correlation. Representing the correlation as a function of the genome position, StereoGene outputs the local correlation track as part of the analysis. StereoGene also accounts for confounders such as input DNA by partial correlation. We apply our method to numerous comparisons of ChIP-Seq datasets from the Human Epigenome Atlas and FANTOM CAGE to demonstrate its wide applicability. We observe the changes in the correlation between epigenomic features across developmental trajectories of several tissue types consistent with known biology and find a novel spatial correlation of CAGE clusters with donor splice sites and with poly(A) sites. These analyses provide examples for the broad applicability of StereoGene for regulatory genomics. The StereoGene C ++ source code, program documentation, Galaxy integration scripts and examples are available from the project homepage http://stereogene.bioinf.fbb.msu.ru/. favorov@sensi.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  12. Genome-wide mapping of mutations at single-nucleotide resolution for protein, metabolic and genome engineering.

    PubMed

    Garst, Andrew D; Bassalo, Marcelo C; Pines, Gur; Lynch, Sean A; Halweg-Edwards, Andrea L; Liu, Rongming; Liang, Liya; Wang, Zhiwen; Zeitoun, Ramsey; Alexander, William G; Gill, Ryan T

    2017-01-01

    Improvements in DNA synthesis and sequencing have underpinned comprehensive assessment of gene function in bacteria and eukaryotes. Genome-wide analyses require high-throughput methods to generate mutations and analyze their phenotypes, but approaches to date have been unable to efficiently link the effects of mutations in coding regions or promoter elements in a highly parallel fashion. We report that CRISPR-Cas9 gene editing in combination with massively parallel oligomer synthesis can enable trackable editing on a genome-wide scale. Our method, CRISPR-enabled trackable genome engineering (CREATE), links each guide RNA to homologous repair cassettes that both edit loci and function as barcodes to track genotype-phenotype relationships. We apply CREATE to site saturation mutagenesis for protein engineering, reconstruction of adaptive laboratory evolution experiments, and identification of stress tolerance and antibiotic resistance genes in bacteria. We provide preliminary evidence that CREATE will work in yeast. We also provide a webtool to design multiplex CREATE libraries.

  13. Genome-Wide Landscapes of Human Local Adaptation in Asia

    PubMed Central

    Lu, Dongsheng; Xu, Shuhua

    2013-01-01

    Genetic studies of human local adaptation have been facilitated greatly by recent advances in high-throughput genotyping and sequencing technologies. However, few studies have investigated local adaptation in Asian populations on a genome-wide scale and with a high geographic resolution. In this study, taking advantage of the dense population coverage in Southeast Asia, which is the part of the world least studied in term of natural selection, we depicted genome-wide landscapes of local adaptations in 63 Asian populations representing the majority of linguistic and ethnic groups in Asia. Using genome-wide data analysis, we discovered many genes showing signs of local adaptation or natural selection. Notable examples, such as FOXQ1, MAST2, and CDH4, were found to play a role in hair follicle development and human cancer, signal transduction, and tumor repression, respectively. These showed strong indications of natural selection in Philippine Negritos, a group of aboriginal hunter-gatherers living in the Philippines. MTTP, which has associations with metabolic syndrome, body mass index, and insulin regulation, showed a strong signature of selection in Southeast Asians, including Indonesians. Functional annotation analysis revealed that genes and genetic variants underlying natural selections were generally enriched in the functional category of alternative splicing. Specifically, many genes showing significant difference with respect to allele frequency between northern and southern Asian populations were found to be associated with human height and growth and various immune pathways. In summary, this study contributes to the overall understanding of human local adaptation in Asia and has identified both known and novel signatures of natural selection in the human genome. PMID:23349834

  14. Genome-wide screen of ovary-specific DNA methylation in polycystic ovary syndrome.

    PubMed

    Yu, Ying-Ying; Sun, Cui-Xiang; Liu, Yin-Kun; Li, Yan; Wang, Li; Zhang, Wei

    2015-07-01

    To compare genome-wide DNA methylation profiles in ovary tissue from women with polycystic ovary syndrome (PCOS) and healthy controls. Case-control study matched for age and body mass index. University-affiliated hospital. Ten women with PCOS who underwent ovarian drilling to induce ovulation and 10 healthy women who were undergoing laparoscopic sterilization, hysterectomy for benign conditions, diagnostic laparoscopy for pelvic pain, or oophorectomy for nonovarian indications. None. Genome-wide DNA methylation patterns determined by immunoprecipitation and microarray (MeDIP-chip) analysis. The methylation levels were statistically significantly higher in CpG island shores (CGI shores), which lie outside of core promoter regions, and lower within gene bodies in women with PCOS relative to the controls. In addition, high CpG content promoters were the most frequently hypermethylated promoters in PCOS ovaries but were more often hypomethylated in controls. Second, 872 CGIs, specifically methylated in PCOS, represented 342 genes that could be associated with various molecular functions, including protein binding, hormone activity, and transcription regulator activity. Finally, methylation differences were validated in seven genes by methylation-specific polymerase chain reaction. These genes correlated to several functional families related to the pathogenesis of PCOS and may be potential biomarkers for this disease. Our results demonstrated that epigenetic modification differs between PCOS and normal ovaries, which may help to further understand the pathophysiology of this disease. Copyright © 2015 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  15. Genome-Wide siRNA-Based Functional Genomics of Pigmentation Identifies Novel Genes and Pathways That Impact Melanogenesis in Human Cells

    PubMed Central

    Bodemann, Brian; Petersen, Sean; Aruri, Jayavani; Koshy, Shiney; Richardson, Zachary; Le, Lu Q.; Krasieva, Tatiana; Roth, Michael G.; Farmer, Pat; White, Michael A.

    2008-01-01

    Melanin protects the skin and eyes from the harmful effects of UV irradiation, protects neural cells from toxic insults, and is required for sound conduction in the inner ear. Aberrant regulation of melanogenesis underlies skin disorders (melasma and vitiligo), neurologic disorders (Parkinson's disease), auditory disorders (Waardenburg's syndrome), and opthalmologic disorders (age related macular degeneration). Much of the core synthetic machinery driving melanin production has been identified; however, the spectrum of gene products participating in melanogenesis in different physiological niches is poorly understood. Functional genomics based on RNA-mediated interference (RNAi) provides the opportunity to derive unbiased comprehensive collections of pharmaceutically tractable single gene targets supporting melanin production. In this study, we have combined a high-throughput, cell-based, one-well/one-gene screening platform with a genome-wide arrayed synthetic library of chemically synthesized, small interfering RNAs to identify novel biological pathways that govern melanin biogenesis in human melanocytes. Ninety-two novel genes that support pigment production were identified with a low false discovery rate. Secondary validation and preliminary mechanistic studies identified a large panel of targets that converge on tyrosinase expression and stability. Small molecule inhibition of a family of gene products in this class was sufficient to impair chronic tyrosinase expression in pigmented melanoma cells and UV-induced tyrosinase expression in primary melanocytes. Isolation of molecular machinery known to support autophagosome biosynthesis from this screen, together with in vitro and in vivo validation, exposed a close functional relationship between melanogenesis and autophagy. In summary, these studies illustrate the power of RNAi-based functional genomics to identify novel genes, pathways, and pharmacologic agents that impact a biological phenotype and operate

  16. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

    PubMed

    Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

    2016-05-01

    Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.

  17. Genome-wide association study of Parkinson's disease in East Asians.

    PubMed

    Foo, Jia Nee; Tan, Louis C; Irwan, Ishak D; Au, Wing-Lok; Low, Hui Qi; Prakash, Kumar-M; Ahmad-Annuar, Azlina; Bei, Jinxin; Chan, Anne Yy; Chen, Chiung Mei; Chen, Yi-Chun; Chung, Sun Ju; Deng, Hao; Lim, Shen-Yang; Mok, Vincent; Pang, Hao; Pei, Zhong; Peng, Rong; Shang, Hui-Fang; Song, Kyuyoung; Tan, Ai Huey; Wu, Yih-Ru; Aung, Tin; Cheng, Ching-Yu; Chew, Fook Tim; Chew, Soo-Hong; Chong, Siow-Ann; Ebstein, Richard P; Lee, Jimmy; Saw, Seang-Mei; Seow, Adeline; Subramaniam, Mythily; Tai, E-Shyong; Vithana, Eranga N; Wong, Tien-Yin; Heng, Khai Koon; Meah, Wee-Yang; Khor, Chiea Chuen; Liu, Hong; Zhang, Furen; Liu, Jianjun; Tan, Eng-King

    2017-01-01

    Genome-wide association studies (GWAS) on Parkinson's disease (PD) have mostly been done in Europeans and Japanese. No study has been done in Han Chinese, which make up nearly a fifth of the world population. We conducted the first Han Chinese GWAS analysing a total of 22,729 subjects (5,125 PD cases and 17,604 controls) from Singapore, Hong Kong, Malaysia, Korea, mainland China and Taiwan. We performed imputation, merging and logistic regression analyses of 2,402,394 SNPs passing quality control filters in 779 PD cases, 13,227 controls, adjusted for the first three principal components. 90 SNPs with association P < 10-4 were validated in 9 additional sample collections and the results were combined using fixed-effects inverse-variance meta-analysis. We observed strong associations reaching genome-wide significance at SNCA, LRRK2 and MCCC1, confirming their important roles in both European and Asian PD. We also identified significant (P < 0.05) associations at 5 loci (DLG2, SIPA1L2, STK39, VPS13C and RIT2), and observed the same direction of associations at 9 other loci including BST1 and PARK16. Allelic heterogeneity was observed at LRRK2 while European risk SNPs at 6 other loci including MAPT and GBA-SYT11 were non-polymorphic or very rare in our cohort. Overall, we replicate associations at SNCA, LRRK2, MCCC1 and 14 other European PD loci but did not identify Asian-specific loci with large effects (OR > 1.45) on PD risk. Our results also demonstrate some differences in the genetic contribution to PD between Europeans and Asians. Further pan-ethnic meta-analysis with European GWAS cohorts may unravel new PD loci. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  18. Genome-wide introgression among distantly related Heliconius butterfly species.

    PubMed

    Zhang, Wei; Dasmahapatra, Kanchon K; Mallet, James; Moreira, Gilson R P; Kronforst, Marcus R

    2016-02-27

    Although hybridization is thought to be relatively rare in animals, the raw genetic material introduced via introgression may play an important role in fueling adaptation and adaptive radiation. The butterfly genus Heliconius is an excellent system to study hybridization and introgression but most studies have focused on closely related species such as H. cydno and H. melpomene. Here we characterize genome-wide patterns of introgression between H. besckei, the only species with a red and yellow banded 'postman' wing pattern in the tiger-striped silvaniform clade, and co-mimetic H. melpomene nanna. We find a pronounced signature of putative introgression from H. melpomene into H. besckei in the genomic region upstream of the gene optix, known to control red wing patterning, suggesting adaptive introgression of wing pattern mimicry between these two distantly related species. At least 39 additional genomic regions show signals of introgression as strong or stronger than this mimicry locus. Gene flow has been on-going, with evidence of gene exchange at multiple time points, and bidirectional, moving from the melpomene to the silvaniform clade and vice versa. The history of gene exchange has also been complex, with contributions from multiple silvaniform species in addition to H. besckei. We also detect a signature of ancient introgression of the entire Z chromosome between the silvaniform and melpomene/cydno clades. Our study provides a genome-wide portrait of introgression between distantly related butterfly species. We further propose a comprehensive and efficient workflow for gene flow identification in genomic data sets.

  19. Genome-wide association analysis identifies a meningioma risk locus at 11p15.5.

    PubMed

    Claus, Elizabeth B; Cornish, Alex J; Broderick, Peter; Schildkraut, Joellen M; Dobbins, Sara E; Holroyd, Amy; Calvocoressi, Lisa; Lu, Lingeng; Hansen, Helen M; Smirnov, Ivan; Walsh, Kyle M; Schramm, Johannes; Hoffmann, Per; Nöthen, Markus M; Jöckel, Karl-Heinz; Swerdlow, Anthony; Larsen, Signe Benzon; Johansen, Christoffer; Simon, Matthias; Bondy, Melissa; Wrensch, Margaret; Houlston, Richard; Wiemels, Joseph L

    2018-05-12

    Meningioma are adult brain tumors originating in the meningeal coverings of the brain and spinal cord, with significant heritable basis. Genome-wide association studies (GWAS) have previously identified only a single risk locus for meningioma, at 10p12.31. To identify a susceptibility locus for meningioma, we conducted a meta-analysis of two GWAS, imputed using a merged reference panel of 1,000 Genomes and UK10K data, with validation in two independent sample series totaling 2,138 cases and 12,081 controls. We identified a new susceptibility locus for meningioma at 11p15.5 (rs2686876, odds ratio = 1.44, P = 9.86 × 10-9). A number of genes localize to the region of linkage disequilibrium encompassing rs2686876, including RIC8A, which plays a central role in the development of neural crest-derived structures, such as the meninges. This finding advances our understanding of the genetic basis of meningioma development and provides additional support for a polygenic model of meningioma.

  20. Genome-wide Association Analysis of Kernel Weight in Hard Winter Wheat

    USDA-ARS?s Scientific Manuscript database

    Wheat kernel weight is an important and heritable component of wheat grain yield and a key predictor of flour extraction. Genome-wide association analysis was conducted to identify genomic regions associated with kernel weight and kernel weight environmental response in 8 trials of 299 hard winter ...

  1. Meta-Analysis in Genome-Wide Association Datasets: Strategies and Application in Parkinson Disease

    PubMed Central

    Evangelou, Evangelos; Maraganore, Demetrius M.; Ioannidis, John P.A.

    2007-01-01

    Background Genome-wide association studies hold substantial promise for identifying common genetic variants that regulate susceptibility to complex diseases. However, for the detection of small genetic effects, single studies may be underpowered. Power may be improved by combining genome-wide datasets with meta-analytic techniques. Methodology/Principal Findings Both single and two-stage genome-wide data may be combined and there are several possible strategies. In the two-stage framework, we considered the options of (1) enhancement of replication data and (2) enhancement of first-stage data, and then, we also considered (3) joint meta-analyses including all first-stage and second-stage data. These strategies were examined empirically using data from two genome-wide association studies (three datasets) on Parkinson disease. In the three strategies, we derived 12, 5, and 49 single nucleotide polymorphisms that show significant associations at conventional levels of statistical significance. None of these remained significant after conservative adjustment for the number of performed analyses in each strategy. However, some may warrant further consideration: 6 SNPs were identified with at least 2 of the 3 strategies and 3 SNPs [rs1000291 on chromosome 3, rs2241743 on chromosome 4 and rs3018626 on chromosome 11] were identified with all 3 strategies and had no or minimal between-dataset heterogeneity (I2 = 0, 0 and 15%, respectively). Analyses were primarily limited by the suboptimal overlap of tested polymorphisms across different datasets (e.g., only 31,192 shared polymorphisms between the two tier 1 datasets). Conclusions/Significance Meta-analysis may be used to improve the power and examine the between-dataset heterogeneity of genome-wide association studies. Prospective designs may be most efficient, if they try to maximize the overlap of genotyping platforms and anticipate the combination of data across many genome-wide association studies. PMID:17332845

  2. A genome-wide association study of chronic obstructive pulmonary disease in Hispanics.

    PubMed

    Chen, Wei; Brehm, John M; Manichaikul, Ani; Cho, Michael H; Boutaoui, Nadia; Yan, Qi; Burkart, Kristin M; Enright, Paul L; Rotter, Jerome I; Petersen, Hans; Leng, Shuguang; Obeidat, Ma'en; Bossé, Yohan; Brandsma, Corry-Anke; Hao, Ke; Rich, Stephen S; Powell, Rhea; Avila, Lydiana; Soto-Quiros, Manuel; Silverman, Edwin K; Tesfaigzi, Yohannes; Barr, R Graham; Celedón, Juan C

    2015-03-01

    Genome-wide association studies (GWAS) of chronic obstructive pulmonary disease (COPD) have identified disease-susceptibility loci, mostly in subjects of European descent. We hypothesized that by studying Hispanic populations we would be able to identify unique loci that contribute to COPD pathogenesis in Hispanics but remain undetected in GWAS of non-Hispanic populations. We conducted a metaanalysis of two GWAS of COPD in independent cohorts of Hispanics in Costa Rica and the United States (Multi-Ethnic Study of Atherosclerosis [MESA]). We performed a replication study of the top single-nucleotide polymorphisms in an independent Hispanic cohort in New Mexico (the Lovelace Smokers Cohort). We also attempted to replicate prior findings from genome-wide studies in non-Hispanic populations in Hispanic cohorts. We found no genome-wide significant association with COPD in our metaanalysis of Costa Rica and MESA. After combining the top results from this metaanalysis with those from our replication study in the Lovelace Smokers Cohort, we identified two single-nucleotide polymorphisms approaching genome-wide significance for an association with COPD. The first (rs858249, combined P value = 6.1 × 10(-8)) is near the genes KLHL7 and NUPL2 on chromosome 7. The second (rs286499, combined P value = 8.4 × 10(-8)) is located in an intron of DLG2. The two most significant single-nucleotide polymorphisms in FAM13A from a previous genome-wide study in non-Hispanics were associated with COPD in Hispanics. We have identified two novel loci (in or near the genes KLHL7/NUPL2 and DLG2) that may play a role in COPD pathogenesis in Hispanic populations.

  3. Pharmacogenetic meta-analysis of genome-wide association studies of LDL cholesterol response to statins

    PubMed Central

    Postmus, Iris; Trompet, Stella; Deshmukh, Harshal A.; Barnes, Michael R.; Li, Xiaohui; Warren, Helen R.; Chasman, Daniel I.; Zhou, Kaixin; Arsenault, Benoit J.; Donnelly, Louise A.; Wiggins, Kerri L.; Avery, Christy L.; Griffin, Paula; Feng, QiPing; Taylor, Kent D.; Li, Guo; Evans, Daniel S.; Smith, Albert V.; de Keyser, Catherine E.; Johnson, Andrew D.; de Craen, Anton J. M.; Stott, David J.; Buckley, Brendan M.; Ford, Ian; Westendorp, Rudi G. J.; Eline Slagboom, P.; Sattar, Naveed; Munroe, Patricia B.; Sever, Peter; Poulter, Neil; Stanton, Alice; Shields, Denis C.; O’Brien, Eoin; Shaw-Hawkins, Sue; Ida Chen, Y.-D.; Nickerson, Deborah A.; Smith, Joshua D.; Pierre Dubé, Marie; Matthijs Boekholdt, S.; Kees Hovingh, G.; Kastelein, John J. P.; McKeigue, Paul M.; Betteridge, John; Neil, Andrew; Durrington, Paul N.; Doney, Alex; Carr, Fiona; Morris, Andrew; McCarthy, Mark I.; Groop, Leif; Ahlqvist, Emma; Bis, Joshua C.; Rice, Kenneth; Smith, Nicholas L.; Lumley, Thomas; Whitsel, Eric A.; Stürmer, Til; Boerwinkle, Eric; Ngwa, Julius S.; O’Donnell, Christopher J.; Vasan, Ramachandran S.; Wei, Wei-Qi; Wilke, Russell A.; Liu, Ching-Ti; Sun, Fangui; Guo, Xiuqing; Heckbert, Susan R; Post, Wendy; Sotoodehnia, Nona; Arnold, Alice M.; Stafford, Jeanette M.; Ding, Jingzhong; Herrington, David M.; Kritchevsky, Stephen B.; Eiriksdottir, Gudny; Launer, Leonore J.; Harris, Tamara B.; Chu, Audrey Y.; Giulianini, Franco; MacFadyen, Jean G.; Barratt, Bryan J.; Nyberg, Fredrik; Stricker, Bruno H.; Uitterlinden, André G.; Hofman, Albert; Rivadeneira, Fernando; Emilsson, Valur; Franco, Oscar H.; Ridker, Paul M.; Gudnason, Vilmundur; Liu, Yongmei; Denny, Joshua C.; Ballantyne, Christie M.; Rotter, Jerome I.; Adrienne Cupples, L.; Psaty, Bruce M.; Palmer, Colin N. A.; Tardif, Jean-Claude; Colhoun, Helen M.; Hitman, Graham; Krauss, Ronald M.; Wouter Jukema, J; Caulfield, Mark J.; Donnelly, Peter; Barroso, Ines; Blackwell, Jenefer M.; Bramon, Elvira; Brown, Matthew A.; Casas, Juan P.; Corvin, Aiden; Deloukas, Panos; Duncanson, Audrey; Jankowski, Janusz; Markus, Hugh S.; Mathew, Christopher G.; Palmer, Colin N. A.; Plomin, Robert; Rautanen, Anna; Sawcer, Stephen J.; Trembath, Richard C.; Viswanathan, Ananth C.; Wood, Nicholas W.; Spencer, Chris C. A.; Band, Gavin; Bellenguez, Céline; Freeman, Colin; Hellenthal, Garrett; Giannoulatou, Eleni; Pirinen, Matti; Pearson, Richard; Strange, Amy; Su, Zhan; Vukcevic, Damjan; Donnelly, Peter; Langford, Cordelia; Hunt, Sarah E.; Edkins, Sarah; Gwilliam, Rhian; Blackburn, Hannah; Bumpstead, Suzannah J.; Dronov, Serge; Gillman, Matthew; Gray, Emma; Hammond, Naomi; Jayakumar, Alagurevathi; McCann, Owen T.; Liddle, Jennifer; Potter, Simon C.; Ravindrarajah, Radhi; Ricketts, Michelle; Waller, Matthew; Weston, Paul; Widaa, Sara; Whittaker, Pamela; Barroso, Ines; Deloukas, Panos; Mathew, Christopher G.; Blackwell, Jenefer M.; Brown, Matthew A.; Corvin, Aiden; McCarthy, Mark I.; Spencer, Chris C. A.

    2014-01-01

    Statins effectively lower LDL cholesterol levels in large studies and the observed interindividual response variability may be partially explained by genetic variation. Here we perform a pharmacogenetic meta-analysis of genome-wide association studies (GWAS) in studies addressing the LDL cholesterol response to statins, including up to 18,596 statin-treated subjects. We validate the most promising signals in a further 22,318 statin recipients and identify two loci, SORT1/CELSR2/PSRC1 and SLCO1B1, not previously identified in GWAS. Moreover, we confirm the previously described associations with APOE and LPA. Our findings advance the understanding of the pharmacogenetic architecture of statin response. PMID:25350695

  4. Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas.

    PubMed

    Galanter, Joshua Mark; Fernandez-Lopez, Juan Carlos; Gignoux, Christopher R; Barnholtz-Sloan, Jill; Fernandez-Rozadilla, Ceres; Via, Marc; Hidalgo-Miranda, Alfredo; Contreras, Alejandra V; Figueroa, Laura Uribe; Raska, Paola; Jimenez-Sanchez, Gerardo; Zolezzi, Irma Silva; Torres, Maria; Ponte, Clara Ruiz; Ruiz, Yarimar; Salas, Antonio; Nguyen, Elizabeth; Eng, Celeste; Borjas, Lisbeth; Zabala, William; Barreto, Guillermo; González, Fernando Rondón; Ibarra, Adriana; Taboada, Patricia; Porras, Liliana; Moreno, Fabián; Bigham, Abigail; Gutierrez, Gerardo; Brutsaert, Tom; León-Velarde, Fabiola; Moore, Lorna G; Vargas, Enrique; Cruz, Miguel; Escobedo, Jorge; Rodriguez-Santana, José; Rodriguez-Cintrón, William; Chapela, Rocio; Ford, Jean G; Bustamante, Carlos; Seminara, Daniela; Shriver, Mark; Ziv, Elad; Burchard, Esteban Gonzalez; Haile, Robert; Parra, Esteban; Carracedo, Angel

    2012-01-01

    Most individuals throughout the Americas are admixed descendants of Native American, European, and African ancestors. Complex historical factors have resulted in varying proportions of ancestral contributions between individuals within and among ethnic groups. We developed a panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America. We used genome-wide data from 953 individuals from diverse African, European, and Native American populations to select AIMs optimized for each of the three main continental populations that form the basis of modern Latin American populations. We selected markers on the basis of locus-specific branch length to be informative, well distributed throughout the genome, capable of being genotyped on widely available commercial platforms, and applicable throughout the Americas by minimizing within-continent heterogeneity. We then validated the panel in samples from four admixed populations by comparing ancestry estimates based on the AIMs panel to estimates based on genome-wide association study (GWAS) data. The panel provided balanced discriminatory power among the three ancestral populations and accurate estimates of individual ancestry proportions (R² > 0.9 for ancestral components with significant between-subject variance). Finally, we genotyped samples from 18 populations from Latin America using the AIMs panel and estimated variability in ancestry within and between these populations. This panel and its reference genotype information will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region.

  5. Development of a Panel of Genome-Wide Ancestry Informative Markers to Study Admixture Throughout the Americas

    PubMed Central

    Galanter, Joshua Mark; Fernandez-Lopez, Juan Carlos; Gignoux, Christopher R.; Barnholtz-Sloan, Jill; Fernandez-Rozadilla, Ceres; Via, Marc; Hidalgo-Miranda, Alfredo; Contreras, Alejandra V.; Figueroa, Laura Uribe; Raska, Paola; Jimenez-Sanchez, Gerardo; Silva Zolezzi, Irma; Torres, Maria; Ponte, Clara Ruiz; Ruiz, Yarimar; Salas, Antonio; Nguyen, Elizabeth; Eng, Celeste; Borjas, Lisbeth; Zabala, William; Barreto, Guillermo; Rondón González, Fernando; Ibarra, Adriana; Taboada, Patricia; Porras, Liliana; Moreno, Fabián; Bigham, Abigail; Gutierrez, Gerardo; Brutsaert, Tom; León-Velarde, Fabiola; Moore, Lorna G.; Vargas, Enrique; Cruz, Miguel; Escobedo, Jorge; Rodriguez-Santana, José; Rodriguez-Cintrón, William; Chapela, Rocio; Ford, Jean G.; Bustamante, Carlos; Seminara, Daniela; Shriver, Mark; Ziv, Elad; Gonzalez Burchard, Esteban; Haile, Robert

    2012-01-01

    Most individuals throughout the Americas are admixed descendants of Native American, European, and African ancestors. Complex historical factors have resulted in varying proportions of ancestral contributions between individuals within and among ethnic groups. We developed a panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America. We used genome-wide data from 953 individuals from diverse African, European, and Native American populations to select AIMs optimized for each of the three main continental populations that form the basis of modern Latin American populations. We selected markers on the basis of locus-specific branch length to be informative, well distributed throughout the genome, capable of being genotyped on widely available commercial platforms, and applicable throughout the Americas by minimizing within-continent heterogeneity. We then validated the panel in samples from four admixed populations by comparing ancestry estimates based on the AIMs panel to estimates based on genome-wide association study (GWAS) data. The panel provided balanced discriminatory power among the three ancestral populations and accurate estimates of individual ancestry proportions (R2>0.9 for ancestral components with significant between-subject variance). Finally, we genotyped samples from 18 populations from Latin America using the AIMs panel and estimated variability in ancestry within and between these populations. This panel and its reference genotype information will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region. PMID:22412386

  6. Efficient Genome-Wide Sequencing and Low-Coverage Pedigree Analysis from Noninvasively Collected Samples

    PubMed Central

    Snyder-Mackler, Noah; Majoros, William H.; Yuan, Michael L.; Shaver, Amanda O.; Gordon, Jacob B.; Kopp, Gisela H.; Schlebusch, Stephen A.; Wall, Jeffrey D.; Alberts, Susan C.; Mukherjee, Sayan; Zhou, Xiang; Tung, Jenny

    2016-01-01

    Research on the genetics of natural populations was revolutionized in the 1990s by methods for genotyping noninvasively collected samples. However, these methods have remained largely unchanged for the past 20 years and lag far behind the genomics era. To close this gap, here we report an optimized laboratory protocol for genome-wide capture of endogenous DNA from noninvasively collected samples, coupled with a novel computational approach to reconstruct pedigree links from the resulting low-coverage data. We validated both methods using fecal samples from 62 wild baboons, including 48 from an independently constructed extended pedigree. We enriched fecal-derived DNA samples up to 40-fold for endogenous baboon DNA and reconstructed near-perfect pedigree relationships even with extremely low-coverage sequencing. We anticipate that these methods will be broadly applicable to the many research systems for which only noninvasive samples are available. The lab protocol and software (“WHODAD”) are freely available at www.tung-lab.org/protocols-and-software.html and www.xzlab.org/software.html, respectively. PMID:27098910

  7. BACCardI--a tool for the validation of genomic assemblies, assisting genome finishing and intergenome comparison.

    PubMed

    Bartels, Daniela; Kespohl, Sebastian; Albaum, Stefan; Drüke, Tanja; Goesmann, Alexander; Herold, Julia; Kaiser, Olaf; Pühler, Alfred; Pfeiffer, Friedhelm; Raddatz, Günter; Stoye, Jens; Meyer, Folker; Schuster, Stephan C

    2005-04-01

    We provide the graphical tool BACCardI for the construction of virtual clone maps from standard assembler output files or BLAST based sequence comparisons. This new tool has been applied to numerous genome projects to solve various problems including (a) validation of whole genome shotgun assemblies, (b) support for contig ordering in the finishing phase of a genome project, and (c) intergenome comparison between related strains when only one of the strains has been sequenced and a large insert library is available for the other. The BACCardI software can seamlessly interact with various sequence assembly packages. Genomic assemblies generated from sequence information need to be validated by independent methods such as physical maps. The time-consuming task of building physical maps can be circumvented by virtual clone maps derived from read pair information of large insert libraries.

  8. Genome-Wide Detection of CNVs and Their Association with Meat Tenderness in Nelore Cattle.

    PubMed

    Silva, Vinicius Henrique da; Regitano, Luciana Correia de Almeida; Geistlinger, Ludwig; Pértille, Fábio; Giachetto, Poliana Fernanda; Brassaloti, Ricardo Augusto; Morosini, Natália Silva; Zimmer, Ralf; Coutinho, Luiz Lehmann

    2016-01-01

    Brazil is one of the largest beef producers and exporters in the world with the Nelore breed representing the vast majority of Brazilian cattle (Bos taurus indicus). Despite the great adaptability of the Nelore breed to tropical climate, meat tenderness (MT) remains to be improved. Several factors including genetic composition can influence MT. In this article, we report a genome-wide analysis of copy number variation (CNV) inferred from Illumina® High Density SNP-chip data for a Nelore population of 723 males. We detected >2,600 CNV regions (CNVRs) representing ≈6.5% of the genome. Comparing our results with previous studies revealed an overlap in ≈1400 CNVRs (>50%). A total of 1,155 CNVRs (43.6%) overlapped 2,750 genes. They were enriched for processes involving guanosine triphosphate (GTP), previously reported to influence skeletal muscle physiology and morphology. Nelore CNVRs also overlapped QTLs for MT reported in other breeds (8.9%, 236 CNVRs) and from a previous study with this population (4.1%, 109 CNVRs). Two CNVRs were also proximal to glutathione metabolism genes that were previously associated with MT. Genome-wide association study of CN state with estimated breeding values derived from meat shear force identified 6 regions, including a region on BTA3 that contains genes of the cAMP and cGMP pathway. Ten CNVRs that overlapped regions associated with MT were successfully validated by qPCR. Our results represent the first comprehensive CNV study in Bos taurus indicus cattle and identify regions in which copy number changes are potentially of importance for the MT phenotype.

  9. Genome-wide analysis of disease progression in age-related macular degeneration.

    PubMed

    Yan, Qi; Ding, Ying; Liu, Yi; Sun, Tao; Fritsche, Lars G; Clemons, Traci; Ratnapriya, Rinki; Klein, Michael L; Cook, Richard J; Liu, Yu; Fan, Ruzong; Wei, Lai; Abecasis, Gonçalo R; Swaroop, Anand; Chew, Emily Y; Weeks, Daniel E; Chen, Wei

    2018-03-01

    Family- and population-based genetic studies have successfully identified multiple disease-susceptibility loci for Age-related macular degeneration (AMD), one of the first batch and most successful examples of genome-wide association study. However, most genetic studies to date have focused on case-control studies of late AMD (choroidal neovascularization or geographic atrophy). The genetic influences on disease progression are largely unexplored. We assembled unique resources to perform a genome-wide bivariate time-to-event analysis to test for association of time-to-late-AMD with ∼9 million variants on 2721 Caucasians from a large multi-center randomized clinical trial, the Age-Related Eye Disease Study. To our knowledge, this is the first genome-wide association study of disease progression (bivariate survival outcome) in AMD genetic studies, thus providing novel insights to AMD genetics. We used a robust Cox proportional hazards model to appropriately account for between-eye correlation when analyzing the progression time in the two eyes of each participant. We identified four previously reported susceptibility loci showing genome-wide significant association with AMD progression: ARMS2-HTRA1 (P = 8.1 × 10-43), CFH (P = 3.5 × 10-37), C2-CFB-SKIV2L (P = 8.1 × 10-10) and C3 (P = 1.2 × 10-9). Furthermore, we detected association of rs58978565 near TNR (P = 2.3 × 10-8), rs28368872 near ATF7IP2 (P = 2.9 × 10-8) and rs142450006 near MMP9 (P = 0.0006) with progression to choroidal neovascularization but not geographic atrophy. Secondary analysis limited to 34 reported risk variants revealed that LIPC and CTRB2-CTRB1 were also associated with AMD progression (P < 0.0015). Our genome-wide analysis thus expands the genetics in both development and progression of AMD and should assist in early identification of high risk individuals.

  10. Potential assessment of genome-wide association study and genomic selection in Japanese pear Pyrus pyrifolia

    PubMed Central

    Iwata, Hiroyoshi; Hayashi, Takeshi; Terakami, Shingo; Takada, Norio; Sawamura, Yutaka; Yamamoto, Toshiya

    2013-01-01

    Although the potential of marker-assisted selection (MAS) in fruit tree breeding has been reported, bi-parental QTL mapping before MAS has hindered the introduction of MAS to fruit tree breeding programs. Genome-wide association studies (GWAS) are an alternative to bi-parental QTL mapping in long-lived perennials. Selection based on genomic predictions of breeding values (genomic selection: GS) is another alternative for MAS. This study examined the potential of GWAS and GS in pear breeding with 76 Japanese pear cultivars to detect significant associations of 162 markers with nine agronomic traits. We applied multilocus Bayesian models accounting for ordinal categorical phenotypes for GWAS and GS model training. Significant associations were detected at harvest time, black spot resistance and the number of spurs and two of the associations were closely linked to known loci. Genome-wide predictions for GS were accurate at the highest level (0.75) in harvest time, at medium levels (0.38–0.61) in resistance to black spot, firmness of flesh, fruit shape in longitudinal section, fruit size, acid content and number of spurs and at low levels (<0.2) in all soluble solid content and vigor of tree. Results suggest the potential of GWAS and GS for use in future breeding programs in Japanese pear. PMID:23641189

  11. A genome-wide association search for type 2 diabetes genes in African Americans.

    PubMed

    Palmer, Nicholette D; McDonough, Caitrin W; Hicks, Pamela J; Roh, Bong H; Wing, Maria R; An, S Sandy; Hester, Jessica M; Cooke, Jessica N; Bostrom, Meredith A; Rudock, Megan E; Talbert, Matthew E; Lewis, Joshua P; Ferrara, Assiamira; Lu, Lingyi; Ziegler, Julie T; Sale, Michele M; Divers, Jasmin; Shriner, Daniel; Adeyemo, Adebowale; Rotimi, Charles N; Ng, Maggie C Y; Langefeld, Carl D; Freedman, Barry I; Bowden, Donald W; Voight, Benjamin F; Scott, Laura J; Steinthorsdottir, Valgerdur; Morris, Andrew P; Dina, Christian; Welch, Ryan P; Zeggini, Eleftheria; Huth, Cornelia; Aulchenko, Yurii S; Thorleifsson, Gudmar; McCulloch, Laura J; Ferreira, Teresa; Grallert, Harald; Amin, Najaf; Wu, Guanming; Willer, Cristen J; Raychaudhuri, Soumya; McCarroll, Steve A; Langenberg, Claudia; Hofmann, Oliver M; Dupuis, Josée; Qi, Lu; Segrè, Ayellet V; van Hoek, Mandy; Navarro, Pau; Ardlie, Kristin; Balkau, Beverley; Benediktsson, Rafn; Bennett, Amanda J; Blagieva, Roza; Boerwinkle, Eric; Bonnycastle, Lori L; Boström, Kristina Bengtsson; Bravenboer, Bert; Bumpstead, Suzannah; Burtt, Noël P; Charpentier, Guillaume; Chines, Peter S; Cornelis, Marilyn; Couper, David J; Crawford, Gabe; Doney, Alex S F; Elliott, Katherine S; Elliott, Amanda L; Erdos, Michael R; Fox, Caroline S; Franklin, Christopher S; Ganser, Martha; Gieger, Christian; Grarup, Niels; Green, Todd; Griffin, Simon; Groves, Christopher J; Guiducci, Candace; Hadjadj, Samy; Hassanali, Neelam; Herder, Christian; Isomaa, Bo; Jackson, Anne U; Johnson, Paul R V; Jørgensen, Torben; Kao, Wen H L; Klopp, Norman; Kong, Augustine; Kraft, Peter; Kuusisto, Johanna; Lauritzen, Torsten; Li, Man; Lieverse, Aloysius; Lindgren, Cecilia M; Lyssenko, Valeriya; Marre, Michel; Meitinger, Thomas; Midthjell, Kristian; Morken, Mario A; Narisu, Narisu; Nilsson, Peter; Owen, Katharine R; Payne, Felicity; Perry, John R B; Petersen, Ann-Kristin; Platou, Carl; Proença, Christine; Prokopenko, Inga; Rathmann, Wolfgang; Rayner, N William; Robertson, Neil R; Rocheleau, Ghislain; Roden, Michael; Sampson, Michael J; Saxena, Richa; Shields, Beverley M; Shrader, Peter; Sigurdsson, Gunnar; Sparsø, Thomas; Strassburger, Klaus; Stringham, Heather M; Sun, Qi; Swift, Amy J; Thorand, Barbara; Tichet, Jean; Tuomi, Tiinamaija; van Dam, Rob M; van Haeften, Timon W; van Herpt, Thijs; van Vliet-Ostaptchouk, Jana V; Walters, G Bragi; Weedon, Michael N; Wijmenga, Cisca; Witteman, Jacqueline; Bergman, Richard N; Cauchi, Stephane; Collins, Francis S; Gloyn, Anna L; Gyllensten, Ulf; Hansen, Torben; Hide, Winston A; Hitman, Graham A; Hofman, Albert; Hunter, David J; Hveem, Kristian; Laakso, Markku; Mohlke, Karen L; Morris, Andrew D; Palmer, Colin N A; Pramstaller, Peter P; Rudan, Igor; Sijbrands, Eric; Stein, Lincoln D; Tuomilehto, Jaakko; Uitterlinden, Andre; Walker, Mark; Wareham, Nicholas J; Watanabe, Richard M; Abecasis, Goncalo R; Boehm, Bernhard O; Campbell, Harry; Daly, Mark J; Hattersley, Andrew T; Hu, Frank B; Meigs, James B; Pankow, James S; Pedersen, Oluf; Wichmann, H-Erich; Barroso, Inês; Florez, Jose C; Frayling, Timothy M; Groop, Leif; Sladek, Rob; Thorsteinsdottir, Unnur; Wilson, James F; Illig, Thomas; Froguel, Philippe; van Duijn, Cornelia M; Stefansson, Kari; Altshuler, David; Boehnke, Michael; McCarthy, Mark I; Soranzo, Nicole; Wheeler, Eleanor; Glazer, Nicole L; Bouatia-Naji, Nabila; Mägi, Reedik; Randall, Joshua; Johnson, Toby; Elliott, Paul; Rybin, Denis; Henneman, Peter; Dehghan, Abbas; Hottenga, Jouke Jan; Song, Kijoung; Goel, Anuj; Egan, Josephine M; Lajunen, Taina; Doney, Alex; Kanoni, Stavroula; Cavalcanti-Proença, Christine; Kumari, Meena; Timpson, Nicholas J; Zabena, Carina; Ingelsson, Erik; An, Ping; O'Connell, Jeffrey; Luan, Jian'an; Elliott, Amanda; McCarroll, Steven A; Roccasecca, Rosa Maria; Pattou, François; Sethupathy, Praveen; Ariyurek, Yavuz; Barter, Philip; Beilby, John P; Ben-Shlomo, Yoav; Bergmann, Sven; Bochud, Murielle; Bonnefond, Amélie; Borch-Johnsen, Knut; Böttcher, Yvonne; Brunner, Eric; Bumpstead, Suzannah J; Chen, Yii-Der Ida; Chines, Peter; Clarke, Robert; Coin, Lachlan J M; Cooper, Matthew N; Crisponi, Laura; Day, Ian N M; de Geus, Eco J C; Delplanque, Jerome; Fedson, Annette C; Fischer-Rosinsky, Antje; Forouhi, Nita G; Frants, Rune; Franzosi, Maria Grazia; Galan, Pilar; Goodarzi, Mark O; Graessler, Jürgen; Grundy, Scott; Gwilliam, Rhian; Hallmans, Göran; Hammond, Naomi; Han, Xijing; Hartikainen, Anna-Liisa; Hayward, Caroline; Heath, Simon C; Hercberg, Serge; Hicks, Andrew A; Hillman, David R; Hingorani, Aroon D; Hui, Jennie; Hung, Joe; Jula, Antti; Kaakinen, Marika; Kaprio, Jaakko; Kesaniemi, Y Antero; Kivimaki, Mika; Knight, Beatrice; Koskinen, Seppo; Kovacs, Peter; Kyvik, Kirsten Ohm; Lathrop, G Mark; Lawlor, Debbie A; Le Bacquer, Olivier; Lecoeur, Cécile; Li, Yun; Mahley, Robert; Mangino, Massimo; Manning, Alisa K; Martínez-Larrad, María Teresa; McAteer, Jarred B; McPherson, Ruth; Meisinger, Christa; Melzer, David; Meyre, David; Mitchell, Braxton D; Mukherjee, Sutapa; Naitza, Silvia; Neville, Matthew J; Oostra, Ben A; Orrù, Marco; Pakyz, Ruth; Paolisso, Giuseppe; Pattaro, Cristian; Pearson, Daniel; Peden, John F; Pedersen, Nancy L; Perola, Markus; Pfeiffer, Andreas F H; Pichler, Irene; Polasek, Ozren; Posthuma, Danielle; Potter, Simon C; Pouta, Anneli; Province, Michael A; Psaty, Bruce M; Rayner, Nigel W; Rice, Kenneth; Ripatti, Samuli; Rivadeneira, Fernando; Rolandsson, Olov; Sandbaek, Annelli; Sandhu, Manjinder; Sanna, Serena; Sayer, Avan Aihie; Scheet, Paul; Seedorf, Udo; Sharp, Stephen J; Shields, Beverley; Sijbrands, Eric J G; Silveira, Angela; Simpson, Laila; Singleton, Andrew; Smith, Nicholas L; Sovio, Ulla; Swift, Amy; Syddall, Holly; Syvänen, Ann-Christine; Tanaka, Toshiko; Tönjes, Anke; Uitterlinden, André G; van Dijk, Ko Willems; Varma, Dhiraj; Visvikis-Siest, Sophie; Vitart, Veronique; Vogelzangs, Nicole; Waeber, Gérard; Wagner, Peter J; Walley, Andrew; Ward, Kim L; Watkins, Hugh; Wild, Sarah H; Willemsen, Gonneke; Witteman, Jaqueline C M; Yarnell, John W G; Zelenika, Diana; Zethelius, Björn; Zhai, Guangju; Zhao, Jing Hua; Zillikens, M Carola; Borecki, Ingrid B; Loos, Ruth J F; Meneton, Pierre; Magnusson, Patrik K E; Nathan, David M; Williams, Gordon H; Silander, Kaisa; Salomaa, Veikko; Smith, George Davey; Bornstein, Stefan R; Schwarz, Peter; Spranger, Joachim; Karpe, Fredrik; Shuldiner, Alan R; Cooper, Cyrus; Dedoussis, George V; Serrano-Ríos, Manuel; Lind, Lars; Palmer, Lyle J; Franks, Paul W; Ebrahim, Shah; Marmot, Michael; Kao, W H Linda; Pramstaller, Peter Paul; Wright, Alan F; Stumvoll, Michael; Hamsten, Anders; Buchanan, Thomas A; Valle, Timo T; Rotter, Jerome I; Siscovick, David S; Penninx, Brenda W J H; Boomsma, Dorret I; Deloukas, Panos; Spector, Timothy D; Ferrucci, Luigi; Cao, Antonio; Scuteri, Angelo; Schlessinger, David; Uda, Manuela; Ruokonen, Aimo; Jarvelin, Marjo-Riitta; Waterworth, Dawn M; Vollenweider, Peter; Peltonen, Leena; Mooser, Vincent; Sladek, Robert

    2012-01-01

    African Americans are disproportionately affected by type 2 diabetes (T2DM) yet few studies have examined T2DM using genome-wide association approaches in this ethnicity. The aim of this study was to identify genes associated with T2DM in the African American population. We performed a Genome Wide Association Study (GWAS) using the Affymetrix 6.0 array in 965 African-American cases with T2DM and end-stage renal disease (T2DM-ESRD) and 1029 population-based controls. The most significant SNPs (n = 550 independent loci) were genotyped in a replication cohort and 122 SNPs (n = 98 independent loci) were further tested through genotyping three additional validation cohorts followed by meta-analysis in all five cohorts totaling 3,132 cases and 3,317 controls. Twelve SNPs had evidence of association in the GWAS (P<0.0071), were directionally consistent in the Replication cohort and were associated with T2DM in subjects without nephropathy (P<0.05). Meta-analysis in all cases and controls revealed a single SNP reaching genome-wide significance (P<2.5×10(-8)). SNP rs7560163 (P = 7.0×10(-9), OR (95% CI) = 0.75 (0.67-0.84)) is located intergenically between RND3 and RBM43. Four additional loci (rs7542900, rs4659485, rs2722769 and rs7107217) were associated with T2DM (P<0.05) and reached more nominal levels of significance (P<2.5×10(-5)) in the overall analysis and may represent novel loci that contribute to T2DM. We have identified novel T2DM-susceptibility variants in the African-American population. Notably, T2DM risk was associated with the major allele and implies an interesting genetic architecture in this population. These results suggest that multiple loci underlie T2DM susceptibility in the African-American population and that these loci are distinct from those identified in other ethnic populations.

  12. A Genome-Wide Association Search for Type 2 Diabetes Genes in African Americans

    PubMed Central

    Palmer, Nicholette D.; McDonough, Caitrin W.; Hicks, Pamela J.; Roh, Bong H.; Wing, Maria R.; An, S. Sandy; Hester, Jessica M.; Cooke, Jessica N.; Bostrom, Meredith A.; Rudock, Megan E.; Talbert, Matthew E.; Lewis, Joshua P.; Ferrara, Assiamira; Lu, Lingyi; Ziegler, Julie T.; Sale, Michele M.; Divers, Jasmin; Shriner, Daniel; Adeyemo, Adebowale; Rotimi, Charles N.; Ng, Maggie C. Y.; Langefeld, Carl D.; Freedman, Barry I.; Bowden, Donald W.

    2012-01-01

    African Americans are disproportionately affected by type 2 diabetes (T2DM) yet few studies have examined T2DM using genome-wide association approaches in this ethnicity. The aim of this study was to identify genes associated with T2DM in the African American population. We performed a Genome Wide Association Study (GWAS) using the Affymetrix 6.0 array in 965 African-American cases with T2DM and end-stage renal disease (T2DM-ESRD) and 1029 population-based controls. The most significant SNPs (n = 550 independent loci) were genotyped in a replication cohort and 122 SNPs (n = 98 independent loci) were further tested through genotyping three additional validation cohorts followed by meta-analysis in all five cohorts totaling 3,132 cases and 3,317 controls. Twelve SNPs had evidence of association in the GWAS (P<0.0071), were directionally consistent in the Replication cohort and were associated with T2DM in subjects without nephropathy (P<0.05). Meta-analysis in all cases and controls revealed a single SNP reaching genome-wide significance (P<2.5×10−8). SNP rs7560163 (P = 7.0×10−9, OR (95% CI) = 0.75 (0.67–0.84)) is located intergenically between RND3 and RBM43. Four additional loci (rs7542900, rs4659485, rs2722769 and rs7107217) were associated with T2DM (P<0.05) and reached more nominal levels of significance (P<2.5×10−5) in the overall analysis and may represent novel loci that contribute to T2DM. We have identified novel T2DM-susceptibility variants in the African-American population. Notably, T2DM risk was associated with the major allele and implies an interesting genetic architecture in this population. These results suggest that multiple loci underlie T2DM susceptibility in the African-American population and that these loci are distinct from those identified in other ethnic populations. PMID:22238593

  13. Transethnic genome-wide scan identifies novel Alzheimer's disease loci.

    PubMed

    Jun, Gyungah R; Chung, Jaeyoon; Mez, Jesse; Barber, Robert; Beecham, Gary W; Bennett, David A; Buxbaum, Joseph D; Byrd, Goldie S; Carrasquillo, Minerva M; Crane, Paul K; Cruchaga, Carlos; De Jager, Philip; Ertekin-Taner, Nilufer; Evans, Denis; Fallin, M Danielle; Foroud, Tatiana M; Friedland, Robert P; Goate, Alison M; Graff-Radford, Neill R; Hendrie, Hugh; Hall, Kathleen S; Hamilton-Nelson, Kara L; Inzelberg, Rivka; Kamboh, M Ilyas; Kauwe, John S K; Kukull, Walter A; Kunkle, Brian W; Kuwano, Ryozo; Larson, Eric B; Logue, Mark W; Manly, Jennifer J; Martin, Eden R; Montine, Thomas J; Mukherjee, Shubhabrata; Naj, Adam; Reiman, Eric M; Reitz, Christiane; Sherva, Richard; St George-Hyslop, Peter H; Thornton, Timothy; Younkin, Steven G; Vardarajan, Badri N; Wang, Li-San; Wendlund, Jens R; Winslow, Ashley R; Haines, Jonathan; Mayeux, Richard; Pericak-Vance, Margaret A; Schellenberg, Gerard; Lunetta, Kathryn L; Farrer, Lindsay A

    2017-07-01

    Genetic loci for Alzheimer's disease (AD) have been identified in whites of European ancestry, but the genetic architecture of AD among other populations is less understood. We conducted a transethnic genome-wide association study (GWAS) for late-onset AD in Stage 1 sample including whites of European Ancestry, African-Americans, Japanese, and Israeli-Arabs assembled by the Alzheimer's Disease Genetics Consortium. Suggestive results from Stage 1 from novel loci were followed up using summarized results in the International Genomics Alzheimer's Project GWAS dataset. Genome-wide significant (GWS) associations in single-nucleotide polymorphism (SNP)-based tests (P < 5 × 10 -8 ) were identified for SNPs in PFDN1/HBEGF, USP6NL/ECHDC3, and BZRAP1-AS1 and for the interaction of the (apolipoprotein E) APOE ε4 allele with NFIC SNP. We also obtained GWS evidence (P < 2.7 × 10 -6 ) for gene-based association in the total sample with a novel locus, TPBG (P = 1.8 × 10 -6 ). Our findings highlight the value of transethnic studies for identifying novel AD susceptibility loci. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  14. Vive la résistance: genome-wide selection against introduced alleles in invasive hybrid zones

    USGS Publications Warehouse

    Kovach, Ryan P.; Hand, Brian K.; Hohenlohe, Paul A.; Cosart, Ted F.; Boyer, Matthew C.; Neville, Helen H.; Muhlfeld, Clint C.; Amish, Stephen J.; Carim, Kellie; Narum, Shawn R.; Lowe, Winsor H.; Allendorf, Fred W.; Luikart, Gordon

    2016-01-01

    Evolutionary and ecological consequences of hybridization between native and invasive species are notoriously complicated because patterns of selection acting on non-native alleles can vary throughout the genome and across environments. Rapid advances in genomics now make it feasible to assess locus-specific and genome-wide patterns of natural selection acting on invasive introgression within and among natural populations occupying diverse environments. We quantified genome-wide patterns of admixture across multiple independent hybrid zones of native westslope cutthroat trout and invasive rainbow trout, the world's most widely introduced fish, by genotyping 339 individuals from 21 populations using 9380 species-diagnostic loci. A significantly greater proportion of the genome appeared to be under selection favouring native cutthroat trout (rather than rainbow trout), and this pattern was pervasive across the genome (detected on most chromosomes). Furthermore, selection against invasive alleles was consistent across populations and environments, even in those where rainbow trout were predicted to have a selective advantage (warm environments). These data corroborate field studies showing that hybrids between these species have lower fitness than the native taxa, and show that these fitness differences are due to selection favouring many native genes distributed widely throughout the genome.

  15. Meta-analysis of sex-specific genome-wide association studies.

    PubMed

    Magi, Reedik; Lindgren, Cecilia M; Morris, Andrew P

    2010-12-01

    Despite the success of genome-wide association studies, much of the genetic contribution to complex human traits is still unexplained. One potential source of genetic variation that may contribute to this "missing heritability" is that which differs in magnitude and/or direction between males and females, which could result from sexual dimorphism in gene expression. Such sex-differentiated effects are common in model organisms, and are becoming increasingly evident in human complex traits through large-scale male- and female-specific meta-analyses. In this article, we review the methodology for meta-analysis of sex-specific genome-wide association studies, and propose a sex-differentiated test of association with quantitative or dichotomous traits, which allows for heterogeneity of allelic effects between males and females. We perform detailed simulations to compare the power of the proposed sex-differentiated meta-analysis with the more traditional "sex-combined" approach, which is ambivalent to gender. The results of this study highlight only a small loss in power for the sex-differentiated meta-analysis when the allelic effects of the causal variant are the same in males and females. However, over a range of models of heterogeneity in allelic effects between genders, our sex-differentiated meta-analysis strategy offers substantial gains in power, and thus has the potential to discover novel loci contributing effects to complex human traits with existing genome-wide association data. © 2010 Wiley-Liss, Inc.

  16. Developing genome-wide microsatellite markers of bamboo and their applications on molecular marker assisted taxonomy for accessions in the genus Phyllostachys.

    PubMed

    Zhao, Hansheng; Yang, Li; Peng, Zhenhua; Sun, Huayu; Yue, Xianghua; Lou, Yongfeng; Dong, Lili; Wang, Lili; Gao, Zhimin

    2015-01-26

    Morphology-based taxonomy via exiguously reproductive organ has severely limitation on bamboo taxonomy, mainly owing to infrequent and unpredictable flowering events of bamboo. Here, we present the first genome-wide analysis and application of microsatellites based on the genome of moso bamboo (Phyllostachys edulis) to assist bamboo taxonomy. Of identified 127,593 microsatellite repeat-motifs, the primers of 1,451 microsatellites were designed and 1,098 markers were physically mapped on the genome of moso bamboo. A total of 917 markers were successfully validated in 9 accessions with ~39.8% polymorphic potential. Retrieved from validated microsatellite markers, 23 markers were selected for polymorphic analysis among 78 accessions and 64 alleles were detected with an average of 2.78 alleles per primers. The cluster result indicated the majority of the accessions were consistent with their current taxonomic classification, confirming the suitability and effectiveness of the developed microsatellite markers. The variations of microsatellite marker in different species were confirmed by sequencing and in silico comparative genome mapping were investigated. Lastly, a bamboo microsatellites database (http://www.bamboogdb.org/ssr) was implemented to browse and search large information of bamboo microsatellites. Consequently, our results of microsatellite marker development are valuable for assisting bamboo taxonomy and investigating genomic studies in bamboo and related grass species.

  17. Multi-Instance Metric Transfer Learning for Genome-Wide Protein Function Prediction.

    PubMed

    Xu, Yonghui; Min, Huaqing; Wu, Qingyao; Song, Hengjie; Ye, Bicui

    2017-02-06

    Multi-Instance (MI) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with multiple instances. Many studies in this literature attempted to find an appropriate Multi-Instance Learning (MIL) method for genome-wide protein function prediction under a usual assumption, the underlying distribution from testing data (target domain, i.e., TD) is the same as that from training data (source domain, i.e., SD). However, this assumption may be violated in real practice. To tackle this problem, in this paper, we propose a Multi-Instance Metric Transfer Learning (MIMTL) approach for genome-wide protein function prediction. In MIMTL, we first transfer the source domain distribution to the target domain distribution by utilizing the bag weights. Then, we construct a distance metric learning method with the reweighted bags. At last, we develop an alternative optimization scheme for MIMTL. Comprehensive experimental evidence on seven real-world organisms verifies the effectiveness and efficiency of the proposed MIMTL approach over several state-of-the-art methods.

  18. An experimental validation of genomic selection in octoploid strawberry

    PubMed Central

    Gezan, Salvador A; Osorio, Luis F; Verma, Sujeet; Whitaker, Vance M

    2017-01-01

    The primary goal of genomic selection is to increase genetic gains for complex traits by predicting performance of individuals for which phenotypic data are not available. The objective of this study was to experimentally evaluate the potential of genomic selection in strawberry breeding and to define a strategy for its implementation. Four clonally replicated field trials, two in each of 2 years comprised of a total of 1628 individuals, were established in 2013–2014 and 2014–2015. Five complex yield and fruit quality traits with moderate to low heritability were assessed in each trial. High-density genotyping was performed with the Affymetrix Axiom IStraw90 single-nucleotide polymorphism array, and 17 479 polymorphic markers were chosen for analysis. Several methods were compared, including Genomic BLUP, Bayes B, Bayes C, Bayesian LASSO Regression, Bayesian Ridge Regression and Reproducing Kernel Hilbert Spaces. Cross-validation within training populations resulted in higher values than for true validations across trials. For true validations, Bayes B gave the highest predictive abilities on average and also the highest selection efficiencies, particularly for yield traits that were the lowest heritability traits. Selection efficiencies using Bayes B for parent selection ranged from 74% for average fruit weight to 34% for early marketable yield. A breeding strategy is proposed in which advanced selection trials are utilized as training populations and in which genomic selection can reduce the breeding cycle from 3 to 2 years for a subset of untested parents based on their predicted genomic breeding values. PMID:28090334

  19. HIV Genome-Wide Protein Associations: a Review of 30 Years of Research

    PubMed Central

    2016-01-01

    SUMMARY The HIV genome encodes a small number of viral proteins (i.e., 16), invariably establishing cooperative associations among HIV proteins and between HIV and host proteins, to invade host cells and hijack their internal machineries. As a known example, the HIV envelope glycoprotein GP120 is closely associated with GP41 for viral entry. From a genome-wide perspective, a hypothesis can be worked out to determine whether 16 HIV proteins could develop 120 possible pairwise associations either by physical interactions or by functional associations mediated via HIV or host molecules. Here, we present the first systematic review of experimental evidence on HIV genome-wide protein associations using a large body of publications accumulated over the past 3 decades. Of 120 possible pairwise associations between 16 HIV proteins, at least 34 physical interactions and 17 functional associations have been identified. To achieve efficient viral replication and infection, HIV protein associations play essential roles (e.g., cleavage, inhibition, and activation) during the HIV life cycle. In either a dispensable or an indispensable manner, each HIV protein collaborates with another viral protein to accomplish specific activities that precisely take place at the proper stages of the HIV life cycle. In addition, HIV genome-wide protein associations have an impact on anti-HIV inhibitors due to the extensive cross talk between drug-inhibited proteins and other HIV proteins. Overall, this study presents for the first time a comprehensive overview of HIV genome-wide protein associations, highlighting meticulous collaborations between all viral proteins during the HIV life cycle. PMID:27357278

  20. Phylogenomics of plant genomes: a methodology for genome-wide searches for orthologs in plants

    PubMed Central

    Conte, Matthieu G; Gaillard, Sylvain; Droc, Gaetan; Perin, Christophe

    2008-01-01

    Background Gene ortholog identification is now a major objective for mining the increasing amount of sequence data generated by complete or partial genome sequencing projects. Comparative and functional genomics urgently need a method for ortholog detection to reduce gene function inference and to aid in the identification of conserved or divergent genetic pathways between several species. As gene functions change during evolution, reconstructing the evolutionary history of genes should be a more accurate way to differentiate orthologs from paralogs. Phylogenomics takes into account phylogenetic information from high-throughput genome annotation and is the most straightforward way to infer orthologs. However, procedures for automatic detection of orthologs are still scarce and suffer from several limitations. Results We developed a procedure for ortholog prediction between Oryza sativa and Arabidopsis thaliana. Firstly, we established an efficient method to cluster A. thaliana and O. sativa full proteomes into gene families. Then, we developed an optimized phylogenomics pipeline for ortholog inference. We validated the full procedure using test sets of orthologs and paralogs to demonstrate that our method outperforms pairwise methods for ortholog predictions. Conclusion Our procedure achieved a high level of accuracy in predicting ortholog and paralog relationships. Phylogenomic predictions for all validated gene families in both species were easily achieved and we can conclude that our methodology outperforms similarly based methods. PMID:18426584

  1. A Genome-Wide Association Study of Chronic Obstructive Pulmonary Disease in Hispanics

    PubMed Central

    Chen, Wei; Brehm, John M.; Manichaikul, Ani; Cho, Michael H.; Boutaoui, Nadia; Yan, Qi; Burkart, Kristin M.; Enright, Paul L.; Rotter, Jerome I.; Petersen, Hans; Leng, Shuguang; Obeidat, Ma’en; Bossé, Yohan; Brandsma, Corry-Anke; Hao, Ke; Rich, Stephen S.; Powell, Rhea; Avila, Lydiana; Soto-Quiros, Manuel; Silverman, Edwin K.; Tesfaigzi, Yohannes; Barr, R. Graham

    2015-01-01

    Rationale: Genome-wide association studies (GWAS) of chronic obstructive pulmonary disease (COPD) have identified disease-susceptibility loci, mostly in subjects of European descent. Objectives: We hypothesized that by studying Hispanic populations we would be able to identify unique loci that contribute to COPD pathogenesis in Hispanics but remain undetected in GWAS of non-Hispanic populations. Methods: We conducted a metaanalysis of two GWAS of COPD in independent cohorts of Hispanics in Costa Rica and the United States (Multi-Ethnic Study of Atherosclerosis [MESA]). We performed a replication study of the top single-nucleotide polymorphisms in an independent Hispanic cohort in New Mexico (the Lovelace Smokers Cohort). We also attempted to replicate prior findings from genome-wide studies in non-Hispanic populations in Hispanic cohorts. Measurements and Main Results: We found no genome-wide significant association with COPD in our metaanalysis of Costa Rica and MESA. After combining the top results from this metaanalysis with those from our replication study in the Lovelace Smokers Cohort, we identified two single-nucleotide polymorphisms approaching genome-wide significance for an association with COPD. The first (rs858249, combined P value = 6.1 × 10−8) is near the genes KLHL7 and NUPL2 on chromosome 7. The second (rs286499, combined P value = 8.4 × 10−8) is located in an intron of DLG2. The two most significant single-nucleotide polymorphisms in FAM13A from a previous genome-wide study in non-Hispanics were associated with COPD in Hispanics. Conclusions: We have identified two novel loci (in or near the genes KLHL7/NUPL2 and DLG2) that may play a role in COPD pathogenesis in Hispanic populations. PMID:25584925

  2. Genome-wide Association Study of Obsessive-Compulsive Disorder

    PubMed Central

    Stewart, S Evelyn; Yu, Dongmei; Scharf, Jeremiah M; Neale, Benjamin M; Fagerness, Jesen A; Mathews, Carol A; Arnold, Paul D; Evans, Patrick D; Gamazon, Eric R; Osiecki, Lisa; McGrath, Lauren; Haddad, Stephen; Crane, Jacquelyn; Hezel, Dianne; Illman, Cornelia; Mayerfeld, Catherine; Konkashbaev, Anuar; Liu, Chunyu; Pluzhnikov, Anna; Tikhomirov, Anna; Edlund, Christopher K; Rauch, Scott L; Moessner, Rainald; Falkai, Peter; Maier, Wolfgang; Ruhrmann, Stephan; Grabe, Hans-Jörgen; Lennertz, Leonard; Wagner, Michael; Bellodi, Laura; Cavallini, Maria Cristina; Richter, Margaret A; Cook, Edwin H; Kennedy, James L; Rosenberg, David; Stein, Dan J; Hemmings, Sian MJ; Lochner, Christine; Azzam, Amin; Chavira, Denise A; Fournier, Eduardo; Garrido, Helena; Sheppard, Brooke; Umaña, Paul; Murphy, Dennis L; Wendland, Jens R; Veenstra-VanderWeele, Jeremy; Denys, Damiaan; Blom, Rianne; Deforce, Dieter; Van Nieuwerburgh, Filip; Westenberg, Herman GM; Walitza, Susanne; Egberts, Karin; Renner, Tobias; Miguel, Euripedes Constantino; Cappi, Carolina; Hounie, Ana G; Conceição do Rosário, Maria; Sampaio, Aline S; Vallada, Homero; Nicolini, Humberto; Lanzagorta, Nuria; Camarena, Beatriz; Delorme, Richard; Leboyer, Marion; Pato, Carlos N; Pato, Michele T; Voyiaziakis, Emanuel; Heutink, Peter; Cath, Danielle C; Posthuma, Danielle; Smit, Jan H; Samuels, Jack; Bienvenu, O Joseph; Cullen, Bernadette; Fyer, Abby J; Grados, Marco A; Greenberg, Benjamin D; McCracken, James T; Riddle, Mark A; Wang, Ying; Coric, Vladimir; Leckman, James F; Bloch, Michael; Pittenger, Christopher; Eapen, Valsamma; Black, Donald W; Ophoff, Roel A; Strengman, Eric; Cusi, Daniele; Turiel, Maurizio; Frau, Francesca; Macciardi, Fabio; Gibbs, J Raphael; Cookson, Mark R; Singleton, Andrew; Hardy, John; Crenshaw, Andrew T; Parkin, Melissa A; Mirel, Daniel B; Conti, David V; Purcell, Shaun; Nestadt, Gerald; Hanna, Gregory L; Jenike, Michael A; Knowles, James A; Cox, Nancy; Pauls, David L

    2014-01-01

    Obsessive-compulsive disorder (OCD) is a common, debilitating neuropsychiatric illness with complex genetic etiology. The International OCD Foundation Genetics Collaborative (IOCDF-GC) is a multi-national collaboration established to discover the genetic variation predisposing to OCD. A set of individuals affected with DSM-IV OCD, a subset of their parents, and unselected controls, were genotyped with several different Illumina SNP microarrays. After extensive data cleaning, 1,465 cases, 5,557 ancestry-matched controls and 400 complete trios remained, with a common set of 469,410 autosomal and 9,657 X-chromosome SNPs. Ancestry-stratified case-control association analyses were conducted for three genetically-defined subpopulations and combined in two meta-analyses, with and without the trio-based analysis. In the case-control analysis, the lowest two p-values were located within DLGAP1 (p=2.49×10-6 and p=3.44×10-6), a member of the neuronal postsynaptic density complex. In the trio analysis, rs6131295, near BTBD3, exceeded the genome-wide significance threshold with a p-value=3.84 × 10-8. However, when trios were meta-analyzed with the combined case-control samples, the p-value for this variant was 3.62×10-5, losing genome-wide significance. Although no SNPs were identified to be associated with OCD at a genome-wide significant level in the combined trio-case-control sample, a significant enrichment of methylation-QTLs (p<0.001) and frontal lobe eQTLs (p=0.001) was observed within the top-ranked SNPs (p<0.01) from the trio-case-control analysis, suggesting these top signals may have a broad role in gene expression in the brain, and possibly in the etiology of OCD. PMID:22889921

  3. Genome-Wide Association Mapping and Genomic Prediction Elucidate the Genetic Architecture of Morphological Traits in Arabidopsis.

    PubMed

    Kooke, Rik; Kruijer, Willem; Bours, Ralph; Becker, Frank; Kuhn, André; van de Geest, Henri; Buntjer, Jaap; Doeswijk, Timo; Guerra, José; Bouwmeester, Harro; Vreugdenhil, Dick; Keurentjes, Joost J B

    2016-04-01

    Quantitative traits in plants are controlled by a large number of genes and their interaction with the environment. To disentangle the genetic architecture of such traits, natural variation within species can be explored by studying genotype-phenotype relationships. Genome-wide association studies that link phenotypes to thousands of single nucleotide polymorphism markers are nowadays common practice for such analyses. In many cases, however, the identified individual loci cannot fully explain the heritability estimates, suggesting missing heritability. We analyzed 349 Arabidopsis accessions and found extensive variation and high heritabilities for different morphological traits. The number of significant genome-wide associations was, however, very low. The application of genomic prediction models that take into account the effects of all individual loci may greatly enhance the elucidation of the genetic architecture of quantitative traits in plants. Here, genomic prediction models revealed different genetic architectures for the morphological traits. Integrating genomic prediction and association mapping enabled the assignment of many plausible candidate genes explaining the observed variation. These genes were analyzed for functional and sequence diversity, and good indications that natural allelic variation in many of these genes contributes to phenotypic variation were obtained. For ACS11, an ethylene biosynthesis gene, haplotype differences explaining variation in the ratio of petiole and leaf length could be identified. © 2016 American Society of Plant Biologists. All Rights Reserved.

  4. GStream: Improving SNP and CNV Coverage on Genome-Wide Association Studies

    PubMed Central

    Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio

    2013-01-01

    We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method. PMID:23844243

  5. Hematopoietic transcriptional mechanisms: from locus-specific to genome-wide vantage points.

    PubMed

    DeVilbiss, Andrew W; Sanalkumar, Rajendran; Johnson, Kirby D; Keles, Sunduz; Bresnick, Emery H

    2014-08-01

    Hematopoiesis is an exquisitely regulated process in which stem cells in the developing embryo and the adult generate progenitor cells that give rise to all blood lineages. Master regulatory transcription factors control hematopoiesis by integrating signals from the microenvironment and dynamically establishing and maintaining genetic networks. One of the most rudimentary aspects of cell type-specific transcription factor function, how they occupy a highly restricted cohort of cis-elements in chromatin, remains poorly understood. Transformative technologic advances involving the coupling of next-generation DNA sequencing technology with the chromatin immunoprecipitation assay (ChIP-seq) have enabled genome-wide mapping of factor occupancy patterns. However, formidable problems remain; notably, ChIP-seq analysis yields hundreds to thousands of chromatin sites occupied by a given transcription factor, and only a fraction of the sites appear to be endowed with critical, non-redundant function. It has become en vogue to map transcription factor occupancy patterns genome-wide, while using powerful statistical tools to establish correlations to inform biology and mechanisms. With the advent of revolutionary genome editing technologies, one can now reach beyond correlations to conduct definitive hypothesis testing. This review focuses on key discoveries that have emerged during the path from single loci to genome-wide analyses, specifically in the context of hematopoietic transcriptional mechanisms. Copyright © 2014 ISEH - International Society for Experimental Hematology. Published by Elsevier Inc. All rights reserved.

  6. Genome-Wide Distribution, Organisation and Functional Characterization of Disease Resistance and Defence Response Genes across Rice Species

    PubMed Central

    Singh, Sangeeta; Chand, Suresh; Singh, N. K.; Sharma, Tilak Raj

    2015-01-01

    The resistance (R) genes and defense response (DR) genes have become very important resources for the development of disease resistant cultivars. In the present investigation, genome-wide identification, expression, phylogenetic and synteny analysis was done for R and DR-genes across three species of rice viz: Oryza sativa ssp indica cv 93-11, Oryza sativa ssp japonica and wild rice species, Oryza brachyantha. We used the in silico approach to identify and map 786 R -genes and 167 DR-genes, 672 R-genes and 142 DR-genes, 251 R-genes and 86 DR-genes in the japonica, indica and O. brachyanth a genomes, respectively. Our analysis showed that 60.5% and 55.6% of the R-genes are tandemly repeated within clusters and distributed over all the rice chromosomes in indica and japonica genomes, respectively. The phylogenetic analysis along with motif distribution shows high degree of conservation of R- and DR-genes in clusters. In silico expression analysis of R-genes and DR-genes showed more than 85% were expressed genes showing corresponding EST matches in the databases. This study gave special emphasis on mechanisms of gene evolution and duplication for R and DR genes across species. Analysis of paralogs across rice species indicated 17% and 4.38% R-genes, 29% and 11.63% DR-genes duplication in indica and Oryza brachyantha, as compared to 20% and 26% duplication of R-genes and DR-genes in japonica respectively. We found that during the course of duplication only 9.5% of R- and DR-genes changed their function and rest of the genes have maintained their identity. Syntenic relationship across three genomes inferred that more orthology is shared between indica and japonica genomes as compared to brachyantha genome. Genome wide identification of R-genes and DR-genes in the rice genome will help in allele mining and functional validation of these genes, and to understand molecular mechanism of disease resistance and their evolution in rice and related species. PMID:25902056

  7. Genome-wide detection of conservative site-specific recombination in bacteria

    PubMed Central

    Mathias Garrett, Elizabeth; Camilli, Andrew

    2018-01-01

    The ability of clonal bacterial populations to generate genomic and phenotypic heterogeneity is thought to be of great importance for many commensal and pathogenic bacteria. One common mechanism contributing to diversity formation relies on the inversion of small genomic DNA segments in a process commonly referred to as conservative site-specific recombination. This phenomenon is known to occur in several bacterial lineages, however it remains notoriously difficult to identify due to the lack of conserved features. Here, we report an easy-to-implement method based on high-throughput paired-end sequencing for genome-wide detection of conservative site-specific recombination on a single-nucleotide level. We demonstrate the effectiveness of the method by successfully detecting several novel inversion sites in an epidemic isolate of the enteric pathogen Clostridium difficile. Using an experimental approach, we validate the inversion potential of all detected sites in C. difficile and quantify their prevalence during exponential and stationary growth in vitro. In addition, we demonstrate that the master recombinase RecV is responsible for the inversion of some but not all invertible sites. Using a fluorescent gene-reporter system, we show that at least one gene from a two-component system located next to an invertible site is expressed in an on-off mode reminiscent of phase variation. We further demonstrate the applicability of our method by mining 209 publicly available sequencing datasets and show that conservative site-specific recombination is common in the bacterial realm but appears to be absent in some lineages. Finally, we show that the gene content associated with the inversion sites is diverse and goes beyond traditionally described surface components. Overall, our method provides a robust platform for detection of conservative site-specific recombination in bacteria and opens a new avenue for global exploration of this important phenomenon. PMID:29621238

  8. Creating a RAW264.7 CRISPR-Cas9 Genome Wide Library

    PubMed Central

    Napier, Brooke A; Monack, Denise M

    2017-01-01

    The bacterial clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 genome editing tools are used in mammalian cells to knock-out specific genes of interest to elucidate gene function. The CRISPR-Cas9 system requires that the mammalian cell expresses Cas9 endonuclease, guide RNA (gRNA) to lead the endonuclease to the gene of interest, and the PAM sequence that links the Cas9 to the gRNA. CRISPR-Cas9 genome wide libraries are used to screen the effect of each gene in the genome on the cellular phenotype of interest, in an unbiased high-throughput manner. In this protocol, we describe our method of creating a CRISPR-Cas9 genome wide library in a transformed murine macrophage cell-line (RAW264.7). We have employed this library to identify novel mediators in the caspase-11 cell death pathway (Napier et al., 2016); however, this library can then be used to screen the importance of specific genes in multiple murine macrophage cellular pathways. PMID:28868328

  9. DNA Breaks and End Resection Measured Genome-wide by End Sequencing.

    PubMed

    Canela, Andres; Sridharan, Sriram; Sciascia, Nicholas; Tubbs, Anthony; Meltzer, Paul; Sleckman, Barry P; Nussenzweig, André

    2016-09-01

    DNA double-strand breaks (DSBs) arise during physiological transcription, DNA replication, and antigen receptor diversification. Mistargeting or misprocessing of DSBs can result in pathological structural variation and mutation. Here we describe a sensitive method (END-seq) to monitor DNA end resection and DSBs genome-wide at base-pair resolution in vivo. We utilized END-seq to determine the frequency and spectrum of restriction-enzyme-, zinc-finger-nuclease-, and RAG-induced DSBs. Beyond sequence preference, chromatin features dictate the repertoire of these genome-modifying enzymes. END-seq can detect at least one DSB per cell among 10,000 cells not harboring DSBs, and we estimate that up to one out of 60 cells contains off-target RAG cleavage. In addition to site-specific cleavage, we detect DSBs distributed over extended regions during immunoglobulin class-switch recombination. Thus, END-seq provides a snapshot of DNA ends genome-wide, which can be utilized for understanding genome-editing specificities and the influence of chromatin on DSB pathway choice. Published by Elsevier Inc.

  10. Genome-wide linkage and association analysis of cardiometabolic phenotypes in Hispanic Americans.

    PubMed

    Hellwege, Jacklyn N; Palmer, Nicholette D; Dimitrov, Latchezar; Keaton, Jacob M; Tabb, Keri L; Sajuthi, Satria; Taylor, Kent D; Ng, Maggie C Y; Speliotes, Elizabeth K; Hawkins, Gregory A; Long, Jirong; Ida Chen, Yii-Der; Lorenzo, Carlos; Norris, Jill M; Rotter, Jerome I; Langefeld, Carl D; Wagenknecht, Lynne E; Bowden, Donald W

    2017-02-01

    Linkage studies of complex genetic diseases have been largely replaced by genome-wide association studies, due in part to limited success in complex trait discovery. However, recent interest in rare and low-frequency variants motivates re-examination of family-based methods. In this study, we investigated the performance of two-point linkage analysis for over 1.6 million single-nucleotide polymorphisms (SNPs) combined with single variant association analysis to identify high impact variants, which are both strongly linked and associated with cardiometabolic traits in up to 1414 Hispanics from the Insulin Resistance Atherosclerosis Family Study (IRASFS). Evaluation of all 50 phenotypes yielded 83 557 000 LOD (logarithm of the odds) scores, with 9214 LOD scores ⩾3.0, 845 ⩾4.0 and 89 ⩾5.0, with a maximal LOD score of 6.49 (rs12956744 in the LAMA1 gene for tumor necrosis factor-α (TNFα) receptor 2). Twenty-seven variants were associated with P<0.005 as well as having an LOD score >4, including variants in the NFIB gene under a linkage peak with TNFα receptor 2 levels on chromosome 9. Linkage regions of interest included a broad peak (31 Mb) on chromosome 1q with acute insulin response (max LOD=5.37). This region was previously documented with type 2 diabetes in family-based studies, providing support for the validity of these results. Overall, we have demonstrated the utility of two-point linkage and association in comprehensive genome-wide array-based SNP genotypes.

  11. A genome-wide methylation study on obesity: differential variability and differential methylation.

    PubMed

    Xu, Xiaojing; Su, Shaoyong; Barnes, Vernon A; De Miguel, Carmen; Pollock, Jennifer; Ownby, Dennis; Shi, Hidong; Zhu, Haidong; Snieder, Harold; Wang, Xiaoling

    2013-05-01

    Besides differential methylation, DNA methylation variation has recently been proposed and demonstrated to be a potential contributing factor to cancer risk. Here we aim to examine whether differential variability in methylation is also an important feature of obesity, a typical non-malignant common complex disease. We analyzed genome-wide methylation profiles of over 470,000 CpGs in peripheral blood samples from 48 obese and 48 lean African-American youth aged 14-20 y old. A substantial number of differentially variable CpG sites (DVCs), using statistics based on variances, as well as a substantial number of differentially methylated CpG sites (DMCs), using statistics based on means, were identified. Similar to the findings in cancers, DVCs generally exhibited an outlier structure and were more variable in cases than in controls. By randomly splitting the current sample into a discovery and validation set, we observed that both the DVCs and DMCs identified from the first set could independently predict obesity status in the second set. Furthermore, both the genes harboring DMCs and the genes harboring DVCs showed significant enrichment of genes identified by genome-wide association studies on obesity and related diseases, such as hypertension, dyslipidemia, type 2 diabetes and certain types of cancers, supporting their roles in the etiology and pathogenesis of obesity. We generalized the recent finding on methylation variability in cancer research to obesity and demonstrated that differential variability is also an important feature of obesity-related methylation changes. Future studies on the epigenetics of obesity will benefit from both statistics based on means and statistics based on variances.

  12. A genome-wide 3C-method for characterizing the three-dimensional architectures of genomes.

    PubMed

    Duan, Zhijun; Andronescu, Mirela; Schutz, Kevin; Lee, Choli; Shendure, Jay; Fields, Stanley; Noble, William S; Anthony Blau, C

    2012-11-01

    Accumulating evidence demonstrates that the three-dimensional (3D) organization of chromosomes within the eukaryotic nucleus reflects and influences genomic activities, including transcription, DNA replication, recombination and DNA repair. In order to uncover structure-function relationships, it is necessary first to understand the principles underlying the folding and the 3D arrangement of chromosomes. Chromosome conformation capture (3C) provides a powerful tool for detecting interactions within and between chromosomes. A high throughput derivative of 3C, chromosome conformation capture on chip (4C), executes a genome-wide interrogation of interaction partners for a given locus. We recently developed a new method, a derivative of 3C and 4C, which, similar to Hi-C, is capable of comprehensively identifying long-range chromosome interactions throughout a genome in an unbiased fashion. Hence, our method can be applied to decipher the 3D architectures of genomes. Here, we provide a detailed protocol for this method. Published by Elsevier Inc.

  13. Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.

    PubMed

    Li, Yifeng; Shi, Wenqiang; Wasserman, Wyeth W

    2018-05-31

    In the human genome, 98% of DNA sequences are non-protein-coding regions that were previously disregarded as junk DNA. In fact, non-coding regions host a variety of cis-regulatory regions which precisely control the expression of genes. Thus, Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. The developments of high-throughput sequencing and machine learning technologies make it possible to predict cis-regulatory regions genome wide. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES based on supervised deep learning approaches for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data), and 26,000 candidate promoters (0.6% of the genome). The predicted annotations of cis-regulatory regions will provide broad utility for genome interpretation from functional genomics to clinical applications. The DECRES model demonstrates potentials of deep learning technologies when combined with high-throughput sequencing data, and inspires the development of other advanced neural network models for further improvement of genome annotations.

  14. Genome-wide significant loci for addiction and anxiety.

    PubMed

    Hodgson, K; Almasy, L; Knowles, E E M; Kent, J W; Curran, J E; Dyer, T D; Göring, H H H; Olvera, R L; Fox, P T; Pearlson, G D; Krystal, J H; Duggirala, R; Blangero, J; Glahn, D C

    2016-08-01

    Psychiatric comorbidity is common among individuals with addictive disorders, with patients frequently suffering from anxiety disorders. While the genetic architecture of comorbid addictive and anxiety disorders remains unclear, elucidating the genes involved could provide important insights into the underlying etiology. Here we examine a sample of 1284 Mexican-Americans from randomly selected extended pedigrees. Variance decomposition methods were used to examine the role of genetics in addiction phenotypes (lifetime history of alcohol dependence, drug dependence or chronic smoking) and various forms of clinically relevant anxiety. Genome-wide univariate and bivariate linkage scans were conducted to localize the chromosomal regions influencing these traits. Addiction phenotypes and anxiety were shown to be heritable and univariate genome-wide linkage scans revealed significant quantitative trait loci for drug dependence (14q13.2-q21.2, LOD=3.322) and a broad anxiety phenotype (12q24.32-q24.33, LOD=2.918). Significant positive genetic correlations were observed between anxiety and each of the addiction subtypes (ρg=0.550-0.655) and further investigation with bivariate linkage analyses identified significant pleiotropic signals for alcohol dependence-anxiety (9q33.1-q33.2, LOD=3.054) and drug dependence-anxiety (18p11.23-p11.22, LOD=3.425). This study confirms the shared genetic underpinnings of addiction and anxiety and identifies genomic loci involved in the etiology of these comorbid disorders. The linkage signal for anxiety on 12q24 spans the location of TMEM132D, an emerging gene of interest from previous GWAS of anxiety traits, whilst the bivariate linkage signal identified for anxiety-alcohol on 9q33 peak coincides with a region where rare CNVs have been associated with psychiatric disorders. Other signals identified implicate novel regions of the genome in addiction genetics. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  15. Genome-wide association studies in the Japanese population identify seven novel loci for type 2 diabetes

    PubMed Central

    Imamura, Minako; Takahashi, Atsushi; Yamauchi, Toshimasa; Hara, Kazuo; Yasuda, Kazuki; Grarup, Niels; Zhao, Wei; Wang, Xu; Huerta-Chagoya, Alicia; Hu, Cheng; Moon, Sanghoon; Long, Jirong; Kwak, Soo Heon; Rasheed, Asif; Saxena, Richa; Ma, Ronald C. W.; Okada, Yukinori; Iwata, Minoru; Hosoe, Jun; Shojima, Nobuhiro; Iwasaki, Minaka; Fujita, Hayato; Suzuki, Ken; Danesh, John; Jørgensen, Torben; Jørgensen, Marit E.; Witte, Daniel R.; Brandslund, Ivan; Christensen, Cramer; Hansen, Torben; Mercader, Josep M.; Flannick, Jason; Moreno-Macías, Hortensia; Burtt, Noël P.; Zhang, Rong; Kim, Young Jin; Zheng, Wei; Singh, Jai Rup; Tam, Claudia H. T.; Hirose, Hiroshi; Maegawa, Hiroshi; Ito, Chikako; Kaku, Kohei; Watada, Hirotaka; Tanaka, Yasushi; Tobe, Kazuyuki; Kawamori, Ryuzo; Kubo, Michiaki; Cho, Yoon Shin; Chan, Juliana C. N.; Sanghera, Dharambir; Frossard, Philippe; Park, Kyong Soo; Shu, Xiao-Ou; Kim, Bong-Jo; Florez, Jose C.; Tusié-Luna, Teresa; Jia, Weiping; Tai, E Shyong; Pedersen, Oluf; Saleheen, Danish; Maeda, Shiro; Kadowaki, Takashi

    2016-01-01

    Genome-wide association studies (GWAS) have identified more than 80 susceptibility loci for type 2 diabetes (T2D), but most of its heritability still remains to be elucidated. In this study, we conducted a meta-analysis of GWAS for T2D in the Japanese population. Combined data from discovery and subsequent validation analyses (23,399 T2D cases and 31,722 controls) identify 7 new loci with genome-wide significance (P<5 × 10−8), rs1116357 near CCDC85A, rs147538848 in FAM60A, rs1575972 near DMRTA1, rs9309245 near ASB3, rs67156297 near ATP8B2, rs7107784 near MIR4686 and rs67839313 near INAFM2. Of these, the association of 4 loci with T2D is replicated in multi-ethnic populations other than Japanese (up to 65,936 T2Ds and 158,030 controls, P<0.007). These results indicate that expansion of single ethnic GWAS is still useful to identify novel susceptibility loci to complex traits not only for ethnicity-specific loci but also for common loci across different ethnicities. PMID:26818947

  16. Genome-wide association studies on HIV susceptibility, pathogenesis and pharmacogenomics

    PubMed Central

    2012-01-01

    Susceptibility to HIV-1 and the clinical course after infection show a substantial heterogeneity between individuals. Part of this variability can be attributed to host genetic variation. Initial candidate gene studies have revealed interesting host factors that influence HIV infection, replication and pathogenesis. Recently, genome-wide association studies (GWAS) were utilized for unbiased searches at a genome-wide level to discover novel genetic factors and pathways involved in HIV-1 infection. This review gives an overview of findings from the GWAS performed on HIV infection, within different cohorts, with variable patient and phenotype selection. Furthermore, novel techniques and strategies in research that might contribute to the complete understanding of virus-host interactions and its role on the pathogenesis of HIV infection are discussed. PMID:22920050

  17. Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary.

    PubMed

    Brynildsrud, Ola; Bohlin, Jon; Scheffer, Lonneke; Eldholm, Vegard

    2016-11-25

    Genome-wide association studies (GWAS) have become indispensable in human medicine and genomics, but very few have been carried out on bacteria. Here we introduce Scoary, an ultra-fast, easy-to-use, and widely applicable software tool that scores the components of the pan-genome for associations to observed phenotypic traits while accounting for population stratification, with minimal assumptions about evolutionary processes. We call our approach pan-GWAS to distinguish it from traditional, single nucleotide polymorphism (SNP)-based GWAS. Scoary is implemented in Python and is available under an open source GPLv3 license at https://github.com/AdmiralenOla/Scoary .

  18. A Genome-Wide Investigation of Autozygosity and Breast Cancer Risk

    DTIC Science & Technology

    2011-07-01

    cases than in controls, using logistic regression methods. Using genome-wide SNP data (525,000 SNPs) on 1,647 non-Hispanic white, early-onset...premenopausal breast cancer cases and 1,556 matched controls we identified over 65,000 individual RoHs and 423 genomic regions harbor RoHs for at least 10...we hypothesize that germline autozygosity is more common in breast cancer cases than in controls. More specifically, we hypothesize that there are

  19. Genome-wide identification of significant aberrations in cancer genome.

    PubMed

    Yuan, Xiguo; Yu, Guoqiang; Hou, Xuchu; Shih, Ie-Ming; Clarke, Robert; Zhang, Junying; Hoffman, Eric P; Wang, Roger R; Zhang, Zhen; Wang, Yue

    2012-07-27

    Somatic Copy Number Alterations (CNAs) in human genomes are present in almost all human cancers. Systematic efforts to characterize such structural variants must effectively distinguish significant consensus events from random background aberrations. Here we introduce Significant Aberration in Cancer (SAIC), a new method for characterizing and assessing the statistical significance of recurrent CNA units. Three main features of SAIC include: (1) exploiting the intrinsic correlation among consecutive probes to assign a score to each CNA unit instead of single probes; (2) performing permutations on CNA units that preserve correlations inherent in the copy number data; and (3) iteratively detecting Significant Copy Number Aberrations (SCAs) and estimating an unbiased null distribution by applying an SCA-exclusive permutation scheme. We test and compare the performance of SAIC against four peer methods (GISTIC, STAC, KC-SMART, CMDS) on a large number of simulation datasets. Experimental results show that SAIC outperforms peer methods in terms of larger area under the Receiver Operating Characteristics curve and increased detection power. We then apply SAIC to analyze structural genomic aberrations acquired in four real cancer genome-wide copy number data sets (ovarian cancer, metastatic prostate cancer, lung adenocarcinoma, glioblastoma). When compared with previously reported results, SAIC successfully identifies most SCAs known to be of biological significance and associated with oncogenes (e.g., KRAS, CCNE1, and MYC) or tumor suppressor genes (e.g., CDKN2A/B). Furthermore, SAIC identifies a number of novel SCAs in these copy number data that encompass tumor related genes and may warrant further studies. Supported by a well-grounded theoretical framework, SAIC has been developed and used to identify SCAs in various cancer copy number data sets, providing useful information to study the landscape of cancer genomes. Open-source and platform-independent SAIC software is

  20. Genome wide identification of aberrant alternative splicing events in myotonic dystrophy type 2.

    PubMed

    Perfetti, Alessandra; Greco, Simona; Fasanaro, Pasquale; Bugiardini, Enrico; Cardani, Rosanna; Garcia-Manteiga, Jose M; Manteiga, Jose M Garcia; Riba, Michela; Cittaro, Davide; Stupka, Elia; Meola, Giovanni; Martelli, Fabio

    2014-01-01

    Myotonic dystrophy type 2 (DM2) is a genetic, autosomal dominant disease due to expansion of tetraplet (CCTG) repetitions in the first intron of the ZNF9/CNBP gene. DM2 is a multisystemic disorder affecting the skeletal muscle, the heart, the eye and the endocrine system. According to the proposed pathological mechanism, the expanded tetraplets have an RNA toxic effect, disrupting the splicing of many mRNAs. Thus, the identification of aberrantly spliced transcripts is instrumental for our understanding of the molecular mechanisms underpinning the disease. The aim of this study was the identification of new aberrant alternative splicing events in DM2 patients. By genome wide analysis of 10 DM2 patients and 10 controls (CTR), we identified 273 alternative spliced exons in 218 genes. While many aberrant splicing events were already identified in the past, most were new. A subset of these events was validated by qPCR assays in 19 DM2 and 15 CTR subjects. To gain insight into the molecular pathways involving the identified aberrantly spliced genes, we performed a bioinformatics analysis with Ingenuity system. This analysis indicated a deregulation of development, cell survival, metabolism, calcium signaling and contractility. In conclusion, our genome wide analysis provided a database of aberrant splicing events in the skeletal muscle of DM2 patients. The affected genes are involved in numerous pathways and networks important for muscle physio-pathology, suggesting that the identified variants may contribute to DM2 pathogenesis.

  1. Genome Wide Identification of Aberrant Alternative Splicing Events in Myotonic Dystrophy Type 2

    PubMed Central

    Fasanaro, Pasquale; Bugiardini, Enrico; Cardani, Rosanna; Manteiga, Jose M. Garcia.; Riba, Michela; Cittaro, Davide; Stupka, Elia; Meola, Giovanni; Martelli, Fabio

    2014-01-01

    Myotonic dystrophy type 2 (DM2) is a genetic, autosomal dominant disease due to expansion of tetraplet (CCTG) repetitions in the first intron of the ZNF9/CNBP gene. DM2 is a multisystemic disorder affecting the skeletal muscle, the heart, the eye and the endocrine system. According to the proposed pathological mechanism, the expanded tetraplets have an RNA toxic effect, disrupting the splicing of many mRNAs. Thus, the identification of aberrantly spliced transcripts is instrumental for our understanding of the molecular mechanisms underpinning the disease. The aim of this study was the identification of new aberrant alternative splicing events in DM2 patients. By genome wide analysis of 10 DM2 patients and 10 controls (CTR), we identified 273 alternative spliced exons in 218 genes. While many aberrant splicing events were already identified in the past, most were new. A subset of these events was validated by qPCR assays in 19 DM2 and 15 CTR subjects. To gain insight into the molecular pathways involving the identified aberrantly spliced genes, we performed a bioinformatics analysis with Ingenuity system. This analysis indicated a deregulation of development, cell survival, metabolism, calcium signaling and contractility. In conclusion, our genome wide analysis provided a database of aberrant splicing events in the skeletal muscle of DM2 patients. The affected genes are involved in numerous pathways and networks important for muscle physio-pathology, suggesting that the identified variants may contribute to DM2 pathogenesis. PMID:24722564

  2. Genome-wide analysis of a Wnt1-regulated transcriptional network implicates neurodegenerative pathways.

    PubMed

    Wexler, Eric M; Rosen, Ezra; Lu, Daning; Osborn, Gregory E; Martin, Elizabeth; Raybould, Helen; Geschwind, Daniel H

    2011-10-04

    Wnt proteins are critical to mammalian brain development and function. The canonical Wnt signaling pathway involves the stabilization and nuclear translocation of β-catenin; however, Wnt also signals through alternative, noncanonical pathways. To gain a systems-level, genome-wide view of Wnt signaling, we analyzed Wnt1-stimulated changes in gene expression by transcriptional microarray analysis in cultured human neural progenitor (hNP) cells at multiple time points over a 72-hour time course. We observed a widespread oscillatory-like pattern of changes in gene expression, involving components of both the canonical and the noncanonical Wnt signaling pathways. A higher-order, systems-level analysis that combined independent component analysis, waveform analysis, and mutual information-based network construction revealed effects on pathways related to cell death and neurodegenerative disease. Wnt effectors were tightly clustered with presenilin1 (PSEN1) and granulin (GRN), which cause dominantly inherited forms of Alzheimer's disease and frontotemporal dementia (FTD), respectively. We further explored a potential link between Wnt1 and GRN and found that Wnt1 decreased GRN expression by hNPs. Conversely, GRN knockdown increased WNT1 expression, demonstrating that Wnt and GRN reciprocally regulate each other. Finally, we provided in vivo validation of the in vitro findings by analyzing gene expression data from individuals with FTD. These unbiased and genome-wide analyses provide evidence for a connection between Wnt signaling and the transcriptional regulation of neurodegenerative disease genes.

  3. Integrative Bayesian variable selection with gene-based informative priors for genome-wide association studies.

    PubMed

    Zhang, Xiaoshuai; Xue, Fuzhong; Liu, Hong; Zhu, Dianwen; Peng, Bin; Wiemels, Joseph L; Yang, Xiaowei

    2014-12-10

    Genome-wide Association Studies (GWAS) are typically designed to identify phenotype-associated single nucleotide polymorphisms (SNPs) individually using univariate analysis methods. Though providing valuable insights into genetic risks of common diseases, the genetic variants identified by GWAS generally account for only a small proportion of the total heritability for complex diseases. To solve this "missing heritability" problem, we implemented a strategy called integrative Bayesian Variable Selection (iBVS), which is based on a hierarchical model that incorporates an informative prior by considering the gene interrelationship as a network. It was applied here to both simulated and real data sets. Simulation studies indicated that the iBVS method was advantageous in its performance with highest AUC in both variable selection and outcome prediction, when compared to Stepwise and LASSO based strategies. In an analysis of a leprosy case-control study, iBVS selected 94 SNPs as predictors, while LASSO selected 100 SNPs. The Stepwise regression yielded a more parsimonious model with only 3 SNPs. The prediction results demonstrated that the iBVS method had comparable performance with that of LASSO, but better than Stepwise strategies. The proposed iBVS strategy is a novel and valid method for Genome-wide Association Studies, with the additional advantage in that it produces more interpretable posterior probabilities for each variable unlike LASSO and other penalized regression methods.

  4. Genome-wide association mapping of qualitatively inherited traits in a germplasm collection

    USDA-ARS?s Scientific Manuscript database

    Genome-wide association (GWA) has been used as a tool for dissecting the genetic architecture of quantitatively inherited traits. We demonstrate here that GWA can also be highly useful for detecting the genomic locations of major genes governing categorically defined phenotype variants that exist fo...

  5. Genetic Variance Partitioning and Genome-Wide Prediction with Allele Dosage Information in Autotetraploid Potato.

    PubMed

    Endelman, Jeffrey B; Carley, Cari A Schmitz; Bethke, Paul C; Coombs, Joseph J; Clough, Mark E; da Silva, Washington L; De Jong, Walter S; Douches, David S; Frederick, Curtis M; Haynes, Kathleen G; Holm, David G; Miller, J Creighton; Muñoz, Patricio R; Navarro, Felix M; Novy, Richard G; Palta, Jiwan P; Porter, Gregory A; Rak, Kyle T; Sathuvalli, Vidyasagar R; Thompson, Asunta L; Yencho, G Craig

    2018-05-01

    As one of the world's most important food crops, the potato ( Solanum tuberosum L.) has spurred innovation in autotetraploid genetics, including in the use of SNP arrays to determine allele dosage at thousands of markers. By combining genotype and pedigree information with phenotype data for economically important traits, the objectives of this study were to (1) partition the genetic variance into additive vs. nonadditive components, and (2) determine the accuracy of genome-wide prediction. Between 2012 and 2017, a training population of 571 clones was evaluated for total yield, specific gravity, and chip fry color. Genomic covariance matrices for additive ( G ), digenic dominant ( D ), and additive × additive epistatic ( G # G ) effects were calculated using 3895 markers, and the numerator relationship matrix ( A ) was calculated from a 13-generation pedigree. Based on model fit and prediction accuracy, mixed model analysis with G was superior to A for yield and fry color but not specific gravity. The amount of additive genetic variance captured by markers was 20% of the total genetic variance for specific gravity, compared to 45% for yield and fry color. Within the training population, including nonadditive effects improved accuracy and/or bias for all three traits when predicting total genotypic value. When six F 1 populations were used for validation, prediction accuracy ranged from 0.06 to 0.63 and was consistently lower (0.13 on average) without allele dosage information. We conclude that genome-wide prediction is feasible in potato and that it will improve selection for breeding value given the substantial amount of nonadditive genetic variance in elite germplasm. Copyright © 2018 by the Genetics Society of America.

  6. Modelling Human Regulatory Variation in Mouse: Finding the Function in Genome-Wide Association Studies and Whole-Genome Sequencing

    PubMed Central

    Schmouth, Jean-François; Bonaguro, Russell J.; Corso-Diaz, Ximena; Simpson, Elizabeth M.

    2012-01-01

    An increasing body of literature from genome-wide association studies and human whole-genome sequencing highlights the identification of large numbers of candidate regulatory variants of potential therapeutic interest in numerous diseases. Our relatively poor understanding of the functions of non-coding genomic sequence, and the slow and laborious process of experimental validation of the functional significance of human regulatory variants, limits our ability to fully benefit from this information in our efforts to comprehend human disease. Humanized mouse models (HuMMs), in which human genes are introduced into the mouse, suggest an approach to this problem. In the past, HuMMs have been used successfully to study human disease variants; e.g., the complex genetic condition arising from Down syndrome, common monogenic disorders such as Huntington disease and β-thalassemia, and cancer susceptibility genes such as BRCA1. In this commentary, we highlight a novel method for high-throughput single-copy site-specific generation of HuMMs entitled High-throughput Human Genes on the X Chromosome (HuGX). This method can be applied to most human genes for which a bacterial artificial chromosome (BAC) construct can be derived and a mouse-null allele exists. This strategy comprises (1) the use of recombineering technology to create a human variant–harbouring BAC, (2) knock-in of this BAC into the mouse genome using Hprt docking technology, and (3) allele comparison by interspecies complementation. We demonstrate the throughput of the HuGX method by generating a series of seven different alleles for the human NR2E1 gene at Hprt. In future challenges, we consider the current limitations of experimental approaches and call for a concerted effort by the genetics community, for both human and mouse, to solve the challenge of the functional analysis of human regulatory variation. PMID:22396661

  7. Genome-wide association study and accuracy of genomic prediction for teat number in Duroc pigs using genotyping-by-sequencing.

    PubMed

    Tan, Cheng; Wu, Zhenfang; Ren, Jiangli; Huang, Zhuolin; Liu, Dewu; He, Xiaoyan; Prakapenka, Dzianis; Zhang, Ran; Li, Ning; Da, Yang; Hu, Xiaoxiang

    2017-03-29

    The number of teats in pigs is related to a sow's ability to rear piglets to weaning age. Several studies have identified genes and genomic regions that affect teat number in swine but few common results were reported. The objective of this study was to identify genetic factors that affect teat number in pigs, evaluate the accuracy of genomic prediction, and evaluate the contribution of significant genes and genomic regions to genomic broad-sense heritability and prediction accuracy using 41,108 autosomal single nucleotide polymorphisms (SNPs) from genotyping-by-sequencing on 2936 Duroc boars. Narrow-sense heritability and dominance heritability of teat number estimated by genomic restricted maximum likelihood were 0.365 ± 0.030 and 0.035 ± 0.019, respectively. The accuracy of genomic predictions, calculated as the average correlation between the genomic best linear unbiased prediction and phenotype in a tenfold validation study, was 0.437 ± 0.064 for the model with additive and dominance effects and 0.435 ± 0.064 for the model with additive effects only. Genome-wide association studies (GWAS) using three methods of analysis identified 85 significant SNP effects for teat number on chromosomes 1, 6, 7, 10, 11, 12 and 14. The region between 102.9 and 106.0 Mb on chromosome 7, which was reported in several studies, had the most significant SNP effects in or near the PTGR2, FAM161B, LIN52, VRTN, FCF1, AREL1 and LRRC74A genes. This region accounted for 10.0% of the genomic additive heritability and 8.0% of the accuracy of prediction. The second most significant chromosome region not reported by previous GWAS was the region between 77.7 and 79.7 Mb on chromosome 11, where SNPs in the FGF14 gene had the most significant effect and accounted for 5.1% of the genomic additive heritability and 5.2% of the accuracy of prediction. The 85 significant SNPs accounted for 28.5 to 28.8% of the genomic additive heritability and 35.8 to 36.8% of the accuracy of

  8. Genome-wide single nucleotide polymorphisms reveal population history and adaptive divergence in wild guppies.

    PubMed

    Willing, Eva-Maria; Bentzen, Paul; van Oosterhout, Cock; Hoffmann, Margarete; Cable, Joanne; Breden, Felix; Weigel, Detlef; Dreyer, Christine

    2010-03-01

    Adaptation of guppies (Poecilia reticulata) to contrasting upland and lowland habitats has been extensively studied with respect to behaviour, morphology and life history traits. Yet population history has not been studied at the whole-genome level. Although single nucleotide polymorphisms (SNPs) are the most abundant form of variation in many genomes and consequently very informative for a genome-wide picture of standing natural variation in populations, genome-wide SNP data are rarely available for wild vertebrates. Here we use genetically mapped SNP markers to comprehensively survey genetic variation within and among naturally occurring guppy populations from a wide geographic range in Trinidad and Venezuela. Results from three different clustering methods, Neighbor-net, principal component analysis (PCA) and Bayesian analysis show that the population substructure agrees with geographic separation and largely with previously hypothesized patterns of historical colonization. Within major drainages (Caroni, Oropouche and Northern), populations are genetically similar, but those in different geographic regions are highly divergent from one another, with some indications of ancient shared polymorphisms. Clear genomic signatures of a previous introduction experiment were seen, and we detected additional potential admixture events. Headwater populations were significantly less heterozygous than downstream populations. Pairwise F(ST) values revealed marked differences in allele frequencies among populations from different regions, and also among populations within the same region. F(ST) outlier methods indicated some regions of the genome as being under directional selection. Overall, this study demonstrates the power of a genome-wide SNP data set to inform for studies on natural variation, adaptation and evolution of wild populations.

  9. Genome-Wide Identification of Molecular Mimicry Candidates in Parasites

    PubMed Central

    Ludin, Philipp; Nilsson, Daniel; Mäser, Pascal

    2011-01-01

    Among the many strategies employed by parasites for immune evasion and host manipulation, one of the most fascinating is molecular mimicry. With genome sequences available for host and parasite, mimicry of linear amino acid epitopes can be investigated by comparative genomics. Here we developed an in silico pipeline for genome-wide identification of molecular mimicry candidate proteins or epitopes. The predicted proteome of a given parasite was broken down into overlapping fragments, each of which was screened for close hits in the human proteome. Control searches were carried out against unrelated, free-living eukaryotes to eliminate the generally conserved proteins, and with randomized versions of the parasite proteins to get an estimate of statistical significance. This simple but computation-intensive approach yielded interesting candidates from human-pathogenic parasites. From Plasmodium falciparum, it returned a 14 amino acid motif in several of the PfEMP1 variants identical to part of the heparin-binding domain in the immunosuppressive serum protein vitronectin. And in Brugia malayi, fragments were detected that matched to periphilin-1, a protein of cell-cell junctions involved in barrier formation. All the results are publicly available by means of mimicDB, a searchable online database for molecular mimicry candidates from pathogens. To our knowledge, this is the first genome-wide survey for molecular mimicry proteins in parasites. The strategy can be adopted to any pair of host and pathogen, once appropriate negative control organisms are chosen. MimicDB provides a host of new starting points to gain insights into the molecular nature of host-pathogen interactions. PMID:21408160

  10. A robust clustering algorithm for identifying problematic samples in genome-wide association studies.

    PubMed

    Bellenguez, Céline; Strange, Amy; Freeman, Colin; Donnelly, Peter; Spencer, Chris C A

    2012-01-01

    High-throughput genotyping arrays provide an efficient way to survey single nucleotide polymorphisms (SNPs) across the genome in large numbers of individuals. Downstream analysis of the data, for example in genome-wide association studies (GWAS), often involves statistical models of genotype frequencies across individuals. The complexities of the sample collection process and the potential for errors in the experimental assay can lead to biases and artefacts in an individual's inferred genotypes. Rather than attempting to model these complications, it has become a standard practice to remove individuals whose genome-wide data differ from the sample at large. Here we describe a simple, but robust, statistical algorithm to identify samples with atypical summaries of genome-wide variation. Its use as a semi-automated quality control tool is demonstrated using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections. The algorithm is written in R and is freely available at www.well.ox.ac.uk/chris-spencer chris.spencer@well.ox.ac.uk Supplementary data are available at Bioinformatics online.

  11. Genome-wide association studies in maize: praise and stargaze

    USDA-ARS?s Scientific Manuscript database

    Genome-wide association study (GWAS) has appeared as a widespread strategy in decoding genotype-phenotype associations in many species thanks to technical advances in next-generation sequencing (NGS) applications. Maize is an ideal crop for GWAS and significant progress has been made in the last dec...

  12. Genome-Wide Screening and Characterization of the Dof Gene Family in Physic Nut (Jatropha curcas L.).

    PubMed

    Wang, Peipei; Li, Jing; Gao, Xiaoyang; Zhang, Di; Li, Anlin; Liu, Changning

    2018-05-29

    Physic nut ( Jatropha curcas L.) is a species of flowering plant with great potential for biofuel production and as an emerging model organism for functional genomic analysis, particularly in the Euphorbiaceae family. DNA binding with one finger (Dof) transcription factors play critical roles in numerous biological processes in plants. Nevertheless, the knowledge about members, and the evolutionary and functional characteristics of the Dof gene family in physic nut is insufficient. Therefore, we performed a genome-wide screening and characterization of the Dof gene family within the physic nut draft genome. In total, 24 JcDof genes (encoding 33 JcDof proteins) were identified. All the JcDof genes were divided into three major groups based on phylogenetic inference, which was further validated by the subsequent gene structure and motif analysis. Genome comparison revealed that segmental duplication may have played crucial roles in the expansion of the JcDof gene family, and gene expansion was mainly subjected to positive selection. The expression profile demonstrated the broad involvement of JcDof genes in response to various abiotic stresses, hormonal treatments and functional divergence. This study provides valuable information for better understanding the evolution of JcDof genes, and lays a foundation for future functional exploration of JcDof genes.

  13. Genome wide association mapping for grain shape traits in indica rice.

    PubMed

    Feng, Yue; Lu, Qing; Zhai, Rongrong; Zhang, Mengchen; Xu, Qun; Yang, Yaolong; Wang, Shan; Yuan, Xiaoping; Yu, Hanyong; Wang, Yiping; Wei, Xinghua

    2016-10-01

    Using genome-wide association mapping, 47 SNPs within 27 significant loci were identified for four grain shape traits, and 424 candidate genes were predicted from public database. Grain shape is a key determinant of grain yield and quality in rice (Oryza sativa L.). However, our knowledge of genes controlling rice grain shape remains limited. Genome-wide association mapping based on linkage disequilibrium (LD) has recently emerged as an effective approach for identifying genes or quantitative trait loci (QTL) underlying complex traits in plants. In this study, association mapping based on 5291 single nucleotide polymorphisms (SNPs) was conducted to identify significant loci associated with grain shape traits in a global collection of 469 diverse rice accessions. A total of 47 SNPs were located in 27 significant loci for four grain traits, and explained ~44.93-65.90 % of the phenotypic variation for each trait. In total, 424 candidate genes within a 200 kb extension region (±100 kb of each locus) of these loci were predicted. Of them, the cloned genes GS3 and qSW5 showed very strong effects on grain length and grain width in our study. Comparing with previously reported QTLs for grain shape traits, we found 11 novel loci, including 3, 3, 2 and 3 loci for grain length, grain width, grain length-width ratio and thousand grain weight, respectively. Validation of these new loci would be performed in the future studies. These results revealed that besides GS3 and qSW5, multiple novel loci and mechanisms were involved in determining rice grain shape. These findings provided valuable information for understanding of the genetic control of grain shape and molecular marker assistant selection (MAS) breeding in rice.

  14. A Genome-Wide Scan for Breast Cancer Risk Haplotypes among African American Women

    PubMed Central

    Song, Chi; Chen, Gary K.; Millikan, Robert C.; Ambrosone, Christine B.; John, Esther M.; Bernstein, Leslie; Zheng, Wei; Hu, Jennifer J.; Ziegler, Regina G.; Nyante, Sarah; Bandera, Elisa V.; Ingles, Sue A.; Press, Michael F.; Deming, Sandra L.; Rodriguez-Gil, Jorge L.; Chanock, Stephen J.; Wan, Peggy; Sheng, Xin; Pooler, Loreall C.; Van Den Berg, David J.; Le Marchand, Loic; Kolonel, Laurence N.; Henderson, Brian E.; Haiman, Chris A.; Stram, Daniel O.

    2013-01-01

    Genome-wide association studies (GWAS) simultaneously investigating hundreds of thousands of single nucleotide polymorphisms (SNP) have become a powerful tool in the investigation of new disease susceptibility loci. Haplotypes are sometimes thought to be superior to SNPs and are promising in genetic association analyses. The application of genome-wide haplotype analysis, however, is hindered by the complexity of haplotypes themselves and sophistication in computation. We systematically analyzed the haplotype effects for breast cancer risk among 5,761 African American women (3,016 cases and 2,745 controls) using a sliding window approach on the genome-wide scale. Three regions on chromosomes 1, 4 and 18 exhibited moderate haplotype effects. Furthermore, among 21 breast cancer susceptibility loci previously established in European populations, 10p15 and 14q24 are likely to harbor novel haplotype effects. We also proposed a heuristic of determining the significance level and the effective number of independent tests by the permutation analysis on chromosome 22 data. It suggests that the effective number was approximately half of the total (7,794 out of 15,645), thus the half number could serve as a quick reference to evaluating genome-wide significance if a similar sliding window approach of haplotype analysis is adopted in similar populations using similar genotype density. PMID:23468962

  15. Genome-wide patterns of selection in 230 ancient Eurasians

    PubMed Central

    Mathieson, Iain; Lazaridis, Iosif; Rohland, Nadin; Mallick, Swapan; Patterson, Nick; Roodenberg, Songül Alpaslan; Harney, Eadaoin; Stewardson, Kristin; Fernandes, Daniel; Novak, Mario; Sirak, Kendra; Gamba, Cristina; Jones, Eppie R.; Llamas, Bastien; Dryomov, Stanislav; Pickrel, Joseph; Arsuaga, Juan Luís; de Castro, José María Bermúdez; Carbonell, Eudald; Gerritsen, Fokke; Khokhlov, Aleksandr; Kuznetsov, Pavel; Lozano, Marina; Meller, Harald; Mochalov, Oleg; Moiseyev, Vayacheslav; Rojo Guerra, Manuel A.; Roodenberg, Jacob; Vergès, Josep Maria; Krause, Johannes; Cooper, Alan; Alt, Kurt W.; Brown, Dorcas; Anthony, David; Lalueza-Fox, Carles; Haak, Wolfgang; Pinhasi, Ron; Reich, David

    2016-01-01

    Ancient DNA makes it possible to directly witness natural selection by analyzing samples from populations before, during and after adaptation events. Here we report the first scan for selection using ancient DNA, capitalizing on the largest genome-wide dataset yet assembled: 230 West Eurasians dating to between 6500 and 1000 BCE, including 163 with newly reported data. The new samples include the first genome-wide data from the Anatolian Neolithic culture whose genetic material we extracted from the DNA-rich petrous bone and who we show were members of the population that was the source of Europe’s first farmers. We also report a complete transect of the steppe region in Samara between 5500 and 1200 BCE that allows us to recognize admixture from at least two external sources into steppe populations during this period. We detect selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height. PMID:26595274

  16. Genome-wide association studies of obesity and metabolic syndrome.

    PubMed

    Fall, Tove; Ingelsson, Erik

    2014-01-25

    Until just a few years ago, the genetic determinants of obesity and metabolic syndrome were largely unknown, with the exception of a few forms of monogenic extreme obesity. Since genome-wide association studies (GWAS) became available, large advances have been made. The first single nucleotide polymorphism robustly associated with increased body mass index (BMI) was in 2007 mapped to a gene with for the time unknown function. This gene, now known as fat mass and obesity associated (FTO) has been repeatedly replicated in several ethnicities and is affecting obesity by regulating appetite. Since the first report from a GWAS of obesity, an increasing number of markers have been shown to be associated with BMI, other measures of obesity or fat distribution and metabolic syndrome. This systematic review of obesity GWAS will summarize genome-wide significant findings for obesity and metabolic syndrome and briefly give a few suggestions of what is to be expected in the next few years. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  17. Microbial genome-wide association studies: lessons from human GWAS.

    PubMed

    Power, Robert A; Parkhill, Julian; de Oliveira, Tulio

    2017-01-01

    The reduced costs of sequencing have led to whole-genome sequences for a large number of microorganisms, enabling the application of microbial genome-wide association studies (GWAS). Given the successes of human GWAS in understanding disease aetiology and identifying potential drug targets, microbial GWAS are likely to further advance our understanding of infectious diseases. These advances include insights into pressing global health problems, such as antibiotic resistance and disease transmission. In this Review, we outline the methodologies of GWAS, the current state of the field of microbial GWAS, and how lessons from human GWAS can direct the future of the field.

  18. A GENOME-WIDE LINKAGE AND ASSOCIATION SCAN REVEALS NOVEL LOCI FOR AUTISM

    PubMed Central

    Weiss, Lauren A.; Arking, Dan E.

    2009-01-01

    Summary Although autism is a highly heritable neurodevelopmental disorder, attempts to identify specific susceptibility genes have thus far met with limited success 1. Genome-wide association studies (GWAS) using half a million or more markers, particularly those with very large sample sizes achieved through meta-analysis, have shown great success in mapping genes for other complex genetic traits (http://www.genome.gov/26525384). Consequently, we initiated a linkage and association mapping study using half a million genome-wide SNPs in a common set of 1,031 multiplex autism families (1,553 affected offspring). We identified regions of suggestive and significant linkage on chromosomes 6q27 and 20p13, respectively. Initial analysis did not yield genome-wide significant associations; however, genotyping of top hits in additional families revealed a SNP on chromosome 5p15 (between SEMA5A and TAS2R1) that was significantly associated with autism (P = 2 × 10−7). We also demonstrated that expression of SEMA5A is reduced in brains from autistic patients, further implicating SEMA5A as an autism susceptibility gene. The linkage regions reported here provide targets for rare variation screening while the discovery of a single novel association demonstrates the action of common variants. PMID:19812673

  19. MAGNAMWAR: an R package for genome-wide association studies of bacterial orthologs.

    PubMed

    Sexton, Corinne E; Smith, Hayden Z; Newell, Peter D; Douglas, Angela E; Chaston, John M

    2018-06-01

    Here we report on an R package for genome-wide association studies of orthologous genes in bacteria. Before using the software, orthologs from bacterial genomes or metagenomes are defined using local or online implementations of OrthoMCL. These presence-absence patterns are statistically associated with variation in user-collected phenotypes using the Mono-Associated GNotobiotic Animals Metagenome-Wide Association R package (MAGNAMWAR). Genotype-phenotype associations can be performed with several different statistical tests based on the type and distribution of the data. MAGNAMWAR is available on CRAN. john_chaston@byu.edu.

  20. Genome-wide bisulfite sensitivity profiling of yeast suggests bisulfite inhibits transcription.

    PubMed

    Segovia, Romulo; Mathew, Veena; Tam, Annie S; Stirling, Peter C

    2017-09-01

    Bisulfite, in the form of sodium bisulfite or metabisulfite, is used commercially as a food preservative. Bisulfite is used in the laboratory as a single-stranded DNA mutagen in epigenomic analyses of DNA methylation. Recently it has also been used on whole yeast cells to induce mutations in exposed single-stranded regions in vivo. To understand the effects of bisulfite on live cells we conducted a genome-wide screen for bisulfite sensitive mutants in yeast. Screening the deletion mutant array, and collections of essential gene mutants we define a genetic network of bisulfite sensitive mutants. Validation of screen hits revealed hyper-sensitivity of transcription and RNA processing mutants, rather than DNA repair pathways and follow-up analyses support a role in perturbation of RNA transactions. We propose a model in which bisulfite-modified nucleotides may interfere with transcription or RNA metabolism when used in vivo. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Transcriptome sequencing reveals genome-wide variation in molecular evolutionary rate among ferns.

    PubMed

    Grusz, Amanda L; Rothfels, Carl J; Schuettpelz, Eric

    2016-08-30

    Transcriptomics in non-model plant systems has recently reached a point where the examination of nuclear genome-wide patterns in understudied groups is an achievable reality. This progress is especially notable in evolutionary studies of ferns, for which molecular resources to date have been derived primarily from the plastid genome. Here, we utilize transcriptome data in the first genome-wide comparative study of molecular evolutionary rate in ferns. We focus on the ecologically diverse family Pteridaceae, which comprises about 10 % of fern diversity and includes the enigmatic vittarioid ferns-an epiphytic, tropical lineage known for dramatically reduced morphologies and radically elongated phylogenetic branch lengths. Using expressed sequence data for 2091 loci, we perform pairwise comparisons of molecular evolutionary rate among 12 species spanning the three largest clades in the family and ask whether previously documented heterogeneity in plastid substitution rates is reflected in their nuclear genomes. We then inquire whether variation in evolutionary rate is being shaped by genes belonging to specific functional categories and test for differential patterns of selection. We find significant, genome-wide differences in evolutionary rate for vittarioid ferns relative to all other lineages within the Pteridaceae, but we recover few significant correlations between faster/slower vittarioid loci and known functional gene categories. We demonstrate that the faster rates characteristic of the vittarioid ferns are likely not driven by positive selection, nor are they unique to any particular type of nucleotide substitution. Our results reinforce recently reviewed mechanisms hypothesized to shape molecular evolutionary rates in vittarioid ferns and provide novel insight into substitution rate variation both within and among fern nuclear genomes.

  2. Genome-wide characterization of microRNA in foxtail millet (Setaria italica).

    PubMed

    Yi, Fei; Xie, Shaojun; Liu, Yuwei; Qi, Xin; Yu, Jingjuan

    2013-12-13

    MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play key roles in many biological processes in both animals and plants. Although many miRNAs have been identified in a large number of organisms, the miRNAs in foxtail millet (Setaria italica) have, until now, been poorly understood. In this study, two replicate small RNA libraries from foxtail millet shoots were sequenced, and 40 million reads representing over 10 million unique sequences were generated. We identified 43 known miRNAs, 172 novel miRNAs and 2 mirtron precursor candidates in foxtail millet. Some miRNA*s of the known and novel miRNAs were detected as well. Further, eight novel miRNAs were validated by stem-loop RT-PCR. Potential targets of the foxtail millet miRNAs were predicted based on our strict criteria. Of the predicted target genes, 79% (351) had functional annotations in InterPro and GO analyses, indicating the targets of the miRNAs were involved in a wide range of regulatory functions and some specific biological processes. A total of 69 pairs of syntenic miRNA precursors that were conserved between foxtail millet and sorghum were found. Additionally, stem-loop RT-PCR was conducted to confirm the tissue-specific expression of some miRNAs in the four tissues identified by deep-sequencing. We predicted, for the first time, 215 miRNAs and 447 miRNA targets in foxtail millet at a genome-wide level. The precursors, expression levels, miRNA* sequences, target functions, conservation, and evolution of miRNAs we identified were investigated. Some of the novel foxtail millet miRNAs and miRNA targets were validated experimentally.

  3. A review of genome-wide approaches to study the genetic basis for spermatogenic defects.

    PubMed

    Aston, Kenneth I; Conrad, Donald F

    2013-01-01

    Rapidly advancing tools for genetic analysis on a genome-wide scale have been instrumental in identifying the genetic bases for many complex diseases. About half of male infertility cases are of unknown etiology in spite of tremendous efforts to characterize the genetic basis for the disorder. Advancing our understanding of the genetic basis for male infertility will require the application of established and emerging genomic tools. This chapter introduces many of the tools available for genetic studies on a genome-wide scale along with principles of study design and data analysis.

  4. Genome-wide association analysis identifies 30 new susceptibility loci for schizophrenia.

    PubMed

    Li, Zhiqiang; Chen, Jianhua; Yu, Hao; He, Lin; Xu, Yifeng; Zhang, Dai; Yi, Qizhong; Li, Changgui; Li, Xingwang; Shen, Jiawei; Song, Zhijian; Ji, Weidong; Wang, Meng; Zhou, Juan; Chen, Boyu; Liu, Yahui; Wang, Jiqiang; Wang, Peng; Yang, Ping; Wang, Qingzhong; Feng, Guoyin; Liu, Benxiu; Sun, Wensheng; Li, Baojie; He, Guang; Li, Weidong; Wan, Chunling; Xu, Qi; Li, Wenjin; Wen, Zujia; Liu, Ke; Huang, Fang; Ji, Jue; Ripke, Stephan; Yue, Weihua; Sullivan, Patrick F; O'Donovan, Michael C; Shi, Yongyong

    2017-11-01

    We conducted a genome-wide association study (GWAS) with replication in 36,180 Chinese individuals and performed further transancestry meta-analyses with data from the Psychiatry Genomics Consortium (PGC2). Approximately 95% of the genome-wide significant (GWS) index alleles (or their proxies) from the PGC2 study were overrepresented in Chinese schizophrenia cases, including ∼50% that achieved nominal significance and ∼75% that continued to be GWS in the transancestry analysis. The Chinese-only analysis identified seven GWS loci; three of these also were GWS in the transancestry analyses, which identified 109 GWS loci, thus yielding a total of 113 GWS loci (30 novel) in at least one of these analyses. We observed improvements in the fine-mapping resolution at many susceptibility loci. Our results provide several lines of evidence supporting candidate genes at many loci and highlight some pathways for further research. Together, our findings provide novel insight into the genetic architecture and biological etiology of schizophrenia.

  5. Partitioning heritability by functional annotation using genome-wide association summary statistics.

    PubMed

    Finucane, Hilary K; Bulik-Sullivan, Brendan; Gusev, Alexander; Trynka, Gosia; Reshef, Yakir; Loh, Po-Ru; Anttila, Verneri; Xu, Han; Zang, Chongzhi; Farh, Kyle; Ripke, Stephan; Day, Felix R; Purcell, Shaun; Stahl, Eli; Lindstrom, Sara; Perry, John R B; Okada, Yukinori; Raychaudhuri, Soumya; Daly, Mark J; Patterson, Nick; Neale, Benjamin M; Price, Alkes L

    2015-11-01

    Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here we analyze a broad set of functional elements, including cell type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits with an average sample size of 73,599. To enable this analysis, we introduce a new method, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers. This new method is computationally tractable at very large sample sizes and leverages genome-wide information. Our findings include a large enrichment of heritability in conserved regions across many traits, a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers and many cell type-specific enrichments, including significant enrichment of central nervous system cell types in the heritability of body mass index, age at menarche, educational attainment and smoking behavior.

  6. Cooperative Genome-Wide Analysis Shows Increased Homozygosity in Early Onset Parkinson's Disease

    PubMed Central

    Nalls, Michael A.; Martinez, Maria; Schulte, Claudia; Holmans, Peter; Gasser, Thomas; Hardy, John; Singleton, Andrew B.; Wood, Nicholas W.; Brice, Alexis; Heutink, Peter; Williams, Nigel; Morris, Huw R.

    2012-01-01

    Parkinson's disease (PD) occurs in both familial and sporadic forms, and both monogenic and complex genetic factors have been identified. Early onset PD (EOPD) is particularly associated with autosomal recessive (AR) mutations, and three genes, PARK2, PARK7 and PINK1, have been found to carry mutations leading to AR disease. Since mutations in these genes account for less than 10% of EOPD patients, we hypothesized that further recessive genetic factors are involved in this disorder, which may appear in extended runs of homozygosity. We carried out genome wide SNP genotyping to look for extended runs of homozygosity (ROHs) in 1,445 EOPD cases and 6,987 controls. Logistic regression analyses showed an increased level of genomic homozygosity in EOPD cases compared to controls. These differences are larger for ROH of 9 Mb and above, where there is a more than three-fold increase in the proportion of cases carrying a ROH. These differences are not explained by occult recessive mutations at existing loci. Controlling for genome wide homozygosity in logistic regression analyses increased the differences between cases and controls, indicating that in EOPD cases ROHs do not simply relate to genome wide measures of inbreeding. Homozygosity at a locus on chromosome19p13.3 was identified as being more common in EOPD cases as compared to controls. Sequencing analysis of genes and predicted transcripts within this locus failed to identify a novel mutation causing EOPD in our cohort. There is an increased rate of genome wide homozygosity in EOPD, as measured by an increase in ROHs. These ROHs are a signature of inbreeding and do not necessarily harbour disease-causing genetic variants. Although there might be other regions of interest apart from chromosome 19p13.3, we lack the power to detect them with this analysis. PMID:22427796

  7. Genome-Wide Association of Heroin Dependence in Han Chinese.

    PubMed

    Kalsi, Gursharan; Euesden, Jack; Coleman, Jonathan R I; Ducci, Francesca; Aliev, Fazil; Newhouse, Stephen J; Liu, Xiehe; Ma, Xiaohong; Wang, Yingcheng; Collier, David A; Asherson, Philip; Li, Tao; Breen, Gerome

    2016-01-01

    Drug addiction is a costly and recurring healthcare problem, necessitating a need to understand risk factors and mechanisms of addiction, and to identify new biomarkers. To date, genome-wide association studies (GWAS) for heroin addiction have been limited; moreover they have been restricted to examining samples of European and African-American origin due to difficulty of recruiting samples from other populations. This is the first study to test a Han Chinese population; we performed a GWAS on a homogeneous sample of 370 Han Chinese subjects diagnosed with heroin dependence using the DSM-IV criteria and 134 ethnically matched controls. Analysis using the diagnostic criteria of heroin dependence yielded suggestive evidence for association between variants in the genes CCDC42 (coiled coil domain 42; p = 2.8x10-7) and BRSK2 (BR serine/threonine 2; p = 4.110-6). In addition, we found evidence for risk variants within the ARHGEF10 (Rho guanine nucleotide exchange factor 10) gene on chromosome 8 and variants in a region on chromosome 20q13, which is gene-poor but has a concentration of mRNAs and predicted miRNAs. Gene-based association analysis identified genome-wide significant association between variants in CCDC42 and heroin addiction. Additionally, when we investigated shared risk variants between heroin addiction and risk of other addiction-related and psychiatric phenotypes using polygenic risk scores, we found a suggestive relationship with variants predicting tobacco addiction, and a significant relationship with variants predicting schizophrenia. Our genome wide association study of heroin dependence provides data in a novel sample, with functionally plausible results and evidence of genetic data of value to the field.

  8. A genome-wide association study of corneal astigmatism: The CREAM Consortium.

    PubMed

    Shah, Rupal L; Li, Qing; Zhao, Wanting; Tedja, Milly S; Tideman, J Willem L; Khawaja, Anthony P; Fan, Qiao; Yazar, Seyhan; Williams, Katie M; Verhoeven, Virginie J M; Xie, Jing; Wang, Ya Xing; Hess, Moritz; Nickels, Stefan; Lackner, Karl J; Pärssinen, Olavi; Wedenoja, Juho; Biino, Ginevra; Concas, Maria Pina; Uitterlinden, André; Rivadeneira, Fernando; Jaddoe, Vincent W V; Hysi, Pirro G; Sim, Xueling; Tan, Nicholas; Tham, Yih-Chung; Sensaki, Sonoko; Hofman, Albert; Vingerling, Johannes R; Jonas, Jost B; Mitchell, Paul; Hammond, Christopher J; Höhn, René; Baird, Paul N; Wong, Tien-Yin; Cheng, Chinfsg-Yu; Teo, Yik Ying; Mackey, David A; Williams, Cathy; Saw, Seang-Mei; Klaver, Caroline C W; Guggenheim, Jeremy A; Bailey-Wilson, Joan E

    2018-01-01

    To identify genes and genetic markers associated with corneal astigmatism. A meta-analysis of genome-wide association studies (GWASs) of corneal astigmatism undertaken for 14 European ancestry (n=22,250) and 8 Asian ancestry (n=9,120) cohorts was performed by the Consortium for Refractive Error and Myopia. Cases were defined as having >0.75 diopters of corneal astigmatism. Subsequent gene-based and gene-set analyses of the meta-analyzed results of European ancestry cohorts were performed using VEGAS2 and MAGMA software. Additionally, estimates of single nucleotide polymorphism (SNP)-based heritability for corneal and refractive astigmatism and the spherical equivalent were calculated for Europeans using LD score regression. The meta-analysis of all cohorts identified a genome-wide significant locus near the platelet-derived growth factor receptor alpha ( PDGFRA ) gene: top SNP: rs7673984, odds ratio=1.12 (95% CI:1.08-1.16), p=5.55×10 -9 . No other genome-wide significant loci were identified in the combined analysis or European/Asian ancestry-specific analyses. Gene-based analysis identified three novel candidate genes for corneal astigmatism in Europeans-claudin-7 ( CLDN7 ), acid phosphatase 2, lysosomal ( ACP2 ), and TNF alpha-induced protein 8 like 3 ( TNFAIP8L3 ). In addition to replicating a previously identified genome-wide significant locus for corneal astigmatism near the PDGFRA gene, gene-based analysis identified three novel candidate genes, CLDN7 , ACP2 , and TNFAIP8L3 , that warrant further investigation to understand their role in the pathogenesis of corneal astigmatism. The much lower number of genetic variants and genes demonstrating an association with corneal astigmatism compared to published spherical equivalent GWAS analyses suggest a greater influence of rare genetic variants, non-additive genetic effects, or environmental factors in the development of astigmatism.

  9. A genome-wide association study of corneal astigmatism: The CREAM Consortium

    PubMed Central

    Shah, Rupal L.; Li, Qing; Zhao, Wanting; Tedja, Milly S.; Tideman, J. Willem L.; Khawaja, Anthony P.; Fan, Qiao; Yazar, Seyhan; Williams, Katie M.; Verhoeven, Virginie J.M.; Xie, Jing; Wang, Ya Xing; Hess, Moritz; Nickels, Stefan; Lackner, Karl J.; Pärssinen, Olavi; Wedenoja, Juho; Biino, Ginevra; Concas, Maria Pina; Uitterlinden, André; Rivadeneira, Fernando; Jaddoe, Vincent W.V.; Hysi, Pirro G.; Sim, Xueling; Tan, Nicholas; Tham, Yih-Chung; Sensaki, Sonoko; Hofman, Albert; Vingerling, Johannes R.; Jonas, Jost B.; Mitchell, Paul; Hammond, Christopher J.; Höhn, René; Baird, Paul N.; Wong, Tien-Yin; Cheng, Chinfsg-Yu; Teo, Yik Ying; Mackey, David A.; Williams, Cathy; Saw, Seang-Mei; Klaver, Caroline C.W.; Bailey-Wilson, Joan E.

    2018-01-01

    Purpose To identify genes and genetic markers associated with corneal astigmatism. Methods A meta-analysis of genome-wide association studies (GWASs) of corneal astigmatism undertaken for 14 European ancestry (n=22,250) and 8 Asian ancestry (n=9,120) cohorts was performed by the Consortium for Refractive Error and Myopia. Cases were defined as having >0.75 diopters of corneal astigmatism. Subsequent gene-based and gene-set analyses of the meta-analyzed results of European ancestry cohorts were performed using VEGAS2 and MAGMA software. Additionally, estimates of single nucleotide polymorphism (SNP)-based heritability for corneal and refractive astigmatism and the spherical equivalent were calculated for Europeans using LD score regression. Results The meta-analysis of all cohorts identified a genome-wide significant locus near the platelet-derived growth factor receptor alpha (PDGFRA) gene: top SNP: rs7673984, odds ratio=1.12 (95% CI:1.08–1.16), p=5.55×10−9. No other genome-wide significant loci were identified in the combined analysis or European/Asian ancestry-specific analyses. Gene-based analysis identified three novel candidate genes for corneal astigmatism in Europeans—claudin-7 (CLDN7), acid phosphatase 2, lysosomal (ACP2), and TNF alpha-induced protein 8 like 3 (TNFAIP8L3). Conclusions In addition to replicating a previously identified genome-wide significant locus for corneal astigmatism near the PDGFRA gene, gene-based analysis identified three novel candidate genes, CLDN7, ACP2, and TNFAIP8L3, that warrant further investigation to understand their role in the pathogenesis of corneal astigmatism. The much lower number of genetic variants and genes demonstrating an association with corneal astigmatism compared to published spherical equivalent GWAS analyses suggest a greater influence of rare genetic variants, non-additive genetic effects, or environmental factors in the development of astigmatism. PMID:29422769

  10. Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data

    PubMed Central

    Oetjens, Matthew T.; Brown-Gentry, Kristin; Goodloe, Robert; Dilks, Holli H.; Crawford, Dana C.

    2016-01-01

    Population stratification or confounding by genetic ancestry is a potential cause of false associations in genetic association studies. Estimation of and adjustment for genetic ancestry has become common practice thanks in part to the availability of ancestry informative markers on genome-wide association study (GWAS) arrays. While array data is now widespread, these data are not ubiquitous as several large epidemiologic and clinic-based studies lack genome-wide data. One such large epidemiologic-based study lacking genome-wide data accessible to investigators is the National Health and Nutrition Examination Surveys (NHANES), population-based cross-sectional surveys of Americans linked to demographic, health, and lifestyle data conducted by the Centers for Disease Control and Prevention. DNA samples (n = 14,998) were extracted from biospecimens from consented NHANES participants between 1991–1994 (NHANES III, phase 2) and 1999–2002 and represent three major self-identified racial/ethnic groups: non-Hispanic whites (n = 6,634), non-Hispanic blacks (n = 3,458), and Mexican Americans (n = 3,950). We as the Epidemiologic Architecture for Genes Linked to Environment study genotyped candidate gene and GWAS-identified index variants in NHANES as part of the larger Population Architecture using Genomics and Epidemiology I study for collaborative genetic association studies. To enable basic quality control such as estimation of genetic ancestry to control for population stratification in NHANES san genome-wide data, we outline here strategies that use limited genetic data to identify the markers optimal for characterizing genetic ancestry. From among 411 and 295 autosomal SNPs available in NHANES III and NHANES 1999–2002, we demonstrate that markers with ancestry information can be identified to estimate global ancestry. Despite limited resolution, global genetic ancestry is highly correlated with self-identified race for the majority of participants, although less so

  11. Genome-Wide Meta-Analysis of Sciatica in Finnish Population.

    PubMed

    Lemmelä, Susanna; Solovieva, Svetlana; Shiri, Rahman; Benner, Christian; Heliövaara, Markku; Kettunen, Johannes; Anttila, Verneri; Ripatti, Samuli; Perola, Markus; Seppälä, Ilkka; Juonala, Markus; Kähönen, Mika; Salomaa, Veikko; Viikari, Jorma; Raitakari, Olli T; Lehtimäki, Terho; Palotie, Aarno; Viikari-Juntura, Eira; Husgafvel-Pursiainen, Kirsti

    2016-01-01

    Sciatica or the sciatic syndrome is a common and often disabling low back disorder in the working-age population. It has a relatively high heritability but poorly understood molecular mechanisms. The Finnish population is a genetic isolate where small founder population and bottleneck events have led to enrichment of certain rare and low frequency variants. We performed here the first genome-wide association (GWAS) and meta-analysis of sciatica. The meta-analysis was conducted across two GWAS covering 291 Finnish sciatica cases and 3671 controls genotyped and imputed at 7.7 million autosomal variants. The most promising loci (p<1x10-6) were replicated in 776 Finnish sciatica patients and 18,489 controls. We identified five intragenic variants, with relatively low frequencies, at two novel loci associated with sciatica at genome-wide significance. These included chr9:14344410:I (rs71321981) at 9p22.3 (NFIB gene; p = 1.30x10-8, MAF = 0.08) and four variants at 15q21.2: rs145901849, rs80035109, rs190200374 and rs117458827 (MYO5A; p = 1.34x10-8, MAF = 0.06; p = 2.32x10-8, MAF = 0.07; p = 3.85x10-8, MAF = 0.06; p = 4.78x10-8, MAF = 0.07, respectively). The most significant association in the meta-analysis, a single base insertion rs71321981 within the regulatory region of the transcription factor NFIB, replicated in an independent Finnish population sample (p = 0.04). Despite identifying 15q21.2 as a promising locus, we were not able to replicate it. It was differentiated; the lead variants within 15q21.2 were more frequent in Finland (6-7%) than in other European populations (1-2%). Imputation accuracies of the three significantly associated variants (chr9:14344410:I, rs190200374, and rs80035109) were validated by genotyping. In summary, our results suggest a novel locus, 9p22.3 (NFIB), which may be involved in susceptibility to sciatica. In addition, another locus, 15q21.2, emerged as a promising one, but failed to replicate.

  12. Genome-Wide Meta-Analysis of Sciatica in Finnish Population

    PubMed Central

    Lemmelä, Susanna; Solovieva, Svetlana; Shiri, Rahman; Benner, Christian; Heliövaara, Markku; Kettunen, Johannes; Anttila, Verneri; Ripatti, Samuli; Perola, Markus; Seppälä, Ilkka; Juonala, Markus; Kähönen, Mika; Salomaa, Veikko; Viikari, Jorma; Raitakari, Olli T.; Lehtimäki, Terho; Palotie, Aarno; Viikari-Juntura, Eira; Husgafvel-Pursiainen, Kirsti

    2016-01-01

    Sciatica or the sciatic syndrome is a common and often disabling low back disorder in the working-age population. It has a relatively high heritability but poorly understood molecular mechanisms. The Finnish population is a genetic isolate where small founder population and bottleneck events have led to enrichment of certain rare and low frequency variants. We performed here the first genome-wide association (GWAS) and meta-analysis of sciatica. The meta-analysis was conducted across two GWAS covering 291 Finnish sciatica cases and 3671 controls genotyped and imputed at 7.7 million autosomal variants. The most promising loci (p<1x10-6) were replicated in 776 Finnish sciatica patients and 18,489 controls. We identified five intragenic variants, with relatively low frequencies, at two novel loci associated with sciatica at genome-wide significance. These included chr9:14344410:I (rs71321981) at 9p22.3 (NFIB gene; p = 1.30x10-8, MAF = 0.08) and four variants at 15q21.2: rs145901849, rs80035109, rs190200374 and rs117458827 (MYO5A; p = 1.34x10-8, MAF = 0.06; p = 2.32x10-8, MAF = 0.07; p = 3.85x10-8, MAF = 0.06; p = 4.78x10-8, MAF = 0.07, respectively). The most significant association in the meta-analysis, a single base insertion rs71321981 within the regulatory region of the transcription factor NFIB, replicated in an independent Finnish population sample (p = 0.04). Despite identifying 15q21.2 as a promising locus, we were not able to replicate it. It was differentiated; the lead variants within 15q21.2 were more frequent in Finland (6–7%) than in other European populations (1–2%). Imputation accuracies of the three significantly associated variants (chr9:14344410:I, rs190200374, and rs80035109) were validated by genotyping. In summary, our results suggest a novel locus, 9p22.3 (NFIB), which may be involved in susceptibility to sciatica. In addition, another locus, 15q21.2, emerged as a promising one, but failed to replicate. PMID:27764105

  13. Genome-wide investigation reveals high evolutionary rates in annual model plants.

    PubMed

    Yue, Jia-Xing; Li, Jinpeng; Wang, Dan; Araki, Hitoshi; Tian, Dacheng; Yang, Sihai

    2010-11-09

    Rates of molecular evolution vary widely among species. While significant deviations from molecular clock have been found in many taxa, effects of life histories on molecular evolution are not fully understood. In plants, annual/perennial life history traits have long been suspected to influence the evolutionary rates at the molecular level. To date, however, the number of genes investigated on this subject is limited and the conclusions are mixed. To evaluate the possible heterogeneity in evolutionary rates between annual and perennial plants at the genomic level, we investigated 85 nuclear housekeeping genes, 10 non-housekeeping families, and 34 chloroplast genes using the genomic data from model plants including Arabidopsis thaliana and Medicago truncatula for annuals and grape (Vitis vinifera) and popular (Populus trichocarpa) for perennials. According to the cross-comparisons among the four species, 74-82% of the nuclear genes and 71-97% of the chloroplast genes suggested higher rates of molecular evolution in the two annuals than those in the two perennials. The significant heterogeneity in evolutionary rate between annuals and perennials was consistently found both in nonsynonymous sites and synonymous sites. While a linear correlation of evolutionary rates in orthologous genes between species was observed in nonsynonymous sites, the correlation was weak or invisible in synonymous sites. This tendency was clearer in nuclear genes than in chloroplast genes, in which the overall evolutionary rate was small. The slope of the regression line was consistently lower than unity, further confirming the higher evolutionary rate in annuals at the genomic level. The higher evolutionary rate in annuals than in perennials appears to be a universal phenomenon both in nuclear and chloroplast genomes in the four dicot model plants we investigated. Therefore, such heterogeneity in evolutionary rate should result from factors that have genome-wide influence, most likely those

  14. Genome-wide association study of antisocial personality disorder

    PubMed Central

    Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J

    2016-01-01

    The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53–3.14), P=1.9 × 10-5). Two polymorphisms at 6p21.2 LINC00951–LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37–1.85), P=1.6 × 10−9) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder. PMID:27598967

  15. Genome-wide association study of antisocial personality disorder.

    PubMed

    Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J

    2016-09-06

    The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53-3.14), P=1.9 × 10(-5)). Two polymorphisms at 6p21.2 LINC00951-LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37-1.85), P=1.6 × 10(-9)) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder.

  16. Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies

    PubMed Central

    Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert

    2016-01-01

    The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008–2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0. PMID:27892471

  17. Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies.

    PubMed

    Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert

    2016-11-28

    The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.

  18. Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies

    NASA Astrophysics Data System (ADS)

    Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert

    2016-11-01

    The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.

  19. Genome-wide characterization of Mediator recruitment, function, and regulation.

    PubMed

    Grünberg, Sebastian; Zentner, Gabriel E

    2017-05-27

    Mediator is a conserved and essential coactivator complex broadly required for RNA polymerase II (RNAPII) transcription. Recent genome-wide studies of Mediator binding in budding yeast have revealed new insights into the functions of this critical complex and raised new questions about its role in the regulation of gene expression.

  20. Multi-trait analysis of genome-wide association summary statistics using MTAG.

    PubMed

    Turley, Patrick; Walters, Raymond K; Maghzian, Omeed; Okbay, Aysu; Lee, James J; Fontana, Mark Alan; Nguyen-Viet, Tuan Anh; Wedow, Robbee; Zacher, Meghan; Furlotte, Nicholas A; Magnusson, Patrik; Oskarsson, Sven; Johannesson, Magnus; Visscher, Peter M; Laibson, David; Cesarini, David; Neale, Benjamin M; Benjamin, Daniel J

    2018-02-01

    We introduce multi-trait analysis of GWAS (MTAG), a method for joint analysis of summary statistics from genome-wide association studies (GWAS) of different traits, possibly from overlapping samples. We apply MTAG to summary statistics for depressive symptoms (N eff  = 354,862), neuroticism (N = 168,105), and subjective well-being (N = 388,538). As compared to the 32, 9, and 13 genome-wide significant loci identified in the single-trait GWAS (most of which are themselves novel), MTAG increases the number of associated loci to 64, 37, and 49, respectively. Moreover, association statistics from MTAG yield more informative bioinformatics analyses and increase the variance explained by polygenic scores by approximately 25%, matching theoretical expectations.

  1. The druggable genome and support for target identification and validation in drug development.

    PubMed

    Finan, Chris; Gaulton, Anna; Kruger, Felix A; Lumbers, R Thomas; Shah, Tina; Engmann, Jorgen; Galver, Luana; Kelley, Ryan; Karlsson, Anneli; Santos, Rita; Overington, John P; Hingorani, Aroon D; Casas, Juan P

    2017-03-29

    Target identification (determining the correct drug targets for a disease) and target validation (demonstrating an effect of target perturbation on disease biomarkers and disease end points) are important steps in drug development. Clinically relevant associations of variants in genes encoding drug targets model the effect of modifying the same targets pharmacologically. To delineate drug development (including repurposing) opportunities arising from this paradigm, we connected complex disease- and biomarker-associated loci from genome-wide association studies to an updated set of genes encoding druggable human proteins, to agents with bioactivity against these targets, and, where there were licensed drugs, to clinical indications. We used this set of genes to inform the design of a new genotyping array, which will enable association studies of druggable genes for drug target selection and validation in human disease. Copyright © 2017, American Association for the Advancement of Science.

  2. Bioinformatics challenges for genome-wide association studies.

    PubMed

    Moore, Jason H; Asselbergs, Folkert W; Williams, Scott M

    2010-02-15

    The sequencing of the human genome has made it possible to identify an informative set of >1 million single nucleotide polymorphisms (SNPs) across the genome that can be used to carry out genome-wide association studies (GWASs). The availability of massive amounts of GWAS data has necessitated the development of new biostatistical methods for quality control, imputation and analysis issues including multiple testing. This work has been successful and has enabled the discovery of new associations that have been replicated in multiple studies. However, it is now recognized that most SNPs discovered via GWAS have small effects on disease susceptibility and thus may not be suitable for improving health care through genetic testing. One likely explanation for the mixed results of GWAS is that the current biostatistical analysis paradigm is by design agnostic or unbiased in that it ignores all prior knowledge about disease pathobiology. Further, the linear modeling framework that is employed in GWAS often considers only one SNP at a time thus ignoring their genomic and environmental context. There is now a shift away from the biostatistical approach toward a more holistic approach that recognizes the complexity of the genotype-phenotype relationship that is characterized by significant heterogeneity and gene-gene and gene-environment interaction. We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. The goal of this review is to identify and discuss those GWAS challenges that will require computational methods.

  3. Genome-wide Analysis Reveals Extensive Functional Interaction between DNA Replication Initiation and Transcription in the Genome of Trypanosoma brucei

    PubMed Central

    Tiengwe, Calvin; Marcello, Lucio; Farr, Helen; Dickens, Nicholas; Kelly, Steven; Swiderski, Michal; Vaughan, Diane; Gull, Keith; Barry, J. David; Bell, Stephen D.; McCulloch, Richard

    2012-01-01

    Summary Identification of replication initiation sites, termed origins, is a crucial step in understanding genome transmission in any organism. Transcription of the Trypanosoma brucei genome is highly unusual, with each chromosome comprising a few discrete transcription units. To understand how DNA replication occurs in the context of such organization, we have performed genome-wide mapping of the binding sites of the replication initiator ORC1/CDC6 and have identified replication origins, revealing that both localize to the boundaries of the transcription units. A remarkably small number of active origins is seen, whose spacing is greater than in any other eukaryote. We show that replication and transcription in T. brucei have a profound functional overlap, as reducing ORC1/CDC6 levels leads to genome-wide increases in mRNA levels arising from the boundaries of the transcription units. In addition, ORC1/CDC6 loss causes derepression of silent Variant Surface Glycoprotein genes, which are critical for host immune evasion. PMID:22840408

  4. Five endometrial cancer risk loci identified through genome-wide association analysis.

    PubMed

    Cheng, Timothy Ht; Thompson, Deborah J; O'Mara, Tracy A; Painter, Jodie N; Glubb, Dylan M; Flach, Susanne; Lewis, Annabelle; French, Juliet D; Freeman-Mills, Luke; Church, David; Gorman, Maggie; Martin, Lynn; Hodgson, Shirley; Webb, Penelope M; Attia, John; Holliday, Elizabeth G; McEvoy, Mark; Scott, Rodney J; Henders, Anjali K; Martin, Nicholas G; Montgomery, Grant W; Nyholt, Dale R; Ahmed, Shahana; Healey, Catherine S; Shah, Mitul; Dennis, Joe; Fasching, Peter A; Beckmann, Matthias W; Hein, Alexander; Ekici, Arif B; Hall, Per; Czene, Kamila; Darabi, Hatef; Li, Jingmei; Dörk, Thilo; Dürst, Matthias; Hillemanns, Peter; Runnebaum, Ingo; Amant, Frederic; Schrauwen, Stefanie; Zhao, Hui; Lambrechts, Diether; Depreeuw, Jeroen; Dowdy, Sean C; Goode, Ellen L; Fridley, Brooke L; Winham, Stacey J; Njølstad, Tormund S; Salvesen, Helga B; Trovik, Jone; Werner, Henrica Mj; Ashton, Katie; Otton, Geoffrey; Proietto, Tony; Liu, Tao; Mints, Miriam; Tham, Emma; Consortium, Chibcha; Jun Li, Mulin; Yip, Shun H; Wang, Junwen; Bolla, Manjeet K; Michailidou, Kyriaki; Wang, Qin; Tyrer, Jonathan P; Dunlop, Malcolm; Houlston, Richard; Palles, Claire; Hopper, John L; Peto, Julian; Swerdlow, Anthony J; Burwinkel, Barbara; Brenner, Hermann; Meindl, Alfons; Brauch, Hiltrud; Lindblom, Annika; Chang-Claude, Jenny; Couch, Fergus J; Giles, Graham G; Kristensen, Vessela N; Cox, Angela; Cunningham, Julie M; Pharoah, Paul D P; Dunning, Alison M; Edwards, Stacey L; Easton, Douglas F; Tomlinson, Ian; Spurdle, Amanda B

    2016-06-01

    We conducted a meta-analysis of three endometrial cancer genome-wide association studies (GWAS) and two follow-up phases totaling 7,737 endometrial cancer cases and 37,144 controls of European ancestry. Genome-wide imputation and meta-analysis identified five new risk loci of genome-wide significance at likely regulatory regions on chromosomes 13q22.1 (rs11841589, near KLF5), 6q22.31 (rs13328298, in LOC643623 and near HEY2 and NCOA7), 8q24.21 (rs4733613, telomeric to MYC), 15q15.1 (rs937213, in EIF2AK4, near BMF) and 14q32.33 (rs2498796, in AKT1, near SIVA1). We also found a second independent 8q24.21 signal (rs17232730). Functional studies of the 13q22.1 locus showed that rs9600103 (pairwise r(2) = 0.98 with rs11841589) is located in a region of active chromatin that interacts with the KLF5 promoter region. The rs9600103[T] allele that is protective in endometrial cancer suppressed gene expression in vitro, suggesting that regulation of the expression of KLF5, a gene linked to uterine development, is implicated in tumorigenesis. These findings provide enhanced insight into the genetic and biological basis of endometrial cancer.

  5. Sniffing out significant "Pee values": genome wide association study of asparagus anosmia.

    PubMed

    Markt, Sarah C; Nuttall, Elizabeth; Turman, Constance; Sinnott, Jennifer; Rimm, Eric B; Ecsedy, Ethan; Unger, Robert H; Fall, Katja; Finn, Stephen; Jensen, Majken K; Rider, Jennifer R; Kraft, Peter; Mucci, Lorelei A

    2016-12-13

     To determine the inherited factors associated with the ability to smell asparagus metabolites in urine.  Genome wide association study.  Nurses' Health Study and Health Professionals Follow-up Study cohorts.  6909 men and women of European-American descent with available genetic data from genome wide association studies.  Participants were characterized as asparagus smellers if they strongly agreed with the prompt "after eating asparagus, you notice a strong characteristic odor in your urine," and anosmic if otherwise. We calculated per-allele estimates of asparagus anosmia for about nine million single nucleotide polymorphisms using logistic regression. P values <5×10 -8 were considered as genome wide significant.  58.0% of men (n=1449/2500) and 61.5% of women (n=2712/4409) had anosmia. 871 single nucleotide polymorphisms reached genome wide significance for asparagus anosmia, all in a region on chromosome 1 (1q44: 248139851-248595299) containing multiple genes in the olfactory receptor 2 (OR2) family. Conditional analyses revealed three independent markers associated with asparagus anosmia: rs13373863, rs71538191, and rs6689553.  A large proportion of people have asparagus anosmia. Genetic variation near multiple olfactory receptor genes is associated with the ability of an individual to smell the metabolites of asparagus in urine. Future replication studies are necessary before considering targeted therapies to help anosmic people discover what they are missing. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  6. Genome-wide analysis identifies 12 loci influencing human reproductive behavior.

    PubMed

    Barban, Nicola; Jansen, Rick; de Vlaming, Ronald; Vaez, Ahmad; Mandemakers, Jornt J; Tropf, Felix C; Shen, Xia; Wilson, James F; Chasman, Daniel I; Nolte, Ilja M; Tragante, Vinicius; van der Laan, Sander W; Perry, John R B; Kong, Augustine; Ahluwalia, Tarunveer S; Albrecht, Eva; Yerges-Armstrong, Laura; Atzmon, Gil; Auro, Kirsi; Ayers, Kristin; Bakshi, Andrew; Ben-Avraham, Danny; Berger, Klaus; Bergman, Aviv; Bertram, Lars; Bielak, Lawrence F; Bjornsdottir, Gyda; Bonder, Marc Jan; Broer, Linda; Bui, Minh; Barbieri, Caterina; Cavadino, Alana; Chavarro, Jorge E; Turman, Constance; Concas, Maria Pina; Cordell, Heather J; Davies, Gail; Eibich, Peter; Eriksson, Nicholas; Esko, Tõnu; Eriksson, Joel; Falahi, Fahimeh; Felix, Janine F; Fontana, Mark Alan; Franke, Lude; Gandin, Ilaria; Gaskins, Audrey J; Gieger, Christian; Gunderson, Erica P; Guo, Xiuqing; Hayward, Caroline; He, Chunyan; Hofer, Edith; Huang, Hongyan; Joshi, Peter K; Kanoni, Stavroula; Karlsson, Robert; Kiechl, Stefan; Kifley, Annette; Kluttig, Alexander; Kraft, Peter; Lagou, Vasiliki; Lecoeur, Cecile; Lahti, Jari; Li-Gao, Ruifang; Lind, Penelope A; Liu, Tian; Makalic, Enes; Mamasoula, Crysovalanto; Matteson, Lindsay; Mbarek, Hamdi; McArdle, Patrick F; McMahon, George; Meddens, S Fleur W; Mihailov, Evelin; Miller, Mike; Missmer, Stacey A; Monnereau, Claire; van der Most, Peter J; Myhre, Ronny; Nalls, Mike A; Nutile, Teresa; Kalafati, Ioanna Panagiota; Porcu, Eleonora; Prokopenko, Inga; Rajan, Kumar B; Rich-Edwards, Janet; Rietveld, Cornelius A; Robino, Antonietta; Rose, Lynda M; Rueedi, Rico; Ryan, Kathleen A; Saba, Yasaman; Schmidt, Daniel; Smith, Jennifer A; Stolk, Lisette; Streeten, Elizabeth; Tönjes, Anke; Thorleifsson, Gudmar; Ulivi, Sheila; Wedenoja, Juho; Wellmann, Juergen; Willeit, Peter; Yao, Jie; Yengo, Loic; Zhao, Jing Hua; Zhao, Wei; Zhernakova, Daria V; Amin, Najaf; Andrews, Howard; Balkau, Beverley; Barzilai, Nir; Bergmann, Sven; Biino, Ginevra; Bisgaard, Hans; Bønnelykke, Klaus; Boomsma, Dorret I; Buring, Julie E; Campbell, Harry; Cappellani, Stefania; Ciullo, Marina; Cox, Simon R; Cucca, Francesco; Toniolo, Daniela; Davey-Smith, George; Deary, Ian J; Dedoussis, George; Deloukas, Panos; van Duijn, Cornelia M; de Geus, Eco J C; Eriksson, Johan G; Evans, Denis A; Faul, Jessica D; Sala, Cinzia Felicita; Froguel, Philippe; Gasparini, Paolo; Girotto, Giorgia; Grabe, Hans-Jörgen; Greiser, Karin Halina; Groenen, Patrick J F; de Haan, Hugoline G; Haerting, Johannes; Harris, Tamara B; Heath, Andrew C; Heikkilä, Kauko; Hofman, Albert; Homuth, Georg; Holliday, Elizabeth G; Hopper, John; Hyppönen, Elina; Jacobsson, Bo; Jaddoe, Vincent W V; Johannesson, Magnus; Jugessur, Astanand; Kähönen, Mika; Kajantie, Eero; Kardia, Sharon L R; Keavney, Bernard; Kolcic, Ivana; Koponen, Päivikki; Kovacs, Peter; Kronenberg, Florian; Kutalik, Zoltan; La Bianca, Martina; Lachance, Genevieve; Iacono, William G; Lai, Sandra; Lehtimäki, Terho; Liewald, David C; Lindgren, Cecilia M; Liu, Yongmei; Luben, Robert; Lucht, Michael; Luoto, Riitta; Magnus, Per; Magnusson, Patrik K E; Martin, Nicholas G; McGue, Matt; McQuillan, Ruth; Medland, Sarah E; Meisinger, Christa; Mellström, Dan; Metspalu, Andres; Traglia, Michela; Milani, Lili; Mitchell, Paul; Montgomery, Grant W; Mook-Kanamori, Dennis; de Mutsert, Renée; Nohr, Ellen A; Ohlsson, Claes; Olsen, Jørn; Ong, Ken K; Paternoster, Lavinia; Pattie, Alison; Penninx, Brenda W J H; Perola, Markus; Peyser, Patricia A; Pirastu, Mario; Polasek, Ozren; Power, Chris; Kaprio, Jaakko; Raffel, Leslie J; Räikkönen, Katri; Raitakari, Olli; Ridker, Paul M; Ring, Susan M; Roll, Kathryn; Rudan, Igor; Ruggiero, Daniela; Rujescu, Dan; Salomaa, Veikko; Schlessinger, David; Schmidt, Helena; Schmidt, Reinhold; Schupf, Nicole; Smit, Johannes; Sorice, Rossella; Spector, Tim D; Starr, John M; Stöckl, Doris; Strauch, Konstantin; Stumvoll, Michael; Swertz, Morris A; Thorsteinsdottir, Unnur; Thurik, A Roy; Timpson, Nicholas J; Tung, Joyce Y; Uitterlinden, André G; Vaccargiu, Simona; Viikari, Jorma; Vitart, Veronique; Völzke, Henry; Vollenweider, Peter; Vuckovic, Dragana; Waage, Johannes; Wagner, Gert G; Wang, Jie Jin; Wareham, Nicholas J; Weir, David R; Willemsen, Gonneke; Willeit, Johann; Wright, Alan F; Zondervan, Krina T; Stefansson, Kari; Krueger, Robert F; Lee, James J; Benjamin, Daniel J; Cesarini, David; Koellinger, Philipp D; den Hoed, Marcel; Snieder, Harold; Mills, Melinda C

    2016-12-01

    The genetic architecture of human reproductive behavior-age at first birth (AFB) and number of children ever born (NEB)-has a strong relationship with fitness, human development, infertility and risk of neuropsychiatric disorders. However, very few genetic loci have been identified, and the underlying mechanisms of AFB and NEB are poorly understood. We report a large genome-wide association study of both sexes including 251,151 individuals for AFB and 343,072 individuals for NEB. We identified 12 independent loci that are significantly associated with AFB and/or NEB in a SNP-based genome-wide association study and 4 additional loci associated in a gene-based effort. These loci harbor genes that are likely to have a role, either directly or by affecting non-local gene expression, in human reproduction and infertility, thereby increasing understanding of these complex traits.

  7. A genome-wide BAC-end sequence survey provides first insights into sweetpotato (Ipomoea batatas (L.) Lam.) genome composition.

    PubMed

    Si, Zengzhi; Du, Bing; Huo, Jinxi; He, Shaozhen; Liu, Qingchang; Zhai, Hong

    2016-11-21

    Sweetpotato, Ipomoea batatas (L.) Lam., is an important food crop widely grown in the world. However, little is known about the genome of this species because it is a highly heterozygous hexaploid. Gaining a more in-depth knowledge of sweetpotato genome is therefore necessary and imperative. In this study, the first bacterial artificial chromosome (BAC) library of sweetpotato was constructed. Clones from the BAC library were end-sequenced and analyzed to provide genome-wide information about this species. The BAC library contained 240,384 clones with an average insert size of 101 kb and had a 7.93-10.82 × coverage of the genome, and the probability of isolating any single-copy DNA sequence from the library was more than 99%. Both ends of 8310 BAC clones randomly selected from the library were sequenced to generate 11,542 high-quality BAC-end sequences (BESs), with an accumulative length of 7,595,261 bp and an average length of 658 bp. Analysis of the BESs revealed that 12.17% of the sweetpotato genome were known repetitive DNA, including 7.37% long terminal repeat (LTR) retrotransposons, 1.15% Non-LTR retrotransposons and 1.42% Class II DNA transposons etc., 18.31% of the genome were identified as sweetpotato-unique repetitive DNA and 10.00% of the genome were predicted to be coding regions. In total, 3,846 simple sequences repeats (SSRs) were identified, with a density of one SSR per 1.93 kb, from which 288 SSRs primers were designed and tested for length polymorphism using 20 sweetpotato accessions, 173 (60.07%) of them produced polymorphic bands. Sweetpotato BESs had significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum than those of Vitis vinifera, Theobroma cacao and Arabidopsis thaliana. The first BAC library for sweetpotato has been successfully constructed. The high quality BESs provide first insights into sweetpotato genome composition, and have significant hits to the genome

  8. Genome-Wide Analysis in Brazilians Reveals Highly Differentiated Native American Genome Regions

    PubMed Central

    Havt, Alexandre; Nayak, Uma; Pinkerton, Relana; Farber, Emily; Concannon, Patrick; Lima, Aldo A.; Guerrant, Richard L.

    2017-01-01

    Despite its population, geographic size, and emerging economic importance, disproportionately little genome-scale research exists into genetic factors that predispose Brazilians to disease, or the population genetics of risk. After identification of suitable proxy populations and careful analysis of tri-continental admixture in 1,538 North-Eastern Brazilians to estimate individual ancestry and ancestral allele frequencies, we computed 400,000 genome-wide locus-specific branch length (LSBL) Fst statistics of Brazilian Amerindian ancestry compared to European and African; and a similar set of differentiation statistics for their Amerindian component compared with the closest Asian 1000 Genomes population (surprisingly, Bengalis in Bangladesh). After ranking SNPs by these statistics, we identified the top 10 highly differentiated SNPs in five genome regions in the LSBL tests of Brazilian Amerindian ancestry compared to European and African; and the top 10 SNPs in eight regions comparing their Amerindian component to the closest Asian 1000 Genomes population. We found SNPs within or proximal to the genes CIITA (rs6498115), SMC6 (rs1834619), and KLHL29 (rs2288697) were most differentiated in the Amerindian-specific branch, while SNPs in the genes ADAMTS9 (rs7631391), DOCK2 (rs77594147), SLC28A1 (rs28649017), ARHGAP5 (rs7151991), and CIITA (rs45601437) were most highly differentiated in the Asian comparison. These genes are known to influence immune function, metabolic and anthropometry traits, and embryonic development. These analyses have identified candidate genes for selection within Amerindian ancestry, and by comparison of the two analyses, those for which the differentiation may have arisen during the migration from Asia to the Americas. PMID:28100790

  9. Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

    PubMed

    Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

    2017-03-27

    Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide

  10. Single-trait and multi-trait genome-wide association analyses identify novel loci for blood pressure in African-ancestry populations

    PubMed Central

    Liang, Jingjing; Le, Thu H.; Edwards, Digna R. Velez; Tayo, Bamidele O.; Gaulton, Kyle J.; Lu, Yingchang; Jensen, Richard A.; Chen, Guanjie; Schwander, Karen; McKenzie, Colin A.; Fox, Ervin; Nalls, Michael A.; Young, J. Hunter; Lane, Jacqueline M.; Zhou, Jie; Tang, Hua; Fornage, Myriam; Musani, Solomon K.; Wang, Heming; Forrester, Terrence; Chu, Pei-Lun; Evans, Michele K.; Morrison, Alanna C.; Martin, Lisa W.; Wiggins, Kerri L.; Hui, Qin; Zhao, Wei; Jackson, Rebecca D.; Faul, Jessica D.; Reiner, Alex P.; Bray, Michael; Denny, Joshua C.; Mosley, Thomas H.; Palmas, Walter; Guo, Xiuqing; Polak, Joseph F.; Taylor, Ken D.; Boerwinkle, Eric; Bottinger, Erwin P.; Liu, Kiang; Risch, Neil; Hunt, Steven C.; Kooperberg, Charles; Zonderman, Alan B.; Becker, Diane M.; Cai, Jianwen; Loos, Ruth J. F.; Psaty, Bruce M.; Weir, David R.; Kardia, Sharon L. R.; Arnett, Donna K.; Won, Sungho; Edwards, Todd L.; Redline, Susan; Cooper, Richard S.; Rao, D. C.; Rotimi, Charles; Levy, Daniel; Chakravarti, Aravinda

    2017-01-01

    Hypertension is a leading cause of global disease, mortality, and disability. While individuals of African descent suffer a disproportionate burden of hypertension and its complications, they have been underrepresented in genetic studies. To identify novel susceptibility loci for blood pressure and hypertension in people of African ancestry, we performed both single and multiple-trait genome-wide association analyses. We analyzed 21 genome-wide association studies comprised of 31,968 individuals of African ancestry, and validated our results with additional 54,395 individuals from multi-ethnic studies. These analyses identified nine loci with eleven independent variants which reached genome-wide significance (P < 1.25×10−8) for either systolic and diastolic blood pressure, hypertension, or for combined traits. Single-trait analyses identified two loci (TARID/TCF21 and LLPH/TMBIM4) and multiple-trait analyses identified one novel locus (FRMD3) for blood pressure. At these three loci, as well as at GRP20/CDH17, associated variants had alleles common only in African-ancestry populations. Functional annotation showed enrichment for genes expressed in immune and kidney cells, as well as in heart and vascular cells/tissues. Experiments driven by these findings and using angiotensin-II induced hypertension in mice showed altered kidney mRNA expression of six genes, suggesting their potential role in hypertension. Our study provides new evidence for genes related to hypertension susceptibility, and the need to study African-ancestry populations in order to identify biologic factors contributing to hypertension. PMID:28498854

  11. Genome-Wide Profiling of DNA Double-Strand Breaks by the BLESS and BLISS Methods.

    PubMed

    Mirzazadeh, Reza; Kallas, Tomasz; Bienko, Magda; Crosetto, Nicola

    2018-01-01

    DNA double-strand breaks (DSBs) are major DNA lesions that are constantly formed during physiological processes such as DNA replication, transcription, and recombination, or as a result of exogenous agents such as ionizing radiation, radiomimetic drugs, and genome editing nucleases. Unrepaired DSBs threaten genomic stability by leading to the formation of potentially oncogenic rearrangements such as translocations. In past few years, several methods based on next-generation sequencing (NGS) have been developed to study the genome-wide distribution of DSBs or their conversion to translocation events. We developed Breaks Labeling, Enrichment on Streptavidin, and Sequencing (BLESS), which was the first method for direct labeling of DSBs in situ followed by their genome-wide mapping at nucleotide resolution (Crosetto et al., Nat Methods 10:361-365, 2013). Recently, we have further expanded the quantitative nature, applicability, and scalability of BLESS by developing Breaks Labeling In Situ and Sequencing (BLISS) (Yan et al., Nat Commun 8:15058, 2017). Here, we first present an overview of existing methods for genome-wide localization of DSBs, and then focus on the BLESS and BLISS methods, discussing different assay design options depending on the sample type and application.

  12. Developmental Stability Covaries with Genome-Wide and Single-Locus Heterozygosity in House Sparrows

    PubMed Central

    Vangestel, Carl; Mergeay, Joachim; Dawson, Deborah A.; Vandomme, Viki; Lens, Luc

    2011-01-01

    Fluctuating asymmetry (FA), a measure of developmental instability, has been hypothesized to increase with genetic stress. Despite numerous studies providing empirical evidence for associations between FA and genome-wide properties such as multi-locus heterozygosity, support for single-locus effects remains scant. Here we test if, and to what extent, FA co-varies with single- and multilocus markers of genetic diversity in house sparrow (Passer domesticus) populations along an urban gradient. In line with theoretical expectations, FA was inversely correlated with genetic diversity estimated at genome level. However, this relationship was largely driven by variation at a single key locus. Contrary to our expectations, relationships between FA and genetic diversity were not stronger in individuals from urban populations that experience higher nutritional stress. We conclude that loss of genetic diversity adversely affects developmental stability in P. domesticus, and more generally, that the molecular basis of developmental stability may involve complex interactions between local and genome-wide effects. Further study on the relative effects of single-locus and genome-wide effects on the developmental stability of populations with different genetic properties is therefore needed. PMID:21747940

  13. Genome-wide characterization of Mediator recruitment, function, and regulation

    PubMed Central

    2017-01-01

    ABSTRACT Mediator is a conserved and essential coactivator complex broadly required for RNA polymerase II (RNAPII) transcription. Recent genome-wide studies of Mediator binding in budding yeast have revealed new insights into the functions of this critical complex and raised new questions about its role in the regulation of gene expression. PMID:28301289

  14. Validation and Implementation of Clinical Laboratory Improvements Act-Compliant Whole-Genome Sequencing in the Public Health Microbiology Laboratory

    PubMed Central

    Kozyreva, Varvara K.; Truong, Chau-Linda; Greninger, Alexander L.; Crandall, John; Mukhopadhyay, Rituparna

    2017-01-01

    ABSTRACT Public health microbiology laboratories (PHLs) are on the cusp of unprecedented improvements in pathogen identification, antibiotic resistance detection, and outbreak investigation by using whole-genome sequencing (WGS). However, considerable challenges remain due to the lack of common standards. Here, we describe the validation of WGS on the Illumina platform for routine use in PHLs according to Clinical Laboratory Improvements Act (CLIA) guidelines for laboratory-developed tests (LDTs). We developed a validation panel comprising 10 Enterobacteriaceae isolates, 5 Gram-positive cocci, 5 Gram-negative nonfermenting species, 9 Mycobacterium tuberculosis isolates, and 5 miscellaneous bacteria. The genome coverage range was 15.71× to 216.4× (average, 79.72×; median, 71.55×); the limit of detection (LOD) for single nucleotide polymorphisms (SNPs) was 60×. The accuracy, reproducibility, and repeatability of base calling were >99.9%. The accuracy of phylogenetic analysis was 100%. The specificity and sensitivity inferred from multilocus sequence typing (MLST) and genome-wide SNP-based phylogenetic assays were 100%. The following objectives were accomplished: (i) the establishment of the performance specifications for WGS applications in PHLs according to CLIA guidelines, (ii) the development of quality assurance and quality control measures, (iii) the development of a reporting format for end users with or without WGS expertise, (iv) the availability of a validation set of microorganisms, and (v) the creation of a modular template for the validation of WGS processes in PHLs. The validation panel, sequencing analytics, and raw sequences could facilitate multilaboratory comparisons of WGS data. Additionally, the WGS performance specifications and modular template are adaptable for the validation of other platforms and reagent kits. PMID:28592550

  15. Validation and Implementation of Clinical Laboratory Improvements Act-Compliant Whole-Genome Sequencing in the Public Health Microbiology Laboratory.

    PubMed

    Kozyreva, Varvara K; Truong, Chau-Linda; Greninger, Alexander L; Crandall, John; Mukhopadhyay, Rituparna; Chaturvedi, Vishnu

    2017-08-01

    Public health microbiology laboratories (PHLs) are on the cusp of unprecedented improvements in pathogen identification, antibiotic resistance detection, and outbreak investigation by using whole-genome sequencing (WGS). However, considerable challenges remain due to the lack of common standards. Here, we describe the validation of WGS on the Illumina platform for routine use in PHLs according to Clinical Laboratory Improvements Act (CLIA) guidelines for laboratory-developed tests (LDTs). We developed a validation panel comprising 10 Enterobacteriaceae isolates, 5 Gram-positive cocci, 5 Gram-negative nonfermenting species, 9 Mycobacterium tuberculosis isolates, and 5 miscellaneous bacteria. The genome coverage range was 15.71× to 216.4× (average, 79.72×; median, 71.55×); the limit of detection (LOD) for single nucleotide polymorphisms (SNPs) was 60×. The accuracy, reproducibility, and repeatability of base calling were >99.9%. The accuracy of phylogenetic analysis was 100%. The specificity and sensitivity inferred from multilocus sequence typing (MLST) and genome-wide SNP-based phylogenetic assays were 100%. The following objectives were accomplished: (i) the establishment of the performance specifications for WGS applications in PHLs according to CLIA guidelines, (ii) the development of quality assurance and quality control measures, (iii) the development of a reporting format for end users with or without WGS expertise, (iv) the availability of a validation set of microorganisms, and (v) the creation of a modular template for the validation of WGS processes in PHLs. The validation panel, sequencing analytics, and raw sequences could facilitate multilaboratory comparisons of WGS data. Additionally, the WGS performance specifications and modular template are adaptable for the validation of other platforms and reagent kits. Copyright © 2017 Kozyreva et al.

  16. GST-PRIME: an algorithm for genome-wide primer design.

    PubMed

    Leister, Dario; Varotto, Claudio

    2007-01-01

    The profiling of mRNA expression based on DNA arrays has become a powerful tool to study genome-wide transcription of genes in a number of organisms. GST-PRIME is a software package created to facilitate large-scale primer design for the amplification of probes to be immobilized on arrays for transcriptome analyses, even though it can be also applied in low-throughput approaches. GST-PRIME allows highly efficient, direct amplification of gene-sequence tags (GSTs) from genomic DNA (gDNA), starting from annotated genome or transcript sequences. GST-PRIME provides a customer-friendly platform for automatic primer design, and despite the relative simplicity of the algorithm, experimental tests in the model plant species Arabidopsis thaliana confirmed the reliability of the software. This chapter describes the algorithm used for primer design, its input and output files, and the installation of the standalone package and its use.

  17. Genome-Wide Prediction and Validation of Intergenic Enhancers in Arabidopsis Using Open Chromatin Signatures[OPEN

    PubMed Central

    Zhu, Bo; Zhang, Wenli; Jiang, Jiming

    2015-01-01

    Enhancers are important regulators of gene expression in eukaryotes. Enhancers function independently of their distance and orientation to the promoters of target genes. Thus, enhancers have been difficult to identify. Only a few enhancers, especially distant intergenic enhancers, have been identified in plants. We developed an enhancer prediction system based exclusively on the DNase I hypersensitive sites (DHSs) in the Arabidopsis thaliana genome. A set of 10,044 DHSs located in intergenic regions, which are away from any gene promoters, were predicted to be putative enhancers. We examined the functions of 14 predicted enhancers using the β-glucuronidase gene reporter. Ten of the 14 (71%) candidates were validated by the reporter assay. We also designed 10 constructs using intergenic sequences that are not associated with DHSs, and none of these constructs showed enhancer activities in reporter assays. In addition, the tissue specificity of the putative enhancers can be precisely predicted based on DNase I hypersensitivity data sets developed from different plant tissues. These results suggest that the open chromatin signature-based enhancer prediction system developed in Arabidopsis may serve as a universal system for enhancer identification in plants. PMID:26373455

  18. Genome-wide linkage in Utah autism pedigrees

    PubMed Central

    Allen-Brady, K; Robison, R; Cannon, D; Varvil, T; Villalobos, M; Pingree, C; Leppert, MF; Miller, J; McMahon, WM; Coon, H

    2014-01-01

    Genetic studies of autism over the past decade suggest a complex landscape of multiple genes. In the face of this heterogeneity, studies that include large extended pedigrees may offer valuable insight, as the relatively few susceptibility genes within single large families may be more easily discerned. This genome-wide screen of 70 families includes 20 large extended pedigrees of 6–9 generations, 6 moderate-sized families of 4–5 generations, and 44 smaller families of 2–3 generations. The Center for Inherited Disease Research (CIDR) provided genotyping using the Illumina Linkage Panel 12, a 6K single nucleotide polymorphism (SNP) platform. Results from 192 subjects with an Autism Spectrum Disorder (ASD), and 461 of their relatives revealed genome-wide significance on chromosome 15q, with three possibly distinct peaks: 15q13.1-q14 (HLOD=4.09 at 29,459,872bp); 15q14-q21.1 (HLOD=3.59 at 36,837,208bp); and 15q21.1-q22.2 (HLOD=5.31 at 55,629,733bp). Two of these peaks replicate previous findings. There were additional suggestive results on chromosomes 2p25.3-p24.1 (HLOD=1.87), 7q31.31-q32.3 (HLOD=1.97), and 13q12.11-q12.3 (HLOD=1.93). Affected subjects in families supporting the linkage peaks found in this study did not reveal strong evidence for distinct phenotypic subgroups. PMID:19455147

  19. A genome-wide 20 K citrus microarray for gene expression analysis

    PubMed Central

    Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose

    2008-01-01

    Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA microarray that include 21,081 putative unigenes of citrus. As a functional companion to the microarray, a web-browsable database [1] was created and populated with information about the unigenes represented in the microarray, including cDNA libraries, isolated clones, raw and processed nucleotide and protein sequences, and results of all the structural and functional annotation of the unigenes, like general description, BLAST hits, putative Arabidopsis orthologs, microsatellites, putative SNPs, GO classification and PFAM domains. We have performed a Gene Ontology comparison with the full set of Arabidopsis proteins to estimate the genome coverage of the microarray. We have also performed microarray hybridizations to check its usability. Conclusion This new cDNA microarray replaces the first 7K microarray generated two years ago and allows gene expression analysis at a more global scale. We have followed a rational design to minimize cross-hybridization while maintaining its utility for different citrus species. Furthermore, we also provide access to a website with full structural and functional annotation of the unigenes represented in the microarray, along with the ability to use this site to directly perform gene expression analysis using standard tools at different publicly available servers. Furthermore, we show how this microarray offers a good representation of the citrus genome and present the usefulness of this genomic tool for global studies in citrus by using it to

  20. Implementing meta-analysis from genome-wide association studies for pork quality traits

    USDA-ARS?s Scientific Manuscript database

    Pork quality plays an important role in the meat processing industry, thus different methodologies have been implemented to elucidate the genetic architecture of traits affecting meat quality. One of the most common and widely used approaches is to perform genome-wide association (GWA) studies. Howe...

  1. Genome-wide alterations of the DNA replication program during tumor progression

    NASA Astrophysics Data System (ADS)

    Arneodo, A.; Goldar, A.; Argoul, F.; Hyrien, O.; Audit, B.

    2016-08-01

    Oncogenic stress is a major driving force in the early stages of cancer development. Recent experimental findings reveal that, in precancerous lesions and cancers, activated oncogenes may induce stalling and dissociation of DNA replication forks resulting in DNA damage. Replication timing is emerging as an important epigenetic feature that recapitulates several genomic, epigenetic and functional specificities of even closely related cell types. There is increasing evidence that chromosome rearrangements, the hallmark of many cancer genomes, are intimately associated with the DNA replication program and that epigenetic replication timing changes often precede chromosomic rearrangements. The recent development of a novel methodology to map replication fork polarity using deep sequencing of Okazaki fragments has provided new and complementary genome-wide replication profiling data. We review the results of a wavelet-based multi-scale analysis of genomic and epigenetic data including replication profiles along human chromosomes. These results provide new insight into the spatio-temporal replication program and its dynamics during differentiation. Here our goal is to bring to cancer research, the experimental protocols and computational methodologies for replication program profiling, and also the modeling of the spatio-temporal replication program. To illustrate our purpose, we report very preliminary results obtained for the chronic myelogeneous leukemia, the archetype model of cancer. Finally, we discuss promising perspectives on using genome-wide DNA replication profiling as a novel efficient tool for cancer diagnosis, prognosis and personalized treatment.

  2. Assessing genomic selection prediction accuracy in a dynamic barley breeding

    USDA-ARS?s Scientific Manuscript database

    Genomic selection is a method to improve quantitative traits in crops and livestock by estimating breeding values of selection candidates using phenotype and genome-wide marker data sets. Prediction accuracy has been evaluated through simulation and cross-validation, however validation based on prog...

  3. A Genome-Wide Study of Modern-Day Tuscans: Revisiting Herodotus's Theory on the Origin of the Etruscans

    PubMed Central

    Gómez-Carballa, Alberto; Amigo, Jorge; Martinón-Torres, Federico

    2014-01-01

    Background The origin of the Etruscan civilization (Etruria, Central Italy) is a long-standing subject of debate among scholars from different disciplines. The bulk of the information has been reconstructed from ancient texts and archaeological findings and, in the last few years, through the analysis of uniparental genetic markers. Methods By meta-analyzing genome-wide data from The 1000 Genomes Project and the literature, we were able to compare the genomic patterns (>540,000 SNPs) of present day Tuscans (N = 98) with other population groups from the main hypothetical source populations, namely, Europe and the Middle East. Results Admixture analysis indicates the presence of 25–34% of Middle Eastern component in modern Tuscans. Different analyses have been carried out using identity-by-state (IBS) values and genetic distances point to Eastern Anatolia/Southern Caucasus as the most likely geographic origin of the main Middle Eastern genetic component observed in the genome of modern Tuscans. Conclusions The data indicate that the admixture event between local Tuscans and Middle Easterners could have occurred in Central Italy about 2,600–3,100 years ago (y.a.). On the whole, the results validate the theory of the ancient historian Herodotus on the origin of Etruscans. PMID:25230205

  4. Genome-wide analysis of epistasis in body mass index using multiple human populations.

    PubMed

    Wei, Wen-Hua; Hemani, Gib; Gyenesei, Attila; Vitart, Veronique; Navarro, Pau; Hayward, Caroline; Cabrera, Claudia P; Huffman, Jennifer E; Knott, Sara A; Hicks, Andrew A; Rudan, Igor; Pramstaller, Peter P; Wild, Sarah H; Wilson, James F; Campbell, Harry; Hastie, Nicholas D; Wright, Alan F; Haley, Chris S

    2012-08-01

    We surveyed gene-gene interactions (epistasis) in human body mass index (BMI) in four European populations (n<1200) via exhaustive pair-wise genome scans where interactions were computed as F ratios by testing a linear regression model fitting two single-nucleotide polymorphisms (SNPs) with interactions against the one without. Before the association tests, BMI was corrected for sex and age, normalised and adjusted for relatedness. Neither single SNPs nor SNP interactions were genome-wide significant in either cohort based on the consensus threshold (P=5.0E-08) and a Bonferroni corrected threshold (P=1.1E-12), respectively. Next we compared sub genome-wide significant SNP interactions (P<5.0E-08) across cohorts to identify common epistatic signals, where SNPs were annotated to genes to test for gene ontology (GO) enrichment. Among the epistatic genes contributing to the commonly enriched GO terms, 19 were shared across study cohorts of which 15 are previously published genome-wide association loci, including CDH13 (cadherin 13) associated with height and SORCS2 (sortilin-related VPS10 domain containing receptor 2) associated with circulating insulin-like growth factor 1 and binding protein 3. Interactions between the 19 shared epistatic genes and those involving BMI candidate loci (P<5.0E-08) were tested across cohorts and found eight replicated at the SNP level (P<0.05) in at least one cohort, which were further tested and showed limited replication in a separate European population (n>5000). We conclude that genome-wide analysis of epistasis in multiple populations is an effective approach to provide new insights into the genetic regulation of BMI but requires additional efforts to confirm the findings.

  5. Genome-wide analysis identifies 12 loci influencing human reproductive behavior

    PubMed Central

    Barban, Nicola; Jansen, Rick; de Vlaming, Ronald; Vaez, Ahmad; Mandemakers, Jornt J.; Tropf, Felix C.; Shen, Xia; Wilson, James F.; Chasman, Daniel I.; Nolte, Ilja M.; Tragante, Vinicius; van der Laan, Sander W.; Perry, John R. B.; Kong, Augustine; Ahluwalia, Tarunveer; Albrecht, Eva; Yerges-Armstrong, Laura; Atzmon, Gil; Auro, Kirsi; Ayers, Kristin; Bakshi, Andrew; Ben-Avraham, Danny; Berger, Klaus; Bergman, Aviv; Bertram, Lars; Bielak, Lawrence F.; Bjornsdottir, Gyda; Bonder, Marc Jan; Broer, Linda; Bui, Minh; Barbieri, Caterina; Cavadino, Alana; Chavarro, Jorge E; Turman, Constance; Concas, Maria Pina; Cordell, Heather J.; Davies, Gail; Eibich, Peter; Eriksson, Nicholas; Esko, Tõnu; Eriksson, Joel; Falahi, Fahimeh; Felix, Janine F.; Fontana, Mark Alan; Franke, Lude; Gandin, Ilaria; Gaskins, Audrey J.; Gieger, Christian; Gunderson, Erica P.; Guo, Xiuqing; Hayward, Caroline; He, Chunyan; Hofer, Edith; Huang, Hongyan; Joshi, Peter K.; Kanoni, Stavroula; Karlsson, Robert; Kiechl, Stefan; Kifley, Annette; Kluttig, Alexander; Kraft, Peter; Lagou, Vasiliki; Lecoeur, Cecile; Lahti, Jari; Li-Gao, Ruifang; Lind, Penelope A.; Liu, Tian; Makalic, Enes; Mamasoula, Crysovalanto; Matteson, Lindsay; Mbarek, Hamdi; McArdle, Patrick F.; McMahon, George; Meddens, S. Fleur W.; Mihailov, Evelin; Miller, Mike; Missmer, Stacey A.; Monnereau, Claire; van der Most, Peter J.; Myhre, Ronny; Nalls, Mike A.; Nutile, Teresa; Panagiota, Kalafati Ioanna; Porcu, Eleonora; Prokopenko, Inga; Rajan, Kumar B.; Rich-Edwards, Janet; Rietveld, Cornelius A.; Robino, Antonietta; Rose, Lynda M.; Rueedi, Rico; Ryan, Kathy; Saba, Yasaman; Schmidt, Daniel; Smith, Jennifer A.; Stolk, Lisette; Streeten, Elizabeth; Tonjes, Anke; Thorleifsson, Gudmar; Ulivi, Sheila; Wedenoja, Juho; Wellmann, Juergen; Willeit, Peter; Yao, Jie; Yengo, Loic; Zhao, Jing Hua; Zhao, Wei; Zhernakova, Daria V.; Amin, Najaf; Andrews, Howard; Balkau, Beverley; Barzilai, Nir; Bergmann, Sven; Biino, Ginevra; Bisgaard, Hans; Bønnelykke, Klaus; Boomsma, Dorret I.; Buring, Julie E.; Campbell, Harry; Cappellani, Stefania; Ciullo, Marina; Cox, Simon R.; Cucca, Francesco; Daniela, Toniolo; Davey-Smith, George; Deary, Ian J.; Dedoussis, George; Deloukas, Panos; van Duijn, Cornelia M.; de Geus, Eco JC.; Eriksson, Johan G.; Evans, Denis A.; Faul, Jessica D.; Felicita, Sala Cinzia; Froguel, Philippe; Gasparini, Paolo; Girotto, Giorgia; Grabe, Hans-Jörgen; Greiser, Karin Halina; Groenen, Patrick J.F.; de Haan, Hugoline G.; Haerting, Johannes; Harris, Tamara B.; Heath, Andrew C.; Heikkilä, Kauko; Hofman, Albert; Homuth, Georg; Holliday, Elizabeth G; Hopper, John; Hypponen, Elina; Jacobsson, Bo; Jaddoe, Vincent W. V.; Johannesson, Magnus; Jugessur, Astanand; Kähönen, Mika; Kajantie, Eero; Kardia, Sharon L.R.; Keavney, Bernard; Kolcic, Ivana; Koponen, Päivikki; Kovacs, Peter; Kronenberg, Florian; Kutalik, Zoltan; La Bianca, Martina; Lachance, Genevieve; Iacono, William; Lai, Sandra; Lehtimäki, Terho; Liewald, David C; Lindgren, Cecilia; Liu, Yongmei; Luben, Robert; Lucht, Michael; Luoto, Riitta; Magnus, Per; Magnusson, Patrik K.E.; Martin, Nicholas G.; McGue, Matt; McQuillan, Ruth; Medland, Sarah E.; Meisinger, Christa; Mellström, Dan; Metspalu, Andres; Michela, Traglia; Milani, Lili; Mitchell, Paul; Montgomery, Grant W.; Mook-Kanamori, Dennis; de Mutsert, Renée; Nohr, Ellen A; Ohlsson, Claes; Olsen, Jørn; Ong, Ken K.; Paternoster, Lavinia; Pattie, Alison; Penninx, Brenda WJH; Perola, Markus; Peyser, Patricia A.; Pirastu, Mario; Polasek, Ozren; Power, Chris; Kaprio, Jaakko; Raffel, Leslie J.; Räikkönen, Katri; Raitakari, Olli; Ridker, Paul M.; Ring, Susan M.; Roll, Kathryn; Rudan, Igor; Ruggiero, Daniela; Rujescu, Dan; Salomaa, Veikko; Schlessinger, David; Schmidt, Helena; Schmidt, Reinhold; Schupf, Nicole; Smit, Johannes; Sorice, Rossella; Spector, Tim D.; Starr, John M.; Stöckl, Doris; Strauch, Konstantin; Stumvoll, Michael; Swertz, Morris A.; Thorsteinsdottir, Unnur; Thurik, A. Roy; Timpson, Nicholas J.; Tönjes, Anke; Tung, Joyce Y.; Uitterlinden, André G.; Vaccargiu, Simona; Viikari, Jorma; Vitart, Veronique; Völzke, Henry; Vollenweider, Peter; Vuckovic, Dragana; Waage, Johannes; Wagner, Gert G.; Wang, Jie Jin; Wareham, Nicholas J.; Weir, David R.; Willemsen, Gonneke; Willeit, Johann; Wright, Alan F.; Zondervan, Krina T.; Stefansson, Kari; Krueger, Robert F.; Lee, James J.; Benjamin, Daniel J.; Cesarini, David; Koellinger, Philipp D.; den Hoed, Marcel; Snieder, Harold; Mills, Melinda C.

    2017-01-01

    The genetic architecture of human reproductive behavior – age at first birth (AFB) and number of children ever born (NEB) – has a strong relationship with fitness, human development, infertility and risk of neuropsychiatric disorders. However, very few genetic loci have been identified and the underlying mechanisms of AFB and NEB are poorly understood. We report the largest genome-wide association study to date of both sexes including 251,151 individuals for AFB and 343,072 for NEB. We identified 12 independent loci that are significantly associated with AFB and/or NEB in a SNP-based genome-wide association study, and four additional loci in a gene-based effort. These loci harbor genes that are likely to play a role – either directly or by affecting non-local gene expression – in human reproduction and infertility, thereby increasing our understanding of these complex traits. PMID:27798627

  6. Quality control and conduct of genome-wide association meta-analyses

    PubMed Central

    Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Mägi, Reedik; Ferreira, Teresa; Fall, Tove; Graff, Mariaelisa; Justice, Anne E; Luan, Jian'an; Gustafsson, Stefan; Randall, Joshua C; Vedantam, Sailaja; Workalemahu, Tsegaselassie; Kilpeläinen, Tuomas O; Scherag, André; Esko, Tonu; Kutalik, Zoltán; Heid, Iris M; Loos, Ruth JF

    2014-01-01

    Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for [1] organizational aspects of GWAMAs, and for [2] QC at the study file level, the meta-level across studies, and the meta-analysis output level. Real–world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for use of a powerful and flexible software package called EasyQC. For consortia of comparable size to the GIANT consortium, the present protocol takes a minimum of about 10 months to complete. PMID:24762786

  7. Genome-wide association analysis identifies 13 new risk loci for schizophrenia.

    PubMed

    Ripke, Stephan; O'Dushlaine, Colm; Chambert, Kimberly; Moran, Jennifer L; Kähler, Anna K; Akterin, Susanne; Bergen, Sarah E; Collins, Ann L; Crowley, James J; Fromer, Menachem; Kim, Yunjung; Lee, Sang Hong; Magnusson, Patrik K E; Sanchez, Nick; Stahl, Eli A; Williams, Stephanie; Wray, Naomi R; Xia, Kai; Bettella, Francesco; Borglum, Anders D; Bulik-Sullivan, Brendan K; Cormican, Paul; Craddock, Nick; de Leeuw, Christiaan; Durmishi, Naser; Gill, Michael; Golimbet, Vera; Hamshere, Marian L; Holmans, Peter; Hougaard, David M; Kendler, Kenneth S; Lin, Kuang; Morris, Derek W; Mors, Ole; Mortensen, Preben B; Neale, Benjamin M; O'Neill, Francis A; Owen, Michael J; Milovancevic, Milica Pejovic; Posthuma, Danielle; Powell, John; Richards, Alexander L; Riley, Brien P; Ruderfer, Douglas; Rujescu, Dan; Sigurdsson, Engilbert; Silagadze, Teimuraz; Smit, August B; Stefansson, Hreinn; Steinberg, Stacy; Suvisaari, Jaana; Tosato, Sarah; Verhage, Matthijs; Walters, James T; Levinson, Douglas F; Gejman, Pablo V; Kendler, Kenneth S; Laurent, Claudine; Mowry, Bryan J; O'Donovan, Michael C; Owen, Michael J; Pulver, Ann E; Riley, Brien P; Schwab, Sibylle G; Wildenauer, Dieter B; Dudbridge, Frank; Holmans, Peter; Shi, Jianxin; Albus, Margot; Alexander, Madeline; Campion, Dominique; Cohen, David; Dikeos, Dimitris; Duan, Jubao; Eichhammer, Peter; Godard, Stephanie; Hansen, Mark; Lerer, F Bernard; Liang, Kung-Yee; Maier, Wolfgang; Mallet, Jacques; Nertney, Deborah A; Nestadt, Gerald; Norton, Nadine; O'Neill, Francis A; Papadimitriou, George N; Ribble, Robert; Sanders, Alan R; Silverman, Jeremy M; Walsh, Dermot; Williams, Nigel M; Wormley, Brandon; Arranz, Maria J; Bakker, Steven; Bender, Stephan; Bramon, Elvira; Collier, David; Crespo-Facorro, Benedicto; Hall, Jeremy; Iyegbe, Conrad; Jablensky, Assen; Kahn, Rene S; Kalaydjieva, Luba; Lawrie, Stephen; Lewis, Cathryn M; Lin, Kuang; Linszen, Don H; Mata, Ignacio; McIntosh, Andrew; Murray, Robin M; Ophoff, Roel A; Powell, John; Rujescu, Dan; Van Os, Jim; Walshe, Muriel; Weisbrod, Matthias; Wiersma, Durk; Donnelly, Peter; Barroso, Ines; Blackwell, Jenefer M; Bramon, Elvira; Brown, Matthew A; Casas, Juan P; Corvin, Aiden P; Deloukas, Panos; Duncanson, Audrey; Jankowski, Janusz; Markus, Hugh S; Mathew, Christopher G; Palmer, Colin N A; Plomin, Robert; Rautanen, Anna; Sawcer, Stephen J; Trembath, Richard C; Viswanathan, Ananth C; Wood, Nicholas W; Spencer, Chris C A; Band, Gavin; Bellenguez, Céline; Freeman, Colin; Hellenthal, Garrett; Giannoulatou, Eleni; Pirinen, Matti; Pearson, Richard D; Strange, Amy; Su, Zhan; Vukcevic, Damjan; Donnelly, Peter; Langford, Cordelia; Hunt, Sarah E; Edkins, Sarah; Gwilliam, Rhian; Blackburn, Hannah; Bumpstead, Suzannah J; Dronov, Serge; Gillman, Matthew; Gray, Emma; Hammond, Naomi; Jayakumar, Alagurevathi; McCann, Owen T; Liddle, Jennifer; Potter, Simon C; Ravindrarajah, Radhi; Ricketts, Michelle; Tashakkori-Ghanbaria, Avazeh; Waller, Matthew J; Weston, Paul; Widaa, Sara; Whittaker, Pamela; Barroso, Ines; Deloukas, Panos; Mathew, Christopher G; Blackwell, Jenefer M; Brown, Matthew A; Corvin, Aiden P; McCarthy, Mark I; Spencer, Chris C A; Bramon, Elvira; Corvin, Aiden P; O'Donovan, Michael C; Stefansson, Kari; Scolnick, Edward; Purcell, Shaun; McCarroll, Steven A; Sklar, Pamela; Hultman, Christina M; Sullivan, Patrick F

    2013-10-01

    Schizophrenia is an idiopathic mental disorder with a heritable component and a substantial public health impact. We conducted a multi-stage genome-wide association study (GWAS) for schizophrenia beginning with a Swedish national sample (5,001 cases and 6,243 controls) followed by meta-analysis with previous schizophrenia GWAS (8,832 cases and 12,067 controls) and finally by replication of SNPs in 168 genomic regions in independent samples (7,413 cases, 19,762 controls and 581 parent-offspring trios). We identified 22 loci associated at genome-wide significance; 13 of these are new, and 1 was previously implicated in bipolar disorder. Examination of candidate genes at these loci suggests the involvement of neuronal calcium signaling. We estimate that 8,300 independent, mostly common SNPs (95% credible interval of 6,300-10,200 SNPs) contribute to risk for schizophrenia and that these collectively account for at least 32% of the variance in liability. Common genetic variation has an important role in the etiology of schizophrenia, and larger studies will allow more detailed understanding of this disorder.

  8. Genome-wide characterization of microRNA in foxtail millet (Setaria italica)

    PubMed Central

    2013-01-01

    Background MicroRNAs (miRNAs) are a class of short non-coding, endogenous RNAs that play key roles in many biological processes in both animals and plants. Although many miRNAs have been identified in a large number of organisms, the miRNAs in foxtail millet (Setaria italica) have, until now, been poorly understood. Results In this study, two replicate small RNA libraries from foxtail millet shoots were sequenced, and 40 million reads representing over 10 million unique sequences were generated. We identified 43 known miRNAs, 172 novel miRNAs and 2 mirtron precursor candidates in foxtail millet. Some miRNA*s of the known and novel miRNAs were detected as well. Further, eight novel miRNAs were validated by stem-loop RT-PCR. Potential targets of the foxtail millet miRNAs were predicted based on our strict criteria. Of the predicted target genes, 79% (351) had functional annotations in InterPro and GO analyses, indicating the targets of the miRNAs were involved in a wide range of regulatory functions and some specific biological processes. A total of 69 pairs of syntenic miRNA precursors that were conserved between foxtail millet and sorghum were found. Additionally, stem-loop RT-PCR was conducted to confirm the tissue-specific expression of some miRNAs in the four tissues identified by deep-sequencing. Conclusions We predicted, for the first time, 215 miRNAs and 447 miRNA targets in foxtail millet at a genome-wide level. The precursors, expression levels, miRNA* sequences, target functions, conservation, and evolution of miRNAs we identified were investigated. Some of the novel foxtail millet miRNAs and miRNA targets were validated experimentally. PMID:24330712

  9. Genetically contextual effects of smoking on genome wide DNA methylation.

    PubMed

    Dogan, Meeshanthini V; Beach, Steven R H; Philibert, Robert A

    2017-09-01

    Smoking is the leading cause of death in the United States. It exerts its effects by increasing susceptibility to a variety of complex disorders among those who smoke, and if pregnant, to their unborn children. In prior efforts to understand the epigenetic mechanisms through which this increased vulnerability is conveyed, a number of investigators have conducted genome wide methylation analyses. Unfortunately, secondary to methodological limitations, these studies were unable to examine methylation in gene regions with significant amounts of genetic variation. Using genome wide genetic and epigenetic data from the Framingham Heart Study, we re-examined the relationship of smoking status to genome wide methylation status. When only methylation status is considered, smoking was significantly associated with differential methylation in 310 genes that map to a variety of biological process and cellular differentiation pathways. However, when SNP effects on the magnitude of smoking associated methylation changes are also considered, cis and trans-interaction effects were noted at a total of 266 and 4353 genes with no marked enrichment for any biological pathways. Furthermore, the SNP variation participating in the significant interaction effects is enriched for loci previously associated with complex medical illnesses. The enlarged scope of the methylome shown to be affected by smoking may better explicate the mediational pathways linking smoking with a myriad of smoking related complex syndromes. Additionally, these results strongly suggest that combined epigenetic and genetic data analyses may be critical for a more complete understanding of the relationship between environmental variables, such as smoking, and pathophysiological outcomes. © 2017 Wiley Periodicals, Inc.

  10. Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations

    DOE PAGES

    Bendall, Matthew L.; Stevens, Sarah L.R.; Chan, Leong-Keat; ...

    2016-01-08

    Multiple models describe the formation and evolution of distinct microbial phylogenetic groups. These evolutionary models make different predictions regarding how adaptive alleles spread through populations and how genetic diversity is maintained. Processes predicted by competing evolutionary models, for example, genome-wide selective sweeps vs gene-specific sweeps, could be captured in natural populations using time-series metagenomics if the approach were applied over a sufficiently long time frame. Direct observations of either process would help resolve how distinct microbial groups evolve. Using a 9-year metagenomic study of a freshwater lake (2005–2013), we explore changes in single-nucleotide polymorphism (SNP) frequencies and patterns of genemore » gain and loss in 30 bacterial populations. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied by >1000-fold among populations. SNP allele frequencies also changed dramatically over time within some populations. Interestingly, nearly all SNP variants were slowly purged over several years from one population of green sulfur bacteria, while at the same time multiple genes either swept through or were lost from this population. Furthermore, these patterns were consistent with a genome-wide selective sweep in progress, a process predicted by the ‘ecotype model’ of speciation but not previously observed in nature. In contrast, other populations contained large, SNP-free genomic regions that appear to have swept independently through the populations prior to the study without purging diversity elsewhere in the genome. Finally, evidence for both genome-wide and gene-specific sweeps suggests that different models of bacterial speciation may apply to different populations coexisting in the same environment.« less

  11. Sniffing out significant “Pee values”: genome wide association study of asparagus anosmia

    PubMed Central

    Markt, Sarah C; Nuttall, Elizabeth; Turman, Constance; Sinnott, Jennifer; Rimm, Eric B; Ecsedy, Ethan; Unger, Robert H; Fall, Katja; Finn, Stephen; Jensen, Majken K; Rider, Jennifer R; Kraft, Peter

    2016-01-01

    Objective To determine the inherited factors associated with the ability to smell asparagus metabolites in urine. Design Genome wide association study. Setting Nurses’ Health Study and Health Professionals Follow-up Study cohorts. Participants 6909 men and women of European-American descent with available genetic data from genome wide association studies. Main outcome measure Participants were characterized as asparagus smellers if they strongly agreed with the prompt “after eating asparagus, you notice a strong characteristic odor in your urine,” and anosmic if otherwise. We calculated per-allele estimates of asparagus anosmia for about nine million single nucleotide polymorphisms using logistic regression. P values <5×10-8 were considered as genome wide significant. Results 58.0% of men (n=1449/2500) and 61.5% of women (n=2712/4409) had anosmia. 871 single nucleotide polymorphisms reached genome wide significance for asparagus anosmia, all in a region on chromosome 1 (1q44: 248139851-248595299) containing multiple genes in the olfactory receptor 2 (OR2) family. Conditional analyses revealed three independent markers associated with asparagus anosmia: rs13373863, rs71538191, and rs6689553. Conclusion A large proportion of people have asparagus anosmia. Genetic variation near multiple olfactory receptor genes is associated with the ability of an individual to smell the metabolites of asparagus in urine. Future replication studies are necessary before considering targeted therapies to help anosmic people discover what they are missing. PMID:27965198

  12. Validating genomic reliabilities and gains from phenotypic updates

    USDA-ARS?s Scientific Manuscript database

    Reliability can be validated from the variance of the difference of earlier and later estimated breeding values as a fraction of the genetic variance. This new method avoids using squared correlations that can be biased downward by selection. Published genomic reliabilities of U.S. young bulls agree...

  13. Genome-wide association analysis of age-at-onset in Alzheimer's disease.

    PubMed

    Kamboh, M I; Barmada, M M; Demirci, F Y; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Sweet, R A; Feingold, E; DeKosky, S T; Lopez, O L

    2012-12-01

    The risk of Alzheimer's disease (AD) is strongly determined by genetic factors and recent genome-wide association studies (GWAS) have identified several genes for the disease risk. In addition to the disease risk, age-at-onset (AAO) of AD has also strong genetic component with an estimated heritability of 42%. Identification of AAO genes may help to understand the biological mechanisms that regulate the onset of the disease. Here we report the first GWAS focused on identifying genes for the AAO of AD. We performed a genome-wide meta-analysis on three samples comprising a total of 2222 AD cases. A total of ~2.5 million directly genotyped or imputed single-nucleotide polymorphisms (SNPs) were analyzed in relation to AAO of AD. As expected, the most significant associations were observed in the apolipoprotein E (APOE) region on chromosome 19 where several SNPs surpassed the conservative genome-wide significant threshold (P<5E-08). The most significant SNP outside the APOE region was located in the DCHS2 gene on chromosome 4q31.3 (rs1466662; P=4.95E-07). There were 19 additional significant SNPs in this region at P<1E-04 and the DCHS2 gene is expressed in the cerebral cortex and thus is a potential candidate for affecting AAO in AD. These findings need to be confirmed in additional well-powered samples.

  14. Transcriptome-wide investigation of genomic imprinting in chicken

    PubMed Central

    Frésard, Laure; Leroux, Sophie; Servin, Bertrand; Gourichon, David; Dehais, Patrice; Cristobal, Magali San; Marsaud, Nathalie; Vignoles, Florence; Bed'hom, Bertrand; Coville, Jean-Luc; Hormozdiari, Farhad; Beaumont, Catherine; Zerjal, Tatiana; Vignal, Alain; Morisson, Mireille; Lagarrigue, Sandrine; Pitel, Frédérique

    2014-01-01

    Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken. PMID:24452801

  15. Transcriptome-wide investigation of genomic imprinting in chicken.

    PubMed

    Frésard, Laure; Leroux, Sophie; Servin, Bertrand; Gourichon, David; Dehais, Patrice; Cristobal, Magali San; Marsaud, Nathalie; Vignoles, Florence; Bed'hom, Bertrand; Coville, Jean-Luc; Hormozdiari, Farhad; Beaumont, Catherine; Zerjal, Tatiana; Vignal, Alain; Morisson, Mireille; Lagarrigue, Sandrine; Pitel, Frédérique

    2014-04-01

    Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken.

  16. Investigation of Maternal Genotype Effects in Autism by Genome-Wide Association

    PubMed Central

    Yuan, Han; Dougherty, Joseph D.

    2014-01-01

    Lay Abstract Autism spectrum disorders (ASDs) are pervasive developmental disorders which have both a genetic and environmental component. One source of the environmental component is the in utero (prenatal) environment. The maternal genome can potentially contribute to the risk of autism in children by altering this prenatal environment. In this study, the possibility of maternal genotype effects was explored by looking for common variants (single nucleotide polymorphisms, or SNPs) in the maternal genome associated with increased risk of autism in children. We performed a case/control genome-wide association study (GWAS) using mothers of probands as cases and either fathers of probands or normal females as controls, using two collections of families with autism. We did not identify any SNP that reached significance and thus a common variant of large effect is unlikely. However, there was evidence for the possibility of a large number of alleles each carrying a small effect. This suggested that if there is a contribution to autism risk through common-variant maternal genetic effects, it may be the result of multiple loci of small effects. We did not investigate rare variants in this study. Scientific Abstract Like most psychiatric disorders, autism spectrum disorders have both a genetic and an environmental component. While previous studies have clearly demonstrated the contribution of in utero (prenatal) environment on autism risk, most of them focused on transient environmental factors. Based on a recent sibling study, we hypothesized that environmental factors could also come from the maternal genome, which would result in persistent effects across siblings. In this study, the possibility of maternal genotype effects was examined by looking for common variants (single nucleotide polymorphisms, or SNPs) in the maternal genome associated with increased risk of autism in children. A case/control genome-wide association study (GWAS) was performed using mothers of

  17. Genome-wide association between DNA methylation and alternative splicing in an invertebrate

    PubMed Central

    2012-01-01

    Background Gene bodies are the most evolutionarily conserved targets of DNA methylation in eukaryotes. However, the regulatory functions of gene body DNA methylation remain largely unknown. DNA methylation in insects appears to be primarily confined to exons. Two recent studies in Apis mellifera (honeybee) and Nasonia vitripennis (jewel wasp) analyzed transcription and DNA methylation data for one gene in each species to demonstrate that exon-specific DNA methylation may be associated with alternative splicing events. In this study we investigated the relationship between DNA methylation, alternative splicing, and cross-species gene conservation on a genome-wide scale using genome-wide transcription and DNA methylation data. Results We generated RNA deep sequencing data (RNA-seq) to measure genome-wide mRNA expression at the exon- and gene-level. We produced a de novo transcriptome from this RNA-seq data and computationally predicted splice variants for the honeybee genome. We found that exons that are included in transcription are higher methylated than exons that are skipped during transcription. We detected enrichment for alternative splicing among methylated genes compared to unmethylated genes using fisher’s exact test. We performed a statistical analysis to reveal that the presence of DNA methylation or alternative splicing are both factors associated with a longer gene length and a greater number of exons in genes. In concordance with this observation, a conservation analysis using BLAST revealed that each of these factors is also associated with higher cross-species gene conservation. Conclusions This study constitutes the first genome-wide analysis exhibiting a positive relationship between exon-level DNA methylation and mRNA expression in the honeybee. Our finding that methylated genes are enriched for alternative splicing suggests that, in invertebrates, exon-level DNA methylation may play a role in the construction of splice variants by positively

  18. Genome-wide association studies in cardiac electrophysiology: recent discoveries and implications for clinical practice.

    PubMed

    Milan, David J; Lubitz, Steven A; Kääb, Stefan; Ellinor, Patrick T

    2010-08-01

    Genome-wide association studies have been increasingly used to study the genetics of complex human diseases. Within the field of cardiac electrophysiology, this technique has been applied to conditions such as atrial fibrillation, and several electrocardiographic parameters including the QT interval. While these studies have identified multiple genomic regions associated with each trait, questions remain, including the best way to explore the pathophysiology of each association and the potential for clinical utility. This review will summarize recent genome-wide association study results within cardiac electrophysiology and discuss their broader implications in basic science and clinical medicine. Copyright 2010 Heart Rhythm Society. Published by Elsevier Inc. All rights reserved.

  19. Accurate computation of survival statistics in genome-wide studies.

    PubMed

    Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J; Upfal, Eli

    2015-05-01

    A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.

  20. Accurate Computation of Survival Statistics in Genome-Wide Studies

    PubMed Central

    Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J.; Upfal, Eli

    2015-01-01

    A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations. PMID:25950620

  1. Genome-wide analytical approaches for reverse metabolic engineering of industrially relevant phenotypes in yeast.

    PubMed

    Oud, Bart; van Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T

    2012-03-01

    Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  2. Genome-wide analytical approaches for reverse metabolic engineering of industrially relevant phenotypes in yeast

    PubMed Central

    Oud, Bart; Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T

    2012-01-01

    Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages. PMID:22152095

  3. SvABA: genome-wide detection of structural variants and indels by local assembly.

    PubMed

    Wala, Jeremiah A; Bandopadhayay, Pratiti; Greenwald, Noah F; O'Rourke, Ryan; Sharpe, Ted; Stewart, Chip; Schumacher, Steve; Li, Yilong; Weischenfeldt, Joachim; Yao, Xiaotong; Nusbaum, Chad; Campbell, Peter; Getz, Gad; Meyerson, Matthew; Zhang, Cheng-Zhong; Imielinski, Marcin; Beroukhim, Rameen

    2018-04-01

    Structural variants (SVs), including small insertion and deletion variants (indels), are challenging to detect through standard alignment-based variant calling methods. Sequence assembly offers a powerful approach to identifying SVs, but is difficult to apply at scale genome-wide for SV detection due to its computational complexity and the difficulty of extracting SVs from assembly contigs. We describe SvABA, an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements. We evaluated SvABA's performance on the NA12878 human genome and in simulated and real cancer genomes. SvABA demonstrates superior sensitivity and specificity across a large spectrum of SVs and substantially improves detection performance for variants in the 20-300 bp range, compared with existing methods. SvABA also identifies complex somatic rearrangements with chains of short (<1000 bp) templated-sequence insertions copied from distant genomic regions. We applied SvABA to 344 cancer genomes from 11 cancer types and found that short templated-sequence insertions occur in ∼4% of all somatic rearrangements. Finally, we demonstrate that SvABA can identify sites of viral integration and cancer driver alterations containing medium-sized (50-300 bp) SVs. © 2018 Wala et al.; Published by Cold Spring Harbor Laboratory Press.

  4. Validation of Type 2 Diabetes Risk Variants Identified by Genome-Wide Association Studies in Northern Han Chinese

    PubMed Central

    Rao, Ping; Zhou, Yong; Ge, Si-Qi; Wang, An-Xin; Yu, Xin-Wei; Alzain, Mohamed Ali; Veronica, Andrea Katherine; Qiu, Jing; Song, Man-Shu; Zhang, Jie; Wang, Hao; Fang, Hong-Hong; Gao, Qing; Wang, You-Xin; Wang, Wei

    2016-01-01

    Background: More than 60 genetic susceptibility loci associated with type 2 diabetes mellitus (T2DM) have been established in populations of Asian and European ancestry. Given ethnic differences and environmental factors, validation of the effects of genetic risk variants with reported associations identified by Genome-Wide Association Studies (GWASs) is essential. The study aims at evaluating the associations of T2DM with 29 single nucleotide polymorphisms (SNPs) from 19 candidate genes derived from GWASs in a northern Han Chinese population. Method: In this case-control study, 461 T2DM-diagnosed patients and 434 controls were recruited at the Jidong oil field hospital (Hebei, China) from January 2009 to October 2013. A cumulative genetic risk score (cGRS) was calculated by summation of the number of risk alleles, and a weight GRS (wGRS) was calculated as the sum of risk alleles at each locus multiplied by their effect sizes for T2DM, using the independent variants selected. Result: The allelic frequency of the “A” allele at rs17106184 (Fas-associated factor 1, FAF1) was significantly higher in the T2DM patients than that of the healthy controls (11.7% vs. 6.4%, p < 0.001). Individuals in the highestquartile of wGRS had an over three-fold increased risk for developing T2DM compared with those in the lowest quartile (odds ratio = 3.06, 95% CI = 1.92–4.88, p < 0.001) adjusted for age, sex, BMI, total cholesterol (TC), triglycerides (TG), low-density lipoprotein cholesterol (LDL-C), systolic blood pressure (SBP) and diastolic blood pressure (DBP). The results were similar when analyzed with the cGRS. Conclusions: We confirmed the association between rs17106184 (FAF1) and T2DM in a northern Han Chinese population. The GRS calculated based on T2DM susceptibility variants may be a useful tool for predicting the T2DM susceptibility. PMID:27589775

  5. Efficient genome-wide association in biobanks using topic modeling identifies multiple novel disease loci.

    PubMed

    McCoy, Thomas H; Castro, Victor M; Snapper, Leslie A; Hart, Kamber L; Perlis, Roy H

    2017-08-31

    Biobanks and national registries represent a powerful tool for genomic discovery, but rely on diagnostic codes that may be unreliable and fail to capture the relationship between related diagnoses. We developed an efficient means of conducting genome-wide association studies using combinations of diagnostic codes from electronic health records (EHR) for 10845 participants in a biobanking program at two large academic medical centers. Specifically, we applied latent Dirichilet allocation to fit 50 disease topics based on diagnostic codes, then conducted genome-wide common-variant association for each topic. In sensitivity analysis, these results were contrasted with those obtained from traditional single-diagnosis phenome-wide association analysis, as well as those in which only a subset of diagnostic codes are included per topic. In meta-analysis across three biobank cohorts, we identified 23 disease-associated loci with p<1e-15, including previously associated autoimmune disease loci. In all cases, observed significant associations were of greater magnitude than for single phenome-wide diagnostic codes, and incorporation of less strongly-loading diagnostic codes enhanced association. This strategy provides a more efficient means of phenome-wide association in biobanks with coded clinical data.

  6. Efficient Genome-wide Association in Biobanks Using Topic Modeling Identifies Multiple Novel Disease Loci

    PubMed Central

    McCoy, Thomas H; Castro, Victor M; Snapper, Leslie A; Hart, Kamber L; Perlis, Roy H

    2017-01-01

    Biobanks and national registries represent a powerful tool for genomic discovery, but rely on diagnostic codes that can be unreliable and fail to capture relationships between related diagnoses. We developed an efficient means of conducting genome-wide association studies using combinations of diagnostic codes from electronic health records for 10,845 participants in a biobanking program at two large academic medical centers. Specifically, we applied latent Dirichilet allocation to fit 50 disease topics based on diagnostic codes, then conducted a genome-wide common-variant association for each topic. In sensitivity analysis, these results were contrasted with those obtained from traditional single-diagnosis phenome-wide association analysis, as well as those in which only a subset of diagnostic codes were included per topic. In meta-analysis across three biobank cohorts, we identified 23 disease-associated loci with p < 1e-15, including previously associated autoimmune disease loci. In all cases, observed significant associations were of greater magnitude than single phenome-wide diagnostic codes, and incorporation of less strongly loading diagnostic codes enhanced association. This strategy provides a more efficient means of identifying phenome-wide associations in biobanks with coded clinical data. PMID:28861588

  7. Genome-wide Association Studies from the Cancer Genetic Markers of Susceptibility (CGEMS) Initiative | Office of Cancer Genomics

    Cancer.gov

    CGEMS identifies common inherited genetic variations associated with a number of cancers, including breast and prostate. Data from these genome-wide association studies (GWAS) are available through the Division of Cancer Epidemiology & Genetics website.

  8. Susceptibility to Childhood Pneumonia: A Genome-Wide Analysis.

    PubMed

    Hayden, Lystra P; Cho, Michael H; McDonald, Merry-Lynn N; Crapo, James D; Beaty, Terri H; Silverman, Edwin K; Hersh, Craig P

    2017-01-01

    Previous studies have indicated that in adult smokers, a history of childhood pneumonia is associated with reduced lung function and chronic obstructive pulmonary disease. There have been few previous investigations using genome-wide association studies to investigate genetic predisposition to pneumonia. This study aims to identify the genetic variants associated with the development of pneumonia during childhood and over the course of the lifetime. Study subjects included current and former smokers with and without chronic obstructive pulmonary disease participating in the COPDGene Study. Pneumonia was defined by subject self-report, with childhood pneumonia categorized as having the first episode at <16 years. Genome-wide association studies for childhood pneumonia (843 cases, 9,091 control subjects) and lifetime pneumonia (3,766 cases, 5,659 control subjects) were performed separately in non-Hispanic whites and African Americans. Non-Hispanic white and African American populations were combined in the meta-analysis. Top genetic variants from childhood pneumonia were assessed in network analysis. No single-nucleotide polymorphisms reached genome-wide significance, although we identified potential regions of interest. In the childhood pneumonia analysis, this included variants in NGR1 (P = 6.3 × 10 -8 ), PAK6 (P = 3.3 × 10 -7 ), and near MATN1 (P = 2.8 × 10 -7 ). In the lifetime pneumonia analysis, this included variants in LOC339862 (P = 8.7 × 10 -7 ), RAPGEF2 (P = 8.4 × 10 -7 ), PHACTR1 (P = 6.1 × 10 -7 ), near PRR27 (P = 4.3 × 10 -7 ), and near MCPH1 (P = 2.7 × 10 -7 ). Network analysis of the genes associated with childhood pneumonia included top networks related to development, blood vessel morphogenesis, muscle contraction, WNT signaling, DNA damage, apoptosis, inflammation, and immune response (P ≤ 0.05). We have identified genes potentially associated with the risk of pneumonia

  9. Genome-wide association study in Asia-adapted tropical maize reveals novel and explored genomic regions for sorghum downy mildew resistance.

    PubMed

    Rashid, Zerka; Singh, Pradeep Kumar; Vemuri, Hindu; Zaidi, Pervez Haider; Prasanna, Boddupalli Maruthi; Nair, Sudha Krishnan

    2018-01-10

    Globally, downy mildews are among the important foliar diseases of maize that cause significant yield losses. We conducted a genome-wide association study for sorghum downy mildew (SDM; Peronosclerospora sorghi) resistance in a panel of 368 inbred lines adapted to the Asian tropics. High density SNPs from Genotyping-by-sequencing were used in GWAS after controlling for population structure and kinship in the panel using a single locus mixed model. The study identified a set of 26 SNPs that were significantly associated with SDM resistance, with Bonferroni corrected P values ≤ 0.05. Among all the identified SNPs, the minor alleles were found to be favorable to SDM resistance in the mapping panel. Trend regression analysis with 16 independent genetic variants including 12 SNPs and four haplotype blocks identified SNP S2_6154311 on chromosome 2 with P value 2.61E-24 and contributing 26.7% of the phenotypic variation. Six of the SNPs/haplotypes were within the same chromosomal bins as the QTLs for SDM resistance mapped in previous studies. Apart from this, eight novel genomic regions for SDM resistance were identified in this study; they need further validation before being applied in the breeding pipeline. Ten SNPs identified in this study were co-located in reported mildew resistance genes.

  10. Genome-wide association studies in Alzheimer disease.

    PubMed

    Waring, Stephen C; Rosenberg, Roger N

    2008-03-01

    The genetics of Alzheimer disease (AD) to date support an age-dependent dichotomous model whereby earlier age of disease onset (< 60 years) is explained by 3 fully penetrant genes (APP [NCBI Entrez gene 351], PSEN1 [NCBI Entrez gene 5663], and PSEN2 [NCBI Entrez gene 5664]), whereas later age of disease onset (> or = 65 years) representing most cases of AD has yet to be explained by a purely genetic model. The APOE gene (NCBI Entrez gene 348) is the strongest genetic risk factor for later onset, although it is neither sufficient nor necessary to explain all occurrences of disease. Numerous putative genetic risk alleles and genetic variants have been reported. Although all have relevance to biological mechanisms that may be associated with AD pathogenesis, they await replication in large representative populations. Genome-wide association studies have emerged as an increasingly effective tool for identifying genetic contributions to complex diseases and represent the next frontier for furthering our understanding of the underlying etiologic, biological, and pathologic mechanisms associated with chronic complex disorders. There have already been success stories for diseases such as macular degeneration and diabetes mellitus. Whether this will hold true for a genetically complex and heterogeneous disease such as AD is not known, although early reports are encouraging. This review considers recent publications from studies that have successfully applied genome-wide association methods to investigations of AD by taking advantage of the currently available high-throughput arrays, bioinformatics, and software advances. The inherent strengths, limitations, and challenges associated with study design issues in the context of AD are presented herein.

  11. Quality control and conduct of genome-wide association meta-analyses.

    PubMed

    Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Mägi, Reedik; Ferreira, Teresa; Fall, Tove; Graff, Mariaelisa; Justice, Anne E; Luan, Jian'an; Gustafsson, Stefan; Randall, Joshua C; Vedantam, Sailaja; Workalemahu, Tsegaselassie; Kilpeläinen, Tuomas O; Scherag, André; Esko, Tonu; Kutalik, Zoltán; Heid, Iris M; Loos, Ruth J F

    2014-05-01

    Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for (i) organizational aspects of GWAMAs, and for (ii) QC at the study file level, the meta-level across studies and the meta-analysis output level. Real-world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for the use of a powerful and flexible software package called EasyQC. Precise timings will be greatly influenced by consortium size. For consortia of comparable size to the GIANT Consortium, this protocol takes a minimum of about 10 months to complete.

  12. A Genome Wide Survey of SNP Variation Reveals the Genetic Structure of Sheep Breeds

    USDA-ARS?s Scientific Manuscript database

    The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identi...

  13. Integration of genome-wide association studies with biological knowledge identifies six novel genes related to kidney function.

    PubMed

    Chasman, Daniel I; Fuchsberger, Christian; Pattaro, Cristian; Teumer, Alexander; Böger, Carsten A; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C; O'Seaghdha, Conall M; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V; O'Connell, Jeffrey R; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D; Gierman, Hinco J; Feitosa, Mary F; Hwang, Shih-Jen; Atkinson, Elizabeth J; Lohman, Kurt; Cornelis, Marilyn C; Johansson, Asa; Tönjes, Anke; Dehghan, Abbas; Lambert, Jean-Charles; Holliday, Elizabeth G; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y; Murgia, Federico; Trompet, Stella; Imboden, Medea; Coassin, Stefan; Pistis, Giorgio; Harris, Tamara B; Launer, Lenore J; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank; Demirkan, Ayse; Oostra, Ben A; de Andrade, Mariza; Turner, Stephen T; Ding, Jingzhong; Andrews, Jeanette S; Freedman, Barry I; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Meisinger, Christa; Gieger, Christian; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H; Wright, Alan F; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G; Rivadeneira, Fernando; Aulchenko, Yurii S; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K; Portas, Laura; Ford, Ian; Buckley, Brendan M; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J Wouter; Probst-Hensch, Nicole M; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S; van Duijn, Cornelia M; Borecki, Ingrid B; Kardia, Sharon L R; Liu, Yongmei; Curhan, Gary C; Rudan, Igor; Gyllensten, Ulf; Wilson, James F; Franke, Andre; Pramstaller, Peter P; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Parsa, Afshin; Bochud, Murielle; Heid, Iris M; Kao, W H Linda; Fox, Caroline S; Köttgen, Anna

    2012-12-15

    In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10(-9)) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10(-4)-2.2 × 10(-7). Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general.

  14. Integration of genome-wide association studies with biological knowledge identifies six novel genes related to kidney function

    PubMed Central

    Chasman, Daniel I.; Fuchsberger, Christian; Pattaro, Cristian; Teumer, Alexander; Böger, Carsten A.; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C.; O'Seaghdha, Conall M.; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V.; O'Connell, Jeffrey R.; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D.; Gierman, Hinco J.; Feitosa, Mary F.; Hwang, Shih-Jen; Atkinson, Elizabeth J.; Lohman, Kurt; Cornelis, Marilyn C.; Johansson, Åsa; Tönjes, Anke; Dehghan, Abbas; Lambert, Jean-Charles; Holliday, Elizabeth G.; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y.; Murgia, Federico; Trompet, Stella; Imboden, Medea; Coassin, Stefan; Pistis, Giorgio; Harris, Tamara B.; Launer, Lenore J.; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D.; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank; Demirkan, Ayse; Oostra, Ben A.; de Andrade, Mariza; Turner, Stephen T.; Ding, Jingzhong; Andrews, Jeanette S.; Freedman, Barry I.; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Meisinger, Christa; Gieger, Christian; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E.; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H.; Wright, Alan F.; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K.; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G.; Rivadeneira, Fernando; Aulchenko, Yurii S.; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K.; Portas, Laura; Ford, Ian; Buckley, Brendan M.; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K.; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J. Wouter; Probst-Hensch, Nicole M.; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R.; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S.; van Duijn, Cornelia M.; Borecki, Ingrid B.; Kardia, Sharon L.R.; Liu, Yongmei; Curhan, Gary C.; Rudan, Igor; Gyllensten, Ulf; Wilson, James F.; Franke, Andre; Pramstaller, Peter P.; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Parsa, Afshin; Bochud, Murielle; Heid, Iris M.; Kao, W.H. Linda; Fox, Caroline S.; Köttgen, Anna

    2012-01-01

    In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10−9) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10−4–2.2 × 10−7. Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general. PMID:22962313

  15. Genome-wide Escherichia coli stress response and improved tolerance towards industrially relevant chemicals.

    PubMed

    Rau, Martin Holm; Calero, Patricia; Lennen, Rebecca M; Long, Katherine S; Nielsen, Alex T

    2016-10-13

    Economically viable biobased production of bulk chemicals and biofuels typically requires high product titers. During microbial bioconversion this often leads to product toxicity, and tolerance is therefore a critical element in the engineering of production strains. Here, a systems biology approach was employed to understand the chemical stress response of Escherichia coli, including a genome-wide screen for mutants with increased fitness during chemical stress. Twelve chemicals with significant production potential were selected, consisting of organic solvent-like chemicals (butanol, hydroxy-γ-butyrolactone, 1,4-butanediol, furfural), organic acids (acetate, itaconic acid, levulinic acid, succinic acid), amino acids (serine, threonine) and membrane-intercalating chemicals (decanoic acid, geraniol). The transcriptional response towards these chemicals revealed large overlaps of transcription changes within and between chemical groups, with functions such as energy metabolism, stress response, membrane modification, transporters and iron metabolism being affected. Regulon enrichment analysis identified key regulators likely mediating the transcriptional response, including CRP, RpoS, OmpR, ArcA, Fur and GadX. These regulators, the genes within their regulons and the above mentioned cellular functions therefore constitute potential targets for increasing E. coli chemical tolerance. Fitness determination of genome-wide transposon mutants (Tn-seq) subjected to the same chemical stress identified 294 enriched and 336 depleted mutants and experimental validation revealed up to 60 % increase in mutant growth rates. Mutants enriched in several conditions contained, among others, insertions in genes of the Mar-Sox-Rob regulon as well as transcription and translation related gene functions. The combination of the transcriptional response and mutant screening provides general targets that can increase tolerance towards not only single, but multiple chemicals.

  16. Detection of genome-wide copy number variants in myeloid malignancies using next-generation sequencing.

    PubMed

    Shen, Wei; Paxton, Christian N; Szankasi, Philippe; Longhurst, Maria; Schumacher, Jonathan A; Frizzell, Kimberly A; Sorrells, Shelly M; Clayton, Adam L; Jattani, Rakhi P; Patel, Jay L; Toydemir, Reha; Kelley, Todd W; Xu, Xinjie

    2018-04-01

    Genetic abnormalities, including copy number variants (CNV), copy number neutral loss of heterozygosity (CN-LOH) and gene mutations, underlie the pathogenesis of myeloid malignancies and serve as important diagnostic, prognostic and/or therapeutic markers. Currently, multiple testing strategies are required for comprehensive genetic testing in myeloid malignancies. The aim of this proof-of-principle study was to investigate the feasibility of combining detection of genome-wide large CNVs, CN-LOH and targeted gene mutations into a single assay using next-generation sequencing (NGS). For genome-wide CNV detection, we designed a single nucleotide polymorphism (SNP) sequencing backbone with 22 762 SNP regions evenly distributed across the entire genome. For targeted mutation detection, 62 frequently mutated genes in myeloid malignancies were targeted. We combined this SNP sequencing backbone with a targeted mutation panel, and sequenced 9 healthy individuals and 16 patients with myeloid malignancies using NGS. We detected 52 somatic CNVs, 11 instances of CN-LOH and 39 oncogenic mutations in the 16 patients with myeloid malignancies, and none in the 9 healthy individuals. All CNVs and CN-LOH were confirmed by SNP microarray analysis. We describe a genome-wide SNP sequencing backbone which allows for sensitive detection of genome-wide CNVs and CN-LOH using NGS. This proof-of-principle study has demonstrated that this strategy can provide more comprehensive genetic profiling for patients with myeloid malignancies using a single assay. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  17. GeneCount: genome-wide calculation of absolute tumor DNA copy numbers from array comparative genomic hybridization data

    PubMed Central

    Lyng, Heidi; Lando, Malin; Brøvig, Runar S; Svendsrud, Debbie H; Johansen, Morten; Galteland, Eivind; Brustugun, Odd T; Meza-Zepeda, Leonardo A; Myklebost, Ola; Kristensen, Gunnar B; Hovig, Eivind; Stokke, Trond

    2008-01-01

    Absolute tumor DNA copy numbers can currently be achieved only on a single gene basis by using fluorescence in situ hybridization (FISH). We present GeneCount, a method for genome-wide calculation of absolute copy numbers from clinical array comparative genomic hybridization data. The tumor cell fraction is reliably estimated in the model. Data consistent with FISH results are achieved. We demonstrate significant improvements over existing methods for exploring gene dosages and intratumor copy number heterogeneity in cancers. PMID:18500990

  18. Linkage Disequilibrium And Genome-Wide Association Studies In O. sativa

    USDA-ARS?s Scientific Manuscript database

    There is increasing evidence that genome-wide association studies provide a powerful approach to find the genetic basis of complex phenotypic variation in all kinds of species. For this purpose, we developed the first generation 44K Affymetrix SNP array in rice (see Tung et al. poster). We genotyped...

  19. Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis

    PubMed Central

    Gianola, Daniel; Fariello, Maria I.; Naya, Hugo; Schön, Chris-Carolin

    2016-01-01

    Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals (G) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G, provided variance components are unaffected by exclusion of such marker(s) from G. The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G does matter. Removal of eigenvectors from G can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions. PMID:27520956

  20. Meta-Analyses of Genome-Wide Association Data Hold New Promise for Addiction Genetics.

    PubMed

    Agrawal, Arpana; Edenberg, Howard J; Gelernter, Joel

    2016-09-01

    Meta-analyses of genome-wide association study data have begun to lead to promising new discoveries for behavioral and psychiatrically relevant phenotypes (e.g., schizophrenia, educational attainment). We outline how this methodology can similarly lead to novel discoveries in genomic studies of substance use disorders, and discuss challenges that will need to be overcome to accomplish this goal. We illustrate our approach with the work of the newly established Substance Use Disorders workgroup of the Psychiatric Genomics Consortium.

  1. Genome-wide association links candidate genes to resistance to Plum Pox Virus in apricot (Prunus armeniaca).

    PubMed

    Mariette, Stéphanie; Wong Jun Tai, Fabienne; Roch, Guillaume; Barre, Aurélien; Chague, Aurélie; Decroocq, Stéphane; Groppi, Alexis; Laizet, Yec'han; Lambert, Patrick; Tricon, David; Nikolski, Macha; Audergon, Jean-Marc; Abbott, Albert G; Decroocq, Véronique

    2016-01-01

    In fruit tree species, many important traits have been characterized genetically by using single-family descent mapping in progenies segregating for the traits. However, most mapped loci have not been sufficiently resolved to the individual genes due to insufficient progeny sizes for high resolution mapping and the previous lack of whole-genome sequence resources of the study species. To address this problem for Plum Pox Virus (PPV) candidate resistance gene identification in Prunus species, we implemented a genome-wide association (GWA) approach in apricot. This study exploited the broad genetic diversity of the apricot (Prunus armeniaca) germplasm containing resistance to PPV, next-generation sequence-based genotyping, and the high-quality peach (Prunus persica) genome reference sequence for single nucleotide polymorphism (SNP) identification. The results of this GWA study validated previously reported PPV resistance quantitative trait loci (QTL) intervals, highlighted other potential resistance loci, and resolved each to a limited set of candidate genes for further study. This work substantiates the association genetics approach for resolution of QTL to candidate genes in apricot and suggests that this approach could simplify identification of other candidate genes for other marked trait intervals in this germplasm. © 2015 INRA, UMR 1332 BFP New Phytologist © 2015 New Phytologist Trust.

  2. Inference of gene regulatory networks from genome-wide knockout fitness data

    PubMed Central

    Wang, Liming; Wang, Xiaodong; Arkin, Adam P.; Samoilov, Michael S.

    2013-01-01

    Motivation: Genome-wide fitness is an emerging type of high-throughput biological data generated for individual organisms by creating libraries of knockouts, subjecting them to broad ranges of environmental conditions, and measuring the resulting clone-specific fitnesses. Since fitness is an organism-scale measure of gene regulatory network behaviour, it may offer certain advantages when insights into such phenotypical and functional features are of primary interest over individual gene expression. Previous works have shown that genome-wide fitness data can be used to uncover novel gene regulatory interactions, when compared with results of more conventional gene expression analysis. Yet, to date, few algorithms have been proposed for systematically using genome-wide mutant fitness data for gene regulatory network inference. Results: In this article, we describe a model and propose an inference algorithm for using fitness data from knockout libraries to identify underlying gene regulatory networks. Unlike most prior methods, the presented approach captures not only structural, but also dynamical and non-linear nature of biomolecular systems involved. A state–space model with non-linear basis is used for dynamically describing gene regulatory networks. Network structure is then elucidated by estimating unknown model parameters. Unscented Kalman filter is used to cope with the non-linearities introduced in the model, which also enables the algorithm to run in on-line mode for practical use. Here, we demonstrate that the algorithm provides satisfying results for both synthetic data as well as empirical measurements of GAL network in yeast Saccharomyces cerevisiae and TyrR–LiuR network in bacteria Shewanella oneidensis. Availability: MATLAB code and datasets are available to download at http://www.duke.edu/∼lw174/Fitness.zip and http://genomics.lbl.gov/supplemental/fitness-bioinf/ Contact: wangx@ee.columbia.edu or mssamoilov@lbl.gov Supplementary information

  3. Genome-wide meta-analyses of stratified depression in Generation Scotland and UK Biobank.

    PubMed

    Hall, Lynsey S; Adams, Mark J; Arnau-Soler, Aleix; Clarke, Toni-Kim; Howard, David M; Zeng, Yanni; Davies, Gail; Hagenaars, Saskia P; Maria Fernandez-Pujals, Ana; Gibson, Jude; Wigmore, Eleanor M; Boutin, Thibaud S; Hayward, Caroline; Scotland, Generation; Porteous, David J; Deary, Ian J; Thomson, Pippa A; Haley, Chris S; McIntosh, Andrew M

    2018-01-10

    Few replicable genetic associations for Major Depressive Disorder (MDD) have been identified. Recent studies of MDD have identified common risk variants by using a broader phenotype definition in very large samples, or by reducing phenotypic and ancestral heterogeneity. We sought to ascertain whether it is more informative to maximize the sample size using data from all available cases and controls, or to use a sex or recurrent stratified subset of affected individuals. To test this, we compared heritability estimates, genetic correlation with other traits, variance explained by MDD polygenic score, and variants identified by genome-wide meta-analysis for broad and narrow MDD classifications in two large British cohorts - Generation Scotland and UK Biobank. Genome-wide meta-analysis of MDD in males yielded one genome-wide significant locus on 3p22.3, with three genes in this region (CRTAP, GLB1, and TMPPE) demonstrating a significant association in gene-based tests. Meta-analyzed MDD, recurrent MDD and female MDD yielded equivalent heritability estimates, showed no detectable difference in association with polygenic scores, and were each genetically correlated with six health-correlated traits (neuroticism, depressive symptoms, subjective well-being, MDD, a cross-disorder phenotype and Bipolar Disorder). Whilst stratified GWAS analysis revealed a genome-wide significant locus for male MDD, the lack of independent replication, and the consistent pattern of results in other MDD classifications suggests that phenotypic stratification using recurrence or sex in currently available sample sizes is currently weakly justified. Based upon existing studies and our findings, the strategy of maximizing sample sizes is likely to provide the greater gain.

  4. Impacts of Genome-Wide Analyses on Our Understanding of Human Herpesvirus Diversity and Evolution.

    PubMed

    Renner, Daniel W; Szpara, Moriah L

    2018-01-01

    Until fairly recently, genome-wide evolutionary dynamics and within-host diversity were more commonly examined in the context of small viruses than in the context of large double-stranded DNA viruses such as herpesviruses. The high mutation rates and more compact genomes of RNA viruses have inspired the investigation of population dynamics for these species, and recent data now suggest that herpesviruses might also be considered candidates for population modeling. High-throughput sequencing (HTS) and bioinformatics have expanded our understanding of herpesviruses through genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures. Here we discuss recent data on the mechanisms that generate herpesvirus genomic diversity and underlie the evolution of these virus families. We focus on human herpesviruses, with key insights drawn from veterinary herpesviruses and other large DNA virus families. We consider the impacts of cell culture on herpesvirus genomes and how to accurately describe the viral populations under study. The need for a strong foundation of high-quality genomes is also discussed, since it underlies all secondary genomic analyses such as RNA sequencing (RNA-Seq), chromatin immunoprecipitation, and ribosome profiling. Areas where we foresee future progress, such as the linking of viral genetic differences to phenotypic or clinical outcomes, are highlighted as well. Copyright © 2017 Renner and Szpara.

  5. Genome-wide signals of positive selection in human evolution

    PubMed Central

    Enard, David; Messer, Philipp W.; Petrov, Dmitri A.

    2014-01-01

    The role of positive selection in human evolution remains controversial. On the one hand, scans for positive selection have identified hundreds of candidate loci, and the genome-wide patterns of polymorphism show signatures consistent with frequent positive selection. On the other hand, recent studies have argued that many of the candidate loci are false positives and that most genome-wide signatures of adaptation are in fact due to reduction of neutral diversity by linked deleterious mutations, known as background selection. Here we analyze human polymorphism data from the 1000 Genomes Project and detect signatures of positive selection once we correct for the effects of background selection. We show that levels of neutral polymorphism are lower near amino acid substitutions, with the strongest reduction observed specifically near functionally consequential amino acid substitutions. Furthermore, amino acid substitutions are associated with signatures of recent adaptation that should not be generated by background selection, such as unusually long and frequent haplotypes and specific distortions in the site frequency spectrum. We use forward simulations to argue that the observed signatures require a high rate of strongly adaptive substitutions near amino acid changes. We further demonstrate that the observed signatures of positive selection correlate better with the presence of regulatory sequences, as predicted by the ENCODE Project Consortium, than with the positions of amino acid substitutions. Our results suggest that adaptation was frequent in human evolution and provide support for the hypothesis of King and Wilson that adaptive divergence is primarily driven by regulatory changes. PMID:24619126

  6. Genome-wide association study identifies multiple loci associated with bladder cancer risk

    PubMed Central

    Figueroa, Jonine D.; Ye, Yuanqing; Siddiq, Afshan; Garcia-Closas, Montserrat; Chatterjee, Nilanjan; Prokunina-Olsson, Ludmila; Cortessis, Victoria K.; Kooperberg, Charles; Cussenot, Olivier; Benhamou, Simone; Prescott, Jennifer; Porru, Stefano; Dinney, Colin P.; Malats, Núria; Baris, Dalsu; Purdue, Mark; Jacobs, Eric J.; Albanes, Demetrius; Wang, Zhaoming; Deng, Xiang; Chung, Charles C.; Tang, Wei; Bas Bueno-de-Mesquita, H.; Trichopoulos, Dimitrios; Ljungberg, Börje; Clavel-Chapelon, Françoise; Weiderpass, Elisabete; Krogh, Vittorio; Dorronsoro, Miren; Travis, Ruth; Tjønneland, Anne; Brenan, Paul; Chang-Claude, Jenny; Riboli, Elio; Conti, David; Gago-Dominguez, Manuela; Stern, Mariana C.; Pike, Malcolm C.; Van Den Berg, David; Yuan, Jian-Min; Hohensee, Chancellor; Rodabough, Rebecca; Cancel-Tassin, Geraldine; Roupret, Morgan; Comperat, Eva; Chen, Constance; De Vivo, Immaculata; Giovannucci, Edward; Hunter, David J.; Kraft, Peter; Lindstrom, Sara; Carta, Angela; Pavanello, Sofia; Arici, Cecilia; Mastrangelo, Giuseppe; Kamat, Ashish M.; Lerner, Seth P.; Barton Grossman, H.; Lin, Jie; Gu, Jian; Pu, Xia; Hutchinson, Amy; Burdette, Laurie; Wheeler, William; Kogevinas, Manolis; Tardón, Adonina; Serra, Consol; Carrato, Alfredo; García-Closas, Reina; Lloreta, Josep; Schwenn, Molly; Karagas, Margaret R.; Johnson, Alison; Schned, Alan; Armenti, Karla R.; Hosain, G.M.; Andriole, Gerald; Grubb, Robert; Black, Amanda; Ryan Diver, W.; Gapstur, Susan M.; Weinstein, Stephanie J.; Virtamo, Jarmo; Haiman, Chris A.; Landi, Maria T.; Caporaso, Neil; Fraumeni, Joseph F.; Vineis, Paolo; Wu, Xifeng; Silverman, Debra T.; Chanock, Stephen; Rothman, Nathaniel

    2014-01-01

    Candidate gene and genome-wide association studies (GWAS) have identified 11 independent susceptibility loci associated with bladder cancer risk. To discover additional risk variants, we conducted a new GWAS of 2422 bladder cancer cases and 5751 controls, followed by a meta-analysis with two independently published bladder cancer GWAS, resulting in a combined analysis of 6911 cases and 11 814 controls of European descent. TaqMan genotyping of 13 promising single nucleotide polymorphisms with P < 1 × 10−5 was pursued in a follow-up set of 801 cases and 1307 controls. Two new loci achieved genome-wide statistical significance: rs10936599 on 3q26.2 (P = 4.53 × 10−9) and rs907611 on 11p15.5 (P = 4.11 × 10−8). Two notable loci were also identified that approached genome-wide statistical significance: rs6104690 on 20p12.2 (P = 7.13 × 10−7) and rs4510656 on 6p22.3 (P = 6.98 × 10−7); these require further studies for confirmation. In conclusion, our study has identified new susceptibility alleles for bladder cancer risk that require fine-mapping and laboratory investigation, which could further understanding into the biological underpinnings of bladder carcinogenesis. PMID:24163127

  7. Genome-wide gene–environment interaction analysis for asbestos exposure in lung cancer susceptibility

    PubMed Central

    Wei, Qingyi Wei

    2012-01-01

    Asbestos exposure is a known risk factor for lung cancer. Although recent genome-wide association studies (GWASs) have identified some novel loci for lung cancer risk, few addressed genome-wide gene–environment interactions. To determine gene–asbestos interactions in lung cancer risk, we conducted genome-wide gene–environment interaction analyses at levels of single nucleotide polymorphisms (SNPs), genes and pathways, using our published Texas lung cancer GWAS dataset. This dataset included 317 498 SNPs from 1154 lung cancer cases and 1137 cancer-free controls. The initial SNP-level P-values for interactions between genetic variants and self-reported asbestos exposure were estimated by unconditional logistic regression models with adjustment for age, sex, smoking status and pack-years. The P-value for the most significant SNP rs13383928 was 2.17×10–6, which did not reach the genome-wide statistical significance. Using a versatile gene-based test approach, we found that the top significant gene was C7orf54, located on 7q32.1 (P = 8.90×10–5). Interestingly, most of the other significant genes were located on 11q13. When we used an improved gene-set-enrichment analysis approach, we found that the Fas signaling pathway and the antigen processing and presentation pathway were most significant (nominal P < 0.001; false discovery rate < 0.05) among 250 pathways containing 17 572 genes. We believe that our analysis is a pilot study that first describes the gene–asbestos interaction in lung cancer risk at levels of SNPs, genes and pathways. Our findings suggest that immune function regulation-related pathways may be mechanistically involved in asbestos-associated lung cancer risk. Abbreviations:CIconfidence intervalEenvironmentFDRfalse discovery rateGgeneGSEAgene-set-enrichment analysisGWASgenome-wide association studiesi-GSEAimproved gene-set-enrichment analysis approachORodds ratioSNPsingle nucleotide polymorphism PMID:22637743

  8. Genome-wide mapping of autonomous promoter activity in human cells

    PubMed Central

    van Arensbergen, Joris; FitzPatrick, Vincent D.; de Haas, Marcel; Pagie, Ludo; Sluimer, Jasper; Bussemaker, Harmen J.; van Steensel, Bas

    2017-01-01

    Previous methods to systematically characterize sequence-intrinsic activity of promoters have been limited by relatively low throughput and the length of sequences that could be tested. Here we present Survey of Regulatory Elements (SuRE), a method to assay more than 108 DNA fragments, each 0.2–2kb in size, for their ability to drive transcription autonomously. In SuRE, a plasmid library is constructed of random genomic fragments upstream of a 20bp barcode and decoded by paired-end sequencing. This library is then transfected into cells and transcribed barcodes are quantified in the RNA by high throughput sequencing. When applied to the human genome, we achieved a 55-fold genome coverage, allowing us to map autonomous promoter activity genome-wide. By computational modeling we delineated subregions within promoters that are relevant for their activity. For instance, we show that antisense promoter transcription is generally dependent on the sense core promoter sequences, and that most enhancers and several families of repetitive elements act as autonomous transcription initiation sites. PMID:28024146

  9. Genome-wide association study identifies 74 loci associated with educational attainment

    PubMed Central

    Okbay, Aysu; Beauchamp, Jonathan P.; Fontana, Mark A.; Lee, James J.; Pers, Tune H.; Rietveld, Cornelius A.; Turley, Patrick; Chen, Guo-Bo; Emilsson, Valur; Meddens, S. Fleur W.; Oskarsson, Sven; Pickrell, Joseph K.; Thom, Kevin; Timshel, Pascal; de Vlaming, Ronald; Abdellaoui, Abdel; Ahluwalia, Tarunveer S.; Bacelis, Jonas; Baumbach, Clemens; Bjornsdottir, Gyda; Brandsma, Johannes H.; Concas, Maria Pina; Derringer, Jaime; Furlotte, Nicholas A.; Galesloot, Tessel E.; Girotto, Giorgia; Gupta, Richa; Hall, Leanne M.; Harris, Sarah E.; Hofer, Edith; Horikoshi, Momoko; Huffman, Jennifer E.; Kaasik, Kadri; Kalafati, Ioanna P.; Karlsson, Robert; Kong, Augustine; Lahti, Jari; van der Lee, Sven J.; de Leeuw, Christiaan; Lind, Penelope A.; Lindgren, Karl-Oskar; Liu, Tian; Mangino, Massimo; Marten, Jonathan; Mihailov, Evelin; Miller, Michael B.; van der Most, Peter J.; Oldmeadow, Christopher; Payton, Antony; Pervjakova, Natalia; Peyrot, Wouter J.; Qian, Yong; Raitakari, Olli; Rueedi, Rico; Salvi, Erika; Schmidt, Börge; Schraut, Katharina E.; Shi, Jianxin; Smith, Albert V.; Poot, Raymond A.; Pourcain, Beate; Teumer, Alexander; Thorleifsson, Gudmar; Verweij, Niek; Vuckovic, Dragana; Wellmann, Juergen; Westra, Harm-Jan; Yang, Jingyun; Zhao, Wei; Zhu, Zhihong; Alizadeh, Behrooz Z.; Amin, Najaf; Bakshi, Andrew; Baumeister, Sebastian E.; Biino, Ginevra; Bønnelykke, Klaus; Boyle, Patricia A.; Campbell, Harry; Cappuccio, Francesco P.; Davies, Gail; De Neve, Jan-Emmanuel; Deloukas, Panos; Demuth, Ilja; Ding, Jun; Eibich, Peter; Eisele, Lewin; Eklund, Niina; Evans68, David M.; Faul, Jessica D.; Feitosa, Mary F.; Forstner, Andreas J.; Gandin, Ilaria; Gunnarsson, Bjarni; Halldórsson, Bjarni V.; Harris, Tamara B.; Heath, Andrew C.; Hocking, Lynne J.; Holliday, Elizabeth G.; Homuth, Georg; Horan, Michael A.; Hottenga, Jouke-Jan; de Jager, Philip L.; Joshi, Peter K.; Jugessur, Astanand; Kaakinen, Marika A.; Kähönen, Mika; Kanoni, Stavroula; Keltigangas-Järvinen, Liisa; Kiemeney, Lambertus A.L.M.; Kolcic, Ivana; Koskinen, Seppo; Kraja, Aldi T.; Kroh, Martin; Kutalik, Zoltan; Latvala, Antti; Launer, Lenore J.; Lebreton, Maël P.; Levinson, Douglas F.; Lichtenstein, Paul; Lichtner, Peter; Liewald, David C.M.; Loukola, Anu; Madden, Pamela A.; Mägi, Reedik; Mäki-Opas, Tomi; Marioni, Riccardo E.; Marques-Vidal, Pedro; Meddens, Gerardus A.; McMahon, George; Meisinger, Christa; Meitinger, Thomas; Milaneschi, Yusplitri; Milani, Lili; Montgomery, Grant W.; Myhre, Ronny; Nelson, Christopher P.; Nyholt, Dale R.; Ollier, William E.R.; Palotie, Aarno; Paternoster, Lavinia; Pedersen, Nancy L.; Petrovic, Katja E.; Porteous, David J.; Räikkönen, Katri; Ring, Susan M.; Robino, Antonietta; Rostapshova, Olga; Rudan, Igor; Rustichini, Aldo; Salomaa, Veikko; Sanders, Alan R.; Sarin, Antti-Pekka; Schmidt, Helena; Scott, Rodney J.; Smith, Blair H.; Smith, Jennifer A.; Staessen, Jan A.; Steinhagen-Thiessen, Elisabeth; Strauch, Konstantin; Terracciano, Antonio; Tobin, Martin D.; Ulivi, Sheila; Vaccargiu, Simona; Quaye, Lydia; van Rooij, Frank J.A.; Venturini, Cristina; Vinkhuyzen, Anna A.E.; Völker, Uwe; Völzke, Henry; Vonk, Judith M.; Vozzi, Diego; Waage, Johannes; Ware, Erin B.; Willemsen, Gonneke; Attia, John R.; Bennett, David A.; Berger, Klaus; Bertram, Lars; Bisgaard, Hans; Boomsma, Dorret I.; Borecki, Ingrid B.; Bultmann, Ute; Chabris, Christopher F.; Cucca, Francesco; Cusi, Daniele; Deary, Ian J.; Dedoussis, George V.; van Duijn, Cornelia M.; Eriksson, Johan G.; Franke, Barbara; Franke, Lude; Gasparini, Paolo; Gejman, Pablo V.; Gieger, Christian; Grabe, Hans-Jörgen; Gratten, Jacob; Groenen, Patrick J.F.; Gudnason, Vilmundur; van der Harst, Pim; Hayward, Caroline; Hinds, David A.; Hoffmann, Wolfgang; Hyppönen, Elina; Iacono, William G.; Jacobsson, Bo; Järvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Kaprio, Jaakko; Kardia, Sharon L.R.; Lehtimäki, Terho; Lehrer, Steven F.; Magnusson, Patrik K.E.; Martin, Nicholas G.; McGue, Matt; Metspalu, Andres; Pendleton, Neil; Penninx, Brenda W.J.H.; Perola, Markus; Pirastu, Nicola; Pirastu, Mario; Polasek, Ozren; Posthuma, Danielle; Power, Christine; Province, Michael A.; Samani, Nilesh J.; Schlessinger, David; Schmidt, Reinhold; Sørensen, Thorkild I.A.; Spector, Tim D.; Stefansson, Kari; Thorsteinsdottir, Unnur; Thurik, A. Roy; Timpson, Nicholas J.; Tiemeier, Henning; Tung, Joyce Y.; Uitterlinden, André G.; Vitart, Veronique; Vollenweider, Peter; Weir, David R.; Wilson, James F.; Wright, Alan F.; Conley, Dalton C.; Krueger, Robert F.; Smith, George Davey; Hofman, Albert; Laibson, David I.; Medland, Sarah E.; Meyer, Michelle N.; Yang, Jian; Johannesson, Magnus; Visscher, Peter M.; Esko, Tõnu; Koellinger, Philipp D.; Cesarini, David; Benjamin, Daniel J.

    2016-01-01

    Summary Educational attainment (EA) is strongly influenced by social and other environmental factors, but genetic factors are also estimated to account for at least 20% of the variation across individuals1. We report the results of a genome-wide association study (GWAS) for EA that extends our earlier discovery sample1,2 of 101,069 individuals to 293,723 individuals, and a replication in an independent sample of 111,349 individuals from the UK Biobank. We now identify 74 genome-wide significant loci associated with number of years of schooling completed. Single-nucleotide polymorphisms (SNPs) associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioral phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because EA is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric disease. PMID:27225129

  10. Genome-wide association study identifies 74 loci associated with educational attainment.

    PubMed

    Okbay, Aysu; Beauchamp, Jonathan P; Fontana, Mark Alan; Lee, James J; Pers, Tune H; Rietveld, Cornelius A; Turley, Patrick; Chen, Guo-Bo; Emilsson, Valur; Meddens, S Fleur W; Oskarsson, Sven; Pickrell, Joseph K; Thom, Kevin; Timshel, Pascal; de Vlaming, Ronald; Abdellaoui, Abdel; Ahluwalia, Tarunveer S; Bacelis, Jonas; Baumbach, Clemens; Bjornsdottir, Gyda; Brandsma, Johannes H; Pina Concas, Maria; Derringer, Jaime; Furlotte, Nicholas A; Galesloot, Tessel E; Girotto, Giorgia; Gupta, Richa; Hall, Leanne M; Harris, Sarah E; Hofer, Edith; Horikoshi, Momoko; Huffman, Jennifer E; Kaasik, Kadri; Kalafati, Ioanna P; Karlsson, Robert; Kong, Augustine; Lahti, Jari; van der Lee, Sven J; deLeeuw, Christiaan; Lind, Penelope A; Lindgren, Karl-Oskar; Liu, Tian; Mangino, Massimo; Marten, Jonathan; Mihailov, Evelin; Miller, Michael B; van der Most, Peter J; Oldmeadow, Christopher; Payton, Antony; Pervjakova, Natalia; Peyrot, Wouter J; Qian, Yong; Raitakari, Olli; Rueedi, Rico; Salvi, Erika; Schmidt, Börge; Schraut, Katharina E; Shi, Jianxin; Smith, Albert V; Poot, Raymond A; St Pourcain, Beate; Teumer, Alexander; Thorleifsson, Gudmar; Verweij, Niek; Vuckovic, Dragana; Wellmann, Juergen; Westra, Harm-Jan; Yang, Jingyun; Zhao, Wei; Zhu, Zhihong; Alizadeh, Behrooz Z; Amin, Najaf; Bakshi, Andrew; Baumeister, Sebastian E; Biino, Ginevra; Bønnelykke, Klaus; Boyle, Patricia A; Campbell, Harry; Cappuccio, Francesco P; Davies, Gail; De Neve, Jan-Emmanuel; Deloukas, Panos; Demuth, Ilja; Ding, Jun; Eibich, Peter; Eisele, Lewin; Eklund, Niina; Evans, David M; Faul, Jessica D; Feitosa, Mary F; Forstner, Andreas J; Gandin, Ilaria; Gunnarsson, Bjarni; Halldórsson, Bjarni V; Harris, Tamara B; Heath, Andrew C; Hocking, Lynne J; Holliday, Elizabeth G; Homuth, Georg; Horan, Michael A; Hottenga, Jouke-Jan; de Jager, Philip L; Joshi, Peter K; Jugessur, Astanand; Kaakinen, Marika A; Kähönen, Mika; Kanoni, Stavroula; Keltigangas-Järvinen, Liisa; Kiemeney, Lambertus A L M; Kolcic, Ivana; Koskinen, Seppo; Kraja, Aldi T; Kroh, Martin; Kutalik, Zoltan; Latvala, Antti; Launer, Lenore J; Lebreton, Maël P; Levinson, Douglas F; Lichtenstein, Paul; Lichtner, Peter; Liewald, David C M; Loukola, Anu; Madden, Pamela A; Mägi, Reedik; Mäki-Opas, Tomi; Marioni, Riccardo E; Marques-Vidal, Pedro; Meddens, Gerardus A; McMahon, George; Meisinger, Christa; Meitinger, Thomas; Milaneschi, Yusplitri; Milani, Lili; Montgomery, Grant W; Myhre, Ronny; Nelson, Christopher P; Nyholt, Dale R; Ollier, William E R; Palotie, Aarno; Paternoster, Lavinia; Pedersen, Nancy L; Petrovic, Katja E; Porteous, David J; Räikkönen, Katri; Ring, Susan M; Robino, Antonietta; Rostapshova, Olga; Rudan, Igor; Rustichini, Aldo; Salomaa, Veikko; Sanders, Alan R; Sarin, Antti-Pekka; Schmidt, Helena; Scott, Rodney J; Smith, Blair H; Smith, Jennifer A; Staessen, Jan A; Steinhagen-Thiessen, Elisabeth; Strauch, Konstantin; Terracciano, Antonio; Tobin, Martin D; Ulivi, Sheila; Vaccargiu, Simona; Quaye, Lydia; van Rooij, Frank J A; Venturini, Cristina; Vinkhuyzen, Anna A E; Völker, Uwe; Völzke, Henry; Vonk, Judith M; Vozzi, Diego; Waage, Johannes; Ware, Erin B; Willemsen, Gonneke; Attia, John R; Bennett, David A; Berger, Klaus; Bertram, Lars; Bisgaard, Hans; Boomsma, Dorret I; Borecki, Ingrid B; Bültmann, Ute; Chabris, Christopher F; Cucca, Francesco; Cusi, Daniele; Deary, Ian J; Dedoussis, George V; van Duijn, Cornelia M; Eriksson, Johan G; Franke, Barbara; Franke, Lude; Gasparini, Paolo; Gejman, Pablo V; Gieger, Christian; Grabe, Hans-Jörgen; Gratten, Jacob; Groenen, Patrick J F; Gudnason, Vilmundur; van der Harst, Pim; Hayward, Caroline; Hinds, David A; Hoffmann, Wolfgang; Hyppönen, Elina; Iacono, William G; Jacobsson, Bo; Järvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Kaprio, Jaakko; Kardia, Sharon L R; Lehtimäki, Terho; Lehrer, Steven F; Magnusson, Patrik K E; Martin, Nicholas G; McGue, Matt; Metspalu, Andres; Pendleton, Neil; Penninx, Brenda W J H; Perola, Markus; Pirastu, Nicola; Pirastu, Mario; Polasek, Ozren; Posthuma, Danielle; Power, Christine; Province, Michael A; Samani, Nilesh J; Schlessinger, David; Schmidt, Reinhold; Sørensen, Thorkild I A; Spector, Tim D; Stefansson, Kari; Thorsteinsdottir, Unnur; Thurik, A Roy; Timpson, Nicholas J; Tiemeier, Henning; Tung, Joyce Y; Uitterlinden, André G; Vitart, Veronique; Vollenweider, Peter; Weir, David R; Wilson, James F; Wright, Alan F; Conley, Dalton C; Krueger, Robert F; Davey Smith, George; Hofman, Albert; Laibson, David I; Medland, Sarah E; Meyer, Michelle N; Yang, Jian; Johannesson, Magnus; Visscher, Peter M; Esko, Tõnu; Koellinger, Philipp D; Cesarini, David; Benjamin, Daniel J

    2016-05-26

    Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our earlier discovery sample of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 genome-wide significant loci associated with the number of years of schooling completed. Single-nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric diseases.

  11. Genome-wide differentiation of various melon horticultural groups for use in genome wide association study for fruit firmness and construction of a high resolution genetic map

    USDA-ARS?s Scientific Manuscript database

    We generated 13,789 single nucleotide plymorphism (SNP) markers from 97 melon accessions using genotyping by sequencing and anchored them to chromosomes to understand genome-wide fixation index between various melon morphotypes and linkage disequilibrium (LD) decay for inodorus and cantalupensis, th...

  12. Genetic link between family socioeconomic status and children's educational achievement estimated from genome-wide SNPs.

    PubMed

    Krapohl, E; Plomin, R

    2016-03-01

    One of the best predictors of children's educational achievement is their family's socioeconomic status (SES), but the degree to which this association is genetically mediated remains unclear. For 3000 UK-representative unrelated children we found that genome-wide single-nucleotide polymorphisms could explain a third of the variance of scores on an age-16 UK national examination of educational achievement and half of the correlation between their scores and family SES. Moreover, genome-wide polygenic scores based on a previously published genome-wide association meta-analysis of total number of years in education accounted for ~3.0% variance in educational achievement and ~2.5% in family SES. This study provides the first molecular evidence for substantial genetic influence on differences in children's educational achievement and its association with family SES.

  13. Genetic link between family socioeconomic status and children's educational achievement estimated from genome-wide SNPs

    PubMed Central

    Krapohl, E; Plomin, R

    2016-01-01

    One of the best predictors of children's educational achievement is their family's socioeconomic status (SES), but the degree to which this association is genetically mediated remains unclear. For 3000 UK-representative unrelated children we found that genome-wide single-nucleotide polymorphisms could explain a third of the variance of scores on an age-16 UK national examination of educational achievement and half of the correlation between their scores and family SES. Moreover, genome-wide polygenic scores based on a previously published genome-wide association meta-analysis of total number of years in education accounted for ~3.0% variance in educational achievement and ~2.5% in family SES. This study provides the first molecular evidence for substantial genetic influence on differences in children's educational achievement and its association with family SES. PMID:25754083

  14. Genome-wide comparative transcriptome analysis of CMS-D2 and its maintainer and restorer lines in upland cotton.

    PubMed

    Wu, Jianyong; Zhang, Meng; Zhang, Bingbing; Zhang, Xuexian; Guo, Liping; Qi, Tingxiang; Wang, Hailin; Zhang, Jinfa; Xing, Chaozhu

    2017-06-08

    Cytoplasmic male sterility (CMS) conferred by the cytoplasm from Gossypium harknessii (D2) is an important system for hybrid seed production in Upland cotton (G. hirsutum). The male sterility of CMS-D2 (i.e., A line) can be restored to fertility by a restorer (i.e., R line) carrying the restorer gene Rf1 transferred from the D2 nuclear genome. However, the molecular mechanisms of CMS-D2 and its restoration are poorly understood. In this study, a genome-wide comparative transcriptome analysis was performed to identify differentially expressed genes (DEGs) in flower buds among the isogenic fertile R line and sterile A line derived from a backcross population (BC 8 F 1 ) and the recurrent parent, i.e., the maintainer (B line). A total of 1464 DEGs were identified among the three isogenic lines, and the Rf1-carrying Chr_D05 and its homeologous Chr_A05 had more DEGs than other chromosomes. The results of GO and KEGG enrichment analysis showed differences in circadian rhythm between the fertile and sterile lines. Eleven DEGs were selected for validation using qRT-PCR, confirming the accuracy of the RNA-seq results. Through genome-wide comparative transcriptome analysis, the differential expression profiles of CMS-D2 and its maintainer and restorer lines in Upland cotton were identified. Our results provide an important foundation for further studies into the molecular mechanisms of the interactions between the restorer gene Rf1 and the CMS-D2 cytoplasm.

  15. Genome-Wide Microsatellite Characterization and Marker Development in the Sequenced Brassica Crop Species

    PubMed Central

    Shi, Jiaqin; Huang, Shunmou; Zhan, Jiepeng; Yu, Jingyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2014-01-01

    Although much research has been conducted, the pattern of microsatellite distribution has remained ambiguous, and the development/utilization of microsatellite markers has still been limited/inefficient in Brassica, due to the lack of genome sequences. In view of this, we conducted genome-wide microsatellite characterization and marker development in three recently sequenced Brassica crops: Brassica rapa, Brassica oleracea and Brassica napus. The analysed microsatellite characteristics of these Brassica species were highly similar or almost identical, which suggests that the pattern of microsatellite distribution is likely conservative in Brassica. The genomic distribution of microsatellites was highly non-uniform and positively or negatively correlated with genes or transposable elements, respectively. Of the total of 115 869, 185 662 and 356 522 simple sequence repeat (SSR) markers developed with high frequencies (408.2, 343.8 and 356.2 per Mb or one every 2.45, 2.91 and 2.81 kb, respectively), most represented new SSR markers, the majority had determined physical positions, and a large number were genic or putative single-locus SSR markers. We also constructed a comprehensive database for the newly developed SSR markers, which was integrated with public Brassica SSR markers and annotated genome components. The genome-wide SSR markers developed in this study provide a useful tool to extend the annotated genome resources of sequenced Brassica species to genetic study/breeding in different Brassica species. PMID:24130371

  16. Genome-wide microsatellite characterization and marker development in the sequenced Brassica crop species.

    PubMed

    Shi, Jiaqin; Huang, Shunmou; Zhan, Jiepeng; Yu, Jingyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2014-02-01

    Although much research has been conducted, the pattern of microsatellite distribution has remained ambiguous, and the development/utilization of microsatellite markers has still been limited/inefficient in Brassica, due to the lack of genome sequences. In view of this, we conducted genome-wide microsatellite characterization and marker development in three recently sequenced Brassica crops: Brassica rapa, Brassica oleracea and Brassica napus. The analysed microsatellite characteristics of these Brassica species were highly similar or almost identical, which suggests that the pattern of microsatellite distribution is likely conservative in Brassica. The genomic distribution of microsatellites was highly non-uniform and positively or negatively correlated with genes or transposable elements, respectively. Of the total of 115 869, 185 662 and 356 522 simple sequence repeat (SSR) markers developed with high frequencies (408.2, 343.8 and 356.2 per Mb or one every 2.45, 2.91 and 2.81 kb, respectively), most represented new SSR markers, the majority had determined physical positions, and a large number were genic or putative single-locus SSR markers. We also constructed a comprehensive database for the newly developed SSR markers, which was integrated with public Brassica SSR markers and annotated genome components. The genome-wide SSR markers developed in this study provide a useful tool to extend the annotated genome resources of sequenced Brassica species to genetic study/breeding in different Brassica species.

  17. Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries

    PubMed Central

    Baurley, James W.; Edlund, Christopher K.; Pardamean, Carissa I.; Conti, David V.; Krasnow, Ruth; Javitz, Harold S.; Hops, Hyman; Swan, Gary E.; Benowitz, Neal L.

    2016-01-01

    Introduction: Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3′-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Methods: Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. Results: African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). Conclusions: This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan

  18. Genome-Wide Meta-Analysis of Longitudinal Alcohol Consumption Across Youth and Early Adulthood.

    PubMed

    Adkins, Daniel E; Clark, Shaunna L; Copeland, William E; Kennedy, Martin; Conway, Kevin; Angold, Adrian; Maes, Hermine; Liu, Youfang; Kumar, Gaurav; Erkanli, Alaattin; Patkar, Ashwin A; Silberg, Judy; Brown, Tyson H; Fergusson, David M; Horwood, L John; Eaves, Lindon; van den Oord, Edwin J C G; Sullivan, Patrick F; Costello, E J

    2015-08-01

    The public health burden of alcohol is unevenly distributed across the life course, with levels of use, abuse, and dependence increasing across adolescence and peaking in early adulthood. Here, we leverage this temporal patterning to search for common genetic variants predicting developmental trajectories of alcohol consumption. Comparable psychiatric evaluations measuring alcohol consumption were collected in three longitudinal community samples (N=2,126, obs=12,166). Consumption-repeated measurements spanning adolescence and early adulthood were analyzed using linear mixed models, estimating individual consumption trajectories, which were then tested for association with Illumina 660W-Quad genotype data (866,099 SNPs after imputation and QC). Association results were combined across samples using standard meta-analysis methods. Four meta-analysis associations satisfied our pre-determined genome-wide significance criterion (FDR<0.1) and six others met our 'suggestive' criterion (FDR<0.2). Genome-wide significant associations were highly biological plausible, including associations within GABA transporter 1, SLC6A1 (solute carrier family 6, member 1), and exonic hits in LOC100129340 (mitofusin-1-like). Pathway analyses elaborated single marker results, indicating significant enriched associations to intuitive biological mechanisms, including neurotransmission, xenobiotic pharmacodynamics, and nuclear hormone receptors (NHR). These findings underscore the value of combining longitudinal behavioral data and genome-wide genotype information in order to study developmental patterns and improve statistical power in genomic studies.

  19. Lessons from ten years of genome-wide association studies of asthma

    PubMed Central

    Vicente, Cristina T; Revez, Joana A; Ferreira, Manuel A R

    2017-01-01

    Twenty-five genome-wide association studies (GWAS) of asthma were published between 2007 and 2016, the largest with a sample size of 157242 individuals. Across these studies, 39 genetic variants in low linkage disequilibrium (LD) with each other were reported to associate with disease risk at a significance threshold of P<5 × 10−8, including 31 in populations of European ancestry. Results from analyses of the UK Biobank data (n=380 503) indicate that at least 28 of the 31 associations reported in Europeans represent true-positive findings, collectively explaining 2.5% of the variation in disease liability (median of 0.06% per variant). We identified 49 transcripts as likely target genes of the published asthma risk variants, mostly based on LD with expression quantitative trait loci (eQTL). Of these genes, 16 were previously implicated in disease pathophysiology by functional studies, including TSLP, TNFSF4, ADORA1, CHIT1 and USF1. In contrast, at present, there is limited or no functional evidence directly implicating the remaining 33 likely target genes in asthma pathophysiology. Some of these genes have a known function that is relevant to allergic disease, including F11R, CD247, PGAP3, AAGAB, CAMK4 and PEX14, and so could be prioritized for functional follow-up. We conclude by highlighting three areas of research that are essential to help translate GWAS findings into clinical research or practice, namely validation of target gene predictions, understanding target gene function and their role in disease pathophysiology and genomics-guided prioritization of targets for drug development. PMID:29333270

  20. Genome-wide Pleiotropy Between Parkinson Disease and Autoimmune Diseases.

    PubMed

    Witoelar, Aree; Jansen, Iris E; Wang, Yunpeng; Desikan, Rahul S; Gibbs, J Raphael; Blauwendraat, Cornelis; Thompson, Wesley K; Hernandez, Dena G; Djurovic, Srdjan; Schork, Andrew J; Bettella, Francesco; Ellinghaus, David; Franke, Andre; Lie, Benedicte A; McEvoy, Linda K; Karlsen, Tom H; Lesage, Suzanne; Morris, Huw R; Brice, Alexis; Wood, Nicholas W; Heutink, Peter; Hardy, John; Singleton, Andrew B; Dale, Anders M; Gasser, Thomas; Andreassen, Ole A; Sharma, Manu

    2017-07-01

    Recent genome-wide association studies (GWAS) and pathway analyses supported long-standing observations of an association between immune-mediated diseases and Parkinson disease (PD). The post-GWAS era provides an opportunity for cross-phenotype analyses between different complex phenotypes. To test the hypothesis that there are common genetic risk variants conveying risk of both PD and autoimmune diseases (ie, pleiotropy) and to identify new shared genetic variants and their pathways by applying a novel statistical framework in a genome-wide approach. Using the conjunction false discovery rate method, this study analyzed GWAS data from a selection of archetypal autoimmune diseases among 138 511 individuals of European ancestry and systemically investigated pleiotropy between PD and type 1 diabetes, Crohn disease, ulcerative colitis, rheumatoid arthritis, celiac disease, psoriasis, and multiple sclerosis. NeuroX data (6927 PD cases and 6108 controls) were used for replication. The study investigated the biological correlation between the top loci through protein-protein interaction and changes in the gene expression and methylation levels. The dates of the analysis were June 10, 2015, to March 4, 2017. The primary outcome was a list of novel loci and their pathways involved in PD and autoimmune diseases. Genome-wide conjunctional analysis identified 17 novel loci at false discovery rate less than 0.05 with overlap between PD and autoimmune diseases, including known PD loci adjacent to GAK, HLA-DRB5, LRRK2, and MAPT for rheumatoid arthritis, ulcerative colitis and Crohn disease. Replication confirmed the involvement of HLA, LRRK2, MAPT, TRIM10, and SETD1A in PD. Among the novel genes discovered, WNT3, KANSL1, CRHR1, BOLA2, and GUCY1A3 are within a protein-protein interaction network with known PD genes. A subset of novel loci was significantly associated with changes in methylation or expression levels of adjacent genes. The study findings provide novel mechanistic

  1. A Genome Wide Association Study Identifies Common Variants Associated with Lipid Levels in the Chinese Population

    PubMed Central

    Wu, Chen; Yang, Handong; Yu, Dianke; Yang, Xiaobo; Zhang, Xiaomin; Wang, Yiqin; Sun, Jielin; Gao, Yong; Tan, Aihua; He, Yunfeng; Zhang, Haiying; Qin, Xue; Zhu, Jingwen; Li, Huaixing; Lin, Xu; Zhu, Jiang; Min, Xinwen; Lang, Mingjian; Li, Dongfeng; Zhai, Kan; Chang, Jiang; Tan, Wen; Yuan, Jing; Chen, Weihong; Wang, Youjie; Wei, Sheng; Miao, Xiaoping; Wang, Feng; Fang, Weimin; Liang, Yuan; Deng, Qifei; Dai, Xiayun; Lin, Dafeng; Huang, Suli; Guo, Huan; Lilly Zheng, S.; Xu, Jianfeng; Lin, Dongxin; Hu, Frank B.; Wu, Tangchun

    2013-01-01

    Plasma lipid levels are important risk factors for cardiovascular disease and are influenced by genetic and environmental factors. Recent genome wide association studies (GWAS) have identified several lipid-associated loci, but these loci have been identified primarily in European populations. In order to identify genetic markers for lipid levels in a Chinese population and analyze the heterogeneity between Europeans and Asians, especially Chinese, we performed a meta-analysis of two genome wide association studies on four common lipid traits including total cholesterol (TC), triglycerides (TG), low-density lipoprotein cholesterol (LDL) and high-density lipoprotein cholesterol (HDL) in a Han Chinese population totaling 3,451 healthy subjects. Replication was performed in an additional 8,830 subjects of Han Chinese ethnicity. We replicated eight loci associated with lipid levels previously reported in a European population. The loci genome wide significantly associated with TC were near DOCK7, HMGCR and ABO; those genome wide significantly associated with TG were near APOA1/C3/A4/A5 and LPL; those genome wide significantly associated with LDL were near HMGCR, ABO and TOMM40; and those genome wide significantly associated with HDL were near LPL, LIPC and CETP. In addition, an additive genotype score of eight SNPs representing the eight loci that were found to be associated with lipid levels was associated with higher TC, TG and LDL levels (P = 5.52×10-16, 1.38×10-6 and 5.59×10-9, respectively). These findings suggest the cumulative effects of multiple genetic loci on plasma lipid levels. Comparisons with previous GWAS of lipids highlight heterogeneity in allele frequency and in effect size for some loci between Chinese and European populations. The results from our GWAS provided comprehensive and convincing evidence of the genetic determinants of plasma lipid levels in a Chinese population. PMID:24386095

  2. Natural CMT2 Variation Is Associated With Genome-Wide Methylation Changes and Temperature Seasonality

    PubMed Central

    Shen, Xia; De Jonge, Jennifer; Forsberg, Simon K. G.; Pettersson, Mats E.; Sheng, Zheya; Hennig, Lars; Carlborg, Örjan

    2014-01-01

    As Arabidopsis thaliana has colonized a wide range of habitats across the world it is an attractive model for studying the genetic mechanisms underlying environmental adaptation. Here, we used public data from two collections of A. thaliana accessions to associate genetic variability at individual loci with differences in climates at the sampling sites. We use a novel method to screen the genome for plastic alleles that tolerate a broader climate range than the major allele. This approach reduces confounding with population structure and increases power compared to standard genome-wide association methods. Sixteen novel loci were found, including an association between Chromomethylase 2 (CMT2) and temperature seasonality where the genome-wide CHH methylation was different for the group of accessions carrying the plastic allele. Cmt2 mutants were shown to be more tolerant to heat-stress, suggesting genetic regulation of epigenetic modifications as a likely mechanism underlying natural adaptation to variable temperatures, potentially through differential allelic plasticity to temperature-stress. PMID:25503602

  3. Genome-wide inference of regulatory networks in Streptomyces coelicolor.

    PubMed

    Castro-Melchor, Marlene; Charaniya, Salim; Karypis, George; Takano, Eriko; Hu, Wei-Shou

    2010-10-18

    The onset of antibiotics production in Streptomyces species is co-ordinated with differentiation events. An understanding of the genetic circuits that regulate these coupled biological phenomena is essential to discover and engineer the pharmacologically important natural products made by these species. The availability of genomic tools and access to a large warehouse of transcriptome data for the model organism, Streptomyces coelicolor, provides incentive to decipher the intricacies of the regulatory cascades and develop biologically meaningful hypotheses. In this study, more than 500 samples of genome-wide temporal transcriptome data, comprising wild-type and more than 25 regulatory gene mutants of Streptomyces coelicolor probed across multiple stress and medium conditions, were investigated. Information based on transcript and functional similarity was used to update a previously-predicted whole-genome operon map and further applied to predict transcriptional networks constituting modules enriched in diverse functions such as secondary metabolism, and sigma factor. The predicted network displays a scale-free architecture with a small-world property observed in many biological networks. The networks were further investigated to identify functionally-relevant modules that exhibit functional coherence and a consensus motif in the promoter elements indicative of DNA-binding elements. Despite the enormous experimental as well as computational challenges, a systems approach for integrating diverse genome-scale datasets to elucidate complex regulatory networks is beginning to emerge. We present an integrated analysis of transcriptome data and genomic features to refine a whole-genome operon map and to construct regulatory networks at the cistron level in Streptomyces coelicolor. The functionally-relevant modules identified in this study pose as potential targets for further studies and verification.

  4. Genome-wide association for abdominal subcutaneous and visceral adipose reveals a novel locus for visceral fat in women.

    PubMed

    Fox, Caroline S; Liu, Yongmei; White, Charles C; Feitosa, Mary; Smith, Albert V; Heard-Costa, Nancy; Lohman, Kurt; Johnson, Andrew D; Foster, Meredith C; Greenawalt, Danielle M; Griffin, Paula; Ding, Jinghong; Newman, Anne B; Tylavsky, Fran; Miljkovic, Iva; Kritchevsky, Stephen B; Launer, Lenore; Garcia, Melissa; Eiriksdottir, Gudny; Carr, J Jeffrey; Gudnason, Vilmunder; Harris, Tamara B; Cupples, L Adrienne; Borecki, Ingrid B

    2012-01-01

    Body fat distribution, particularly centralized obesity, is associated with metabolic risk above and beyond total adiposity. We performed genome-wide association of abdominal adipose depots quantified using computed tomography (CT) to uncover novel loci for body fat distribution among participants of European ancestry. Subcutaneous and visceral fat were quantified in 5,560 women and 4,997 men from 4 population-based studies. Genome-wide genotyping was performed using standard arrays and imputed to ~2.5 million Hapmap SNPs. Each study performed a genome-wide association analysis of subcutaneous adipose tissue (SAT), visceral adipose tissue (VAT), VAT adjusted for body mass index, and VAT/SAT ratio (a metric of the propensity to store fat viscerally as compared to subcutaneously) in the overall sample and in women and men separately. A weighted z-score meta-analysis was conducted. For the VAT/SAT ratio, our most significant p-value was rs11118316 at LYPLAL1 gene (p = 3.1 × 10E-09), previously identified in association with waist-hip ratio. For SAT, the most significant SNP was in the FTO gene (p = 5.9 × 10E-08). Given the known gender differences in body fat distribution, we performed sex-specific analyses. Our most significant finding was for VAT in women, rs1659258 near THNSL2 (p = 1.6 × 10-08), but not men (p = 0.75). Validation of this SNP in the GIANT consortium data demonstrated a similar sex-specific pattern, with observed significance in women (p = 0.006) but not men (p = 0.24) for BMI and waist circumference (p = 0.04 [women], p = 0.49 [men]). Finally, we interrogated our data for the 14 recently published loci for body fat distribution (measured by waist-hip ratio adjusted for BMI); associations were observed at 7 of these loci. In contrast, we observed associations at only 7/32 loci previously identified in association with BMI; the majority of overlap was observed with SAT. Genome-wide association for visceral and subcutaneous fat revealed a SNP for

  5. Genome wide association analyses based on a multiple trait approach for modeling feed efficiency

    USDA-ARS?s Scientific Manuscript database

    Genome wide association (GWA) of feed efficiency (FE) could help target important genomic regions influencing FE. Data provided by an international dairy FE research consortium consisted of phenotypic records on dry matter intakes (DMI), milk energy (MILKE), and metabolic body weight (MBW) on 6,937 ...

  6. Genome-wide association study identifies three novel loci for type 2 diabetes.

    PubMed

    Hara, Kazuo; Fujita, Hayato; Johnson, Todd A; Yamauchi, Toshimasa; Yasuda, Kazuki; Horikoshi, Momoko; Peng, Chen; Hu, Cheng; Ma, Ronald C W; Imamura, Minako; Iwata, Minoru; Tsunoda, Tatsuhiko; Morizono, Takashi; Shojima, Nobuhiro; So, Wing Yee; Leung, Ting Fan; Kwan, Patrick; Zhang, Rong; Wang, Jie; Yu, Weihui; Maegawa, Hiroshi; Hirose, Hiroshi; Kaku, Kohei; Ito, Chikako; Watada, Hirotaka; Tanaka, Yasushi; Tobe, Kazuyuki; Kashiwagi, Atsunori; Kawamori, Ryuzo; Jia, Weiping; Chan, Juliana C N; Teo, Yik Ying; Shyong, Tai E; Kamatani, Naoyuki; Kubo, Michiaki; Maeda, Shiro; Kadowaki, Takashi

    2014-01-01

    Although over 60 loci for type 2 diabetes (T2D) have been identified, there still remains a large genetic component to be clarified. To explore unidentified loci for T2D, we performed a genome-wide association study (GWAS) of 6 209 637 single-nucleotide polymorphisms (SNPs), which were directly genotyped or imputed using East Asian references from the 1000 Genomes Project (June 2011 release) in 5976 Japanese patients with T2D and 20 829 nondiabetic individuals. Nineteen unreported loci were selected and taken forward to follow-up analyses. Combined discovery and follow-up analyses (30 392 cases and 34 814 controls) identified three new loci with genome-wide significance, which were MIR129-LEP [rs791595; risk allele = A; risk allele frequency (RAF) = 0.080; P = 2.55 × 10(-13); odds ratio (OR) = 1.17], GPSM1 [rs11787792; risk allele = A; RAF = 0.874; P = 1.74 × 10(-10); OR = 1.15] and SLC16A13 (rs312457; risk allele = G; RAF = 0.078; P = 7.69 × 10(-13); OR = 1.20). This study demonstrates that GWASs based on the imputation of genotypes using modern reference haplotypes such as that from the 1000 Genomes Project data can assist in identification of new loci for common diseases.

  7. Genome-wide Association Study Identifies African-Specific Susceptibility Loci in African Americans with Inflammatory Bowel Disease

    PubMed Central

    Brant, Steven R.; Okou, David T.; Simpson, Claire L.; Cutler, David J.; Haritunians, Talin; Bradfield, Jonathan P.; Chopra, Pankaj; Prince, Jarod; Begum, Ferdouse; Kumar, Archana; Huang, Chengrui; Venkateswaran, Suresh; Datta, Lisa W.; Wei, Zhi; Thomas, Kelly; Herrinton, Lisa J.; Klapproth, Jan-Micheal A.; Quiros, Antonio J.; Seminerio, Jenifer; Liu, Zhenqiu; Alexander, Jonathan S.; Baldassano, Robert N.; Dudley-Brown, Sharon; Cross, Raymond K.; Dassopoulos, Themistocles; Denson, Lee A.; Dhere, Tanvi A.; Dryden, Gerald W.; Hanson, John S.; Hou, Jason K.; Hussain, Sunny Z.; Hyams, Jeffrey S.; Isaacs, Kim L.; Kader, Howard; Kappelman, Michael D.; Katz, Jeffry; Kellermayer, Richard; Kirschner, Barbara S.; Kuemmerle, John F.; Kwon, John H.; Lazarev, Mark; Li, Ellen; Mack, David; Mannon, Peter; Moulton, Dedrick E.; Newberry, Rodney D.; Osuntokun, Bankole O.; Patel, Ashish S.; Saeed, Shehzad A.; Targan, Stephan R.; Valentine, John F.; Wang, Ming-Hsi; Zonca, Martin; Rioux, John D.; Duerr, Richard H.; Silverberg, Mark S.; Cho, Judy H.; Hakonarson, Hakon; Zwick, Michael E.; McGovern, Dermot P.B.; Kugathasan, Subra

    2016-01-01

    Background & Aims The inflammatory bowel diseases (IBD) ulcerative colitis (UC) and Crohn’s disease (CD) cause significant morbidity and are increasing in prevalence among all populations, including African Americans. More than 200 susceptibility loci have been identified in populations of predominantly European ancestry, but few loci have been associated with IBD in other ethnicities. Methods We performed 2 high-density, genome-wide scans comprising 2345 cases of African Americans with IBD (1646 with CD, 583 with UC, and 116 inflammatory bowel disease unclassified [IBD-U]) and 5002 individuals without IBD (controls, identified from the Health Retirement Study and Kaiser Permanente database). Single-nucleotide polymorphisms (SNPs) associated at P<5.0×10−8 in meta-analysis with a nominal evidence (P<.05) in each scan were considered to have genome-wide significance. Results We detected SNPs at HLA-DRB1, and African-specific SNPs at ZNF649 and LSAMP, with associations of genome-wide significance for UC. We detected SNPs at USP25 with associations of genome-wide significance associations for IBD. No associations of genome-wide significance were detected for CD. In addition, 9 genes previously associated with IBD contained SNPs with significant evidence for replication (P<1.6×10−6): ADCY3, CXCR6, HLA-DRB1 to HLA-DQA1 (genome-wide significance on conditioning), IL12B, PTGER4, and TNC for IBD; IL23R, PTGER4, and SNX20 (in strong linkage disequilibrium with NOD2) for CD; and KCNQ2 (near TNFRSF6B) for UC. Several of these genes, such as TNC (near TNFSF15), CXCR6, and genes associated with IBD at the HLA locus, contained SNPs with unique association patterns with African-specific alleles. Conclusions We performed a genome-wide association study of African Americans with IBD and identified loci associated with CD and UC in only this population; we also replicated loci identified in European populations. The detection of variants associated with IBD risk in only

  8. Genome-Wide Association Study Identifies African-Specific Susceptibility Loci in African Americans With Inflammatory Bowel Disease.

    PubMed

    Brant, Steven R; Okou, David T; Simpson, Claire L; Cutler, David J; Haritunians, Talin; Bradfield, Jonathan P; Chopra, Pankaj; Prince, Jarod; Begum, Ferdouse; Kumar, Archana; Huang, Chengrui; Venkateswaran, Suresh; Datta, Lisa W; Wei, Zhi; Thomas, Kelly; Herrinton, Lisa J; Klapproth, Jan-Micheal A; Quiros, Antonio J; Seminerio, Jenifer; Liu, Zhenqiu; Alexander, Jonathan S; Baldassano, Robert N; Dudley-Brown, Sharon; Cross, Raymond K; Dassopoulos, Themistocles; Denson, Lee A; Dhere, Tanvi A; Dryden, Gerald W; Hanson, John S; Hou, Jason K; Hussain, Sunny Z; Hyams, Jeffrey S; Isaacs, Kim L; Kader, Howard; Kappelman, Michael D; Katz, Jeffry; Kellermayer, Richard; Kirschner, Barbara S; Kuemmerle, John F; Kwon, John H; Lazarev, Mark; Li, Ellen; Mack, David; Mannon, Peter; Moulton, Dedrick E; Newberry, Rodney D; Osuntokun, Bankole O; Patel, Ashish S; Saeed, Shehzad A; Targan, Stephan R; Valentine, John F; Wang, Ming-Hsi; Zonca, Martin; Rioux, John D; Duerr, Richard H; Silverberg, Mark S; Cho, Judy H; Hakonarson, Hakon; Zwick, Michael E; McGovern, Dermot P B; Kugathasan, Subra

    2017-01-01

    The inflammatory bowel diseases (IBD) ulcerative colitis (UC) and Crohn's disease (CD) cause significant morbidity and are increasing in prevalence among all populations, including African Americans. More than 200 susceptibility loci have been identified in populations of predominantly European ancestry, but few loci have been associated with IBD in other ethnicities. We performed 2 high-density, genome-wide scans comprising 2345 cases of African Americans with IBD (1646 with CD, 583 with UC, and 116 inflammatory bowel disease unclassified) and 5002 individuals without IBD (controls, identified from the Health Retirement Study and Kaiser Permanente database). Single-nucleotide polymorphisms (SNPs) associated at P < 5.0 × 10 -8 in meta-analysis with a nominal evidence (P < .05) in each scan were considered to have genome-wide significance. We detected SNPs at HLA-DRB1, and African-specific SNPs at ZNF649 and LSAMP, with associations of genome-wide significance for UC. We detected SNPs at USP25 with associations of genome-wide significance for IBD. No associations of genome-wide significance were detected for CD. In addition, 9 genes previously associated with IBD contained SNPs with significant evidence for replication (P < 1.6 × 10 -6 ): ADCY3, CXCR6, HLA-DRB1 to HLA-DQA1 (genome-wide significance on conditioning), IL12B,PTGER4, and TNC for IBD; IL23R, PTGER4, and SNX20 (in strong linkage disequilibrium with NOD2) for CD; and KCNQ2 (near TNFRSF6B) for UC. Several of these genes, such as TNC (near TNFSF15), CXCR6, and genes associated with IBD at the HLA locus, contained SNPs with unique association patterns with African-specific alleles. We performed a genome-wide association study of African Americans with IBD and identified loci associated with UC in only this population; we also replicated IBD, CD, and UC loci identified in European populations. The detection of variants associated with IBD risk in only people of African descent demonstrates the

  9. Genome-Wide Identification and Characterization of WRKY Gene Family in Peanut.

    PubMed

    Song, Hui; Wang, Pengfei; Lin, Jer-Young; Zhao, Chuanzhi; Bi, Yuping; Wang, Xingjun

    2016-01-01

    WRKY, an important transcription factor family, is widely distributed in the plant kingdom. Many reports focused on analysis of phylogenetic relationship and biological function of WRKY protein at the whole genome level in different plant species. However, little is known about WRKY proteins in the genome of Arachis species and their response to salicylic acid (SA) and jasmonic acid (JA) treatment. In this study, we identified 77 and 75 WRKY proteins from the two wild ancestral diploid genomes of cultivated tetraploid peanut, Arachis duranensis and Arachis ipaënsis, using bioinformatics approaches. Most peanut WRKY coding genes were located on A. duranensis chromosome A6 and A. ipaënsis chromosome B3, while the least number of WRKY genes was found in chromosome 9. The WRKY orthologous gene pairs in A. duranensis and A. ipaënsis chromosomes were highly syntenic. Our analysis indicated that segmental duplication events played a major role in AdWRKY and AiWRKY genes, and strong purifying selection was observed in gene duplication pairs. Furthermore, we translate the knowledge gained from the genome-wide analysis result of wild ancestral peanut to cultivated peanut to reveal that gene activities of specific cultivated peanut WRKY gene were changed due to SA and JA treatment. Peanut WRKY7, 8 and 13 genes were down-regulated, whereas WRKY1 and 12 genes were up-regulated with SA and JA treatment. These results could provide valuable information for peanut improvement.

  10. Genome-Wide Identification and Characterization of WRKY Gene Family in Peanut

    PubMed Central

    Song, Hui; Wang, Pengfei; Lin, Jer-Young; Zhao, Chuanzhi; Bi, Yuping; Wang, Xingjun

    2016-01-01

    WRKY, an important transcription factor family, is widely distributed in the plant kingdom. Many reports focused on analysis of phylogenetic relationship and biological function of WRKY protein at the whole genome level in different plant species. However, little is known about WRKY proteins in the genome of Arachis species and their response to salicylic acid (SA) and jasmonic acid (JA) treatment. In this study, we identified 77 and 75 WRKY proteins from the two wild ancestral diploid genomes of cultivated tetraploid peanut, Arachis duranensis and Arachis ipaënsis, using bioinformatics approaches. Most peanut WRKY coding genes were located on A. duranensis chromosome A6 and A. ipaënsis chromosome B3, while the least number of WRKY genes was found in chromosome 9. The WRKY orthologous gene pairs in A. duranensis and A. ipaënsis chromosomes were highly syntenic. Our analysis indicated that segmental duplication events played a major role in AdWRKY and AiWRKY genes, and strong purifying selection was observed in gene duplication pairs. Furthermore, we translate the knowledge gained from the genome-wide analysis result of wild ancestral peanut to cultivated peanut to reveal that gene activities of specific cultivated peanut WRKY gene were changed due to SA and JA treatment. Peanut WRKY7, 8 and 13 genes were down-regulated, whereas WRKY1 and 12 genes were up-regulated with SA and JA treatment. These results could provide valuable information for peanut improvement. PMID:27200012

  11. Cell Context Dependent p53 Genome-Wide Binding Patterns and Enrichment at Repeats

    DOE PAGES

    Botcheva, Krassimira; McCorkle, Sean R.

    2014-11-21

    The p53 ability to elicit stress specific and cell type specific responses is well recognized, but how that specificity is established remains to be defined. Whether upon activation p53 binds to its genomic targets in a cell type and stress type dependent manner is still an open question. Here we show that the p53 binding to the human genome is selective and cell context-dependent. We mapped the genomic binding sites for the endogenous wild type p53 protein in the human cancer cell line HCT116 and compared them to those we previously determined in the normal cell line IMR90. We reportmore » distinct p53 genome-wide binding landscapes in two different cell lines, analyzed under the same treatment and experimental conditions, using the same ChIP-seq approach. This is evidence for cell context dependent p53 genomic binding. The observed differences affect the p53 binding sites distribution with respect to major genomic and epigenomic elements (promoter regions, CpG islands and repeats). We correlated the high-confidence p53 ChIP-seq peaks positions with the annotated human repeats (UCSC Human Genome Browser) and observed both common and cell line specific trends. In HCT116, the p53 binding was specifically enriched at LINE repeats, compared to IMR90 cells. The p53 genome-wide binding patterns in HCT116 and IMR90 likely reflect the different epigenetic landscapes in these two cell lines, resulting from cancer-associated changes (accumulated in HCT116) superimposed on tissue specific differences (HCT116 has epithelial, while IMR90 has mesenchymal origin). In conclusion, our data support the model for p53 binding to the human genome in a highly selective manner, mobilizing distinct sets of genes, contributing to distinct pathways.« less

  12. Distribution of triclosan-resistant genes in major pathogenic microorganisms revealed by metagenome and genome-wide analysis

    PubMed Central

    Khan, Raees; Roy, Nazish; Choi, Kihyuck

    2018-01-01

    The substantial use of triclosan (TCS) has been aimed to kill pathogenic bacteria, but TCS resistance seems to be prevalent in microbial species and limited knowledge exists about TCS resistance determinants in a majority of pathogenic bacteria. We aimed to evaluate the distribution of TCS resistance determinants in major pathogenic bacteria (N = 231) and to assess the enrichment of potentially pathogenic genera in TCS contaminated environments. A TCS-resistant gene (TRG) database was constructed and experimentally validated to predict TCS resistance in major pathogenic bacteria. Genome-wide in silico analysis was performed to define the distribution of TCS-resistant determinants in major pathogens. Microbiome analysis of TCS contaminated soil samples was also performed to investigate the abundance of TCS-resistant pathogens. We experimentally confirmed that TCS resistance could be accurately predicted using genome-wide in silico analysis against TRG database. Predicted TCS resistant phenotypes were observed in all of the tested bacterial strains (N = 17), and heterologous expression of selected TCS resistant genes from those strains conferred expected levels of TCS resistance in an alternative host Escherichia coli. Moreover, genome-wide analysis revealed that potential TCS resistance determinants were abundant among the majority of human-associated pathogens (79%) and soil-borne plant pathogenic bacteria (98%). These included a variety of enoyl-acyl carrier protein reductase (ENRs) homologues, AcrB efflux pumps, and ENR substitutions. FabI ENR, which is the only known effective target for TCS, was either co-localized with other TCS resistance determinants or had TCS resistance-associated substitutions. Furthermore, microbiome analysis revealed that pathogenic genera with intrinsic TCS-resistant determinants exist in TCS contaminated environments. We conclude that TCS may not be as effective against the majority of bacterial pathogens as previously presumed

  13. Genome-wide analyses of LINE–LINE-mediated nonallelic homologous recombination

    PubMed Central

    Startek, Michał; Szafranski, Przemyslaw; Gambin, Tomasz; Campbell, Ian M.; Hixson, Patricia; Shaw, Chad A.; Stankiewicz, Paweł; Gambin, Anna

    2015-01-01

    Nonallelic homologous recombination (NAHR), occurring between low-copy repeats (LCRs) >10 kb in size and sharing >97% DNA sequence identity, is responsible for the majority of recurrent genomic rearrangements in the human genome. Recent studies have shown that transposable elements (TEs) can also mediate recurrent deletions and translocations, indicating the features of substrates that mediate NAHR may be significantly less stringent than previously believed. Using >4 kb length and >95% sequence identity criteria, we analyzed of the genome-wide distribution of long interspersed element (LINE) retrotransposon and their potential to mediate NAHR. We identified 17 005 directly oriented LINE pairs located <10 Mbp from each other as potential NAHR substrates, placing 82.8% of the human genome at risk of LINE–LINE-mediated instability. Cross-referencing these regions with CNVs in the Baylor College of Medicine clinical chromosomal microarray database of 36 285 patients, we identified 516 CNVs potentially mediated by LINEs. Using long-range PCR of five different genomic regions in a total of 44 patients, we confirmed that the CNV breakpoints in each patient map within the LINE elements. To additionally assess the scale of LINE–LINE/NAHR phenomenon in the human genome, we tested DNA samples from six healthy individuals on a custom aCGH microarray targeting LINE elements predicted to mediate CNVs and identified 25 LINE–LINE rearrangements. Our data indicate that LINE–LINE-mediated NAHR is widespread and under-recognized, and is an important mechanism of structural rearrangement contributing to human genomic variability. PMID:25613453

  14. Thermodynamically optimal whole-genome tiling microarray design and validation.

    PubMed

    Cho, Hyejin; Chou, Hui-Hsien

    2016-06-13

    Microarray is an efficient apparatus to interrogate the whole transcriptome of species. Microarray can be designed according to annotated gene sets, but the resulted microarrays cannot be used to identify novel transcripts and this design method is not applicable to unannotated species. Alternatively, a whole-genome tiling microarray can be designed using only genomic sequences without gene annotations, and it can be used to detect novel RNA transcripts as well as known genes. The difficulty with tiling microarray design lies in the tradeoff between probe-specificity and coverage of the genome. Sequence comparison methods based on BLAST or similar software are commonly employed in microarray design, but they cannot precisely determine the subtle thermodynamic competition between probe targets and partially matched probe nontargets during hybridizations. Using the whole-genome thermodynamic analysis software PICKY to design tiling microarrays, we can achieve maximum whole-genome coverage allowable under the thermodynamic constraints of each target genome. The resulted tiling microarrays are thermodynamically optimal in the sense that all selected probes share the same melting temperature separation range between their targets and closest nontargets, and no additional probes can be added without violating the specificity of the microarray to the target genome. This new design method was used to create two whole-genome tiling microarrays for Escherichia coli MG1655 and Agrobacterium tumefaciens C58 and the experiment results validated the design.

  15. Genome-wide divergence, haplotype distribution and population demographic histories for Gossypium hirsutum and Gossypium barbadense as revealed by genome-anchored SNPs

    PubMed Central

    Reddy, Umesh K.; Nimmakayala, Padma; Abburi, Venkata Lakshmi; Reddy, C. V. C. M.; Saminathan, Thangasamy; Percy, Richard G.; Yu, John Z.; Frelichowski, James; Udall, Joshua A.; Page, Justin T.; Zhang, Dong; Shehzad, Tariq; Paterson, Andrew H.

    2017-01-01

    Use of 10,129 singleton SNPs of known genomic location in tetraploid cotton provided unique opportunities to characterize genome-wide diversity among 440 Gossypium hirsutum and 219 G. barbadense cultivars and landrace accessions of widespread origin. Using the SNPs distributed genome-wide, we examined genetic diversity, haplotype distribution and linkage disequilibrium patterns in the G. hirsutum and G. barbadense genomes to clarify population demographic history. Diversity and identity-by-state analyses have revealed little sharing of alleles between the two cultivated allotetraploid genomes, with a few exceptions that indicated sporadic gene flow. We found a high number of new alleles, representing increased nucleotide diversity, on chromosomes 1 and 2 in cultivated G. hirsutum as compared with low nucleotide diversity on these chromosomes in landrace G. hirsutum. In contrast, G. barbadense chromosomes showed negative Tajima’s D on several chromosomes for both cultivated and landrace types, which indicate that speciation of G. barbadense itself, might have occurred with relatively narrow genetic diversity. The presence of conserved linkage disequilibrium (LD) blocks and haplotypes between G. hirsutum and G. barbadense provides strong evidence for comparable patterns of evolution in their domestication processes. Our study illustrates the potential use of population genetic techniques to identify genomic regions for domestication. PMID:28128280

  16. Characterizing Protein Interactions Employing a Genome-Wide siRNA Cellular Phenotyping Screen

    PubMed Central

    Suratanee, Apichat; Schaefer, Martin H.; Betts, Matthew J.; Soons, Zita; Mannsperger, Heiko; Harder, Nathalie; Oswald, Marcus; Gipp, Markus; Ramminger, Ellen; Marcus, Guillermo; Männer, Reinhard; Rohr, Karl; Wanker, Erich; Russell, Robert B.; Andrade-Navarro, Miguel A.; Eils, Roland; König, Rainer

    2014-01-01

    Characterizing the activating and inhibiting effect of protein-protein interactions (PPI) is fundamental to gain insight into the complex signaling system of a human cell. A plethora of methods has been suggested to infer PPI from data on a large scale, but none of them is able to characterize the effect of this interaction. Here, we present a novel computational development that employs mitotic phenotypes of a genome-wide RNAi knockdown screen and enables identifying the activating and inhibiting effects of PPIs. Exemplarily, we applied our technique to a knockdown screen of HeLa cells cultivated at standard conditions. Using a machine learning approach, we obtained high accuracy (82% AUC of the receiver operating characteristics) by cross-validation using 6,870 known activating and inhibiting PPIs as gold standard. We predicted de novo unknown activating and inhibiting effects for 1,954 PPIs in HeLa cells covering the ten major signaling pathways of the Kyoto Encyclopedia of Genes and Genomes, and made these predictions publicly available in a database. We finally demonstrate that the predicted effects can be used to cluster knockdown genes of similar biological processes in coherent subgroups. The characterization of the activating or inhibiting effect of individual PPIs opens up new perspectives for the interpretation of large datasets of PPIs and thus considerably increases the value of PPIs as an integrated resource for studying the detailed function of signaling pathways of the cellular system of interest. PMID:25255318

  17. Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis.

    PubMed

    Gianola, Daniel; Fariello, Maria I; Naya, Hugo; Schön, Chris-Carolin

    2016-10-13

    Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals ( G: ) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G,: provided variance components are unaffected by exclusion of such marker(s) from G: The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G: does matter. Removal of eigenvectors from G: can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions. Copyright © 2016 Gianola et al.

  18. Genome-Wide Variation Patterns Uncover the Origin and Selection in Cultivated Ginseng (Panax ginseng Meyer)

    PubMed Central

    Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili

    2017-01-01

    Abstract Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. PMID:28922794

  19. Genome-wide scans of genetic variants for psychophysiological endophenotypes: a methodological overview.

    PubMed

    Iacono, William G; Malone, Stephen M; Vaidyanathan, Uma; Vrieze, Scott I

    2014-12-01

    This article provides an introductory overview of the investigative strategy employed to evaluate the genetic basis of 17 endophenotypes examined as part of a 20-year data collection effort from the Minnesota Center for Twin and Family Research. Included are characterization of the study samples, descriptive statistics for key properties of the psychophysiological measures, and rationale behind the steps taken in the molecular genetic study design. The statistical approach included (a) biometric analysis of twin and family data, (b) heritability analysis using 527,829 single nucleotide polymorphisms (SNPs), (c) genome-wide association analysis of these SNPs and 17,601 autosomal genes, (d) follow-up analyses of candidate SNPs and genes hypothesized to have an association with each endophenotype, (e) rare variant analysis of nonsynonymous SNPs in the exome, and (f) whole genome sequencing association analysis using 27 million genetic variants. These methods were used in the accompanying empirical articles comprising this special issue, Genome-Wide Scans of Genetic Variants for Psychophysiological Endophenotypes. Copyright © 2014 Society for Psychophysiological Research.

  20. Genome-wide significant locus for Research Diagnostic Criteria Schizoaffective Disorder Bipolar type.

    PubMed

    Green, Elaine K; Di Florio, Arianna; Forty, Liz; Gordon-Smith, Katherine; Grozeva, Detelina; Fraser, Christine; Richards, Alexander L; Moran, Jennifer L; Purcell, Shaun; Sklar, Pamela; Kirov, George; Owen, Michael J; O'Donovan, Michael C; Craddock, Nick; Jones, Lisa; Jones, Ian R

    2017-12-01

    Studies have suggested that Research Diagnostic Criteria for Schizoaffective Disorder Bipolar type (RDC-SABP) might identify a more genetically homogenous subgroup of bipolar disorder. Aiming to identify loci associated with RDC-SABP, we have performed a replication study using independent RDC-SABP cases (n = 144) and controls (n = 6,559), focusing on the 10 loci that reached a p-value <10 -5 for RDC-SABP in the Wellcome Trust Case Control Consortium (WTCCC) bipolar disorder sample. Combining the WTCCC and replication datasets by meta-analysis (combined RDC-SABP, n = 423, controls, n = 9,494), we observed genome-wide significant association at one SNP, rs2352974, located within the intron of the gene TRAIP on chromosome 3p21.31 (p-value, 4.37 × 10 -8 ). This locus did not reach genome-wide significance in bipolar disorder or schizophrenia large Psychiatric Genomic Consortium datasets, suggesting that it may represent a relatively specific genetic risk for the bipolar subtype of schizoaffective disorder. © 2017 Wiley Periodicals, Inc.

  1. Genome-wide identification, functional and evolutionary analysis of terpene synthases in pineapple.

    PubMed

    Chen, Xiaoe; Yang, Wei; Zhang, Liqin; Wu, Xianmiao; Cheng, Tian; Li, Guanglin

    2017-10-01

    Terpene synthases (TPSs) are vital for the biosynthesis of active terpenoids, which have important physiological, ecological and medicinal value. Although terpenoids have been reported in pineapple (Ananas comosus), genome-wide investigations of the TPS genes responsible for pineapple terpenoid synthesis are still lacking. By integrating pineapple genome and proteome data, twenty-one putative terpene synthase genes were found in pineapple and divided into five subfamilies. Tandem duplication is the cause of TPS gene family duplication. Furthermore, functional differentiation between each TPS subfamily may have occurred for several reasons. Sixty-two key amino acid sites were identified as being type-II functionally divergence between TPS-a and TPS-c subfamily. Finally, coevolution analysis indicated that multiple amino acid residues are involved in coevolutionary processes. In addition, the enzyme activity of two TPSs were tested. This genome-wide identification, functional and evolutionary analysis of pineapple TPS genes provide a new insight into understanding the roles of TPS family and lay the basis for further characterizing the function and evolution of TPS gene family. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. HITS-CLIP yields genome-wide insights into brain alternative RNA processing

    NASA Astrophysics Data System (ADS)

    Licatalosi, Donny D.; Mele, Aldo; Fak, John J.; Ule, Jernej; Kayikci, Melis; Chi, Sung Wook; Clark, Tyson A.; Schweitzer, Anthony C.; Blume, John E.; Wang, Xuning; Darnell, Jennifer C.; Darnell, Robert B.

    2008-11-01

    Protein-RNA interactions have critical roles in all aspects of gene expression. However, applying biochemical methods to understand such interactions in living tissues has been challenging. Here we develop a genome-wide means of mapping protein-RNA binding sites in vivo, by high-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP). HITS-CLIP analysis of the neuron-specific splicing factor Nova revealed extremely reproducible RNA-binding maps in multiple mouse brains. These maps provide genome-wide in vivo biochemical footprints confirming the previous prediction that the position of Nova binding determines the outcome of alternative splicing; moreover, they are sufficiently powerful to predict Nova action de novo. HITS-CLIP revealed a large number of Nova-RNA interactions in 3' untranslated regions, leading to the discovery that Nova regulates alternative polyadenylation in the brain. HITS-CLIP, therefore, provides a robust, unbiased means to identify functional protein-RNA interactions in vivo.

  3. The application of genome-wide 5-hydroxymethylcytosine studies in cancer research.

    PubMed

    Thomson, John P; Meehan, Richard R

    2017-01-01

    Early detection and characterization of molecular events associated with tumorgenesis remain high priorities. Genome-wide epigenetic assays are promising diagnostic tools, as aberrant epigenetic events are frequent and often cancer specific. The deposition and analysis of multiple patient-derived cancer epigenomic profiles contributes to our appreciation of the underlying biology; aiding the detection of novel identifiers for cancer subtypes. Modifying enzymes and co-factors regulating these epigenetic marks are frequently mutated in cancers, and as epigenetic modifications themselves are reversible, this makes their study very attractive with respect to pharmaceutical intervention. Here we focus on the novel modified base, 5-hydoxymethylcytosine, and discuss how genome-wide 5-hydoxymethylcytosine profiling expedites our molecular understanding of cancer, serves as a lineage tracer, classifies the mode of action of potentially carcinogenic agents and clarifies the roles of potential novel cancer drug targets; thus assisting the development of new diagnostic/prognostic tools.

  4. Genome-Wide Association Study Identifies Novel Loci Associated With Diisocyanate-Induced Occupational Asthma

    PubMed Central

    Yucesoy, Berran; Kaufman, Kenneth M.; Lummus, Zana L.; Weirauch, Matthew T.; Zhang, Ge; Cartier, André; Boulet, Louis-Philippe; Sastre, Joaquin; Quirce, Santiago; Tarlo, Susan M.; Cruz, Maria-Jesus; Munoz, Xavier; Harley, John B.; Bernstein, David I.

    2015-01-01

    Diisocyanates, reactive chemicals used to produce polyurethane products, are the most common causes of occupational asthma. The aim of this study is to identify susceptibility gene variants that could contribute to the pathogenesis of diisocyanate asthma (DA) using a Genome-Wide Association Study (GWAS) approach. Genome-wide single nucleotide polymorphism (SNP) genotyping was performed in 74 diisocyanate-exposed workers with DA and 824 healthy controls using Omni-2.5 and Omni-5 SNP microarrays. We identified 11 SNPs that exceeded genome-wide significance; the strongest association was for the rs12913832 SNP located on chromosome 15, which has been mapped to the HERC2 gene (p = 6.94 × 10−14). Strong associations were also found for SNPs near the ODZ3 and CDH17 genes on chromosomes 4 and 8 (rs908084, p = 8.59 × 10−9 and rs2514805, p = 1.22 × 10−8, respectively). We also prioritized 38 SNPs with suggestive genome-wide significance (p < 1 × 10−6). Among them, 17 SNPs map to the PITPNC1, ACMSD, ZBTB16, ODZ3, and CDH17 gene loci. Functional genomics data indicate that 2 of the suggestive SNPs (rs2446823 and rs2446824) are located within putative binding sites for the CCAAT/Enhancer Binding Protein (CEBP) and Hepatocyte Nuclear Factor 4, Alpha transcription factors (TFs), respectively. This study identified SNPs mapping to the HERC2, CDH17, and ODZ3 genes as potential susceptibility loci for DA. Pathway analysis indicated that these genes are associated with antigen processing and presentation, and other immune pathways. Overlap of 2 suggestive SNPs with likely TF binding sites suggests possible roles in disruption of gene regulation. These results provide new insights into the genetic architecture of DA and serve as a basis for future functional and mechanistic studies. PMID:25918132

  5. A genome-wide association study of copy number variations with umbilical hernia in swine.

    PubMed

    Long, Yi; Su, Ying; Ai, Huashui; Zhang, Zhiyan; Yang, Bin; Ruan, Guorong; Xiao, Shijun; Liao, Xinjun; Ren, Jun; Huang, Lusheng; Ding, Nengshui

    2016-06-01

    Umbilical hernia (UH) is one of the most common congenital defects in pigs, leading to considerable economic loss and serious animal welfare problems. To test whether copy number variations (CNVs) contribute to pig UH, we performed a case-control genome-wide CNV association study on 905 pigs from the Duroc, Landrace and Yorkshire breeds using the Porcine SNP60 BeadChip and penncnv algorithm. We first constructed a genomic map comprising 6193 CNVs that pertain to 737 CNV regions. Then, we identified eight CNVs significantly associated with the risk for UH in the three pig breeds. Six of seven significantly associated CNVs were validated using quantitative real-time PCR. Notably, a rare CNV (CNV14:13030843-13059455) encompassing the NUGGC gene was strongly associated with UH (permutation-corrected P = 0.0015) in Duroc pigs. This CNV occurred exclusively in seven Duroc UH-affected individuals. SNPs surrounding the CNV did not show association signals, indicating that rare CNVs may play an important role in complex pig diseases such as UH. The NUGGC gene has been implicated in human omphalocele and inguinal hernia. Our finding supports that CNVs, including the NUGGC CNV, contribute to the pathogenesis of pig UH. © 2016 Stichting International Foundation for Animal Genetics.

  6. Signatures of positive selection in East African Shorthorn Zebu: a genome-wide SNP analysis

    USDA-ARS?s Scientific Manuscript database

    The small East African Shorthorn Zebu is the main indigenous cattle across East Africa. A recent genome wide SNPs analysis has revealed their ancient stable African taurine x Asian zebu admixture. Here, we assess the presence of candidate signature of positive selection in their genome, with the aim...

  7. Quantitative genome-wide methylation analysis of high-grade non-muscle invasive bladder cancer

    PubMed Central

    Kitchen, Mark O.; Bryan, Richard T.; Emes, Richard D.; Glossop, John R.; Luscombe, Christopher; Cheng, K. K.; Zeegers, Maurice P.; James, Nicholas D.; Devall, Adam J.; Mein, Charles A.; Gommersall, Lyndon; Fryer, Anthony A.; Farrell, William E.

    2016-01-01

    ABSTRACT High-grade non-muscle invasive bladder cancer (HG-NMIBC) is a clinically unpredictable disease with greater risks of recurrence and progression relative to their low-intermediate-grade counterparts. The molecular events, including those affecting the epigenome, that characterize this disease entity in the context of tumor development, recurrence, and progression, are incompletely understood. We therefore interrogated genome-wide DNA methylation using HumanMethylation450 BeadChip arrays in 21 primary HG-NMIBC tumors relative to normal bladder controls. Using strict inclusion-exclusion criteria we identified 1,057 hypermethylated CpGs within gene promoter-associated CpG islands, representing 256 genes. We validated the array data by bisulphite pyrosequencing and examined 25 array-identified candidate genes in an independent cohort of 30 HG-NMIBC and 18 low-intermediate-grade NMIBC. These analyses revealed significantly higher methylation frequencies in high-grade tumors relative to low-intermediate-grade tumors for the ATP5G2, IRX1 and VAX2 genes (P<0.05), and similarly significant increases in mean levels of methylation in high-grade tumors for the ATP5G2, VAX2, INSRR, PRDM14, VSX1, TFAP2b, PRRX1, and HIST1H4F genes (P<0.05). Although inappropriate promoter methylation was not invariantly associated with reduced transcript expression, a significant association was apparent for the ARHGEF4, PON3, STAT5a, and VAX2 gene transcripts (P<0.05). Herein, we present the first genome-wide DNA methylation analysis in a unique HG-NMIBC cohort, showing extensive and discrete methylation changes relative to normal bladder and low-intermediate-grade tumors. The genes we identified hold significant potential as targets for novel therapeutic intervention either alone, or in combination, with more conventional therapeutic options in the treatment of this clinically unpredictable disease. PMID:26929985

  8. Genome-wide association analysis of age-at-onset in Alzheimer’s disease

    PubMed Central

    Kamboh, M. Ilyas; Barmada, M. Michael; Demirci, F. Yesim; Minster, Ryan L.; Carrasquillo, Minerva M.; Pankratz, V. Shane; Younkin, Steven G.; Saykin, Andrew J.; Sweet, Robert A.; Feingold, Eleanor; DeKosky, Steven T.; Lopez, Oscar L.

    2011-01-01

    The risk of Alzheimer’s disease (AD) is strongly determined by genetic factors and recent genome-wide association studies (GWAS) have identified several genes for the disease risk. In addition to the disease risk, age-at-onset (AAO) of AD has also strong genetic component with an estimated heritability of 42%. Identification of AAO genes may help to understand the biological mechanisms that regulate the onset of the disease. Here we report the first GWAS focused on identifying genes for the AAO of AD. We performed a genome-wide meta analysis on 3 samples comprising a total of 2,222 AD cases. A total of ~2.5 million directly genotyped or imputed SNPs were analyzed in relation to AAO of AD. As expected, the most significant associations were observed in the APOE region on chromosome 19 where several SNPs surpassed the conservative genome-wide significant threshold (P<5E-08). The most significant SNP outside the APOE region was located in the DCHS2 gene on chromosome 4q31.3 (rs1466662; P=4.95E-07). There were 19 additional significant SNPs in this region at P<1E-04 and the DCHS2 gene is expressed in the cerebral cortex and thus is a potential candidate for affecting AAO in AD. These findings need to be confirmed in additional well-powered samples. PMID:22005931

  9. Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome

    PubMed Central

    Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

    2016-01-01

    Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. PMID:27401230

  10. From conservation genetics to conservation genomics: a genome-wide assessment of blue whales (Balaenoptera musculus) in Australian feeding aggregations

    PubMed Central

    Sandoval-Castillo, Jonathan; Jenner, K. Curt S.; Gill, Peter C.; Jenner, Micheline-Nicole M.; Morrice, Margaret G.

    2018-01-01

    Genetic datasets of tens of markers have been superseded through next-generation sequencing technology with genome-wide datasets of thousands of markers. Genomic datasets improve our power to detect low population structure and identify adaptive divergence. The increased population-level knowledge can inform the conservation management of endangered species, such as the blue whale (Balaenoptera musculus). In Australia, there are two known feeding aggregations of the pygmy blue whale (B. m. brevicauda) which have shown no evidence of genetic structure based on a small dataset of 10 microsatellites and mtDNA. Here, we develop and implement a high-resolution dataset of 8294 genome-wide filtered single nucleotide polymorphisms, the first of its kind for blue whales. We use these data to assess whether the Australian feeding aggregations constitute one population and to test for the first time whether there is adaptive divergence between the feeding aggregations. We found no evidence of neutral population structure and negligible evidence of adaptive divergence. We propose that individuals likely travel widely between feeding areas and to breeding areas, which would require them to be adapted to a wide range of environmental conditions. This has important implications for their conservation as this blue whale population is likely vulnerable to a range of anthropogenic threats both off Australia and elsewhere. PMID:29410806

  11. Genome-wide variation within and between wild and domestic yak.

    PubMed

    Wang, Kun; Hu, Quanjun; Ma, Hui; Wang, Lizhong; Yang, Yongzhi; Luo, Wenchun; Qiu, Qiang

    2014-07-01

    The yak is one of the few animals that can thrive in the harsh environment of the Qinghai-Tibetan Plateau and adjacent Alpine regions. Yak provides essential resources allowing Tibetans to live at high altitudes. However, genetic variation within and between wild and domestic yak remain unknown. Here, we present a genome-wide study of the genetic variation within and between wild and domestic yak. Using next-generation sequencing technology, we resequenced three wild and three domestic yak with a mean of fivefold coverage using our published domestic yak genome as a reference. We identified a total of 8.38 million SNPs (7.14 million novel), 383,241 InDels and 126,352 structural variants between the six yak. We observed higher linkage disequilibrium in domestic yak than in wild yak and a modest but distinct genetic divergence between these two groups. We further identified more than a thousand of potential selected regions (PSRs) for the three domestic yak by scanning the whole genome. These genomic resources can be further used to study genetic diversity and select superior breeds of yak and other bovid species. © 2014 John Wiley & Sons Ltd.

  12. Meta-Analysis of Genome-Wide Association Studies for Abdominal Aortic Aneurysm Identifies Four New Disease-Specific Risk Loci

    PubMed Central

    Tromp, Gerard; Kuivaniemi, Helena; Gretarsdottir, Solveig; Baas, Annette F.; Giusti, Betti; Strauss, Ewa; van‘t Hof, Femke N.G.; Webb, Thomas R.; Erdman, Robert; Ritchie, Marylyn D.; Elmore, James R.; Verma, Anurag; Pendergrass, Sarah; Kullo, Iftikhar J.; Ye, Zi; Peissig, Peggy L.; Gottesman, Omri; Verma, Shefali S.; Malinowski, Jennifer; Rasmussen-Torvik, Laura J.; Borthwick, Kenneth M.; Smelser, Diane T.; Crosslin, David R.; de Andrade, Mariza; Ryer, Evan J.; McCarty, Catherine A.; Böttinger, Erwin P.; Pacheco, Jennifer A.; Crawford, Dana C.; Carrell, David S.; Gerhard, Glenn S.; Franklin, David P.; Carey, David J.; Phillips, Victoria L.; Williams, Michael J.A.; Wei, Wenhua; Blair, Ross; Hill, Andrew A.; Vasudevan, Thodor M.; Lewis, David R.; Thomson, Ian A.; Krysa, Jo; Hill, Geraldine B.; Roake, Justin; Merriman, Tony R.; Oszkinis, Grzegorz; Galora, Silvia; Saracini, Claudia; Abbate, Rosanna; Pulli, Raffaele; Pratesi, Carlo; Saratzis, Athanasios; Verissimo, Ana R.; Bumpstead, Suzannah; Badger, Stephen A.; Clough, Rachel E.; Cockerill, Gillian; Hafez, Hany; Scott, D. Julian A.; Futers, T. Simon; Romaine, Simon P.R.; Bridge, Katherine; Griffin, Kathryn J.; Bailey, Marc A.; Smith, Alberto; Thompson, Matthew M.; van Bockxmeer, Frank M.; Matthiasson, Stefan E.; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Blankensteijn, Jan D.; Teijink, Joep A.W.; Wijmenga, Cisca; de Graaf, Jacqueline; Kiemeney, Lambertus A.; Lindholt, Jes S.; Hughes, Anne; Bradley, Declan T.; Stirrups, Kathleen; Golledge, Jonathan; Norman, Paul E.; Powell, Janet T.; Humphries, Steve E.; Hamby, Stephen E.; Goodall, Alison H.; Nelson, Christopher P.; Sakalihasan, Natzi; Courtois, Audrey; Ferrell, Robert E.; Eriksson, Per; Folkersen, Lasse; Franco-Cereceda, Anders; Eicher, John D.; Johnson, Andrew D.; Betsholtz, Christer; Ruusalepp, Arno; Franzén, Oscar; Schadt, Eric E.; Björkegren, Johan L.M.; Lipovich, Leonard; Drolet, Anne M.; Verhoeven, Eric L.; Zeebregts, Clark J.; Geelkerken, Robert H.; van Sambeek, Marc R.; van Sterkenburg, Steven M.; de Vries, Jean-Paul; Stefansson, Kari; Thompson, John R.; de Bakker, Paul I.W.; Deloukas, Panos; Sayers, Robert D.; Harrison, Seamus C.; van Rij, Andre M.; Samani, Nilesh J.

    2017-01-01

    Rationale: Abdominal aortic aneurysm (AAA) is a complex disease with both genetic and environmental risk factors. Together, 6 previously identified risk loci only explain a small proportion of the heritability of AAA. Objective: To identify additional AAA risk loci using data from all available genome-wide association studies. Methods and Results: Through a meta-analysis of 6 genome-wide association study data sets and a validation study totaling 10 204 cases and 107 766 controls, we identified 4 new AAA risk loci: 1q32.3 (SMYD2), 13q12.11 (LINC00540), 20q13.12 (near PCIF1/MMP9/ZNF335), and 21q22.2 (ERG). In various database searches, we observed no new associations between the lead AAA single nucleotide polymorphisms and coronary artery disease, blood pressure, lipids, or diabetes mellitus. Network analyses identified ERG, IL6R, and LDLR as modifiers of MMP9, with a direct interaction between ERG and MMP9. Conclusions: The 4 new risk loci for AAA seem to be specific for AAA compared with other cardiovascular diseases and related traits suggesting that traditional cardiovascular risk factor management may only have limited value in preventing the progression of aneurysmal disease. PMID:27899403

  13. Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries.

    PubMed

    Baurley, James W; Edlund, Christopher K; Pardamean, Carissa I; Conti, David V; Krasnow, Ruth; Javitz, Harold S; Hops, Hyman; Swan, Gary E; Benowitz, Neal L; Bergen, Andrew W

    2016-09-01

    Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3'-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan-continental population biomarkers for nicotine metabolism. This

  14. A mega-analysis of genome-wide association studies for major depressive disorder.

    PubMed

    Ripke, Stephan; Wray, Naomi R; Lewis, Cathryn M; Hamilton, Steven P; Weissman, Myrna M; Breen, Gerome; Byrne, Enda M; Blackwood, Douglas H R; Boomsma, Dorret I; Cichon, Sven; Heath, Andrew C; Holsboer, Florian; Lucae, Susanne; Madden, Pamela A F; Martin, Nicholas G; McGuffin, Peter; Muglia, Pierandrea; Noethen, Markus M; Penninx, Brenda P; Pergadia, Michele L; Potash, James B; Rietschel, Marcella; Lin, Danyu; Müller-Myhsok, Bertram; Shi, Jianxin; Steinberg, Stacy; Grabe, Hans J; Lichtenstein, Paul; Magnusson, Patrik; Perlis, Roy H; Preisig, Martin; Smoller, Jordan W; Stefansson, Kari; Uher, Rudolf; Kutalik, Zoltan; Tansey, Katherine E; Teumer, Alexander; Viktorin, Alexander; Barnes, Michael R; Bettecken, Thomas; Binder, Elisabeth B; Breuer, René; Castro, Victor M; Churchill, Susanne E; Coryell, William H; Craddock, Nick; Craig, Ian W; Czamara, Darina; De Geus, Eco J; Degenhardt, Franziska; Farmer, Anne E; Fava, Maurizio; Frank, Josef; Gainer, Vivian S; Gallagher, Patience J; Gordon, Scott D; Goryachev, Sergey; Gross, Magdalena; Guipponi, Michel; Henders, Anjali K; Herms, Stefan; Hickie, Ian B; Hoefels, Susanne; Hoogendijk, Witte; Hottenga, Jouke Jan; Iosifescu, Dan V; Ising, Marcus; Jones, Ian; Jones, Lisa; Jung-Ying, Tzeng; Knowles, James A; Kohane, Isaac S; Kohli, Martin A; Korszun, Ania; Landen, Mikael; Lawson, William B; Lewis, Glyn; Macintyre, Donald; Maier, Wolfgang; Mattheisen, Manuel; McGrath, Patrick J; McIntosh, Andrew; McLean, Alan; Middeldorp, Christel M; Middleton, Lefkos; Montgomery, Grant M; Murphy, Shawn N; Nauck, Matthias; Nolen, Willem A; Nyholt, Dale R; O'Donovan, Michael; Oskarsson, Högni; Pedersen, Nancy; Scheftner, William A; Schulz, Andrea; Schulze, Thomas G; Shyn, Stanley I; Sigurdsson, Engilbert; Slager, Susan L; Smit, Johannes H; Stefansson, Hreinn; Steffens, Michael; Thorgeirsson, Thorgeir; Tozzi, Federica; Treutlein, Jens; Uhr, Manfred; van den Oord, Edwin J C G; Van Grootheest, Gerard; Völzke, Henry; Weilburg, Jeffrey B; Willemsen, Gonneke; Zitman, Frans G; Neale, Benjamin; Daly, Mark; Levinson, Douglas F; Sullivan, Patrick F

    2013-04-01

    Prior genome-wide association studies (GWAS) of major depressive disorder (MDD) have met with limited success. We sought to increase statistical power to detect disease loci by conducting a GWAS mega-analysis for MDD. In the MDD discovery phase, we analyzed more than 1.2 million autosomal and X chromosome single-nucleotide polymorphisms (SNPs) in 18 759 independent and unrelated subjects of recent European ancestry (9240 MDD cases and 9519 controls). In the MDD replication phase, we evaluated 554 SNPs in independent samples (6783 MDD cases and 50 695 controls). We also conducted a cross-disorder meta-analysis using 819 autosomal SNPs with P<0.0001 for either MDD or the Psychiatric GWAS Consortium bipolar disorder (BIP) mega-analysis (9238 MDD cases/8039 controls and 6998 BIP cases/7775 controls). No SNPs achieved genome-wide significance in the MDD discovery phase, the MDD replication phase or in pre-planned secondary analyses (by sex, recurrent MDD, recurrent early-onset MDD, age of onset, pre-pubertal onset MDD or typical-like MDD from a latent class analyses of the MDD criteria). In the MDD-bipolar cross-disorder analysis, 15 SNPs exceeded genome-wide significance (P<5 × 10(-8)), and all were in a 248 kb interval of high LD on 3p21.1 (chr3:52 425 083-53 822 102, minimum P=5.9 × 10(-9) at rs2535629). Although this is the largest genome-wide analysis of MDD yet conducted, its high prevalence means that the sample is still underpowered to detect genetic effects typical for complex traits. Therefore, we were unable to identify robust and replicable findings. We discuss what this means for genetic research for MDD. The 3p21.1 MDD-BIP finding should be interpreted with caution as the most significant SNP did not replicate in MDD samples, and genotyping in independent samples will be needed to resolve its status.

  15. Genome-Wide Association Analysis of Blood Biomarkers in Chronic Obstructive Pulmonary Disease

    PubMed Central

    Kim, Deog Kyeom; Cho, Michael H.; Hersh, Craig P.; Lomas, David A.; Miller, Bruce E.; Kong, Xiangyang; Bakke, Per; Gulsvik, Amund; Agustí, Alvar; Wouters, Emiel; Celli, Bartolome; Coxson, Harvey; Vestbo, Jørgen; MacNee, William; Yates, Julie C.; Rennard, Stephen; Litonjua, Augusto; Qiu, Weiliang; Beaty, Terri H.; Crapo, James D.; Riley, John H.; Tal-Singer, Ruth

    2012-01-01

    Rationale: A genome-wide association study (GWAS) for circulating chronic obstructive pulmonary disease (COPD) biomarkers could identify genetic determinants of biomarker levels and COPD susceptibility. Objectives: To identify genetic variants of circulating protein biomarkers and novel genetic determinants of COPD. Methods: GWAS was performed for two pneumoproteins, Clara cell secretory protein (CC16) and surfactant protein D (SP-D), and five systemic inflammatory markers (C-reactive protein, fibrinogen, IL-6, IL-8, and tumor necrosis factor-α) in 1,951 subjects with COPD. For genome-wide significant single nucleotide polymorphisms (SNPs) (P < 1 × 10−8), association with COPD susceptibility was tested in 2,939 cases with COPD and 1,380 smoking control subjects. The association of candidate SNPs with mRNA expression in induced sputum was also elucidated. Measurements and Main Results: Genome-wide significant susceptibility loci affecting biomarker levels were found only for the two pneumoproteins. Two discrete loci affecting CC16, one region near the CC16 coding gene (SCGB1A1) on chromosome 11 and another locus approximately 25 Mb away from SCGB1A1, were identified, whereas multiple SNPs on chromosomes 6 and 16, in addition to SNPs near SFTPD, had genome-wide significant associations with SP-D levels. Several SNPs affecting circulating CC16 levels were significantly associated with sputum mRNA expression of SCGB1A1 (P = 0.009–0.03). Several SNPs highly associated with CC16 or SP-D levels were nominally associated with COPD in a collaborative GWAS (P = 0.001–0.049), although these COPD associations were not replicated in two additional cohorts. Conclusions: Distant genetic loci and biomarker-coding genes affect circulating levels of COPD-related pneumoproteins. A subset of these protein quantitative trait loci may influence their gene expression in the lung and/or COPD susceptibility. Clinical trial registered with www.clinicaltrials.gov (NCT 00292552). PMID

  16. Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

    NASA Astrophysics Data System (ADS)

    Qian, Long; Kussell, Edo

    The composition of genomes with respect to short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. The underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, which we detect in all species across domains of life. We hypothesize that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Alternative contributions may come from interference of protein-DNA binding with replication and mutational repair processes, which operates with similar rates. We conclude that genome-wide word compositions have been molded by DNA binding proteins through tiny evolutionary steps over timescales spanning millions of generations.

  17. Genome-wide transposon mutagenesis in pathogenic Leptospira species.

    PubMed

    Murray, Gerald L; Morel, Viviane; Cerqueira, Gustavo M; Croda, Julio; Srikram, Amporn; Henry, Rebekah; Ko, Albert I; Dellagostin, Odir A; Bulach, Dieter M; Sermswan, Rasana W; Adler, Ben; Picardeau, Mathieu

    2009-02-01

    Leptospira interrogans is the most common cause of leptospirosis in humans and animals. Genetic analysis of L. interrogans has been severely hindered by a lack of tools for genetic manipulation. Recently we developed the mariner-based transposon Himar1 to generate the first defined mutants in L. interrogans. In this study, a total of 929 independent transposon mutants were obtained and the location of insertion determined. Of these mutants, 721 were located in the protein coding regions of 551 different genes. While sequence analysis of transposon insertion sites indicated that transposition occurred in an essentially random fashion in the genome, 25 unique transposon mutants were found to exhibit insertions into genes encoding 16S or 23S rRNAs, suggesting these genes are insertional hot spots in the L. interrogans genome. In contrast, loci containing notionally essential genes involved in lipopolysaccharide and heme biosynthesis showed few transposon insertions. The effect of gene disruption on the virulence of a selected set of defined mutants was investigated using the hamster model of leptospirosis. Two attenuated mutants with disruptions in hypothetical genes were identified, thus validating the use of transposon mutagenesis for the identification of novel virulence factors in L. interrogans. This library provides a valuable resource for the study of gene function in L. interrogans. Combined with the genome sequences of L. interrogans, this provides an opportunity to investigate genes that contribute to pathogenesis and will provide a better understanding of the biology of L. interrogans.

  18. Genome-wide single nucleotide polymorphisms (SNPs) for a model invasive ascidian Botryllus schlosseri.

    PubMed

    Gao, Yangchun; Li, Shiguo; Zhan, Aibin

    2018-04-01

    Invasive species cause huge damages to ecology, environment and economy globally. The comprehensive understanding of invasion mechanisms, particularly genetic bases of micro-evolutionary processes responsible for invasion success, is essential for reducing potential damages caused by invasive species. The golden star tunicate, Botryllus schlosseri, has become a model species in invasion biology, mainly owing to its high invasiveness nature and small well-sequenced genome. However, the genome-wide genetic markers have not been well developed in this highly invasive species, thus limiting the comprehensive understanding of genetic mechanisms of invasion success. Using restriction site-associated DNA (RAD) tag sequencing, here we developed a high-quality resource of 14,119 out of 158,821 SNPs for B. schlosseri. These SNPs were relatively evenly distributed at each chromosome. SNP annotations showed that the majority of SNPs (63.20%) were located at intergenic regions, and 21.51% and 14.58% were located at introns and exons, respectively. In addition, the potential use of the developed SNPs for population genomics studies was primarily assessed, such as the estimate of observed heterozygosity (H O ), expected heterozygosity (H E ), nucleotide diversity (π), Wright's inbreeding coefficient (F IS ) and effective population size (Ne). Our developed SNP resource would provide future studies the genome-wide genetic markers for genetic and genomic investigations, such as genetic bases of micro-evolutionary processes responsible for invasion success.

  19. Genome-wide Association Analysis of Blood-Pressure Traits in African-Ancestry Individuals Reveals Common Associated Genes in African and Non-African Populations

    PubMed Central

    Franceschini, Nora; Fox, Ervin; Zhang, Zhaogong; Edwards, Todd L.; Nalls, Michael A.; Sung, Yun Ju; Tayo, Bamidele O.; Sun, Yan V.; Gottesman, Omri; Adeyemo, Adebawole; Johnson, Andrew D.; Young, J. Hunter; Rice, Ken; Duan, Qing; Chen, Fang; Li, Yun; Tang, Hua; Fornage, Myriam; Keene, Keith L.; Andrews, Jeanette S.; Smith, Jennifer A.; Faul, Jessica D.; Guangfa, Zhang; Guo, Wei; Liu, Yu; Murray, Sarah S.; Musani, Solomon K.; Srinivasan, Sathanur; Velez Edwards, Digna R.; Wang, Heming; Becker, Lewis C.; Bovet, Pascal; Bochud, Murielle; Broeckel, Ulrich; Burnier, Michel; Carty, Cara; Chasman, Daniel I.; Ehret, Georg; Chen, Wei-Min; Chen, Guanjie; Chen, Wei; Ding, Jingzhong; Dreisbach, Albert W.; Evans, Michele K.; Guo, Xiuqing; Garcia, Melissa E.; Jensen, Rich; Keller, Margaux F.; Lettre, Guillaume; Lotay, Vaneet; Martin, Lisa W.; Moore, Jason H.; Morrison, Alanna C.; Mosley, Thomas H.; Ogunniyi, Adesola; Palmas, Walter; Papanicolaou, George; Penman, Alan; Polak, Joseph F.; Ridker, Paul M.; Salako, Babatunde; Singleton, Andrew B.; Shriner, Daniel; Taylor, Kent D.; Vasan, Ramachandran; Wiggins, Kerri; Williams, Scott M.; Yanek, Lisa R.; Zhao, Wei; Zonderman, Alan B.; Becker, Diane M.; Berenson, Gerald; Boerwinkle, Eric; Bottinger, Erwin; Cushman, Mary; Eaton, Charles; Nyberg, Fredrik; Heiss, Gerardo; Hirschhron, Joel N.; Howard, Virginia J.; Karczewsk, Konrad J.; Lanktree, Matthew B.; Liu, Kiang; Liu, Yongmei; Loos, Ruth; Margolis, Karen; Snyder, Michael; Go, Min Jin; Kim, Young Jin; Lee, Jong-Young; Jeon, Jae-Pil; Kim, Sung Soo; Han, Bok-Ghee; Cho, Yoon Shin; Sim, Xueling; Tay, Wan Ting; Ong, Rick Twee Hee; Seielstad, Mark; Liu, Jian Jun; Aung, Tin; Wong, Tien Yin; Teo, Yik Ying; Tai, E. Shyong; Chen, Chien-Hsiun; Chang, Li-ching; Chen, Yuan-Tsong; Wu, Jer-Yuarn; Kelly, Tanika N.; Gu, Dongfeng; Hixson, James E.; Sung, Yun Ju; He, Jiang; Tabara, Yasuharu; Kokubo, Yoshihiro; Miki, Tetsuro; Iwai, Naoharu; Kato, Norihiro; Takeuchi, Fumihiko; Katsuya, Tomohiro; Nabika, Toru; Sugiyama, Takao; Zhang, Yi; Huang, Wei; Zhang, Xuegong; Zhou, Xueya; Jin, Li; Zhu, Dingliang; Psaty, Bruce M.; Schork, Nicholas J.; Weir, David R.; Rotimi, Charles N.; Sale, Michele M.; Harris, Tamara; Kardia, Sharon L.R.; Hunt, Steven C.; Arnett, Donna; Redline, Susan; Cooper, Richard S.; Risch, Neil J.; Rao, D.C.; Rotter, Jerome I.; Chakravarti, Aravinda; Reiner, Alex P.; Levy, Daniel; Keating, Brendan J.; Zhu, Xiaofeng

    2013-01-01

    High blood pressure (BP) is more prevalent and contributes to more severe manifestations of cardiovascular disease (CVD) in African Americans than in any other United States ethnic group. Several small African-ancestry (AA) BP genome-wide association studies (GWASs) have been published, but their findings have failed to replicate to date. We report on a large AA BP GWAS meta-analysis that includes 29,378 individuals from 19 discovery cohorts and subsequent replication in additional samples of AA (n = 10,386), European ancestry (EA) (n = 69,395), and East Asian ancestry (n = 19,601). Five loci (EVX1-HOXA, ULK4, RSPO3, PLEKHG1, and SOX6) reached genome-wide significance (p < 1.0 × 10−8) for either systolic or diastolic BP in a transethnic meta-analysis after correction for multiple testing. Three of these BP loci (EVX1-HOXA, RSPO3, and PLEKHG1) lack previous associations with BP. We also identified one independent signal in a known BP locus (SOX6) and provide evidence for fine mapping in four additional validated BP loci. We also demonstrate that validated EA BP GWAS loci, considered jointly, show significant effects in AA samples. Consequently, these findings suggest that BP loci might have universal effects across studied populations, demonstrating that multiethnic samples are an essential component in identifying, fine mapping, and understanding their trait variability. PMID:23972371

  20. pKWmEB: integration of Kruskal-Wallis test with empirical Bayes under polygenic background control for multi-locus genome-wide association study.

    PubMed

    Ren, Wen-Long; Wen, Yang-Jun; Dunwell, Jim M; Zhang, Yuan-Ming

    2018-03-01

    Although nonparametric methods in genome-wide association studies (GWAS) are robust in quantitative trait nucleotide (QTN) detection, the absence of polygenic background control in single-marker association in genome-wide scans results in a high false positive rate. To overcome this issue, we proposed an integrated nonparametric method for multi-locus GWAS. First, a new model transformation was used to whiten the covariance matrix of polygenic matrix K and environmental noise. Using the transferred model, Kruskal-Wallis test along with least angle regression was then used to select all the markers that were potentially associated with the trait. Finally, all the selected markers were placed into multi-locus model, these effects were estimated by empirical Bayes, and all the nonzero effects were further identified by a likelihood ratio test for true QTN detection. This method, named pKWmEB, was validated by a series of Monte Carlo simulation studies. As a result, pKWmEB effectively controlled false positive rate, although a less stringent significance criterion was adopted. More importantly, pKWmEB retained the high power of Kruskal-Wallis test, and provided QTN effect estimates. To further validate pKWmEB, we re-analyzed four flowering time related traits in Arabidopsis thaliana, and detected some previously reported genes that were not identified by the other methods.

  1. Genome-wide association analysis of symbiotic nitrogen fixation in common bean

    USDA-ARS?s Scientific Manuscript database

    A genome-wide association study (GWAS) was conducted to explore the genetic basis of variation for symbiotic nitrogen fixation (SNF) and related traits in the Andean diversity panel (ADP) comprised of 259 common bean (Phaseolus vulgaris) genotypes. The ADP was evaluated for SNF and related traits in...

  2. Genome-wide analyses for personality traits identify six genomic loci and show correlations with psychiatric disorders

    PubMed Central

    Lo, Min-Tzu; Hinds, David A.; Tung, Joyce Y.; Franz, Carol; Fan, Chun-Chieh; Wang, Yunpeng; Smeland, Olav B.; Schork, Andrew; Holland, Dominic; Kauppi, Karolina; Sanyal, Nilotpal; Escott-Price, Valentina; Smith, Daniel J.; O'Donovan, Michael; Stefansson, Hreinn; Bjornsdottir, Gyda; Thorgeirsson, Thorgeir E.; Stefansson, Kari; McEvoy, Linda K.; Dale, Anders M.; Andreassen, Ole A.; Chen, Chi-Hua

    2017-01-01

    Summary Personality is influenced by genetic and environmental factors1, and associated with mental health. However, the underlying genetic determinants are largely unknown. We identified six genetic loci, including five novel loci2,3, significantly associated with personality traits in a meta-analysis of genome-wide association studies (N=123,132–260,861). Of these genome-wide significant loci, extraversion was associated with variants in WSCD2 and near PCDH15, and neuroticism with variants on chromosome 8p23.1 and in L3MBTL2. We performed a principal component analysis to extract major dimensions underlying genetic variations among five personality traits and six psychiatric disorders (N=5,422–18,759). The first genetic dimension separated personality traits and psychiatric disorders, except that neuroticism and openness to experience were clustered with the disorders. High genetic correlations were found between extraversion and attention-deficit/hyperactivity disorder (ADHD), and between openness and schizophrenia/bipolar disorder. The second genetic dimension was closely aligned with extraversion-introversion and grouped neuroticism with internalizing psychopathology (e.g., depression/anxiety). PMID:27918536

  3. Genome-wide analyses for personality traits identify six genomic loci and show correlations with psychiatric disorders.

    PubMed

    Lo, Min-Tzu; Hinds, David A; Tung, Joyce Y; Franz, Carol; Fan, Chun-Chieh; Wang, Yunpeng; Smeland, Olav B; Schork, Andrew; Holland, Dominic; Kauppi, Karolina; Sanyal, Nilotpal; Escott-Price, Valentina; Smith, Daniel J; O'Donovan, Michael; Stefansson, Hreinn; Bjornsdottir, Gyda; Thorgeirsson, Thorgeir E; Stefansson, Kari; McEvoy, Linda K; Dale, Anders M; Andreassen, Ole A; Chen, Chi-Hua

    2017-01-01

    Personality is influenced by genetic and environmental factors and associated with mental health. However, the underlying genetic determinants are largely unknown. We identified six genetic loci, including five novel loci, significantly associated with personality traits in a meta-analysis of genome-wide association studies (N = 123,132-260,861). Of these genome-wide significant loci, extraversion was associated with variants in WSCD2 and near PCDH15, and neuroticism with variants on chromosome 8p23.1 and in L3MBTL2. We performed a principal component analysis to extract major dimensions underlying genetic variations among five personality traits and six psychiatric disorders (N = 5,422-18,759). The first genetic dimension separated personality traits and psychiatric disorders, except that neuroticism and openness to experience were clustered with the disorders. High genetic correlations were found between extraversion and attention-deficit-hyperactivity disorder (ADHD) and between openness and schizophrenia and bipolar disorder. The second genetic dimension was closely aligned with extraversion-introversion and grouped neuroticism with internalizing psychopathology (e.g., depression or anxiety).

  4. Gene expression levels as endophenotypes in genome-wide association studies of Alzheimer disease

    PubMed Central

    Zou, F.; Carrasquillo, M. M.; Pankratz, V. S.; Belbin, O.; Morgan, K.; Allen, M.; Wilcox, S. L.; Ma, L.; Walker, L. P.; Kouri, N.; Burgess, J. D.; Younkin, L. H.; Younkin, Samuel G.; Younkin, C. S.; Bisceglio, G. D.; Crook, J. E.; Dickson, D. W.; Petersen, R. C.; Graff-Radford, N.; Younkin, Steven G.; Ertekin-Taner, N.

    2010-01-01

    Background: Late-onset Alzheimer disease (LOAD) is a common disorder with a substantial genetic component. We postulate that many disease susceptibility variants act by altering gene expression levels. Methods: We measured messenger RNA (mRNA) expression levels of 12 LOAD candidate genes in the cerebella of 200 subjects with LOAD. Using the genotypes from our LOAD genome-wide association study for the cis-single nucleotide polymorphisms (SNPs) (n = 619) of these 12 LOAD candidate genes, we tested for associations with expression levels as endophenotypes. The strongest expression cis-SNP was tested for AD association in 7 independent case-control series (2,280 AD and 2,396 controls). Results: We identified 3 SNPs that associated significantly with IDE (insulin degrading enzyme) expression levels. A single copy of the minor allele for each significant SNP was associated with ∼twofold higher IDE expression levels. The most significant SNP, rs7910977, is 4.2 kb beyond the 3′ end of IDE. The association observed with this SNP was significant even at the genome-wide level (p = 2.7 × 10−8). Furthermore, the minor allele of rs7910977 associated significantly (p = 0.0046) with reduced LOAD risk (OR = 0.81 with a 95% CI of 0.70-0.94), as expected biologically from its association with elevated IDE expression. Conclusions: These results provide strong evidence that IDE is a late-onset Alzheimer disease (LOAD) gene with variants that modify risk of LOAD by influencing IDE expression. They also suggest that the use of expression levels as endophenotypes in genome-wide association studies may provide a powerful approach for the identification of disease susceptibility alleles. GLOSSARY AD = Alzheimer disease; CI = confidence interval; GWAS = genome-wide association study; LOAD = late-onset Alzheimer disease; mRNA = messenger RNA; OR = odds ratio; SNP = single nucleotide polymorphism. PMID:20142614

  5. In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.

    PubMed

    Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M

    2014-01-30

    RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.

  6. Genomic Predictions and Genome-Wide Association Study of Resistance Against Piscirickettsia salmonis in Coho Salmon (Oncorhynchus kisutch) Using ddRAD Sequencing

    PubMed Central

    Barría, Agustín; Christensen, Kris A.; Yoshida, Grazyella M.; Correa, Katharina; Jedlicki, Ana; Lhorente, Jean P.; Davidson, William S.; Yáñez, José M.

    2018-01-01

    Piscirickettsia salmonis is one of the main infectious diseases affecting coho salmon (Oncorhynchus kisutch) farming, and current treatments have been ineffective for the control of this disease. Genetic improvement for P. salmonis resistance has been proposed as a feasible alternative for the control of this infectious disease in farmed fish. Genotyping by sequencing (GBS) strategies allow genotyping of hundreds of individuals with thousands of single nucleotide polymorphisms (SNPs), which can be used to perform genome wide association studies (GWAS) and predict genetic values using genome-wide information. We used double-digest restriction-site associated DNA (ddRAD) sequencing to dissect the genetic architecture of resistance against P. salmonis in a farmed coho salmon population and to identify molecular markers associated with the trait. We also evaluated genomic selection (GS) models in order to determine the potential to accelerate the genetic improvement of this trait by means of using genome-wide molecular information. A total of 764 individuals from 33 full-sib families (17 highly resistant and 16 highly susceptible) were experimentally challenged against P. salmonis and their genotypes were assayed using ddRAD sequencing. A total of 9,389 SNPs markers were identified in the population. These markers were used to test genomic selection models and compare different GWAS methodologies for resistance measured as day of death (DD) and binary survival (BIN). Genomic selection models showed higher accuracies than the traditional pedigree-based best linear unbiased prediction (PBLUP) method, for both DD and BIN. The models showed an improvement of up to 95% and 155% respectively over PBLUP. One SNP related with B-cell development was identified as a potential functional candidate associated with resistance to P. salmonis defined as DD. PMID:29440129

  7. Genome-wide investigation of genetic changes during modern breeding of Brassica napus.

    PubMed

    Wang, Nian; Li, Feng; Chen, Biyun; Xu, Kun; Yan, Guixin; Qiao, Jiangwei; Li, Jun; Gao, Guizhen; Bancroft, Ian; Meng, Jingling; King, Graham J; Wu, Xiaoming

    2014-08-01

    Considerable genome variation had been incorporated within rapeseed breeding programs over past decades. In past decades, there have been substantial changes in phenotypic properties of rapeseed as a result of extensive breeding effort. Uncovering the underlying patterns of allelic variation in the context of genome organisation would provide knowledge to guide future genetic improvement. We assessed genome-wide genetic changes, including population structure, genetic relatedness, the extent of linkage disequilibrium, nucleotide diversity and genetic differentiation based on F ST outlier detection, for a panel of 472 Brassica napus inbred accessions using a 60 k Brassica Infinium® SNP array. We found genetic diversity varied in different sub-groups. Moreover, the genetic diversity increased from 1950 to 1980 and then remained at a similar level in China and Europe. We also found ~6-10 % genomic regions revealed high F ST values. Some QTLs previously associated with important agronomic traits overlapped with these regions. Overall, the B. napus C genome was found to have more high F ST signals than the A genome, and we concluded that the C genome may contribute more valuable alleles to generate elite traits. The results of this study indicate that considerable genome variation had been incorporated within rapeseed breeding programs over past decades. These results also contribute to understanding the impact of rapeseed improvement on available genome variation and the potential for dissecting complex agronomic traits.

  8. Development and validation of a comprehensive genomic diagnostic tool for myeloid malignancies

    PubMed Central

    McKerrell, Thomas; Moreno, Thaidy; Ponstingl, Hannes; Bolli, Niccolo; Dias, João M. L.; Tischler, German; Colonna, Vincenza; Manasse, Bridget; Bench, Anthony; Bloxham, David; Herman, Bram; Fletcher, Danielle; Park, Naomi; Quail, Michael A.; Manes, Nicla; Hodkinson, Clare; Baxter, Joanna; Sierra, Jorge; Foukaneli, Theodora; Warren, Alan J.; Chi, Jianxiang; Costeas, Paul; Rad, Roland; Huntly, Brian; Grove, Carolyn; Ning, Zemin; Tyler-Smith, Chris; Varela, Ignacio; Scott, Mike; Nomdedeu, Josep; Mustonen, Ville

    2016-01-01

    The diagnosis of hematologic malignancies relies on multidisciplinary workflows involving morphology, flow cytometry, cytogenetic, and molecular genetic analyses. Advances in cancer genomics have identified numerous recurrent mutations with clear prognostic and/or therapeutic significance to different cancers. In myeloid malignancies, there is a clinical imperative to test for such mutations in mainstream diagnosis; however, progress toward this has been slow and piecemeal. Here we describe Karyogene, an integrated targeted resequencing/analytical platform that detects nucleotide substitutions, insertions/deletions, chromosomal translocations, copy number abnormalities, and zygosity changes in a single assay. We validate the approach against 62 acute myeloid leukemia, 50 myelodysplastic syndrome, and 40 blood DNA samples from individuals without evidence of clonal blood disorders. We demonstrate robust detection of sequence changes in 49 genes, including difficult-to-detect mutations such as FLT3 internal-tandem and mixed-lineage leukemia (MLL) partial-tandem duplications, and clinically significant chromosomal rearrangements including MLL translocations to known and unknown partners, identifying the novel fusion gene MLL-DIAPH2 in the process. Additionally, we identify most significant chromosomal gains and losses, and several copy neutral loss-of-heterozygosity mutations at a genome-wide level, including previously unreported changes such as homozygosity for DNMT3A R882 mutations. Karyogene represents a dependable genomic diagnosis platform for translational research and for the clinical management of myeloid malignancies, which can be readily adapted for use in other cancers. PMID:27121471

  9. Genome-Wide Association Study for Susceptibility to and Recoverability From Mastitis in Danish Holstein Cows.

    PubMed

    Welderufael, B G; Løvendahl, Peter; de Koning, Dirk-Jan; Janss, Lucas L G; Fikse, W F

    2018-01-01

    Because mastitis is very frequent and unavoidable, adding recovery information into the analysis for genetic evaluation of mastitis is of great interest from economical and animal welfare point of view. Here we have performed genome-wide association studies (GWAS) to identify associated single nucleotide polymorphisms (SNPs) and investigate the genetic background not only for susceptibility to - but also for recoverability from mastitis. Somatic cell count records from 993 Danish Holstein cows genotyped for a total of 39378 autosomal SNP markers were used for the association analysis. Single SNP regression analysis was performed using the statistical software package DMU. Substitution effect of each SNP was tested with a t -test and a genome-wide significance level of P -value < 10 -4 was used to declare significant SNP-trait association. A number of significant SNP variants were identified for both traits. Many of the SNP variants associated either with susceptibility to - or recoverability from mastitis were located in or very near to genes that have been reported for their role in the immune system. Genes involved in lymphocyte developments (e.g., MAST3 and STAB2 ) and genes involved in macrophage recruitment and regulation of inflammations ( PDGFD and PTX3 ) were suggested as possible causal genes for susceptibility to - and recoverability from mastitis, respectively. However, this is the first GWAS study for recoverability from mastitis and our results need to be validated. The findings in the current study are, therefore, a starting point for further investigations in identifying causal genetic variants or chromosomal regions for both susceptibility to - and recoverability from mastitis.

  10. A genome-wide association study of susceptibility to acute lymphoblastic leukemia in adolescents and young adults.

    PubMed

    Perez-Andreu, Virginia; Roberts, Kathryn G; Xu, Heng; Smith, Colton; Zhang, Hui; Yang, Wenjian; Harvey, Richard C; Payne-Turner, Debbie; Devidas, Meenakshi; Cheng, I-Ming; Carroll, William L; Heerema, Nyla A; Carroll, Andrew J; Raetz, Elizabeth A; Gastier-Foster, Julie M; Marcucci, Guido; Bloomfield, Clara D; Mrózek, Krzysztof; Kohlschmidt, Jessica; Stock, Wendy; Kornblau, Steven M; Konopleva, Marina; Paietta, Elisabeth; Rowe, Jacob M; Luger, Selina M; Tallman, Martin S; Dean, Michael; Burchard, Esteban G; Torgerson, Dara G; Yue, Feng; Wang, Yanli; Pui, Ching-Hon; Jeha, Sima; Relling, Mary V; Evans, William E; Gerhard, Daniela S; Loh, Mignon L; Willman, Cheryl L; Hunger, Stephen P; Mullighan, Charles G; Yang, Jun J

    2015-01-22

    Acute lymphoblastic leukemia (ALL) in adolescents and young adults (AYA) is characterized by distinct presenting features and inferior prognosis compared with pediatric ALL. We performed a genome-wide association study (GWAS) to comprehensively identify inherited genetic variants associated with susceptibility to AYA ALL. In the discovery GWAS, we compared genotype frequency at 635 297 single nucleotide polymorphisms (SNPs) in 308 AYA ALL cases and 6,661 non-ALL controls by using a logistic regression model with genetic ancestry as a covariate. SNPs that reached P ≤ 5 × 10(-8) in GWAS were tested in an independent cohort of 162 AYA ALL cases and 5,755 non-ALL controls. We identified a single genome-wide significant susceptibility locus in GATA3: rs3824662, odds ratio (OR), 1.77 (P = 2.8 × 10(-10)) and rs3781093, OR, 1.73 (P = 3.2 × 10(-9)). These findings were validated in the replication cohort. The risk allele at rs3824662 was most frequent in Philadelphia chromosome (Ph)-like ALL but also conferred susceptibility to non-Ph-like ALL in AYAs. In 1,827 non-selected ALL cases, the risk allele frequency at this SNP was positively correlated with age at diagnosis (P = 6.29 × 10(-11)). Our results from this first GWAS of AYA ALL susceptibility point to unique biology underlying leukemogenesis and potentially distinct disease etiology by age group.

  11. Genome-wide Association Study Identifies Loci for the Polled Phenotype in Yak

    PubMed Central

    Wu, Xiaoyun; Wang, Kun; Ding, Xuezhi; Wang, Mingcheng; Chu, Min; Xie, Xiuyue; Qiu, Qiang; Yan, Ping

    2016-01-01

    The absence of horns, known as the polled phenotype, is an economically important trait in modern yak husbandry, but the genomic structure and genetic basis of this phenotype have yet to be discovered. Here, we conducted a genome-wide association study with a panel of 10 horned and 10 polled yaks using whole genome sequencing. We mapped the POLLED locus to a 200-kb interval, which comprises three protein-coding genes. Further characterization of the candidate region showed recent artificial selection signals resulting from the breeding process. We suggest that expressional variations rather than structural variations in protein probably contribute to the polled phenotype. Our results not only represent the first and important step in establishing the genomic structure of the polled region in yak, but also add to our understanding of the polled trait in bovid species. PMID:27389700

  12. Genome-Wide Variation Patterns Uncover the Origin and Selection in Cultivated Ginseng (Panax ginseng Meyer).

    PubMed

    Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili; Liu, Bao; Li, Lin-Feng

    2017-09-01

    Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  13. Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods.

    PubMed

    Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay

    2017-10-17

    Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.

  14. Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods

    PubMed Central

    Grzesik, Peter; Voorhies, Alexander A.; Alperovich, Nina; MacMath, Derek; Najera, Claudia D.; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N.; Montague, Michael G.; Friedman, Robert M.; Desai, Prashant J.

    2017-01-01

    Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOSYA, replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats. PMID:28928148

  15. Evaluation of genome-wide association study results through development of ontology fingerprints

    PubMed Central

    Tsoi, Lam C.; Boehnke, Michael; Klein, Richard L.; Zheng, W. Jim

    2009-01-01

    Motivation: Genome-wide association (GWA) studies may identify multiple variants that are associated with a disease or trait. To narrow down candidates for further validation, quantitatively assessing how identified genes relate to a phenotype of interest is important. Results: We describe an approach to characterize genes or biological concepts (phenotypes, pathways, diseases, etc.) by ontology fingerprint—the set of Gene Ontology (GO) terms that are overrepresented among the PubMed abstracts discussing the gene or biological concept together with the enrichment p-value of these terms generated from a hypergeometric enrichment test. We then quantify the relevance of genes to the trait from a GWA study by calculating similarity scores between their ontology fingerprints using enrichment p-values. We validate this approach by correctly identifying corresponding genes for biological pathways with a 90% average area under the ROC curve (AUC). We applied this approach to rank genes identified through a GWA study that are associated with the lipid concentrations in plasma as well as to prioritize genes within linkage disequilibrium (LD) block. We found that the genes with highest scores were: ABCA1, lipoprotein lipase (LPL) and cholesterol ester transfer protein, plasma for high-density lipoprotein; low-density lipoprotein receptor, APOE and APOB for low-density lipoprotein; and LPL, APOA1 and APOB for triglyceride. In addition, we identified genes relevant to lipid metabolism from the literature even in cases where such knowledge was not reflected in current annotation of these genes. These results demonstrate that ontology fingerprints can be used effectively to prioritize genes from GWA studies for experimental validation. Contact: zhengw@musc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19349285

  16. Genome-wide meta-analysis identifies five new susceptibility loci for cutaneous malignant melanoma.

    PubMed

    Law, Matthew H; Bishop, D Timothy; Lee, Jeffrey E; Brossard, Myriam; Martin, Nicholas G; Moses, Eric K; Song, Fengju; Barrett, Jennifer H; Kumar, Rajiv; Easton, Douglas F; Pharoah, Paul D P; Swerdlow, Anthony J; Kypreou, Katerina P; Taylor, John C; Harland, Mark; Randerson-Moor, Juliette; Akslen, Lars A; Andresen, Per A; Avril, Marie-Françoise; Azizi, Esther; Scarrà, Giovanna Bianchi; Brown, Kevin M; Dębniak, Tadeusz; Duffy, David L; Elder, David E; Fang, Shenying; Friedman, Eitan; Galan, Pilar; Ghiorzo, Paola; Gillanders, Elizabeth M; Goldstein, Alisa M; Gruis, Nelleke A; Hansson, Johan; Helsing, Per; Hočevar, Marko; Höiom, Veronica; Ingvar, Christian; Kanetsky, Peter A; Chen, Wei V; Landi, Maria Teresa; Lang, Julie; Lathrop, G Mark; Lubiński, Jan; Mackie, Rona M; Mann, Graham J; Molven, Anders; Montgomery, Grant W; Novaković, Srdjan; Olsson, Håkan; Puig, Susana; Puig-Butille, Joan Anton; Qureshi, Abrar A; Radford-Smith, Graham L; van der Stoep, Nienke; van Doorn, Remco; Whiteman, David C; Craig, Jamie E; Schadendorf, Dirk; Simms, Lisa A; Burdon, Kathryn P; Nyholt, Dale R; Pooley, Karen A; Orr, Nick; Stratigos, Alexander J; Cust, Anne E; Ward, Sarah V; Hayward, Nicholas K; Han, Jiali; Schulze, Hans-Joachim; Dunning, Alison M; Bishop, Julia A Newton; Demenais, Florence; Amos, Christopher I; MacGregor, Stuart; Iles, Mark M

    2015-09-01

    Thirteen common susceptibility loci have been reproducibly associated with cutaneous malignant melanoma (CMM). We report the results of an international 2-stage meta-analysis of CMM genome-wide association studies (GWAS). This meta-analysis combines 11 GWAS (5 previously unpublished) and a further three stage 2 data sets, totaling 15,990 CMM cases and 26,409 controls. Five loci not previously associated with CMM risk reached genome-wide significance (P < 5 × 10(-8)), as did 2 previously reported but unreplicated loci and all 13 established loci. Newly associated SNPs fall within putative melanocyte regulatory elements, and bioinformatic and expression quantitative trait locus (eQTL) data highlight candidate genes in the associated regions, including one involved in telomere biology.

  17. Genome-wide meta-analysis identifies five new susceptibility loci for cutaneous malignant melanoma

    PubMed Central

    Law, Matthew H.; Bishop, D. Timothy; Martin, Nicholas G.; Moses, Eric K.; Song, Fengju; Barrett, Jennifer H.; Kumar, Rajiv; Easton, Douglas F.; Pharoah, Paul D. P.; Swerdlow, Anthony J.; Kypreou, Katerina P.; Taylor, John C.; Harland, Mark; Randerson-Moor, Juliette; Akslen, Lars A.; Andresen, Per A.; Avril, Marie-Françoise; Azizi, Esther; Scarrà, Giovanna Bianchi; Brown, Kevin M.; Dębniak, Tadeusz; Duffy, David L.; Elder, David E.; Fang, Shenying; Friedman, Eitan; Galan, Pilar; Ghiorzo, Paola; Gillanders, Elizabeth M.; Goldstein, Alisa M.; Gruis, Nelleke A.; Hansson, Johan; Helsing, Per; Hočevar, Marko; Höiom, Veronica; Ingvar, Christian; Kanetsky, Peter A.; Chen, Wei V.; Landi, Maria Teresa; Lang, Julie; Lathrop, G. Mark; Lubiński, Jan; Mackie, Rona M.; Mann, Graham J.; Molven, Anders; Montgomery, Grant W.; Novaković, Srdjan; Olsson, Håkan; Puig, Susana; Puig-Butille, Joan Anton; Qureshi, Abrar A.; Radford-Smith, Graham L.; van der Stoep, Nienke; van Doorn, Remco; Whiteman, David C.; Craig, Jamie E.; Schadendorf, Dirk; Simms, Lisa A.; Burdon, Kathryn P.; Nyholt, Dale R.; Pooley, Karen A.; Orr, Nick; Stratigos, Alexander J.; Cust, Anne E.; Ward, Sarah V.; Hayward, Nicholas K.; Han, Jiali; Schulze, Hans-Joachim; Dunning, Alison M.; Bishop, Julia A. Newton; MacGregor, Stuart; Iles, Mark M.

    2015-01-01

    Thirteen common susceptibility loci have been reproducibly associated with cutaneous malignant melanoma (CMM). We report the results of an international 2-stage meta-analysis of CMM genome-wide association studies (GWAS). This meta-analysis combines 11 GWAS (5 previously unpublished) and a further three stage 2 data sets, totaling 15,990 CMM cases and 26,409 controls. Five loci not previously associated with CMM risk reached genome-wide significance (P < 5×10–8), as did two previously-reported but un-replicated loci and all thirteen established loci. Novel SNPs fall within putative melanocyte regulatory elements, and bioinformatic and expression quantitative trait locus (eQTL) data highlight candidate genes including one involved in telomere biology. PMID:26237428

  18. Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).

    PubMed

    Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

    2014-01-01

    High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.

  19. Development and Validation of a 20K Single Nucleotide Polymorphism (SNP) Whole Genome Genotyping Array for Apple (Malus × domestica Borkh)

    PubMed Central

    Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela

    2014-01-01

    High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs. PMID:25303088

  20. Genome-wide computational prediction and analysis of core promoter elements across plant monocots and dicots

    USDA-ARS?s Scientific Manuscript database

    Transcription initiation, essential to gene expression regulation, involves recruitment of basal transcription factors to the core promoter elements (CPEs). The distribution of currently known CPEs across plant genomes is largely unknown. This is the first large scale genome-wide report on the compu...

  1. A 2-Stage Genome-Wide Association Study to Identify Single Nucleotide Polymorphisms Associated With Development of Erectile Dysfunction Following Radiation Therapy for Prostate Cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kerns, Sarah L.; Departments of Pathology and Genetics, Albert Einstein College of Medicine, Bronx, New York; Stock, Richard

    2013-01-01

    Purpose: To identify single nucleotide polymorphisms (SNPs) associated with development of erectile dysfunction (ED) among prostate cancer patients treated with radiation therapy. Methods and Materials: A 2-stage genome-wide association study was performed. Patients were split randomly into a stage I discovery cohort (132 cases, 103 controls) and a stage II replication cohort (128 cases, 102 controls). The discovery cohort was genotyped using Affymetrix 6.0 genome-wide arrays. The 940 top ranking SNPs selected from the discovery cohort were genotyped in the replication cohort using Illumina iSelect custom SNP arrays. Results: Twelve SNPs identified in the discovery cohort and validated in themore » replication cohort were associated with development of ED following radiation therapy (Fisher combined P values 2.1 Multiplication-Sign 10{sup -5} to 6.2 Multiplication-Sign 10{sup -4}). Notably, these 12 SNPs lie in or near genes involved in erectile function or other normal cellular functions (adhesion and signaling) rather than DNA damage repair. In a multivariable model including nongenetic risk factors, the odds ratios for these SNPs ranged from 1.6 to 5.6 in the pooled cohort. There was a striking relationship between the cumulative number of SNP risk alleles an individual possessed and ED status (Sommers' D P value = 1.7 Multiplication-Sign 10{sup -29}). A 1-allele increase in cumulative SNP score increased the odds for developing ED by a factor of 2.2 (P value = 2.1 Multiplication-Sign 10{sup -19}). The cumulative SNP score model had a sensitivity of 84% and specificity of 75% for prediction of developing ED at the radiation therapy planning stage. Conclusions: This genome-wide association study identified a set of SNPs that are associated with development of ED following radiation therapy. These candidate genetic predictors warrant more definitive validation in an independent cohort.« less

  2. Refining genome-wide linkage intervals using a meta-analysis of genome-wide association studies identifies loci influencing personality dimensions

    PubMed Central

    Amin, Najaf; Hottenga, Jouke-Jan; Hansell, Narelle K; Janssens, A Cecile JW; de Moor, Marleen HM; Madden, Pamela AF; Zorkoltseva, Irina V; Penninx, Brenda W; Terracciano, Antonio; Uda, Manuela; Tanaka, Toshiko; Esko, Tonu; Realo, Anu; Ferrucci, Luigi; Luciano, Michelle; Davies, Gail; Metspalu, Andres; Abecasis, Goncalo R; Deary, Ian J; Raikkonen, Katri; Bierut, Laura J; Costa, Paul T; Saviouk, Viatcheslav; Zhu, Gu; Kirichenko, Anatoly V; Isaacs, Aaron; Aulchenko, Yurii S; Willemsen, Gonneke; Heath, Andrew C; Pergadia, Michele L; Medland, Sarah E; Axenovich, Tatiana I; de Geus, Eco; Montgomery, Grant W; Wright, Margaret J; Oostra, Ben A; Martin, Nicholas G; Boomsma, Dorret I; van Duijn, Cornelia M

    2013-01-01

    Personality traits are complex phenotypes related to psychosomatic health. Individually, various gene finding methods have not achieved much success in finding genetic variants associated with personality traits. We performed a meta-analysis of four genome-wide linkage scans (N=6149 subjects) of five basic personality traits assessed with the NEO Five-Factor Inventory. We compared the significant regions from the meta-analysis of linkage scans with the results of a meta-analysis of genome-wide association studies (GWAS) (N∼17 000). We found significant evidence of linkage of neuroticism to chromosome 3p14 (rs1490265, LOD=4.67) and to chromosome 19q13 (rs628604, LOD=3.55); of extraversion to 14q32 (ATGG002, LOD=3.3); and of agreeableness to 3p25 (rs709160, LOD=3.67) and to two adjacent regions on chromosome 15, including 15q13 (rs970408, LOD=4.07) and 15q14 (rs1055356, LOD=3.52) in the individual scans. In the meta-analysis, we found strong evidence of linkage of extraversion to 4q34, 9q34, 10q24 and 11q22, openness to 2p25, 3q26, 9p21, 11q24, 15q26 and 19q13 and agreeableness to 4q34 and 19p13. Significant evidence of association in the GWAS was detected between openness and rs677035 at 11q24 (P-value=2.6 × 10−06, KCNJ1). The findings of our linkage meta-analysis and those of the GWAS suggest that 11q24 is a susceptible locus for openness, with KCNJ1 as the possible candidate gene. PMID:23211697

  3. Genome-wide population structure and evolutionary history of the Frizarta dairy sheep.

    PubMed

    Kominakis, A; Hager-Theodorides, A L; Saridaki, A; Antonakos, G; Tsiamis, G

    2017-10-01

    In the present study, we used genomic data, generated with a medium density single nucleotide polymorphisms (SNP) array, to acquire more information on the population structure and evolutionary history of the synthetic Frizarta dairy sheep. First, two typical measures of linkage disequilibrium (LD) were estimated at various physical distances that were then used to make inferences on the effective population size at key past time points. Population structure was also assessed by both multidimensional scaling analysis and k-means clustering on the distance matrix obtained from the animals' genomic relationships. The Wright's fixation F ST index was also employed to assess herds' genetic homogeneity and to indirectly estimate past migration rates. The Wright's fixation F IS index and genomic inbreeding coefficients based on the genomic relationship matrix as well as on runs of homozygosity were also estimated. The Frizarta breed displays relatively low LD levels with r 2 and |D'| equal to 0.18 and 0.50, respectively, at an average inter-marker distance of 31 kb. Linkage disequilibrium decayed rapidly by distance and persisted over just a few thousand base pairs. Rate of LD decay (β) varied widely among the 26 autosomes with larger values estimated for shorter chromosomes (e.g. β=0.057, for OAR6) and smaller values for longer ones (e.g. β=0.022, for OAR2). The inferred effective population size at the beginning of the breed's formation was as high as 549, was then reduced to 463 in 1981 (end of the breed's formation) and further declined to 187, one generation ago. Multidimensional scaling analysis and k-means clustering suggested a genetically homogenous population, F ST estimates indicated relatively low genetic differentiation between herds, whereas a heat map of the animals' genomic kinship relationships revealed a stratified population, at a herd level. Estimates of genomic inbreeding coefficients suggested that most recent parental relatedness may have been a

  4. Genome-wide association study of sporadic brain arteriovenous malformations.

    PubMed

    Weinsheimer, Shantel; Bendjilali, Nasrine; Nelson, Jeffrey; Guo, Diana E; Zaroff, Jonathan G; Sidney, Stephen; McCulloch, Charles E; Al-Shahi Salman, Rustam; Berg, Jonathan N; Koeleman, Bobby P C; Simon, Matthias; Bostroem, Azize; Fontanella, Marco; Sturiale, Carmelo L; Pola, Roberto; Puca, Alfredo; Lawton, Michael T; Young, William L; Pawlikowska, Ludmila; Klijn, Catharina J M; Kim, Helen

    2016-09-01

    The pathogenesis of sporadic brain arteriovenous malformations (BAVMs) remains unknown, but studies suggest a genetic component. We estimated the heritability of sporadic BAVM and performed a genome-wide association study (GWAS) to investigate association of common single nucleotide polymorphisms (SNPs) with risk of sporadic BAVM in the international, multicentre Genetics of Arteriovenous Malformation (GEN-AVM) consortium. The Caucasian discovery cohort included 515 BAVM cases and 1191 controls genotyped using Affymetrix genome-wide SNP arrays. Genotype data were imputed to 1000 Genomes Project data, and well-imputed SNPs (>0.01 minor allele frequency) were analysed for association with BAVM. 57 top BAVM-associated SNPs (51 SNPs with p<10(-05) or p<10(-04) in candidate pathway genes, and 6 candidate BAVM SNPs) were tested in a replication cohort including 608 BAVM cases and 744 controls. The estimated heritability of BAVM was 17.6% (SE 8.9%, age and sex-adjusted p=0.015). None of the SNPs were significantly associated with BAVM in the replication cohort after correction for multiple testing. 6 SNPs had a nominal p<0.1 in the replication cohort and map to introns in EGFEM1P, SP4 and CDKAL1 or near JAG1 and BNC2. Of the 6 candidate SNPs, 2 in ACVRL1 and MMP3 had a nominal p<0.05 in the replication cohort. We performed the first GWAS of sporadic BAVM in the largest BAVM cohort assembled to date. No GWAS SNPs were replicated, suggesting that common SNPs do not contribute strongly to BAVM susceptibility. However, heritability estimates suggest a modest but significant genetic contribution. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  5. Unraveling the Genetic Etiology of Adult Antisocial Behavior: A Genome-Wide Association Study

    PubMed Central

    Tielbeek, Jorim J.; Medland, Sarah E.; Benyamin, Beben; Byrne, Enda M.; Heath, Andrew C.; Madden, Pamela A. F.; Martin, Nicholas G.; Wray, Naomi R.; Verweij, Karin J. H.

    2012-01-01

    Crime poses a major burden for society. The heterogeneous nature of criminal behavior makes it difficult to unravel its causes. Relatively little research has been conducted on the genetic influences of criminal behavior. The few twin and adoption studies that have been undertaken suggest that about half of the variance in antisocial behavior can be explained by genetic factors. In order to identify the specific common genetic variants underlying this behavior, we conduct the first genome-wide association study (GWAS) on adult antisocial behavior. Our sample comprised a community sample of 4816 individuals who had completed a self-report questionnaire. No genetic polymorphisms reached genome-wide significance for association with adult antisocial behavior. In addition, none of the traditional candidate genes can be confirmed in our study. While not genome-wide significant, the gene with the strongest association (p-value = 8.7×10−5) was DYRK1A, a gene previously related to abnormal brain development and mental retardation. Future studies should use larger, more homogeneous samples to disentangle the etiology of antisocial behavior. Biosocial criminological research allows a more empirically grounded understanding of criminal behavior, which could ultimately inform and improve current treatment strategies. PMID:23077488

  6. Multi-instance multi-label distance metric learning for genome-wide protein function prediction.

    PubMed

    Xu, Yonghui; Min, Huaqing; Song, Hengjie; Wu, Qingyao

    2016-08-01

    Multi-instance multi-label (MIML) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with not only multiple instances but also multiple class labels. To find an appropriate MIML learning method for genome-wide protein function prediction, many studies in the literature attempted to optimize objective functions in which dissimilarity between instances is measured using the Euclidean distance. But in many real applications, Euclidean distance may be unable to capture the intrinsic similarity/dissimilarity in feature space and label space. Unlike other previous approaches, in this paper, we propose to learn a multi-instance multi-label distance metric learning framework (MIMLDML) for genome-wide protein function prediction. Specifically, we learn a Mahalanobis distance to preserve and utilize the intrinsic geometric information of both feature space and label space for MIML learning. In addition, we try to deal with the sparsely labeled data by giving weight to the labeled data. Extensive experiments on seven real-world organisms covering the biological three-domain system (i.e., archaea, bacteria, and eukaryote; Woese et al., 1990) show that the MIMLDML algorithm is superior to most state-of-the-art MIML learning algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Genome-wide Association Study for Ovarian Cancer Susceptibility using Pooled DNA

    PubMed Central

    Lu, Yi; Chen, Xiaoqing; Beesley, Jonathan; Johnatty, Sharon E.; deFazio, Anna; Lambrechts, Sandrina; Lambrechts, Diether; Despierre, Evelyn; Vergotes, Ignace; Chang-Claude, Jenny; Hein, Rebecca; Nickels, Stefan; Wang-Gohrke, Shan; Dörk, Thilo; Dürst, Matthias; Antonenkova, Natalia; Bogdanova, Natalia; Goodman, Marc T.; Lurie, Galina; Wilkens, Lynne R.; Carney, Michael E.; Butzow, Ralf; Nevanlinna, Heli; Heikkinen, Tuomas; Leminen, Arto; Kiemeney, Lambertus A.; Massuger, Leon F.A.G.; van Altena, Anne M.; Aben, Katja K.; Kjaer, Susanne Krüger; Høgdall, Estrid; Jensen, Allan; Brooks-Wilson, Angela; Le, Nhu; Cook, Linda; Earp, Madalene; Kelemen, Linda; Easton, Douglas; Pharoah, Paul; Song, Honglin; Tyrer, Jonathan; Ramus, Susan; Menon, Usha; Gentry-Maharaj, Alexandra; Gayther, Simon A.; Bandera, Elisa V.; Olson, Sara H.; Orlow, Irene; Rodriguez-Rodriguez, Lorna

    2013-01-01

    Recent genome-wide association studies (GWAS) have identified four low-penetrance ovarian cancer susceptibility loci. We hypothesized that further moderate or low penetrance variants exist among the subset of SNPs not well tagged by the genotyping arrays used in the previous studies which would account for some of the remaining risk. We therefore conducted a time- and cost-effective stage 1 GWAS on 342 invasive serous cases and 643 controls genotyped on pooled DNA using the high density Illumina 1M-Duo array. We followed up 20 of the most significantly associated SNPs, which are not well tagged by the lower density arrays used by the published GWAS, and genotyping them on individual DNA. Most of the top 20 SNPs were clearly validated by individually genotyping the samples used in the pools. However, none of the 20 SNPs replicated when tested for association in a much larger stage 2 set of 4,651 cases and 6,966 controls from the Ovarian Cancer Association Consortium. Given that most of the top 20 SNPs from pooling were validated in the same samples by individual genotyping, the lack of replication is likely to be due to the relatively small sample size in our stage 1 GWAS rather than due to problems with the pooling approach. We conclude that there are unlikely to be any moderate or large effects on ovarian cancer risk untagged by the less dense arrays. However our study lacked power to make clear statements on the existence of hitherto untagged small effect variants. PMID:22794196

  8. Genome-Wide Linkage and Association Analysis Identifies Major Gene Loci for Guttural Pouch Tympany in Arabian and German Warmblood Horses

    PubMed Central

    Metzger, Julia; Ohnesorge, Bernhard; Distl, Ottmar

    2012-01-01

    Equine guttural pouch tympany (GPT) is a hereditary condition affecting foals in their first months of life. Complex segregation analyses in Arabian and German warmblood horses showed the involvement of a major gene as very likely. Genome-wide linkage and association analyses including a high density marker set of single nucleotide polymorphisms (SNPs) were performed to map the genomic region harbouring the potential major gene for GPT. A total of 85 Arabian and 373 German warmblood horses were genotyped on the Illumina equine SNP50 beadchip. Non-parametric multipoint linkage analyses showed genome-wide significance on horse chromosomes (ECA) 3 for German warmblood at 16–26 Mb and 34–55 Mb and for Arabian on ECA15 at 64–65 Mb. Genome-wide association analyses confirmed the linked regions for both breeds. In Arabian, genome-wide association was detected at 64 Mb within the region with the highest linkage peak on ECA15. For German warmblood, signals for genome-wide association were close to the peak region of linkage at 52 Mb on ECA3. The odds ratio for the SNP with the highest genome-wide association was 0.12 for the Arabian. In conclusion, the refinement of the regions with the Illumina equine SNP50 beadchip is an important step to unravel the responsible mutations for GPT. PMID:22848553

  9. Genome-wide copy number variation (CNV) detection in Nelore cattle reveals highly frequent variants in genome regions harboring QTLs affecting production traits.

    PubMed

    da Silva, Joaquim Manoel; Giachetto, Poliana Fernanda; da Silva, Luiz Otávio; Cintra, Leandro Carrijo; Paiva, Samuel Rezende; Yamagishi, Michel Eduardo Beleza; Caetano, Alexandre Rodrigues

    2016-06-13

    Copy number variations (CNVs) have been shown to account for substantial portions of observed genomic variation and have been associated with qualitative and quantitative traits and the onset of disease in a number of species. Information from high-resolution studies to detect, characterize and estimate population-specific variant frequencies will facilitate the incorporation of CNVs in genomic studies to identify genes affecting traits of importance. Genome-wide CNVs were detected in high-density single nucleotide polymorphism (SNP) genotyping data from 1,717 Nelore (Bos indicus) cattle, and in NGS data from eight key ancestral bulls. A total of 68,007 and 12,786 distinct CNVs were observed, respectively. Cross-comparisons of results obtained for the eight resequenced animals revealed that 92 % of the CNVs were observed in both datasets, while 62 % of all detected CNVs were observed to overlap with previously validated cattle copy number variant regions (CNVRs). Observed CNVs were used for obtaining breed-specific CNV frequencies and identification of CNVRs, which were subsequently used for gene annotation. A total of 688 of the detected CNVRs were observed to overlap with 286 non-redundant QTLs associated with important production traits in cattle. All of 34 CNVs previously reported to be associated with milk production traits in Holsteins were also observed in Nelore cattle. Comparisons of estimated frequencies of these CNVs in the two breeds revealed 14, 13, 6 and 14 regions in high (>20 %), low (<20 %) and divergent (NEL > HOL, NEL < HOL) frequencies, respectively. Obtained results significantly enriched the bovine CNV map and enabled the identification of variants that are potentially associated with traits under selection in Nelore cattle, particularly in genome regions harboring QTLs affecting production traits.

  10. Genome-wide association studies in Alzheimer's disease.

    PubMed

    Bertram, Lars; Tanzi, Rudolph E

    2009-10-15

    Genome-wide association studies (GWAS) have gained considerable momentum over the last couple of years for the identification of novel complex disease genes. In the field of Alzheimer's disease (AD), there are currently eight published and two provisionally reported GWAS, highlighting over two dozen novel potential susceptibility loci beyond the well-established APOE association. On the basis of the data available at the time of this writing, the most compelling novel GWAS signal has been observed in GAB2 (GRB2-associated binding protein 2), followed by less consistently replicated signals in galanin-like peptide (GALP), piggyBac transposable element derived 1 (PGBD1), tyrosine kinase, non-receptor 1 (TNK1). Furthermore, consistent replication has been recently announced for CLU (clusterin, also known as apolipoprotein J). Finally, there are at least three replicated loci in hitherto uncharacterized genomic intervals on chromosomes 14q32.13, 14q31.2 and 6q24.1 likely implicating the existence of novel AD genes in these regions. In this review, we will discuss the characteristics and potential relevance to pathogenesis of the outcomes of all currently available GWAS in AD. A particular emphasis will be laid on findings with independent data in favor of the original association.

  11. Genetic determinants of common epilepsies: a meta-analysis of genome-wide association studies

    PubMed Central

    2014-01-01

    Summary Background The epilepsies are a clinically heterogeneous group of neurological disorders. Despite strong evidence for heritability, genome-wide association studies have had little success in identification of risk loci associated with epilepsy, probably because of relatively small sample sizes and insufficient power. We aimed to identify risk loci through meta-analyses of genome-wide association studies for all epilepsy and the two largest clinical subtypes (genetic generalised epilepsy and focal epilepsy). Methods We combined genome-wide association data from 12 cohorts of individuals with epilepsy and controls from population-based datasets. Controls were ethnically matched with cases. We phenotyped individuals with epilepsy into categories of genetic generalised epilepsy, focal epilepsy, or unclassified epilepsy. After standardised filtering for quality control and imputation to account for different genotyping platforms across sites, investigators at each site conducted a linear mixed-model association analysis for each dataset. Combining summary statistics, we conducted fixed-effects meta-analyses of all epilepsy, focal epilepsy, and genetic generalised epilepsy. We set the genome-wide significance threshold at p<1·66 × 10−8. Findings We included 8696 cases and 26 157 controls in our analysis. Meta-analysis of the all-epilepsy cohort identified loci at 2q24.3 (p=8·71 × 10−10), implicating SCN1A, and at 4p15.1 (p=5·44 × 10−9), harbouring PCDH7, which encodes a protocadherin molecule not previously implicated in epilepsy. For the cohort of genetic generalised epilepsy, we noted a single signal at 2p16.1 (p=9·99 × 10−9), implicating VRK2 or FANCL. No single nucleotide polymorphism achieved genome-wide significance for focal epilepsy. Interpretation This meta-analysis describes a new locus not previously implicated in epilepsy and provides further evidence about the genetic architecture of these disorders, with the

  12. Mycobacterium tuberculosis genome-wide screen exposes multiple CD8+ T cell epitopes

    PubMed Central

    Hammond, A S; Klein, M R; Corrah, T; Fox, A; Jaye, A; McAdam, K P; Brookes, R H

    2005-01-01

    Mounting evidence suggests human leucocyte antigen (HLA) class I-restricted CD8+ T cells play a role in protective immunity against tuberculosis yet relatively few epitopes specific for the causative organism, Mycobacterium tuberculosis, are reported. Here a total genome-wide screen of M. tuberculosis was used to identify putative HLA-B*3501 T cell epitopes. Of 479 predicted epitopes, 13 with the highest score were synthesized and used to restimulate lymphocytes from naturally exposed HLA-B*3501 healthy individuals in cultured and ex vivo enzyme-linked immunospot (ELISPOT) assays for interferon (IFN)-γ. All 13 peptides elicited a response that varied considerably between individuals. For three peptides CD8+ T cell lines were expanded and four of the 13 were recognized permissively through the HLA-B7 supertype family. Although further testing is required we show the genome-wide screen to be feasible for the identification of unknown mycobacterial antigens involved in immunity against natural infection. While the mechanisms of protective immunity against M. tuberculosis infection remain unclear, conventional class I-restricted CD8+ T cell responses appear to be widespread throughout the genome. PMID:15762882

  13. From Genome-Wide Association Study to Phenome-Wide Association Study: New Paradigms in Obesity Research.

    PubMed

    Zhang, Y-P; Zhang, Y-Y; Duan, D D

    2016-01-01

    Obesity is a condition in which excess body fat has accumulated over an extent that increases the risk of many chronic diseases. The current clinical classification of obesity is based on measurement of body mass index (BMI), waist-hip ratio, and body fat percentage. However, these measurements do not account for the wide individual variations in fat distribution, degree of fatness or health risks, and genetic variants identified in the genome-wide association studies (GWAS). In this review, we will address this important issue with the introduction of phenome, phenomics, and phenome-wide association study (PheWAS). We will discuss the new paradigm shift from GWAS to PheWAS in obesity research. In the era of precision medicine, phenomics and PheWAS provide the required approaches to better definition and classification of obesity according to the association of obese phenome with their unique molecular makeup, lifestyle, and environmental impact. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. Genome-Wide Association Study of Erosive Tooth Wear in a Finnish Cohort.

    PubMed

    Alaraudanjoki, Viivi Karoliina; Koivisto, Salla; Pesonen, Paula; Männikkö, Minna; Leinonen, Jukka; Tjäderhane, Leo; Laitala, Marja-Liisa; Lussi, Adrian; Anttonen, Vuokko Anna-Marketta

    2018-06-13

    Erosive tooth wear is defined as irreversible loss of dental tissues due to intrinsic or extrinsic acids, exacerbated by mechanical forces. Recent studies have suggested a higher prevalence of erosive tooth wear in males, as well as a genetic contribution to susceptibility to erosive tooth wear. Our aim was to examine erosive tooth wear by performing a genome-wide association study (GWAS) in a sample of the Northern Finland Birth Cohort 1966 (n = 1,962). Erosive tooth wear was assessed clinically using the basic erosive wear examination. A GWAS was performed for the whole sample as well as separately for males and females. We identified one genome-wide significant signal (rs11681214) in the GWAS of the whole sample near the genes PXDN and MYT1L. When the sample was stratified by sex, the strongest genome-wide significant signals were observed in or near the genes FGFR1, C8orf86, CDH4, SCD5, F2R, and ING1. Additionally, multiple suggestive association signals were detected in all GWASs performed. Many of the signals were in or near the genes putatively related to oral environment or tooth development, and some were near the regions considered to be associated with dental caries, such as 2p24, 4q21, and 13q33. Replications of these associations in other samples, as well as experimental studies to determine the biological functions of associated genetic variants, are needed. © 2018 S. Karger AG, Basel.

  15. A Genome-Wide Landscape of Retrocopies in Primate Genomes.

    PubMed

    Navarro, Fábio C P; Galante, Pedro A F

    2015-07-29

    Gene duplication is a key factor contributing to phenotype diversity across and within species. Although the availability of complete genomes has led to the extensive study of genomic duplications, the dynamics and variability of gene duplications mediated by retrotransposition are not well understood. Here, we predict mRNA retrotransposition and use comparative genomics to investigate their origin and variability across primates. Analyzing seven anthropoid primate genomes, we found a similar number of mRNA retrotranspositions (∼7,500 retrocopies) in Catarrhini (Old Word Monkeys, including humans), but a surprising large number of retrocopies (∼10,000) in Platyrrhini (New World Monkeys), which may be a by-product of higher long interspersed nuclear element 1 activity in these genomes. By inferring retrocopy orthology, we dated most of the primate retrocopy origins, and estimated a decrease in the fixation rate in recent primate history, implying a smaller number of species-specific retrocopies. Moreover, using RNA-Seq data, we identified approximately 3,600 expressed retrocopies. As expected, most of these retrocopies are located near or within known genes, present tissue-specific and even species-specific expression patterns, and no expression correlation to their parental genes. Taken together, our results provide further evidence that mRNA retrotransposition is an active mechanism in primate evolution and suggest that retrocopies may not only introduce great genetic variability between lineages but also create a large reservoir of potentially functional new genomic loci in primate genomes. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  16. CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining

    PubMed Central

    Navarro, Carmen; Lopez, Francisco J.; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando

    2014-01-01

    Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by

  17. Significant Locus and Metabolic Genetic Correlations Revealed in Genome-Wide Association Study of Anorexia Nervosa.

    PubMed

    Duncan, Laramie; Yilmaz, Zeynep; Gaspar, Helena; Walters, Raymond; Goldstein, Jackie; Anttila, Verneri; Bulik-Sullivan, Brendan; Ripke, Stephan; Thornton, Laura; Hinney, Anke; Daly, Mark; Sullivan, Patrick F; Zeggini, Eleftheria; Breen, Gerome; Bulik, Cynthia M

    2017-09-01

    The authors conducted a genome-wide association study of anorexia nervosa and calculated genetic correlations with a series of psychiatric, educational, and metabolic phenotypes. Following uniform quality control and imputation procedures using the 1000 Genomes Project (phase 3) in 12 case-control cohorts comprising 3,495 anorexia nervosa cases and 10,982 controls, the authors performed standard association analysis followed by a meta-analysis across cohorts. Linkage disequilibrium score regression was used to calculate genome-wide common variant heritability (single-nucleotide polymorphism [SNP]-based heritability [h 2 SNP ]), partitioned heritability, and genetic correlations (r g ) between anorexia nervosa and 159 other phenotypes. Results were obtained for 10,641,224 SNPs and insertion-deletion variants with minor allele frequencies >1% and imputation quality scores >0.6. The h 2 SNP of anorexia nervosa was 0.20 (SE=0.02), suggesting that a substantial fraction of the twin-based heritability arises from common genetic variation. The authors identified one genome-wide significant locus on chromosome 12 (rs4622308) in a region harboring a previously reported type 1 diabetes and autoimmune disorder locus. Significant positive genetic correlations were observed between anorexia nervosa and schizophrenia, neuroticism, educational attainment, and high-density lipoprotein cholesterol, and significant negative genetic correlations were observed between anorexia nervosa and body mass index, insulin, glucose, and lipid phenotypes. Anorexia nervosa is a complex heritable phenotype for which this study has uncovered the first genome-wide significant locus. Anorexia nervosa also has large and significant genetic correlations with both psychiatric phenotypes and metabolic traits. The study results encourage a reconceptualization of this frequently lethal disorder as one with both psychiatric and metabolic etiology.

  18. Ab Initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment

    PubMed Central

    Xu, Dong; Zhang, Yang

    2013-01-01

    Genome-wide protein structure prediction and structure-based function annotation have been a long-term goal in molecular biology but not yet become possible due to difficulties in modeling distant-homology targets. We developed a hybrid pipeline combining ab initio folding and template-based modeling for genome-wide structure prediction applied to the Escherichia coli genome. The pipeline was tested on 43 known sequences, where QUARK-based ab initio folding simulation generated models with TM-score 17% higher than that by traditional comparative modeling methods. For 495 unknown hard sequences, 72 are predicted to have a correct fold (TM-score > 0.5) and 321 have a substantial portion of structure correctly modeled (TM-score > 0.35). 317 sequences can be reliably assigned to a SCOP fold family based on structural analogy to existing proteins in PDB. The presented results, as a case study of E. coli, represent promising progress towards genome-wide structure modeling and fold family assignment using state-of-the-art ab initio folding algorithms. PMID:23719418

  19. Genome-wide association study of handedness excludes simple genetic models

    PubMed Central

    Armour, J AL; Davison, A; McManus, I C

    2014-01-01

    Handedness is a human behavioural phenotype that appears to be congenital, and is often assumed to be inherited, but for which the developmental origin and underlying causation(s) have been elusive. Models of the genetic basis of variation in handedness have been proposed that fit different features of the observed resemblance between relatives, but none has been decisively tested or a corresponding causative locus identified. In this study, we applied data from well-characterised individuals studied at the London Twin Research Unit. Analysis of genome-wide SNP data from 3940 twins failed to identify any locus associated with handedness at a genome-wide level of significance. The most straightforward interpretation of our analyses is that they exclude the simplest formulations of the ‘right-shift' model of Annett and the ‘dextral/chance' model of McManus, although more complex modifications of those models are still compatible with our observations. For polygenic effects, our study is inadequately powered to reliably detect alleles with effect sizes corresponding to an odds ratio of 1.2, but should have good power to detect effects at an odds ratio of 2 or more. PMID:24065183

  20. Exploiting the Proteome to Improve the Genome-Wide Genetic Analysis of Epistasis in Common Human Diseases

    PubMed Central

    Pattin, Kristine A.; Moore, Jason H.

    2009-01-01

    One of the central goals of human genetics is the identification of loci with alleles or genotypes that confer increased susceptibility. The availability of dense maps of single-nucleotide polymorphisms (SNPs) along with high-throughput genotyping technologies has set the stage for routine genome-wide association studies that are expected to significantly improve our ability to identify susceptibility loci. Before this promise can be realized, there are some significant challenges that need to be addressed. We address here the challenge of detecting epistasis or gene-gene interactions in genome-wide association studies. Discovering epistatic interactions in high dimensional datasets remains a challenge due to the computational complexity resulting from the analysis of all possible combinations of SNPs. One potential way to overcome the computational burden of a genome-wide epistasis analysis would be to devise a logical way to prioritize the many SNPs in a dataset so that the data may be analyzed more efficiently and yet still retain important biological information. One of the strongest demonstrations of the functional relationship between genes is protein-protein interaction. Thus, it is plausible that the expert knowledge extracted from protein interaction databases may allow for a more efficient analysis of genome-wide studies as well as facilitate the biological interpretation of the data. In this review we will discuss the challenges of detecting epistasis in genome-wide genetic studies and the means by which we propose to apply expert knowledge extracted from protein interaction databases to facilitate this process. We explore some of the fundamentals of protein interactions and the databases that are publicly available. PMID:18551320

  1. Genome-wide screen identifies a novel prognostic signature for breast cancer survival

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mao, Xuan Y.; Lee, Matthew J.; Zhu, Jeffrey

    Large genomic datasets in combination with clinical data can be used as an unbiased tool to identify genes important in patient survival and discover potential therapeutic targets. We used a genome-wide screen to identify 587 genes significantly and robustly deregulated across four independent breast cancer (BC) datasets compared to normal breast tissue. Gene expression of 381 genes was significantly associated with relapse-free survival (RFS) in BC patients. We used a gene co-expression network approach to visualize the genetic architecture in normal breast and BCs. In normal breast tissue, co-expression cliques were identified enriched for cell cycle, gene transcription, cell adhesion,more » cytoskeletal organization and metabolism. In contrast, in BC, only two major co-expression cliques were identified enriched for cell cycle-related processes or blood vessel development, cell adhesion and mammary gland development processes. Interestingly, gene expression levels of 7 genes were found to be negatively correlated with many cell cycle related genes, highlighting these genes as potential tumor suppressors and novel therapeutic targets. A forward-conditional Cox regression analysis was used to identify a 12-gene signature associated with RFS. A prognostic scoring system was created based on the 12-gene signature. This scoring system robustly predicted BC patient RFS in 60 sampling test sets and was further validated in TCGA and METABRIC BC data. Our integrated study identified a 12-gene prognostic signature that could guide adjuvant therapy for BC patients and includes novel potential molecular targets for therapy.« less

  2. Genome-wide screen identifies a novel prognostic signature for breast cancer survival

    DOE PAGES

    Mao, Xuan Y.; Lee, Matthew J.; Zhu, Jeffrey; ...

    2017-01-21

    Large genomic datasets in combination with clinical data can be used as an unbiased tool to identify genes important in patient survival and discover potential therapeutic targets. We used a genome-wide screen to identify 587 genes significantly and robustly deregulated across four independent breast cancer (BC) datasets compared to normal breast tissue. Gene expression of 381 genes was significantly associated with relapse-free survival (RFS) in BC patients. We used a gene co-expression network approach to visualize the genetic architecture in normal breast and BCs. In normal breast tissue, co-expression cliques were identified enriched for cell cycle, gene transcription, cell adhesion,more » cytoskeletal organization and metabolism. In contrast, in BC, only two major co-expression cliques were identified enriched for cell cycle-related processes or blood vessel development, cell adhesion and mammary gland development processes. Interestingly, gene expression levels of 7 genes were found to be negatively correlated with many cell cycle related genes, highlighting these genes as potential tumor suppressors and novel therapeutic targets. A forward-conditional Cox regression analysis was used to identify a 12-gene signature associated with RFS. A prognostic scoring system was created based on the 12-gene signature. This scoring system robustly predicted BC patient RFS in 60 sampling test sets and was further validated in TCGA and METABRIC BC data. Our integrated study identified a 12-gene prognostic signature that could guide adjuvant therapy for BC patients and includes novel potential molecular targets for therapy.« less

  3. Genome-wide Association for Major Depression Through Age at Onset Stratification: Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium.

    PubMed

    Power, Robert A; Tansey, Katherine E; Buttenschøn, Henriette Nørmølle; Cohen-Woods, Sarah; Bigdeli, Tim; Hall, Lynsey S; Kutalik, Zoltán; Lee, S Hong; Ripke, Stephan; Steinberg, Stacy; Teumer, Alexander; Viktorin, Alexander; Wray, Naomi R; Arolt, Volker; Baune, Bernard T; Boomsma, Dorret I; Børglum, Anders D; Byrne, Enda M; Castelao, Enrique; Craddock, Nick; Craig, Ian W; Dannlowski, Udo; Deary, Ian J; Degenhardt, Franziska; Forstner, Andreas J; Gordon, Scott D; Grabe, Hans J; Grove, Jakob; Hamilton, Steven P; Hayward, Caroline; Heath, Andrew C; Hocking, Lynne J; Homuth, Georg; Hottenga, Jouke J; Kloiber, Stefan; Krogh, Jesper; Landén, Mikael; Lang, Maren; Levinson, Douglas F; Lichtenstein, Paul; Lucae, Susanne; MacIntyre, Donald J; Madden, Pamela; Magnusson, Patrik K E; Martin, Nicholas G; McIntosh, Andrew M; Middeldorp, Christel M; Milaneschi, Yuri; Montgomery, Grant W; Mors, Ole; Müller-Myhsok, Bertram; Nyholt, Dale R; Oskarsson, Hogni; Owen, Michael J; Padmanabhan, Sandosh; Penninx, Brenda W J H; Pergadia, Michele L; Porteous, David J; Potash, James B; Preisig, Martin; Rivera, Margarita; Shi, Jianxin; Shyn, Stanley I; Sigurdsson, Engilbert; Smit, Johannes H; Smith, Blair H; Stefansson, Hreinn; Stefansson, Kari; Strohmaier, Jana; Sullivan, Patrick F; Thomson, Pippa; Thorgeirsson, Thorgeir E; Van der Auwera, Sandra; Weissman, Myrna M; Breen, Gerome; Lewis, Cathryn M

    2017-02-15

    Major depressive disorder (MDD) is a disabling mood disorder, and despite a known heritable component, a large meta-analysis of genome-wide association studies revealed no replicable genetic risk variants. Given prior evidence of heterogeneity by age at onset in MDD, we tested whether genome-wide significant risk variants for MDD could be identified in cases subdivided by age at onset. Discovery case-control genome-wide association studies were performed where cases were stratified using increasing/decreasing age-at-onset cutoffs; significant single nucleotide polymorphisms were tested in nine independent replication samples, giving a total sample of 22,158 cases and 133,749 control subjects for subsetting. Polygenic score analysis was used to examine whether differences in shared genetic risk exists between earlier and adult-onset MDD with commonly comorbid disorders of schizophrenia, bipolar disorder, Alzheimer's disease, and coronary artery disease. We identified one replicated genome-wide significant locus associated with adult-onset (>27 years) MDD (rs7647854, odds ratio: 1.16, 95% confidence interval: 1.11-1.21, p = 5.2 × 10 -11 ). Using polygenic score analyses, we show that earlier-onset MDD is genetically more similar to schizophrenia and bipolar disorder than adult-onset MDD. We demonstrate that using additional phenotype data previously collected by genetic studies to tackle phenotypic heterogeneity in MDD can successfully lead to the discovery of genetic risk factor despite reduced sample size. Furthermore, our results suggest that the genetic susceptibility to MDD differs between adult- and earlier-onset MDD, with earlier-onset cases having a greater genetic overlap with schizophrenia and bipolar disorder. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  4. Challenges and Opportunities in Genome-Wide Environmental Interaction (GWEI) studies

    PubMed Central

    Aschard, Hugues; Lutz, Sharon; Maus, Bärbel; Duell, Eric J.; Fingerlin, Tasha; Chatterjee, Nilanjan; Kraft, Peter; Van Steen, Kristel

    2012-01-01

    The interest in performing gene-environment interaction studies has seen a significant increase with the increase of advanced molecular genetics techniques. Practically, it became possible to investigate the role of environmental factors in disease risk and hence to investigate their role as genetic effect modifiers. The understanding that genetics is important in the uptake and metabolism of toxic substances is an example of how genetic profiles can modify important environmental risk factors to disease. Several rationales exist to set up gene-environment interaction studies and the technical challenges related to these studies – when the number of environmental or genetic risk factors is relatively small – has been described before. In the post-genomic era, it is now possible to study thousands of genes and their interaction with the environment. This brings along a whole range of new challenges and opportunities. Despite a continuing effort in developing efficient methods and optimal bioinformatics infrastructures to deal with the available wealth of data, the challenge remains how to best present and analyze Genome-Wide Environmental Interaction (GWEI) studies involving multiple genetic and environmental factors. Since GWEIs are performed at the intersection of statistical genetics, bioinformatics and epidemiology, usually similar problems need to be dealt with as for Genome-Wide Association gene-gene Interaction (GWAI) studies. However, additional complexities need to be considered which are typical for large-scale epidemiological studies, but are also related to “joining” two heterogeneous types of data in explaining complex disease trait variation or for prediction purposes. PMID:22760307

  5. Genome-wide association of meat quality traits and tenderness in swine

    USDA-ARS?s Scientific Manuscript database

    Pork quality has a large impact on consumer preference and perception of eating quality. A genome-wide association was performed for pork quality traits [intramuscular fat (IMF)], slice shear force (SSF), color attributes, purge, cooking loss, and pH] from 531 to 1,237 records on barrows and gilts o...

  6. Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study

    PubMed Central

    Amyotte, Beatrice; Bowen, Amy J.; Banks, Travis; Rajcan, Istvan; Somers, Daryl J.

    2017-01-01

    Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants. PMID:28231290

  7. Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study.

    PubMed

    Amyotte, Beatrice; Bowen, Amy J; Banks, Travis; Rajcan, Istvan; Somers, Daryl J

    2017-01-01

    Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants.

  8. Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

    PubMed

    Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

    2011-01-01

    Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.

  9. [Genome-wide association study for adolescent idiopathic scoliosis].

    PubMed

    Ogura, Yoji; Kou, Ikuyo; Scoliosis, Japan; Matsumoto, Morio; Watanabe, Kota; Ikegawa, Shiro

    2016-04-01

    Adolescent idiopathic scoliosis(AIS)is a polygenic disease. Genome-wide association studies(GWASs)have been performed for a lot of polygenic diseases. For AIS, we conducted GWAS and identified the first AIS locus near LBX1. After the discovery, we have extended our study by increasing the numbers of subjects and SNPs. In total, our Japanese GWAS has identified four susceptibility genes. GWASs for AIS have also been performed in the USA and China, which identified one and three susceptibility genes, respectively. Here we review GWASs in Japan and abroad and functional analysis to clarify the pathomechanism of AIS.

  10. A genome-wide aberrant RNA splicing in patients with acute myeloid leukemia identifies novel potential disease markers and therapeutic targets.

    PubMed

    Adamia, Sophia; Haibe-Kains, Benjamin; Pilarski, Patrick M; Bar-Natan, Michal; Pevzner, Samuel; Avet-Loiseau, Herve; Lode, Laurence; Verselis, Sigitas; Fox, Edward A; Burke, John; Galinsky, Ilene; Dagogo-Jack, Ibiayi; Wadleigh, Martha; Steensma, David P; Motyckova, Gabriela; Deangelo, Daniel J; Quackenbush, John; Stone, Richard; Griffin, James D

    2014-03-01

    Despite new treatments, acute myeloid leukemia (AML) remains an incurable disease. More effective drug design requires an expanded view of the molecular complexity that underlies AML. Alternative splicing of RNA is used by normal cells to generate protein diversity. Growing evidence indicates that aberrant splicing of genes plays a key role in cancer. We investigated genome-wide splicing abnormalities in AML and based on these abnormalities, we aimed to identify novel potential biomarkers and therapeutic targets. We used genome-wide alternative splicing screening to investigate alternative splicing abnormalities in two independent AML patient cohorts [Dana-Farber Cancer Institute (DFCI) (Boston, MA) and University Hospital de Nantes (UHN) (Nantes, France)] and normal donors. Selected splicing events were confirmed through cloning and sequencing analysis, and than validated in 193 patients with AML. Our results show that approximately 29% of expressed genes genome-wide were differentially and recurrently spliced in patients with AML compared with normal donors bone marrow CD34(+) cells. Results were reproducible in two independent AML cohorts. In both cohorts, annotation analyses indicated similar proportions of differentially spliced genes encoding several oncogenes, tumor suppressor proteins, splicing factors, and heterogeneous-nuclear-ribonucleoproteins, proteins involved in apoptosis, cell proliferation, and spliceosome assembly. Our findings are consistent with reports for other malignances and indicate that AML-specific aberrations in splicing mechanisms are a hallmark of AML pathogenesis. Overall, our results suggest that aberrant splicing is a common characteristic for AML. Our findings also suggest that splice variant transcripts that are the result of splicing aberrations create novel disease markers and provide potential targets for small molecules or antibody therapeutics for this disease. ©2013 AACR

  11. Uprobe: a genome-wide universal probe resource for comparative physical mapping in vertebrates.

    PubMed

    Kellner, Wendy A; Sullivan, Robert T; Carlson, Brian H; Thomas, James W

    2005-01-01

    Interspecies comparisons are important for deciphering the functional content and evolution of genomes. The expansive array of >70 public vertebrate genomic bacterial artificial chromosome (BAC) libraries can provide a means of comparative mapping, sequencing, and functional analysis of targeted chromosomal segments that is independent and complementary to whole-genome sequencing. However, at the present time, no complementary resource exists for the efficient targeted physical mapping of the majority of these BAC libraries. Universal overgo-hybridization probes, designed from regions of sequenced genomes that are highly conserved between species, have been demonstrated to be an effective resource for the isolation of orthologous regions from multiple BAC libraries in parallel. Here we report the application of the universal probe design principal across entire genomes, and the subsequent creation of a complementary probe resource, Uprobe, for screening vertebrate BAC libraries. Uprobe currently consists of whole-genome sets of universal overgo-hybridization probes designed for screening mammalian or avian/reptilian libraries. Retrospective analysis, experimental validation of the probe design process on a panel of representative BAC libraries, and estimates of probe coverage across the genome indicate that the majority of all eutherian and avian/reptilian genes or regions of interest can be isolated using Uprobe. Future implementation of the universal probe design strategy will be used to create an expanded number of whole-genome probe sets that will encompass all vertebrate genomes.

  12. GEM System: automatic prototyping of cell-wide metabolic pathway models from genomes.

    PubMed

    Arakawa, Kazuharu; Yamada, Yohei; Shinoda, Kosaku; Nakayama, Yoichi; Tomita, Masaru

    2006-03-23

    Successful realization of a "systems biology" approach to analyzing cells is a grand challenge for our understanding of life. However, current modeling approaches to cell simulation are labor-intensive, manual affairs, and therefore constitute a major bottleneck in the evolution of computational cell biology. We developed the Genome-based Modeling (GEM) System for the purpose of automatically prototyping simulation models of cell-wide metabolic pathways from genome sequences and other public biological information. Models generated by the GEM System include an entire Escherichia coli metabolism model comprising 968 reactions of 1195 metabolites, achieving 100% coverage when compared with the KEGG database, 92.38% with the EcoCyc database, and 95.06% with iJR904 genome-scale model. The GEM System prototypes qualitative models to reduce the labor-intensive tasks required for systems biology research. Models of over 90 bacterial genomes are available at our web site.

  13. Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits

    PubMed Central

    Pecetti, Luciano; Brummer, E. Charles; Palmonari, Alberto; Tava, Aldo

    2017-01-01

    Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3–0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits

  14. Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits.

    PubMed

    Biazzi, Elisa; Nazzicari, Nelson; Pecetti, Luciano; Brummer, E Charles; Palmonari, Alberto; Tava, Aldo; Annicchiarico, Paolo

    2017-01-01

    Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3-0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits

  15. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines

    PubMed Central

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours’ biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription–quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes. PMID:29263807

  16. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines.

    PubMed

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours' biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription-quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes.

  17. Genome-wide detection of intervals of genetic heterogeneity associated with complex traits

    PubMed Central

    Llinares-López, Felipe; Grimm, Dominik G.; Bodenham, Dean A.; Gieraths, Udo; Sugiyama, Mahito; Rowan, Beth; Borgwardt, Karsten

    2015-01-01

    Motivation: Genetic heterogeneity, the fact that several sequence variants give rise to the same phenotype, is a phenomenon that is of the utmost interest in the analysis of complex phenotypes. Current approaches for finding regions in the genome that exhibit genetic heterogeneity suffer from at least one of two shortcomings: (i) they require the definition of an exact interval in the genome that is to be tested for genetic heterogeneity, potentially missing intervals of high relevance, or (ii) they suffer from an enormous multiple hypothesis testing problem due to the large number of potential candidate intervals being tested, which results in either many false positives or a lack of power to detect true intervals. Results: Here, we present an approach that overcomes both problems: it allows one to automatically find all contiguous sequences of single nucleotide polymorphisms in the genome that are jointly associated with the phenotype. It also solves both the inherent computational efficiency problem and the statistical problem of multiple hypothesis testing, which are both caused by the huge number of candidate intervals. We demonstrate on Arabidopsis thaliana genome-wide association study data that our approach can discover regions that exhibit genetic heterogeneity and would be missed by single-locus mapping. Conclusions: Our novel approach can contribute to the genome-wide discovery of intervals that are involved in the genetic heterogeneity underlying complex phenotypes. Availability and implementation: The code can be obtained at: http://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/sis.html. Contact: felipe.llinares@bsse.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26072488

  18. Genetics of Genome-Wide Recombination Rate Evolution in Mice from an Isolated Island.

    PubMed

    Wang, Richard J; Payseur, Bret A

    2017-08-01

    Recombination rate is a heritable quantitative trait that evolves despite the fundamentally conserved role that recombination plays in meiosis. Differences in recombination rate can alter the landscape of the genome and the genetic diversity of populations. Yet our understanding of the genetic basis of recombination rate evolution in nature remains limited. We used wild house mice ( Mus musculus domesticus ) from Gough Island (GI), which diverged recently from their mainland counterparts, to characterize the genetics of recombination rate evolution. We quantified genome-wide autosomal recombination rates by immunofluorescence cytology in spermatocytes from 240 F 2 males generated from intercrosses between GI-derived mice and the wild-derived inbred strain WSB/EiJ. We identified four quantitative trait loci (QTL) responsible for inter-F 2 variation in this trait, the strongest of which had effects that opposed the direction of the parental trait differences. Candidate genes and mutations for these QTL were identified by overlapping the detected intervals with whole-genome sequencing data and publicly available transcriptomic profiles from spermatocytes. Combined with existing studies, our findings suggest that genome-wide recombination rate divergence is not directional and its evolution within and between subspecies proceeds from distinct genetic loci. Copyright © 2017 by the Genetics Society of America.

  19. Genome-wide methylation study of diploid and triploid brown trout (Salmo trutta L.).

    PubMed

    Covelo-Soto, L; Leunda, P M; Pérez-Figueroa, A; Morán, P

    2015-06-01

    The induction of triploidization in fish is a very common practice in aquaculture. Although triploidization has been applied successfully in many salmonid species, little is known about the epigenetic mechanisms implicated in the maintenance of the normal functions of the new polyploid genome. By means of methylation-sensitive amplified polymorphism (MSAP) techniques, genome-wide methylation changes associated with triploidization were assessed in DNA samples obtained from diploid and triploid siblings of brown trout (Salmo trutta). Simple comparative body measurements showed that the triploid trout used in the study were statistically bigger, however, not heavier than their diploid counterparts. The statistical analysis of the MSAP data showed no significant differences between diploid and triploid brown trout in respect to brain, gill, heart, liver, kidney or muscle samples. Nonetheless, local analysis pointed to the possibility of differences in connection with concrete loci. This is the first study that has investigated DNA methylation alterations associated with triploidization in brown trout. Our results set the basis for new studies to be undertaken and provide a new approach concerning triploidization effects of the salmonid genome while also contributing to the better understanding of the genome-wide methylation processes. © 2015 Stichting International Foundation for Animal Genetics.

  20. Genome-Wide Networks of Amino Acid Covariances Are Common among Viruses

    PubMed Central

    Donlin, Maureen J.; Szeto, Brandon; Gohara, David W.; Aurora, Rajeev

    2012-01-01

    Coordinated variation among positions in amino acid sequence alignments can reveal genetic dependencies at noncontiguous positions, but methods to assess these interactions are incompletely developed. Previously, we found genome-wide networks of covarying residue positions in the hepatitis C virus genome (R. Aurora, M. J. Donlin, N. A. Cannon, and J. E. Tavis, J. Clin. Invest. 119:225–236, 2009). Here, we asked whether such networks are present in a diverse set of viruses and, if so, what they may imply about viral biology. Viral sequences were obtained for 16 viruses in 13 species from 9 families. The entire viral coding potential for each virus was aligned, all possible amino acid covariances were identified using the observed-minus-expected-squared algorithm at a false-discovery rate of ≤1%, and networks of covariances were assessed using standard methods. Covariances that spanned the viral coding potential were common in all viruses. In all cases, the covariances formed a single network that contained essentially all of the covariances. The hepatitis C virus networks had hub-and-spoke topologies, but all other networks had random topologies with an unusually large number of highly connected nodes. These results indicate that genome-wide networks of genetic associations and the coordinated evolution they imply are very common in viral genomes, that the networks rarely have the hub-and-spoke topology that dominates other biological networks, and that network topologies can vary substantially even within a given viral group. Five examples with hepatitis B virus and poliovirus are presented to illustrate how covariance network analysis can lead to inferences about viral biology. PMID:22238298

  1. Integration of mouse and human genome-wide association data identifies KCNIP4 as an asthma gene.

    PubMed

    Himes, Blanca E; Sheppard, Keith; Berndt, Annerose; Leme, Adriana S; Myers, Rachel A; Gignoux, Christopher R; Levin, Albert M; Gauderman, W James; Yang, James J; Mathias, Rasika A; Romieu, Isabelle; Torgerson, Dara G; Roth, Lindsey A; Huntsman, Scott; Eng, Celeste; Klanderman, Barbara; Ziniti, John; Senter-Sylvia, Jody; Szefler, Stanley J; Lemanske, Robert F; Zeiger, Robert S; Strunk, Robert C; Martinez, Fernando D; Boushey, Homer; Chinchilli, Vernon M; Israel, Elliot; Mauger, David; Koppelman, Gerard H; Postma, Dirkje S; Nieuwenhuis, Maartje A E; Vonk, Judith M; Lima, John J; Irvin, Charles G; Peters, Stephen P; Kubo, Michiaki; Tamari, Mayumi; Nakamura, Yusuke; Litonjua, Augusto A; Tantisira, Kelan G; Raby, Benjamin A; Bleecker, Eugene R; Meyers, Deborah A; London, Stephanie J; Barnes, Kathleen C; Gilliland, Frank D; Williams, L Keoki; Burchard, Esteban G; Nicolae, Dan L; Ober, Carole; DeMeo, Dawn L; Silverman, Edwin K; Paigen, Beverly; Churchill, Gary; Shapiro, Steve D; Weiss, Scott T

    2013-01-01

    Asthma is a common chronic respiratory disease characterized by airway hyperresponsiveness (AHR). The genetics of asthma have been widely studied in mouse and human, and homologous genomic regions have been associated with mouse AHR and human asthma-related phenotypes. Our goal was to identify asthma-related genes by integrating AHR associations in mouse with human genome-wide association study (GWAS) data. We used Efficient Mixed Model Association (EMMA) analysis to conduct a GWAS of baseline AHR measures from males and females of 31 mouse strains. Genes near or containing SNPs with EMMA p-values <0.001 were selected for further study in human GWAS. The results of the previously reported EVE consortium asthma GWAS meta-analysis consisting of 12,958 diverse North American subjects from 9 study centers were used to select a subset of homologous genes with evidence of association with asthma in humans. Following validation attempts in three human asthma GWAS (i.e., Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG) and two human AHR GWAS (i.e., SHARP, DAG), the Kv channel interacting protein 4 (KCNIP4) gene was identified as nominally associated with both asthma and AHR at a gene- and SNP-level. In EVE, the smallest KCNIP4 association was at rs6833065 (P-value 2.9e-04), while the strongest associations for Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG were 1.5e-03, 1.0e-03, 3.1e-03 at rs7664617, rs4697177, rs4696975, respectively. At a SNP level, the strongest association across all asthma GWAS was at rs4697177 (P-value 1.1e-04). The smallest P-values for association with AHR were 2.3e-03 at rs11947661 in SHARP and 2.1e-03 at rs402802 in DAG. Functional studies are required to validate the potential involvement of KCNIP4 in modulating asthma susceptibility and/or AHR. Our results suggest that a useful approach to identify genes associated with human asthma is to leverage mouse AHR association data.

  2. Large meta-analysis of genome-wide association studies identifies five loci for lean body mass.

    PubMed

    Zillikens, M Carola; Demissie, Serkalem; Hsu, Yi-Hsiang; Yerges-Armstrong, Laura M; Chou, Wen-Chi; Stolk, Lisette; Livshits, Gregory; Broer, Linda; Johnson, Toby; Koller, Daniel L; Kutalik, Zoltán; Luan, Jian'an; Malkin, Ida; Ried, Janina S; Smith, Albert V; Thorleifsson, Gudmar; Vandenput, Liesbeth; Hua Zhao, Jing; Zhang, Weihua; Aghdassi, Ali; Åkesson, Kristina; Amin, Najaf; Baier, Leslie J; Barroso, Inês; Bennett, David A; Bertram, Lars; Biffar, Rainer; Bochud, Murielle; Boehnke, Michael; Borecki, Ingrid B; Buchman, Aron S; Byberg, Liisa; Campbell, Harry; Campos Obanda, Natalia; Cauley, Jane A; Cawthon, Peggy M; Cederberg, Henna; Chen, Zhao; Cho, Nam H; Jin Choi, Hyung; Claussnitzer, Melina; Collins, Francis; Cummings, Steven R; De Jager, Philip L; Demuth, Ilja; Dhonukshe-Rutten, Rosalie A M; Diatchenko, Luda; Eiriksdottir, Gudny; Enneman, Anke W; Erdos, Mike; Eriksson, Johan G; Eriksson, Joel; Estrada, Karol; Evans, Daniel S; Feitosa, Mary F; Fu, Mao; Garcia, Melissa; Gieger, Christian; Girke, Thomas; Glazer, Nicole L; Grallert, Harald; Grewal, Jagvir; Han, Bok-Ghee; Hanson, Robert L; Hayward, Caroline; Hofman, Albert; Hoffman, Eric P; Homuth, Georg; Hsueh, Wen-Chi; Hubal, Monica J; Hubbard, Alan; Huffman, Kim M; Husted, Lise B; Illig, Thomas; Ingelsson, Erik; Ittermann, Till; Jansson, John-Olov; Jordan, Joanne M; Jula, Antti; Karlsson, Magnus; Khaw, Kay-Tee; Kilpeläinen, Tuomas O; Klopp, Norman; Kloth, Jacqueline S L; Koistinen, Heikki A; Kraus, William E; Kritchevsky, Stephen; Kuulasmaa, Teemu; Kuusisto, Johanna; Laakso, Markku; Lahti, Jari; Lang, Thomas; Langdahl, Bente L; Launer, Lenore J; Lee, Jong-Young; Lerch, Markus M; Lewis, Joshua R; Lind, Lars; Lindgren, Cecilia; Liu, Yongmei; Liu, Tian; Liu, Youfang; Ljunggren, Östen; Lorentzon, Mattias; Luben, Robert N; Maixner, William; McGuigan, Fiona E; Medina-Gomez, Carolina; Meitinger, Thomas; Melhus, Håkan; Mellström, Dan; Melov, Simon; Michaëlsson, Karl; Mitchell, Braxton D; Morris, Andrew P; Mosekilde, Leif; Newman, Anne; Nielson, Carrie M; O'Connell, Jeffrey R; Oostra, Ben A; Orwoll, Eric S; Palotie, Aarno; Parker, Stephen C J; Peacock, Munro; Perola, Markus; Peters, Annette; Polasek, Ozren; Prince, Richard L; Räikkönen, Katri; Ralston, Stuart H; Ripatti, Samuli; Robbins, John A; Rotter, Jerome I; Rudan, Igor; Salomaa, Veikko; Satterfield, Suzanne; Schadt, Eric E; Schipf, Sabine; Scott, Laura; Sehmi, Joban; Shen, Jian; Soo Shin, Chan; Sigurdsson, Gunnar; Smith, Shad; Soranzo, Nicole; Stančáková, Alena; Steinhagen-Thiessen, Elisabeth; Streeten, Elizabeth A; Styrkarsdottir, Unnur; Swart, Karin M A; Tan, Sian-Tsung; Tarnopolsky, Mark A; Thompson, Patricia; Thomson, Cynthia A; Thorsteinsdottir, Unnur; Tikkanen, Emmi; Tranah, Gregory J; Tuomilehto, Jaakko; van Schoor, Natasja M; Verma, Arjun; Vollenweider, Peter; Völzke, Henry; Wactawski-Wende, Jean; Walker, Mark; Weedon, Michael N; Welch, Ryan; Wichmann, H-Erich; Widen, Elisabeth; Williams, Frances M K; Wilson, James F; Wright, Nicole C; Xie, Weijia; Yu, Lei; Zhou, Yanhua; Chambers, John C; Döring, Angela; van Duijn, Cornelia M; Econs, Michael J; Gudnason, Vilmundur; Kooner, Jaspal S; Psaty, Bruce M; Spector, Timothy D; Stefansson, Kari; Rivadeneira, Fernando; Uitterlinden, André G; Wareham, Nicholas J; Ossowski, Vicky; Waterworth, Dawn; Loos, Ruth J F; Karasik, David; Harris, Tamara B; Ohlsson, Claes; Kiel, Douglas P

    2017-07-19

    Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p < 5 × 10 -8 ) or suggestively genome wide (p < 2.3 × 10 -6 ). Replication in 63,475 (47,227 of European ancestry) individuals from 33 cohorts for whole body lean body mass and in 45,090 (42,360 of European ancestry) subjects from 25 cohorts for appendicular lean body mass was successful for five single-nucleotide polymorphisms in/near HSD17B11, VCAN, ADAMTSL3, IRS1, and FTO for total lean body mass and for three single-nucleotide polymorphisms in/near VCAN, ADAMTSL3, and IRS1 for appendicular lean body mass. Our findings provide new insight into the genetics of lean body mass.Lean body mass is a highly heritable trait and is associated with various health conditions. Here, Kiel and colleagues perform a meta-analysis of genome-wide association studies for whole body lean body mass and find five novel genetic loci to be significantly associated.

  3. Educational Attainment: A Genome Wide Association Study in 9538 Australians

    PubMed Central

    Martin, Nicolas W.; Medland, Sarah E.; Verweij, Karin J. H.; Lee, S. Hong; Nyholt, Dale R.; Madden, Pamela A.; Heath, Andrew C.; Montgomery, Grant W.; Wright, Margaret J.; Martin, Nicholas G.

    2011-01-01

    Background Correlations between Educational Attainment (EA) and measures of cognitive performance are as high as 0.8. This makes EA an attractive alternative phenotype for studies wishing to map genes affecting cognition due to the ease of collecting EA data compared to other cognitive phenotypes such as IQ. Methodology In an Australian family sample of 9538 individuals we performed a genome-wide association scan (GWAS) using the imputed genotypes of ∼2.4 million single nucleotide polymorphisms (SNP) for a 6-point scale measure of EA. Top hits were checked for replication in an independent sample of 968 individuals. A gene-based test of association was then applied to the GWAS results. Additionally we performed prediction analyses using the GWAS results from our discovery sample to assess the percentage of EA and full scale IQ variance explained by the predicted scores. Results The best SNP fell short of having a genome-wide significant p-value (p = 9.77×10−7). In our independent replication sample six SNPs among the top 50 hits pruned for linkage disequilibrium (r2<0.8) had a p-value<0.05 but only one of these SNPs survived correction for multiple testing - rs7106258 (p = 9.7*10−4) located in an intergenic region of chromosome 11q14.1. The gene based test results were non-significant and our prediction analyses show that the predicted scores explained little variance in EA in our replication sample. Conclusion While we have identified a polymorphism chromosome 11q14.1 associated with EA, further replication is warranted. Overall, the absence of genome-wide significant p-values in our large discovery sample confirmed the high polygenic architecture of EA. Only the assembly of large samples or meta-analytic efforts will be able to assess the implication of common DNA polymorphisms in the etiology of EA. PMID:21694764

  4. Ensembl Genomes 2013: scaling up access to genome-wide data

    USDA-ARS?s Scientific Manuscript database

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provi...

  5. Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome.

    PubMed

    Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

    2016-10-01

    Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  6. Genome-wide identification, isolation and expression analysis of auxin response factor (ARF) gene family in sweet orange (Citrus sinensis)

    PubMed Central

    Li, Si-Bei; OuYang, Wei-Zhi; Hou, Xiao-Jin; Xie, Liang-Liang; Hu, Chun-Gen; Zhang, Jin-Zhi

    2015-01-01

    Auxin response factors (ARFs) are an important family of proteins in auxin-mediated response, with key roles in various physiological and biochemical processes. To date, a genome-wide overview of the ARF gene family in citrus was not available. A systematic analysis of this gene family in citrus was begun by carrying out a genome-wide search for the homologs of ARFs. A total of 19 nonredundant ARF genes (CiARF) were found and validated from the sweet orange. A comprehensive overview of the CiARFs was undertaken, including the gene structures, phylogenetic analysis, chromosome locations, conserved motifs of proteins, and cis-elements in promoters of CiARF. Furthermore, expression profiling using real-time PCR revealed many CiARF genes, albeit with different patterns depending on types of tissues and/or developmental stages. Comprehensive expression analysis of these genes was also performed under two hormone treatments using real-time PCR. Indole-3-acetic acid (IAA) and N-1-napthylphthalamic acid (NPA) treatment experiments revealed differential up-regulation and down-regulation, respectively, of the 19 citrus ARF genes in the callus of sweet orange. Our comprehensive analysis of ARF genes further elucidates the roles of CiARF family members during citrus growth and development process. PMID:25870601

  7. Genome wide selection in Citrus breeding.

    PubMed

    Gois, I B; Borém, A; Cristofani-Yaly, M; de Resende, M D V; Azevedo, C F; Bastianel, M; Novelli, V M; Machado, M A

    2016-10-17

    Genome wide selection (GWS) is essential for the genetic improvement of perennial species such as Citrus because of its ability to increase gain per unit time and to enable the efficient selection of characteristics with low heritability. This study assessed GWS efficiency in a population of Citrus and compared it with selection based on phenotypic data. A total of 180 individual trees from a cross between Pera sweet orange (Citrus sinensis Osbeck) and Murcott tangor (Citrus sinensis Osbeck x Citrus reticulata Blanco) were evaluated for 10 characteristics related to fruit quality. The hybrids were genotyped using 5287 DArT_seq TM (diversity arrays technology) molecular markers and their effects on phenotypes were predicted using the random regression - best linear unbiased predictor (rr-BLUP) method. The predictive ability, prediction bias, and accuracy of GWS were estimated to verify its effectiveness for phenotype prediction. The proportion of genetic variance explained by the markers was also computed. The heritability of the traits, as determined by markers, was 16-28%. The predictive ability of these markers ranged from 0.53 to 0.64, and the regression coefficients between predicted and observed phenotypes were close to unity. Over 35% of the genetic variance was accounted for by the markers. Accuracy estimates with GWS were lower than those obtained by phenotypic analysis; however, GWS was superior in terms of genetic gain per unit time. Thus, GWS may be useful for Citrus breeding as it can predict phenotypes early and accurately, and reduce the length of the selection cycle. This study demonstrates the feasibility of genomic selection in Citrus.

  8. Asbestos-associated genome-wide DNA methylation changes in lung cancer.

    PubMed

    Kettunen, Eeva; Hernandez-Vargas, Hector; Cros, Marie-Pierre; Durand, Geoffroy; Le Calvez-Kelm, Florence; Stuopelyte, Kristina; Jarmalaite, Sonata; Salmenkivi, Kaisa; Anttila, Sisko; Wolff, Henrik; Herceg, Zdenko; Husgafvel-Pursiainen, Kirsti

    2017-11-15

    Previous studies have revealed a robust association between exposure to asbestos and human lung cancer. Accumulating evidence has highlighted the role of epigenome deregulation in the mechanism of carcinogen-induced malignancies. We examined the impact of asbestos on DNA methylation. Our genome-wide studies (using Illumina HumanMethylation450K BeadChip) of lung cancer tissue and paired normal lung from 28 asbestos-exposed or non-exposed patients, mostly smokers, revealed distinctive DNA methylation changes. We identified a number of differentially methylated regions (DMR) and differentially variable, differentially methylated CpGs (DVMC), with individual CpGs further validated by pyrosequencing in an independent series of 91 non-small cell lung cancer and paired normal lung. We discovered and validated BEND4, ZSCAN31 and GPR135 as significantly hypermethylated in lung cancer. DMRs in genes such as RARB (FDR 1.1 × 10 -19 , mean change in beta [Δ] -0.09), GPR135 (FDR 1.87 × 10 -8 , mean Δ -0.09) and TPO (FDR 8.58 × 10 -5 , mean Δ -0.11), and DVMCs in NPTN, NRG2, GLT25D2 and TRPC3 (all with p <0.05, t-test) were significantly associated with asbestos exposure status in exposed versus non-exposed lung tumors. Hypomethylation was characteristic to DVMCs in lung cancer tissue from asbestos-exposed subjects. When DVMCs related to asbestos or smoking were analyzed, 96% of the elements were unique to either of the exposures, consistent with the concept that the methylation changes in tumors may be specific for risk factors. In conclusion, we identified novel DNA methylation changes associated with lung tumors and asbestos exposure, suggesting that changes may be present in causal pathway from asbestos exposure to lung cancer. © 2017 UICC.

  9. Human Genomic Loci Important in Common Infectious Diseases: Role of High-Throughput Sequencing and Genome-Wide Association Studies

    PubMed Central

    Sserwadda, Ivan; Amujal, Marion; Namatovu, Norah

    2018-01-01

    HIV/AIDS, tuberculosis (TB), and malaria are 3 major global public health threats that undermine development in many resource-poor settings. Recently, the notion that positive selection during epidemics or longer periods of exposure to common infectious diseases may have had a major effect in modifying the constitution of the human genome is being interrogated at a large scale in many populations around the world. This positive selection from infectious diseases increases power to detect associations in genome-wide association studies (GWASs). High-throughput sequencing (HTS) has transformed both the management of infectious diseases and continues to enable large-scale functional characterization of host resistance/susceptibility alleles and loci; a paradigm shift from single candidate gene studies. Application of genome sequencing technologies and genomics has enabled us to interrogate the host-pathogen interface for improving human health. Human populations are constantly locked in evolutionary arms races with pathogens; therefore, identification of common infectious disease-associated genomic variants/markers is important in therapeutic, vaccine development, and screening susceptible individuals in a population. This review describes a range of host-pathogen genomic loci that have been associated with disease susceptibility and resistant patterns in the era of HTS. We further highlight potential opportunities for these genetic markers. PMID:29755620

  10. Genome-wide scan of healthy human connectome discovers SPON1 gene variant influencing dementia severity

    PubMed Central

    Jahanshad, Neda; Rajagopalan, Priya; Hua, Xue; Hibar, Derrek P.; Nir, Talia M.; Toga, Arthur W.; Jack, Clifford R.; Saykin, Andrew J.; Green, Robert C.; Weiner, Michael W.; Medland, Sarah E.; Montgomery, Grant W.; Hansell, Narelle K.; McMahon, Katie L.; de Zubicaray, Greig I.; Martin, Nicholas G.; Wright, Margaret J.; Thompson, Paul M.; Weiner, Michael; Aisen, Paul; Weiner, Michael; Aisen, Paul; Petersen, Ronald; Jack, Clifford R.; Jagust, William; Trojanowski, John Q.; Toga, Arthur W.; Beckett, Laurel; Green, Robert C.; Saykin, Andrew J.; Morris, John; Liu, Enchi; Green, Robert C.; Montine, Tom; Petersen, Ronald; Aisen, Paul; Gamst, Anthony; Thomas, Ronald G.; Donohue, Michael; Walter, Sarah; Gessert, Devon; Sather, Tamie; Beckett, Laurel; Harvey, Danielle; Gamst, Anthony; Donohue, Michael; Kornak, John; Jack, Clifford R.; Dale, Anders; Bernstein, Matthew; Felmlee, Joel; Fox, Nick; Thompson, Paul; Schuff, Norbert; Alexander, Gene; DeCarli, Charles; Jagust, William; Bandy, Dan; Koeppe, Robert A.; Foster, Norm; Reiman, Eric M.; Chen, Kewei; Mathis, Chet; Morris, John; Cairns, Nigel J.; Taylor-Reinwald, Lisa; Trojanowki, J.Q.; Shaw, Les; Lee, Virginia M.Y.; Korecka, Magdalena; Toga, Arthur W.; Crawford, Karen; Neu, Scott; Saykin, Andrew J.; Foroud, Tatiana M.; Potkin, Steven; Shen, Li; Khachaturian, Zaven; Frank, Richard; Snyder, Peter J.; Molchan, Susan; Kaye, Jeffrey; Quinn, Joseph; Lind, Betty; Dolen, Sara; Schneider, Lon S.; Pawluczyk, Sonia; Spann, Bryan M.; Brewer, James; Vanderswag, Helen; Heidebrink, Judith L.; Lord, Joanne L.; Petersen, Ronald; Johnson, Kris; Doody, Rachelle S.; Villanueva-Meyer, Javier; Chowdhury, Munir; Stern, Yaakov; Honig, Lawrence S.; Bell, Karen L.; Morris, John C.; Ances, Beau; Carroll, Maria; Leon, Sue; Mintun, Mark A.; Schneider, Stacy; Marson, Daniel; Griffith, Randall; Clark, David; Grossman, Hillel; Mitsis, Effie; Romirowsky, Aliza; deToledo-Morrell, Leyla; Shah, Raj C.; Duara, Ranjan; Varon, Daniel; Roberts, Peggy; Albert, Marilyn; Onyike, Chiadi; Kielb, Stephanie; Rusinek, Henry; de Leon, Mony J.; Glodzik, Lidia; De Santi, Susan; Doraiswamy, P. Murali; Petrella, Jeffrey R.; Coleman, R. Edward; Arnold, Steven E.; Karlawish, Jason H.; Wolk, David; Smith, Charles D.; Jicha, Greg; Hardy, Peter; Lopez, Oscar L.; Oakley, MaryAnn; Simpson, Donna M.; Porsteinsson, Anton P.; Goldstein, Bonnie S.; Martin, Kim; Makino, Kelly M.; Ismail, M. Saleem; Brand, Connie; Mulnard, Ruth A.; Thai, Gaby; Mc-Adams-Ortiz, Catherine; Womack, Kyle; Mathews, Dana; Quiceno, Mary; Diaz-Arrastia, Ramon; King, Richard; Weiner, Myron; Martin-Cook, Kristen; DeVous, Michael; Levey, Allan I.; Lah, James J.; Cellar, Janet S.; Burns, Jeffrey M.; Anderson, Heather S.; Swerdlow, Russell H.; Apostolova, Liana; Lu, Po H.; Bartzokis, George; Silverman, Daniel H.S.; Graff-Radford, Neill R.; Parfitt, Francine; Johnson, Heather; Farlow, Martin R.; Hake, Ann Marie; Matthews, Brandy R.; Herring, Scott; van Dyck, Christopher H.; Carson, Richard E.; MacAvoy, Martha G.; Chertkow, Howard; Bergman, Howard; Hosein, Chris; Black, Sandra; Stefanovic, Bojana; Caldwell, Curtis; Hsiung, Ging-Yuek Robin; Feldman, Howard; Mudge, Benita; Assaly, Michele; Kertesz, Andrew; Rogers, John; Trost, Dick; Bernick, Charles; Munic, Donna; Kerwin, Diana; Mesulam, Marek-Marsel; Lipowski, Kristina; Wu, Chuang-Kuo; Johnson, Nancy; Sadowsky, Carl; Martinez, Walter; Villena, Teresa; Turner, Raymond Scott; Johnson, Kathleen; Reynolds, Brigid; Sperling, Reisa A.; Johnson, Keith A.; Marshall, Gad; Frey, Meghan; Yesavage, Jerome; Taylor, Joy L.; Lane, Barton; Rosen, Allyson; Tinklenberg, Jared; Sabbagh, Marwan; Belden, Christine; Jacobson, Sandra; Kowall, Neil; Killiany, Ronald; Budson, Andrew E.; Norbash, Alexander; Johnson, Patricia Lynn; Obisesan, Thomas O.; Wolday, Saba; Bwayo, Salome K.; Lerner, Alan; Hudson, Leon; Ogrocki, Paula; Fletcher, Evan; Carmichael, Owen; Olichney, John; DeCarli, Charles; Kittur, Smita; Borrie, Michael; Lee, T.-Y.; Bartha, Rob; Johnson, Sterling; Asthana, Sanjay; Carlsson, Cynthia M.; Potkin, Steven G.; Preda, Adrian; Nguyen, Dana; Tariot, Pierre; Fleisher, Adam; Reeder, Stephanie; Bates, Vernice; Capote, Horacio; Rainka, Michelle; Scharre, Douglas W.; Kataki, Maria; Zimmerman, Earl A.; Celmins, Dzintra; Brown, Alice D.; Pearlson, Godfrey D.; Blank, Karen; Anderson, Karen; Saykin, Andrew J.; Santulli, Robert B.; Schwartz, Eben S.; Sink, Kaycee M.; Williamson, Jeff D.; Garg, Pradeep; Watkins, Franklin; Ott, Brian R.; Querfurth, Henry; Tremont, Geoffrey; Salloway, Stephen; Malloy, Paul; Correia, Stephen; Rosen, Howard J.; Miller, Bruce L.; Mintzer, Jacobo; Longmire, Crystal Flynn; Spicer, Kenneth; Finger, Elizabeth; Rachinsky, Irina; Rogers, John; Kertesz, Andrew; Drost, Dick

    2013-01-01

    Aberrant connectivity is implicated in many neurological and psychiatric disorders, including Alzheimer’s disease and schizophrenia. However, other than a few disease-associated candidate genes, we know little about the degree to which genetics play a role in the brain networks; we know even less about specific genes that influence brain connections. Twin and family-based studies can generate estimates of overall genetic influences on a trait, but genome-wide association scans (GWASs) can screen the genome for specific variants influencing the brain or risk for disease. To identify the heritability of various brain connections, we scanned healthy young adult twins with high-field, high-angular resolution diffusion MRI. We adapted GWASs to screen the brain’s connectivity pattern, allowing us to discover genetic variants that affect the human brain’s wiring. The association of connectivity with the SPON1 variant at rs2618516 on chromosome 11 (11p15.2) reached connectome-wide, genome-wide significance after stringent statistical corrections were enforced, and it was replicated in an independent subsample. rs2618516 was shown to affect brain structure in an elderly population with varying degrees of dementia. Older people who carried the connectivity variant had significantly milder clinical dementia scores and lower risk of Alzheimer’s disease. As a posthoc analysis, we conducted GWASs on several organizational and topological network measures derived from the matrices to discover variants in and around genes associated with autism (MACROD2), development (NEDD4), and mental retardation (UBE2A) significantly associated with connectivity. Connectome-wide, genome-wide screening offers substantial promise to discover genes affecting brain connectivity and risk for brain diseases. PMID:23471985

  11. Genome-wide scan of healthy human connectome discovers SPON1 gene variant influencing dementia severity.

    PubMed

    Jahanshad, Neda; Rajagopalan, Priya; Hua, Xue; Hibar, Derrek P; Nir, Talia M; Toga, Arthur W; Jack, Clifford R; Saykin, Andrew J; Green, Robert C; Weiner, Michael W; Medland, Sarah E; Montgomery, Grant W; Hansell, Narelle K; McMahon, Katie L; de Zubicaray, Greig I; Martin, Nicholas G; Wright, Margaret J; Thompson, Paul M

    2013-03-19

    Aberrant connectivity is implicated in many neurological and psychiatric disorders, including Alzheimer's disease and schizophrenia. However, other than a few disease-associated candidate genes, we know little about the degree to which genetics play a role in the brain networks; we know even less about specific genes that influence brain connections. Twin and family-based studies can generate estimates of overall genetic influences on a trait, but genome-wide association scans (GWASs) can screen the genome for specific variants influencing the brain or risk for disease. To identify the heritability of various brain connections, we scanned healthy young adult twins with high-field, high-angular resolution diffusion MRI. We adapted GWASs to screen the brain's connectivity pattern, allowing us to discover genetic variants that affect the human brain's wiring. The association of connectivity with the SPON1 variant at rs2618516 on chromosome 11 (11p15.2) reached connectome-wide, genome-wide significance after stringent statistical corrections were enforced, and it was replicated in an independent subsample. rs2618516 was shown to affect brain structure in an elderly population with varying degrees of dementia. Older people who carried the connectivity variant had significantly milder clinical dementia scores and lower risk of Alzheimer's disease. As a posthoc analysis, we conducted GWASs on several organizational and topological network measures derived from the matrices to discover variants in and around genes associated with autism (MACROD2), development (NEDD4), and mental retardation (UBE2A) significantly associated with connectivity. Connectome-wide, genome-wide screening offers substantial promise to discover genes affecting brain connectivity and risk for brain diseases.

  12. Systems genetics of obesity in an F2 pig model by genome-wide association, genetic network, and pathway analyses

    PubMed Central

    Kogelman, Lisette J. A.; Pant, Sameer D.; Fredholm, Merete; Kadarmideen, Haja N.

    2014-01-01

    Obesity is a complex condition with world-wide exponentially rising prevalence rates, linked with severe diseases like Type 2 Diabetes. Economic and welfare consequences have led to a raised interest in a better understanding of the biological and genetic background. To date, whole genome investigations focusing on single genetic variants have achieved limited success, and the importance of including genetic interactions is becoming evident. Here, the aim was to perform an integrative genomic analysis in an F2 pig resource population that was constructed with an aim to maximize genetic variation of obesity-related phenotypes and genotyped using the 60K SNP chip. Firstly, Genome Wide Association (GWA) analysis was performed on the Obesity Index to locate candidate genomic regions that were further validated using combined Linkage Disequilibrium Linkage Analysis and investigated by evaluation of haplotype blocks. We built Weighted Interaction SNP Hub (WISH) and differentially wired (DW) networks using genotypic correlations amongst obesity-associated SNPs resulting from GWA analysis. GWA results and SNP modules detected by WISH and DW analyses were further investigated by functional enrichment analyses. The functional annotation of SNPs revealed several genes associated with obesity, e.g., NPC2 and OR4D10. Moreover, gene enrichment analyses identified several significantly associated pathways, over and above the GWA study results, that may influence obesity and obesity related diseases, e.g., metabolic processes. WISH networks based on genotypic correlations allowed further identification of various gene ontology terms and pathways related to obesity and related traits, which were not identified by the GWA study. In conclusion, this is the first study to develop a (genetic) obesity index and employ systems genetics in a porcine model to provide important insights into the complex genetic architecture associated with obesity and many biological pathways that underlie

  13. Genome-wide association studies in preterm birth: implications for the practicing obstetrician-gynaecologist

    PubMed Central

    2013-01-01

    Preterm birth has the highest mortality and morbidity of all pregnancy complications. The burden of preterm birth on public health worldwide is enormous, yet there are few effective means to prevent a preterm delivery. To date, much of its etiology is unexplained, but genetic predisposition is thought to play a major role. In the upcoming year, the international Preterm Birth Genome Project (PGP) consortium plans to publish a large genome wide association study in early preterm birth. Genome-wide association studies (GWAS) are designed to identify common genetic variants that influence health and disease. Despite the many challenges that are involved, GWAS can be an important discovery tool, revealing genetic variations that are associated with preterm birth. It is highly unlikely that findings of a GWAS can be directly translated into clinical practice in the short run. Nonetheless, it will help us to better understand the etiology of preterm birth and the GWAS results will generate new hypotheses for further research, thus enhancing our understanding of preterm birth and informing prevention efforts in the long run. PMID:23445776

  14. Genome-wide association studies in preterm birth: implications for the practicing obstetrician-gynaecologist.

    PubMed

    Dolan, Siobhan M; Christiaens, Inge

    2013-01-01

    Preterm birth has the highest mortality and morbidity of all pregnancy complications. The burden of preterm birth on public health worldwide is enormous, yet there are few effective means to prevent a preterm delivery. To date, much of its etiology is unexplained, but genetic predisposition is thought to play a major role. In the upcoming year, the international Preterm Birth Genome Project (PGP) consortium plans to publish a large genome wide association study in early preterm birth. Genome-wide association studies (GWAS) are designed to identify common genetic variants that influence health and disease. Despite the many challenges that are involved, GWAS can be an important discovery tool, revealing genetic variations that are associated with preterm birth. It is highly unlikely that findings of a GWAS can be directly translated into clinical practice in the short run. Nonetheless, it will help us to better understand the etiology of preterm birth and the GWAS results will generate new hypotheses for further research, thus enhancing our understanding of preterm birth and informing prevention efforts in the long run.

  15. genipe: an automated genome-wide imputation pipeline with automatic reporting and statistical tools.

    PubMed

    Lemieux Perreault, Louis-Philippe; Legault, Marc-André; Asselin, Géraldine; Dubé, Marie-Pierre

    2016-12-01

    Genotype imputation is now commonly performed following genome-wide genotyping experiments. Imputation increases the density of analyzed genotypes in the dataset, enabling fine-mapping across the genome. However, the process of imputation using the most recent publicly available reference datasets can require considerable computation power and the management of hundreds of large intermediate files. We have developed genipe, a complete genome-wide imputation pipeline which includes automatic reporting, imputed data indexing and management, and a suite of statistical tests for imputed data commonly used in genetic epidemiology (Sequence Kernel Association Test, Cox proportional hazards for survival analysis, and linear mixed models for repeated measurements in longitudinal studies). The genipe package is an open source Python software and is freely available for non-commercial use (CC BY-NC 4.0) at https://github.com/pgxcentre/genipe Documentation and tutorials are available at http://pgxcentre.github.io/genipe CONTACT: louis-philippe.lemieux.perreault@statgen.org or marie-pierre.dube@statgen.orgSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  16. The Quality and Validation of Structures from Structural Genomics

    PubMed Central

    Domagalski, Marcin J.; Zheng, Heping; Zimmerman, Matthew D.; Dauter, Zbigniew; Wlodawer, Alexander; Minor, Wladek

    2014-01-01

    Quality control of three-dimensional structures of macromolecules is a critical step to ensure the integrity of structural biology data, especially those produced by structural genomics centers. Whereas the Protein Data Bank (PDB) has proven to be a remarkable success overall, the inconsistent quality of structures reveals a lack of universal standards for structure/deposit validation. Here, we review the state-of-the-art methods used in macromolecular structure validation, focusing on validation of structures determined by X-ray crystallography. We describe some general protocols used in the rebuilding and re-refinement of problematic structural models. We also briefly discuss some frontier areas of structure validation, including refinement of protein–ligand complexes, automation of structure redetermination, and the use of NMR structures and computational models to solve X-ray crystal structures by molecular replacement. PMID:24203341

  17. Harnessing the sorghum genome sequence:development of a genome-wide microsattelite (SSR) resource for swift genetic mapping and map based cloning in sorghum

    USDA-ARS?s Scientific Manuscript database

    Sorghum is the second cereal crop to have a full genome completely sequenced (Nature (2009), 457:551). This achievement is widely recognized as a scientific milestone for grass genetics and genomics in general. However, the true worth of genetic information lies in translating the sequence informa...

  18. Codon usage bias: causative factors, quantification methods and genome-wide patterns: with emphasis on insect genomes.

    PubMed

    Behura, Susanta K; Severson, David W

    2013-02-01

    Codon usage bias refers to the phenomenon where specific codons are used more often than other synonymous codons during translation of genes, the extent of which varies within and among species. Molecular evolutionary investigations suggest that codon bias is manifested as a result of balance between mutational and translational selection of such genes and that this phenomenon is widespread across species and may contribute to genome evolution in a significant manner. With the advent of whole-genome sequencing of numerous species, both prokaryotes and eukaryotes, genome-wide patterns of codon bias are emerging in different organisms. Various factors such as expression level, GC content, recombination rates, RNA stability, codon position, gene length and others (including environmental stress and population size) can influence codon usage bias within and among species. Moreover, there has been a continuous quest towards developing new concepts and tools to measure the extent of codon usage bias of genes. In this review, we outline the fundamental concepts of evolution of the genetic code, discuss various factors that may influence biased usage of synonymous codons and then outline different principles and methods of measurement of codon usage bias. Finally, we discuss selected studies performed using whole-genome sequences of different insect species to show how codon bias patterns vary within and among genomes. We conclude with generalized remarks on specific emerging aspects of codon bias studies and highlight the recent explosion of genome-sequencing efforts on arthropods (such as twelve Drosophila species, species of ants, honeybee, Nasonia and Anopheles mosquitoes as well as the recent launch of a genome-sequencing project involving 5000 insects and other arthropods) that may help us to understand better the evolution of codon bias and its biological significance. © 2012 The Authors. Biological Reviews © 2012 Cambridge Philosophical Society.

  19. A novel genome-wide microsatellite resource for species of Eucalyptus with linkage-to-physical correspondence on the reference genome sequence.

    PubMed

    Grattapaglia, Dario; Mamani, Eva M C; Silva-Junior, Orzenil B; Faria, Danielle A

    2015-03-01

    Keystone species in their native ranges, eucalypts, are ecologically and genetically very diverse, growing naturally along extensive latitudinal and altitudinal ranges and variable environments. Besides their ecological importance, eucalypts are also the most widely planted trees for sustainable forestry in the world. We report the development of a novel collection of 535 microsatellites for species of Eucalyptus, 494 designed from ESTs and 41 from genomic libraries. A selected subset of 223 was evaluated for individual identification, parentage testing, and ancestral information content in the two most extensively studied species, Eucalyptus grandis and Eucalyptus globulus. Microsatellites showed high transferability and overlapping allele size range, suggesting they have arisen still in their common ancestor and confirming the extensive genome conservation between these two species. A consensus linkage map with 437 microsatellites, the most comprehensive microsatellite-only genetic map for Eucalyptus, was built by assembling segregation data from three mapping populations and anchored to the Eucalyptus genome. An overall colinearity between recombination-based and physical positioning of 84% of the mapped microsatellites was observed, with some ordering discrepancies and sporadic locus duplications, consistent with the recently described whole genome duplication events in Eucalyptus. The linkage map covered 95.2% of the 605.8-Mbp assembled genome sequence, placing one microsatellite every 1.55 Mbp on average, and an overall estimate of physical to recombination distance of 618 kbp/cM. The genetic parameters estimates together with linkage and physical position data for this large set of microsatellites should assist marker choice for genome-wide population genetics and comparative mapping in Eucalyptus. © 2014 John Wiley & Sons Ltd.

  20. The Csr system regulates genome-wide mRNA stability and transcription and thus gene expression in Escherichia coli.

    PubMed

    Esquerré, Thomas; Bouvier, Marie; Turlan, Catherine; Carpousis, Agamemnon J; Girbal, Laurence; Cocaign-Bousquet, Muriel

    2016-04-26

    Bacterial adaptation requires large-scale regulation of gene expression. We have performed a genome-wide analysis of the Csr system, which regulates many important cellular functions. The Csr system is involved in post-transcriptional regulation, but a role in transcriptional regulation has also been suggested. Two proteins, an RNA-binding protein CsrA and an atypical signaling protein CsrD, participate in the Csr system. Genome-wide transcript stabilities and levels were compared in wildtype E. coli (MG1655) and isogenic mutant strains deficient in CsrA or CsrD activity demonstrating for the first time that CsrA and CsrD are global negative and positive regulators of transcription, respectively. The role of CsrA in transcription regulation may be indirect due to the 4.6-fold increase in csrD mRNA concentration in the CsrA deficient strain. Transcriptional action of CsrA and CsrD on a few genes was validated by transcriptional fusions. In addition to an effect on transcription, CsrA stabilizes thousands of mRNAs. This is the first demonstration that CsrA is a global positive regulator of mRNA stability. For one hundred genes, we predict that direct control of mRNA stability by CsrA might contribute to metabolic adaptation by regulating expression of genes involved in carbon metabolism and transport independently of transcriptional regulation.

  1. Development and application of a novel genome-wide SNP array reveals domestication history in soybean

    PubMed Central

    Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

    2016-01-01

    Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean. PMID:26856884

  2. Development and application of a novel genome-wide SNP array reveals domestication history in soybean.

    PubMed

    Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

    2016-02-09

    Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean.

  3. Genome wide association study and genomic prediction for fatty acid composition in Chinese Simmental beef cattle using high density SNP array.

    PubMed

    Zhu, Bo; Niu, Hong; Zhang, Wengang; Wang, Zezhao; Liang, Yonghu; Guan, Long; Guo, Peng; Chen, Yan; Zhang, Lupei; Guo, Yong; Ni, Heming; Gao, Xue; Gao, Huijiang; Xu, Lingyang; Li, Junya

    2017-06-14

    Fatty acid composition of muscle is an important trait contributing to meat quality. Recently, genome-wide association study (GWAS) has been extensively used to explore the molecular mechanism underlying important traits in cattle. In this study, we performed GWAS using high density SNP array to analyze the association between SNPs and fatty acids and evaluated the accuracy of genomic prediction for fatty acids in Chinese Simmental cattle. Using the BayesB method, we identified 35 and 7 regions in Chinese Simmental cattle that displayed significant associations with individual fatty acids and fatty acid groups, respectively. We further obtained several candidate genes which may be involved in fatty acid biosynthesis including elongation of very long chain fatty acids protein 5 (ELOVL5), fatty acid synthase (FASN), caspase 2 (CASP2) and thyroglobulin (TG). Specifically, we obtained strong evidence of association signals for one SNP located at 51.3 Mb for FASN using Genome-wide Rapid Association Mixed Model and Regression-Genomic Control (GRAMMAR-GC) approaches. Also, region-based association test identified multiple SNPs within FASN and ELOVL5 for C14:0. In addition, our result revealed that the effectiveness of genomic prediction for fatty acid composition using BayesB was slightly superior over GBLUP in Chinese Simmental cattle. We identified several significantly associated regions and loci which can be considered as potential candidate markers for genomics-assisted breeding programs. Using multiple methods, our results revealed that FASN and ELOVL5 are associated with fatty acids with strong evidence. Our finding also suggested that it is feasible to perform genomic selection for fatty acids in Chinese Simmental cattle.

  4. Alzheimer Disease Pathology in Cognitively Healthy Elderly:A Genome-wide Study

    PubMed Central

    Kramer, Patricia L; Xu, Haiyan; Woltjer, Randall L; Westaway, Shawn K; Clark, David; Erten-Lyons, Deniz; Kaye, Jeffrey A; Welsh-Bohmer, Kathleen A; Troncoso, Juan C; Markesbery, William R; Petersen, Ronald C; Turner, R Scott; Kukull, Walter A; Bennett, David A; DouglasGalasko; Morris, John C; Ott, Jurg

    2010-01-01

    Many elderly individuals remain dementia-free throughout their life. However, some of these individuals exhibit Alzheimer disease neuropathology on autopsy, evidenced by neurofibrillary tangles (NFTs) in AD-specific brain regions. We conducted a genome-wide association study to identify genetic mechanisms that distinguish non-demented elderly with a heavy NFT burden from those with a low NFT burden. The study included 344 non-demented subjects with autopsy (201 subjects with low and 143 with high NFT levels). Both a genotype test, using logistic regression, and an allele test provided genome-wide significant evidence that variants in the RELNgene are associated with neuropathology in the context of cognitive health. Immunohistochemical data for reelin expression in AD-related brain regions added support for these findings. Reelin signaling pathways modulate phosphorylation of tau, the major component of NFTs, either directly or through β-amyloid pathways that influence tau phosphorylation. Our findings suggest that up-regulation of reelin may be a compensatory response to tau-related or beta-amyloid stress associated with AD even prior to the onset of dementia. PMID:20452100

  5. Novel efficient genome-wide SNP panels for the conservation of the highly endangered Iberian lynx.

    PubMed

    Kleinman-Ruiz, Daniel; Martínez-Cruz, Begoña; Soriano, Laura; Lucena-Perez, Maria; Cruz, Fernando; Villanueva, Beatriz; Fernández, Jesús; Godoy, José A

    2017-07-21

    The Iberian lynx (Lynx pardinus) has been acknowledged as the most endangered felid species in the world. An intense contraction and fragmentation during the twentieth century left less than 100 individuals split in two isolated and genetically eroded populations by 2002. Genetic monitoring and management so far have been based on 36 STRs, but their limited variability and the more complex situation of current populations demand more efficient molecular markers. The recent characterization of the Iberian lynx genome identified more than 1.6 million SNPs, of which 1536 were selected and genotyped in an extended Iberian lynx sample. We validated 1492 SNPs and analysed their heterozygosity, Hardy-Weinberg equilibrium, and linkage disequilibrium. We then selected a panel of 343 minimally linked autosomal SNPs from which we extracted subsets optimized for four different typical tasks in conservation applications: individual identification, parentage assignment, relatedness estimation, and admixture classification, and compared their power to currently used STR panels. We ascribed 21 SNPs to chromosome X based on their segregation patterns, and identified one additional marker that showed significant differentiation between sexes. For all applications considered, panels of autosomal SNPs showed higher power than the currently used STR set with only a very modest increase in the number of markers. These novel panels of highly informative genome-wide SNPs provide more powerful, efficient, and flexible tools for the genetic management and non-invasive monitoring of Iberian lynx populations. This example highlights an important outcome of whole-genome studies in genetically threatened species.

  6. Implications of genome-wide association studies in cancer therapeutics.

    PubMed

    Patel, Jai N; McLeod, Howard L; Innocenti, Federico

    2013-09-01

    Genome wide association studies (GWAS) provide an agnostic approach to identifying potential genetic variants associated with disease susceptibility, prognosis of survival and/or predictive of drug response. Although these techniques are costly and interpretation of study results is challenging, they do allow for a more unbiased interrogation of the entire genome, resulting in the discovery of novel genes and understanding of novel biological associations. This review will focus on the implications of GWAS in cancer therapy, in particular germ-line mutations, including findings from major GWAS which have identified predictive genetic loci for clinical outcome and/or toxicity. Lessons and challenges in cancer GWAS are also discussed, including the need for functional analysis and replication, as well as future perspectives for biological and clinical utility. Given the large heterogeneity in response to cancer therapeutics, novel methods of identifying mechanisms and biology of variable drug response and ultimately treatment individualization will be indispensable. © 2013 The British Pharmacological Society.

  7. Meta-analysis of genome-wide linkage studies in BMI and obesity.

    PubMed

    Saunders, Catherine L; Chiodini, Benedetta D; Sham, Pak; Lewis, Cathryn M; Abkevich, Victor; Adeyemo, Adebowale A; de Andrade, Mariza; Arya, Rector; Berenson, Gerald S; Blangero, John; Boehnke, Michael; Borecki, Ingrid B; Chagnon, Yvon C; Chen, Wei; Comuzzie, Anthony G; Deng, Hong-Wen; Duggirala, Ravindranath; Feitosa, Mary F; Froguel, Philippe; Hanson, Robert L; Hebebrand, Johannes; Huezo-Dias, Patricia; Kissebah, Ahmed H; Li, Weidong; Luke, Amy; Martin, Lisa J; Nash, Matthew; Ohman, Miina; Palmer, Lyle J; Peltonen, Leena; Perola, Markus; Price, R Arlen; Redline, Susan; Srinivasan, Sathanur R; Stern, Michael P; Stone, Steven; Stringham, Heather; Turner, Stephen; Wijmenga, Cisca; Collier, David A

    2007-09-01

    The objective was to provide an overall assessment of genetic linkage data of BMI and BMI-defined obesity using a nonparametric genome scan meta-analysis. We identified 37 published studies containing data on over 31,000 individuals from more than >10,000 families and obtained genome-wide logarithm of the odds (LOD) scores, non-parametric linkage (NPL) scores, or maximum likelihood scores (MLS). BMI was analyzed in a pooled set of all studies, as a subgroup of 10 studies that used BMI-defined obesity, and for subgroups ascertained through type 2 diabetes, hypertension, or subjects of European ancestry. Bins at chromosome 13q13.2- q33.1, 12q23-q24.3 achieved suggestive evidence of linkage to BMI in the pooled analysis and samples ascertained for hypertension. Nominal evidence of linkage to these regions and suggestive evidence for 11q13.3-22.3 were also observed for BMI-defined obesity. The FTO obesity gene locus at 16q12.2 also showed nominal evidence for linkage. However, overall distribution of summed rank p values <0.05 is not different from that expected by chance. The strongest evidence was obtained in the families ascertained for hypertension at 9q31.1-qter and 12p11.21-q23 (p < 0.01). Despite having substantial statistical power, we did not unequivocally implicate specific loci for BMI or obesity. This may be because genes influencing adiposity are of very small effect, with substantial genetic heterogeneity and variable dependence on environmental factors. However, the observation that the FTO gene maps to one of the highest ranking bins for obesity is interesting and, while not a validation of this approach, indicates that other potential loci identified in this study should be investigated further.

  8. Combining Genome-Wide Information with a Functional Structural Plant Model to Simulate 1-Year-Old Apple Tree Architecture.

    PubMed

    Migault, Vincent; Pallas, Benoît; Costes, Evelyne

    2016-01-01

    In crops, optimizing target traits in breeding programs can be fostered by selecting appropriate combinations of architectural traits which determine light interception and carbon acquisition. In apple tree, architectural traits were observed to be under genetic control. However, architectural traits also result from many organogenetic and morphological processes interacting with the environment. The present study aimed at combining a FSPM built for apple tree, MAppleT, with genetic determinisms of architectural traits, previously described in a bi-parental population. We focused on parameters related to organogenesis (phyllochron and immediate branching) and morphogenesis processes (internode length and leaf area) during the first year of tree growth. Two independent datasets collected in 2004 and 2007 on 116 genotypes, issued from a 'Starkrimson' × 'Granny Smith' cross, were used. The phyllochron was estimated as a function of thermal time and sylleptic branching was modeled subsequently depending on phyllochron. From a genetic map built with SNPs, marker effects were estimated on four MAppleT parameters with rrBLUP, using 2007 data. These effects were then considered in MAppleT to simulate tree development in the two climatic conditions. The genome wide prediction model gave consistent estimations of parameter values with correlation coefficients between observed values and estimated values from SNP markers ranging from 0.79 to 0.96. However, the accuracy of the prediction model following cross validation schemas was lower. Three integrative traits (the number of leaves, trunk length, and number of sylleptic laterals) were considered for validating MAppleT simulations. In 2007 climatic conditions, simulated values were close to observations, highlighting the correct simulation of genetic variability. However, in 2004 conditions which were not used for model calibration, the simulations differed from observations. This study demonstrates the possibility of

  9. A GENOME WIDE ASSOCIATION STUDY FOR DIABETIC NEPHROPATHY GENES IN AFRICAN AMERICANS

    PubMed Central

    McDonough, Caitrin W.; Palmer, Nicholette D.; Hicks, Pamela J.; Roh, Bong H.; An, S. Sandy; Cooke, Jessica N.; Hester, Jessica M.; Wing, Maria R.; Bostrom, Meredith A.; Rudock, Megan E.; Lewis, Joshua P.; Talbert, Matthew E.; Blevins, Rebecca A.; Lu, Lingyi; Ng, Maggie C.Y.; Sale, Michele M.; Divers, Jasmin; Langefeld, Carl D.; Freedman, Barry I.; Bowden, Donald W.

    2011-01-01

    A genome-wide association study was performed using the Affymetrix 6.0 chip to identify genes associated with diabetic nephropathy in African Americans. Association analysis was performed adjusting for admixture in 965 type 2 diabetic African American patients with end-stage renal disease (ESRD) and in 1029 African Americans without type 2 diabetes or kidney disease as controls. The top 724 single nucleotide polymorphisms (SNPs) with evidence of association to diabetic nephropathy were then genotyped in a replication sample of an additional 709 type 2 diabetes-ESRD patients and 690 controls. SNPs with evidence of association in both the original and replication studies were tested in additional African American cohorts consisting of 1246 patients with type 2 diabetes without kidney disease and 1216 with non-diabetic ESRD to differentiate candidate loci for type 2 diabetes-ESRD, type 2 diabetes, and/or all-cause ESRD. Twenty-five SNPs were significantly associated with type 2 diabetes-ESRD in the genome-wide association and initial replication. Although genome-wide significance with type 2 diabetes was not found for any of these 25 SNPs, several genes, including RPS12, LIMK2, and SFI1 are strong candidates for diabetic nephropathy. A combined analysis of all 2890 patients with ESRD showed significant association SNPs in LIMK2 and SFI1 suggesting that they also contribute to all-cause ESRD. Thus, our results suggest that multiple loci underlie susceptibility to kidney disease in African Americans with type 2 diabetes and some may also contribute to all-cause ESRD. PMID:21150874

  10. A genome-wide association study for diabetic nephropathy genes in African Americans.

    PubMed

    McDonough, Caitrin W; Palmer, Nicholette D; Hicks, Pamela J; Roh, Bong H; An, S Sandy; Cooke, Jessica N; Hester, Jessica M; Wing, Maria R; Bostrom, Meredith A; Rudock, Megan E; Lewis, Joshua P; Talbert, Matthew E; Blevins, Rebecca A; Lu, Lingyi; Ng, Maggie C Y; Sale, Michele M; Divers, Jasmin; Langefeld, Carl D; Freedman, Barry I; Bowden, Donald W

    2011-03-01

    A genome-wide association study was performed using the Affymetrix 6.0 chip to identify genes associated with diabetic nephropathy in African Americans. Association analysis was performed adjusting for admixture in 965 type 2 diabetic African American patients with end-stage renal disease (ESRD) and in 1029 African Americans without type 2 diabetes or kidney disease as controls. The top 724 single nucleotide polymorphisms (SNPs) with evidence of association to diabetic nephropathy were then genotyped in a replication sample of an additional 709 type 2 diabetes-ESRD patients and 690 controls. SNPs with evidence of association in both the original and replication studies were tested in additional African American cohorts consisting of 1246 patients with type 2 diabetes without kidney disease and 1216 with non-diabetic ESRD to differentiate candidate loci for type 2 diabetes-ESRD, type 2 diabetes, and/or all-cause ESRD. Twenty-five SNPs were significantly associated with type 2 diabetes-ESRD in the genome-wide association and initial replication. Although genome-wide significance with type 2 diabetes was not found for any of these 25 SNPs, several genes, including RPS12, LIMK2, and SFI1 are strong candidates for diabetic nephropathy. A combined analysis of all 2890 patients with ESRD showed significant association SNPs in LIMK2 and SFI1 suggesting that they also contribute to all-cause ESRD. Thus, our results suggest that multiple loci underlie susceptibility to kidney disease in African Americans with type 2 diabetes and some may also contribute to all-cause ESRD.

  11. Ensembl Genomes 2013: scaling up access to genome-wide data.

    PubMed

    Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

    2014-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.

  12. Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation.

    PubMed

    Saatchi, Mahdi; McClure, Mathew C; McKay, Stephanie D; Rolf, Megan M; Kim, JaeWoo; Decker, Jared E; Taxis, Tasia M; Chapple, Richard H; Ramey, Holly R; Northcutt, Sally L; Bauck, Stewart; Woodward, Brent; Dekkers, Jack C M; Fernando, Rohan L; Schnabel, Robert D; Garrick, Dorian J; Taylor, Jeremy F

    2011-11-28

    Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but

  13. Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation

    PubMed Central

    2011-01-01

    Background Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Methods Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Results Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. Conclusions These results suggest that genomic estimates of genetic merit can be

  14. Genome-Wide Association Study for Susceptibility to and Recoverability From Mastitis in Danish Holstein Cows

    PubMed Central

    Welderufael, B. G.; Løvendahl, Peter; de Koning, Dirk-Jan; Janss, Lucas L. G.; Fikse, W. F.

    2018-01-01

    Because mastitis is very frequent and unavoidable, adding recovery information into the analysis for genetic evaluation of mastitis is of great interest from economical and animal welfare point of view. Here we have performed genome-wide association studies (GWAS) to identify associated single nucleotide polymorphisms (SNPs) and investigate the genetic background not only for susceptibility to – but also for recoverability from mastitis. Somatic cell count records from 993 Danish Holstein cows genotyped for a total of 39378 autosomal SNP markers were used for the association analysis. Single SNP regression analysis was performed using the statistical software package DMU. Substitution effect of each SNP was tested with a t-test and a genome-wide significance level of P-value < 10-4 was used to declare significant SNP-trait association. A number of significant SNP variants were identified for both traits. Many of the SNP variants associated either with susceptibility to – or recoverability from mastitis were located in or very near to genes that have been reported for their role in the immune system. Genes involved in lymphocyte developments (e.g., MAST3 and STAB2) and genes involved in macrophage recruitment and regulation of inflammations (PDGFD and PTX3) were suggested as possible causal genes for susceptibility to – and recoverability from mastitis, respectively. However, this is the first GWAS study for recoverability from mastitis and our results need to be validated. The findings in the current study are, therefore, a starting point for further investigations in identifying causal genetic variants or chromosomal regions for both susceptibility to – and recoverability from mastitis. PMID:29755506

  15. Genome-wide association analysis of blood-pressure traits in African-ancestry individuals reveals common associated genes in African and non-African populations.

    PubMed

    Franceschini, Nora; Fox, Ervin; Zhang, Zhaogong; Edwards, Todd L; Nalls, Michael A; Sung, Yun Ju; Tayo, Bamidele O; Sun, Yan V; Gottesman, Omri; Adeyemo, Adebawole; Johnson, Andrew D; Young, J Hunter; Rice, Ken; Duan, Qing; Chen, Fang; Li, Yun; Tang, Hua; Fornage, Myriam; Keene, Keith L; Andrews, Jeanette S; Smith, Jennifer A; Faul, Jessica D; Guangfa, Zhang; Guo, Wei; Liu, Yu; Murray, Sarah S; Musani, Solomon K; Srinivasan, Sathanur; Velez Edwards, Digna R; Wang, Heming; Becker, Lewis C; Bovet, Pascal; Bochud, Murielle; Broeckel, Ulrich; Burnier, Michel; Carty, Cara; Chasman, Daniel I; Ehret, Georg; Chen, Wei-Min; Chen, Guanjie; Chen, Wei; Ding, Jingzhong; Dreisbach, Albert W; Evans, Michele K; Guo, Xiuqing; Garcia, Melissa E; Jensen, Rich; Keller, Margaux F; Lettre, Guillaume; Lotay, Vaneet; Martin, Lisa W; Moore, Jason H; Morrison, Alanna C; Mosley, Thomas H; Ogunniyi, Adesola; Palmas, Walter; Papanicolaou, George; Penman, Alan; Polak, Joseph F; Ridker, Paul M; Salako, Babatunde; Singleton, Andrew B; Shriner, Daniel; Taylor, Kent D; Vasan, Ramachandran; Wiggins, Kerri; Williams, Scott M; Yanek, Lisa R; Zhao, Wei; Zonderman, Alan B; Becker, Diane M; Berenson, Gerald; Boerwinkle, Eric; Bottinger, Erwin; Cushman, Mary; Eaton, Charles; Nyberg, Fredrik; Heiss, Gerardo; Hirschhron, Joel N; Howard, Virginia J; Karczewsk, Konrad J; Lanktree, Matthew B; Liu, Kiang; Liu, Yongmei; Loos, Ruth; Margolis, Karen; Snyder, Michael; Psaty, Bruce M; Schork, Nicholas J; Weir, David R; Rotimi, Charles N; Sale, Michele M; Harris, Tamara; Kardia, Sharon L R; Hunt, Steven C; Arnett, Donna; Redline, Susan; Cooper, Richard S; Risch, Neil J; Rao, D C; Rotter, Jerome I; Chakravarti, Aravinda; Reiner, Alex P; Levy, Daniel; Keating, Brendan J; Zhu, Xiaofeng

    2013-09-05

    High blood pressure (BP) is more prevalent and contributes to more severe manifestations of cardiovascular disease (CVD) in African Americans than in any other United States ethnic group. Several small African-ancestry (AA) BP genome-wide association studies (GWASs) have been published, but their findings have failed to replicate to date. We report on a large AA BP GWAS meta-analysis that includes 29,378 individuals from 19 discovery cohorts and subsequent replication in additional samples of AA (n = 10,386), European ancestry (EA) (n = 69,395), and East Asian ancestry (n = 19,601). Five loci (EVX1-HOXA, ULK4, RSPO3, PLEKHG1, and SOX6) reached genome-wide significance (p < 1.0 × 10(-8)) for either systolic or diastolic BP in a transethnic meta-analysis after correction for multiple testing. Three of these BP loci (EVX1-HOXA, RSPO3, and PLEKHG1) lack previous associations with BP. We also identified one independent signal in a known BP locus (SOX6) and provide evidence for fine mapping in four additional validated BP loci. We also demonstrate that validated EA BP GWAS loci, considered jointly, show significant effects in AA samples. Consequently, these findings suggest that BP loci might have universal effects across studied populations, demonstrating that multiethnic samples are an essential component in identifying, fine mapping, and understanding their trait variability. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  16. Realizing privacy preserving genome-wide association studies.

    PubMed

    Simmons, Sean; Berger, Bonnie

    2016-05-01

    As genomics moves into the clinic, there has been much interest in using this medical data for research. At the same time the use of such data raises many privacy concerns. These circumstances have led to the development of various methods to perform genome-wide association studies (GWAS) on patient records while ensuring privacy. In particular, there has been growing interest in applying differentially private techniques to this challenge. Unfortunately, up until now all methods for finding high scoring SNPs in a differentially private manner have had major drawbacks in terms of either accuracy or computational efficiency. Here we overcome these limitations with a substantially modified version of the neighbor distance method for performing differentially private GWAS, and thus are able to produce a more viable mechanism. Specifically, we use input perturbation and an adaptive boundary method to overcome accuracy issues. We also design and implement a convex analysis based algorithm to calculate the neighbor distance for each SNP in constant time, overcoming the major computational bottleneck in the neighbor distance method. It is our hope that methods such as ours will pave the way for more widespread use of patient data in biomedical research. A python implementation is available at http://groups.csail.mit.edu/cb/DiffPriv/ bab@csail.mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  17. Multi-Ethnic Genome-Wide Association Study of Cerebral White Matter Hyperintensities on MRI

    PubMed Central

    Verhaaren, Benjamin F.J.; Debette, Stéphanie; Bis, Joshua C.; Smith, Jennifer A.; Ikram, M. Kamran; Adams, Hieab H.; Beecham, Ashley H.; Rajan, Kumar B.; Lopez, Lorna M.; Barral, Sandra; van Buchem, Mark A.; van der Grond, Jeroen; Smith, Albert V.; Hegenscheid, Katrin; Aggarwal, Neelum T.; de Andrade, Mariza; Atkinson, Elizabeth J.; Beekman, Marian; Beiser, Alexa S.; Blanton, Susan H.; Boerwinkle, Eric; Brickman, Adam M.; Bryan, R. Nick; Chauhan, Ganesh; Chen, Christopher P.L.H.; Chouraki, Vincent; de Craen, Anton J.M.; Crivello, Fabrice; Deary, Ian J.; Deelen, Joris; De Jager, Philip L.; Dufouil, Carole; Elkind, Mitchell S.V.; Evans, Denis A.; Freudenberger, Paul; Gottesman, Rebecca F.; Guðnason, Vilmundur; Habes, Mohamad; Heckbert, Susan R.; Heiss, Gerardo; Hilal, Saima; Hofer, Edith; Hofman, Albert; Ibrahim-Verbaas, Carla A.; Knopman, David S.; Lewis, Cora E.; Liao, Jiemin; Liewald, David C.M.; Luciano, Michelle; van der Lugt, Aad; Martinez, Oliver O.; Mayeux, Richard; Mazoyer, Bernard; Nalls, Mike; Nauck, Matthias; Niessen, Wiro J.; Oostra, Ben A.; Psaty, Bruce M.; Rice, Kenneth M.; Rotter, Jerome I.; von Sarnowski, Bettina; Schmidt, Helena; Schreiner, Pamela J.; Schuur, Maaike; Sidney, Stephen S.; Sigurdsson, Sigurdur; Slagboom, P. Eline; Stott, David J.M.; van Swieten, John C.; Teumer, Alexander; Töglhofer, Anna Maria; Traylor, Matthew; Trompet, Stella; Turner, Stephen T.; Tzourio, Christophe; Uh, Hae-Won; Uitterlinden, André G.; Vernooij, Meike W.; Wang, Jing J.; Wong, Tien Y.; Wardlaw, Joanna M.; Windham, B. Gwen; Wittfeld, Katharina; Wolf, Christiane; Wright, Clinton B.; Yang, Qiong; Zhao, Wei; Zijdenbos, Alex; Jukema, J. Wouter; Sacco, Ralph L.; Kardia, Sharon L.R.; Amouyel, Philippe; Mosley, Thomas H.; Longstreth, W. T.; DeCarli, Charles C.; van Duijn, Cornelia M.; Schmidt, Reinhold; Launer, Lenore J.; Grabe, Hans J.; Seshadri, Sudha S.; Ikram, M. Arfan; Fornage, Myriam

    2015-01-01

    Background The burden of cerebral white matter hyperintensities (WMH) is associated with an increased risk of stroke, dementia, and death. WMH are highly heritable, but their genetic underpinnings are incompletely characterized. To identify novel genetic variants influencing WMH burden, we conducted a meta-analysis of multi-ethnic genome-wide association studies. Methods and Results We included 21,079 middle-aged to elderly individuals from 29 population-based cohorts, who were free of dementia and stroke and were of European (N=17,936), African (N=1,943), Hispanic (N=795), and Asian (N=405) descent. WMH burden was quantified on MRI either by a validated automated segmentation method or a validated visual grading scale. Genotype data in each study were imputed to the 1000 Genomes reference. Within each ethnic group, we investigated the relationship between each SNP and WMH burden using a linear regression model adjusted for age, sex, intracranial volume, and principal components of ancestry. A meta-analysis was conducted for each ethnicity separately and for the combined sample. In the European descent samples, we confirmed a previously known locus on chr17q25 (p=2.7×10−19) and identified novel loci on chr10q24 (p=1.6×10−9) and chr2p21 (p=4.4×10−8). In the multi-ethnic meta-analysis, we identified two additional loci, on chr1q22 (p=2.0×10−8) and chr2p16 (p=1.5×10−8). The novel loci contained genes that have been implicated in Alzheimer’s disease (chr2p21, chr10q24), intracerebral hemorrhage (chr1q22), neuroinflammatory diseases (chr2p21), and glioma (chr10q24, chr2p16). Conclusions We identified four novel genetic loci that implicate inflammatory and glial proliferative pathways in the development of white matter hyperintensities in addition to previously-proposed ischemic mechanisms. PMID:25663218

  18. Genome-wide association mapping of partial resistance to Aphanomyces euteiches in pea

    USDA-ARS?s Scientific Manuscript database

    Genome-wide association mapping has recently emerged as a valuable approach to refine genetic basis of polygenic resistance to plant diseases, which are increasingly used in integrated strategies for durable crop protection. Aphanomyces euteiches is a soil borne pathogen of pea and other legumes wor...

  19. A genome-wide association study identifies multiple loci for variation in human ear morphology.

    PubMed

    Adhikari, Kaustubh; Reales, Guillermo; Smith, Andrew J P; Konka, Esra; Palmen, Jutta; Quinto-Sanchez, Mirsha; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Fuentes, Macarena; Pizarro, María; Barquera Lozano, Rodrigo; Macín Pérez, Gastón; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M; Bortolini, Maria-Cátira; Canizales-Quinteros, Samuel; Rothhammer, Francisco; Bedoya, Gabriel; Calderón, Rosario; Rosique, Javier; Cheeseman, Michael; Bhutta, Mahmood F; Humphries, Steve E; Gonzalez-José, Rolando; Headon, Denis; Balding, David; Ruiz-Linares, Andrés

    2015-06-24

    Here we report a genome-wide association study for non-pathological pinna morphology in over 5,000 Latin Americans. We find genome-wide significant association at seven genomic regions affecting: lobe size and attachment, folding of antihelix, helix rolling, ear protrusion and antitragus size (linear regression P values 2 × 10(-8) to 3 × 10(-14)). Four traits are associated with a functional variant in the Ectodysplasin A receptor (EDAR) gene, a key regulator of embryonic skin appendage development. We confirm expression of Edar in the developing mouse ear and that Edar-deficient mice have an abnormally shaped pinna. Two traits are associated with SNPs in a region overlapping the T-Box Protein 15 (TBX15) gene, a major determinant of mouse skeletal development. Strongest association in this region is observed for SNP rs17023457 located in an evolutionarily conserved binding site for the transcription factor Cartilage paired-class homeoprotein 1 (CART1), and we confirm that rs17023457 alters in vitro binding of CART1.

  20. Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

    NASA Astrophysics Data System (ADS)

    Qian, Long; Kussell, Edo

    2016-10-01

    The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations.

  1. Genome-wide study of resistant hypertension identified from electronic health records.

    PubMed

    Dumitrescu, Logan; Ritchie, Marylyn D; Denny, Joshua C; El Rouby, Nihal M; McDonough, Caitrin W; Bradford, Yuki; Ramirez, Andrea H; Bielinski, Suzette J; Basford, Melissa A; Chai, High Seng; Peissig, Peggy; Carrell, David; Pathak, Jyotishman; Rasmussen, Luke V; Wang, Xiaoming; Pacheco, Jennifer A; Kho, Abel N; Hayes, M Geoffrey; Matsumoto, Martha; Smith, Maureen E; Li, Rongling; Cooper-DeHoff, Rhonda M; Kullo, Iftikhar J; Chute, Christopher G; Chisholm, Rex L; Jarvik, Gail P; Larson, Eric B; Carey, David; McCarty, Catherine A; Williams, Marc S; Roden, Dan M; Bottinger, Erwin; Johnson, Julie A; de Andrade, Mariza; Crawford, Dana C

    2017-01-01

    Resistant hypertension is defined as high blood pressure that remains above treatment goals in spite of the concurrent use of three antihypertensive agents from different classes. Despite the important health consequences of resistant hypertension, few studies of resistant hypertension have been conducted. To perform a genome-wide association study for resistant hypertension, we defined and identified cases of resistant hypertension and hypertensives with treated, controlled hypertension among >47,500 adults residing in the US linked to electronic health records (EHRs) and genotyped as part of the electronic MEdical Records & GEnomics (eMERGE) Network. Electronic selection logic using billing codes, laboratory values, text queries, and medication records was used to identify resistant hypertension cases and controls at each site, and a total of 3,006 cases of resistant hypertension and 876 controlled hypertensives were identified among eMERGE Phase I and II sites. After imputation and quality control, a total of 2,530,150 SNPs were tested for an association among 2,830 multi-ethnic cases of resistant hypertension and 876 controlled hypertensives. No test of association was genome-wide significant in the full dataset or in the dataset limited to European American cases (n = 1,719) and controls (n = 708). The most significant finding was CLNK rs13144136 at p = 1.00x10-6 (odds ratio = 0.68; 95% CI = 0.58-0.80) in the full dataset with similar results in the European American only dataset. We also examined whether SNPs known to influence blood pressure or hypertension also influenced resistant hypertension. None was significant after correction for multiple testing. These data highlight both the difficulties and the potential utility of EHR-linked genomic data to study clinically-relevant traits such as resistant hypertension.

  2. Genome-wide association study of Alzheimer's disease.

    PubMed

    Kamboh, M I; Demirci, F Y; Wang, X; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Jun, G; Baldwin, C; Logue, M W; Buros, J; Farrer, L; Pericak-Vance, M A; Haines, J L; Sweet, R A; Ganguli, M; Feingold, E; Dekosky, S T; Lopez, O L; Barmada, M M

    2012-05-15

    In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ~2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69-180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P = 3.05E-07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples.

  3. Genome-wide association study of Alzheimer's disease

    PubMed Central

    Kamboh, M I; Demirci, F Y; Wang, X; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Jun, G; Baldwin, C; Logue, M W; Buros, J; Farrer, L; Pericak-Vance, M A; Haines, J L; Sweet, R A; Ganguli, M; Feingold, E; DeKosky, S T; Lopez, O L; Barmada, M M

    2012-01-01

    In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ∼2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69–180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P=3.05E–07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples. PMID:22832961

  4. A Novel Genome-Information Content-Based Statistic for Genome-Wide Association Analysis Designed for Next-Generation Sequencing Data

    PubMed Central

    Luo, Li; Zhu, Yun

    2012-01-01

    Abstract The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T2, collapsing method, multivariate and collapsing (CMC) method, individual χ2 test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets. PMID:22651812

  5. A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data.

    PubMed

    Luo, Li; Zhu, Yun; Xiong, Momiao

    2012-06-01

    The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.

  6. A Genome-Wide Map of Mitochondrial DNA Recombination in Yeast

    PubMed Central

    Fritsch, Emilie S.; Chabbert, Christophe D.; Klaus, Bernd; Steinmetz, Lars M.

    2014-01-01

    In eukaryotic cells, the production of cellular energy requires close interplay between nuclear and mitochondrial genomes. The mitochondrial genome is essential in that it encodes several genes involved in oxidative phosphorylation. Each cell contains several mitochondrial genome copies and mitochondrial DNA recombination is a widespread process occurring in plants, fungi, protists, and invertebrates. Saccharomyces cerevisiae has proved to be an excellent model to dissect mitochondrial biology. Several studies have focused on DNA recombination in this organelle, yet mostly relied on reporter genes or artificial systems. However, no complete mitochondrial recombination map has been released for any eukaryote so far. In the present work, we sequenced pools of diploids originating from a cross between two different S. cerevisiae strains to detect recombination events. This strategy allowed us to generate the first genome-wide map of recombination for yeast mitochondrial DNA. We demonstrated that recombination events are enriched in specific hotspots preferentially localized in non-protein-coding regions. Additionally, comparison of the recombination profiles of two different crosses showed that the genetic background affects hotspot localization and recombination rates. Finally, to gain insights into the mechanisms involved in mitochondrial recombination, we assessed the impact of individual depletion of four genes previously associated with this process. Deletion of NTG1 and MGT1 did not substantially influence the recombination landscape, alluding to the potential presence of additional regulatory factors. Our findings also revealed the loss of large mitochondrial DNA regions in the absence of MHR1, suggesting a pivotal role for Mhr1 in mitochondrial genome maintenance during mating. This study provides a comprehensive overview of mitochondrial DNA recombination in yeast and thus paves the way for future mechanistic studies of mitochondrial recombination and genome

  7. A genome-wide map of mitochondrial DNA recombination in yeast.

    PubMed

    Fritsch, Emilie S; Chabbert, Christophe D; Klaus, Bernd; Steinmetz, Lars M

    2014-10-01

    In eukaryotic cells, the production of cellular energy requires close interplay between nuclear and mitochondrial genomes. The mitochondrial genome is essential in that it encodes several genes involved in oxidative phosphorylation. Each cell contains several mitochondrial genome copies and mitochondrial DNA recombination is a widespread process occurring in plants, fungi, protists, and invertebrates. Saccharomyces cerevisiae has proved to be an excellent model to dissect mitochondrial biology. Several studies have focused on DNA recombination in this organelle, yet mostly relied on reporter genes or artificial systems. However, no complete mitochondrial recombination map has been released for any eukaryote so far. In the present work, we sequenced pools of diploids originating from a cross between two different S. cerevisiae strains to detect recombination events. This strategy allowed us to generate the first genome-wide map of recombination for yeast mitochondrial DNA. We demonstrated that recombination events are enriched in specific hotspots preferentially localized in non-protein-coding regions. Additionally, comparison of the recombination profiles of two different crosses showed that the genetic background affects hotspot localization and recombination rates. Finally, to gain insights into the mechanisms involved in mitochondrial recombination, we assessed the impact of individual depletion of four genes previously associated with this process. Deletion of NTG1 and MGT1 did not substantially influence the recombination landscape, alluding to the potential presence of additional regulatory factors. Our findings also revealed the loss of large mitochondrial DNA regions in the absence of MHR1, suggesting a pivotal role for Mhr1 in mitochondrial genome maintenance during mating. This study provides a comprehensive overview of mitochondrial DNA recombination in yeast and thus paves the way for future mechanistic studies of mitochondrial recombination and genome

  8. Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds

    USDA-ARS?s Scientific Manuscript database

    Copy number variations (CNVs) are large insertions, deletions or duplications in the genome that vary between members of a species and are known to affect a wide variety of phenotypic traits. In this study, we identified CNVs in a population of bulls using low coverage next-generation sequence data....

  9. Genome-wide study of methotrexate clearance replicates SLCO1B1

    PubMed Central

    Ramsey, Laura B.; Panetta, John C.; Smith, Colton; Yang, Wenjian; Fan, Yiping; Winick, Naomi J.; Martin, Paul L.; Cheng, Cheng; Devidas, Meenakshi; Pui, Ching-Hon; Evans, William E.; Hunger, Stephen P.; Loh, Mignon

    2013-01-01

    Methotrexate clearance can influence the cure of and toxicity in children with acute lymphoblastic leukemia (ALL). We estimated methotrexate plasma clearance for 1279 patients with ALL treated with methotrexate (24-hour infusion of a 1 g/m2 dose or 4-hour infusion of a 2 g/m2 dose) on the Children's Oncology Group P9904 and P9905 protocols. Methotrexate clearance was lower in older children (P = 7 × 10−7), girls (P = 2.7 × 10−4), and those who received a delayed-intensification phase (P = .0022). A genome-wide analysis showed that methotrexate clearance was associated with polymorphisms in the organic anion transporter gene SLCO1B1 (P = 2.1 × 10−11). This replicates findings using different schedules of high-dose methotrexate in St Jude ALL treatment protocols; a combined meta-analysis yields a P value of 5.7 × 10−19 for the association of methotrexate clearance with SLCO1B1 SNP rs4149056. Validation of this variant with 5 different treatment regimens of methotrexate solidifies the robustness of this pharmacogenomic determinant of methotrexate clearance. This study is registered at http://www.clinicaltrials.gov as NCT00005585 and NCT00005596. PMID:23233662

  10. Genome-wide compendium and functional assessment of in vivo heart enhancers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen

    Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of > 35 epigenomic data sets from mouse and human pre-and postnatal hearts we created a comprehensive reference of > 80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs ofmore » two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function.« less

  11. Genome-wide compendium and functional assessment of in vivo heart enhancers

    DOE PAGES

    Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen; ...

    2016-10-05

    Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of > 35 epigenomic data sets from mouse and human pre-and postnatal hearts we created a comprehensive reference of > 80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs ofmore » two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function.« less

  12. Genome-wide compendium and functional assessment of in vivo heart enhancers

    PubMed Central

    Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen; Fukuda-Yuzawa, Yoko; Osterwalder, Marco; Mannion, Brandon J.; May, Dalit; Spurrell, Cailyn H.; Plajzer-Frick, Ingrid; Pickle, Catherine S.; Lee, Elizabeth; Garvin, Tyler H.; Kato, Momoe; Akiyama, Jennifer A.; Afzal, Veena; Lee, Ah Young; Gorkin, David U.; Ren, Bing; Rubin, Edward M.; Visel, Axel; Pennacchio, Len A.

    2016-01-01

    Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of >35 epigenomic data sets from mouse and human pre- and postnatal hearts we created a comprehensive reference of >80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs of two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function. PMID:27703156

  13. Genome-wide association studies to identify rice salt-tolerance markers.

    PubMed

    Patishtan, Juan; Hartley, Tom N; Fonseca de Carvalho, Raquel; Maathuis, Frans J M

    2018-05-01

    Salinity is an ever increasing menace that affects agriculture worldwide. Crops such as rice are salt sensitive, but its degree of susceptibility varies widely between cultivars pointing to extensive genetic diversity that can be exploited to identify genes and proteins that are relevant in the response of rice to salt stress. We used a diversity panel of 306 rice accessions and collected phenotypic data after short (6 h), medium (7 d) and long (30 d) salinity treatment (50 mm NaCl). A genome-wide association study (GWAS) was subsequently performed, which identified around 1200 candidate genes from many functional categories, but this was treatment period dependent. Further analysis showed the presence of cation transporters and transcription factors with a known role in salinity tolerance and those that hitherto were not known to be involved in salt stress. Localization analysis of single nucleotide polymorphisms (SNPs) showed the presence of several hundred non-synonymous SNPs (nsSNPs) in coding regions and earmarked specific genomic regions with increased numbers of nsSNPs. It points to components of the ubiquitination pathway as important sources of genetic diversity that could underpin phenotypic variation in stress tolerance. © 2017 John Wiley & Sons Ltd.

  14. Genome-wide association analysis of ischemic stroke in young adults.

    PubMed

    Cheng, Yu-Ching; O'Connell, Jeffrey R; Cole, John W; Stine, O Colin; Dueker, Nicole; McArdle, Patrick F; Sparks, Mary J; Shen, Jess; Laurie, Cathy C; Nelson, Sarah; Doheny, Kimberly F; Ling, Hua; Pugh, Elizabeth W; Brott, Thomas G; Brown, Robert D; Meschia, James F; Nalls, Michael; Rich, Stephen S; Worrall, Bradford; Anderson, Christopher D; Biffi, Alessandro; Cortellini, Lynelle; Furie, Karen L; Rost, Natalia S; Rosand, Jonathan; Manolio, Teri A; Kittner, Steven J; Mitchell, Braxton D

    2011-11-01

    Ischemic stroke (IS) is among the leading causes of death in Western countries. There is a significant genetic component to IS susceptibility, especially among young adults. To date, research to identify genetic loci predisposing to stroke has met only with limited success. We performed a genome-wide association (GWA) analysis of early-onset IS to identify potential stroke susceptibility loci. The GWA analysis was conducted by genotyping 1 million SNPs in a biracial population of 889 IS cases and 927 controls, ages 15-49 years. Genotypes were imputed using the HapMap3 reference panel to provide 1.4 million SNPs for analysis. Logistic regression models adjusting for age, recruitment stages, and population structure were used to determine the association of IS with individual SNPs. Although no single SNP reached genome-wide significance (P < 5 × 10(-8)), we identified two SNPs in chromosome 2q23.3, rs2304556 (in FMNL2; P = 1.2 × 10(-7)) and rs1986743 (in ARL6IP6; P = 2.7 × 10(-7)), strongly associated with early-onset stroke. These data suggest that a novel locus on human chromosome 2q23.3 may be associated with IS susceptibility among young adults.

  15. Nencki Genomics Database—Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs

    PubMed Central

    Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

    2013-01-01

    We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql –h database.nencki-genomics.org –u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface. Database URL: http://www.nencki-genomics.org. PMID:24089456

  16. Genome-Wide Analysis of the Arabidopsis Replication Timing Program1[OPEN

    PubMed Central

    Brooks, Ashley M.; Wheeler, Emily; LeBlanc, Chantal; Lee, Tae-Jin; Martienssen, Robert A.; Thompson, William F.

    2018-01-01

    Eukaryotes use a temporally regulated process, known as the replication timing program, to ensure that their genomes are fully and accurately duplicated during S phase. Replication timing programs are predictive of genomic features and activity and are considered to be functional readouts of chromatin organization. Although replication timing programs have been described for yeast and animal systems, much less is known about the temporal regulation of plant DNA replication or its relationship to genome sequence and chromatin structure. We used the thymidine analog, 5-ethynyl-2′-deoxyuridine, in combination with flow sorting and Repli-Seq to describe, at high-resolution, the genome-wide replication timing program for Arabidopsis (Arabidopsis thaliana) Col-0 suspension cells. We identified genomic regions that replicate predominantly during early, mid, and late S phase, and correlated these regions with genomic features and with data for chromatin state, accessibility, and long-distance interaction. Arabidopsis chromosome arms tend to replicate early while pericentromeric regions replicate late. Early and mid-replicating regions are gene-rich and predominantly euchromatic, while late regions are rich in transposable elements and primarily heterochromatic. However, the distribution of chromatin states across the different times is complex, with each replication time corresponding to a mixture of states. Early and mid-replicating sequences interact with each other and not with late sequences, but early regions are more accessible than mid regions. The replication timing program in Arabidopsis reflects a bipartite genomic organization with early/mid-replicating regions and late regions forming separate, noninteracting compartments. The temporal order of DNA replication within the early/mid compartment may be modulated largely by chromatin accessibility. PMID:29301956

  17. Genome-wide Analysis of Genetic Loci Associated with Alzheimer’s Disease

    PubMed Central

    Seshadri, Sudha; Fitzpatrick, Annette L.; Arfan Ikram, M; DeStefano, Anita L.; Gudnason, Vilmundur; Boada, Merce; Bis, Joshua C.; Smith, Albert V.; Carassquillo, Minerva M.; Charles Lambert, Jean; Harold, Denise; Schrijvers, Elisabeth M. C.; Ramirez-Lorca, Reposo; Debette, Stephanie; Longstreth, W.T.; Janssens, A. Cecile J.W.; Shane Pankratz, V.; Dartigues, Jean François; Hollingworth, Paul; Aspelund, Thor; Hernandez, Isabel; Beiser, Alexa; Kuller, Lewis H.; Koudstaal, Peter J.; Dickson, Dennis W.; Tzourio, Christophe; Abraham, Richard; Antunez, Carmen; Du, Yangchun; Rotter, Jerome I.; Aulchenko, Yurii S.; Harris, Tamara B.; Petersen, Ronald C.; Berr, Claudine; Owen, Michael J.; Lopez-Arrieta, Jesus; Varadarajan, Badri N.; Becker, James T.; Rivadeneira, Fernando; Nalls, Michael A.; Graff-Radford, Neill R.; Campion, Dominique; Auerbach, Sanford; Rice, Kenneth; Hofman, Albert; Jonsson, Palmi V.; Schmidt, Helena; Lathrop, Mark; Mosley, Thomas H.; Au, Rhoda; Psaty, Bruce M.; Uitterlinden, Andre G.; Farrer, Lindsay A.; Lumley, Thomas; Ruiz, Agustin; Williams, Julie; Amouyel, Philippe; Younkin, Steve G.; Wolf, Philip A.; Launer, Lenore J.; Lopez, Oscar L.; van Duijn, Cornelia M.; Breteler, Monique M. B.

    2010-01-01

    Context Genome wide association studies (GWAS) have recently identified CLU, PICALM and CR1 as novel genes for late-onset Alzheimer’s disease (AD). Objective In a three-stage analysis of new and previously published GWAS on over 35000 persons (8371 AD cases), we sought to identify and strengthen additional loci associated with AD and confirm these in an independent sample. We also examined the contribution of recently identified genes to AD risk prediction. Design, Setting, and Participants We identified strong genetic associations (p<10−3) in a Stage 1 sample of 3006 AD cases and 14642 controls by combining new data from the population-based Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium (1367 AD cases (973 incident)) with previously reported results from the Translational Genomics Research Institute (TGEN) and Mayo AD GWAS. We identified 2708 single nucleotide polymorphisms (SNPs) with p-values<10−3, and in Stage 2 pooled results for these SNPs with the European AD Initiative (2032 cases, 5328 controls) to identify ten loci with p-values<10−5. In Stage 3, we combined data for these ten loci with data from the Genetic and Environmental Risk in AD consortium (3333 cases, 6995 controls) to identify four SNPs with a p-value<1.7×10−8. These four SNPs were replicated in an independent Spanish sample (1140 AD cases and 1209 controls). Main outcome measure Alzheimer’s Disease. Results We showed genome-wide significance for two new loci: rs744373 near BIN1 (OR:1.13; 95%CI:1.06–1.21 per copy of the minor allele; p=1.6×10−11) and rs597668 near EXOC3L2/BLOC1S3/MARK4 (OR:1.18; 95%CI1.07–1.29; p=6.5×10−9). Associations of CLU, PICALM, BIN1 and EXOC3L2 with AD were confirmed in the Spanish sample (p<0.05). However, CLU and PICALM did not improve incident AD prediction beyond age, sex, and APOE (improvement in area under receiver-operating-characteristic curve <0.003). Conclusions Two novel genetic loci for AD are reported

  18. SUSCEPTIBILITY LOCI FOR UMBILICAL HERNIA IN SWINE DETECTED BY GENOME-WIDE ASSOCIATION.

    PubMed

    Liao, X J; Lia, L; Zhang, Z Y; Long, Y; Yang, B; Ruan, G R; Su, Y; Ai, H S; Zhang, W C; Deng, W Y; Xiao, S J; Ren, J; Ding, N S; Huang, L S

    2015-10-01

    Umbilical hernia (UH) is a complex disorder caused by both genetic and environmental factors. UH brings animal welfare problems and severe economic loss to the pig industry. Until now, the genetic basis of UH is poorly understood. The high-density 60K porcine SNP array enables the rapid application of genome-wide association study (GWAS) to identify genetic loci for phenotypic traits at genome wide scale in pigs. The objective of this research was to identify susceptibility loci for swine umbilical hernia using the GWAS approach. We genotyped 478 piglets from 142 families representing three Western commercial breeds with the Illumina PorcineSNP60 BeadChip. Then significant SNPs were detected by GWAS using ROADTRIPS (Robust Association-Detection Test for Related Individuals with Population Substructure) software base on a Bonferroni corrected threshold (P = 1.67E-06) or suggestive threshold (P = 3.34E-05) and false discovery rate (FDR = 0.05). After quality control, 29,924 qualified SNPs and 472 piglets were used for GWAS. Two suggestive loci predisposing to pig UH were identified at 44.25MB on SSC2 (rs81358018, P = 3.34E-06, FDR = 0.049933) and at 45.90MB on SSC17 (rs81479278, P = 3.30E-06, FDR = 0.049933) in Duroc population, respectively. And no SNP was detected to be associated with pig UH at significant level in neither Landrace nor Large White population. Furthermore, we carried out a meta-analysis in the combined pure-breed population containing all the 472 piglets. rs81479278 (P = 1.16E-06, FDR = 0.022475) was identified to associate with pig UH at genome-wide significant level. SRC was characterized as plausible candidate gene for susceptibility to pig UH according to its genomic position and biological functions. To our knowledge, this study gives the first description of GWAS identifying susceptibility loci for umbilical hernia in pigs. Our findings provide deeper insights to the genetic architecture of umbilical hernia in pigs.

  19. A genome-wide association study platform built on iPlant cyber-infrastructure

    USDA-ARS?s Scientific Manuscript database

    We demonstrated a flexible Genome-Wide Association (GWA) Study (GWAS) platform built upon the iPlant Collaborative Cyber-infrastructure. The platform supports big data management, sharing, and large scale study of both genotype and phenotype data on clusters. End users can add their own analysis too...

  20. Genome-wide DNA methylation profiling in infants born to gestational diabetes mellitus.

    PubMed

    Weng, Xiaoling; Liu, Fatao; Zhang, Hong; Kan, Mengyuan; Wang, Ting; Dong, Mingyue; Liu, Yun

    2018-03-26

    Offspring exposed to gestational diabetes mellitus (GDM) are at a high risk for metabolic diseases. The mechanisms behind the association between offspring exposed to GDM in utero and an increased risk of health consequences later in life remain unclear. The aim of this study was to clarify the changes in methylation levels in the foetuses of women with GDM and to explore the possible mechanisms linking maternal GDM with a high risk of metabolic diseases in offspring later in life. A genome-wide comparative methylome analysis on the umbilical cord blood of infants born to 30 women with GDM and 33 women with normal pregnancy was performed using Infinium HumanMethylation 450 BeadChip assays. A quantitative methylation analysis of 18 CpG dinucleotides was verified in the validation umbilical cord blood samples from 102 newborns exposed to GDM and 103 newborns who experienced normal pregnancy by MassARRAY EpiTYPER. A total of 4485 differentially methylated sites (DMSs), including 2150 hypermethylated sites and 2335 hypomethylated sites, with a mean β-value difference of >0.05, were identified by the 450k array. Good agreement was observed between the massarray validation data and the 450k array data (R 2 > 0.99; P < 0.0001). Thirty-seven CpGs (representing 20 genes) with a β-value difference of >0.15 between the GDM and healthy groups were identified and showed potential as clinical biomarkers for GDM. "hsa04940: Type I diabetes mellitus" was the most significant Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, with a P-value = 3.20E-07 and 1.36E-02 in the hypermethylated and hypomethylated genepathway enrichment analyses, respectively. In the Gene Ontology (GO) pathway analyses, immune MHC-related pathways and neuron development-related pathways were significantly enriched. Our results suggest that GDM has epigenetic effects on genes that are preferentially involved in the Type I diabetes mellitus pathway, immune MHC (major histocompatibility complex

  1. Genome-Wide Transposon Mutagenesis in Pathogenic Leptospira Species▿ ‡

    PubMed Central

    Murray, Gerald L.; Morel, Viviane; Cerqueira, Gustavo M.; Croda, Julio; Srikram, Amporn; Henry, Rebekah; Ko, Albert I.; Dellagostin, Odir A.; Bulach, Dieter M.; Sermswan, Rasana W.; Adler, Ben; Picardeau, Mathieu

    2009-01-01

    Leptospira interrogans is the most common cause of leptospirosis in humans and animals. Genetic analysis of L. interrogans has been severely hindered by a lack of tools for genetic manipulation. Recently we developed the mariner-based transposon Himar1 to generate the first defined mutants in L. interrogans. In this study, a total of 929 independent transposon mutants were obtained and the location of insertion determined. Of these mutants, 721 were located in the protein coding regions of 551 different genes. While sequence analysis of transposon insertion sites indicated that transposition occurred in an essentially random fashion in the genome, 25 unique transposon mutants were found to exhibit insertions into genes encoding 16S or 23S rRNAs, suggesting these genes are insertional hot spots in the L. interrogans genome. In contrast, loci containing notionally essential genes involved in lipopolysaccharide and heme biosynthesis showed few transposon insertions. The effect of gene disruption on the virulence of a selected set of defined mutants was investigated using the hamster model of leptospirosis. Two attenuated mutants with disruptions in hypothetical genes were identified, thus validating the use of transposon mutagenesis for the identification of novel virulence factors in L. interrogans. This library provides a valuable resource for the study of gene function in L. interrogans. Combined with the genome sequences of L. interrogans, this provides an opportunity to investigate genes that contribute to pathogenesis and will provide a better understanding of the biology of L. interrogans. PMID:19047402

  2. Genome-Wide Development and Use of Microsatellite Markers for Large-Scale Genotyping Applications in Foxtail Millet [Setaria italica (L.)

    PubMed Central

    Pandey, Garima; Misra, Gopal; Kumari, Kajal; Gupta, Sarika; Parida, Swarup Kumar; Chattopadhyay, Debasis; Prasad, Manoj

    2013-01-01

    The availability of well-validated informative co-dominant microsatellite markers and saturated genetic linkage map has been limited in foxtail millet (Setaria italica L.). In view of this, we conducted a genome-wide analysis and identified 28 342 microsatellite repeat-motifs spanning 405.3 Mb of foxtail millet genome. The trinucleotide repeats (∼48%) was prevalent when compared with dinucleotide repeats (∼46%). Of the 28 342 microsatellites, 21 294 (∼75%) primer pairs were successfully designed, and a total of 15 573 markers were physically mapped on 9 chromosomes of foxtail millet. About 159 markers were validated successfully in 8 accessions of Setaria sp. with ∼67% polymorphic potential. The high percentage (89.3%) of cross-genera transferability across millet and non-millet species with higher transferability percentage in bioenergy grasses (∼79%, Switchgrass and ∼93%, Pearl millet) signifies their importance in studying the bioenergy grasses. In silico comparative mapping of 15 573 foxtail millet microsatellite markers against the mapping data of sorghum (16.9%), maize (14.5%) and rice (6.4%) indicated syntenic relationships among the chromosomes of foxtail millet and target species. The results, thus, demonstrate the immense applicability of developed microsatellite markers in germplasm characterization, phylogenetics, construction of genetic linkage map for gene/quantitative trait loci discovery, comparative mapping in foxtail millet, including other millets and bioenergy grass species. PMID:23382459

  3. Genome-wide development and use of microsatellite markers for large-scale genotyping applications in foxtail millet [Setaria italica (L.)].

    PubMed

    Pandey, Garima; Misra, Gopal; Kumari, Kajal; Gupta, Sarika; Parida, Swarup Kumar; Chattopadhyay, Debasis; Prasad, Manoj

    2013-04-01

    The availability of well-validated informative co-dominant microsatellite markers and saturated genetic linkage map has been limited in foxtail millet (Setaria italica L.). In view of this, we conducted a genome-wide analysis and identified 28 342 microsatellite repeat-motifs spanning 405.3 Mb of foxtail millet genome. The trinucleotide repeats (∼48%) was prevalent when compared with dinucleotide repeats (∼46%). Of the 28 342 microsatellites, 21 294 (∼75%) primer pairs were successfully designed, and a total of 15 573 markers were physically mapped on 9 chromosomes of foxtail millet. About 159 markers were validated successfully in 8 accessions of Setaria sp. with ∼67% polymorphic potential. The high percentage (89.3%) of cross-genera transferability across millet and non-millet species with higher transferability percentage in bioenergy grasses (∼79%, Switchgrass and ∼93%, Pearl millet) signifies their importance in studying the bioenergy grasses. In silico comparative mapping of 15 573 foxtail millet microsatellite markers against the mapping data of sorghum (16.9%), maize (14.5%) and rice (6.4%) indicated syntenic relationships among the chromosomes of foxtail millet and target species. The results, thus, demonstrate the immense applicability of developed microsatellite markers in germplasm characterization, phylogenetics, construction of genetic linkage map for gene/quantitative trait loci discovery, comparative mapping in foxtail millet, including other millets and bioenergy grass species.

  4. Efficiently Identifying Significant Associations in Genome-wide Association Studies

    PubMed Central

    Eskin, Eleazar

    2013-01-01

    Abstract Over the past several years, genome-wide association studies (GWAS) have implicated hundreds of genes in common disease. More recently, the GWAS approach has been utilized to identify regions of the genome that harbor variation affecting gene expression or expression quantitative trait loci (eQTLs). Unlike GWAS applied to clinical traits, where only a handful of phenotypes are analyzed per study, in eQTL studies, tens of thousands of gene expression levels are measured, and the GWAS approach is applied to each gene expression level. This leads to computing billions of statistical tests and requires substantial computational resources, particularly when applying novel statistical methods such as mixed models. We introduce a novel two-stage testing procedure that identifies all of the significant associations more efficiently than testing all the single nucleotide polymorphisms (SNPs). In the first stage, a small number of informative SNPs, or proxies, across the genome are tested. Based on their observed associations, our approach locates the regions that may contain significant SNPs and only tests additional SNPs from those regions. We show through simulations and analysis of real GWAS datasets that the proposed two-stage procedure increases the computational speed by a factor of 10. Additionally, efficient implementation of our software increases the computational speed relative to the state-of-the-art testing approaches by a factor of 75. PMID:24033261

  5. Pernicious plans revealed: Plasmodium falciparum genome wide expression analysis.

    PubMed

    Llinás, Manuel; DeRisi, Joseph L

    2004-08-01

    The asexual intraerythrocytic developmental cycle (IDC) of Plasmodium falciparum is responsible for the majority of the clinical manifestations of malaria in humans. Although malaria has been studied for over a century, the elucidation of the full genome sequence of P. falciparum has now allowed for in-depth studies of gene expression throughout the entire intraerythrocytic stage. As the mainstays of anti-malarial chemotherapy become increasingly ineffective, we need a deeper understanding of fundamental plasmodial bioregulatory mechanisms to successfully subvert them. Recent gene expression studies have begun to examine different aspects of the IDC and are providing key insights into the basic mechanisms of Plasmodium gene regulation and are helping to define gene functions. However, to date, no transcription factor has been fully characterized from Plasmodium and the definitive identification of cis-acting regulatory elements along with their corresponding trans-acting partners is still lacking. The characterization of the transcriptome of P. falciparum is the first major step towards the understanding of the genome wide regulation of gene expression in this parasite. IDC expression data for almost every gene in the P. falciparum genome can now be publicly queried at and. The results of these studies suggest promising leads for identifying novel targets for anti-malarial therapeutics and vaccines in addition to providing a solid foundation for the ongoing elucidation of plasmodial gene expression.

  6. Genome-Wide Association Study of Seed Dormancy and the Genomic Consequences of Improvement Footprints in Rice (Oryza sativa L.)

    PubMed Central

    Lu, Qing; Niu, Xiaojun; Zhang, Mengchen; Wang, Caihong; Xu, Qun; Feng, Yue; Yang, Yaolong; Wang, Shan; Yuan, Xiaoping; Yu, Hanyong; Wang, Yiping; Chen, Xiaoping; Liang, Xuanqiang; Wei, Xinghua

    2018-01-01

    Seed dormancy is an important agronomic trait affecting grain yield and quality because of pre-harvest germination and is influenced by both environmental and genetic factors. However, our knowledge of the factors controlling seed dormancy remains limited. To better reveal the molecular mechanism underlying this trait, a genome-wide association study was conducted in an indica-only population consisting of 453 accessions genotyped using 5,291 SNPs. Nine known and new significant SNPs were identified on eight chromosomes. These lead SNPs explained 34.9% of the phenotypic variation, and four of them were designed as dCAPS markers in the hope of accelerating molecular breeding. Moreover, a total of 212 candidate genes was predicted and eight candidate genes showed plant tissue-specific expression in expression profile data from different public bioinformatics databases. In particular, LOC_Os03g10110, which had a maize homolog involved in embryo development, was identified as a candidate regulator for further biological function investigations. Additionally, a polymorphism information content ratio method was used to screen improvement footprints and 27 selective sweeps were identified, most of which harbored domestication-related genes. Further studies suggested that three significant SNPs were adjacent to the candidate selection signals, supporting the accuracy of our genome-wide association study (GWAS) results. These findings show that genome-wide screening for selective sweeps can be used to identify new improvement-related DNA regions, although the phenotypes are unknown. This study enhances our knowledge of the genetic variation in seed dormancy, and the new dormancy-associated SNPs will provide real benefits in molecular breeding. PMID:29354150

  7. Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays.

    PubMed

    Mak, Angel C Y; Lai, Yvonne Y Y; Lam, Ernest T; Kwok, Tsz-Piu; Leung, Alden K Y; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W C; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J K; Li, Catherine M L; Li, Jing-Woei; Yim, Aldrin K Y; Chan, Saki; Sibert, Justin; Džakula, Željko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan

    2016-01-01

    Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation. Copyright © 2016 by the Genetics Society of America.

  8. Genome-wide analysis of murine renal distal convoluted tubular cells for the target genes of mineralocorticoid receptor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ueda, Kohei; Fujiki, Katsunori; Shirahige, Katsuhiko

    Highlights: • We define a target gene of MR as that with MR-binding to the adjacent region of DNA. • We use ChIP-seq analysis in combination with microarray. • We, for the first time, explore the genome-wide binding profile of MR. • We reveal 5 genes as the direct target genes of MR in the renal epithelial cell-line. - Abstract: Background and objective: Mineralocorticoid receptor (MR) is a member of nuclear receptor family proteins and contributes to fluid homeostasis in the kidney. Although aldosterone-MR pathway induces several gene expressions in the kidney, it is often unclear whether the gene expressionsmore » are accompanied by direct regulations of MR through its binding to the regulatory region of each gene. The purpose of this study is to identify the direct target genes of MR in a murine distal convoluted tubular epithelial cell-line (mDCT). Methods: We analyzed the DNA samples of mDCT cells overexpressing 3xFLAG-hMR after treatment with 10{sup −7} M aldosterone for 1 h by chromatin immunoprecipitation with deep-sequence (ChIP-seq) and mRNA of the cell-line with treatment of 10{sup −7} M aldosterone for 3 h by microarray. Results: 3xFLAG-hMR overexpressed in mDCT cells accumulated in the nucleus in response to 10{sup −9} M aldosterone. Twenty-five genes were indicated as the candidate target genes of MR by ChIP-seq and microarray analyses. Five genes, Sgk1, Fkbp5, Rasl12, Tns1 and Tsc22d3 (Gilz), were validated as the direct target genes of MR by quantitative RT-qPCR and ChIP-qPCR. MR binding regions adjacent to Ctgf and Serpine1 were also validated. Conclusions: We, for the first time, captured the genome-wide distribution of MR in mDCT cells and, furthermore, identified five MR target genes in the cell-line. These results will contribute to further studies on the mechanisms of kidney diseases.« less

  9. Genome-Wide Association Study of Receptive Language Ability of 12-Year-Olds

    ERIC Educational Resources Information Center

    Harlaar, Nicole; Meaburn, Emma L.; Hayiou-Thomas, Marianna E.; Davis, Oliver S. P.; Docherty, Sophia; Hanscombe, Ken B.; Haworth, Claire M. A.; Price, Thomas S.; Trzaskowski, Maciej; Dale, Philip S.; Plomin, Robert

    2014-01-01

    Purpose: Researchers have previously shown that individual differences in measures of receptive language ability at age 12 are highly heritable. In the current study, the authors attempted to identify some of the genes responsible for the heritability of receptive language ability using a "genome-wide association" approach. Method: The…

  10. Genome-wide significant predictors of metabolites in the one-carbon metabolism pathway

    USDA-ARS?s Scientific Manuscript database

    Low plasma B-vitamin levels and elevated homocysteine have been associated with cancer, cardiovascular disease, and neurodegenerative disorders. Common variants in FUT2 on chromosome 19q13 were associated with plasma vitamin B12 levels among women in a genome-wide association study (GWAS) in the Nur...

  11. Genome-Wide Polygenic Scores Predict Reading Performance throughout the School Years

    ERIC Educational Resources Information Center

    Selzam, Saskia; Dale, Philip S.; Wagner, Richard K.; DeFries, John C.; Cederlöf, Martin; O'Reilly, Paul F.; Krapohl, Eva; Plomin, Robert

    2017-01-01

    It is now possible to create individual-specific genetic scores, called genome-wide polygenic scores (GPS). We used a GPS for years of education ("EduYears") to predict reading performance assessed at UK National Curriculum Key Stages 1 (age 7), 2 (age 12) and 3 (age 14) and on reading tests administered at ages 7 and 12 in a UK sample…

  12. Genome-wide association study with 1000 genomes imputation identifies signals for nine sex hormone-related phenotypes.

    PubMed

    Ruth, Katherine S; Campbell, Purdey J; Chew, Shelby; Lim, Ee Mun; Hadlow, Narelle; Stuckey, Bronwyn G A; Brown, Suzanne J; Feenstra, Bjarke; Joseph, John; Surdulescu, Gabriela L; Zheng, Hou Feng; Richards, J Brent; Murray, Anna; Spector, Tim D; Wilson, Scott G; Perry, John R B

    2016-02-01

    Genetic factors contribute strongly to sex hormone levels, yet knowledge of the regulatory mechanisms remains incomplete. Genome-wide association studies (GWAS) have identified only a small number of loci associated with sex hormone levels, with several reproductive hormones yet to be assessed. The aim of the study was to identify novel genetic variants contributing to the regulation of sex hormones. We performed GWAS using genotypes imputed from the 1000 Genomes reference panel. The study used genotype and phenotype data from a UK twin register. We included 2913 individuals (up to 294 males) from the Twins UK study, excluding individuals receiving hormone treatment. Phenotypes were standardised for age, sex, BMI, stage of menstrual cycle and menopausal status. We tested 7,879,351 autosomal SNPs for association with levels of dehydroepiandrosterone sulphate (DHEAS), oestradiol, free androgen index (FAI), follicle-stimulating hormone (FSH), luteinizing hormone (LH), prolactin, progesterone, sex hormone-binding globulin and testosterone. Eight independent genetic variants reached genome-wide significance (P<5 × 10(-8)), with minor allele frequencies of 1.3-23.9%. Novel signals included variants for progesterone (P=7.68 × 10(-12)), oestradiol (P=1.63 × 10(-8)) and FAI (P=1.50 × 10(-8)). A genetic variant near the FSHB gene was identified which influenced both FSH (P=1.74 × 10(-8)) and LH (P=3.94 × 10(-9)) levels. A separate locus on chromosome 7 was associated with both DHEAS (P=1.82 × 10(-14)) and progesterone (P=6.09 × 10(-14)). This study highlights loci that are relevant to reproductive function and suggests overlap in the genetic basis of hormone regulation.

  13. Genome-Wide Mutagenesis in Borrelia burgdorferi.

    PubMed

    Lin, Tao; Gao, Lihui

    2018-01-01

    population of mutants with different tags, after recovered from different tissues of infected mice and ticks, mutants from output pool and input pool are detected using high-throughput, semi-quantitative Luminex ® FLEXMAP™ or next-generation sequencing (Tn-seq) technologies. Thus far, we have created a high-density, sequence-defined transposon library of over 6600 STM mutants for the efficient genome-wide investigation of genes and gene products required for wild-type pathogenesis, host-pathogen interactions, in vitro growth, in vivo survival, physiology, morphology, chemotaxis, motility, structure, metabolism, gene regulation, plasmid maintenance and replication, etc. The insertion sites of 4480 transposon mutants have been determined. About 800 predicted protein-encoding genes in the genome were disrupted in the STM transposon library. The infectivity and some functions of 800 mutants in 500 genes have been determined. Analysis of these transposon mutants has yielded valuable information regarding the genes and gene products important in the pathogenesis and biology of B. burgdorferi and its tick vectors.

  14. Genome-wide annotation of the soybean WRKY family and functional characterization of genes involved in response to Phakopsora pachyrhizi infection.

    PubMed

    Bencke-Malato, Marta; Cabreira, Caroline; Wiebke-Strohm, Beatriz; Bücker-Neto, Lauro; Mancini, Estefania; Osorio, Marina B; Homrich, Milena S; Turchetto-Zolet, Andreia Carina; De Carvalho, Mayra C C G; Stolf, Renata; Weber, Ricardo L M; Westergaard, Gastón; Castagnaro, Atílio P; Abdelnoor, Ricardo V; Marcelino-Guimarães, Francismar C; Margis-Pinheiro, Márcia; Bodanese-Zanettini, Maria Helena

    2014-09-10

    Many previous studies have shown that soybean WRKY transcription factors are involved in the plant response to biotic and abiotic stresses. Phakopsora pachyrhizi is the causal agent of Asian Soybean Rust, one of the most important soybean diseases. There are evidences that WRKYs are involved in the resistance of some soybean genotypes against that fungus. The number of WRKY genes already annotated in soybean genome was underrepresented. In the present study, a genome-wide annotation of the soybean WRKY family was carried out and members involved in the response to P. pachyrhizi were identified. As a result of a soybean genomic databases search, 182 WRKY-encoding genes were annotated and 33 putative pseudogenes identified. Genes involved in the response to P. pachyrhizi infection were identified using superSAGE, RNA-Seq of microdissected lesions and microarray experiments. Seventy-five genes were differentially expressed during fungal infection. The expression of eight WRKY genes was validated by RT-qPCR. The expression of these genes in a resistant genotype was earlier and/or stronger compared with a susceptible genotype in response to P. pachyrhizi infection. Soybean somatic embryos were transformed in order to overexpress or silence WRKY genes. Embryos overexpressing a WRKY gene were obtained, but they were unable to convert into plants. When infected with P. pachyrhizi, the leaves of the silenced transgenic line showed a higher number of lesions than the wild-type plants. The present study reports a genome-wide annotation of soybean WRKY family. The participation of some members in response to P. pachyrhizi infection was demonstrated. The results contribute to the elucidation of gene function and suggest the manipulation of WRKYs as a strategy to increase fungal resistance in soybean plants.

  15. Haplotype-Based Genome-Wide Prediction Models Exploit Local Epistatic Interactions Among Markers

    PubMed Central

    Jiang, Yong; Schmidt, Renate H.; Reif, Jochen C.

    2018-01-01

    Genome-wide prediction approaches represent versatile tools for the analysis and prediction of complex traits. Mostly they rely on marker-based information, but scenarios have been reported in which models capitalizing on closely-linked markers that were combined into haplotypes outperformed marker-based models. Detailed comparisons were undertaken to reveal under which circumstances haplotype-based genome-wide prediction models are superior to marker-based models. Specifically, it was of interest to analyze whether and how haplotype-based models may take local epistatic effects between markers into account. Assuming that populations consisted of fully homozygous individuals, a marker-based model in which local epistatic effects inside haplotype blocks were exploited (LEGBLUP) was linearly transformable into a haplotype-based model (HGBLUP). This theoretical derivation formally revealed that haplotype-based genome-wide prediction models capitalize on local epistatic effects among markers. Simulation studies corroborated this finding. Due to its computational efficiency the HGBLUP model promises to be an interesting tool for studies in which ultra-high-density SNP data sets are studied. Applying the HGBLUP model to empirical data sets revealed higher prediction accuracies than for marker-based models for both traits studied using a mouse panel. In contrast, only a small subset of the traits analyzed in crop populations showed such a benefit. Cases in which higher prediction accuracies are observed for HGBLUP than for marker-based models are expected to be of immediate relevance for breeders, due to the tight linkage a beneficial haplotype will be preserved for many generations. In this respect the inheritance of local epistatic effects very much resembles the one of additive effects. PMID:29549092

  16. Haplotype-Based Genome-Wide Prediction Models Exploit Local Epistatic Interactions Among Markers.

    PubMed

    Jiang, Yong; Schmidt, Renate H; Reif, Jochen C

    2018-05-04

    Genome-wide prediction approaches represent versatile tools for the analysis and prediction of complex traits. Mostly they rely on marker-based information, but scenarios have been reported in which models capitalizing on closely-linked markers that were combined into haplotypes outperformed marker-based models. Detailed comparisons were undertaken to reveal under which circumstances haplotype-based genome-wide prediction models are superior to marker-based models. Specifically, it was of interest to analyze whether and how haplotype-based models may take local epistatic effects between markers into account. Assuming that populations consisted of fully homozygous individuals, a marker-based model in which local epistatic effects inside haplotype blocks were exploited (LEGBLUP) was linearly transformable into a haplotype-based model (HGBLUP). This theoretical derivation formally revealed that haplotype-based genome-wide prediction models capitalize on local epistatic effects among markers. Simulation studies corroborated this finding. Due to its computational efficiency the HGBLUP model promises to be an interesting tool for studies in which ultra-high-density SNP data sets are studied. Applying the HGBLUP model to empirical data sets revealed higher prediction accuracies than for marker-based models for both traits studied using a mouse panel. In contrast, only a small subset of the traits analyzed in crop populations showed such a benefit. Cases in which higher prediction accuracies are observed for HGBLUP than for marker-based models are expected to be of immediate relevance for breeders, due to the tight linkage a beneficial haplotype will be preserved for many generations. In this respect the inheritance of local epistatic effects very much resembles the one of additive effects. Copyright © 2018 Jiang et al.

  17. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks.

    PubMed

    Yan, Winston X; Mirzazadeh, Reza; Garnerone, Silvano; Scott, David; Schneider, Martin W; Kallas, Tomasz; Custodio, Joaquin; Wernersson, Erik; Li, Yinqing; Gao, Linyi; Federova, Yana; Zetsche, Bernd; Zhang, Feng; Bienko, Magda; Crosetto, Nicola

    2017-05-12

    Precisely measuring the location and frequency of DNA double-strand breaks (DSBs) along the genome is instrumental to understanding genomic fragility, but current methods are limited in versatility, sensitivity or practicality. Here we present Breaks Labeling In Situ and Sequencing (BLISS), featuring the following: (1) direct labelling of DSBs in fixed cells or tissue sections on a solid surface; (2) low-input requirement by linear amplification of tagged DSBs by in vitro transcription; (3) quantification of DSBs through unique molecular identifiers; and (4) easy scalability and multiplexing. We apply BLISS to profile endogenous and exogenous DSBs in low-input samples of cancer cells, embryonic stem cells and liver tissue. We demonstrate the sensitivity of BLISS by assessing the genome-wide off-target activity of two CRISPR-associated RNA-guided endonucleases, Cas9 and Cpf1, observing that Cpf1 has higher specificity than Cas9. Our results establish BLISS as a versatile, sensitive and efficient method for genome-wide DSB mapping in many applications.

  18. A genome-wide association study of anorexia nervosa.

    PubMed

    Boraska, V; Franklin, C S; Floyd, J A B; Thornton, L M; Huckins, L M; Southam, L; Rayner, N W; Tachmazidou, I; Klump, K L; Treasure, J; Lewis, C M; Schmidt, U; Tozzi, F; Kiezebrink, K; Hebebrand, J; Gorwood, P; Adan, R A H; Kas, M J H; Favaro, A; Santonastaso, P; Fernández-Aranda, F; Gratacos, M; Rybakowski, F; Dmitrzak-Weglarz, M; Kaprio, J; Keski-Rahkonen, A; Raevuori, A; Van Furth, E F; Slof-Op 't Landt, M C T; Hudson, J I; Reichborn-Kjennerud, T; Knudsen, G P S; Monteleone, P; Kaplan, A S; Karwautz, A; Hakonarson, H; Berrettini, W H; Guo, Y; Li, D; Schork, N J; Komaki, G; Ando, T; Inoko, H; Esko, T; Fischer, K; Männik, K; Metspalu, A; Baker, J H; Cone, R D; Dackor, J; DeSocio, J E; Hilliard, C E; O'Toole, J K; Pantel, J; Szatkiewicz, J P; Taico, C; Zerwas, S; Trace, S E; Davis, O S P; Helder, S; Bühren, K; Burghardt, R; de Zwaan, M; Egberts, K; Ehrlich, S; Herpertz-Dahlmann, B; Herzog, W; Imgart, H; Scherag, A; Scherag, S; Zipfel, S; Boni, C; Ramoz, N; Versini, A; Brandys, M K; Danner, U N; de Kovel, C; Hendriks, J; Koeleman, B P C; Ophoff, R A; Strengman, E; van Elburg, A A; Bruson, A; Clementi, M; Degortes, D; Forzan, M; Tenconi, E; Docampo, E; Escaramís, G; Jiménez-Murcia, S; Lissowska, J; Rajewski, A; Szeszenia-Dabrowska, N; Slopien, A; Hauser, J; Karhunen, L; Meulenbelt, I; Slagboom, P E; Tortorella, A; Maj, M; Dedoussis, G; Dikeos, D; Gonidakis, F; Tziouvas, K; Tsitsika, A; Papezova, H; Slachtova, L; Martaskova, D; Kennedy, J L; Levitan, R D; Yilmaz, Z; Huemer, J; Koubek, D; Merl, E; Wagner, G; Lichtenstein, P; Breen, G; Cohen-Woods, S; Farmer, A; McGuffin, P; Cichon, S; Giegling, I; Herms, S; Rujescu, D; Schreiber, S; Wichmann, H-E; Dina, C; Sladek, R; Gambaro, G; Soranzo, N; Julia, A; Marsal, S; Rabionet, R; Gaborieau, V; Dick, D M; Palotie, A; Ripatti, S; Widén, E; Andreassen, O A; Espeseth, T; Lundervold, A; Reinvang, I; Steen, V M; Le Hellard, S; Mattingsdal, M; Ntalla, I; Bencko, V; Foretova, L; Janout, V; Navratilova, M; Gallinger, S; Pinto, D; Scherer, S W; Aschauer, H; Carlberg, L; Schosser, A; Alfredsson, L; Ding, B; Klareskog, L; Padyukov, L; Courtet, P; Guillaume, S; Jaussent, I; Finan, C; Kalsi, G; Roberts, M; Logan, D W; Peltonen, L; Ritchie, G R S; Barrett, J C; Estivill, X; Hinney, A; Sullivan, P F; Collier, D A; Zeggini, E; Bulik, C M

    2014-10-01

    Anorexia nervosa (AN) is a complex and heritable eating disorder characterized by dangerously low body weight. Neither candidate gene studies nor an initial genome-wide association study (GWAS) have yielded significant and replicated results. We performed a GWAS in 2907 cases with AN from 14 countries (15 sites) and 14 860 ancestrally matched controls as part of the Genetic Consortium for AN (GCAN) and the Wellcome Trust Case Control Consortium 3 (WTCCC3). Individual association analyses were conducted in each stratum and meta-analyzed across all 15 discovery data sets. Seventy-six (72 independent) single nucleotide polymorphisms were taken forward for in silico (two data sets) or de novo (13 data sets) replication genotyping in 2677 independent AN cases and 8629 European ancestry controls along with 458 AN cases and 421 controls from Japan. The final global meta-analysis across discovery and replication data sets comprised 5551 AN cases and 21 080 controls. AN subtype analyses (1606 AN restricting; 1445 AN binge-purge) were performed. No findings reached genome-wide significance. Two intronic variants were suggestively associated: rs9839776 (P=3.01 × 10(-7)) in SOX2OT and rs17030795 (P=5.84 × 10(-6)) in PPP3CA. Two additional signals were specific to Europeans: rs1523921 (P=5.76 × 10(-)(6)) between CUL3 and FAM124B and rs1886797 (P=8.05 × 10(-)(6)) near SPATA13. Comparing discovery with replication results, 76% of the effects were in the same direction, an observation highly unlikely to be due to chance (P=4 × 10(-6)), strongly suggesting that true findings exist but our sample, the largest yet reported, was underpowered for their detection. The accrual of large genotyped AN case-control samples should be an immediate priority for the field.

  19. A genome-wide association study of anorexia nervosa

    PubMed Central

    Boraska, Vesna; Franklin, Christopher S; Floyd, James AB; Thornton, Laura M; Huckins, Laura M; Southam, Lorraine; Rayner, N William; Tachmazidou, Ioanna; Klump, Kelly L; Treasure, Janet; Lewis, Cathryn M; Schmidt, Ulrike; Tozzi, Federica; Kiezebrink, Kirsty; Hebebrand, Johannes; Gorwood, Philip; Adan, Roger AH; Kas, Martien JH; Favaro, Angela; Santonastaso, Paolo; Fernández-Aranda, Fernando; Gratacos, Monica; Rybakowski, Filip; Dmitrzak-Weglarz, Monika; Kaprio, Jaakko; Keski-Rahkonen, Anna; Raevuori, Anu; Van Furth, Eric F; Landt, Margarita CT Slof-Op t; Hudson, James I; Reichborn-Kjennerud, Ted; Knudsen, Gun Peggy S; Monteleone, Palmiero; Kaplan, Allan S; Karwautz, Andreas; Hakonarson, Hakon; Berrettini, Wade H; Guo, Yiran; Li, Dong; Schork, Nicholas J.; Komaki, Gen; Ando, Tetsuya; Inoko, Hidetoshi; Esko, Tõnu; Fischer, Krista; Männik, Katrin; Metspalu, Andres; Baker, Jessica H; Cone, Roger D; Dackor, Jennifer; DeSocio, Janiece E; Hilliard, Christopher E; O'Toole, Julie K; Pantel, Jacques; Szatkiewicz, Jin P; Taico, Chrysecolla; Zerwas, Stephanie; Trace, Sara E; Davis, Oliver SP; Helder, Sietske; Bühren, Katharina; Burghardt, Roland; de Zwaan, Martina; Egberts, Karin; Ehrlich, Stefan; Herpertz-Dahlmann, Beate; Herzog, Wolfgang; Imgart, Hartmut; Scherag, André; Scherag, Susann; Zipfel, Stephan; Boni, Claudette; Ramoz, Nicolas; Versini, Audrey; Brandys, Marek K; Danner, Unna N; de Kovel, Carolien; Hendriks, Judith; Koeleman, Bobby PC; Ophoff, Roel A; Strengman, Eric; van Elburg, Annemarie A; Bruson, Alice; Clementi, Maurizio; Degortes, Daniela; Forzan, Monica; Tenconi, Elena; Docampo, Elisa; Escaramís, Geòrgia; Jiménez-Murcia, Susana; Lissowska, Jolanta; Rajewski, Andrzej; Szeszenia-Dabrowska, Neonila; Slopien, Agnieszka; Hauser, Joanna; Karhunen, Leila; Meulenbelt, Ingrid; Slagboom, P Eline; Tortorella, Alfonso; Maj, Mario; Dedoussis, George; Dikeos, Dimitris; Gonidakis, Fragiskos; Tziouvas, Konstantinos; Tsitsika, Artemis; Papezova, Hana; Slachtova, Lenka; Martaskova, Debora; Kennedy, James L.; Levitan, Robert D.; Yilmaz, Zeynep; Huemer, Julia; Koubek, Doris; Merl, Elisabeth; Wagner, Gudrun; Lichtenstein, Paul; Breen, Gerome; Cohen-Woods, Sarah; Farmer, Anne; McGuffin, Peter; Cichon, Sven; Giegling, Ina; Herms, Stefan; Rujescu, Dan; Schreiber, Stefan; Wichmann, H-Erich; Dina, Christian; Sladek, Rob; Gambaro, Giovanni; Soranzo, Nicole; Julia, Antonio; Marsal, Sara; Rabionet, Raquel; Gaborieau, Valerie; Dick, Danielle M; Palotie, Aarno; Ripatti, Samuli; Widén, Elisabeth; Andreassen, Ole A; Espeseth, Thomas; Lundervold, Astri; Reinvang, Ivar; Steen, Vidar M; Le Hellard, Stephanie; Mattingsdal, Morten; Ntalla, Ioanna; Bencko, Vladimir; Foretova, Lenka; Janout, Vladimir; Navratilova, Marie; Gallinger, Steven; Pinto, Dalila; Scherer, Stephen; Aschauer, Harald; Carlberg, Laura; Schosser, Alexandra; Alfredsson, Lars; Ding, Bo; Klareskog, Lars; Padyukov, Leonid; Finan, Chris; Kalsi, Gursharan; Roberts, Marion; Logan, Darren W; Peltonen, Leena; Ritchie, Graham RS; Barrett, Jeffrey C; Estivill, Xavier; Hinney, Anke; Sullivan, Patrick F; Collier, David A; Zeggini, Eleftheria; Bulik, Cynthia M

    2015-01-01

    Anorexia nervosa (AN) is a complex and heritable eating disorder characterized by dangerously low body weight. Neither candidate gene studies nor an initial genome wide association study (GWAS) have yielded significant and replicated results. We performed a GWAS in 2,907 cases with AN from 14 countries (15 sites) and 14,860 ancestrally matched controls as part of the Genetic Consortium for AN (GCAN) and the Wellcome Trust Case Control Consortium 3 (WTCCC3). Individual association analyses were conducted in each stratum and meta-analyzed across all 15 discovery datasets. Seventy-six (72 independent) SNPs were taken forward for in silico (two datasets) or de novo (13 datasets) replication genotyping in 2,677 independent AN cases and 8,629 European ancestry controls along with 458 AN cases and 421 controls from Japan. The final global meta-analysis across discovery and replication datasets comprised 5,551 AN cases and 21,080 controls. AN subtype analyses (1,606 AN restricting; 1,445 AN binge-purge) were performed. No findings reached genome-wide significance. Two intronic variants were suggestively associated: rs9839776 (P=3.01×10-7) in SOX2OT and rs17030795 (P=5.84×10-6) in PPP3CA. Two additional signals were specific to Europeans: rs1523921 (P=5.76×10-6) between CUL3 and FAM124B and rs1886797 (P=8.05×10-6) near SPATA13. Comparing discovery to replication results, 76% of the effects were in the same direction, an observation highly unlikely to be due to chance (P=4×10-6), strongly suggesting that true findings exist but that our sample, the largest yet reported, was underpowered for their detection. The accrual of large genotyped AN case-control samples should be an immediate priority for the field. PMID:24514567

  20. Meta-analysis of Genome-wide Association Studies for Neuroticism, and the Polygenic Association With Major Depressive Disorder.

    PubMed

    de Moor, Marleen H M; van den Berg, Stéphanie M; Verweij, Karin J H; Krueger, Robert F; Luciano, Michelle; Arias Vasquez, Alejandro; Matteson, Lindsay K; Derringer, Jaime; Esko, Tõnu; Amin, Najaf; Gordon, Scott D; Hansell, Narelle K; Hart, Amy B; Seppälä, Ilkka; Huffman, Jennifer E; Konte, Bettina; Lahti, Jari; Lee, Minyoung; Miller, Mike; Nutile, Teresa; Tanaka, Toshiko; Teumer, Alexander; Viktorin, Alexander; Wedenoja, Juho; Abecasis, Goncalo R; Adkins, Daniel E; Agrawal, Arpana; Allik, Jüri; Appel, Katja; Bigdeli, Timothy B; Busonero, Fabio; Campbell, Harry; Costa, Paul T; Davey Smith, George; Davies, Gail; de Wit, Harriet; Ding, Jun; Engelhardt, Barbara E; Eriksson, Johan G; Fedko, Iryna O; Ferrucci, Luigi; Franke, Barbara; Giegling, Ina; Grucza, Richard; Hartmann, Annette M; Heath, Andrew C; Heinonen, Kati; Henders, Anjali K; Homuth, Georg; Hottenga, Jouke-Jan; Iacono, William G; Janzing, Joost; Jokela, Markus; Karlsson, Robert; Kemp, John P; Kirkpatrick, Matthew G; Latvala, Antti; Lehtimäki, Terho; Liewald, David C; Madden, Pamela A F; Magri, Chiara; Magnusson, Patrik K E; Marten, Jonathan; Maschio, Andrea; Medland, Sarah E; Mihailov, Evelin; Milaneschi, Yuri; Montgomery, Grant W; Nauck, Matthias; Ouwens, Klaasjan G; Palotie, Aarno; Pettersson, Erik; Polasek, Ozren; Qian, Yong; Pulkki-Råback, Laura; Raitakari, Olli T; Realo, Anu; Rose, Richard J; Ruggiero, Daniela; Schmidt, Carsten O; Slutske, Wendy S; Sorice, Rossella; Starr, John M; St Pourcain, Beate; Sutin, Angelina R; Timpson, Nicholas J; Trochet, Holly; Vermeulen, Sita; Vuoksimaa, Eero; Widen, Elisabeth; Wouda, Jasper; Wright, Margaret J; Zgaga, Lina; Porteous, David; Minelli, Alessandra; Palmer, Abraham A; Rujescu, Dan; Ciullo, Marina; Hayward, Caroline; Rudan, Igor; Metspalu, Andres; Kaprio, Jaakko; Deary, Ian J; Räikkönen, Katri; Wilson, James F; Keltikangas-Järvinen, Liisa; Bierut, Laura J; Hettema, John M; Grabe, Hans J; van Duijn, Cornelia M; Evans, David M; Schlessinger, David; Pedersen, Nancy L; Terracciano, Antonio; McGue, Matt; Penninx, Brenda W J H; Martin, Nicholas G; Boomsma, Dorret I

    2015-07-01

    Neuroticism is a pervasive risk factor for psychiatric conditions. It genetically overlaps with major depressive disorder (MDD) and is therefore an important phenotype for psychiatric genetics. The Genetics of Personality Consortium has created a resource for genome-wide association analyses of personality traits in more than 63,000 participants (including MDD cases). To identify genetic variants associated with neuroticism by performing a meta-analysis of genome-wide association results based on 1000 Genomes imputation; to evaluate whether common genetic variants as assessed by single-nucleotide polymorphisms (SNPs) explain variation in neuroticism by estimating SNP-based heritability; and to examine whether SNPs that predict neuroticism also predict MDD. Genome-wide association meta-analysis of 30 cohorts with genome-wide genotype, personality, and MDD data from the Genetics of Personality Consortium. The study included 63,661 participants from 29 discovery cohorts and 9786 participants from a replication cohort. Participants came from Europe, the United States, or Australia. Analyses were conducted between 2012 and 2014. Neuroticism scores harmonized across all 29 discovery cohorts by item response theory analysis, and clinical MDD case-control status in 2 of the cohorts. A genome-wide significant SNP was found on 3p14 in MAGI1 (rs35855737; P = 9.26 × 10-9 in the discovery meta-analysis). This association was not replicated (P = .32), but the SNP was still genome-wide significant in the meta-analysis of all 30 cohorts (P = 2.38 × 10-8). Common genetic variants explain 15% of the variance in neuroticism. Polygenic scores based on the meta-analysis of neuroticism in 27 cohorts significantly predicted neuroticism (1.09 × 10-12 < P < .05) and MDD (4.02 × 10-9 < P < .05) in the 2 other cohorts. This study identifies a novel locus for neuroticism. The variant is located in a known gene that has been associated with

  1. Differential DNA Methylation Analysis without a Reference Genome.

    PubMed

    Klughammer, Johanna; Datlinger, Paul; Printz, Dieter; Sheffield, Nathan C; Farlik, Matthias; Hadler, Johanna; Fritsch, Gerhard; Bock, Christoph

    2015-12-22

    Genome-wide DNA methylation mapping uncovers epigenetic changes associated with animal development, environmental adaptation, and species evolution. To address the lack of high-throughput methods for DNA methylation analysis in non-model organisms, we developed an integrated approach for studying DNA methylation differences independent of a reference genome. Experimentally, our method relies on an optimized 96-well protocol for reduced representation bisulfite sequencing (RRBS), which we have validated in nine species (human, mouse, rat, cow, dog, chicken, carp, sea bass, and zebrafish). Bioinformatically, we developed the RefFreeDMA software to deduce ad hoc genomes directly from RRBS reads and to pinpoint differentially methylated regions between samples or groups of individuals (http://RefFreeDMA.computational-epigenetics.org). The identified regions are interpreted using motif enrichment analysis and/or cross-mapping to annotated genomes. We validated our method by reference-free analysis of cell-type-specific DNA methylation in the blood of human, cow, and carp. In summary, we present a cost-effective method for epigenome analysis in ecology and evolution, which enables epigenome-wide association studies in natural populations and species without a reference genome. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  2. Genome-wide analyses of four major histone modifications in Arabidopsis hybrids at the germinating seed stage.

    PubMed

    Zhu, Anyu; Greaves, Ian K; Dennis, Elizabeth S; Peacock, W James

    2017-02-07

    Hybrid vigour (heterosis) has been used for decades in cropping agriculture, especially in the production of maize and rice, because hybrid varieties exceed their parents in plant biomass and seed yield. The molecular basis of hybrid vigour is not fully understood. Previous studies have suggested that epigenetic systems could play a role in heterosis. In this project, we investigated genome-wide patterns of four histone modifications in Arabidopsis hybrids in germinating seeds. We found that although hybrids have similar histone modification patterns to the parents in most regions of the genome, they have altered patterns at specific loci. A small subset of genes show changes in histone modifications in the hybrids that correlate with changes in gene expression. Our results also show that genome-wide patterns of histone modifications in geminating seeds parallel those at later developmental stages of seedlings. Ler/C24 hybrids showed similar genome-wide patterns of histone modifications as the parents at an early germination stage. However, a small subset of genes, such as FLC, showed correlated changes in histone modification and in gene expression in the hybrids. The altered patterns of histone modifications for those genes in hybrids could be related to some heterotic traits in Arabidopsis, such as flowering time, and could play a role in hybrid vigour establishment.

  3. Exploring Relationships between Host Genome and Microbiome: New Insights from Genome-Wide Association Studies

    PubMed Central

    Abdul-Aziz, Muslihudeen A.; Cooper, Alan; Weyrich, Laura S.

    2016-01-01

    As our understanding of the human microbiome expands, impacts on health and disease continue to be revealed. Alterations in the microbiome can result in dysbiosis, which has now been linked to subsequent autoimmune and metabolic diseases, highlighting the need to identify factors that shape the microbiome. Research has identified that the composition and functions of the human microbiome can be influenced by diet, age, sex, and environment. More recently, studies have explored how human genetic variation may also influence the microbiome. Here, we review several recent analytical advances in this new research area, including those that use genome-wide association studies to examine host genome–microbiome interactions, while controlling for the influence of other factors. We find that current research is limited by small sample sizes, lack of cohort replication, and insufficient confirmatory mechanistic studies. In addition, we discuss the importance of understanding long-term interactions between the host genome and microbiome, as well as the potential impacts of disrupting this relationship, and explore new research avenues that may provide information about the co-evolutionary history of humans and their microorganisms. PMID:27785127

  4. Genome-wide association studies for the identification of biomarkers in metabolic diseases.

    PubMed

    Pattin, Kristine A; Moore, Jason H

    2010-01-01

    The field of genetics as it relates to metabolic disorders such as obesity and type II diabetes is complicated, and along with the medical research community, great strides are being taken to begin to understand the biological and genetic underpinnings of these diseases, with the hope of improving therapeutic, diagnostic and preventive strategies. Although research on metabolic disorders has been continuing for decades, the completion of the Human Genome Project in 2003 and the International HapMap Project in 2005 gave rise to an abundance of research tools, such as genome-wide genotyping, which allow researchers to conduct genome-wide association studies (GWAS) for detecting genetic variants that confer increased or decreased susceptibility to such complex diseases. In this review, the complex nature of metabolic disorders is discussed, specifically obesity and type II diabetes, as well as the limitations of the GWAS as applied to these disorders. While acknowledging limitations of GWAS, it is hoped to provide an insight about how GWAS can be adapted and advantageous in the clinical setting, enhancing prevention, diagnosis and treatment of these diseases. To be able to use the GWAS in a clinical setting is a complex challenge, yet it is hoped that in the future this tool will ultimately allow the development of pharmaceutical options that are capable of targeting the cause of metabolic disorders, not just the symptoms themselves.

  5. Genetic Diversity in the Modern Horse Illustrated from Genome-Wide SNP Data

    PubMed Central

    Petersen, Jessica L.; Mickelson, James R.; Cothran, E. Gus; Andersson, Lisa S.; Axelsson, Jeanette; Bailey, Ernie; Bannasch, Danika; Binns, Matthew M.; Borges, Alexandre S.; Brama, Pieter; da Câmara Machado, Artur; Distl, Ottmar; Felicetti, Michela; Fox-Clipsham, Laura; Graves, Kathryn T.; Guérin, Gérard; Haase, Bianca; Hasegawa, Telhisa; Hemmann, Karin; Hill, Emmeline W.; Leeb, Tosso; Lindgren, Gabriella; Lohi, Hannes; Lopes, Maria Susana; McGivney, Beatrice A.; Mikko, Sofia; Orr, Nicholas; Penedo, M. Cecilia T; Piercy, Richard J.; Raekallio, Marja; Rieder, Stefan; Røed, Knut H.; Silvestrelli, Maurizio; Swinburne, June; Tozaki, Teruaki; Vaudin, Mark; M. Wade, Claire; McCue, Molly E.

    2013-01-01

    Horses were domesticated from the Eurasian steppes 5,000–6,000 years ago. Since then, the use of horses for transportation, warfare, and agriculture, as well as selection for desired traits and fitness, has resulted in diverse populations distributed across the world, many of which have become or are in the process of becoming formally organized into closed, breeding populations (breeds). This report describes the use of a genome-wide set of autosomal SNPs and 814 horses from 36 breeds to provide the first detailed description of equine breed diversity. FST calculations, parsimony, and distance analysis demonstrated relationships among the breeds that largely reflect geographic origins and known breed histories. Low levels of population divergence were observed between breeds that are relatively early on in the process of breed development, and between those with high levels of within-breed diversity, whether due to large population size, ongoing outcrossing, or large within-breed phenotypic diversity. Populations with low within-breed diversity included those which have experienced population bottlenecks, have been under intense selective pressure, or are closed populations with long breed histories. These results provide new insights into the relationships among and the diversity within breeds of horses. In addition these results will facilitate future genome-wide association studies and investigations into genomic targets of selection. PMID:23383025

  6. Genome-Wide Prediction of the Performance of Three-Way Hybrids in Barley.

    PubMed

    Li, Zuo; Philipp, Norman; Spiller, Monika; Stiewe, Gunther; Reif, Jochen C; Zhao, Yusheng

    2017-03-01

    Predicting the grain yield performance of three-way hybrids is challenging. Three-way crosses are relevant for hybrid breeding in barley ( L.) and maize ( L.) adapted to East Africa. The main goal of our study was to implement and evaluate genome-wide prediction approaches of the performance of three-way hybrids using data of single-cross hybrids for a scenario in which parental lines of the three-way hybrids originate from three genetically distinct subpopulations. We extended the ridge regression best linear unbiased prediction (RRBLUP) and devised a genomic selection model allowing for subpopulation-specific marker effects (GSA-RRBLUP: general and subpopulation-specific additive RRBLUP). Using an empirical barley data set, we showed that applying GSA-RRBLUP tripled the prediction ability of three-way hybrids from 0.095 to 0.308 compared with RRBLUP, modeling one additive effect for all three subpopulations. The experimental findings were further substantiated with computer simulations. Our results emphasize the potential of GSA-RRBLUP to improve genome-wide hybrid prediction of three-way hybrids for scenarios of genetically diverse parental populations. Because of the advantages of the GSA-RRBLUP model in dealing with hybrids from different parental populations, it may also be a promising approach to boost the prediction ability for hybrid breeding programs based on genetically diverse heterotic groups. Copyright © 2017 Crop Science Society of America.

  7. A genome-wide resource of cell cycle and cell shape genes of fission yeast

    PubMed Central

    Hayles, Jacqueline; Wood, Valerie; Jeffery, Linda; Hoe, Kwang-Lae; Kim, Dong-Uk; Park, Han-Oh; Salas-Pino, Silvia; Heichinger, Christian; Nurse, Paul

    2013-01-01

    To identify near complete sets of genes required for the cell cycle and cell shape, we have visually screened a genome-wide gene deletion library of 4843 fission yeast deletion mutants (95.7% of total protein encoding genes) for their effects on these processes. A total of 513 genes have been identified as being required for cell cycle progression, 276 of which have not been previously described as cell cycle genes. Deletions of a further 333 genes lead to specific alterations in cell shape and another 524 genes result in generally misshapen cells. Here, we provide the first eukaryotic resource of gene deletions, which describes a near genome-wide set of genes required for the cell cycle and cell shape. PMID:23697806

  8. Integrated data analysis for genome-wide research.

    PubMed

    Steinfath, Matthias; Repsilber, Dirk; Scholz, Matthias; Walther, Dirk; Selbig, Joachim

    2007-01-01

    Integrated data analysis is introduced as the intermediate level of a systems biology approach to analyse different 'omics' datasets, i.e., genome-wide measurements of transcripts, protein levels or protein-protein interactions, and metabolite levels aiming at generating a coherent understanding of biological function. In this chapter we focus on different methods of correlation analyses ranging from simple pairwise correlation to kernel canonical correlation which were recently applied in molecular biology. Several examples are presented to illustrate their application. The input data for this analysis frequently originate from different experimental platforms. Therefore, preprocessing steps such as data normalisation and missing value estimation are inherent to this approach. The corresponding procedures, potential pitfalls and biases, and available software solutions are reviewed. The multiplicity of observations obtained in omics-profiling experiments necessitates the application of multiple testing correction techniques.

  9. Prospecting sugarcane resistance to Sugarcane yellow leaf virus by genome-wide association.

    PubMed

    Debibakas, S; Rocher, S; Garsmeur, O; Toubi, L; Roques, D; D'Hont, A; Hoarau, J-Y; Daugrois, J H

    2014-08-01

    Using GWAS approaches, we detected independent resistant markers in sugarcane towards a vectored virus disease. Based on comparative genomics, several candidate genes potentially involved in virus/aphid/plant interactions were pinpointed. Yellow leaf of sugarcane is an emerging viral disease whose causal agent is a Polerovirus, the Sugarcane yellow leaf virus (SCYLV) transmitted by aphids. To identify quantitative trait loci controlling resistance to yellow leaf which are of direct relevance for breeding, we undertook a genome-wide association study (GWAS) on a sugarcane cultivar panel (n = 189) representative of current breeding germplasm. This panel was fingerprinted with 3,949 polymorphic markers (DArT and AFLP). The panel was phenotyped for SCYLV infection in leaves and stalks in two trials for two crop cycles, under natural disease pressure prevalent in Guadeloupe. Mixed linear models including co-factors representing population structure fixed effects and pairwise kinship random effects provided an efficient control of the risk of inflated type-I error at a genome-wide level. Six independent markers were significantly detected in association with SCYLV resistance phenotype. These markers explained individually between 9 and 14 % of the disease variation of the cultivar panel. Their frequency in the panel was relatively low (8-20 %). Among them, two markers were detected repeatedly across the GWAS exercises based on the different disease resistance parameters. These two markers could be blasted on Sorghum bicolor genome and candidate genes potentially involved in plant-aphid or plant-virus interactions were localized in the vicinity of sorghum homologs of sugarcane markers. Our results illustrate the potential of GWAS approaches to prospect among sugarcane germplasm for accessions likely bearing resistance alleles of significant effect useful in breeding programs.

  10. The accuracy of Genomic Selection in Norwegian red cattle assessed by cross-validation.

    PubMed

    Luan, Tu; Woolliams, John A; Lien, Sigbjørn; Kent, Matthew; Svendsen, Morten; Meuwissen, Theo H E

    2009-11-01

    Genomic Selection (GS) is a newly developed tool for the estimation of breeding values for quantitative traits through the use of dense markers covering the whole genome. For a successful application of GS, accuracy of the prediction of genomewide breeding value (GW-EBV) is a key issue to consider. Here we investigated the accuracy and possible bias of GW-EBV prediction, using real bovine SNP genotyping (18,991 SNPs) and phenotypic data of 500 Norwegian Red bulls. The study was performed on milk yield, fat yield, protein yield, first lactation mastitis traits, and calving ease. Three methods, best linear unbiased prediction (G-BLUP), Bayesian statistics (BayesB), and a mixture model approach (MIXTURE), were used to estimate marker effects, and their accuracy and bias were estimated by using cross-validation. The accuracies of the GW-EBV prediction were found to vary widely between 0.12 and 0.62. G-BLUP gave overall the highest accuracy. We observed a strong relationship between the accuracy of the prediction and the heritability of the trait. GW-EBV prediction for production traits with high heritability achieved higher accuracy and also lower bias than health traits with low heritability. To achieve a similar accuracy for the health traits probably more records will be needed.

  11. A comprehensive 1,000 Genomes-based genome-wide association meta-analysis of coronary artery disease.

    PubMed

    Nikpay, Majid; Goel, Anuj; Won, Hong-Hee; Hall, Leanne M; Willenborg, Christina; Kanoni, Stavroula; Saleheen, Danish; Kyriakou, Theodosios; Nelson, Christopher P; Hopewell, Jemma C; Webb, Thomas R; Zeng, Lingyao; Dehghan, Abbas; Alver, Maris; Armasu, Sebastian M; Auro, Kirsi; Bjonnes, Andrew; Chasman, Daniel I; Chen, Shufeng; Ford, Ian; Franceschini, Nora; Gieger, Christian; Grace, Christopher; Gustafsson, Stefan; Huang, Jie; Hwang, Shih-Jen; Kim, Yun Kyoung; Kleber, Marcus E; Lau, King Wai; Lu, Xiangfeng; Lu, Yingchang; Lyytikäinen, Leo-Pekka; Mihailov, Evelin; Morrison, Alanna C; Pervjakova, Natalia; Qu, Liming; Rose, Lynda M; Salfati, Elias; Saxena, Richa; Scholz, Markus; Smith, Albert V; Tikkanen, Emmi; Uitterlinden, Andre; Yang, Xueli; Zhang, Weihua; Zhao, Wei; de Andrade, Mariza; de Vries, Paul S; van Zuydam, Natalie R; Anand, Sonia S; Bertram, Lars; Beutner, Frank; Dedoussis, George; Frossard, Philippe; Gauguier, Dominique; Goodall, Alison H; Gottesman, Omri; Haber, Marc; Han, Bok-Ghee; Huang, Jianfeng; Jalilzadeh, Shapour; Kessler, Thorsten; König, Inke R; Lannfelt, Lars; Lieb, Wolfgang; Lind, Lars; Lindgren, Cecilia M; Lokki, Marja-Liisa; Magnusson, Patrik K; Mallick, Nadeem H; Mehra, Narinder; Meitinger, Thomas; Memon, Fazal-Ur-Rehman; Morris, Andrew P; Nieminen, Markku S; Pedersen, Nancy L; Peters, Annette; Rallidis, Loukianos S; Rasheed, Asif; Samuel, Maria; Shah, Svati H; Sinisalo, Juha; Stirrups, Kathleen E; Trompet, Stella; Wang, Laiyuan; Zaman, Khan S; Ardissino, Diego; Boerwinkle, Eric; Borecki, Ingrid B; Bottinger, Erwin P; Buring, Julie E; Chambers, John C; Collins, Rory; Cupples, L Adrienne; Danesh, John; Demuth, Ilja; Elosua, Roberto; Epstein, Stephen E; Esko, Tõnu; Feitosa, Mary F; Franco, Oscar H; Franzosi, Maria Grazia; Granger, Christopher B; Gu, Dongfeng; Gudnason, Vilmundur; Hall, Alistair S; Hamsten, Anders; Harris, Tamara B; Hazen, Stanley L; Hengstenberg, Christian; Hofman, Albert; Ingelsson, Erik; Iribarren, Carlos; Jukema, J Wouter; Karhunen, Pekka J; Kim, Bong-Jo; Kooner, Jaspal S; Kullo, Iftikhar J; Lehtimäki, Terho; Loos, Ruth J F; Melander, Olle; Metspalu, Andres; März, Winfried; Palmer, Colin N; Perola, Markus; Quertermous, Thomas; Rader, Daniel J; Ridker, Paul M; Ripatti, Samuli; Roberts, Robert; Salomaa, Veikko; Sanghera, Dharambir K; Schwartz, Stephen M; Seedorf, Udo; Stewart, Alexandre F; Stott, David J; Thiery, Joachim; Zalloua, Pierre A; O'Donnell, Christopher J; Reilly, Muredach P; Assimes, Themistocles L; Thompson, John R; Erdmann, Jeanette; Clarke, Robert; Watkins, Hugh; Kathiresan, Sekar; McPherson, Ruth; Deloukas, Panos; Schunkert, Heribert; Samani, Nilesh J; Farrall, Martin

    2015-10-01

    Existing knowledge of genetic variants affecting risk of coronary artery disease (CAD) is largely based on genome-wide association study (GWAS) analysis of common SNPs. Leveraging phased haplotypes from the 1000 Genomes Project, we report a GWAS meta-analysis of ∼185,000 CAD cases and controls, interrogating 6.7 million common (minor allele frequency (MAF) > 0.05) and 2.7 million low-frequency (0.005 < MAF < 0.05) variants. In addition to confirming most known CAD-associated loci, we identified ten new loci (eight additive and two recessive) that contain candidate causal genes newly implicating biological processes in vessel walls. We observed intralocus allelic heterogeneity but little evidence of low-frequency variants with larger effects and no evidence of synthetic association. Our analysis provides a comprehensive survey of the fine genetic architecture of CAD, showing that genetic susceptibility to this common disease is largely determined by common SNPs of small effect size.

  12. A genome-wide survey of CD4+ lymphocyte regulatory genetic variants identifies novel asthma genes

    PubMed Central

    Sharma, Sunita; Zhou, Xiaobo; Thibault, Derek M.; Himes, Blanca E.; Liu, Andy; Szefler, Stanley J.; Strunk, Robert; Castro, Mario; Hansel, Nadia N.; Diette, Gregory B.; Vonakis, Becky M.; Adkinson, N. Franklin; Avila, Lydiana; Soto-Quiros, Manuel; Barraza-Villareal, Albino; Lemanske, Robert F.; Solway, Julian; Krishnan, Jerry; White, Steven R.; Cheadle, Chris; Berger, Alan E.; Fan, Jinshui; Boorgula, Meher Preethi; Nicolae, Dan; Gilliland, Frank; Barnes, Kathleen; London, Stephanie J.; Martinez, Fernando; Ober, Carole; Celedón, Juan C.; Carey, Vincent J.; Weiss, Scott T.; Raby, Benjamin A.

    2014-01-01

    Background Genome-wide association studies have yet to identify the majority of genetic variants involved in asthma. We hypothesized that expression quantitative trait locus (eQTL) mapping can identify novel asthma genes by enabling prioritization of putative functional variants for association testing. Objective We evaluated 6,706 cis-acting expression-associated variants (eSNP) identified through a genome-wide eQTL survey of CD4+ lymphocytes for association with asthma. Methods eSNP were tested for association with asthma in 359 asthma cases and 846 controls from the Childhood Asthma Management Program, with verification using family-based testing. Significant associations were tested for replication in 579 parent-child trios with asthma from Costa Rica. Further functional validation was performed by Formaldehyde Assisted Isolation of Regulatory Elements (FAIRE)-qPCR and Chromatin-Immunoprecipitation (ChIP)-PCR in lung derived epithelial cell lines (Beas-2B and A549) and Jurkat cells, a leukemia cell line derived from T lymphocytes. Results Cis-acting eSNP demonstrated associations with asthma in both cohorts. We confirmed the previously-reported association of ORMDL3/GSDMB variants with asthma (combined p=2.9 × 108). Reproducible associations were also observed for eSNP in three additional genes: FADS2 (p=0.002), NAGA (p=0.0002), and F13A1 (p=0.0001). We subsequently demonstrated that FADS2 mRNA is increased in CD4+ lymphocytes in asthmatics, and that the associated eSNPs reside within DNA segments with histone modifications that denote open chromatin status and confer enhancer activity. Conclusions Our results demonstrate the utility of eQTL mapping in the identification of novel asthma genes, and provide evidence for the importance of FADS2, NAGA, and F13A1 in the pathogenesis of asthma. PMID:24934276

  13. Genome-wide DNA methylation measurements in prostate tissues uncovers novel prostate cancer diagnostic biomarkers and transcription factor binding patterns.

    PubMed

    Kirby, Marie K; Ramaker, Ryne C; Roberts, Brian S; Lasseigne, Brittany N; Gunther, David S; Burwell, Todd C; Davis, Nicholas S; Gulzar, Zulfiqar G; Absher, Devin M; Cooper, Sara J; Brooks, James D; Myers, Richard M

    2017-04-17

    Current diagnostic tools for prostate cancer lack specificity and sensitivity for detecting very early lesions. DNA methylation is a stable genomic modification that is detectable in peripheral patient fluids such as urine and blood plasma that could serve as a non-invasive diagnostic biomarker for prostate cancer. We measured genome-wide DNA methylation patterns in 73 clinically annotated fresh-frozen prostate cancers and 63 benign-adjacent prostate tissues using the Illumina Infinium HumanMethylation450 BeadChip array. We overlaid the most significantly differentially methylated sites in the genome with transcription factor binding sites measured by the Encyclopedia of DNA Elements consortium. We used logistic regression and receiver operating characteristic curves to assess the performance of candidate diagnostic models. We identified methylation patterns that have a high predictive power for distinguishing malignant prostate tissue from benign-adjacent prostate tissue, and these methylation signatures were validated using data from The Cancer Genome Atlas Project. Furthermore, by overlaying ENCODE transcription factor binding data, we observed an enrichment of enhancer of zeste homolog 2 binding in gene regulatory regions with higher DNA methylation in malignant prostate tissues. DNA methylation patterns are greatly altered in prostate cancer tissue in comparison to benign-adjacent tissue. We have discovered patterns of DNA methylation marks that can distinguish prostate cancers with high specificity and sensitivity in multiple patient tissue cohorts, and we have identified transcription factors binding in these differentially methylated regions that may play important roles in prostate cancer development.

  14. Genome-wide evolutionary dynamics of influenza B viruses on a global scale

    PubMed Central

    Langat, Pinky; Bowden, Thomas A.; Edwards, Stephanie; Gall, Astrid; Rambaut, Andrew; Daniels, Rodney S.; Russell, Colin A.; Pybus, Oliver G.; McCauley, John

    2017-01-01

    The global-scale epidemiology and genome-wide evolutionary dynamics of influenza B remain poorly understood compared with influenza A viruses. We compiled a spatio-temporally comprehensive dataset of influenza B viruses, comprising over 2,500 genomes sampled worldwide between 1987 and 2015, including 382 newly-sequenced genomes that fill substantial gaps in previous molecular surveillance studies. Our contributed data increase the number of available influenza B virus genomes in Europe, Africa and Central Asia, improving the global context to study influenza B viruses. We reveal Yamagata-lineage diversity results from co-circulation of two antigenically-distinct groups that also segregate genetically across the entire genome, without evidence of intra-lineage reassortment. In contrast, Victoria-lineage diversity stems from geographic segregation of different genetic clades, with variability in the degree of geographic spread among clades. Differences between the lineages are reflected in their antigenic dynamics, as Yamagata-lineage viruses show alternating dominance between antigenic groups, while Victoria-lineage viruses show antigenic drift of a single lineage. Structural mapping of amino acid substitutions on trunk branches of influenza B gene phylogenies further supports these antigenic differences and highlights two potential mechanisms of adaptation for polymerase activity. Our study provides new insights into the epidemiological and molecular processes shaping influenza B virus evolution globally. PMID:29284042

  15. Genome-wide analysis of alternative splicing during human heart development

    NASA Astrophysics Data System (ADS)

    Wang, He; Chen, Yanmei; Li, Xinzhong; Chen, Guojun; Zhong, Lintao; Chen, Gangbing; Liao, Yulin; Liao, Wangjun; Bin, Jianping

    2016-10-01

    Alternative splicing (AS) drives determinative changes during mouse heart development. Recent high-throughput technological advancements have facilitated genome-wide AS, while its analysis in human foetal heart transition to the adult stage has not been reported. Here, we present a high-resolution global analysis of AS transitions between human foetal and adult hearts. RNA-sequencing data showed extensive AS transitions occurred between human foetal and adult hearts, and AS events occurred more frequently in protein-coding genes than in long non-coding RNA (lncRNA). A significant difference of AS patterns was found between foetal and adult hearts. The predicted difference in AS events was further confirmed using quantitative reverse transcription-polymerase chain reaction analysis of human heart samples. Functional foetal-specific AS event analysis showed enrichment associated with cell proliferation-related pathways including cell cycle, whereas adult-specific AS events were associated with protein synthesis. Furthermore, 42.6% of foetal-specific AS events showed significant changes in gene expression levels between foetal and adult hearts. Genes exhibiting both foetal-specific AS and differential expression were highly enriched in cell cycle-associated functions. In conclusion, we provided a genome-wide profiling of AS transitions between foetal and adult hearts and proposed that AS transitions and deferential gene expression may play determinative roles in human heart development.

  16. Genome-wide Association Study of a Quantitative Disordered Gambling Trait

    PubMed Central

    Lind, Penelope A.; Zhu, Gu; Montgomery, Grant W; Madden, Pamela A.F.; Heath, Andrew C.; Martin, Nicholas G.; Slutske, Wendy S.

    2012-01-01

    Disordered gambling is a moderately heritable trait, but the underlying genetic basis is largely unknown. We performed a genome-wide association study (GWAS) for disordered gambling using a quantitative factor score in 1,312 twins from 894 Australian families. Association was conducted for 2,381,914 single nucleotide polymorphisms (SNPs) using the family-based association test in Merlin followed by gene and pathway enrichment analyses. Although no SNP reached genome-wide significance, six achieved P-values < 1 × 10−5 with variants in three genes (MT1X, ATXN1 and VLDLR) implicated in disordered gambling. Secondary case-control analyses found two SNPs on chromosome 9 (rs1106076 and rs12305135 near VLDLR) and rs10812227 near FZD10 on chromosome 12 to be significantly associated with lifetime DSM-IV pathological gambling and SOGS classified probable pathological gambling status. Furthermore, several addiction-related pathways were enriched for SNPs associated with disordered gambling. Finally, gene-based analysis of 24 candidate genes for dopamine agonist induced gambling in individuals with Parkinson’s disease suggested an enrichment of SNPs associated with disordered gambling. We report the first GWAS of disordered gambling. While further replication is required, the identification of susceptibility loci and biological pathways will be important in characterizing the biological mechanisms that underpin disordered gambling. PMID:22780124

  17. Genome-wide association studies in oesophageal adenocarcinoma and Barrett's oesophagus: a large-scale meta-analysis.

    PubMed

    Gharahkhani, Puya; Fitzgerald, Rebecca C; Vaughan, Thomas L; Palles, Claire; Gockel, Ines; Tomlinson, Ian; Buas, Matthew F; May, Andrea; Gerges, Christian; Anders, Mario; Becker, Jessica; Kreuser, Nicole; Noder, Tania; Venerito, Marino; Veits, Lothar; Schmidt, Thomas; Manner, Hendrik; Schmidt, Claudia; Hess, Timo; Böhmer, Anne C; Izbicki, Jakob R; Hölscher, Arnulf H; Lang, Hauke; Lorenz, Dietmar; Schumacher, Brigitte; Hackelsberger, Andreas; Mayershofer, Rupert; Pech, Oliver; Vashist, Yogesh; Ott, Katja; Vieth, Michael; Weismüller, Josef; Nöthen, Markus M; Attwood, Stephen; Barr, Hugh; Chegwidden, Laura; de Caestecker, John; Harrison, Rebecca; Love, Sharon B; MacDonald, David; Moayyedi, Paul; Prenen, Hans; Watson, R G Peter; Iyer, Prasad G; Anderson, Lesley A; Bernstein, Leslie; Chow, Wong-Ho; Hardie, Laura J; Lagergren, Jesper; Liu, Geoffrey; Risch, Harvey A; Wu, Anna H; Ye, Weimin; Bird, Nigel C; Shaheen, Nicholas J; Gammon, Marilie D; Corley, Douglas A; Caldas, Carlos; Moebus, Susanne; Knapp, Michael; Peters, Wilbert H M; Neuhaus, Horst; Rösch, Thomas; Ell, Christian; MacGregor, Stuart; Pharoah, Paul; Whiteman, David C; Jankowski, Janusz; Schumacher, Johannes

    2016-10-01

    Oesophageal adenocarcinoma represents one of the fastest rising cancers in high-income countries. Barrett's oesophagus is the premalignant precursor of oesophageal adenocarcinoma. However, only a few patients with Barrett's oesophagus develop adenocarcinoma, which complicates clinical management in the absence of valid predictors. Within an international consortium investigating the genetics of Barrett's oesophagus and oesophageal adenocarcinoma, we aimed to identify novel genetic risk variants for the development of Barrett's oesophagus and oesophageal adenocarcinoma. We did a meta-analysis of all genome-wide association studies of Barrett's oesophagus and oesophageal adenocarcinoma available in PubMed up to Feb 29, 2016; all patients were of European ancestry and disease was confirmed histopathologically. All participants were from four separate studies within Europe, North America, and Australia and were genotyped on high-density single nucleotide polymorphism (SNP) arrays. Meta-analysis was done with a fixed-effects inverse variance-weighting approach and with a standard genome-wide significance threshold (p<5 × 10 -8 ). We also did an association analysis after reweighting of loci with an approach that investigates annotation enrichment among genome-wide significant loci. Furthermore, the entire dataset was analysed with bioinformatics approaches-including functional annotation databases and gene-based and pathway-based methods-to identify pathophysiologically relevant cellular mechanisms. Our sample comprised 6167 patients with Barrett's oesophagus and 4112 individuals with oesophageal adenocarcinoma, in addition to 17 159 representative controls from four genome-wide association studies in Europe, North America, and Australia. We identified eight new risk loci associated with either Barrett's oesophagus or oesophageal adenocarcinoma, within or near the genes CFTR (rs17451754; p=4·8 × 10 -10 ), MSRA (rs17749155; p=5·2 × 10 -10 ), LINC00208

  18. Genome-wide association mapping of crown rust resistance in oat elite germplasm

    USDA-ARS?s Scientific Manuscript database

    Oat crown rust, caused by Puccinia coronata f. sp. avenae, is a major constraint to oat production in many parts of the world. In this first comprehensive multi-environment genome-wide association map of oat crown rust, we used 2,972 SNPs genotyped on 631 oat lines for association mapping of quantit...

  19. Genome-wide association studies in pharmacogenetics research debate

    PubMed Central

    Bailey, Kent R; Cheng, Cheng

    2016-01-01

    Will genome-wide association studies (GWAS) ‘work’ for pharmacogenetics research? This question was the topic of a staged debate, with pro and con sides, aimed to bring out the strengths and weaknesses of GWAS for pharmacogenetics studies. After a full day of seminars at the Fifth Statistical Analysis Workshop of the Pharmacogenetics Research Network, the lively debate was held – appropriately – at Goonies Comedy Club in Rochester (MN, USA). The pro side emphasized that the many GWAS successes for identifying genetic variants associated with disease risk show that it works; that the current genotyping platforms are efficient, with good imputation methods to fill in missing data; that its global assessment is always a success even if no significant associations are detected; and that genetic effects are likely to be large because humans have not evolved in a drug-therapy environment. By contrast, the con side emphasized that we have limited knowledge of the complexity of the genome; limited clinical phenotypes compromise studies; the likely multifactorial nature of drug response clouding the small genetic effects; and limitations of sample size and replication studies in pharmacogenetic studies. Lively and insightful discussions emphasized further research efforts that might benefit GWAS in pharmacogenetics. PMID:20235786

  20. Pathway analysis of genome-wide association datasets of personality traits.

    PubMed

    Kim, H-N; Kim, B-H; Cho, J; Ryu, S; Shin, H; Sung, J; Shin, C; Cho, N H; Sung, Y A; Choi, B-O; Kim, H-L

    2015-04-01

    Although several genome-wide association (GWA) studies of human personality have been recently published, genetic variants that are highly associated with certain personality traits remain unknown, due to difficulty reproducing results. To further investigate these genetic variants, we assessed biological pathways using GWA datasets. Pathway analysis using GWA data was performed on 1089 Korean women whose personality traits were measured with the Revised NEO Personality Inventory for the 5-factor model of personality. A total of 1042 pathways containing 8297 genes were included in our study. Of these, 14 pathways were highly enriched with association signals that were validated in 1490 independent samples. These pathways include association of: Neuroticism with axon guidance [L1 cell adhesion molecule (L1CAM) interactions]; Extraversion with neuronal system and voltage-gated potassium channels; Agreeableness with L1CAM interaction, neurotransmitter receptor binding and downstream transmission in postsynaptic cells; and Conscientiousness with the interferon-gamma and platelet-derived growth factor receptor beta polypeptide pathways. Several genes that contribute to top-ranked pathways in this study were previously identified in GWA studies or by pathway analysis in schizophrenia or other neuropsychiatric disorders. Here we report the first pathway analysis of all five personality traits. Importantly, our analysis identified novel pathways that contribute to understanding the etiology of personality traits. © 2015 The Authors. Genes, Brain and Behavior published by International Behavioural and Neural Genetics Society and John Wiley & Sons Ltd.