Jia, Peilin; Wang, Lily; Fanous, Ayman H.; Pato, Carlos N.; Edwards, Todd L.; Zhao, Zhongming
2012-01-01
With the recent success of genome-wide association studies (GWAS), a wealth of association data has been accomplished for more than 200 complex diseases/traits, proposing a strong demand for data integration and interpretation. A combinatory analysis of multiple GWAS datasets, or an integrative analysis of GWAS data and other high-throughput data, has been particularly promising. In this study, we proposed an integrative analysis framework of multiple GWAS datasets by overlaying association signals onto the protein-protein interaction network, and demonstrated it using schizophrenia datasets. Building on a dense module search algorithm, we first searched for significantly enriched subnetworks for schizophrenia in each single GWAS dataset and then implemented a discovery-evaluation strategy to identify module genes with consistent association signals. We validated the module genes in an independent dataset, and also examined them through meta-analysis of the related SNPs using multiple GWAS datasets. As a result, we identified 205 module genes with a joint effect significantly associated with schizophrenia; these module genes included a number of well-studied candidate genes such as DISC1, GNA12, GNA13, GNAI1, GPR17, and GRIN2B. Further functional analysis suggested these genes are involved in neuronal related processes. Additionally, meta-analysis found that 18 SNPs in 9 module genes had P meta<1×10−4, including the gene HLA-DQA1 located in the MHC region on chromosome 6, which was reported in previous studies using the largest cohort of schizophrenia patients to date. These results demonstrated our bi-directional network-based strategy is efficient for identifying disease-associated genes with modest signals in GWAS datasets. This approach can be applied to any other complex diseases/traits where multiple GWAS datasets are available. PMID:22792057
Liu, Guiyou; Zhang, Fang; Jiang, Yongshuai; Hu, Yang; Gong, Zhongying; Liu, Shoufeng; Chen, Xiuju; Jiang, Qinghua; Hao, Junwei
2017-02-01
Much effort has been expended on identifying the genetic determinants of multiple sclerosis (MS). Existing large-scale genome-wide association study (GWAS) datasets provide strong support for using pathway and network-based analysis methods to investigate the mechanisms underlying MS. However, no shared genetic pathways have been identified to date. We hypothesize that shared genetic pathways may indeed exist in different MS-GWAS datasets. Here, we report results from a three-stage analysis of GWAS and expression datasets. In stage 1, we conducted multiple pathway analyses of two MS-GWAS datasets. In stage 2, we performed a candidate pathway analysis of the large-scale MS-GWAS dataset. In stage 3, we performed a pathway analysis using the dysregulated MS gene list from seven human MS case-control expression datasets. In stage 1, we identified 15 shared pathways. In stage 2, we successfully replicated 14 of these 15 significant pathways. In stage 3, we found that dysregulated MS genes were significantly enriched in 10 of 15 MS risk pathways identified in stages 1 and 2. We report shared genetic pathways in different MS-GWAS datasets and highlight some new MS risk pathways. Our findings provide new insights on the genetic determinants of MS.
Kwon, Ji-Sun; Kim, Jihye; Nam, Dougu; Kim, Sangsoo
2012-06-01
Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.
Saeed, Mohammad
2017-05-01
Systemic lupus erythematosus (SLE) is a complex disorder. Genetic association studies of complex disorders suffer from the following three major issues: phenotypic heterogeneity, false positive (type I error), and false negative (type II error) results. Hence, genes with low to moderate effects are missed in standard analyses, especially after statistical corrections. OASIS is a novel linkage disequilibrium clustering algorithm that can potentially address false positives and negatives in genome-wide association studies (GWAS) of complex disorders such as SLE. OASIS was applied to two SLE dbGAP GWAS datasets (6077 subjects; ∼0.75 million single-nucleotide polymorphisms). OASIS identified three known SLE genes viz. IFIH1, TNIP1, and CD44, not previously reported using these GWAS datasets. In addition, 22 novel loci for SLE were identified and the 5 SLE genes previously reported using these datasets were verified. OASIS methodology was validated using single-variant replication and gene-based analysis with GATES. This led to the verification of 60% of OASIS loci. New SLE genes that OASIS identified and were further verified include TNFAIP6, DNAJB3, TTF1, GRIN2B, MON2, LATS2, SNX6, RBFOX1, NCOA3, and CHAF1B. This study presents the OASIS algorithm, software, and the meta-analyses of two publicly available SLE GWAS datasets along with the novel SLE genes. Hence, OASIS is a novel linkage disequilibrium clustering method that can be universally applied to existing GWAS datasets for the identification of new genes.
Genotype-based gene signature of glioma risk.
Huang, Yen-Tsung; Zhang, Yi; Wu, Zhijin; Michaud, Dominique S
2017-07-01
Glioma accounts for 80% of malignant brain tumors, but its etiologic determinants remain elusive. Despite genetic susceptibility loci identified by genome-wide association study (GWAS), the agnostic approach leaves open the possibility that other susceptibility genes remain to be discovered. Here we conduct a gene-centric integrative GWAS (iGWAS) of glioma risk that combines transcriptomics and genetics. We synthesized a brain transcriptomics dataset (n = 354), a GWAS dataset (n = 4203), and an advanced glioma tumor transcriptomic dataset (n = 483) to conduct an iGWAS. Using the expression quantitative trait loci (eQTL) dataset, we built models to predict gene expression for the GWAS data, based on eQTL genotypes. With the predicted gene expression, iGWAS analyses were performed using a novel statistical method. Gene signature risk score was constructed using a penalized logistic regression model. A total of 30527 transcripts were analyzed using the iGWAS approach. Four novel glioma susceptibility genes were identified with internal and external validation, including DRD5 (P = 3.0 × 10-79), WDR1 (P = 8.4 × 10-77), NOMO1 (P = 1.3 × 10-25), and PDXDC1 (P = 8.3 × 10-24). The genotype-predicted transcription pattern between cases and controls is consistent with that between tumor and its matched normal tissue. The genotype-based 4-gene signature improved the classification between glioma cases and controls based on age, gender, and population stratification, with area under the receiver operating characteristic curve increasing from 0.77 to 0.85 (P = 8.1 × 10-23). A new genotype-based gene signature of glioma was identified using a novel iGWAS approach, which integrates multiplatform genomic data as well as different genetic association studies. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Neuro-Oncology. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Privacy-preserving GWAS analysis on federated genomic datasets.
Constable, Scott D; Tang, Yuzhe; Wang, Shuang; Jiang, Xiaoqian; Chapin, Steve
2015-01-01
The biomedical community benefits from the increasing availability of genomic data to support meaningful scientific research, e.g., Genome-Wide Association Studies (GWAS). However, high quality GWAS usually requires a large amount of samples, which can grow beyond the capability of a single institution. Federated genomic data analysis holds the promise of enabling cross-institution collaboration for effective GWAS, but it raises concerns about patient privacy and medical information confidentiality (as data are being exchanged across institutional boundaries), which becomes an inhibiting factor for the practical use. We present a privacy-preserving GWAS framework on federated genomic datasets. Our method is to layer the GWAS computations on top of secure multi-party computation (MPC) systems. This approach allows two parties in a distributed system to mutually perform secure GWAS computations, but without exposing their private data outside. We demonstrate our technique by implementing a framework for minor allele frequency counting and χ2 statistics calculation, one of typical computations used in GWAS. For efficient prototyping, we use a state-of-the-art MPC framework, i.e., Portable Circuit Format (PCF) 1. Our experimental results show promise in realizing both efficient and secure cross-institution GWAS computations.
Chung, Dongjun; Kim, Hang J; Zhao, Hongyu
2017-02-01
Genome-wide association studies (GWAS) have identified tens of thousands of genetic variants associated with hundreds of phenotypes and diseases, which have provided clinical and medical benefits to patients with novel biomarkers and therapeutic targets. However, identification of risk variants associated with complex diseases remains challenging as they are often affected by many genetic variants with small or moderate effects. There has been accumulating evidence suggesting that different complex traits share common risk basis, namely pleiotropy. Recently, several statistical methods have been developed to improve statistical power to identify risk variants for complex traits through a joint analysis of multiple GWAS datasets by leveraging pleiotropy. While these methods were shown to improve statistical power for association mapping compared to separate analyses, they are still limited in the number of phenotypes that can be integrated. In order to address this challenge, in this paper, we propose a novel statistical framework, graph-GPA, to integrate a large number of GWAS datasets for multiple phenotypes using a hidden Markov random field approach. Application of graph-GPA to a joint analysis of GWAS datasets for 12 phenotypes shows that graph-GPA improves statistical power to identify risk variants compared to statistical methods based on smaller number of GWAS datasets. In addition, graph-GPA also promotes better understanding of genetic mechanisms shared among phenotypes, which can potentially be useful for the development of improved diagnosis and therapeutics. The R implementation of graph-GPA is currently available at https://dongjunchung.github.io/GGPA/.
Privacy-Preserving Data Exploration in Genome-Wide Association Studies.
Johnson, Aaron; Shmatikov, Vitaly
2013-08-01
Genome-wide association studies (GWAS) have become a popular method for analyzing sets of DNA sequences in order to discover the genetic basis of disease. Unfortunately, statistics published as the result of GWAS can be used to identify individuals participating in the study. To prevent privacy breaches, even previously published results have been removed from public databases, impeding researchers' access to the data and hindering collaborative research. Existing techniques for privacy-preserving GWAS focus on answering specific questions, such as correlations between a given pair of SNPs (DNA sequence variations). This does not fit the typical GWAS process, where the analyst may not know in advance which SNPs to consider and which statistical tests to use, how many SNPs are significant for a given dataset, etc. We present a set of practical, privacy-preserving data mining algorithms for GWAS datasets. Our framework supports exploratory data analysis, where the analyst does not know a priori how many and which SNPs to consider. We develop privacy-preserving algorithms for computing the number and location of SNPs that are significantly associated with the disease, the significance of any statistical test between a given SNP and the disease, any measure of correlation between SNPs, and the block structure of correlations. We evaluate our algorithms on real-world datasets and demonstrate that they produce significantly more accurate results than prior techniques while guaranteeing differential privacy.
Design and analysis of multiple diseases genome-wide association studies without controls.
Chen, Zhongxue; Huang, Hanwen; Ng, Hon Keung Tony
2012-11-15
In genome-wide association studies (GWAS), multiple diseases with shared controls is one of the case-control study designs. If data obtained from these studies are appropriately analyzed, this design can have several advantages such as improving statistical power in detecting associations and reducing the time and cost in the data collection process. In this paper, we propose a study design for GWAS which involves multiple diseases but without controls. We also propose corresponding statistical data analysis strategy for GWAS with multiple diseases but no controls. Through a simulation study, we show that the statistical association test with the proposed study design is more powerful than the test with single disease sharing common controls, and it has comparable power to the overall test based on the whole dataset including the controls. We also apply the proposed method to a real GWAS dataset to illustrate the methodologies and the advantages of the proposed design. Some possible limitations of this study design and testing method and their solutions are also discussed. Our findings indicate that the proposed study design and statistical analysis strategy could be more efficient than the usual case-control GWAS as well as those with shared controls. Copyright © 2012 Elsevier B.V. All rights reserved.
Protein Interaction Networks Reveal Novel Autism Risk Genes within GWAS Statistical Noise
Correia, Catarina; Oliveira, Guiomar; Vicente, Astrid M.
2014-01-01
Genome-wide association studies (GWAS) for Autism Spectrum Disorder (ASD) thus far met limited success in the identification of common risk variants, consistent with the notion that variants with small individual effects cannot be detected individually in single SNP analysis. To further capture disease risk gene information from ASD association studies, we applied a network-based strategy to the Autism Genome Project (AGP) and the Autism Genetics Resource Exchange GWAS datasets, combining family-based association data with Human Protein-Protein interaction (PPI) data. Our analysis showed that autism-associated proteins at higher than conventional levels of significance (P<0.1) directly interact more than random expectation and are involved in a limited number of interconnected biological processes, indicating that they are functionally related. The functionally coherent networks generated by this approach contain ASD-relevant disease biology, as demonstrated by an improved positive predictive value and sensitivity in retrieving known ASD candidate genes relative to the top associated genes from either GWAS, as well as a higher gene overlap between the two ASD datasets. Analysis of the intersection between the networks obtained from the two ASD GWAS and six unrelated disease datasets identified fourteen genes exclusively present in the ASD networks. These are mostly novel genes involved in abnormal nervous system phenotypes in animal models, and in fundamental biological processes previously implicated in ASD, such as axon guidance, cell adhesion or cytoskeleton organization. Overall, our results highlighted novel susceptibility genes previously hidden within GWAS statistical “noise” that warrant further analysis for causal variants. PMID:25409314
Protein interaction networks reveal novel autism risk genes within GWAS statistical noise.
Correia, Catarina; Oliveira, Guiomar; Vicente, Astrid M
2014-01-01
Genome-wide association studies (GWAS) for Autism Spectrum Disorder (ASD) thus far met limited success in the identification of common risk variants, consistent with the notion that variants with small individual effects cannot be detected individually in single SNP analysis. To further capture disease risk gene information from ASD association studies, we applied a network-based strategy to the Autism Genome Project (AGP) and the Autism Genetics Resource Exchange GWAS datasets, combining family-based association data with Human Protein-Protein interaction (PPI) data. Our analysis showed that autism-associated proteins at higher than conventional levels of significance (P<0.1) directly interact more than random expectation and are involved in a limited number of interconnected biological processes, indicating that they are functionally related. The functionally coherent networks generated by this approach contain ASD-relevant disease biology, as demonstrated by an improved positive predictive value and sensitivity in retrieving known ASD candidate genes relative to the top associated genes from either GWAS, as well as a higher gene overlap between the two ASD datasets. Analysis of the intersection between the networks obtained from the two ASD GWAS and six unrelated disease datasets identified fourteen genes exclusively present in the ASD networks. These are mostly novel genes involved in abnormal nervous system phenotypes in animal models, and in fundamental biological processes previously implicated in ASD, such as axon guidance, cell adhesion or cytoskeleton organization. Overall, our results highlighted novel susceptibility genes previously hidden within GWAS statistical "noise" that warrant further analysis for causal variants.
Zhang, Mingming; Mu, Hongbo; Shang, Zhenwei; Kang, Kai; Lv, Hongchao; Duan, Lian; Li, Jin; Chen, Xinren; Teng, Yanbo; Jiang, Yongshuai; Zhang, Ruijie
2017-01-06
Parkinson's disease (PD) is the second most common neurodegenerative disease. It is generally believed that it is influenced by both genetic and environmental factors, but the precise pathogenesis of PD is unknown to date. In this study, we performed a pathway analysis based on genome-wide association study (GWAS) to detect risk pathways of PD in three GWAS datasets. We first mapped all SNP markers to autosomal genes in each GWAS dataset. Then, we evaluated gene risk values using the minimum P-value of the tagSNPs. We took a pathway as a unit to identify the risk pathways based on the cumulative risks of the genes in the pathway. Finally, we combine the analysis results of the three datasets to detect the high risk pathways associated with PD. We found there were five same pathways in the three datasets. Besides, we also found there were five pathways which were shared in two datasets. Most of these pathways are associated with nervoussystem. Five pathways had been reported to be PD-related pathways in the previous literature. Our findings also implied that there was a close association between immune response and PD. Continued investigation of these pathways will further help us explain the pathogenesis of PD. Copyright © 2016. Published by Elsevier Ltd.
Leveraging lung tissue transcriptome to uncover candidate causal genes in COPD genetic associations.
Lamontagne, Maxime; Bérubé, Jean-Christophe; Obeidat, Ma'en; Cho, Michael H; Hobbs, Brian D; Sakornsakolpat, Phuwanat; de Jong, Kim; Boezen, H Marike; Nickle, David; Hao, Ke; Timens, Wim; van den Berge, Maarten; Joubert, Philippe; Laviolette, Michel; Sin, Don D; Paré, Peter D; Bossé, Yohan
2018-05-15
Causal genes of chronic obstructive pulmonary disease (COPD) remain elusive. The current study aims at integrating genome-wide association studies (GWAS) and lung expression quantitative trait loci (eQTL) data to map COPD candidate causal genes and gain biological insights into the recently discovered COPD susceptibility loci. Two complementary genomic datasets on COPD were studied. First, the lung eQTL dataset which included whole-genome gene expression and genotyping data from 1038 individuals. Second, the largest COPD GWAS to date from the International COPD Genetics Consortium (ICGC) with 13 710 cases and 38 062 controls. Methods that integrated GWAS with eQTL signals including transcriptome-wide association study (TWAS), colocalization and Mendelian randomization-based (SMR) approaches were used to map causality genes, i.e. genes with the strongest evidence of being the functional effector at specific loci. These methods were applied at the genome-wide level and at COPD risk loci derived from the GWAS literature. Replication was performed using lung data from GTEx. We collated 129 non-overlapping risk loci for COPD from the GWAS literature. At the genome-wide scale, 12 new COPD candidate genes/loci were revealed and six replicated in GTEx including CAMK2A, DMPK, MYO15A, TNFRSF10A, BTN3A2 and TRBV30. In addition, we mapped candidate causal genes for 60 out of the 129 GWAS-nominated loci and 23 of them were replicated in GTEx. Mapping candidate causal genes in lung tissue represents an important contribution to the genetics of COPD, enriches our biological interpretation of GWAS findings, and brings us closer to clinical translation of genetic associations.
Grover, Sandeep; Del Greco M, Fabiola; Stein, Catherine M; Ziegler, Andreas
2017-01-01
Confounding and reverse causality have prevented us from drawing meaningful clinical interpretation even in well-powered observational studies. Confounding may be attributed to our inability to randomize the exposure variable in observational studies. Mendelian randomization (MR) is one approach to overcome confounding. It utilizes one or more genetic polymorphisms as a proxy for the exposure variable of interest. Polymorphisms are randomly distributed in a population, they are static throughout an individual's lifetime, and may thus help in inferring directionality in exposure-outcome associations. Genome-wide association studies (GWAS) or meta-analyses of GWAS are characterized by large sample sizes and the availability of many single nucleotide polymorphisms (SNPs), making GWAS-based MR an attractive approach. GWAS-based MR comes with specific challenges, including multiple causality. Despite shortcomings, it still remains one of the most powerful techniques for inferring causality.With MR still an evolving concept with complex statistical challenges, the literature is relatively scarce in terms of providing working examples incorporating real datasets. In this chapter, we provide a step-by-step guide for causal inference based on the principles of MR with a real dataset using both individual and summary data from unrelated individuals. We suggest best possible practices and give recommendations based on the current literature.
LEAP: biomarker inference through learning and evaluating association patterns.
Jiang, Xia; Neapolitan, Richard E
2015-03-01
Single nucleotide polymorphism (SNP) high-dimensional datasets are available from Genome Wide Association Studies (GWAS). Such data provide researchers opportunities to investigate the complex genetic basis of diseases. Much of genetic risk might be due to undiscovered epistatic interactions, which are interactions in which combination of several genes affect disease. Research aimed at discovering interacting SNPs from GWAS datasets proceeded in two directions. First, tools were developed to evaluate candidate interactions. Second, algorithms were developed to search over the space of candidate interactions. Another problem when learning interacting SNPs, which has not received much attention, is evaluating how likely it is that the learned SNPs are associated with the disease. A complete system should provide this information as well. We develop such a system. Our system, called LEAP, includes a new heuristic search algorithm for learning interacting SNPs, and a Bayesian network based algorithm for computing the probability of their association. We evaluated the performance of LEAP using 100 1,000-SNP simulated datasets, each of which contains 15 SNPs involved in interactions. When learning interacting SNPs from these datasets, LEAP outperformed seven others methods. Furthermore, only SNPs involved in interactions were found to be probable. We also used LEAP to analyze real Alzheimer's disease and breast cancer GWAS datasets. We obtained interesting and new results from the Alzheimer's dataset, but limited results from the breast cancer dataset. We conclude that our results support that LEAP is a useful tool for extracting candidate interacting SNPs from high-dimensional datasets and determining their probability. © 2015 The Authors. *Genetic Epidemiology published by Wiley Periodicals, Inc.
Liley, James; Wallace, Chris
2015-02-01
Genome-wide association studies (GWAS) have been successful in identifying single nucleotide polymorphisms (SNPs) associated with many traits and diseases. However, at existing sample sizes, these variants explain only part of the estimated heritability. Leverage of GWAS results from related phenotypes may improve detection without the need for larger datasets. The Bayesian conditional false discovery rate (cFDR) constitutes an upper bound on the expected false discovery rate (FDR) across a set of SNPs whose p values for two diseases are both less than two disease-specific thresholds. Calculation of the cFDR requires only summary statistics and have several advantages over traditional GWAS analysis. However, existing methods require distinct control samples between studies. Here, we extend the technique to allow for some or all controls to be shared, increasing applicability. Several different SNP sets can be defined with the same cFDR value, and we show that the expected FDR across the union of these sets may exceed expected FDR in any single set. We describe a procedure to establish an upper bound for the expected FDR among the union of such sets of SNPs. We apply our technique to pairwise analysis of p values from ten autoimmune diseases with variable sharing of controls, enabling discovery of 59 SNP-disease associations which do not reach GWAS significance after genomic control in individual datasets. Most of the SNPs we highlight have previously been confirmed using replication studies or larger GWAS, a useful validation of our technique; we report eight SNP-disease associations across five diseases not previously declared. Our technique extends and strengthens the previous algorithm, and establishes robust limits on the expected FDR. This approach can improve SNP detection in GWAS, and give insight into shared aetiology between phenotypically related conditions.
Nonsyndromic cleft palate: An association study at GWAS candidate loci in a multiethnic sample.
Ishorst, Nina; Francheschelli, Paola; Böhmer, Anne C; Khan, Mohammad Faisal J; Heilmann-Heimbach, Stefanie; Fricker, Nadine; Little, Julian; Steegers-Theunissen, Regine P M; Peterlin, Borut; Nowak, Stefanie; Martini, Markus; Kruse, Teresa; Dunsche, Anton; Kreusch, Thomas; Gölz, Lina; Aldhorae, Khalid; Halboub, Esam; Reutter, Heiko; Mossey, Peter; Nöthen, Markus M; Rubini, Michele; Ludwig, Kerstin U; Knapp, Michael; Mangold, Elisabeth
2018-06-01
Nonsyndromic cleft palate only (nsCPO) is a common and multifactorial form of orofacial clefting. In contrast to successes achieved for the other common form of orofacial clefting, that is, nonsyndromic cleft lip with/without cleft palate (nsCL/P), genome wide association studies (GWAS) of nsCPO have identified only one genome wide significant locus. Aim of the present study was to investigate whether common variants contribute to nsCPO and, if so, to identify novel risk loci. We genotyped 33 SNPs at 27 candidate loci from 2 previously published nsCPO GWAS in an independent multiethnic sample. It included: (i) a family-based sample of European ancestry (n = 212); and (ii) two case/control samples of Central European (n = 94/339) and Arabian ancestry (n = 38/231), respectively. A separate association analysis was performed for each genotyped dataset, and meta-analyses were performed. After association analysis and meta-analyses, none of the 33 SNPs showed genome-wide significance. Two variants showed nominally significant association in the imputed GWAS dataset and exhibited a further decrease in p-value in a European and an overall meta-analysis including imputed GWAS data, respectively (rs395572: P MetaEU = 3.16 × 10 -4 ; rs6809420: P MetaAll = 2.80 × 10 -4 ). Our findings suggest that there is a limited contribution of common variants to nsCPO. However, the individual effect sizes might be too small for detection of further associations in the present sample sizes. Rare variants may play a more substantial role in nsCPO than in nsCL/P, for which GWAS of smaller sample sizes have identified genome-wide significant loci. Whole-exome/genome sequencing studies of nsCPO are now warranted. © 2018 Wiley Periodicals, Inc.
López-Isac, Elena; Martín, Jose-Ezequiel; Assassi, Shervin; Simeón, Carmen P; Carreira, Patricia; Ortego-Centeno, Norberto; Freire, Mayka; Beltrán, Emma; Narváez, Javier; Alegre-Sancho, Juan J; Fernández-Gutiérrez, Benjamín; Balsa, Alejandro; Ortiz, Ana M; González-Gay, Miguel A; Beretta, Lorenzo; Santaniello, Alessandro; Bellocchi, Chiara; Lunardi, Claudio; Moroncini, Gianluca; Gabrielli, Armando; Witte, Torsten; Hunzelmann, Nicolas; Distler, Jörg HW; Riekemasten, Gabriella; van der Helm-van Mil, Annete H; de Vries-Bouwstra, Jeska; Magro-Checa, Cesar; Voskuyl, Alexandre E; Vonk, Madelon C; Molberg, Øyvind; Merriman, Tony; Hesselstrand, Roger; Nordin, Annika; Padyukov, Leonid; Herrick, Ariane; Eyre, Steve; Koeleman, Bobby PC; Denton, Christopher P; Fonseca, Carmen; Radstake, Timothy RDJ; Worthington, Jane; Mayes, Maureen D; Martín, Javier
2017-01-01
Objectives Systemic sclerosis (SSc) and rheumatoid arthritis (RA) are autoimmune diseases that share clinical and immunological characteristics. To date, several shared SSc-RA loci have been identified independently. In this study, we aimed to systematically search for new common SSc-RA loci through an inter-disease meta-GWAS strategy. Methods We performed a meta-analysis combining GWAS datasets of SSc and RA using a strategy that allowed identification of loci with both same-direction and opposing-direction allelic effects. The top single-nucleotide polymorphisms (SNPs) were followed-up in independent SSc and RA case-control cohorts. This allowed us to increase the sample size to a total of 8,830 SSc patients, 16,870 RA patients and 43,393 controls. Results The cross-disease meta-analysis of the GWAS datasets identified several loci with nominal association signals (P-value < 5 × 10-6), which also showed evidence of association in the disease-specific GWAS scan. These loci included several genomic regions not previously reported as shared loci, besides risk factors associated with both diseases in previous studies. The follow-up of the putatively new SSc-RA loci identified IRF4 as a shared risk factor for these two diseases (Pcombined = 3.29 × 10-12). In addition, the analysis of the biological relevance of the known SSc-RA shared loci pointed to the type I interferon and the interleukin 12 signaling pathways as the main common etiopathogenic factors. Conclusions Our study has identified a novel shared locus, IRF4, for SSc and RA and highlighted the usefulness of cross-disease GWAS meta-analysis in the identification of common risk loci. PMID:27111665
Giambartolomei, Claudia; Vukcevic, Damjan; Schadt, Eric E; Franke, Lude; Hingorani, Aroon D; Wallace, Chris; Plagnol, Vincent
2014-05-01
Genetic association studies, in particular the genome-wide association study (GWAS) design, have provided a wealth of novel insights into the aetiology of a wide range of human diseases and traits, in particular cardiovascular diseases and lipid biomarkers. The next challenge consists of understanding the molecular basis of these associations. The integration of multiple association datasets, including gene expression datasets, can contribute to this goal. We have developed a novel statistical methodology to assess whether two association signals are consistent with a shared causal variant. An application is the integration of disease scans with expression quantitative trait locus (eQTL) studies, but any pair of GWAS datasets can be integrated in this framework. We demonstrate the value of the approach by re-analysing a gene expression dataset in 966 liver samples with a published meta-analysis of lipid traits including >100,000 individuals of European ancestry. Combining all lipid biomarkers, our re-analysis supported 26 out of 38 reported colocalisation results with eQTLs and identified 14 new colocalisation results, hence highlighting the value of a formal statistical test. In three cases of reported eQTL-lipid pairs (SYPL2, IFT172, TBKBP1) for which our analysis suggests that the eQTL pattern is not consistent with the lipid association, we identify alternative colocalisation results with SORT1, GCKR, and KPNB1, indicating that these genes are more likely to be causal in these genomic intervals. A key feature of the method is the ability to derive the output statistics from single SNP summary statistics, hence making it possible to perform systematic meta-analysis type comparisons across multiple GWAS datasets (implemented online at http://coloc.cs.ucl.ac.uk/coloc/). Our methodology provides information about candidate causal genes in associated intervals and has direct implications for the understanding of complex diseases as well as the design of drugs to target disease pathways.
GWASeq: targeted re-sequencing follow up to GWAS.
Salomon, Matthew P; Li, Wai Lok Sibon; Edlund, Christopher K; Morrison, John; Fortini, Barbara K; Win, Aung Ko; Conti, David V; Thomas, Duncan C; Duggan, David; Buchanan, Daniel D; Jenkins, Mark A; Hopper, John L; Gallinger, Steven; Le Marchand, Loïc; Newcomb, Polly A; Casey, Graham; Marjoram, Paul
2016-03-03
For the last decade the conceptual framework of the Genome-Wide Association Study (GWAS) has dominated the investigation of human disease and other complex traits. While GWAS have been successful in identifying a large number of variants associated with various phenotypes, the overall amount of heritability explained by these variants remains small. This raises the question of how best to follow up on a GWAS, localize causal variants accounting for GWAS hits, and as a consequence explain more of the so-called "missing" heritability. Advances in high throughput sequencing technologies now allow for the efficient and cost-effective collection of vast amounts of fine-scale genomic data to complement GWAS. We investigate these issues using a colon cancer dataset. After QC, our data consisted of 1993 cases, 899 controls. Using marginal tests of associations, we identify 10 variants distributed among six targeted regions that are significantly associated with colorectal cancer, with eight of the variants being novel to this study. Additionally, we perform so-called 'SNP-set' tests of association and identify two sets of variants that implicate both common and rare variants in the etiology of colorectal cancer. Here we present a large-scale targeted re-sequencing resource focusing on genomic regions implicated in colorectal cancer susceptibility previously identified in several GWAS, which aims to 1) provide fine-scale targeted sequencing data for fine-mapping and 2) provide data resources to address methodological questions regarding the design of sequencing-based follow-up studies to GWAS. Additionally, we show that this strategy successfully identifies novel variants associated with colorectal cancer susceptibility and can implicate both common and rare variants.
Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering.
Guo, Xuan; Meng, Yu; Yu, Ning; Pan, Yi
2014-04-10
Taking the advantage of high-throughput single nucleotide polymorphism (SNP) genotyping technology, large genome-wide association studies (GWASs) have been considered to hold promise for unravelling complex relationships between genotype and phenotype. At present, traditional single-locus-based methods are insufficient to detect interactions consisting of multiple-locus, which are broadly existing in complex traits. In addition, statistic tests for high order epistatic interactions with more than 2 SNPs propose computational and analytical challenges because the computation increases exponentially as the cardinality of SNPs combinations gets larger. In this paper, we provide a simple, fast and powerful method using dynamic clustering and cloud computing to detect genome-wide multi-locus epistatic interactions. We have constructed systematic experiments to compare powers performance against some recently proposed algorithms, including TEAM, SNPRuler, EDCF and BOOST. Furthermore, we have applied our method on two real GWAS datasets, Age-related macular degeneration (AMD) and Rheumatoid arthritis (RA) datasets, where we find some novel potential disease-related genetic factors which are not shown up in detections of 2-loci epistatic interactions. Experimental results on simulated data demonstrate that our method is more powerful than some recently proposed methods on both two- and three-locus disease models. Our method has discovered many novel high-order associations that are significantly enriched in cases from two real GWAS datasets. Moreover, the running time of the cloud implementation for our method on AMD dataset and RA dataset are roughly 2 hours and 50 hours on a cluster with forty small virtual machines for detecting two-locus interactions, respectively. Therefore, we believe that our method is suitable and effective for the full-scale analysis of multiple-locus epistatic interactions in GWAS.
Cloud computing for detecting high-order genome-wide epistatic interaction via dynamic clustering
2014-01-01
Backgroud Taking the advan tage of high-throughput single nucleotide polymorphism (SNP) genotyping technology, large genome-wide association studies (GWASs) have been considered to hold promise for unravelling complex relationships between genotype and phenotype. At present, traditional single-locus-based methods are insufficient to detect interactions consisting of multiple-locus, which are broadly existing in complex traits. In addition, statistic tests for high order epistatic interactions with more than 2 SNPs propose computational and analytical challenges because the computation increases exponentially as the cardinality of SNPs combinations gets larger. Results In this paper, we provide a simple, fast and powerful method using dynamic clustering and cloud computing to detect genome-wide multi-locus epistatic interactions. We have constructed systematic experiments to compare powers performance against some recently proposed algorithms, including TEAM, SNPRuler, EDCF and BOOST. Furthermore, we have applied our method on two real GWAS datasets, Age-related macular degeneration (AMD) and Rheumatoid arthritis (RA) datasets, where we find some novel potential disease-related genetic factors which are not shown up in detections of 2-loci epistatic interactions. Conclusions Experimental results on simulated data demonstrate that our method is more powerful than some recently proposed methods on both two- and three-locus disease models. Our method has discovered many novel high-order associations that are significantly enriched in cases from two real GWAS datasets. Moreover, the running time of the cloud implementation for our method on AMD dataset and RA dataset are roughly 2 hours and 50 hours on a cluster with forty small virtual machines for detecting two-locus interactions, respectively. Therefore, we believe that our method is suitable and effective for the full-scale analysis of multiple-locus epistatic interactions in GWAS. PMID:24717145
Elliott, Katherine S; Chapman, Kay; Day-Williams, Aaron; Panoutsopoulou, Kalliope; Southam, Lorraine; Lindgren, Cecilia M; Arden, Nigel; Aslam, Nadim; Birrell, Fraser; Carluke, Ian; Carr, Andrew; Deloukas, Panos; Doherty, Michael; Loughlin, John; McCaskie, Andrew; Ollier, William E R; Rai, Ashok; Ralston, Stuart; Reed, Mike R; Spector, Timothy D; Valdes, Ana M; Wallis, Gillian A; Wilkinson, Mark; Zeggini, Eleftheria
2013-06-01
Obesity as measured by body mass index (BMI) is one of the major risk factors for osteoarthritis. In addition, genetic overlap has been reported between osteoarthritis and normal adult height variation. We investigated whether this relationship is due to a shared genetic aetiology on a genome-wide scale. We compared genetic association summary statistics (effect size, p value) for BMI and height from the GIANT consortium genome-wide association study (GWAS) with genetic association summary statistics from the arcOGEN consortium osteoarthritis GWAS. Significance was evaluated by permutation. Replication of osteoarthritis association of the highlighted signals was investigated in an independent dataset. Phenotypic information of height and BMI was accounted for in a separate analysis using osteoarthritis-free controls. We found significant overlap between osteoarthritis and height (p=3.3×10(-5) for signals with p≤0.05) when the GIANT and arcOGEN GWAS were compared. For signals with p≤0.001 we found 17 shared signals between osteoarthritis and height and four between osteoarthritis and BMI. However, only one of the height or BMI signals that had shown evidence of association with osteoarthritis in the arcOGEN GWAS was also associated with osteoarthritis in the independent dataset: rs12149832, within the FTO gene (combined p=2.3×10(-5)). As expected, this signal was attenuated when we adjusted for BMI. We found a significant excess of shared signals between both osteoarthritis and height and osteoarthritis and BMI, suggestive of a common genetic aetiology. However, only one signal showed association with osteoarthritis when followed up in a new dataset.
Elliott, Katherine S; Chapman, Kay; Day-Williams, Aaron; Panoutsopoulou, Kalliope; Southam, Lorraine; Lindgren, Cecilia M; Arden, Nigel; Aslam, Nadim; Birrell, Fraser; Carluke, Ian; Carr, Andrew; Deloukas, Panos; Doherty, Michael; Loughlin, John; McCaskie, Andrew; Ollier, William E R; Rai, Ashok; Ralston, Stuart; Reed, Mike R; Spector, Timothy D; Valdes, Ana M; Wallis, Gillian A; Wilkinson, Mark; Zeggini, Eleftheria
2013-01-01
Objectives Obesity as measured by body mass index (BMI) is one of the major risk factors for osteoarthritis. In addition, genetic overlap has been reported between osteoarthritis and normal adult height variation. We investigated whether this relationship is due to a shared genetic aetiology on a genome-wide scale. Methods We compared genetic association summary statistics (effect size, p value) for BMI and height from the GIANT consortium genome-wide association study (GWAS) with genetic association summary statistics from the arcOGEN consortium osteoarthritis GWAS. Significance was evaluated by permutation. Replication of osteoarthritis association of the highlighted signals was investigated in an independent dataset. Phenotypic information of height and BMI was accounted for in a separate analysis using osteoarthritis-free controls. Results We found significant overlap between osteoarthritis and height (p=3.3×10−5 for signals with p≤0.05) when the GIANT and arcOGEN GWAS were compared. For signals with p≤0.001 we found 17 shared signals between osteoarthritis and height and four between osteoarthritis and BMI. However, only one of the height or BMI signals that had shown evidence of association with osteoarthritis in the arcOGEN GWAS was also associated with osteoarthritis in the independent dataset: rs12149832, within the FTO gene (combined p=2.3×10−5). As expected, this signal was attenuated when we adjusted for BMI. Conclusions We found a significant excess of shared signals between both osteoarthritis and height and osteoarthritis and BMI, suggestive of a common genetic aetiology. However, only one signal showed association with osteoarthritis when followed up in a new dataset. PMID:22956599
An Adaptive Association Test for Multiple Phenotypes with GWAS Summary Statistics.
Kim, Junghi; Bai, Yun; Pan, Wei
2015-12-01
We study the problem of testing for single marker-multiple phenotype associations based on genome-wide association study (GWAS) summary statistics without access to individual-level genotype and phenotype data. For most published GWASs, because obtaining summary data is substantially easier than accessing individual-level phenotype and genotype data, while often multiple correlated traits have been collected, the problem studied here has become increasingly important. We propose a powerful adaptive test and compare its performance with some existing tests. We illustrate its applications to analyses of a meta-analyzed GWAS dataset with three blood lipid traits and another with sex-stratified anthropometric traits, and further demonstrate its potential power gain over some existing methods through realistic simulation studies. We start from the situation with only one set of (possibly meta-analyzed) genome-wide summary statistics, then extend the method to meta-analysis of multiple sets of genome-wide summary statistics, each from one GWAS. We expect the proposed test to be useful in practice as more powerful than or complementary to existing methods. © 2015 WILEY PERIODICALS, INC.
2011-01-01
Background To date, nine Parkinson disease (PD) genome-wide association studies in North American, European and Asian populations have been published. The majority of studies have confirmed the association of the previously identified genetic risk factors, SNCA and MAPT, and two studies have identified three new PD susceptibility loci/genes (PARK16, BST1 and HLA-DRB5). In a recent meta-analysis of datasets from five of the published PD GWAS an additional 6 novel candidate genes (SYT11, ACMSD, STK39, MCCC1/LAMP3, GAK and CCDC62/HIP1R) were identified. Collectively the associations identified in these GWAS account for only a small proportion of the estimated total heritability of PD suggesting that an 'unknown' component of the genetic architecture of PD remains to be identified. Methods We applied a GWAS approach to a relatively homogeneous Ashkenazi Jewish (AJ) population from New York to search for both 'rare' and 'common' genetic variants that confer risk of PD by examining any SNPs with allele frequencies exceeding 2%. We have focused on a genetic isolate, the AJ population, as a discovery dataset since this cohort has a higher sharing of genetic background and historically experienced a significant bottleneck. We also conducted a replication study using two publicly available datasets from dbGaP. The joint analysis dataset had a combined sample size of 2,050 cases and 1,836 controls. Results We identified the top 57 SNPs showing the strongest evidence of association in the AJ dataset (p < 9.9 × 10-5). Six SNPs located within gene regions had positive signals in at least one other independent dbGaP dataset: LOC100505836 (Chr3p24), LOC153328/SLC25A48 (Chr5q31.1), UNC13B (9p13.3), SLCO3A1(15q26.1), WNT3(17q21.3) and NSF (17q21.3). We also replicated published associations for the gene regions SNCA (Chr4q21; rs3775442, p = 0.037), PARK16 (Chr1q32.1; rs823114 (NUCKS1), p = 6.12 × 10-4), BST1 (Chr4p15; rs12502586, p = 0.027), STK39 (Chr2q24.3; rs3754775, p = 0.005), and LAMP3 (Chr3; rs12493050, p = 0.005) in addition to the two most common PD susceptibility genes in the AJ population LRRK2 (Chr12q12; rs34637584, p = 1.56 × 10-4) and GBA (Chr1q21; rs2990245, p = 0.015). Conclusions We have demonstrated the utility of the AJ dataset in PD candidate gene and SNP discovery both by replication in dbGaP datasets with a larger sample size and by replicating association of previously identified PD susceptibility genes. Our GWAS study has identified candidate gene regions for PD that are implicated in neuronal signalling and the dopamine pathway. PMID:21812969
2012-01-01
Background Genome-wide association studies (GWAS) do not provide a full account of the heritability of genetic diseases since gene-gene interactions, also known as epistasis are not considered in single locus GWAS. To address this problem, a considerable number of methods have been developed for identifying disease-associated gene-gene interactions. However, these methods typically fail to identify interacting markers explaining more of the disease heritability over single locus GWAS, since many of the interactions significant for disease are obscured by uninformative marker interactions e.g., linkage disequilibrium (LD). Results In this study, we present a novel SNP interaction prioritization algorithm, named iLOCi (Interacting Loci). This algorithm accounts for marker dependencies separately in case and control groups. Disease-associated interactions are then prioritized according to a novel ranking score calculated from the difference in marker dependencies for every possible pair between case and control groups. The analysis of a typical GWAS dataset can be completed in less than a day on a standard workstation with parallel processing capability. The proposed framework was validated using simulated data and applied to real GWAS datasets using the Wellcome Trust Case Control Consortium (WTCCC) data. The results from simulated data showed the ability of iLOCi to identify various types of gene-gene interactions, especially for high-order interaction. From the WTCCC data, we found that among the top ranked interacting SNP pairs, several mapped to genes previously known to be associated with disease, and interestingly, other previously unreported genes with biologically related roles. Conclusion iLOCi is a powerful tool for uncovering true disease interacting markers and thus can provide a more complete understanding of the genetic basis underlying complex disease. The program is available for download at http://www4a.biotec.or.th/GI/tools/iloci. PMID:23281813
He, Awen; Wang, Wenyu; Prakash, N Tejo; Tinkov, Alexey A; Skalny, Anatoly V; Wen, Yan; Hao, Jingcan; Guo, Xiong; Zhang, Feng
2018-03-01
Chemical elements are closely related to human health. Extensive genomic profile data of complex diseases offer us a good opportunity to systemically investigate the relationships between elements and complex diseases/traits. In this study, we applied gene set enrichment analysis (GSEA) approach to detect the associations between elements and complex diseases/traits though integrating element-gene interaction datasets and genome-wide association study (GWAS) data of complex diseases/traits. To illustrate the performance of GSEA, the element-gene interaction datasets of 24 elements were extracted from the comparative toxicogenomics database (CTD). GWAS summary datasets of 24 complex diseases or traits were downloaded from the dbGaP or GEFOS websites. We observed significant associations between 7 elements and 13 complex diseases or traits (all false discovery rate (FDR) < 0.05), including reported relationships such as aluminum vs. Alzheimer's disease (FDR = 0.042), calcium vs. bone mineral density (FDR = 0.031), magnesium vs. systemic lupus erythematosus (FDR = 0.012) as well as novel associations, such as nickel vs. hypertriglyceridemia (FDR = 0.002) and bipolar disorder (FDR = 0.027). Our study results are consistent with previous biological studies, supporting the good performance of GSEA. Our analyzing results based on GSEA framework provide novel clues for discovering causal relationships between elements and complex diseases. © 2017 WILEY PERIODICALS, INC.
Populus Trichocarpa Genome-Wide Association Study (GWAS) Population SNP Dataset Released
Tuskan, Gerald; Muchero, Wellington; Chen, Jin-Gui; Jacobson, Daniel; Tschaplinski, Timothy; Rokhsar, Daniel S; Schackwitz, Wendy S; Schmutz, Jeremy; DiFazio, Stephen P
2016-01-01
This dataset includes genetic variations found in 882 poplar trees, and provides useful information to scientists studying plants as well as researchers more generally in the fields of biofuels, materials science, and secondary plant compounds. For nearly 10 years, researchers with DOE’s BioEnergy Science Center (BESC), a multi-institutional organization headquartered at ORNL, have studied the genome of Populus — a fast-growing perennial tree recognized for its economic potential in biofuels production. This Genome-Wide Association Study (GWAS) dataset includes more than 28 million single nucleotide polymorphisms, or SNPs that have been derived from 17 trillion bases of sequence data generated from 882 undomesticated Populus genotypes. Each SNP represents a variation in a single DNA nucleotide, or building block, that can act as a biological marker and/or causal allele within a protein sequence, helping scientists locate genes associated with certain characteristics, conditions or diseases. The results of this analysis have been used, among other things, to 1) seek genetic control of cell-wall recalcitrance — a natural characteristic of plant cell walls that prevent the release of sugars under microbial conversion and restricts biofuels production and 2) identify the molecular mechanisms controlling deposition of lignin in plant structures. Lignin is a polyphenolic polymer that strengthens plant cell walls and acts as a barrier to microbial access to cellulose during saccharfication — the process of breaking cellulose down into simple sugars for fermentation. Although the dataset’s most immediate applications are in fundamental plant sciences, ORNL researchers plan to use the GWAS data to inform applied work in areas such as cleaner, sustainable transportation biofuels, carbon fiber for lightweight vehicles and alternatives to conventional plastics and building insulation materials.
Multi-criteria decision making approaches for quality control of genome-wide association studies.
Malovini, Alberto; Rognoni, Carla; Puca, Annibale; Bellazzi, Riccardo
2009-03-01
Experimental errors in the genotyping phases of a Genome-Wide Association Study (GWAS) can lead to false positive findings and to spurious associations. An appropriate quality control phase could minimize the effects of this kind of errors. Several filtering criteria can be used to perform quality control. Currently, no formal methods have been proposed for taking into account at the same time these criteria and the experimenter's preferences. In this paper we propose two strategies for setting appropriate genotyping rate thresholds for GWAS quality control. These two approaches are based on the Multi-Criteria Decision Making theory. We have applied our method on a real dataset composed by 734 individuals affected by Arterial Hypertension (AH) and 486 nonagenarians without history of AH. The proposed strategies appear to deal with GWAS quality control in a sound way, as they lead to rationalize and make explicit the experimenter's choices thus providing more reproducible results.
Efficiently Identifying Significant Associations in Genome-wide Association Studies
Eskin, Eleazar
2013-01-01
Abstract Over the past several years, genome-wide association studies (GWAS) have implicated hundreds of genes in common disease. More recently, the GWAS approach has been utilized to identify regions of the genome that harbor variation affecting gene expression or expression quantitative trait loci (eQTLs). Unlike GWAS applied to clinical traits, where only a handful of phenotypes are analyzed per study, in eQTL studies, tens of thousands of gene expression levels are measured, and the GWAS approach is applied to each gene expression level. This leads to computing billions of statistical tests and requires substantial computational resources, particularly when applying novel statistical methods such as mixed models. We introduce a novel two-stage testing procedure that identifies all of the significant associations more efficiently than testing all the single nucleotide polymorphisms (SNPs). In the first stage, a small number of informative SNPs, or proxies, across the genome are tested. Based on their observed associations, our approach locates the regions that may contain significant SNPs and only tests additional SNPs from those regions. We show through simulations and analysis of real GWAS datasets that the proposed two-stage procedure increases the computational speed by a factor of 10. Additionally, efficient implementation of our software increases the computational speed relative to the state-of-the-art testing approaches by a factor of 75. PMID:24033261
BlueSNP: R package for highly scalable genome-wide association studies using Hadoop clusters.
Huang, Hailiang; Tata, Sandeep; Prill, Robert J
2013-01-01
Computational workloads for genome-wide association studies (GWAS) are growing in scale and complexity outpacing the capabilities of single-threaded software designed for personal computers. The BlueSNP R package implements GWAS statistical tests in the R programming language and executes the calculations across computer clusters configured with Apache Hadoop, a de facto standard framework for distributed data processing using the MapReduce formalism. BlueSNP makes computationally intensive analyses, such as estimating empirical p-values via data permutation, and searching for expression quantitative trait loci over thousands of genes, feasible for large genotype-phenotype datasets. http://github.com/ibm-bioinformatics/bluesnp
PExFInS: An Integrative Post-GWAS Explorer for Functional Indels and SNPs
Cheng, Zhongshan; Chu, Hin; Fan, Yanhui; Li, Cun; Song, You-Qiang; Zhou, Jie; Yuen, Kwok-Yung
2015-01-01
Expression quantitative trait loci (eQTLs) mapping and linkage disequilibrium (LD) analysis have been widely employed to interpret findings of genome-wide association studies (GWAS). With the availability of deep sequencing data of 423 lymphoblastoid cell lines (LCLs) from six global populations and the microarray expression data, we performed eQTL analysis, identified more than 228 K SNP cis-eQTLs and 21 K indel cis-eQTLs and generated a LCL cis-eQTL database. We demonstrate that the percentages of population-shared and population-specific cis-eQTLs are comparable; while indel cis-eQTLs in the population-specific subsection make more contribution to gene expression variations than those in the population-shared subsection. We found cis-eQTLs, especially the population-shared cis-eQTLs are significantly enriched toward transcription start site. Moreover, the National Human Genome Research Institute cataloged GWAS SNPs are enriched for LCL cis-eQTLs. Specifically, 32.8% GWAS SNPs are LCL cis-eQTLs, among which 12.5% can be tagged by indel cis-eQTLs, suggesting the fundamental contribution of indel cis-eQTLs to GWAS association signals. To search for functional indels and SNPs tagging GWAS SNPs, a pipeline Post-GWAS Explorer for Functional Indels and SNPs (PExFInS) has been developed, integrating LD analysis, functional annotation from public databases, cis-eQTL mapping with our LCL cis-eQTL database and other published cis-eQTL datasets. PMID:26612672
Song, Minsun; Wheeler, William; Caporaso, Neil E; Landi, Maria Teresa; Chatterjee, Nilanjan
2018-03-01
Genome-wide association studies (GWAS) are now routinely imputed for untyped single nucleotide polymorphisms (SNPs) based on various powerful statistical algorithms for imputation trained on reference datasets. The use of predicted allele counts for imputed SNPs as the dosage variable is known to produce valid score test for genetic association. In this paper, we investigate how to best handle imputed SNPs in various modern complex tests for genetic associations incorporating gene-environment interactions. We focus on case-control association studies where inference for an underlying logistic regression model can be performed using alternative methods that rely on varying degree on an assumption of gene-environment independence in the underlying population. As increasingly large-scale GWAS are being performed through consortia effort where it is preferable to share only summary-level information across studies, we also describe simple mechanisms for implementing score tests based on standard meta-analysis of "one-step" maximum-likelihood estimates across studies. Applications of the methods in simulation studies and a dataset from GWAS of lung cancer illustrate ability of the proposed methods to maintain type-I error rates for the underlying testing procedures. For analysis of imputed SNPs, similar to typed SNPs, the retrospective methods can lead to considerable efficiency gain for modeling of gene-environment interactions under the assumption of gene-environment independence. Methods are made available for public use through CGEN R software package. © 2017 WILEY PERIODICALS, INC.
TEAM: efficient two-locus epistasis tests in human genome-wide association study.
Zhang, Xiang; Huang, Shunping; Zou, Fei; Wang, Wei
2010-06-15
As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genome-wide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach.
Fan, Qianrui; Wang, Wenyu; Hao, Jingcan; He, Awen; Wen, Yan; Guo, Xiong; Wu, Cuiyan; Ning, Yujie; Wang, Xi; Wang, Sen; Zhang, Feng
2017-08-01
Neuroticism is a fundamental personality trait with significant genetic determinant. To identify novel susceptibility genes for neuroticism, we conducted an integrative analysis of genomic and transcriptomic data of genome wide association study (GWAS) and expression quantitative trait locus (eQTL) study. GWAS summary data was driven from published studies of neuroticism, totally involving 170,906 subjects. eQTL dataset containing 927,753 eQTLs were obtained from an eQTL meta-analysis of 5311 samples. Integrative analysis of GWAS and eQTL data was conducted by summary data-based Mendelian randomization (SMR) analysis software. To identify neuroticism associated gene sets, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). The gene set annotation dataset (containing 13,311 annotated gene sets) of GSEA Molecular Signatures Database was used. SMR single gene analysis identified 6 significant genes for neuroticism, including MSRA (p value=2.27×10 -10 ), MGC57346 (p value=6.92×10 -7 ), BLK (p value=1.01×10 -6 ), XKR6 (p value=1.11×10 -6 ), C17ORF69 (p value=1.12×10 -6 ) and KIAA1267 (p value=4.00×10 -6 ). Gene set enrichment analysis observed significant association for Chr8p23 gene set (false discovery rate=0.033). Our results provide novel clues for the genetic mechanism studies of neuroticism. Copyright © 2017. Published by Elsevier Inc.
Kar, Siddhartha P; Tyrer, Jonathan P; Li, Qiyuan; Lawrenson, Kate; Aben, Katja K H; Anton-Culver, Hoda; Antonenkova, Natalia; Chenevix-Trench, Georgia; Baker, Helen; Bandera, Elisa V; Bean, Yukie T; Beckmann, Matthias W; Berchuck, Andrew; Bisogna, Maria; Bjørge, Line; Bogdanova, Natalia; Brinton, Louise; Brooks-Wilson, Angela; Butzow, Ralf; Campbell, Ian; Carty, Karen; Chang-Claude, Jenny; Chen, Yian Ann; Chen, Zhihua; Cook, Linda S; Cramer, Daniel; Cunningham, Julie M; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas F; Edwards, Robert P; Ekici, Arif B; Fasching, Peter A; Fridley, Brooke L; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G; Glasspool, Rosalind; Goode, Ellen L; Goodman, Marc T; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A T; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus K; Hosono, Satoyo; Iversen, Edwin S; Jakubowska, Anna; Paul, James; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kjaer, Susanne K; Kelemen, Linda E; Kellar, Melissa; Kelley, Joseph; Kiemeney, Lambertus A; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Alice W; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; McNeish, Iain A; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Narod, Steven A; Nedergaard, Lotte; Ness, Roberta B; Nevanlinna, Heli; Odunsi, Kunle; Olson, Sara H; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pike, Malcolm C; Poole, Elizabeth M; Ramus, Susan J; Risch, Harvey A; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H; Rudolph, Anja; Runnebaum, Ingo B; Rzepecka, Iwona K; Salvesen, Helga B; Schildkraut, Joellen M; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C; Sucheston-Campbell, Lara E; Tangen, Ingvild L; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tsai, Ya-Yu; Tworoger, Shelley S; van Altena, Anne M; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S; Wicklund, Kristine G; Wilkens, Lynne R; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A; Monteiro, Alvaro N A; Freedman, Matthew L; Gayther, Simon A; Pharoah, Paul D P
2015-10-01
Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by coexpression may also be enriched for additional EOC risk associations. We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly coexpressed with each selected TF gene in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P < 0.05 and FDR < 0.05). These results were replicated (P < 0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. Network analysis integrating large, context-specific datasets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization. ©2015 American Association for Cancer Research.
GWATCH: a web platform for automated gene association discovery analysis.
Svitin, Anton; Malov, Sergey; Cherkasov, Nikolay; Geerts, Paul; Rotkevich, Mikhail; Dobrynin, Pavel; Shevchenko, Andrey; Guan, Li; Troyer, Jennifer; Hendrickson, Sher; Dilks, Holli Hutcheson; Oleksyk, Taras K; Donfield, Sharyne; Gomperts, Edward; Jabs, Douglas A; Sezgin, Efe; Van Natta, Mark; Harrigan, P Richard; Brumme, Zabrina L; O'Brien, Stephen J
2014-01-01
As genome-wide sequence analyses for complex human disease determinants are expanding, it is increasingly necessary to develop strategies to promote discovery and validation of potential disease-gene associations. Here we present a dynamic web-based platform - GWATCH - that automates and facilitates four steps in genetic epidemiological discovery: 1) Rapid gene association search and discovery analysis of large genome-wide datasets; 2) Expanded visual display of gene associations for genome-wide variants (SNPs, indels, CNVs), including Manhattan plots, 2D and 3D snapshots of any gene region, and a dynamic genome browser illustrating gene association chromosomal regions; 3) Real-time validation/replication of candidate or putative genes suggested from other sources, limiting Bonferroni genome-wide association study (GWAS) penalties; 4) Open data release and sharing by eliminating privacy constraints (The National Human Genome Research Institute (NHGRI) Institutional Review Board (IRB), informed consent, The Health Insurance Portability and Accountability Act (HIPAA) of 1996 etc.) on unabridged results, which allows for open access comparative and meta-analysis. GWATCH is suitable for both GWAS and whole genome sequence association datasets. We illustrate the utility of GWATCH with three large genome-wide association studies for HIV-AIDS resistance genes screened in large multicenter cohorts; however, association datasets from any study can be uploaded and analyzed by GWATCH.
He, Hao; Zhang, Lei; Li, Jian; Wang, Yu-Ping; Zhang, Ji-Gang; Shen, Jie; Guo, Yan-Fang
2014-01-01
Context: To date, few systems genetics studies in the bone field have been performed. We designed our study from a systems-level perspective by integrating genome-wide association studies (GWASs), human protein-protein interaction (PPI) network, and gene expression to identify gene modules contributing to osteoporosis risk. Methods: First we searched for modules significantly enriched with bone mineral density (BMD)-associated genes in human PPI network by using 2 large meta-analysis GWAS datasets through a dense module search algorithm. One included 7 individual GWAS samples (Meta7). The other was from the Genetic Factors for Osteoporosis Consortium (GEFOS2). One was assigned as a discovery dataset and the other as an evaluation dataset, and vice versa. Results: In total, 42 modules and 129 modules were identified significantly in both Meta7 and GEFOS2 datasets for femoral neck and spine BMD, respectively. There were 3340 modules identified for hip BMD only in Meta7. As candidate modules, they were assessed for the biological relevance to BMD by gene set enrichment analysis in 2 expression profiles generated from circulating monocytes in subjects with low versus high BMD values. Interestingly, there were 2 modules significantly enriched in monocytes from the low BMD group in both gene expression datasets (nominal P value <.05). Two modules had 16 nonredundant genes. Functional enrichment analysis revealed that both modules were enriched for genes involved in Wnt receptor signaling and osteoblast differentiation. Conclusion: We highlighted 2 modules and novel genes playing important roles in the regulation of bone mass, providing important clues for therapeutic approaches for osteoporosis. PMID:25119315
Demirci, F. Yesim; Wang, Xingbin; Kelly, Jennifer A.; Morris, David L.; Barmada, M. Michael; Feingold, Eleanor; Kao, Amy H.; Sivils, Kathy L.; Bernatsky, Sasha; Pineau, Christian; Clarke, Ann; Ramsey-Goldman, Rosalind; Vyse, Timothy J.; Gaffney, Patrick M.; Manzi, Susan; Kamboh, M. Ilyas
2016-01-01
Objective Genome-wide association studies (GWASs) in individuals of European ancestry identified a number of systemic lupus erythematosus (SLE) susceptibility loci using earlier versions of high-density genotyping platforms. Follow-up studies on suggestive GWAS regions using larger samples and more markers identified additional SLE loci in European-descent subjects. Here we report the results of a multi-stage study that we performed to identify novel SLE loci. Methods In Stage 1, we conducted a new GWAS of SLE in a North American case-control sample of European ancestry (n=1,166) genotyped on Affymetrix Genome-Wide Human SNP Array 6.0. In Stage 2, we further investigated top new suggestive GWAS hits by in silico evaluation and meta-analysis using an additional dataset of European-descent subjects (>2,500 individuals), followed by replication of top meta-analysis findings in another dataset of European-descent subjects (>10,000 individuals) in Stage 3. Results As expected, our GWAS revealed most significant associations at the major histocompatibility complex locus (6p21), which easily surpassed genome-wide significance threshold (P<5×10−8). Several other SLE signals/loci previously implicated in Caucasians and/or Asians were also supported in Stage 1 discovery sample and strongest signals were observed at 2q32/STAT4 (P=3.6×10−7) and at 8p23/BLK (P=8.1×10−6). Stage 2 meta-analyses identified a new genome-wide significant SLE locus at 12q12 (meta P=3.1×10−8), which was replicated in Stage 3. Conclusion Our multi-stage study identified and replicated a new SLE locus that warrants further follow-up in additional studies. Publicly available databases suggest that this new SLE signal falls within a functionally relevant genomic region and near biologically important genes. PMID:26316170
The relationship between the human genome and microbiome comes into view
Goodrich, Julia K.; Davenport, Emily R.; Clark, Andrew G.; Ley, Ruth E.
2017-01-01
The microbiome’s involvement in health and disease, and the complexity of its composition and function, make it intriguing to consider human genetic factors that impact microbiome composition. Genes may influence health through their ability to promote a stable microbial community in the gut. Studies of heritability yield a consistent subset of microbes that are impacted by genes, but the use of genome-wide association studies (GWAS) to identify specific genetic variants associated with microbiota phenotypes has proven challenging. Processing microbiome datasets into traits to be modeled and reducing the burden of multiple testing are just some of the technical hurdles in microbiome GWAS. Studies to date are small by GWAS standards, making cross-study comparisons and validations particularly important in identifying authentic signals. Cross-study comparisons are hampered by differences in analytical approaches. Nevertheless, some consistent associations have emerged between populations, most notably between Bifidobacteria and the lactase non-persister genotype. These early successes open the way for the microbiome to be incorporated into studies that quantify interactions among genotype, environment, and the microbiome for predicting disease susceptibility. PMID:28934590
An algorithm for direct causal learning of influences on patient outcomes.
Rathnam, Chandramouli; Lee, Sanghoon; Jiang, Xia
2017-01-01
This study aims at developing and introducing a new algorithm, called direct causal learner (DCL), for learning the direct causal influences of a single target. We applied it to both simulated and real clinical and genome wide association study (GWAS) datasets and compared its performance to classic causal learning algorithms. The DCL algorithm learns the causes of a single target from passive data using Bayesian-scoring, instead of using independence checks, and a novel deletion algorithm. We generate 14,400 simulated datasets and measure the number of datasets for which DCL correctly and partially predicts the direct causes. We then compare its performance with the constraint-based path consistency (PC) and conservative PC (CPC) algorithms, the Bayesian-score based fast greedy search (FGS) algorithm, and the partial ancestral graphs algorithm fast causal inference (FCI). In addition, we extend our comparison of all five algorithms to both a real GWAS dataset and real breast cancer datasets over various time-points in order to observe how effective they are at predicting the causal influences of Alzheimer's disease and breast cancer survival. DCL consistently outperforms FGS, PC, CPC, and FCI in discovering the parents of the target for the datasets simulated using a simple network. Overall, DCL predicts significantly more datasets correctly (McNemar's test significance: p<0.0001) than any of the other algorithms for these network types. For example, when assessing overall performance (simple and complex network results combined), DCL correctly predicts approximately 1400 more datasets than the top FGS method, 1600 more datasets than the top CPC method, 4500 more datasets than the top PC method, and 5600 more datasets than the top FCI method. Although FGS did correctly predict more datasets than DCL for the complex networks, and DCL correctly predicted only a few more datasets than CPC for these networks, there is no significant difference in performance between these three algorithms for this network type. However, when we use a more continuous measure of accuracy, we find that all the DCL methods are able to better partially predict more direct causes than FGS and CPC for the complex networks. In addition, DCL consistently had faster runtimes than the other algorithms. In the application to the real datasets, DCL identified rs6784615, located on the NISCH gene, and rs10824310, located on the PRKG1 gene, as direct causes of late onset Alzheimer's disease (LOAD) development. In addition, DCL identified ER category as a direct predictor of breast cancer mortality within 5 years, and HER2 status as a direct predictor of 10-year breast cancer mortality. These predictors have been identified in previous studies to have a direct causal relationship with their respective phenotypes, supporting the predictive power of DCL. When the other algorithms discovered predictors from the real datasets, these predictors were either also found by DCL or could not be supported by previous studies. Our results show that DCL outperforms FGS, PC, CPC, and FCI in almost every case, demonstrating its potential to advance causal learning. Furthermore, our DCL algorithm effectively identifies direct causes in the LOAD and Metabric GWAS datasets, which indicates its potential for clinical applications. Copyright © 2016 Elsevier B.V. All rights reserved.
Multi-Criteria Decision Making Approaches for Quality Control of Genome-Wide Association Studies
Malovini, Alberto; Rognoni, Carla; Puca, Annibale; Bellazzi, Riccardo
2009-01-01
Experimental errors in the genotyping phases of a Genome-Wide Association Study (GWAS) can lead to false positive findings and to spurious associations. An appropriate quality control phase could minimize the effects of this kind of errors. Several filtering criteria can be used to perform quality control. Currently, no formal methods have been proposed for taking into account at the same time these criteria and the experimenter’s preferences. In this paper we propose two strategies for setting appropriate genotyping rate thresholds for GWAS quality control. These two approaches are based on the Multi-Criteria Decision Making theory. We have applied our method on a real dataset composed by 734 individuals affected by Arterial Hypertension (AH) and 486 nonagenarians without history of AH. The proposed strategies appear to deal with GWAS quality control in a sound way, as they lead to rationalize and make explicit the experimenter’s choices thus providing more reproducible results. PMID:21347174
Gelernter, Joel; Sherva, Richard; Koesterer, Ryan; Almasy, Laura; Zhao, Hongyu; Kranzler, Henry R.; Farrer, Lindsay
2013-01-01
We report a GWAS for cocaine dependence (CD) in three sets of African- and European-American subjects (AAs and EAs, respectively), to identify pathways, genes, and alleles important in CD risk. The discovery GWAS dataset (n=5,697 subjects) was genotyped using the Illumina OmniQuad microarray (890,000 analyzed SNPs). Additional genotypes were imputed based on the 1000 Genomes reference panel. Top-ranked findings were evaluated by incorporating information from publicly available GWAS data from 4,063 subjects. Then, the most significant GWAS SNPs were genotyped in 2,549 independent subjects. We observed one genomewide-significant (GWS) result: rs7086629 at the FAM53B (“family with sequence similarity 53, member B”) locus. This was supported in both AAs and EAs; p-value (meta-analysis of all samples) =4.28×10−8. The gene maps to the same chromosomal region as the maximum peak we observed in a previous linkage study. NCOR2 (nuclear receptor corepressor 1) SNP rs150954431 was associated with p=1.19×10−9 in the EA discovery sample. SNP rs2456778, which maps to CDK1 (“cyclin-dependent kinase 1”), was associated with cocaine-induced paranoia in AAs in the discovery sample only (p=4.68×10−8). This is the first study to identify risk variants for CD using GWAS. Our results implicate novel risk loci and provide insights into potential therapeutic and prevention strategies. PMID:23958962
Bernatsky, S; Easton, D F; Dunning, A; Michailidou, K; Ramsey-Goldman, R; Gordon, C; Clarke, A E; Foulkes, W
2012-07-01
Recent work has demonstrated an important decrease in breast cancers for women with systemic lupus erythematosus (SLE). The reason behind this phenomenon is unknown. Our purpose was to explore whether the single nucleotide polymorphisms (SNPs) predisposing to SLE might be protective against breast cancer (in women in the general population). We focused on loci relevant to 10 SNPs associated with SLE (with a p value of <10(-9)). We determined whether we could establish a decreased frequency of these SNPs in breast cancer cases versus controls, within the general population. To do this we used a large breast cancer genome-wide association study (GWAS) dataset, involving 3,659 breast cancer cases and 4,897 controls. These subjects were all primarily of European ancestry. The population-based GWAS breast cancer data we examined suggested little evidence for important associations between breast cancer and SLE-related SNPs. Within the general population GWAS data, a cytosine(C) nucleotide substitution at rs9888739 (on chromosome 16p11.2) showed a very weak inverse association with breast cancer. The odds ratio (OR) for the rs9888739-C allele was 0.907551 (p value 0.049899) in the GWAS breast cancer sample, compared to controls. There was a slightly stronger, positive, association with breast cancer for rs6445975-G (Guanine) on chromosome 3p14.3, with a breast cancer OR of 1.0911 (p value 0.0097). Within this large breast cancer dataset, we did not demonstrate important associations with 10 lupus-associated SNPs. If decreased breast cancer risk in SLE is influenced by genetic profiles, this may be due to complex interactions and/or epigenetic factors.
A Population Genetic Signal of Polygenic Adaptation
Berg, Jeremy J.; Coop, Graham
2014-01-01
Adaptation in response to selection on polygenic phenotypes may occur via subtle allele frequencies shifts at many loci. Current population genomic techniques are not well posed to identify such signals. In the past decade, detailed knowledge about the specific loci underlying polygenic traits has begun to emerge from genome-wide association studies (GWAS). Here we combine this knowledge from GWAS with robust population genetic modeling to identify traits that may have been influenced by local adaptation. We exploit the fact that GWAS provide an estimate of the additive effect size of many loci to estimate the mean additive genetic value for a given phenotype across many populations as simple weighted sums of allele frequencies. We use a general model of neutral genetic value drift for an arbitrary number of populations with an arbitrary relatedness structure. Based on this model, we develop methods for detecting unusually strong correlations between genetic values and specific environmental variables, as well as a generalization of comparisons to test for over-dispersion of genetic values among populations. Finally we lay out a framework to identify the individual populations or groups of populations that contribute to the signal of overdispersion. These tests have considerably greater power than their single locus equivalents due to the fact that they look for positive covariance between like effect alleles, and also significantly outperform methods that do not account for population structure. We apply our tests to the Human Genome Diversity Panel (HGDP) dataset using GWAS data for height, skin pigmentation, type 2 diabetes, body mass index, and two inflammatory bowel disease datasets. This analysis uncovers a number of putative signals of local adaptation, and we discuss the biological interpretation and caveats of these results. PMID:25102153
Hyde, Craig L.; Nagle, Mike W.; Tian, Chao; Chen, Xing; Paciga, Sara A.; Wendland, Jens R.; Tung, Joyce; Hinds, David A.; Perlis, Roy H.; Winslow, Ashley R.
2016-01-01
Despite strong evidence supporting the heritability of Major Depressive Disorder, previous genome-wide studies were unable to identify risk loci among individuals of European descent. We used self-reported data from 75,607 individuals reporting clinical diagnosis of depression and 231,747 reporting no history of depression through 23andMe, and meta-analyzed these results with published MDD GWAS results. We identified five independent variants from four regions associated with self-report of clinical diagnosis or treatment for depression. Loci with pval<1.0×10−5 in the meta-analysis were further analyzed in a replication dataset (45,773 cases and 106,354 controls) from 23andMe. A total of 17 independent SNPs from 15 regions reached genome-wide significance after joint-analysis over all three datasets. Some of these loci were also implicated in GWAS of related psychiatric traits. These studies provide evidence for large-scale consumer genomic data as a powerful and efficient complement to traditional means of ascertainment for neuropsychiatric disease genomics. PMID:27479909
Walsh, Kyle M; Anderson, Erik; Hansen, Helen M; Decker, Paul A; Kosel, Matt L; Kollmeyer, Thomas; Rice, Terri; Zheng, Shichun; Xiao, Yuanyuan; Chang, Jeffrey S; McCoy, Lucie S; Bracci, Paige M; Wiemels, Joe L; Pico, Alexander R; Smirnov, Ivan; Lachance, Daniel H; Sicotte, Hugues; Eckel-Passow, Jeanette E; Wiencke, John K; Jenkins, Robert B; Wrensch, Margaret R
2013-02-01
Genomewide association studies (GWAS) and candidate-gene studies have implicated single-nucleotide polymorphisms (SNPs) in at least 45 different genes as putative glioma risk factors. Attempts to validate these associations have yielded variable results and few genetic risk factors have been consistently replicated. We conducted a case-control study of Caucasian glioma cases and controls from the University of California San Francisco (810 cases, 512 controls) and the Mayo Clinic (852 cases, 789 controls) in an attempt to replicate previously reported genetic risk factors for glioma. Sixty SNPs selected from the literature (eight from GWAS and 52 from candidate-gene studies) were successfully genotyped on an Illumina custom genotyping panel. Eight SNPs in/near seven different genes (TERT, EGFR, CCDC26, CDKN2A, PHLDB1, RTEL1, TP53) were significantly associated with glioma risk in the combined dataset (P < 0.05), with all associations in the same direction as in previous reports. Several SNP associations showed considerable differences across histologic subtype. All eight successfully replicated associations were first identified by GWAS, although none of the putative risk SNPs from candidate-gene studies was associated in the full case-control sample (all P values > 0.05). Although several confirmed associations are located near genes long known to be involved in gliomagenesis (e.g., EGFR, CDKN2A, TP53), these associations were first discovered by the GWAS approach and are in noncoding regions. These results highlight that the deficiencies of the candidate-gene approach lay in selecting both appropriate genes and relevant SNPs within these genes. © 2012 WILEY PERIODICALS, INC.
HU, TING; DARABOS, CHRISTIAN; CRICCO, MARIA E.; KONG, EMILY; MOORE, JASON H.
2014-01-01
The large volume of GWAS data poses great computational challenges for analyzing genetic interactions associated with common human diseases. We propose a computational framework for characterizing epistatic interactions among large sets of genetic attributes in GWAS data. We build the human phenotype network (HPN) and focus around a disease of interest. In this study, we use the GLAUGEN glaucoma GWAS dataset and apply the HPN as a biological knowledge-based filter to prioritize genetic variants. Then, we use the statistical epistasis network (SEN) to identify a significant connected network of pairwise epistatic interactions among the prioritized SNPs. These clearly highlight the complex genetic basis of glaucoma. Furthermore, we identify key SNPs by quantifying structural network characteristics. Through functional annotation of these key SNPs using Biofilter, a software accessing multiple publicly available human genetic data sources, we find supporting biomedical evidences linking glaucoma to an array of genetic diseases, proving our concept. We conclude by suggesting hypotheses for a better understanding of the disease. PMID:25592582
Ascertainment bias from imputation methods evaluation in wheat.
Brandariz, Sofía P; González Reymúndez, Agustín; Lado, Bettina; Malosetti, Marcos; Garcia, Antonio Augusto Franco; Quincke, Martín; von Zitzewitz, Jarislav; Castro, Marina; Matus, Iván; Del Pozo, Alejandro; Castro, Ariel J; Gutiérrez, Lucía
2016-10-04
Whole-genome genotyping techniques like Genotyping-by-sequencing (GBS) are being used for genetic studies such as Genome-Wide Association (GWAS) and Genomewide Selection (GS), where different strategies for imputation have been developed. Nevertheless, imputation error may lead to poor performance (i.e. smaller power or higher false positive rate) when complete data is not required as it is for GWAS, and each marker is taken at a time. The aim of this study was to compare the performance of GWAS analysis for Quantitative Trait Loci (QTL) of major and minor effect using different imputation methods when no reference panel is available in a wheat GBS panel. In this study, we compared the power and false positive rate of dissecting quantitative traits for imputed and not-imputed marker score matrices in: (1) a complete molecular marker barley panel array, and (2) a GBS wheat panel with missing data. We found that there is an ascertainment bias in imputation method comparisons. Simulating over a complete matrix and creating missing data at random proved that imputation methods have a poorer performance. Furthermore, we found that when QTL were simulated with imputed data, the imputation methods performed better than the not-imputed ones. On the other hand, when QTL were simulated with not-imputed data, the not-imputed method and one of the imputation methods performed better for dissecting quantitative traits. Moreover, larger differences between imputation methods were detected for QTL of major effect than QTL of minor effect. We also compared the different marker score matrices for GWAS analysis in a real wheat phenotype dataset, and we found minimal differences indicating that imputation did not improve the GWAS performance when a reference panel was not available. Poorer performance was found in GWAS analysis when an imputed marker score matrix was used, no reference panel is available, in a wheat GBS panel.
Pathway Analysis in Attention Deficit Hyperactivity Disorder: An Ensemble Approach
Mooney, Michael A.; McWeeney, Shannon K.; Faraone, Stephen V.; Hinney, Anke; Hebebrand, Johannes; Nigg, Joel T.; Wilmot, Beth
2016-01-01
Despite a wealth of evidence for the role of genetics in attention deficit hyperactivity disorder (ADHD), specific and definitive genetic mechanisms have not been identified. Pathway analyses, a subset of gene-set analyses, extend the knowledge gained from genome-wide association studies (GWAS) by providing functional context for genetic associations. However, there are numerous methods for association testing of gene sets and no real consensus regarding the best approach. The present study applied six pathway analysis methods to identify pathways associated with ADHD in two GWAS datasets from the Psychiatric Genomics Consortium. Methods that utilize genotypes to model pathway-level effects identified more replicable pathway associations than methods using summary statistics. In addition, pathways implicated by more than one method were significantly more likely to replicate. A number of brain-relevant pathways, such as RhoA signaling, glycosaminoglycan biosynthesis, fibroblast growth factor receptor activity, and pathways containing potassium channel genes, were nominally significant by multiple methods in both datasets. These results support previous hypotheses about the role of regulation of neurotransmitter release, neurite outgrowth and axon guidance in contributing to the ADHD phenotype and suggest the value of cross-method convergence in evaluating pathway analysis results. PMID:27004716
A genome-wide association study of anorexia nervosa
Boraska, Vesna; Franklin, Christopher S; Floyd, James AB; Thornton, Laura M; Huckins, Laura M; Southam, Lorraine; Rayner, N William; Tachmazidou, Ioanna; Klump, Kelly L; Treasure, Janet; Lewis, Cathryn M; Schmidt, Ulrike; Tozzi, Federica; Kiezebrink, Kirsty; Hebebrand, Johannes; Gorwood, Philip; Adan, Roger AH; Kas, Martien JH; Favaro, Angela; Santonastaso, Paolo; Fernández-Aranda, Fernando; Gratacos, Monica; Rybakowski, Filip; Dmitrzak-Weglarz, Monika; Kaprio, Jaakko; Keski-Rahkonen, Anna; Raevuori, Anu; Van Furth, Eric F; Landt, Margarita CT Slof-Op t; Hudson, James I; Reichborn-Kjennerud, Ted; Knudsen, Gun Peggy S; Monteleone, Palmiero; Kaplan, Allan S; Karwautz, Andreas; Hakonarson, Hakon; Berrettini, Wade H; Guo, Yiran; Li, Dong; Schork, Nicholas J.; Komaki, Gen; Ando, Tetsuya; Inoko, Hidetoshi; Esko, Tõnu; Fischer, Krista; Männik, Katrin; Metspalu, Andres; Baker, Jessica H; Cone, Roger D; Dackor, Jennifer; DeSocio, Janiece E; Hilliard, Christopher E; O'Toole, Julie K; Pantel, Jacques; Szatkiewicz, Jin P; Taico, Chrysecolla; Zerwas, Stephanie; Trace, Sara E; Davis, Oliver SP; Helder, Sietske; Bühren, Katharina; Burghardt, Roland; de Zwaan, Martina; Egberts, Karin; Ehrlich, Stefan; Herpertz-Dahlmann, Beate; Herzog, Wolfgang; Imgart, Hartmut; Scherag, André; Scherag, Susann; Zipfel, Stephan; Boni, Claudette; Ramoz, Nicolas; Versini, Audrey; Brandys, Marek K; Danner, Unna N; de Kovel, Carolien; Hendriks, Judith; Koeleman, Bobby PC; Ophoff, Roel A; Strengman, Eric; van Elburg, Annemarie A; Bruson, Alice; Clementi, Maurizio; Degortes, Daniela; Forzan, Monica; Tenconi, Elena; Docampo, Elisa; Escaramís, Geòrgia; Jiménez-Murcia, Susana; Lissowska, Jolanta; Rajewski, Andrzej; Szeszenia-Dabrowska, Neonila; Slopien, Agnieszka; Hauser, Joanna; Karhunen, Leila; Meulenbelt, Ingrid; Slagboom, P Eline; Tortorella, Alfonso; Maj, Mario; Dedoussis, George; Dikeos, Dimitris; Gonidakis, Fragiskos; Tziouvas, Konstantinos; Tsitsika, Artemis; Papezova, Hana; Slachtova, Lenka; Martaskova, Debora; Kennedy, James L.; Levitan, Robert D.; Yilmaz, Zeynep; Huemer, Julia; Koubek, Doris; Merl, Elisabeth; Wagner, Gudrun; Lichtenstein, Paul; Breen, Gerome; Cohen-Woods, Sarah; Farmer, Anne; McGuffin, Peter; Cichon, Sven; Giegling, Ina; Herms, Stefan; Rujescu, Dan; Schreiber, Stefan; Wichmann, H-Erich; Dina, Christian; Sladek, Rob; Gambaro, Giovanni; Soranzo, Nicole; Julia, Antonio; Marsal, Sara; Rabionet, Raquel; Gaborieau, Valerie; Dick, Danielle M; Palotie, Aarno; Ripatti, Samuli; Widén, Elisabeth; Andreassen, Ole A; Espeseth, Thomas; Lundervold, Astri; Reinvang, Ivar; Steen, Vidar M; Le Hellard, Stephanie; Mattingsdal, Morten; Ntalla, Ioanna; Bencko, Vladimir; Foretova, Lenka; Janout, Vladimir; Navratilova, Marie; Gallinger, Steven; Pinto, Dalila; Scherer, Stephen; Aschauer, Harald; Carlberg, Laura; Schosser, Alexandra; Alfredsson, Lars; Ding, Bo; Klareskog, Lars; Padyukov, Leonid; Finan, Chris; Kalsi, Gursharan; Roberts, Marion; Logan, Darren W; Peltonen, Leena; Ritchie, Graham RS; Barrett, Jeffrey C; Estivill, Xavier; Hinney, Anke; Sullivan, Patrick F; Collier, David A; Zeggini, Eleftheria; Bulik, Cynthia M
2015-01-01
Anorexia nervosa (AN) is a complex and heritable eating disorder characterized by dangerously low body weight. Neither candidate gene studies nor an initial genome wide association study (GWAS) have yielded significant and replicated results. We performed a GWAS in 2,907 cases with AN from 14 countries (15 sites) and 14,860 ancestrally matched controls as part of the Genetic Consortium for AN (GCAN) and the Wellcome Trust Case Control Consortium 3 (WTCCC3). Individual association analyses were conducted in each stratum and meta-analyzed across all 15 discovery datasets. Seventy-six (72 independent) SNPs were taken forward for in silico (two datasets) or de novo (13 datasets) replication genotyping in 2,677 independent AN cases and 8,629 European ancestry controls along with 458 AN cases and 421 controls from Japan. The final global meta-analysis across discovery and replication datasets comprised 5,551 AN cases and 21,080 controls. AN subtype analyses (1,606 AN restricting; 1,445 AN binge-purge) were performed. No findings reached genome-wide significance. Two intronic variants were suggestively associated: rs9839776 (P=3.01×10-7) in SOX2OT and rs17030795 (P=5.84×10-6) in PPP3CA. Two additional signals were specific to Europeans: rs1523921 (P=5.76×10-6) between CUL3 and FAM124B and rs1886797 (P=8.05×10-6) near SPATA13. Comparing discovery to replication results, 76% of the effects were in the same direction, an observation highly unlikely to be due to chance (P=4×10-6), strongly suggesting that true findings exist but that our sample, the largest yet reported, was underpowered for their detection. The accrual of large genotyped AN case-control samples should be an immediate priority for the field. PMID:24514567
Pathway-Based Genome-Wide Association Studies for Two Meat Production Traits in Simmental Cattle.
Fan, Huizhong; Wu, Yang; Zhou, Xiaojing; Xia, Jiangwei; Zhang, Wengang; Song, Yuxin; Liu, Fei; Chen, Yan; Zhang, Lupei; Gao, Xue; Gao, Huijiang; Li, Junya
2015-12-17
Most single nucleotide polymorphisms (SNPs) detected by genome-wide association studies (GWAS), explain only a small fraction of phenotypic variation. Pathway-based GWAS were proposed to improve the proportion of genes for some human complex traits that could be explained by enriching a mass of SNPs within genetic groups. However, few attempts have been made to describe the quantitative traits in domestic animals. In this study, we used a dataset with approximately 7,700,000 SNPs from 807 Simmental cattle and analyzed live weight and longissimus muscle area using a modified pathway-based GWAS method to orthogonalise the highly linked SNPs within each gene using principal component analysis (PCA). As a result, of the 262 biological pathways of cattle collected from the KEGG database, the gamma aminobutyric acid (GABA)ergic synapse pathway and the non-alcoholic fatty liver disease (NAFLD) pathway were significantly associated with the two traits analyzed. The GABAergic synapse pathway was biologically applicable to the traits analyzed because of its roles in feed intake and weight gain. The proposed method had high statistical power and a low false discovery rate, compared to those of the smallest P-value and SNP set enrichment analysis methods.
Zhu, Zhaozhong; Anttila, Verneri; Smoller, Jordan W; Lee, Phil H
2018-01-01
Advances in recent genome wide association studies (GWAS) suggest that pleiotropic effects on human complex traits are widespread. A number of classic and recent meta-analysis methods have been used to identify genetic loci with pleiotropic effects, but the overall performance of these methods is not well understood. In this work, we use extensive simulations and case studies of GWAS datasets to investigate the power and type-I error rates of ten meta-analysis methods. We specifically focus on three conditions commonly encountered in the studies of multiple traits: (1) extensive heterogeneity of genetic effects; (2) characterization of trait-specific association; and (3) inflated correlation of GWAS due to overlapping samples. Although the statistical power is highly variable under distinct study conditions, we found the superior power of several methods under diverse heterogeneity. In particular, classic fixed-effects model showed surprisingly good performance when a variant is associated with more than a half of study traits. As the number of traits with null effects increases, ASSET performed the best along with competitive specificity and sensitivity. With opposite directional effects, CPASSOC featured the first-rate power. However, caution is advised when using CPASSOC for studying genetically correlated traits with overlapping samples. We conclude with a discussion of unresolved issues and directions for future research.
Second-generation PLINK: rising to the challenge of larger and richer datasets.
Chang, Christopher C; Chow, Carson C; Tellier, Laurent Cam; Vattikuti, Shashaank; Purcell, Shaun M; Lee, James J
2015-01-01
PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1's primary data format. To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, [Formula: see text]-time/constant-space Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0). The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.
Deciphering Signaling Pathway Networks to Understand the Molecular Mechanisms of Metformin Action
Sun, Jingchun; Zhao, Min; Jia, Peilin; Wang, Lily; Wu, Yonghui; Iverson, Carissa; Zhou, Yubo; Bowton, Erica; Roden, Dan M.; Denny, Joshua C.; Aldrich, Melinda C.; Xu, Hua; Zhao, Zhongming
2015-01-01
A drug exerts its effects typically through a signal transduction cascade, which is non-linear and involves intertwined networks of multiple signaling pathways. Construction of such a signaling pathway network (SPNetwork) can enable identification of novel drug targets and deep understanding of drug action. However, it is challenging to synopsize critical components of these interwoven pathways into one network. To tackle this issue, we developed a novel computational framework, the Drug-specific Signaling Pathway Network (DSPathNet). The DSPathNet amalgamates the prior drug knowledge and drug-induced gene expression via random walk algorithms. Using the drug metformin, we illustrated this framework and obtained one metformin-specific SPNetwork containing 477 nodes and 1,366 edges. To evaluate this network, we performed the gene set enrichment analysis using the disease genes of type 2 diabetes (T2D) and cancer, one T2D genome-wide association study (GWAS) dataset, three cancer GWAS datasets, and one GWAS dataset of cancer patients with T2D on metformin. The results showed that the metformin network was significantly enriched with disease genes for both T2D and cancer, and that the network also included genes that may be associated with metformin-associated cancer survival. Furthermore, from the metformin SPNetwork and common genes to T2D and cancer, we generated a subnetwork to highlight the molecule crosstalk between T2D and cancer. The follow-up network analyses and literature mining revealed that seven genes (CDKN1A, ESR1, MAX, MYC, PPARGC1A, SP1, and STK11) and one novel MYC-centered pathway with CDKN1A, SP1, and STK11 might play important roles in metformin’s antidiabetic and anticancer effects. Some results are supported by previous studies. In summary, our study 1) develops a novel framework to construct drug-specific signal transduction networks; 2) provides insights into the molecular mode of metformin; 3) serves a model for exploring signaling pathways to facilitate understanding of drug action, disease pathogenesis, and identification of drug targets. PMID:26083494
iPat: intelligent prediction and association tool for genomic research.
Chen, Chunpeng James; Zhang, Zhiwu
2018-06-01
The ultimate goal of genomic research is to effectively predict phenotypes from genotypes so that medical management can improve human health and molecular breeding can increase agricultural production. Genomic prediction or selection (GS) plays a complementary role to genome-wide association studies (GWAS), which is the primary method to identify genes underlying phenotypes. Unfortunately, most computing tools cannot perform data analyses for both GWAS and GS. Furthermore, the majority of these tools are executed through a command-line interface (CLI), which requires programming skills. Non-programmers struggle to use them efficiently because of the steep learning curves and zero tolerance for data formats and mistakes when inputting keywords and parameters. To address these problems, this study developed a software package, named the Intelligent Prediction and Association Tool (iPat), with a user-friendly graphical user interface. With iPat, GWAS or GS can be performed using a pointing device to simply drag and/or click on graphical elements to specify input data files, choose input parameters and select analytical models. Models available to users include those implemented in third party CLI packages such as GAPIT, PLINK, FarmCPU, BLINK, rrBLUP and BGLR. Users can choose any data format and conduct analyses with any of these packages. File conversions are automatically conducted for specified input data and selected packages. A GWAS-assisted genomic prediction method was implemented to perform genomic prediction using any GWAS method such as FarmCPU. iPat was written in Java for adaptation to multiple operating systems including Windows, Mac and Linux. The iPat executable file, user manual, tutorials and example datasets are freely available at http://zzlab.net/iPat. zhiwu.zhang@wsu.edu.
Wang, W; Huang, S; Hou, W; Liu, Y; Fan, Q; He, A; Wen, Y; Hao, J; Guo, X; Zhang, F
2017-10-01
Several genome-wide association studies (GWAS) of bone mineral density (BMD) have successfully identified multiple susceptibility genes, yet isolated susceptibility genes are often difficult to interpret biologically. The aim of this study was to unravel the genetic background of BMD at pathway level, by integrating BMD GWAS data with genome-wide expression quantitative trait loci (eQTLs) and methylation quantitative trait loci (meQTLs) data METHOD: We employed the GWAS datasets of BMD from the Genetic Factors for Osteoporosis Consortium (GEFOS), analysing patients' BMD. The areas studied included 32 735 femoral necks, 28 498 lumbar spines, and 8143 forearms. Genome-wide eQTLs (containing 923 021 eQTLs) and meQTLs (containing 683 152 unique methylation sites with local meQTLs) data sets were collected from recently published studies. Gene scores were first calculated by summary data-based Mendelian randomisation (SMR) software and meQTL-aligned GWAS results. Gene set enrichment analysis (GSEA) was then applied to identify BMD-associated gene sets with a predefined significance level of 0.05. We identified multiple gene sets associated with BMD in one or more regions, including relevant known biological gene sets such as the Reactome Circadian Clock (GSEA p-value = 1.0 × 10 -4 for LS and 2.7 × 10 -2 for femoral necks BMD in eQTLs-based GSEA) and insulin-like growth factor receptor binding (GSEA p-value = 5.0 × 10 -4 for femoral necks and 2.6 × 10 -2 for lumbar spines BMD in meQTLs-based GSEA). Our results provided novel clues for subsequent functional analysis of bone metabolism, and illustrated the benefit of integrating eQTLs and meQTLs data into pathway association analysis for genetic studies of complex human diseases. Cite this article : W. Wang, S. Huang, W. Hou, Y. Liu, Q. Fan, A. He, Y. Wen, J. Hao, X. Guo, F. Zhang. Integrative analysis of GWAS, eQTLs and meQTLs data suggests that multiple gene sets are associated with bone mineral density. Bone Joint Res 2017;6:572-576. © 2017 Wang et al.
Genome-wide association study yields variants at 20p12.2 that associate with urinary bladder cancer.
Rafnar, Thorunn; Sulem, Patrick; Thorleifsson, Gudmar; Vermeulen, Sita H; Helgason, Hannes; Saemundsdottir, Jona; Gudjonsson, Sigurjon A; Sigurdsson, Asgeir; Stacey, Simon N; Gudmundsson, Julius; Johannsdottir, Hrefna; Alexiusdottir, Kristin; Petursdottir, Vigdis; Nikulasson, Sigfus; Geirsson, Gudmundur; Jonsson, Thorvaldur; Aben, Katja K H; Grotenhuis, Anne J; Verhaegh, Gerald W; Dudek, Aleksandra M; Witjes, J Alfred; van der Heijden, Antoine G; Vrieling, Alina; Galesloot, Tessel E; De Juan, Ana; Panadero, Angeles; Rivera, Fernando; Hurst, Carolyn; Bishop, D Timothy; Sak, Sei C; Choudhury, Ananya; Teo, Mark T W; Arici, Cecilia; Carta, Angela; Toninelli, Elena; de Verdier, Petra; Rudnai, Peter; Gurzau, Eugene; Koppova, Kvetoslava; van der Keur, Kirstin A; Lurkin, Irene; Goossens, Mieke; Kellen, Eliane; Guarrera, Simonetta; Russo, Alessia; Critelli, Rossana; Sacerdote, Carlotta; Vineis, Paolo; Krucker, Clémentine; Zeegers, Maurice P; Gerullis, Holger; Ovsiannikov, Daniel; Volkert, Frank; Hengstler, Jan G; Selinski, Silvia; Magnusson, Olafur T; Masson, Gisli; Kong, Augustine; Gudbjartsson, Daniel; Lindblom, Annika; Zwarthoff, Ellen; Porru, Stefano; Golka, Klaus; Buntinx, Frank; Matullo, Giuseppe; Kumar, Rajiv; Mayordomo, José I; Steineck, D Gunnar; Kiltie, Anne E; Jonsson, Eirikur; Radvanyi, François; Knowles, Margaret A; Thorsteinsdottir, Unnur; Kiemeney, Lambertus A; Stefansson, Kari
2014-10-15
Genome-wide association studies (GWAS) of urinary bladder cancer (UBC) have yielded common variants at 12 loci that associate with risk of the disease. We report here the results of a GWAS of UBC including 1670 UBC cases and 90 180 controls, followed by replication analysis in additional 5266 UBC cases and 10 456 controls. We tested a dataset containing 34.2 million variants, generated by imputation based on whole-genome sequencing of 2230 Icelanders. Several correlated variants at 20p12, represented by rs62185668, show genome-wide significant association with UBC after combining discovery and replication results (OR = 1.19, P = 1.5 × 10(-11) for rs62185668-A, minor allele frequency = 23.6%). The variants are located in a non-coding region approximately 300 kb upstream from the JAG1 gene, an important component of the Notch signaling pathways that may be oncogenic or tumor suppressive in several forms of cancer. Our results add to the growing number of UBC risk variants discovered through GWAS. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Yang, Cheng-Hong; Chuang, Li-Yeh; Lin, Yu-Da
2017-08-01
Detecting epistatic interactions in genome-wide association studies (GWAS) is a computational challenge. Such huge numbers of single-nucleotide polymorphism (SNP) combinations limit the some of the powerful algorithms to be applied to detect the potential epistasis in large-scale SNP datasets. We propose a new algorithm which combines the differential evolution (DE) algorithm with a classification based multifactor-dimensionality reduction (CMDR), termed DECMDR. DECMDR uses the CMDR as a fitness measure to evaluate values of solutions in DE process for scanning the potential statistical epistasis in GWAS. The results indicated that DECMDR outperforms the existing algorithms in terms of detection success rate by the large simulation and real data obtained from the Wellcome Trust Case Control Consortium. For running time comparison, DECMDR can efficient to apply the CMDR to detect the significant association between cases and controls amongst all possible SNP combinations in GWAS. DECMDR is freely available at https://goo.gl/p9sLuJ . chuang@isu.edu.tw or e0955767257@yahoo.com.tw. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
A comprehensive survey of genetic variation in 20,691 subjects from four large cohorts
Loomis, Stephanie; Turman, Constance; Huang, Hongyan; Huang, Jinyan; Aschard, Hugues; Chan, Andrew T.; Choi, Hyon; Cornelis, Marilyn; Curhan, Gary; De Vivo, Immaculata; Eliassen, A. Heather; Fuchs, Charles; Gaziano, Michael; Hankinson, Susan E.; Hu, Frank; Jensen, Majken; Kang, Jae H.; Kabrhel, Christopher; Liang, Liming; Pasquale, Louis R.; Rimm, Eric; Stampfer, Meir J.; Tamimi, Rulla M.; Tworoger, Shelley S.; Wiggs, Janey L.; Hunter, David J.; Kraft, Peter
2017-01-01
The Nurses’ Health Study (NHS), Nurses’ Health Study II (NHSII), Health Professionals Follow Up Study (HPFS) and the Physicians Health Study (PHS) have collected detailed longitudinal data on multiple exposures and traits for approximately 310,000 study participants over the last 35 years. Over 160,000 study participants across the cohorts have donated a DNA sample and to date, 20,691 subjects have been genotyped as part of genome-wide association studies (GWAS) of twelve primary outcomes. However, these studies utilized six different GWAS arrays making it difficult to conduct analyses of secondary phenotypes or share controls across studies. To allow for secondary analyses of these data, we have created three new datasets merged by platform family and performed imputation using a common reference panel, the 1,000 Genomes Phase I release. Here, we describe the methodology behind the data merging and imputation and present imputation quality statistics and association results from two GWAS of secondary phenotypes (body mass index (BMI) and venous thromboembolism (VTE)). We observed the strongest BMI association for the FTO SNP rs55872725 (β = 0.45, p = 3.48x10-22), and using a significance level of p = 0.05, we replicated 19 out of 32 known BMI SNPs. For VTE, we observed the strongest association for the rs2040445 SNP (OR = 2.17, 95% CI: 1.79–2.63, p = 2.70x10-15), located downstream of F5 and also observed significant associations for the known ABO and F11 regions. This pooled resource can be used to maximize power in GWAS of phenotypes collected across the cohorts and for studying gene-environment interactions as well as rare phenotypes and genotypes. PMID:28301549
Chang, Lun-Ching; Jamain, Stephane; Lin, Chien-Wei; Rujescu, Dan; Tseng, George C; Sibille, Etienne
2014-01-01
Large scale gene expression (transcriptome) analysis and genome-wide association studies (GWAS) for single nucleotide polymorphisms have generated a considerable amount of gene- and disease-related information, but heterogeneity and various sources of noise have limited the discovery of disease mechanisms. As systematic dataset integration is becoming essential, we developed methods and performed meta-clustering of gene coexpression links in 11 transcriptome studies from postmortem brains of human subjects with major depressive disorder (MDD) and non-psychiatric control subjects. We next sought enrichment in the top 50 meta-analyzed coexpression modules for genes otherwise identified by GWAS for various sets of disorders. One coexpression module of 88 genes was consistently and significantly associated with GWAS for MDD, other neuropsychiatric disorders and brain functions, and for medical illnesses with elevated clinical risk of depression, but not for other diseases. In support of the superior discriminative power of this novel approach, we observed no significant enrichment for GWAS-related genes in coexpression modules extracted from single studies or in meta-modules using gene expression data from non-psychiatric control subjects. Genes in the identified module encode proteins implicated in neuronal signaling and structure, including glutamate metabotropic receptors (GRM1, GRM7), GABA receptors (GABRA2, GABRA4), and neurotrophic and development-related proteins [BDNF, reelin (RELN), Ephrin receptors (EPHA3, EPHA5)]. These results are consistent with the current understanding of molecular mechanisms of MDD and provide a set of putative interacting molecular partners, potentially reflecting components of a functional module across cells and biological pathways that are synchronously recruited in MDD, other brain disorders and MDD-related illnesses. Collectively, this study demonstrates the importance of integrating transcriptome data, gene coexpression modules and GWAS results for providing novel and complementary approaches to investigate the molecular pathology of MDD and other complex brain disorders.
GWAS in a Box: Statistical and Visual Analytics of Structured Associations via GenAMap
Xing, Eric P.; Curtis, Ross E.; Schoenherr, Georg; Lee, Seunghak; Yin, Junming; Puniyani, Kriti; Wu, Wei; Kinnaird, Peter
2014-01-01
With the continuous improvement in genotyping and molecular phenotyping technology and the decreasing typing cost, it is expected that in a few years, more and more clinical studies of complex diseases will recruit thousands of individuals for pan-omic genetic association analyses. Hence, there is a great need for algorithms and software tools that could scale up to the whole omic level, integrate different omic data, leverage rich structure information, and be easily accessible to non-technical users. We present GenAMap, an interactive analytics software platform that 1) automates the execution of principled machine learning methods that detect genome- and phenome-wide associations among genotypes, gene expression data, and clinical or other macroscopic traits, and 2) provides new visualization tools specifically designed to aid in the exploration of association mapping results. Algorithmically, GenAMap is based on a new paradigm for GWAS and PheWAS analysis, termed structured association mapping, which leverages various structures in the omic data. We demonstrate the function of GenAMap via a case study of the Brem and Kruglyak yeast dataset, and then apply it on a comprehensive eQTL analysis of the NIH heterogeneous stock mice dataset and report some interesting findings. GenAMap is available from http://sailing.cs.cmu.edu/genamap. PMID:24905018
Pathway-Based Kernel Boosting for the Analysis of Genome-Wide Association Studies
Manitz, Juliane; Burger, Patricia; Amos, Christopher I.; Chang-Claude, Jenny; Wichmann, Heinz-Erich; Kneib, Thomas; Bickeböller, Heike
2017-01-01
The analysis of genome-wide association studies (GWAS) benefits from the investigation of biologically meaningful gene sets, such as gene-interaction networks (pathways). We propose an extension to a successful kernel-based pathway analysis approach by integrating kernel functions into a powerful algorithmic framework for variable selection, to enable investigation of multiple pathways simultaneously. We employ genetic similarity kernels from the logistic kernel machine test (LKMT) as base-learners in a boosting algorithm. A model to explain case-control status is created iteratively by selecting pathways that improve its prediction ability. We evaluated our method in simulation studies adopting 50 pathways for different sample sizes and genetic effect strengths. Additionally, we included an exemplary application of kernel boosting to a rheumatoid arthritis and a lung cancer dataset. Simulations indicate that kernel boosting outperforms the LKMT in certain genetic scenarios. Applications to GWAS data on rheumatoid arthritis and lung cancer resulted in sparse models which were based on pathways interpretable in a clinical sense. Kernel boosting is highly flexible in terms of considered variables and overcomes the problem of multiple testing. Additionally, it enables the prediction of clinical outcomes. Thus, kernel boosting constitutes a new, powerful tool in the analysis of GWAS data and towards the understanding of biological processes involved in disease susceptibility. PMID:28785300
Pathway-Based Kernel Boosting for the Analysis of Genome-Wide Association Studies.
Friedrichs, Stefanie; Manitz, Juliane; Burger, Patricia; Amos, Christopher I; Risch, Angela; Chang-Claude, Jenny; Wichmann, Heinz-Erich; Kneib, Thomas; Bickeböller, Heike; Hofner, Benjamin
2017-01-01
The analysis of genome-wide association studies (GWAS) benefits from the investigation of biologically meaningful gene sets, such as gene-interaction networks (pathways). We propose an extension to a successful kernel-based pathway analysis approach by integrating kernel functions into a powerful algorithmic framework for variable selection, to enable investigation of multiple pathways simultaneously. We employ genetic similarity kernels from the logistic kernel machine test (LKMT) as base-learners in a boosting algorithm. A model to explain case-control status is created iteratively by selecting pathways that improve its prediction ability. We evaluated our method in simulation studies adopting 50 pathways for different sample sizes and genetic effect strengths. Additionally, we included an exemplary application of kernel boosting to a rheumatoid arthritis and a lung cancer dataset. Simulations indicate that kernel boosting outperforms the LKMT in certain genetic scenarios. Applications to GWAS data on rheumatoid arthritis and lung cancer resulted in sparse models which were based on pathways interpretable in a clinical sense. Kernel boosting is highly flexible in terms of considered variables and overcomes the problem of multiple testing. Additionally, it enables the prediction of clinical outcomes. Thus, kernel boosting constitutes a new, powerful tool in the analysis of GWAS data and towards the understanding of biological processes involved in disease susceptibility.
Genetic variants in the PIWI-piRNA pathway gene DCP1A predict melanoma disease-specific survival.
Zhang, Weikang; Liu, Hongliang; Yin, Jieyun; Wu, Wenting; Zhu, Dakai; Amos, Christopher I; Fang, Shenying; Lee, Jeffrey E; Li, Yi; Han, Jiali; Wei, Qingyi
2016-12-15
The Piwi-piRNA pathway is important for germ cell maintenance, genome integrity, DNA methylation and retrotransposon control and thus may be involved in cancer development. In this study, we comprehensively analyzed prognostic roles of 3,116 common SNPs in PIWI-piRNA pathway genes in melanoma disease-specific survival. A published genome-wide association study (GWAS) by The University of Texas M.D. Anderson Cancer Center was used to identify associated SNPs, which were later validated by another GWAS from the Harvard Nurses' Health Study and Health Professionals Follow-up Study. After multiple testing correction, we found that there were 27 common SNPs in two genes (PIWIL4 and DCP1A) with false discovery rate < 0.2 in the discovery dataset. Three tagSNPs (i.e., rs7933369 and rs508485 in PIWIL4; rs11551405 in DCP1A) were replicated. The rs11551405 A allele, located at the 3' UTR microRNA binding site of DCP1A, was associated with an increased risk of melanoma disease-specific death in both discovery dataset [adjusted Hazards ratio (HR) = 1.66, 95% confidence interval (CI) = 1.21-2.27, p =1.50 × 10 -3 ] and validation dataset (HR = 1.55, 95% CI = 1.03-2.34, p = 0.038), compared with the C allele, and their meta-analysis showed an HR of 1.62 (95% CI, 1.26-2.08, p =1.55 × 10 -4 ). Using RNA-seq data from the 1000 Genomes Project, we found that DCP1A mRNA expression levels increased significantly with the A allele number of rs11551405. Additional large, prospective studies are needed to validate these findings. © 2016 UICC.
Anand, Vibha; Rosenman, Marc B; Downs, Stephen M
2013-09-01
To develop a map of disease associations exclusively using two publicly available genetic sources: the catalog of single nucleotide polymorphisms (SNPs) from the HapMap, and the catalog of Genome Wide Association Studies (GWAS) from the NHGRI, and to evaluate it with a large, long-standing electronic medical record (EMR). A computational model, In Silico Bayesian Integration of GWAS (IsBIG), was developed to learn associations among diseases using a Bayesian network (BN) framework, using only genetic data. The IsBIG model (I-Model) was re-trained using data from our EMR (M-Model). Separately, another clinical model (C-Model) was learned from this training dataset. The I-Model was compared with both the M-Model and the C-Model for power to discriminate a disease given other diseases using a test dataset from our EMR. Area under receiver operator characteristics curve was used as a performance measure. Direct associations between diseases in the I-Model were also searched in the PubMed database and in classes of the Human Disease Network (HDN). On the basis of genetic information alone, the I-Model linked a third of diseases from our EMR. When compared to the M-Model, the I-Model predicted diseases given other diseases with 94% specificity, 33% sensitivity, and 80% positive predictive value. The I-Model contained 117 direct associations between diseases. Of those associations, 20 (17%) were absent from the searches of the PubMed database; one of these was present in the C-Model. Of the direct associations in the I-Model, 7 (35%) were absent from disease classes of HDN. Using only publicly available genetic sources we have mapped associations in GWAS to a human disease map using an in silico approach. Furthermore, we have validated this disease map using phenotypic data from our EMR. Models predicting disease associations on the basis of known genetic associations alone are specific but not sensitive. Genetic data, as it currently exists, can only explain a fraction of the risk of a disease. Our approach makes a quantitative statement about disease variation that can be explained in an EMR on the basis of genetic associations described in the GWAS. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Boucher, Gabrielle; Beauchamp, Claudine; Trynka, Gosia; Dubois, Patrick C.; Lagacé, Caroline; Stokkers, Pieter C. F.; Hommes, Daan W.; Barisani, Donatella; Palmieri, Orazio; Annese, Vito; van Heel, David A.; Weersma, Rinse K.; Daly, Mark J.; Wijmenga, Cisca; Rioux, John D.
2011-01-01
Crohn's disease (CD) and celiac disease (CelD) are chronic intestinal inflammatory diseases, involving genetic and environmental factors in their pathogenesis. The two diseases can co-occur within families, and studies suggest that CelD patients have a higher risk to develop CD than the general population. These observations suggest that CD and CelD may share common genetic risk loci. Two such shared loci, IL18RAP and PTPN2, have already been identified independently in these two diseases. The aim of our study was to explicitly identify shared risk loci for these diseases by combining results from genome-wide association study (GWAS) datasets of CD and CelD. Specifically, GWAS results from CelD (768 cases, 1,422 controls) and CD (3,230 cases, 4,829 controls) were combined in a meta-analysis. Nine independent regions had nominal association p-value <1.0×10−5 in this meta-analysis and showed evidence of association to the individual diseases in the original scans (p-value <1×10−2 in CelD and <1×10−3 in CD). These include the two previously reported shared loci, IL18RAP and PTPN2, with p-values of 3.37×10−8 and 6.39×10−9, respectively, in the meta-analysis. The other seven had not been reported as shared loci and thus were tested in additional CelD (3,149 cases and 4,714 controls) and CD (1,835 cases and 1,669 controls) cohorts. Two of these loci, TAGAP and PUS10, showed significant evidence of replication (Bonferroni corrected p-values <0.0071) in the combined CelD and CD replication cohorts and were firmly established as shared risk loci of genome-wide significance, with overall combined p-values of 1.55×10−10 and 1.38×10−11 respectively. Through a meta-analysis of GWAS data from CD and CelD, we have identified four shared risk loci: PTPN2, IL18RAP, TAGAP, and PUS10. The combined analysis of the two datasets provided the power, lacking in the individual GWAS for single diseases, to detect shared loci with a relatively small effect. PMID:21298027
Gene-Gene and Gene-Environment Interactions in Ulcerative Colitis
Wang, Ming-Hsi; Fiocchi, Claudio; Zhu, Xiaofeng; Ripke, Stephan; Kamboh, M. Ilyas; Rebert, Nancy; Duerr, Richard H.; Achkar, Jean-Paul
2014-01-01
Genome-wide association studies (GWAS) have identified at least 133 ulcerative colitis (UC) associated loci. The role of genetic factors in clinical practice is not clearly defined. The relevance of genetic variants to disease pathogenesis is still uncertain because of not characterized gene-gene and gene-environment interactions. We examined the predictive value of combining the 133 UC risk loci with genetic interactions in an ongoing inflammatory bowel disease (IBD) GWAS. The Wellcome Trust Case-Control Consortium (WTCCC) IBD GWAS was used as a replication cohort. We applied logic regression (LR), a novel adaptive regression methodology, to search for high order interactions. Exploratory genotype correlations with UC sub-phenotypes (extent of disease, need of surgery, age of onset, extra-intestinal manifestations and primary sclerosing cholangitis (PSC)) were conducted. The combination of 133 UC loci yielded good UC risk predictability (area under the curve [AUC] of 0.86). A higher cumulative allele score predicted higher UC risk. Through LR, several lines of evidence for genetic interactions were identified and successfully replicated in the WTCCC cohort. The genetic interactions combined with the gene-smoking interaction significantly improved predictability in the model (AUC, from 0.86 to 0.89, P=3.26E-05). Explained UC variance increased from 37% to 42% after adding the interaction terms. A within case analysis found suggested genetic association with PSC. Our study demonstrates that the LR methodology allows the identification and replication of high order genetic interactions in UC GWAS datasets. UC risk can be predicted by a 133 loci and improved by adding gene-gene and gene-environment interactions. PMID:24241240
Convergent Genetic and Expression Datasets Highlight TREM2 in Parkinson's Disease Susceptibility.
Liu, Guiyou; Liu, Yongquan; Jiang, Qinghua; Jiang, Yongshuai; Feng, Rennan; Zhang, Liangcai; Chen, Zugen; Li, Keshen; Liu, Jiafeng
2016-09-01
A rare TREM2 missense mutation (rs75932628-T) was reported to confer a significant Alzheimer's disease (AD) risk. A recent study indicated no evidence of the involvement of this variant in Parkinson's disease (PD). Here, we used the genetic and expression data to reinvestigate the potential association between TREM2 and PD susceptibility. In stage 1, using 10 independent studies (N = 89,157; 8787 cases and 80,370 controls), we conducted a subgroup meta-analysis. We identified a significant association between rs75932628 and PD (P = 3.10E-03, odds ratio (OR) = 3.88, 95 % confidence interval (CI) 1.58-9.54) in No-Northern Europe subgroup, and significantly increased PD risks (P = 0.01 for Mann-Whitney test) in No-Northern Europe subgroup than in Northern Europe subgroup. In stage 2, we used the summary results from a large-scale PD genome-wide association study (GWAS; N = 108,990; 13,708 cases and 95,282 controls) to search for other TREM2 variants contributing to PD susceptibility. We identified 14 single-nucleotide polymorphisms (SNPs) associated with PD within 50-kb upstream and downstream range of TREM2. In stage 3, using two brain expression GWAS datasets (N = 773), we identified 6 of the 14 SNPs regulating increased expression of TREM2. In stage 4, using the whole human genome microarray data (N = 50), we further identified significantly increased expression of TREM2 in PD cases compared with controls in human prefrontal cortex. In summary, convergent genetic and expression datasets demonstrate that TREM2 is a potent risk factor for PD and may be a therapeutic target in PD and other neurodegenerative diseases.
FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption
2015-01-01
Background The increasing availability of genome data motivates massive research studies in personalized treatment and precision medicine. Public cloud services provide a flexible way to mitigate the storage and computation burden in conducting genome-wide association studies (GWAS). However, data privacy has been widely concerned when sharing the sensitive information in a cloud environment. Methods We presented a novel framework (FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption) to fully outsource GWAS (i.e., chi-square statistic computation) using homomorphic encryption. The proposed framework enables secure divisions over encrypted data. We introduced two division protocols (i.e., secure errorless division and secure approximation division) with a trade-off between complexity and accuracy in computing chi-square statistics. Results The proposed framework was evaluated for the task of chi-square statistic computation with two case-control datasets from the 2015 iDASH genome privacy protection challenge. Experimental results show that the performance of FORESEE can be significantly improved through algorithmic optimization and parallel computation. Remarkably, the secure approximation division provides significant performance gain, but without missing any significance SNPs in the chi-square association test using the aforementioned datasets. Conclusions Unlike many existing HME based studies, in which final results need to be computed by the data owner due to the lack of the secure division operation, the proposed FORESEE framework support complete outsourcing to the cloud and output the final encrypted chi-square statistics. PMID:26733391
FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption.
Zhang, Yuchen; Dai, Wenrui; Jiang, Xiaoqian; Xiong, Hongkai; Wang, Shuang
2015-01-01
The increasing availability of genome data motivates massive research studies in personalized treatment and precision medicine. Public cloud services provide a flexible way to mitigate the storage and computation burden in conducting genome-wide association studies (GWAS). However, data privacy has been widely concerned when sharing the sensitive information in a cloud environment. We presented a novel framework (FORESEE: Fully Outsourced secuRe gEnome Study basEd on homomorphic Encryption) to fully outsource GWAS (i.e., chi-square statistic computation) using homomorphic encryption. The proposed framework enables secure divisions over encrypted data. We introduced two division protocols (i.e., secure errorless division and secure approximation division) with a trade-off between complexity and accuracy in computing chi-square statistics. The proposed framework was evaluated for the task of chi-square statistic computation with two case-control datasets from the 2015 iDASH genome privacy protection challenge. Experimental results show that the performance of FORESEE can be significantly improved through algorithmic optimization and parallel computation. Remarkably, the secure approximation division provides significant performance gain, but without missing any significance SNPs in the chi-square association test using the aforementioned datasets. Unlike many existing HME based studies, in which final results need to be computed by the data owner due to the lack of the secure division operation, the proposed FORESEE framework support complete outsourcing to the cloud and output the final encrypted chi-square statistics.
Polygenic risk score and heritability estimates reveals a genetic relationship between ASD and OCD.
Guo, W; Samuels, J F; Wang, Y; Cao, H; Ritter, M; Nestadt, P S; Krasnow, J; Greenberg, B D; Fyer, A J; McCracken, J T; Geller, D A; Murphy, D L; Knowles, J A; Grados, M A; Riddle, M A; Rasmussen, S A; McLaughlin, N C; Nurmi, E L; Askland, K D; Cullen, B A; Piacentini, J; Pauls, D L; Bienvenu, O J; Stewart, S E; Goes, F S; Maher, B; Pulver, A E; Valle, D; Mattheisen, M; Qian, J; Nestadt, G; Shugart, Y Y
2017-07-01
Obsessive-compulsive disorder (OCD) and Autism spectrum disorder (ASD) are both highly heritable neurodevelopmental disorders that conceivably share genetic risk factors. However, the underlying genetic determinants remain largely unknown. In this work, the authors describe a combined genome-wide association study (GWAS) of ASD and OCD. The OCD dataset includes 2998 individuals in nuclear families. The ASD dataset includes 6898 individuals in case-parents trios. GWAS summary statistics were examined for potential enrichment of functional variants associated with gene expression levels in brain regions. The top ranked SNP is rs4785741 (chromosome 16) with P value=6.9×10 -7 in our re-analysis. Polygenic risk score analyses were conducted to investigate the genetic relationship within and across the two disorders. These analyses identified a significant polygenic component of ASD, predicting 0.11% of the phenotypic variance in an independent OCD data set. In addition, we examined the genomic architecture of ASD and OCD by estimating heritability on different chromosomes and different allele frequencies, analyzing genome-wide common variant data by using the Genome-wide Complex Trait Analysis (GCTA) program. The estimated global heritability of OCD is 0.427 (se=0.093) and 0.174 (se=0.053) for ASD in these imputed data. Published by Elsevier B.V.
Wang, Yi-Ting; Sung, Pei-Yuan; Lin, Peng-Lin; Yu, Ya-Wen; Chung, Ren-Hua
2015-05-15
Genome-wide association studies (GWAS) have become a common approach to identifying single nucleotide polymorphisms (SNPs) associated with complex diseases. As complex diseases are caused by the joint effects of multiple genes, while the effect of individual gene or SNP is modest, a method considering the joint effects of multiple SNPs can be more powerful than testing individual SNPs. The multi-SNP analysis aims to test association based on a SNP set, usually defined based on biological knowledge such as gene or pathway, which may contain only a portion of SNPs with effects on the disease. Therefore, a challenge for the multi-SNP analysis is how to effectively select a subset of SNPs with promising association signals from the SNP set. We developed the Optimal P-value Threshold Pedigree Disequilibrium Test (OPTPDT). The OPTPDT uses general nuclear families. A variable p-value threshold algorithm is used to determine an optimal p-value threshold for selecting a subset of SNPs. A permutation procedure is used to assess the significance of the test. We used simulations to verify that the OPTPDT has correct type I error rates. Our power studies showed that the OPTPDT can be more powerful than the set-based test in PLINK, the multi-SNP FBAT test, and the p-value based test GATES. We applied the OPTPDT to a family-based autism GWAS dataset for gene-based association analysis and identified MACROD2-AS1 with genome-wide significance (p-value=2.5×10(-6)). Our simulation results suggested that the OPTPDT is a valid and powerful test. The OPTPDT will be helpful for gene-based or pathway association analysis. The method is ideal for the secondary analysis of existing GWAS datasets, which may identify a set of SNPs with joint effects on the disease.
Genome-wide interaction study of smoking and bladder cancer risk
Figueroa, Jonine D.; Han, Summer S.; Garcia-Closas, Montserrat; Baris, Dalsu; Jacobs, Eric J.; Kogevinas, Manolis; Schwenn, Molly; Malats, Nuria; Johnson, Alison; Purdue, Mark P.; Caporaso, Neil; Landi, Maria Teresa; Prokunina-Olsson, Ludmila; Wang, Zhaoming; Hutchinson, Amy; Burdette, Laurie; Wheeler, William; Vineis, Paolo; Siddiq, Afshan; Cortessis, Victoria K.; Kooperberg, Charles; Cussenot, Olivier; Benhamou, Simone; Prescott, Jennifer; Porru, Stefano; Bueno-de-Mesquita, H.Bas; Trichopoulos, Dimitrios; Ljungberg, Börje; Clavel-Chapelon, Françoise; Weiderpass, Elisabete; Krogh, Vittorio; Dorronsoro, Miren; Travis, Ruth; Tjønneland, Anne; Brenan, Paul; Chang-Claude, Jenny; Riboli, Elio; Conti, David; Gago-Dominguez, Manuela; Stern, Mariana C.; Pike, Malcolm C.; Van Den Berg, David; Yuan, Jian-Min; Hohensee, Chancellor; Rodabough, Rebecca; Cancel-Tassin, Geraldine; Roupret, Morgan; Comperat, Eva; Chen, Constance; De Vivo, Immaculata; Giovannucci, Edward; Hunter, David J.; Kraft, Peter; Lindstrom, Sara; Carta, Angela; Pavanello, Sofia; Arici, Cecilia; Mastrangelo, Giuseppe; Karagas, Margaret R.; Schned, Alan; Armenti, Karla R.; Hosain, G.M.Monawar; Haiman, Chris A.; Fraumeni, Joseph F.; Chanock, Stephen J.; Chatterjee, Nilanjan; Rothman, Nathaniel; Silverman, Debra T.
2014-01-01
Bladder cancer is a complex disease with known environmental and genetic risk factors. We performed a genome-wide interaction study (GWAS) of smoking and bladder cancer risk based on primary scan data from 3002 cases and 4411 controls from the National Cancer Institute Bladder Cancer GWAS. Alternative methods were used to evaluate both additive and multiplicative interactions between individual single nucleotide polymorphisms (SNPs) and smoking exposure. SNPs with interaction P values < 5 × 10− 5 were evaluated further in an independent dataset of 2422 bladder cancer cases and 5751 controls. We identified 10 SNPs that showed association in a consistent manner with the initial dataset and in the combined dataset, providing evidence of interaction with tobacco use. Further, two of these novel SNPs showed strong evidence of association with bladder cancer in tobacco use subgroups that approached genome-wide significance. Specifically, rs1711973 (FOXF2) on 6p25.3 was a susceptibility SNP for never smokers [combined odds ratio (OR) = 1.34, 95% confidence interval (CI) = 1.20–1.50, P value = 5.18 × 10− 7]; and rs12216499 (RSPH3-TAGAP-EZR) on 6q25.3 was a susceptibility SNP for ever smokers (combined OR = 0.75, 95% CI = 0.67–0.84, P value = 6.35 × 10− 7). In our analysis of smoking and bladder cancer, the tests for multiplicative interaction seemed to more commonly identify susceptibility loci with associations in never smokers, whereas the additive interaction analysis identified more loci with associations among smokers—including the known smoking and NAT2 acetylation interaction. Our findings provide additional evidence of gene–environment interactions for tobacco and bladder cancer. PMID:24662972
Bhaskar, Anand; Javanmard, Adel; Courtade, Thomas A; Tse, David
2017-03-15
Genetic variation in human populations is influenced by geographic ancestry due to spatial locality in historical mating and migration patterns. Spatial population structure in genetic datasets has been traditionally analyzed using either model-free algorithms, such as principal components analysis (PCA) and multidimensional scaling, or using explicit spatial probabilistic models of allele frequency evolution. We develop a general probabilistic model and an associated inference algorithm that unify the model-based and data-driven approaches to visualizing and inferring population structure. Our spatial inference algorithm can also be effectively applied to the problem of population stratification in genome-wide association studies (GWAS), where hidden population structure can create fictitious associations when population ancestry is correlated with both the genotype and the trait. Our algorithm Geographic Ancestry Positioning (GAP) relates local genetic distances between samples to their spatial distances, and can be used for visually discerning population structure as well as accurately inferring the spatial origin of individuals on a two-dimensional continuum. On both simulated and several real datasets from diverse human populations, GAP exhibits substantially lower error in reconstructing spatial ancestry coordinates compared to PCA. We also develop an association test that uses the ancestry coordinates inferred by GAP to accurately account for ancestry-induced correlations in GWAS. Based on simulations and analysis of a dataset of 10 metabolic traits measured in a Northern Finland cohort, which is known to exhibit significant population structure, we find that our method has superior power to current approaches. Our software is available at https://github.com/anand-bhaskar/gap . abhaskar@stanford.edu or ajavanma@usc.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Metabolomic and Genome-wide Association Studies Reveal Potential Endogenous Biomarkers for OATP1B1.
Yee, S W; Giacomini, M M; Hsueh, C-H; Weitz, D; Liang, X; Goswami, S; Kinchen, J M; Coelho, A; Zur, A A; Mertsch, K; Brian, W; Kroetz, D L; Giacomini, K M
2016-11-01
Transporter-mediated drug-drug interactions (DDIs) are a major cause of drug toxicities. Using published genome-wide association studies (GWAS) of the human metabolome, we identified 20 metabolites associated with genetic variants in organic anion transporter, OATP1B1 (P < 5 × 10 -8 ). Of these, 12 metabolites were significantly higher in plasma samples from volunteers dosed with the OATP1B1 inhibitor, cyclosporine (CSA) vs. placebo (q-value < 0.2). Conjugated bile acids and fatty acid dicarboxylates were among the metabolites discovered using both GWAS and CSA administration. In vitro studies confirmed tetradecanedioate (TDA) and hexadecanedioate (HDA) were novel substrates of OATP1B1 as well as OAT1 and OAT3. This study highlights the use of multiple datasets for the discovery of endogenous metabolites that represent potential in vivo biomarkers for transporter-mediated DDIs. Future studies are needed to determine whether these metabolites can serve as qualified biomarkers for organic anion transporters. Quantitative relationships between metabolite levels and modulation of transporters should be established. © 2016 American Society for Clinical Pharmacology and Therapeutics.
Edenberg, Howard J; Foroud, Tatiana
2014-01-01
Multiple lines of evidence strongly indicate that genetic factors contribute to the risk for alcohol use disorders (AUD). There is substantial heterogeneity in AUD, which complicates studies seeking to identify specific genetic factors. To identify these genetic effects, several different alcohol-related phenotypes have been analyzed, including diagnosis and quantitative measures related to AUDs. Study designs have used candidate gene analyses, genetic linkage studies, genomewide association studies (GWAS), and analyses of rare variants. Two genes that encode enzymes of alcohol metabolism have the strongest effect on AUD: aldehyde dehydrogenase 2 and alcohol dehydrogenase 1B each has strongly protective variants that reduce risk, with odds ratios approximately 0.2-0.4. A number of other genes important in AUD have been identified and replicated, including GABRA2 and alcohol dehydrogenases 1B and 4. GWAS have identified additional candidates. Rare variants are likely also to play a role; studies of these are just beginning. A multifaceted approach to gene identification, targeting both rare and common variations and assembling much larger datasets for meta-analyses, is critical for identifying the key genes and pathways important in AUD. © 2014 Elsevier B.V. All rights reserved.
Evangelou, Marina; Smyth, Deborah J; Fortune, Mary D; Burren, Oliver S; Walker, Neil M; Guo, Hui; Onengut-Gumuscu, Suna; Chen, Wei-Min; Concannon, Patrick; Rich, Stephen S; Todd, John A; Wallace, Chris
2014-01-01
Pathway analysis can complement point-wise single nucleotide polymorphism (SNP) analysis in exploring genomewide association study (GWAS) data to identify specific disease-associated genes that can be candidate causal genes. We propose a straightforward methodology that can be used for conducting a gene-based pathway analysis using summary GWAS statistics in combination with widely available reference genotype data. We used this method to perform a gene-based pathway analysis of a type 1 diabetes (T1D) meta-analysis GWAS (of 7,514 cases and 9,045 controls). An important feature of the conducted analysis is the removal of the major histocompatibility complex gene region, the major genetic risk factor for T1D. Thirty-one of the 1,583 (2%) tested pathways were identified to be enriched for association with T1D at a 5% false discovery rate. We analyzed these 31 pathways and their genes to identify SNPs in or near these pathway genes that showed potentially novel association with T1D and attempted to replicate the association of 22 SNPs in additional samples. Replication P-values were skewed () with 12 of the 22 SNPs showing . Support, including replication evidence, was obtained for nine T1D associated variants in genes ITGB7 (rs11170466, ), NRP1 (rs722988, ), BAD (rs694739, ), CTSB (rs1296023, ), FYN (rs11964650, ), UBE2G1 (rs9906760, ), MAP3K14 (rs17759555, ), ITGB1 (rs1557150, ), and IL7R (rs1445898, ). The proposed methodology can be applied to other GWAS datasets for which only summary level data are available. PMID:25371288
easyGWAS: A Cloud-Based Platform for Comparing the Results of Genome-Wide Association Studies.
Grimm, Dominik G; Roqueiro, Damian; Salomé, Patrice A; Kleeberger, Stefan; Greshake, Bastian; Zhu, Wangsheng; Liu, Chang; Lippert, Christoph; Stegle, Oliver; Schölkopf, Bernhard; Weigel, Detlef; Borgwardt, Karsten M
2017-01-01
The ever-growing availability of high-quality genotypes for a multitude of species has enabled researchers to explore the underlying genetic architecture of complex phenotypes at an unprecedented level of detail using genome-wide association studies (GWAS). The systematic comparison of results obtained from GWAS of different traits opens up new possibilities, including the analysis of pleiotropic effects. Other advantages that result from the integration of multiple GWAS are the ability to replicate GWAS signals and to increase statistical power to detect such signals through meta-analyses. In order to facilitate the simple comparison of GWAS results, we present easyGWAS, a powerful, species-independent online resource for computing, storing, sharing, annotating, and comparing GWAS. The easyGWAS tool supports multiple species, the uploading of private genotype data and summary statistics of existing GWAS, as well as advanced methods for comparing GWAS results across different experiments and data sets in an interactive and user-friendly interface. easyGWAS is also a public data repository for GWAS data and summary statistics and already includes published data and results from several major GWAS. We demonstrate the potential of easyGWAS with a case study of the model organism Arabidopsis thaliana , using flowering and growth-related traits. © 2016 American Society of Plant Biologists. All rights reserved.
Gianfrancesco, Olympia; Griffiths, Daniel; Myers, Paul; Collier, David A; Bubb, Vivien J; Quinn, John P
2016-10-01
Genome-wide association studies (GWAS) have identified a region at chromosome 1p21.3, containing the microRNA MIR137, to be among the most significant associations for schizophrenia. However, the mechanism by which genetic variation at this locus increases risk of schizophrenia is unknown. Identifying key regulatory regions around MIR137 is crucial to understanding the potential role of this gene in the aetiology of psychiatric disorders. Through alignment of vertebrate genomes, we identified seven non-coding regions at the MIR137 locus with conservation comparable to exons (>70 %). Bioinformatic analysis using the Psychiatric Genomics Consortium GWAS dataset for schizophrenia showed five of the ECRs to have genome-wide significant SNPs in or adjacent to their sequence. Analysis of available datasets on chromatin marks and histone modification data showed that three of the ECRs were predicted to be functional in the human brain, and three in development. In vitro analysis of ECR activity using reporter gene assays showed that all seven of the selected ECRs displayed transcriptional regulatory activity in the SH-SY5Y neuroblastoma cell line. This data suggests a regulatory role in the developing and adult brain for these highly conserved regions at the MIR137 schizophrenia-associated locus and further that these domains could act individually or synergistically to regulate levels of MIR137 expression.
Identification of a Bipolar Disorder Vulnerable Gene CHDH at 3p21.1.
Chang, Hong; Li, Lingyi; Peng, Tao; Grigoroiu-Serbanescu, Maria; Bergen, Sarah E; Landén, Mikael; Hultman, Christina M; Forstner, Andreas J; Strohmaier, Jana; Hecker, Julian; Schulze, Thomas G; Müller-Myhsok, Bertram; Reif, Andreas; Mitchell, Philip B; Martin, Nicholas G; Cichon, Sven; Nöthen, Markus M; Jamain, Stéphane; Leboyer, Marion; Bellivier, Frank; Etain, Bruno; Kahn, Jean-Pierre; Henry, Chantal; Rietschel, Marcella; Xiao, Xiao; Li, Ming
2017-09-01
Genome-wide analysis (GWA) is an effective strategy to discover extreme effects surpassing genome-wide significant levels in studying complex disorders; however, when sample size is limited, the true effects may fail to achieve genome-wide significance. In such case, there may be authentic results among the pools of nominal candidates, and an alternative approach is to consider nominal candidates but are replicable across different samples. Here, we found that mRNA expression of the choline dehydrogenase gene (CHDH) was uniformly upregulated in the brains of bipolar disorder (BPD) patients compared with healthy controls across different studies. Follow-up genetic analyses of CHDH variants in multiple independent clinical datasets (including 11,564 cases and 17,686 controls) identified a risk SNP rs9836592 showing consistent associations with BPD (P meta = 5.72 × 10 -4 ), and the risk allele indicated an increased CHDH expression in multiple neuronal tissues (lowest P = 6.70 × 10 -16 ). These converging results may identify a nominal but true BPD susceptibility gene CHDH. Further exploratory analysis revealed suggestive associations of rs9836592 with childhood intelligence (P = 0.044) and educational attainment (P = 0.0039), a "proxy phenotype" of general cognitive abilities. Intriguingly, the CHDH gene is located at chromosome 3p21.1, a risk region implicated in previous BPD genome-wide association studies (GWAS), but CHDH is lying outside of the core GWAS linkage disequilibrium (LD) region, and our studied SNP rs9836592 is ∼1.2 Mb 3' downstream of the previous GWAS loci (e.g., rs2251219) with no LD between them; thus, the association observed here is unlikely a reflection of previous GWAS signals. In summary, our results imply that CHDH may play a previously unknown role in the etiology of BPD and also highlight the informative value of integrating gene expression and genetic code in advancing our understanding of its biological basis.
Batra, Jyotsna; Lose, Felicity; O'Mara, Tracy; Marquart, Louise; Stephens, Carson; Alexander, Kimberly; Srinivasan, Srilakshmi; Eeles, Rosalind A.; Easton, Douglas F.; Olama, Ali Amin Al; Kote-Jarai, Zsofia; Guy, Michelle; Muir, Kenneth; Lophatananon, Artitaya; Rahman, Aneela A.; Neal, David E.; Hamdy, Freddie C.; Donovan, Jenny L.; Chambers, Suzanne; Gardiner, Robert A.; Aitken, Joanne; Yaxley, John; Kedda, Mary-Anne
2011-01-01
Background Kallikrein 15 (KLK15)/Prostinogen is a plausible candidate for prostate cancer susceptibility. Elevated KLK15 expression has been reported in prostate cancer and it has been described as an unfavorable prognostic marker for the disease. Objectives We performed a comprehensive analysis of association of variants in the KLK15 gene with prostate cancer risk and aggressiveness by genotyping tagSNPs, as well as putative functional SNPs identified by extensive bioinformatics analysis. Methods and Data Sources Twelve out of 22 SNPs, selected on the basis of linkage disequilibrium pattern, were analyzed in an Australian sample of 1,011 histologically verified prostate cancer cases and 1,405 ethnically matched controls. Replication was sought from two existing genome wide association studies (GWAS): the Cancer Genetic Markers of Susceptibility (CGEMS) project and a UK GWAS study. Results Two KLK15 SNPs, rs2659053 and rs3745522, showed evidence of association (p<0.05) but were not present on the GWAS platforms. KLK15 SNP rs2659056 was found to be associated with prostate cancer aggressiveness and showed evidence of association in a replication cohort of 5,051 patients from the UK, Australia, and the CGEMS dataset of US samples. A highly significant association with Gleason score was observed when the data was combined from these three studies with an Odds Ratio (OR) of 0.85 (95% CI = 0.77–0.93; p = 2.7×10−4). The rs2659056 SNP is predicted to alter binding of the RORalpha transcription factor, which has a role in the control of cell growth and differentiation and has been suggested to control the metastatic behavior of prostate cancer cells. Conclusions Our findings suggest a role for KLK15 genetic variation in the etiology of prostate cancer among men of European ancestry, although further studies in very large sample sets are necessary to confirm effect sizes. PMID:22132073
Lee, Tae-Rim; Ahn, Jin Mo; Kim, Gyuhee; Kim, Sangsoo
2017-12-01
Next-generation sequencing (NGS) technology has become a trend in the genomics research area. There are many software programs and automated pipelines to analyze NGS data, which can ease the pain for traditional scientists who are not familiar with computer programming. However, downstream analyses, such as finding differentially expressed genes or visualizing linkage disequilibrium maps and genome-wide association study (GWAS) data, still remain a challenge. Here, we introduce a dockerized web application written in R using the Shiny platform to visualize pre-analyzed RNA sequencing and GWAS data. In addition, we have integrated a genome browser based on the JBrowse platform and an automated intermediate parsing process required for custom track construction, so that users can easily build and navigate their personal genome tracks with in-house datasets. This application will help scientists perform series of downstream analyses and obtain a more integrative understanding about various types of genomic data by interactively visualizing them with customizable options.
Random Bits Forest: a Strong Classifier/Regressor for Big Data
NASA Astrophysics Data System (ADS)
Wang, Yi; Li, Yi; Pu, Weilin; Wen, Kathryn; Shugart, Yin Yao; Xiong, Momiao; Jin, Li
2016-07-01
Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth), boosting (for width), and random forests (for prediction accuracy). Through a gradient boosting scheme, it first generates and selects ~10,000 small, 3-layer random neural networks. These networks are then fed into a modified random forest algorithm to obtain predictions. Testing with datasets from the UCI (University of California, Irvine) Machine Learning Repository shows that RBF outperforms other popular methods in both accuracy and robustness, especially with large datasets (N > 1000). The algorithm also performed highly in testing with an independent data set, a real psoriasis genome-wide association study (GWAS).
Rahmioglu, Nilufer; Nyholt, Dale R.; Morris, Andrew P.; Missmer, Stacey A.; Montgomery, Grant W.; Zondervan, Krina T.
2014-01-01
BACKGROUND Endometriosis is a heritable common gynaecological condition influenced by multiple genetic and environmental factors. Genome-wide association studies (GWASs) have proved successful in identifying common genetic variants of moderate effects for various complex diseases. To date, eight GWAS and replication studies from multiple populations have been published on endometriosis. In this review, we investigate the consistency and heterogeneity of the results across all the studies and their implications for an improved understanding of the aetiology of the condition. METHODS Meta-analyses were conducted on four GWASs and four replication studies including a total of 11 506 cases and 32 678 controls, and on the subset of studies that investigated associations for revised American Fertility Society (rAFS) Stage III/IV including 2859 cases. The datasets included 9039 cases and 27 343 controls of European (Australia, Belgium, Italy, UK, USA) and 2467 cases and 5335 controls of Japanese ancestry. Fixed and Han and Elkin random-effects models, and heterogeneity statistics (Cochran's Q test), were used to investigate the evidence of the nine reported genome-wide significant loci across datasets and populations. RESULTS Meta-analysis showed that seven out of nine loci had consistent directions of effect across studies and populations, and six out of nine remained genome-wide significant (P < 5 × 10−8), including rs12700667 on 7p15.2 (P = 1.6 × 10−9), rs7521902 near WNT4 (P = 1.8 × 10−15), rs10859871 near VEZT (P = 4.7 × 10−15), rs1537377 near CDKN2B-AS1 (P = 1.5 × 10−8), rs7739264 near ID4 (P = 6.2 × 10−10) and rs13394619 in GREB1 (P = 4.5 × 10−8). In addition to the six loci, two showed borderline genome-wide significant associations with Stage III/IV endometriosis, including rs1250248 in FN1 (P = 8 × 10−8) and rs4141819 on 2p14 (P = 9.2 × 10−8). Two independent inter-genic loci, rs4141819 and rs6734792 on chromosome 2, showed significant evidence of heterogeneity across datasets (P < 0.005). Eight of the nine loci had stronger effect sizes among Stage III/IV cases, implying that they are likely to be implicated in the development of moderate to severe, or ovarian, disease. While three out of nine loci were inter-genic, the remaining were in or near genes with known functions of biological relevance to endometriosis, varying from roles in developmental pathways to cellular growth/carcinogenesis. CONCLUSIONS Our meta-analysis shows remarkable consistency in endometriosis GWAS results across studies, with little evidence of population-based heterogeneity. They also show that the phenotypic classifications used in GWAS to date have been limited. Stronger associations with Stage III/IV disease observed for most loci emphasize the importance for future studies to include detailed sub-phenotype information. Functional studies in relevant tissues are needed to understand the effect of the variants on downstream biological pathways. PMID:24676469
Characterization of Large Structural Genetic Mosaicism in Human Autosomes
Machiela, Mitchell J.; Zhou, Weiyin; Sampson, Joshua N.; Dean, Michael C.; Jacobs, Kevin B.; Black, Amanda; Brinton, Louise A.; Chang, I-Shou; Chen, Chu; Chen, Constance; Chen, Kexin; Cook, Linda S.; Crous Bou, Marta; De Vivo, Immaculata; Doherty, Jennifer; Friedenreich, Christine M.; Gaudet, Mia M.; Haiman, Christopher A.; Hankinson, Susan E.; Hartge, Patricia; Henderson, Brian E.; Hong, Yun-Chul; Hosgood, H. Dean; Hsiung, Chao A.; Hu, Wei; Hunter, David J.; Jessop, Lea; Kim, Hee Nam; Kim, Yeul Hong; Kim, Young Tae; Klein, Robert; Kraft, Peter; Lan, Qing; Lin, Dongxin; Liu, Jianjun; Le Marchand, Loic; Liang, Xiaolin; Lissowska, Jolanta; Lu, Lingeng; Magliocco, Anthony M.; Matsuo, Keitaro; Olson, Sara H.; Orlow, Irene; Park, Jae Yong; Pooler, Loreall; Prescott, Jennifer; Rastogi, Radhai; Risch, Harvey A.; Schumacher, Fredrick; Seow, Adeline; Setiawan, Veronica Wendy; Shen, Hongbing; Sheng, Xin; Shin, Min-Ho; Shu, Xiao-Ou; VanDen Berg, David; Wang, Jiu-Cun; Wentzensen, Nicolas; Wong, Maria Pik; Wu, Chen; Wu, Tangchun; Wu, Yi-Long; Xia, Lucy; Yang, Hannah P.; Yang, Pan-Chyr; Zheng, Wei; Zhou, Baosen; Abnet, Christian C.; Albanes, Demetrius; Aldrich, Melinda C.; Amos, Christopher; Amundadottir, Laufey T.; Berndt, Sonja I.; Blot, William J.; Bock, Cathryn H.; Bracci, Paige M.; Burdett, Laurie; Buring, Julie E.; Butler, Mary A.; Carreón, Tania; Chatterjee, Nilanjan; Chung, Charles C.; Cook, Michael B.; Cullen, Michael; Davis, Faith G.; Ding, Ti; Duell, Eric J.; Epstein, Caroline G.; Fan, Jin-Hu; Figueroa, Jonine D.; Fraumeni, Joseph F.; Freedman, Neal D.; Fuchs, Charles S.; Gao, Yu-Tang; Gapstur, Susan M.; Patiño-Garcia, Ana; Garcia-Closas, Montserrat; Gaziano, J. Michael; Giles, Graham G.; Gillanders, Elizabeth M.; Giovannucci, Edward L.; Goldin, Lynn; Goldstein, Alisa M.; Greene, Mark H.; Hallmans, Goran; Harris, Curtis C.; Henriksson, Roger; Holly, Elizabeth A.; Hoover, Robert N.; Hu, Nan; Hutchinson, Amy; Jenab, Mazda; Johansen, Christoffer; Khaw, Kay-Tee; Koh, Woon-Puay; Kolonel, Laurence N.; Kooperberg, Charles; Krogh, Vittorio; Kurtz, Robert C.; LaCroix, Andrea; Landgren, Annelie; Landi, Maria Teresa; Li, Donghui; Liao, Linda M.; Malats, Nuria; McGlynn, Katherine A.; McNeill, Lorna H.; McWilliams, Robert R.; Melin, Beatrice S.; Mirabello, Lisa; Peplonska, Beata; Peters, Ulrike; Petersen, Gloria M.; Prokunina-Olsson, Ludmila; Purdue, Mark; Qiao, You-Lin; Rabe, Kari G.; Rajaraman, Preetha; Real, Francisco X.; Riboli, Elio; Rodríguez-Santiago, Benjamín; Rothman, Nathaniel; Ruder, Avima M.; Savage, Sharon A.; Schwartz, Ann G.; Schwartz, Kendra L.; Sesso, Howard D.; Severi, Gianluca; Silverman, Debra T.; Spitz, Margaret R.; Stevens, Victoria L.; Stolzenberg-Solomon, Rachael; Stram, Daniel; Tang, Ze-Zhong; Taylor, Philip R.; Teras, Lauren R.; Tobias, Geoffrey S.; Viswanathan, Kala; Wacholder, Sholom; Wang, Zhaoming; Weinstein, Stephanie J.; Wheeler, William; White, Emily; Wiencke, John K.; Wolpin, Brian M.; Wu, Xifeng; Wunder, Jay S.; Yu, Kai; Zanetti, Krista A.; Zeleniuch-Jacquotte, Anne; Ziegler, Regina G.; de Andrade, Mariza; Barnes, Kathleen C.; Beaty, Terri H.; Bierut, Laura J.; Desch, Karl C.; Doheny, Kimberly F.; Feenstra, Bjarke; Ginsburg, David; Heit, John A.; Kang, Jae H.; Laurie, Cecilia A.; Li, Jun Z.; Lowe, William L.; Marazita, Mary L.; Melbye, Mads; Mirel, Daniel B.; Murray, Jeffrey C.; Nelson, Sarah C.; Pasquale, Louis R.; Rice, Kenneth; Wiggs, Janey L.; Wise, Anastasia; Tucker, Margaret; Pérez-Jurado, Luis A.; Laurie, Cathy C.; Caporaso, Neil E.; Yeager, Meredith; Chanock, Stephen J.
2015-01-01
Analyses of genome-wide association study (GWAS) data have revealed that detectable genetic mosaicism involving large (>2 Mb) structural autosomal alterations occurs in a fraction of individuals. We present results for a set of 24,849 genotyped individuals (total GWAS set II [TGSII]) in whom 341 large autosomal abnormalities were observed in 168 (0.68%) individuals. Merging data from the new TGSII set with data from two prior reports (the Gene-Environment Association Studies and the total GWAS set I) generated a large dataset of 127,179 individuals; we then conducted a meta-analysis to investigate the patterns of detectable autosomal mosaicism (n = 1,315 events in 925 [0.73%] individuals). Restricting to events >2 Mb in size, we observed an increase in event frequency as event size decreased. The combined results underscore that the rate of detectable mosaicism increases with age (p value = 5.5 × 10−31) and is higher in men (p value = 0.002) but lower in participants of African ancestry (p value = 0.003). In a subset of 47 individuals from whom serial samples were collected up to 6 years apart, complex changes were noted over time and showed an overall increase in the proportion of mosaic cells as age increased. Our large combined sample allowed for a unique ability to characterize detectable genetic mosaicism involving large structural events and strengthens the emerging evidence of non-random erosion of the genome in the aging population. PMID:25748358
Microbial genome-wide association studies: lessons from human GWAS.
Power, Robert A; Parkhill, Julian; de Oliveira, Tulio
2017-01-01
The reduced costs of sequencing have led to whole-genome sequences for a large number of microorganisms, enabling the application of microbial genome-wide association studies (GWAS). Given the successes of human GWAS in understanding disease aetiology and identifying potential drug targets, microbial GWAS are likely to further advance our understanding of infectious diseases. These advances include insights into pressing global health problems, such as antibiotic resistance and disease transmission. In this Review, we outline the methodologies of GWAS, the current state of the field of microbial GWAS, and how lessons from human GWAS can direct the future of the field.
Pengelly, Reuben J; Tapper, William; Gibson, Jane; Knut, Marcin; Tearle, Rick; Collins, Andrew; Ennis, Sarah
2015-09-03
An understanding of linkage disequilibrium (LD) structures in the human genome underpins much of medical genetics and provides a basis for disease gene mapping and investigating biological mechanisms such as recombination and selection. Whole genome sequencing (WGS) provides the opportunity to determine LD structures at maximal resolution. We compare LD maps constructed from WGS data with LD maps produced from the array-based HapMap dataset, for representative European and African populations. WGS provides up to 5.7-fold greater SNP density than array-based data and achieves much greater resolution of LD structure, allowing for identification of up to 2.8-fold more regions of intense recombination. The absence of ascertainment bias in variant genotyping improves the population representativeness of the WGS maps, and highlights the extent of uncaptured variation using array genotyping methodologies. The complete capture of LD patterns using WGS allows for higher genome-wide association study (GWAS) power compared to array-based GWAS, with WGS also allowing for the analysis of rare variation. The impact of marker ascertainment issues in arrays has been greatest for Sub-Saharan African populations where larger sample sizes and substantially higher marker densities are required to fully resolve the LD structure. WGS provides the best possible resource for LD mapping due to the maximal marker density and lack of ascertainment bias. WGS LD maps provide a rich resource for medical and population genetics studies. The increasing availability of WGS data for large populations will allow for improved research utilising LD, such as GWAS and recombination biology studies.
Jiang, Jicai; Shen, Botong; O'Connell, Jeffrey R; VanRaden, Paul M; Cole, John B; Ma, Li
2017-05-30
Although genome-wide association and genomic selection studies have primarily focused on additive effects, dominance and imprinting effects play an important role in mammalian biology and development. The degree to which these non-additive genetic effects contribute to phenotypic variation and whether QTL acting in a non-additive manner can be detected in genetic association studies remain controversial. To empirically answer these questions, we analyzed a large cattle dataset that consisted of 42,701 genotyped Holstein cows with genotyped parents and phenotypic records for eight production and reproduction traits. SNP genotypes were phased in pedigree to determine the parent-of-origin of alleles, and a three-component GREML was applied to obtain variance decomposition for additive, dominance, and imprinting effects. The results showed a significant non-zero contribution from dominance to production traits but not to reproduction traits. Imprinting effects significantly contributed to both production and reproduction traits. Interestingly, imprinting effects contributed more to reproduction traits than to production traits. Using GWAS and imputation-based fine-mapping analyses, we identified and validated a dominance association signal with milk yield near RUNX2, a candidate gene that has been associated with milk production in mice. When adding non-additive effects into the prediction models, however, we observed little or no increase in prediction accuracy for the eight traits analyzed. Collectively, our results suggested that non-additive effects contributed a non-negligible amount (more for reproduction traits) to the total genetic variance of complex traits in cattle, and detection of QTLs with non-additive effect is possible in GWAS using a large dataset.
The AraGWAS Catalog: a curated and standardized Arabidopsis thaliana GWAS catalog
Togninalli, Matteo; Seren, Ümit; Meng, Dazhe; Fitz, Joffrey; Nordborg, Magnus; Weigel, Detlef
2018-01-01
Abstract The abundance of high-quality genotype and phenotype data for the model organism Arabidopsis thaliana enables scientists to study the genetic architecture of many complex traits at an unprecedented level of detail using genome-wide association studies (GWAS). GWAS have been a great success in A. thaliana and many SNP-trait associations have been published. With the AraGWAS Catalog (https://aragwas.1001genomes.org) we provide a publicly available, manually curated and standardized GWAS catalog for all publicly available phenotypes from the central A. thaliana phenotype repository, AraPheno. All GWAS have been recomputed on the latest imputed genotype release of the 1001 Genomes Consortium using a standardized GWAS pipeline to ensure comparability between results. The catalog includes currently 167 phenotypes and more than 222 000 SNP-trait associations with P < 10−4, of which 3887 are significantly associated using permutation-based thresholds. The AraGWAS Catalog can be accessed via a modern web-interface and provides various features to easily access, download and visualize the results and summary statistics across GWAS. PMID:29059333
Genomic regions associated with bovine milk fatty acids in both summer and winter milk samples
2012-01-01
Background In this study we perform a genome-wide association study (GWAS) for bovine milk fatty acids from summer milk samples. This study replicates a previous study where we performed a GWAS for bovine milk fatty acids based on winter milk samples from the same population. Fatty acids from summer and winter milk are genetically similar traits and we therefore compare the regions detected in summer milk to the regions previously detected in winter milk GWAS to discover regions that explain genetic variation in both summer and winter milk. Results The GWAS of summer milk samples resulted in 51 regions associated with one or more milk fatty acids. Results are in agreement with most associations that were previously detected in a GWAS of fatty acids from winter milk samples, including eight ‘new’ regions that were not considered in the individual studies. The high correlation between the –log10(P-values) and effects of SNPs that were found significant in both GWAS imply that the effects of the SNPs were similar on winter and summer milk fatty acids. Conclusions The GWAS of fatty acids based on summer milk samples was in agreement with most of the associations detected in the GWAS of fatty acids based on winter milk samples. Associations that were in agreement between both GWAS are more likely to be involved in fatty acid synthesis compared to regions detected in only one GWAS and are therefore worthwhile to pursue in fine-mapping studies. PMID:23107417
Wang, Yanru; Freedman, Jennifer A; Liu, Hongliang; Moorman, Patricia G; Hyslop, Terry; George, Daniel J; Lee, Norman H; Patierno, Steven R; Wei, Qingyi
2017-08-15
Evidence suggests that cells with a stemness phenotype play a pivotal role in oncogenesis, and prostate cells exhibiting this phenotype have been identified. We used two genome-wide association study (GWAS) datasets of African descendants, from the Multiethnic/Minority Cohort Study of Diet and Cancer (MEC) and the Ghana Prostate Study, and two GWAS datasets of non-Hispanic whites, from the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial and the Breast and Prostate Cancer Cohort Consortium (BPC3), to analyze the associations between genetic variants of stemness-related genes and racial disparities in susceptibility to prostate cancer. We evaluated associations of single-nucleotide polymorphisms (SNPs) in 25 stemness-related genes with prostate cancer risk in 1,609 cases and 2,550 controls of non-Hispanic whites (4,934 SNPs) and 1,144 cases and 1,116 controls of African descendants (5,448 SNPs) with correction by false discovery rate ≤0.2. We identified 32 SNPs in five genes (TP63, ALDH1A1, WNT1, MET and EGFR) that were significantly associated with prostate cancer risk, of which six SNPs in three genes (TP63, ALDH1A1 and WNT1) and eight EGFR SNPs showed heterogeneity in susceptibility between these two racial groups. In addition, 13 SNPs in MET and one in ALDH1A1 were found only in African descendants. The in silico bioinformatics analyses revealed that EGFR rs2072454 and SNPs in linkage with the identified SNPs in MET and ALDH1A1 (r 2 > 0.6) were predicted to regulate RNA splicing. These variants may serve as novel biomarkers for racial disparities in prostate cancer risk. © 2017 UICC.
Luo, Li; Zhu, Yun
2012-01-01
Abstract The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T2, collapsing method, multivariate and collapsing (CMC) method, individual χ2 test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets. PMID:22651812
Luo, Li; Zhu, Yun; Xiong, Momiao
2012-06-01
The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.
Carter, Chris J.; France, James; Crean, StJohn; Singhrao, Sim K.
2017-01-01
Periodontal disease is of established etiology in which polymicrobial synergistic ecology has become dysbiotic under the influence of Porphyromonas gingivalis. Following breakdown of the host's protective oral tissue barriers, P. gingivalis migrates to developing inflammatory pathologies that associate with Alzheimer's disease (AD). Periodontal disease is a risk factor for cardiovascular disorders (CVD), type II diabetes mellitus (T2DM), AD and other chronic diseases, whilst T2DM exacerbates periodontitis. This study analyzed the relationship between the P. gingivalis/host interactome and the genes identified in genome-wide association studies (GWAS) for the aforementioned conditions using data from GWASdb (P < 1E-03) and, in some cases, from the NCBI/EBI GWAS database (P < 1E-05). Gene expression data from periodontitis or P. gingivalis microarray was compared to microarray datasets from the AD hippocampus and/or from carotid artery plaques. The results demonstrated that the host genes of the P. gingivalis interactome were significantly enriched in genes deposited in GWASdb genes related to cognitive disorders, AD and dementia, and its co-morbid conditions T2DM, obesity, and CVD. The P. gingivalis/host interactome was also enriched in GWAS genes from the more stringent NCBI-EBI database for AD, atherosclerosis and T2DM. The misregulated genes in periodontitis tissue or P. gingivalis infected macrophages also matched those in the AD hippocampus or atherosclerotic plaques. Together, these data suggest important gene/environment interactions between P. gingivalis and susceptibility genes or gene expression changes in conditions where periodontal disease is a contributory factor. PMID:29311898
Carter, Chris J; France, James; Crean, StJohn; Singhrao, Sim K
2017-01-01
Periodontal disease is of established etiology in which polymicrobial synergistic ecology has become dysbiotic under the influence of Porphyromonas gingivalis . Following breakdown of the host's protective oral tissue barriers, P. gingivalis migrates to developing inflammatory pathologies that associate with Alzheimer's disease (AD). Periodontal disease is a risk factor for cardiovascular disorders (CVD), type II diabetes mellitus (T2DM), AD and other chronic diseases, whilst T2DM exacerbates periodontitis. This study analyzed the relationship between the P. gingivalis /host interactome and the genes identified in genome-wide association studies (GWAS) for the aforementioned conditions using data from GWASdb ( P < 1E-03) and, in some cases, from the NCBI/EBI GWAS database ( P < 1E-05). Gene expression data from periodontitis or P. gingivalis microarray was compared to microarray datasets from the AD hippocampus and/or from carotid artery plaques. The results demonstrated that the host genes of the P. gingivalis interactome were significantly enriched in genes deposited in GWASdb genes related to cognitive disorders, AD and dementia, and its co-morbid conditions T2DM, obesity, and CVD. The P. gingivalis /host interactome was also enriched in GWAS genes from the more stringent NCBI-EBI database for AD, atherosclerosis and T2DM. The misregulated genes in periodontitis tissue or P. gingivalis infected macrophages also matched those in the AD hippocampus or atherosclerotic plaques. Together, these data suggest important gene/environment interactions between P. gingivalis and susceptibility genes or gene expression changes in conditions where periodontal disease is a contributory factor.
Characterization of large structural genetic mosaicism in human autosomes.
Machiela, Mitchell J; Zhou, Weiyin; Sampson, Joshua N; Dean, Michael C; Jacobs, Kevin B; Black, Amanda; Brinton, Louise A; Chang, I-Shou; Chen, Chu; Chen, Constance; Chen, Kexin; Cook, Linda S; Crous Bou, Marta; De Vivo, Immaculata; Doherty, Jennifer; Friedenreich, Christine M; Gaudet, Mia M; Haiman, Christopher A; Hankinson, Susan E; Hartge, Patricia; Henderson, Brian E; Hong, Yun-Chul; Hosgood, H Dean; Hsiung, Chao A; Hu, Wei; Hunter, David J; Jessop, Lea; Kim, Hee Nam; Kim, Yeul Hong; Kim, Young Tae; Klein, Robert; Kraft, Peter; Lan, Qing; Lin, Dongxin; Liu, Jianjun; Le Marchand, Loic; Liang, Xiaolin; Lissowska, Jolanta; Lu, Lingeng; Magliocco, Anthony M; Matsuo, Keitaro; Olson, Sara H; Orlow, Irene; Park, Jae Yong; Pooler, Loreall; Prescott, Jennifer; Rastogi, Radhai; Risch, Harvey A; Schumacher, Fredrick; Seow, Adeline; Setiawan, Veronica Wendy; Shen, Hongbing; Sheng, Xin; Shin, Min-Ho; Shu, Xiao-Ou; VanDen Berg, David; Wang, Jiu-Cun; Wentzensen, Nicolas; Wong, Maria Pik; Wu, Chen; Wu, Tangchun; Wu, Yi-Long; Xia, Lucy; Yang, Hannah P; Yang, Pan-Chyr; Zheng, Wei; Zhou, Baosen; Abnet, Christian C; Albanes, Demetrius; Aldrich, Melinda C; Amos, Christopher; Amundadottir, Laufey T; Berndt, Sonja I; Blot, William J; Bock, Cathryn H; Bracci, Paige M; Burdett, Laurie; Buring, Julie E; Butler, Mary A; Carreón, Tania; Chatterjee, Nilanjan; Chung, Charles C; Cook, Michael B; Cullen, Michael; Davis, Faith G; Ding, Ti; Duell, Eric J; Epstein, Caroline G; Fan, Jin-Hu; Figueroa, Jonine D; Fraumeni, Joseph F; Freedman, Neal D; Fuchs, Charles S; Gao, Yu-Tang; Gapstur, Susan M; Patiño-Garcia, Ana; Garcia-Closas, Montserrat; Gaziano, J Michael; Giles, Graham G; Gillanders, Elizabeth M; Giovannucci, Edward L; Goldin, Lynn; Goldstein, Alisa M; Greene, Mark H; Hallmans, Goran; Harris, Curtis C; Henriksson, Roger; Holly, Elizabeth A; Hoover, Robert N; Hu, Nan; Hutchinson, Amy; Jenab, Mazda; Johansen, Christoffer; Khaw, Kay-Tee; Koh, Woon-Puay; Kolonel, Laurence N; Kooperberg, Charles; Krogh, Vittorio; Kurtz, Robert C; LaCroix, Andrea; Landgren, Annelie; Landi, Maria Teresa; Li, Donghui; Liao, Linda M; Malats, Nuria; McGlynn, Katherine A; McNeill, Lorna H; McWilliams, Robert R; Melin, Beatrice S; Mirabello, Lisa; Peplonska, Beata; Peters, Ulrike; Petersen, Gloria M; Prokunina-Olsson, Ludmila; Purdue, Mark; Qiao, You-Lin; Rabe, Kari G; Rajaraman, Preetha; Real, Francisco X; Riboli, Elio; Rodríguez-Santiago, Benjamín; Rothman, Nathaniel; Ruder, Avima M; Savage, Sharon A; Schwartz, Ann G; Schwartz, Kendra L; Sesso, Howard D; Severi, Gianluca; Silverman, Debra T; Spitz, Margaret R; Stevens, Victoria L; Stolzenberg-Solomon, Rachael; Stram, Daniel; Tang, Ze-Zhong; Taylor, Philip R; Teras, Lauren R; Tobias, Geoffrey S; Viswanathan, Kala; Wacholder, Sholom; Wang, Zhaoming; Weinstein, Stephanie J; Wheeler, William; White, Emily; Wiencke, John K; Wolpin, Brian M; Wu, Xifeng; Wunder, Jay S; Yu, Kai; Zanetti, Krista A; Zeleniuch-Jacquotte, Anne; Ziegler, Regina G; de Andrade, Mariza; Barnes, Kathleen C; Beaty, Terri H; Bierut, Laura J; Desch, Karl C; Doheny, Kimberly F; Feenstra, Bjarke; Ginsburg, David; Heit, John A; Kang, Jae H; Laurie, Cecilia A; Li, Jun Z; Lowe, William L; Marazita, Mary L; Melbye, Mads; Mirel, Daniel B; Murray, Jeffrey C; Nelson, Sarah C; Pasquale, Louis R; Rice, Kenneth; Wiggs, Janey L; Wise, Anastasia; Tucker, Margaret; Pérez-Jurado, Luis A; Laurie, Cathy C; Caporaso, Neil E; Yeager, Meredith; Chanock, Stephen J
2015-03-05
Analyses of genome-wide association study (GWAS) data have revealed that detectable genetic mosaicism involving large (>2 Mb) structural autosomal alterations occurs in a fraction of individuals. We present results for a set of 24,849 genotyped individuals (total GWAS set II [TGSII]) in whom 341 large autosomal abnormalities were observed in 168 (0.68%) individuals. Merging data from the new TGSII set with data from two prior reports (the Gene-Environment Association Studies and the total GWAS set I) generated a large dataset of 127,179 individuals; we then conducted a meta-analysis to investigate the patterns of detectable autosomal mosaicism (n = 1,315 events in 925 [0.73%] individuals). Restricting to events >2 Mb in size, we observed an increase in event frequency as event size decreased. The combined results underscore that the rate of detectable mosaicism increases with age (p value = 5.5 × 10(-31)) and is higher in men (p value = 0.002) but lower in participants of African ancestry (p value = 0.003). In a subset of 47 individuals from whom serial samples were collected up to 6 years apart, complex changes were noted over time and showed an overall increase in the proportion of mosaic cells as age increased. Our large combined sample allowed for a unique ability to characterize detectable genetic mosaicism involving large structural events and strengthens the emerging evidence of non-random erosion of the genome in the aging population. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Wang, Shuang; Zhang, Yuchen; Dai, Wenrui; Lauter, Kristin; Kim, Miran; Tang, Yuzhe; Xiong, Hongkai; Jiang, Xiaoqian
2016-01-01
Motivation: Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual’s privacy at risk. It is important to protect human genome data. Exact logistic regression is a bias-reduction method based on a penalized likelihood to discover rare variants that are associated with disease susceptibility. We propose the HEALER framework to facilitate secure rare variants analysis with a small sample size. Results: We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e. evaluate P-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size. We evaluate the algorithm performance using rare Kawasaki Disease datasets. Availability and implementation: Download HEALER at http://research.ucsd-dbmi.org/HEALER/ Contact: shw070@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26446135
2013-01-01
Background The theoretical basis of genome-wide association studies (GWAS) is statistical inference of linkage disequilibrium (LD) between any polymorphic marker and a putative disease locus. Most methods widely implemented for such analyses are vulnerable to several key demographic factors and deliver a poor statistical power for detecting genuine associations and also a high false positive rate. Here, we present a likelihood-based statistical approach that accounts properly for non-random nature of case–control samples in regard of genotypic distribution at the loci in populations under study and confers flexibility to test for genetic association in presence of different confounding factors such as population structure, non-randomness of samples etc. Results We implemented this novel method together with several popular methods in the literature of GWAS, to re-analyze recently published Parkinson’s disease (PD) case–control samples. The real data analysis and computer simulation show that the new method confers not only significantly improved statistical power for detecting the associations but also robustness to the difficulties stemmed from non-randomly sampling and genetic structures when compared to its rivals. In particular, the new method detected 44 significant SNPs within 25 chromosomal regions of size < 1 Mb but only 6 SNPs in two of these regions were previously detected by the trend test based methods. It discovered two SNPs located 1.18 Mb and 0.18 Mb from the PD candidates, FGF20 and PARK8, without invoking false positive risk. Conclusions We developed a novel likelihood-based method which provides adequate estimation of LD and other population model parameters by using case and control samples, the ease in integration of these samples from multiple genetically divergent populations and thus confers statistically robust and powerful analyses of GWAS. On basis of simulation studies and analysis of real datasets, we demonstrated significant improvement of the new method over the non-parametric trend test, which is the most popularly implemented in the literature of GWAS. PMID:23394771
Veerkamp, Roel F; Bouwman, Aniek C; Schrooten, Chris; Calus, Mario P L
2016-12-01
Whole-genome sequence data is expected to capture genetic variation more completely than common genotyping panels. Our objective was to compare the proportion of variance explained and the accuracy of genomic prediction by using imputed sequence data or preselected SNPs from a genome-wide association study (GWAS) with imputed whole-genome sequence data. Phenotypes were available for 5503 Holstein-Friesian bulls. Genotypes were imputed up to whole-genome sequence (13,789,029 segregating DNA variants) by using run 4 of the 1000 bull genomes project. The program GCTA was used to perform GWAS for protein yield (PY), somatic cell score (SCS) and interval from first to last insemination (IFL). From the GWAS, subsets of variants were selected and genomic relationship matrices (GRM) were used to estimate the variance explained in 2087 validation animals and to evaluate the genomic prediction ability. Finally, two GRM were fitted together in several models to evaluate the effect of selected variants that were in competition with all the other variants. The GRM based on full sequence data explained only marginally more genetic variation than that based on common SNP panels: for PY, SCS and IFL, genomic heritability improved from 0.81 to 0.83, 0.83 to 0.87 and 0.69 to 0.72, respectively. Sequence data also helped to identify more variants linked to quantitative trait loci and resulted in clearer GWAS peaks across the genome. The proportion of total variance explained by the selected variants combined in a GRM was considerably smaller than that explained by all variants (less than 0.31 for all traits). When selected variants were used, accuracy of genomic predictions decreased and bias increased. Although 35 to 42 variants were detected that together explained 13 to 19% of the total variance (18 to 23% of the genetic variance) when fitted alone, there was no advantage in using dense sequence information for genomic prediction in the Holstein data used in our study. Detection and selection of variants within a single breed are difficult due to long-range linkage disequilibrium. Stringent selection of variants resulted in more biased genomic predictions, although this might be due to the training population being the same dataset from which the selected variants were identified.
Transethnic genome-wide scan identifies novel Alzheimer's disease loci.
Jun, Gyungah R; Chung, Jaeyoon; Mez, Jesse; Barber, Robert; Beecham, Gary W; Bennett, David A; Buxbaum, Joseph D; Byrd, Goldie S; Carrasquillo, Minerva M; Crane, Paul K; Cruchaga, Carlos; De Jager, Philip; Ertekin-Taner, Nilufer; Evans, Denis; Fallin, M Danielle; Foroud, Tatiana M; Friedland, Robert P; Goate, Alison M; Graff-Radford, Neill R; Hendrie, Hugh; Hall, Kathleen S; Hamilton-Nelson, Kara L; Inzelberg, Rivka; Kamboh, M Ilyas; Kauwe, John S K; Kukull, Walter A; Kunkle, Brian W; Kuwano, Ryozo; Larson, Eric B; Logue, Mark W; Manly, Jennifer J; Martin, Eden R; Montine, Thomas J; Mukherjee, Shubhabrata; Naj, Adam; Reiman, Eric M; Reitz, Christiane; Sherva, Richard; St George-Hyslop, Peter H; Thornton, Timothy; Younkin, Steven G; Vardarajan, Badri N; Wang, Li-San; Wendlund, Jens R; Winslow, Ashley R; Haines, Jonathan; Mayeux, Richard; Pericak-Vance, Margaret A; Schellenberg, Gerard; Lunetta, Kathryn L; Farrer, Lindsay A
2017-07-01
Genetic loci for Alzheimer's disease (AD) have been identified in whites of European ancestry, but the genetic architecture of AD among other populations is less understood. We conducted a transethnic genome-wide association study (GWAS) for late-onset AD in Stage 1 sample including whites of European Ancestry, African-Americans, Japanese, and Israeli-Arabs assembled by the Alzheimer's Disease Genetics Consortium. Suggestive results from Stage 1 from novel loci were followed up using summarized results in the International Genomics Alzheimer's Project GWAS dataset. Genome-wide significant (GWS) associations in single-nucleotide polymorphism (SNP)-based tests (P < 5 × 10 -8 ) were identified for SNPs in PFDN1/HBEGF, USP6NL/ECHDC3, and BZRAP1-AS1 and for the interaction of the (apolipoprotein E) APOE ε4 allele with NFIC SNP. We also obtained GWS evidence (P < 2.7 × 10 -6 ) for gene-based association in the total sample with a novel locus, TPBG (P = 1.8 × 10 -6 ). Our findings highlight the value of transethnic studies for identifying novel AD susceptibility loci. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Jannink, Jean-Luc
2010-01-01
Genome-wide association studies (GWAS) may benefit from utilizing haplotype information for making marker-phenotype associations. Several rationales for grouping single nucleotide polymorphisms (SNPs) into haplotype blocks exist, but any advantage may depend on such factors as genetic architecture of traits, patterns of linkage disequilibrium in the study population, and marker density. The objective of this study was to explore the utility of haplotypes for GWAS in barley (Hordeum vulgare) to offer a first detailed look at this approach for identifying agronomically important genes in crops. To accomplish this, we used genotype and phenotype data from the Barley Coordinated Agricultural Project and constructed haplotypes using three different methods. Marker-trait associations were tested by the efficient mixed-model association algorithm (EMMA). When QTL were simulated using single SNPs dropped from the marker dataset, a simple sliding window performed as well or better than single SNPs or the more sophisticated methods of blocking SNPs into haplotypes. Moreover, the haplotype analyses performed better 1) when QTL were simulated as polymorphisms that arose subsequent to marker variants, and 2) in analysis of empirical heading date data. These results demonstrate that the information content of haplotypes is dependent on the particular mutational and recombinational history of the QTL and nearby markers. Analysis of the empirical data also confirmed our intuition that the distribution of QTL alleles in nature is often unlike the distribution of marker variants, and hence utilizing haplotype information could capture associations that would elude single SNPs. We recommend routine use of both single SNP and haplotype markers for GWAS to take advantage of the full information content of the genotype data. PMID:21124933
Analysis of the Influence of microRNAs in Lithium Response in Bipolar Disorder.
Reinbold, Céline S; Forstner, Andreas J; Hecker, Julian; Fullerton, Janice M; Hoffmann, Per; Hou, Liping; Heilbronner, Urs; Degenhardt, Franziska; Adli, Mazda; Akiyama, Kazufumi; Akula, Nirmala; Ardau, Raffaella; Arias, Bárbara; Backlund, Lena; Benabarre, Antonio; Bengesser, Susanne; Bhattacharjee, Abesh K; Biernacka, Joanna M; Birner, Armin; Marie-Claire, Cynthia; Cervantes, Pablo; Chen, Guo-Bo; Chen, Hsi-Chung; Chillotti, Caterina; Clark, Scott R; Colom, Francesc; Cousins, David A; Cruceanu, Cristiana; Czerski, Piotr M; Dayer, Alexandre; Étain, Bruno; Falkai, Peter; Frisén, Louise; Gard, Sébastien; Garnham, Julie S; Goes, Fernando S; Grof, Paul; Gruber, Oliver; Hashimoto, Ryota; Hauser, Joanna; Herms, Stefan; Jamain, Stéphane; Jiménez, Esther; Kahn, Jean-Pierre; Kassem, Layla; Kittel-Schneider, Sarah; Kliwicki, Sebastian; König, Barbara; Kusumi, Ichiro; Lackner, Nina; Laje, Gonzalo; Landén, Mikael; Lavebratt, Catharina; Leboyer, Marion; Leckband, Susan G; López Jaramillo, Carlos A; MacQueen, Glenda; Manchia, Mirko; Martinsson, Lina; Mattheisen, Manuel; McCarthy, Michael J; McElroy, Susan L; Mitjans, Marina; Mondimore, Francis M; Monteleone, Palmiero; Nievergelt, Caroline M; Ösby, Urban; Ozaki, Norio; Perlis, Roy H; Pfennig, Andrea; Reich-Erkelenz, Daniela; Rouleau, Guy A; Schofield, Peter R; Schubert, K Oliver; Schweizer, Barbara W; Seemüller, Florian; Severino, Giovanni; Shekhtman, Tatyana; Shilling, Paul D; Shimoda, Kazutaka; Simhandl, Christian; Slaney, Claire M; Smoller, Jordan W; Squassina, Alessio; Stamm, Thomas J; Stopkova, Pavla; Tighe, Sarah K; Tortorella, Alfonso; Turecki, Gustavo; Volkert, Julia; Witt, Stephanie H; Wright, Adam J; Young, L Trevor; Zandi, Peter P; Potash, James B; DePaulo, J Raymond; Bauer, Michael; Reininghaus, Eva; Novák, Tomáš; Aubry, Jean-Michel; Maj, Mario; Baune, Bernhard T; Mitchell, Philip B; Vieta, Eduard; Frye, Mark A; Rybakowski, Janusz K; Kuo, Po-Hsiu; Kato, Tadafumi; Grigoroiu-Serbanescu, Maria; Reif, Andreas; Del Zompo, Maria; Bellivier, Frank; Schalling, Martin; Wray, Naomi R; Kelsoe, John R; Alda, Martin; McMahon, Francis J; Schulze, Thomas G; Rietschel, Marcella; Nöthen, Markus M; Cichon, Sven
2018-01-01
Bipolar disorder (BD) is a common, highly heritable neuropsychiatric disease characterized by recurrent episodes of mania and depression. Lithium is the best-established long-term treatment for BD, even though individual response is highly variable. Evidence suggests that some of this variability has a genetic basis. This is supported by the largest genome-wide association study (GWAS) of lithium response to date conducted by the International Consortium on Lithium Genetics (ConLiGen). Recently, we performed the first genome-wide analysis of the involvement of miRNAs in BD and identified nine BD-associated miRNAs. However, it is unknown whether these miRNAs are also associated with lithium response in BD. In the present study, we therefore tested whether common variants at these nine candidate miRNAs contribute to the variance in lithium response in BD. Furthermore, we systematically analyzed whether any other miRNA in the genome is implicated in the response to lithium. For this purpose, we performed gene-based tests for all known miRNA coding genes in the ConLiGen GWAS dataset ( n = 2,563 patients) using a set-based testing approach adapted from the versatile gene-based test for GWAS (VEGAS2). In the candidate approach, miR-499a showed a nominally significant association with lithium response, providing some evidence for involvement in both development and treatment of BD. In the genome-wide miRNA analysis, 71 miRNAs showed nominally significant associations with the dichotomous phenotype and 106 with the continuous trait for treatment response. A total of 15 miRNAs revealed nominal significance in both phenotypes with miR-633 showing the strongest association with the continuous trait ( p = 9.80E-04) and miR-607 with the dichotomous phenotype ( p = 5.79E-04). No association between miRNAs and treatment response to lithium in BD in either of the tested conditions withstood multiple testing correction. Given the limited power of our study, the investigation of miRNAs in larger GWAS samples of BD and lithium response is warranted.
An alternative covariance estimator to investigate genetic heterogeneity in populations.
Heslot, Nicolas; Jannink, Jean-Luc
2015-11-26
For genomic prediction and genome-wide association studies (GWAS) using mixed models, covariance between individuals is estimated using molecular markers. Based on the properties of mixed models, using available molecular data for prediction is optimal if this covariance is known. Under this assumption, adding individuals to the analysis should never be detrimental. However, some empirical studies showed that increasing training population size decreased prediction accuracy. Recently, results from theoretical models indicated that even if marker density is high and the genetic architecture of traits is controlled by many loci with small additive effects, the covariance between individuals, which depends on relationships at causal loci, is not always well estimated by the whole-genome kinship. We propose an alternative covariance estimator named K-kernel, to account for potential genetic heterogeneity between populations that is characterized by a lack of genetic correlation, and to limit the information flow between a priori unknown populations in a trait-specific manner. This is similar to a multi-trait model and parameters are estimated by REML and, in extreme cases, it can allow for an independent genetic architecture between populations. As such, K-kernel is useful to study the problem of the design of training populations. K-kernel was compared to other covariance estimators or kernels to examine its fit to the data, cross-validated accuracy and suitability for GWAS on several datasets. It provides a significantly better fit to the data than the genomic best linear unbiased prediction model and, in some cases it performs better than other kernels such as the Gaussian kernel, as shown by an empirical null distribution. In GWAS simulations, alternative kernels control type I errors as well as or better than the classical whole-genome kinship and increase statistical power. No or small gains were observed in cross-validated prediction accuracy. This alternative covariance estimator can be used to gain insight into trait-specific genetic heterogeneity by identifying relevant sub-populations that lack genetic correlation between them. Genetic correlation can be 0 between identified sub-populations by performing automatic selection of relevant sets of individuals to be included in the training population. It may also increase statistical power in GWAS.
Southam, Lorraine; Panoutsopoulou, Kalliope; Rayner, N William; Chapman, Kay; Durrant, Caroline; Ferreira, Teresa; Arden, Nigel; Carr, Andrew; Deloukas, Panos; Doherty, Michael; Loughlin, John; McCaskie, Andrew; Ollier, William E R; Ralston, Stuart; Spector, Timothy D; Valdes, Ana M; Wallis, Gillian A; Wilkinson, J Mark; Marchini, Jonathan; Zeggini, Eleftheria
2011-05-01
Imputation is an extremely valuable tool in conducting and synthesising genome-wide association studies (GWASs). Directly typed SNP quality control (QC) is thought to affect imputation quality. It is, therefore, common practise to use quality-controlled (QCed) data as an input for imputing genotypes. This study aims to determine the effect of commonly applied QC steps on imputation outcomes. We performed several iterations of imputing SNPs across chromosome 22 in a dataset consisting of 3177 samples with Illumina 610 k (Illumina, San Diego, CA, USA) GWAS data, applying different QC steps each time. The imputed genotypes were compared with the directly typed genotypes. In addition, we investigated the correlation between alternatively QCed data. We also applied a series of post-imputation QC steps balancing elimination of poorly imputed SNPs and information loss. We found that the difference between the unQCed data and the fully QCed data on imputation outcome was minimal. Our study shows that imputation of common variants is generally very accurate and robust to GWAS QC, which is not a major factor affecting imputation outcome. A minority of common-frequency SNPs with particular properties cannot be accurately imputed regardless of QC stringency. These findings may not generalise to the imputation of low frequency and rare variants.
Genome-wide association study of rice grain width variation.
Zheng, Xiao-Ming; Gong, Tingting; Ou, Hong-Ling; Xue, Dayuan; Qiao, Weihua; Wang, Junrui; Liu, Sha; Yang, Qingwen; Olsen, Kenneth M
2018-04-01
Seed size is variable within many plant species, and understanding the underlying genetic factors can provide insights into mechanisms of local environmental adaptation. Here we make use of the abundant genomic and germplasm resources available for rice (Oryza sativa) to perform a large-scale genome-wide association study (GWAS) of grain width. Grain width varies widely within the crop and is also known to show climate-associated variation across populations of its wild progenitor. Using a filtered dataset of >1.9 million genome-wide SNPs in a sample of 570 cultivated and wild rice accessions, we performed GWAS with two complementary models, GLM and MLM. The models yielded 10 and 33 significant associations, respectively, and jointly yielded seven candidate locus regions, two of which have been previously identified. Analyses of nucleotide diversity and haplotype distributions at these loci revealed signatures of selection and patterns consistent with adaptive introgression of grain width alleles across rice variety groups. The results provide a 50% increase in the total number of rice grain width loci mapped to date and support a polygenic model whereby grain width is shaped by gene-by-environment interactions. These loci can potentially serve as candidates for studies of adaptive seed size variation in wild grass species.
Gene-diet interaction effects on BMI levels in the Singapore Chinese population.
Chang, Xuling; Dorajoo, Rajkumar; Sun, Ye; Han, Yi; Wang, Ling; Khor, Chiea-Chuen; Sim, Xueling; Tai, E-Shyong; Liu, Jianjun; Yuan, Jian-Min; Koh, Woon-Puay; van Dam, Rob M; Friedlander, Yechiel; Heng, Chew-Kiat
2018-02-24
Recent genome-wide association studies (GWAS) have identified 97 body-mass index (BMI) associated loci. We aimed to evaluate if dietary intake modifies BMI associations at these loci in the Singapore Chinese population. We utilized GWAS information from six data subsets from two adult Chinese population (N = 7817). Seventy-eight genotyped or imputed index BMI single nucleotide polymorphisms (SNPs) that passed quality control procedures were available in all datasets. Alternative Healthy Eating Index (AHEI)-2010 score and ten nutrient variables were evaluated. Linear regression analyses between z score transformed BMI (Z-BMI) and dietary factors were performed. Interaction analyses were performed by introducing the interaction term (diet x SNP) in the same regression model. Analysis was carried out in each cohort individually and subsequently meta-analyzed using the inverse-variance weighted method. Analyses were also evaluated with a weighted gene-risk score (wGRS) contructed by BMI index SNPs from recent large-scale GWAS studies. Nominal associations between Z-BMI and AHEI-2010 and some dietary factors were identified (P = 0.047-0.010). The BMI wGRS was robustly associated with Z-BMI (P = 1.55 × 10 - 15 ) but not with any dietary variables. Dietary variables did not significantly interact with the wGRS to modify BMI associations. When interaction analyses were repeated using individual SNPs, a significant association between cholesterol intake and rs4740619 (CCDC171) was identified (β = 0.077, adjP interaction = 0.043). The CCDC171 gene locus may interact with cholesterol intake to increase BMI in the Singaporean Chinese population, however most known obesity risk loci were not associated with dietary intake and did not interact with diet to modify BMI levels.
Villegas, Raquel; Williams, Scott M.; Gao, Yu-Tang; Long, Jirong; Shi, Jiajun; Cai, Hui; Li, Honglan; Chen, Ching-Chu; Tai, E. Shyong; Hu, Frank; Cai, Qiuyin; Zheng, Wei; Shu, Xiao-Ou
2014-01-01
Summary We used a two-stage study design to evaluate whether variations in the peroxisome proliferator-activated receptors (PPAR) and the peroxisome proliferator-activated receptor gamma co-activator 1 (PGC1) gene families (PPARA, PPARG, PPARD, PPARGC1A, and PPARGC1B) are associated with T2D risk. Stage I used data from a genome-wide association study (GWAS) from Shanghai, China (1,019 T2D cases and 1,709 controls) and from a meta-analysis of data from the Asian Genetic Epidemiology Network for T2D (AGEN-T2D). Criteria for selection of SNPs for stage II were: 1) P<0.05 in single marker analysis in Shanghai GWAS and P<0.05 in the meta-analysis or 2) P<10−3 in the meta-analysis alone and 3) minor allele frequency ≥0.10. Nine SNPs from the PGC1 family were assessed in stage II (an independent set of middle-aged men and women from Shanghai with 1,700 T2D cases and 1,647 controls). One SNP in PPARGC1B, rs251464, was replicated in stage II (OR=0.87; 95% CI: 0.77–0.99). Gene-body mass index (BMI) and gene-exercise interactions and T2D risk were evaluated in a combined dataset (Shanghai GWAS and stage II data: 2,719 cases and 3,356 controls). One SNP in PPARGC1A, rs12640088, had a significant interaction with BMI. No interactions between the PPARGC1B gene and BMI or exercise were observed. PMID:24359475
Been, L F; Hatfield, J L; Shankar, A; Aston, C E; Ralhan, S; Wander, G S; Mehra, N K; Singh, J R; Mulvihill, J J; Sanghera, D K
2012-11-01
Two common variants (rs1387153, rs10830963) in MTNR1B have been reported to have independent effects on fasting blood glucose (FBG) levels with increased risk to type 2 diabetes (T2D) in recent genome-wide association studies (GWAS). In this investigation, we report the association of these two variants, and an additional variant (rs1374645) within the GWAS locus of MTNR1B with FBG, 2h glucose, insulin resistance (HOMA IR), β-cell function (HOMA B), and T2D in our sample of Asian Sikhs from India. Our cohort comprised 2222 subjects [1201 T2D, 1021 controls]. None of these SNPs was associated with T2D in this cohort. Our data also could not confirm association of rs1387153 and rs10830963 with FBG phenotype. However, upon stratifying data according to body mass index (BMI) (low ≤ 25 kg/m(2) and high > 25 kg/m(2)) in normoglycemic subjects (n = 1021), the rs1374645 revealed a strong association with low FBG levels in low BMI group (β = -0.073, p = 0.002, Bonferroni p = 0.01) compared to the high BMI group (β = 0.015, p = 0.50). We also detected a strong evidence of interaction between rs1374645 and BMI with respect to FBG levels (p = 0.002). Our data provide new information about the significant impact of another MTNR1B variant on FBG levels that appears to be modulated by BMI. Future confirmation on independent datasets and functional studies will be required to define the role of this variant in fasting glucose variation. Published by Elsevier B.V.
Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application
Cantor, Rita M.; Lange, Kenneth; Sinsheimer, Janet S.
2010-01-01
Genome-wide association studies (GWAS) have rapidly become a standard method for disease gene discovery. A substantial number of recent GWAS indicate that for most disorders, only a few common variants are implicated and the associated SNPs explain only a small fraction of the genetic risk. This review is written from the viewpoint that findings from the GWAS provide preliminary genetic information that is available for additional analysis by statistical procedures that accumulate evidence, and that these secondary analyses are very likely to provide valuable information that will help prioritize the strongest constellations of results. We review and discuss three analytic methods to combine preliminary GWAS statistics to identify genes, alleles, and pathways for deeper investigations. Meta-analysis seeks to pool information from multiple GWAS to increase the chances of finding true positives among the false positives and provides a way to combine associations across GWAS, even when the original data are unavailable. Testing for epistasis within a single GWAS study can identify the stronger results that are revealed when genes interact. Pathway analysis of GWAS results is used to prioritize genes and pathways within a biological context. Following a GWAS, association results can be assigned to pathways and tested in aggregate with computational tools and pathway databases. Reviews of published methods with recommendations for their application are provided within the framework for each approach. PMID:20074509
Evangelou, Marina; Smyth, Deborah J; Fortune, Mary D; Burren, Oliver S; Walker, Neil M; Guo, Hui; Onengut-Gumuscu, Suna; Chen, Wei-Min; Concannon, Patrick; Rich, Stephen S; Todd, John A; Wallace, Chris
2014-12-01
Pathway analysis can complement point-wise single nucleotide polymorphism (SNP) analysis in exploring genomewide association study (GWAS) data to identify specific disease-associated genes that can be candidate causal genes. We propose a straightforward methodology that can be used for conducting a gene-based pathway analysis using summary GWAS statistics in combination with widely available reference genotype data. We used this method to perform a gene-based pathway analysis of a type 1 diabetes (T1D) meta-analysis GWAS (of 7,514 cases and 9,045 controls). An important feature of the conducted analysis is the removal of the major histocompatibility complex gene region, the major genetic risk factor for T1D. Thirty-one of the 1,583 (2%) tested pathways were identified to be enriched for association with T1D at a 5% false discovery rate. We analyzed these 31 pathways and their genes to identify SNPs in or near these pathway genes that showed potentially novel association with T1D and attempted to replicate the association of 22 SNPs in additional samples. Replication P-values were skewed (P=9.85×10-11) with 12 of the 22 SNPs showing P<0.05. Support, including replication evidence, was obtained for nine T1D associated variants in genes ITGB7 (rs11170466, P=7.86×10-9), NRP1 (rs722988, 4.88×10-8), BAD (rs694739, 2.37×10-7), CTSB (rs1296023, 2.79×10-7), FYN (rs11964650, P=5.60×10-7), UBE2G1 (rs9906760, 5.08×10-7), MAP3K14 (rs17759555, 9.67×10-7), ITGB1 (rs1557150, 1.93×10-6), and IL7R (rs1445898, 2.76×10-6). The proposed methodology can be applied to other GWAS datasets for which only summary level data are available. © 2014 The Authors. ** Genetic Epidemiology published by Wiley Periodicals, Inc.
van den Berg, Irene; Boichard, Didier; Lund, Mogens Sandø
2016-11-01
The objective of this study was to compare mapping precision and power of within-breed and multibreed genome-wide association studies (GWAS) and to compare the results obtained by the multibreed GWAS with 3 meta-analysis methods. The multibreed GWAS was expected to improve mapping precision compared with a within-breed GWAS because linkage disequilibrium is conserved over shorter distances across breeds than within breeds. The multibreed GWAS was also expected to increase detection power for quantitative trait loci (QTL) segregating across breeds. GWAS were performed for production traits in dairy cattle, using imputed full genome sequences of 16,031 bulls, originating from 6 French and Danish dairy cattle populations. Our results show that a multibreed GWAS can be a valuable tool for the detection and fine mapping of quantitative trait loci. The number of QTL detected with the multibreed GWAS was larger than the number detected by the within-breed GWAS, indicating an increase in power, especially when the 2 Holstein populations were combined. The largest number of QTL was detected when all populations were combined. The analysis combining all breeds was, however, dominated by Holstein, and QTL segregating in other breeds but not in Holstein were sometimes overshadowed by larger QTL segregating in Holstein. Therefore, the GWAS combining all breeds except Holstein was useful to detect such peaks. Combining all breeds except Holstein resulted in smaller QTL intervals on average, but this outcome was not the case when the Holstein populations were included in the analysis. Although no decrease in the average QTL size was observed, mapping precision did improve for several QTL. Out of 3 different multibreed meta-analysis methods, the weighted z-scores model resulted in the most similar results to the full multibreed GWAS and can be useful as an alternative to a full multibreed GWAS. Differences between the multibreed GWAS and the meta-analyses were larger when different breeds were combined than when the 2 Holstein populations were combined. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Sevillano, Claudia A; Lopes, Marcos S; Harlizius, Barbara; Hanenberg, Egiel H A T; Knol, Egbert F; Bastiaansen, John W M
2015-03-21
Cryptorchidism and scrotal/inguinal hernia are the most frequent congenital defects in pigs. Identification of genomic regions that control these congenital defects is of great interest to breeding programs, both from an animal welfare point of view as well as for economic reasons. The aim of this genome-wide association study (GWAS) was to identify single nucleotide polymorphisms (SNPs) that are strongly associated with these congenital defects. Genotypes were available for 2570 Large White (LW) and 2272 Landrace (LR) pigs. Breeding values were estimated based on 1 359 765 purebred and crossbred male offspring, using a binary trait animal model. Estimated breeding values were deregressed (DEBV) and taken as the response variable in the GWAS. Heritability estimates were equal to 0.26 ± 0.02 for cryptorchidism and to 0.31 ± 0.01 for scrotal/inguinal hernia. Seven and 31 distinct QTL regions were associated with cryptorchidism in the LW and LR datasets, respectively. The top SNP per region explained between 0.96% and 1.10% and between 0.48% and 2.77% of the total variance of cryptorchidism incidence in the LW and LR populations, respectively. Five distinct QTL regions associated with scrotal/inguinal hernia were detected in both LW and LR datasets. The top SNP per region explained between 1.22% and 1.60% and between 1.15% and 1.46% of the total variance of scrotal/inguinal hernia incidence in the LW and LR populations, respectively. For each trait, we identified one overlapping region between the LW and LR datasets, i.e. a region on SSC8 (Sus scrofa chromosome) between 65 and 73 Mb for cryptorchidism and a region on SSC13 between 34 and 37 Mb for scrotal/inguinal hernia. The use of DEBV in combination with a binary trait model was a powerful approach to detect regions associated with difficult traits such as cryptorchidism and scrotal/inguinal hernia that have a low incidence and for which affected animals are generally not available for genotyping. Several novel QTL regions were detected for cryptorchidism and scrotal/inguinal hernia, and for several previously known QTL regions, the confidence interval was narrowed down.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oh, J; Deasy, J; Kerns, S
Purpose: We investigated whether integration of machine learning and bioinformatics techniques on genome-wide association study (GWAS) data can improve the performance of predictive models in predicting the risk of developing radiation-induced late rectal bleeding and erectile dysfunction in prostate cancer patients. Methods: We analyzed a GWAS dataset generated from 385 prostate cancer patients treated with radiotherapy. Using genotype information from these patients, we designed a machine learning-based predictive model of late radiation-induced toxicities: rectal bleeding and erectile dysfunction. The model building process was performed using 2/3 of samples (training) and the predictive model was tested with 1/3 of samples (validation).more » To identify important single nucleotide polymorphisms (SNPs), we computed the SNP importance score, resulting from our random forest regression model. We performed gene ontology (GO) enrichment analysis for nearby genes of the important SNPs. Results: After univariate analysis on the training dataset, we filtered out many SNPs with p>0.001, resulting in 749 and 367 SNPs that were used in the model building process for rectal bleeding and erectile dysfunction, respectively. On the validation dataset, our random forest regression model achieved the area under the curve (AUC)=0.70 and 0.62 for rectal bleeding and erectile dysfunction, respectively. We performed GO enrichment analysis for the top 25%, 50%, 75%, and 100% SNPs out of the select SNPs in the univariate analysis. When we used the top 50% SNPs, more plausible biological processes were obtained for both toxicities. An additional test with the top 50% SNPs improved predictive power with AUC=0.71 and 0.65 for rectal bleeding and erectile dysfunction. A better performance was achieved with AUC=0.67 when age and androgen deprivation therapy were added to the model for erectile dysfunction. Conclusion: Our approach that combines machine learning and bioinformatics techniques enabled designing better models and identifying more plausible biological processes associated with the outcomes.« less
GenoCore: A simple and fast algorithm for core subset selection from large genotype datasets.
Jeong, Seongmun; Kim, Jae-Yoon; Jeong, Soon-Chun; Kang, Sung-Taeg; Moon, Jung-Kyung; Kim, Namshin
2017-01-01
Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.
Yang, Jinliang; Jiang, Haiying; Yeh, Cheng-Ting; Yu, Jianming; Jeddeloh, Jeffrey A; Nettleton, Dan; Schnable, Patrick S
2015-11-01
Although approaches for performing genome-wide association studies (GWAS) are well developed, conventional GWAS requires high-density genotyping of large numbers of individuals from a diversity panel. Here we report a method for performing GWAS that does not require genotyping of large numbers of individuals. Instead XP-GWAS (extreme-phenotype GWAS) relies on genotyping pools of individuals from a diversity panel that have extreme phenotypes. This analysis measures allele frequencies in the extreme pools, enabling discovery of associations between genetic variants and traits of interest. This method was evaluated in maize (Zea mays) using the well-characterized kernel row number trait, which was selected to enable comparisons between the results of XP-GWAS and conventional GWAS. An exome-sequencing strategy was used to focus sequencing resources on genes and their flanking regions. A total of 0.94 million variants were identified and served as evaluation markers; comparisons among pools showed that 145 of these variants were statistically associated with the kernel row number phenotype. These trait-associated variants were significantly enriched in regions identified by conventional GWAS. XP-GWAS was able to resolve several linked QTL and detect trait-associated variants within a single gene under a QTL peak. XP-GWAS is expected to be particularly valuable for detecting genes or alleles responsible for quantitative variation in species for which extensive genotyping resources are not available, such as wild progenitors of crops, orphan crops, and other poorly characterized species such as those of ecological interest. © 2015 The Authors The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
Zhang, Kunlin; Chang, Suhua; Cui, Sijia; Guo, Liyuan; Zhang, Liuyan; Wang, Jing
2011-07-01
Genome-wide association study (GWAS) is widely utilized to identify genes involved in human complex disease or some other trait. One key challenge for GWAS data interpretation is to identify causal SNPs and provide profound evidence on how they affect the trait. Currently, researches are focusing on identification of candidate causal variants from the most significant SNPs of GWAS, while there is lack of support on biological mechanisms as represented by pathways. Although pathway-based analysis (PBA) has been designed to identify disease-related pathways by analyzing the full list of SNPs from GWAS, it does not emphasize on interpreting causal SNPs. To our knowledge, so far there is no web server available to solve the challenge for GWAS data interpretation within one analytical framework. ICSNPathway is developed to identify candidate causal SNPs and their corresponding candidate causal pathways from GWAS by integrating linkage disequilibrium (LD) analysis, functional SNP annotation and PBA. ICSNPathway provides a feasible solution to bridge the gap between GWAS and disease mechanism study by generating hypothesis of SNP → gene → pathway(s). The ICSNPathway server is freely available at http://icsnpathway.psych.ac.cn/.
Ulmer, Megan; Li, Jun; Yaspan, Brian L; Ozel, Ayse Bilge; Richards, Julia E; Moroi, Sayoko E; Hawthorne, Felicia; Budenz, Donald L; Friedman, David S; Gaasterland, Douglas; Haines, Jonathan; Kang, Jae H; Lee, Richard; Lichter, Paul; Liu, Yutao; Pasquale, Louis R; Pericak-Vance, Margaret; Realini, Anthony; Schuman, Joel S; Singh, Kuldev; Vollrath, Douglas; Weinreb, Robert; Wollstein, Gadi; Zack, Donald J; Zhang, Kang; Young, Terri; Allingham, R Rand; Wiggs, Janey L; Ashley-Koch, Allison; Hauser, Michael A
2012-07-03
To investigate the effects of central corneal thickness (CCT)-associated variants on primary open-angle glaucoma (POAG) risk using single nucleotide polymorphisms (SNP) data from the Glaucoma Genes and Environment (GLAUGEN) and National Eye Institute (NEI) Glaucoma Human Genetics Collaboration (NEIGHBOR) consortia. A replication analysis of previously reported CCT SNPs was performed in a CCT dataset (n = 1117) and these SNPs were then tested for association with POAG using a larger POAG dataset (n = 6470). Then a CCT genome-wide association study (GWAS) was performed. Top SNPs from this analysis were selected and tested for association with POAG. cDNA libraries from fetal and adult brain and ocular tissue samples were generated and used for candidate gene expression analysis. Association with one of 20 previously published CCT SNPs was replicated: rs12447690, near the ZNF469 gene (P = 0.001; β = -5.08 μm/allele). None of these SNPs were significantly associated with POAG. In the CCT GWAS, no SNPs reached genome-wide significance. After testing 50 candidate SNPs for association with POAG, one SNP was identified, rs7481514 within the neurotrimin (NTM) gene, that was significantly associated with POAG in a low-tension subset (P = 0.00099; Odds Ratio [OR] = 1.28). Additionally, SNPs in the CNTNAP4 gene showed suggestive association with POAG (top SNP = rs1428758; P = 0.018; OR = 0.84). NTM and CNTNAP4 were shown to be expressed in ocular tissues. The results suggest previously reported CCT loci are not significantly associated with POAG susceptibility. By performing a quantitative analysis of CCT and a subsequent analysis of POAG, SNPs in two cell adhesion molecules, NTM and CNTNAP4, were identified and may increase POAG susceptibility in a subset of cases.
Huang, Yen-Tsung; Liang, Liming; Moffatt, Miriam F; Cookson, William O C M; Lin, Xihong
2015-07-01
Genome-wide association studies (GWAS) have been a standard practice in identifying single nucleotide polymorphisms (SNPs) for disease susceptibility. We propose a new approach, termed integrative GWAS (iGWAS) that exploits the information of gene expressions to investigate the mechanisms of the association of SNPs with a disease phenotype, and to incorporate the family-based design for genetic association studies. Specifically, the relations among SNPs, gene expression, and disease are modeled within the mediation analysis framework, which allows us to disentangle the genetic effect on a disease phenotype into two parts: an effect mediated through a gene expression (mediation effect, ME) and an effect through other biological mechanisms or environment-mediated mechanisms (alternative effect, AE). We develop omnibus tests for the ME and AE that are robust to underlying true disease models. Numerical studies show that the iGWAS approach is able to facilitate discovering genetic association mechanisms, and outperforms the SNP-only method for testing genetic associations. We conduct a family-based iGWAS of childhood asthma that integrates genetic and genomic data. The iGWAS approach identifies six novel susceptibility genes (MANEA, MRPL53, LYCAT, ST8SIA4, NDFIP1, and PTCH1) using the omnibus test with false discovery rate less than 1%, whereas no gene using SNP-only analyses survives with the same cut-off. The iGWAS analyses further characterize that genetic effects of these genes are mostly mediated through their gene expressions. In summary, the iGWAS approach provides a new analytic framework to investigate the mechanism of genetic etiology, and identifies novel susceptibility genes of childhood asthma that were biologically meaningful. © 2015 WILEY PERIODICALS, INC.
Jin, Ying; Andersen, Genevieve; Yorgov, Daniel; Ferrara, Tracey M; Ben, Songtao; Brownson, Kelly M; Holland, Paulene J; Birlea, Stanca A; Siebert, Janet; Hartmann, Anke; Lienert, Anne; van Geel, Nanja; Lambert, Jo; Luiten, Rosalie M; Wolkerstorfer, Albert; Wietze van der Veen, J P; Bennett, Dorothy C; Taïeb, Alain; Ezzedine, Khaled; Kemp, E Helen; Gawkrodger, David J; Weetman, Anthony P; Kõks, Sulev; Prans, Ele; Kingo, Külli; Karelson, Maire; Wallace, Margaret R; McCormack, Wayne T; Overbeck, Andreas; Moretti, Silvia; Colucci, Roberta; Picardo, Mauro; Silverberg, Nanette B; Olsson, Mats; Valle, Yan; Korobko, Igor; Böhm, Markus; Lim, Henry W; Hamzavi, Iltefat; Zhou, Li; Mi, Qing-Sheng; Fain, Pamela R; Santorico, Stephanie A; Spritz, Richard A
2016-11-01
Vitiligo is an autoimmune disease in which depigmented skin results from the destruction of melanocytes, with epidemiological association with other autoimmune diseases. In previous linkage and genome-wide association studies (GWAS1 and GWAS2), we identified 27 vitiligo susceptibility loci in patients of European ancestry. We carried out a third GWAS (GWAS3) in European-ancestry subjects, with augmented GWAS1 and GWAS2 controls, genome-wide imputation, and meta-analysis of all three GWAS, followed by an independent replication. The combined analyses, with 4,680 cases and 39,586 controls, identified 23 new significantly associated loci and 7 suggestive loci. Most encode immune and apoptotic regulators, with some also associated with other autoimmune diseases, as well as several melanocyte regulators. Bioinformatic analyses indicate a predominance of causal regulatory variation, some of which corresponds to expression quantitative trait loci (eQTLs) at these loci. Together, the identified genes provide a framework for the genetic architecture and pathobiology of vitiligo, highlight relationships with other autoimmune diseases and melanoma, and offer potential targets for treatment.
Jin, Ying; Andersen, Genevieve; Yorgov, Daniel; Ferrara, Tracey M; Ben, Songtao; Brownson, Kelly M; Holland, Paulene J; Birlea, Stanca A; Siebert, Janet; Hartmann, Anke; Lienert, Anne; van Geel, Nanja; Lambert, Jo; Luiten, Rosalie M; Wolkerstorfer, Albert; van der Veen, JP Wietze; Bennett, Dorothy C; Taïeb, Alain; Ezzedine, Khaled; Kemp, E Helen; Gawkrodger, David J; Weetman, Anthony P; Kõks, Sulev; Prans, Ele; Kingo, Külli; Karelson, Maire; Wallace, Margaret R; McCormack, Wayne T; Overbeck, Andreas; Moretti, Silvia; Colucci, Roberta; Picardo, Mauro; Silverberg, Nanette B; Olsson, Mats; Valle, Yan; Korobko, Igor; Böhm, Markus; Lim, Henry W.; Hamzavi, Iltefat; Zhou, Li; Mi, Qing-Sheng; Fain, Pamela R.; Santorico, Stephanie A; Spritz, Richard A
2016-01-01
Vitiligo is an autoimmune disease in which depigmented skin results from destruction of melanocytes1, with epidemiologic association with other autoimmune diseases2. In previous linkage and genome-wide association studies (GWAS1, GWAS2), we identified 27 vitiligo susceptibility loci in patients of European (EUR) ancestry. We carried out a third GWAS (GWAS3) in EUR subjects, with augmented GWAS1 and GWAS2 controls, genome-wide imputation, and meta-analysis of all three GWAS, followed by an independent replication. The combined analyses, with 4,680 cases and 39,586 controls, identified 23 new loci and 7 suggestive loci, most encoding immune and apoptotic regulators, some also associated with other autoimmune diseases, as well as several melanocyte regulators. Bioinformatic analyses indicate a predominance of causal regulatory variation, some corresponding to eQTL at these loci. Together, the identified genes provide a framework for vitiligo genetic architecture and pathobiology, highlight relationships to other autoimmune diseases and melanoma, and offer potential targets for treatment. PMID:27723757
Kilaru, V; Iyer, S V; Almli, L M; Stevens, J S; Lori, A; Jovanovic, T; Ely, T D; Bradley, B; Binder, E B; Koen, N; Stein, D J; Conneely, K N; Wingo, A P; Smith, A K; Ressler, K J
2016-05-24
Post-traumatic stress disorder (PTSD) develops in only some people following trauma exposure, but the mechanisms differentially explaining risk versus resilience remain largely unknown. PTSD is heritable but candidate gene studies and genome-wide association studies (GWAS) have identified only a modest number of genes that reliably contribute to PTSD. New gene-based methods may help identify additional genes that increase risk for PTSD development or severity. We applied gene-based testing to GWAS data from the Grady Trauma Project (GTP), a primarily African American cohort, and identified two genes (NLGN1 and ZNRD1-AS1) that associate with PTSD after multiple test correction. Although the top SNP from NLGN1 did not replicate, we observed gene-based replication of NLGN1 with PTSD in the Drakenstein Child Health Study (DCHS) cohort from Cape Town. NLGN1 has previously been associated with autism, and it encodes neuroligin 1, a protein involved in synaptogenesis, learning, and memory. Within the GTP dataset, a single nucleotide polymorphism (SNP), rs6779753, underlying the gene-based association, associated with the intermediate phenotypes of higher startle response and greater functional magnetic resonance imaging activation of the amygdala, orbitofrontal cortex, right thalamus and right fusiform gyrus in response to fearful faces. These findings support a contribution of the NLGN1 gene pathway to the neurobiological underpinnings of PTSD.
Kilaru, V; Iyer, S V; Almli, L M; Stevens, J S; Lori, A; Jovanovic, T; Ely, T D; Bradley, B; Binder, E B; Koen, N; Stein, D J; Conneely, K N; Wingo, A P; Smith, A K; Ressler, K J
2016-01-01
Post-traumatic stress disorder (PTSD) develops in only some people following trauma exposure, but the mechanisms differentially explaining risk versus resilience remain largely unknown. PTSD is heritable but candidate gene studies and genome-wide association studies (GWAS) have identified only a modest number of genes that reliably contribute to PTSD. New gene-based methods may help identify additional genes that increase risk for PTSD development or severity. We applied gene-based testing to GWAS data from the Grady Trauma Project (GTP), a primarily African American cohort, and identified two genes (NLGN1 and ZNRD1-AS1) that associate with PTSD after multiple test correction. Although the top SNP from NLGN1 did not replicate, we observed gene-based replication of NLGN1 with PTSD in the Drakenstein Child Health Study (DCHS) cohort from Cape Town. NLGN1 has previously been associated with autism, and it encodes neuroligin 1, a protein involved in synaptogenesis, learning, and memory. Within the GTP dataset, a single nucleotide polymorphism (SNP), rs6779753, underlying the gene-based association, associated with the intermediate phenotypes of higher startle response and greater functional magnetic resonance imaging activation of the amygdala, orbitofrontal cortex, right thalamus and right fusiform gyrus in response to fearful faces. These findings support a contribution of the NLGN1 gene pathway to the neurobiological underpinnings of PTSD. PMID:27219346
Sajuthi, Satria P.; Sharma, Neeraj K.; Chou, Jeff W.; Palmer, Nicholette D.; McWilliams, David R.; Beal, John; Comeau, Mary E.; Ma, Lijun; Calles-Escandon, Jorge; Demons, Jamehl; Rogers, Samantha; Cherry, Kristina; Menon, Lata; Kouba, Ethel; Davis, Donna; Burris, Marcie; Byerly, Sara J.; Ng, Maggie C.Y.; Maruthur, Nisa M.; Patel, Sanjay R.; Bielak, Lawrence F.; Lange, Leslie; Guo, Xiuqing; Sale, Michèle M.; Chan, Kei Hang; Monda, Keri L.; Chen, Gary K.; Taylor, Kira; Palmer, Cameron; Edwards, Todd L; North, Kari E.; Haiman, Christopher A.; Bowden, Donald W.; Freedman, Barry I.; Langefeld, Carl D.; Das, Swapan K.
2016-01-01
Relative to European Americans, type 2 diabetes (T2D) is more prevalent in African Americans (AAs). Genetic variation may modulate transcript abundance in insulin-responsive tissues and contribute to risk; yet published studies identifying expression quantitative trait loci (eQTLs) in African ancestry populations are restricted to blood cells. This study aims to develop a map of genetically regulated transcripts expressed in tissues important for glucose homeostasis in AAs, critical for identifying the genetic etiology of T2D and related traits. Quantitative measures of adipose and muscle gene expression, and genotypic data were integrated in 260 non-diabetic AAs to identify expression regulatory variants. Their roles in genetic susceptibility to T2D, and related metabolic phenotypes were evaluated by mining GWAS datasets. eQTL analysis identified 1,971 and 2,078 cis-eGenes in adipose and muscle, respectively. Cis-eQTLs for 885 transcripts including top cis-eGenes CHURC1, USMG5, and ERAP2, were identified in both tissues. 62.1% of top cis-eSNPs were within ±50kb of transcription start sites and cis-eGenes were enriched for mitochondrial transcripts. Mining GWAS databases revealed association of cis-eSNPs for more than 50 genes with T2D (e.g. PIK3C2A, RBMS1, UFSP1), gluco-metabolic phenotypes, (e.g. INPP5E, SNX17, ERAP2, FN3KRP), and obesity (e.g. POMC, CPEB4). Integration of GWAS meta-analysis data from AA cohorts revealed the most significant association for cis-eSNPs of ATP5SL and MCCC1 genes, with T2D and BMI, respectively. This study developed the first comprehensive map of adipose and muscle tissue eQTLs in AAs (publically accessible at https://mdsetaa.phs.wakehealth.edu) and identified genetically-regulated transcripts for delineating genetic causes of T2D, and related metabolic phenotypes. PMID:27193597
Fast Genome-Wide QTL Association Mapping on Pedigree and Population Data.
Zhou, Hua; Blangero, John; Dyer, Thomas D; Chan, Kei-Hang K; Lange, Kenneth; Sobel, Eric M
2017-04-01
Since most analysis software for genome-wide association studies (GWAS) currently exploit only unrelated individuals, there is a need for efficient applications that can handle general pedigree data or mixtures of both population and pedigree data. Even datasets thought to consist of only unrelated individuals may include cryptic relationships that can lead to false positives if not discovered and controlled for. In addition, family designs possess compelling advantages. They are better equipped to detect rare variants, control for population stratification, and facilitate the study of parent-of-origin effects. Pedigrees selected for extreme trait values often segregate a single gene with strong effect. Finally, many pedigrees are available as an important legacy from the era of linkage analysis. Unfortunately, pedigree likelihoods are notoriously hard to compute. In this paper, we reexamine the computational bottlenecks and implement ultra-fast pedigree-based GWAS analysis. Kinship coefficients can either be based on explicitly provided pedigrees or automatically estimated from dense markers. Our strategy (a) works for random sample data, pedigree data, or a mix of both; (b) entails no loss of power; (c) allows for any number of covariate adjustments, including correction for population stratification; (d) allows for testing SNPs under additive, dominant, and recessive models; and (e) accommodates both univariate and multivariate quantitative traits. On a typical personal computer (six CPU cores at 2.67 GHz), analyzing a univariate HDL (high-density lipoprotein) trait from the San Antonio Family Heart Study (935,392 SNPs on 1,388 individuals in 124 pedigrees) takes less than 2 min and 1.5 GB of memory. Complete multivariate QTL analysis of the three time-points of the longitudinal HDL multivariate trait takes less than 5 min and 1.5 GB of memory. The algorithm is implemented as the Ped-GWAS Analysis (Option 29) in the Mendel statistical genetics package, which is freely available for Macintosh, Linux, and Windows platforms from http://genetics.ucla.edu/software/mendel. © 2016 WILEY PERIODICALS, INC.
Orozco, Gisela; Goh, Chee L; Al Olama, Ali Amin; Benlloch-Garcia, Sara; Govindasami, Koveela; Guy, Michelle; Muir, Kenneth R; Giles, Graham G; Severi, Gianluca; Neal, David E; Hamdy, Freddie C; Donovan, Jenny L; Kote-Jarai, Zsofia; Easton, Douglas F; Eyre, Steve; Eeles, Rosalind A
2013-06-01
WHAT'S KNOWN ON THE SUBJECT? AND WHAT DOES THE STUDY ADD?: The link between inflammation and cancer has long been reported and inflammation is thought to play a role in the pathogenesis of many cancers, including prostate cancer (PrCa). Over the last 5 years, genome-wide association studies (GWAS) have reported numerous susceptibility loci that predispose individuals to many different traits. The present study aims to ascertain if there are common genetic risk profiles that might predispose individuals to both PrCa and the autoimmune inflammatory condition, rheumatoid arthritis. These results could have potential public heath impact in terms of screening and chemoprevention. To investigate if potential common pathways exist for the pathogenesis of autoimmune disease and prostate cancer (PrCa). To ascertain if the single nucleotide polymorphisms (SNPs) reported by genome-wide association studies (GWAS) as being associated with susceptibility to PrCa are also associated with susceptibility to the autoimmune disease rheumatoid arthritis (RA). The original Wellcome Trust Case Control Consortium (WTCCC) UK RA GWAS study was expanded to include a total of 3221 cases and 5272 controls. In all, 37 germline autosomal SNPs at genome-wide significance associated with PrCa risk were identified from a UK/Australian PrCa GWAS. Allele frequencies were compared for these 37 SNPs between RA cases and controls using a chi-squared trend test and corrected for multiple testing (Bonferroni). In all, 33 SNPs were able to be analysed in the RA dataset. Proxies could not be located for the SNPs in 3q26, 5p15 and for two SNPs in 17q12. After applying a Bonferroni correction for the number of SNPs tested, the SNP mapping to CCHCR1 (rs130067) retained statistically significant evidence for association (P = 6 × 10(-4) ; odds ratio [OR] = 1.15, 95% CI: 1.06-1.24); this has also been associated with psoriasis. However, further analyses showed that the association of this allele was due to confounding by RA-associated HLA-DRB1 alleles. There is currently no evidence that SNPs associated with PrCa at genome-wide significance are associated with the development of RA. Studies like this are important in determining if common genetic risk profiles might predispose individuals to many diseases, which could have implications for public health in terms of screening and chemoprevention. © 2012 BJU International.
Fernández, Maria V.; Budde, John; Del-Aguila, Jorge L.; Ibañez, Laura; Deming, Yuetiva; Harari, Oscar; Norton, Joanne; Morris, John C.; Goate, Alison M.; Cruchaga, Carlos
2018-01-01
Gene-based tests to study the combined effect of rare variants on a particular phenotype have been widely developed for case-control studies, but their evolution and adaptation for family-based studies, especially studies of complex incomplete families, has been slower. In this study, we have performed a practical examination of all the latest gene-based methods available for family-based study designs using both simulated and real datasets. We examined the performance of several collapsing, variance-component, and transmission disequilibrium tests across eight different software packages and 22 models utilizing a cohort of 285 families (N = 1,235) with late-onset Alzheimer disease (LOAD). After a thorough examination of each of these tests, we propose a methodological approach to identify, with high confidence, genes associated with the tested phenotype and we provide recommendations to select the best software and model for family-based gene-based analyses. Additionally, in our dataset, we identified PTK2B, a GWAS candidate gene for sporadic AD, along with six novel genes (CHRD, CLCN2, HDLBP, CPAMD8, NLRP9, and MAS1L) as candidate genes for familial LOAD. PMID:29670507
Fernández, Maria V; Budde, John; Del-Aguila, Jorge L; Ibañez, Laura; Deming, Yuetiva; Harari, Oscar; Norton, Joanne; Morris, John C; Goate, Alison M; Cruchaga, Carlos
2018-01-01
Gene-based tests to study the combined effect of rare variants on a particular phenotype have been widely developed for case-control studies, but their evolution and adaptation for family-based studies, especially studies of complex incomplete families, has been slower. In this study, we have performed a practical examination of all the latest gene-based methods available for family-based study designs using both simulated and real datasets. We examined the performance of several collapsing, variance-component, and transmission disequilibrium tests across eight different software packages and 22 models utilizing a cohort of 285 families ( N = 1,235) with late-onset Alzheimer disease (LOAD). After a thorough examination of each of these tests, we propose a methodological approach to identify, with high confidence, genes associated with the tested phenotype and we provide recommendations to select the best software and model for family-based gene-based analyses. Additionally, in our dataset, we identified PTK2B , a GWAS candidate gene for sporadic AD, along with six novel genes ( CHRD, CLCN2, HDLBP, CPAMD8, NLRP9 , and MAS1L ) as candidate genes for familial LOAD.
Duncan, Emma L; Danoy, Patrick; Kemp, John P; Leo, Paul J; McCloskey, Eugene; Nicholson, Geoffrey C; Eastell, Richard; Prince, Richard L; Eisman, John A; Jones, Graeme; Sambrook, Philip N; Reid, Ian R; Dennison, Elaine M; Wark, John; Richards, J Brent; Uitterlinden, Andre G; Spector, Tim D; Esapa, Chris; Cox, Roger D; Brown, Steve D M; Thakker, Rajesh V; Addison, Kathryn A; Bradbury, Linda A; Center, Jacqueline R; Cooper, Cyrus; Cremin, Catherine; Estrada, Karol; Felsenberg, Dieter; Glüer, Claus-C; Hadler, Johanna; Henry, Margaret J; Hofman, Albert; Kotowicz, Mark A; Makovey, Joanna; Nguyen, Sing C; Nguyen, Tuan V; Pasco, Julie A; Pryce, Karena; Reid, David M; Rivadeneira, Fernando; Roux, Christian; Stefansson, Kari; Styrkarsdottir, Unnur; Thorleifsson, Gudmar; Tichawangana, Rumbidzai; Evans, David M; Brown, Matthew A
2011-04-01
Osteoporotic fracture is a major cause of morbidity and mortality worldwide. Low bone mineral density (BMD) is a major predisposing factor to fracture and is known to be highly heritable. Site-, gender-, and age-specific genetic effects on BMD are thought to be significant, but have largely not been considered in the design of genome-wide association studies (GWAS) of BMD to date. We report here a GWAS using a novel study design focusing on women of a specific age (postmenopausal women, age 55-85 years), with either extreme high or low hip BMD (age- and gender-adjusted BMD z-scores of +1.5 to +4.0, n = 1055, or -4.0 to -1.5, n = 900), with replication in cohorts of women drawn from the general population (n = 20,898). The study replicates 21 of 26 known BMD-associated genes. Additionally, we report suggestive association of a further six new genetic associations in or around the genes CLCN7, GALNT3, IBSP, LTBP3, RSPO3, and SOX4, with replication in two independent datasets. A novel mouse model with a loss-of-function mutation in GALNT3 is also reported, which has high bone mass, supporting the involvement of this gene in BMD determination. In addition to identifying further genes associated with BMD, this study confirms the efficiency of extreme-truncate selection designs for quantitative trait association studies.
Duncan, Emma L.; Danoy, Patrick; Kemp, John P.; Leo, Paul J.; McCloskey, Eugene; Nicholson, Geoffrey C.; Eastell, Richard; Prince, Richard L.; Eisman, John A.; Jones, Graeme; Sambrook, Philip N.; Reid, Ian R.; Dennison, Elaine M.; Wark, John; Richards, J. Brent; Uitterlinden, Andre G.; Spector, Tim D.; Esapa, Chris; Cox, Roger D.; Brown, Steve D. M.; Thakker, Rajesh V.; Addison, Kathryn A.; Bradbury, Linda A.; Center, Jacqueline R.; Cooper, Cyrus; Cremin, Catherine; Estrada, Karol; Felsenberg, Dieter; Glüer, Claus-C.; Hadler, Johanna; Henry, Margaret J.; Hofman, Albert; Kotowicz, Mark A.; Makovey, Joanna; Nguyen, Sing C.; Nguyen, Tuan V.; Pasco, Julie A.; Pryce, Karena; Reid, David M.; Rivadeneira, Fernando; Roux, Christian; Stefansson, Kari; Styrkarsdottir, Unnur; Thorleifsson, Gudmar; Tichawangana, Rumbidzai; Evans, David M.; Brown, Matthew A.
2011-01-01
Osteoporotic fracture is a major cause of morbidity and mortality worldwide. Low bone mineral density (BMD) is a major predisposing factor to fracture and is known to be highly heritable. Site-, gender-, and age-specific genetic effects on BMD are thought to be significant, but have largely not been considered in the design of genome-wide association studies (GWAS) of BMD to date. We report here a GWAS using a novel study design focusing on women of a specific age (postmenopausal women, age 55–85 years), with either extreme high or low hip BMD (age- and gender-adjusted BMD z-scores of +1.5 to +4.0, n = 1055, or −4.0 to −1.5, n = 900), with replication in cohorts of women drawn from the general population (n = 20,898). The study replicates 21 of 26 known BMD–associated genes. Additionally, we report suggestive association of a further six new genetic associations in or around the genes CLCN7, GALNT3, IBSP, LTBP3, RSPO3, and SOX4, with replication in two independent datasets. A novel mouse model with a loss-of-function mutation in GALNT3 is also reported, which has high bone mass, supporting the involvement of this gene in BMD determination. In addition to identifying further genes associated with BMD, this study confirms the efficiency of extreme-truncate selection designs for quantitative trait association studies. PMID:21533022
Byrne, Enda M; Gehrman, Philip R; Trzaskowski, Maciej; Tiemeier, Henning; Pack, Allan I
2016-10-01
We sought to examine how much of the heritability of self-report sleep duration is tagged by common genetic variation in populations of European ancestry and to test if the common variants contributing to sleep duration are also associated with other diseases and traits. We utilized linkage disequilibrium (LD)-score regression to estimate the heritability tagged by common single nucleotide polymorphisms (SNPs) in the CHARGE consortium genome-wide association study (GWAS) of self-report sleep duration. We also used bivariate LD-score regression to investigate the genetic correlation of sleep duration with other publicly available GWAS datasets. We show that 6% (SE = 1%) of the variance in self-report sleep duration in the CHARGE study is tagged by common SNPs in European populations. Furthermore, we find evidence of a positive genetic correlation (rG) between sleep duration and type 2 diabetes (rG = 0.26, P = 0.02), and between sleep duration and schizophrenia (rG = 0.19, P = 0.01). Our results show that increased sample sizes will identify more common variants for self-report sleep duration; however, the heritability tagged is small when compared to other traits and diseases. These results also suggest that those who carry variants that increase risk to type 2 diabetes and schizophrenia are more likely to report longer sleep duration. © 2016 Associated Professional Sleep Societies, LLC.
Figueroa, Jonine D.; Middlebrooks, Candace D.; Banday, A. Rouf; Ye, Yuanqing; Garcia-Closas, Montserrat; Chatterjee, Nilanjan; Koutros, Stella; Kiemeney, Lambertus A.; Rafnar, Thorunn; Bishop, Timothy; Furberg, Helena; Matullo, Giuseppe; Golka, Klaus; Gago-Dominguez, Manuela; Taylor, Jack A.; Fletcher, Tony; Siddiq, Afshan; Cortessis, Victoria K.; Kooperberg, Charles; Cussenot, Olivier; Benhamou, Simone; Prescott, Jennifer; Porru, Stefano; Dinney, Colin P.; Malats, Núria; Baris, Dalsu; Purdue, Mark P.; Jacobs, Eric J.; Albanes, Demetrius; Wang, Zhaoming; Chung, Charles C.; Vermeulen, Sita H.; Aben, Katja K.; Galesloot, Tessel E.; Thorleifsson, Gudmar; Sulem, Patrick; Stefansson, Kari; Kiltie, Anne E.; Harland, Mark; Teo, Mark; Offit, Kenneth; Vijai, Joseph; Bajorin, Dean; Kopp, Ryan; Fiorito, Giovanni; Guarrera, Simonetta; Sacerdote, Carlotta; Selinski, Silvia; Hengstler, Jan G.; Gerullis, Holger; Ovsiannikov, Daniel; Blaszkewicz, Meinolf; Castelao, Jose Esteban; Calaza, Manuel; Martinez, Maria Elena; Cordeiro, Patricia; Xu, Zongli; Panduri, Vijayalakshmi; Kumar, Rajiv; Gurzau, Eugene; Koppova, Kvetoslava; Bueno-De-Mesquita, H. Bas; Ljungberg, Börje; Clavel-Chapelon, Françoise; Weiderpass, Elisabete; Krogh, Vittorio; Dorronsoro, Miren; Travis, Ruth C.; Tjønneland, Anne; Brennan, Paul; Chang-Claude, Jenny; Riboli, Elio; Conti, David; Stern, Marianna C.; Pike, Malcolm C.; Van Den Berg, David; Yuan, Jian-Min; Hohensee, Chancellor; Jeppson, Rebecca P.; Cancel-Tassin, Geraldine; Roupret, Morgan; Comperat, Eva; Turman, Constance; De Vivo, Immaculata; Giovannucci, Edward; Hunter, David J.; Kraft, Peter; Lindstrom, Sara; Carta, Angela; Pavanello, Sofia; Arici, Cecilia; Mastrangelo, Giuseppe; Kamat, Ashish M.; Zhang, Liren; Gong, Yilei; Pu, Xia; Hutchinson, Amy; Burdett, Laurie; Wheeler, William A.; Karagas, Margaret R.; Johnson, Alison; Schned, Alan; Monawar Hosain, G. M.; Schwenn, Molly; Kogevinas, Manolis; Tardón, Adonina; Serra, Consol; Carrato, Alfredo; García-Closas, Reina; Lloreta, Josep; Andriole, Gerald; Grubb, Robert; Black, Amanda; Diver, W. Ryan; Gapstur, Susan M.; Weinstein, Stephanie; Virtamo, Jarmo; Haiman, Christopher A.; Landi, Maria Teresa; Caporaso, Neil E.; Fraumeni, Joseph F.; Vineis, Paolo; Wu, Xifeng; Chanock, Stephen J.; Silverman, Debra T.; Prokunina-Olsson, Ludmila; Rothman, Nathaniel
2016-01-01
Candidate gene and genome-wide association studies (GWAS) have identified 15 independent genomic regions associated with bladder cancer risk. In search for additional susceptibility variants, we followed up on four promising single-nucleotide polymorphisms (SNPs) that had not achieved genome-wide significance in 6911 cases and 11 814 controls (rs6104690, rs4510656, rs5003154 and rs4907479, P < 1 × 10−6), using additional data from existing GWAS datasets and targeted genotyping for studies that did not have GWAS data. In a combined analysis, which included data on up to 15 058 cases and 286 270 controls, two SNPs achieved genome-wide statistical significance: rs6104690 in a gene desert at 20p12.2 (P = 2.19 × 10−11) and rs4907479 within the MCF2L gene at 13q34 (P = 3.3 × 10−10). Imputation and fine-mapping analyses were performed in these two regions for a subset of 5551 bladder cancer cases and 10 242 controls. Analyses at the 13q34 region suggest a single signal marked by rs4907479. In contrast, we detected two signals in the 20p12.2 region—the first signal is marked by rs6104690, and the second signal is marked by two moderately correlated SNPs (r2 = 0.53), rs6108803 and the previously reported rs62185668. The second 20p12.2 signal is more strongly associated with the risk of muscle-invasive (T2-T4 stage) compared with non-muscle-invasive (Ta, T1 stage) bladder cancer (case–case P ≤ 0.02 for both rs62185668 and rs6108803). Functional analyses are needed to explore the biological mechanisms underlying these novel genetic associations with risk for bladder cancer. PMID:26732427
Hsu, Yi-Hsiang; Kiel, Douglas P
2012-10-01
The primary goals of genome-wide association studies (GWAS) are to discover new molecular and biological pathways involved in the regulation of bone metabolism that can be leveraged for drug development. In addition, the identified genetic determinants may be used to enhance current risk factor profiles. There have been more than 40 published GWAS on skeletal phenotypes, predominantly focused on dual-energy x-ray absorptiometry-derived bone mineral density (BMD) of the hip and spine. Sixty-six BMD loci have been replicated across all the published GWAS, confirming the highly polygenic nature of BMD variation. Only seven of the 66 previously reported genes (LRP5, SOST, ESR1, TNFRSF11B, TNFRSF11A, TNFSF11, PTH) from candidate gene association studies have been confirmed by GWAS. Among 59 novel BMD GWAS loci that have not been reported by previous candidate gene association studies, some have been shown to be involved in key biological pathways involving the skeleton, particularly Wnt signaling (AXIN1, LRP5, CTNNB1, DKK1, FOXC2, HOXC6, LRP4, MEF2C, PTHLH, RSPO3, SFRP4, TGFBR3, WLS, WNT3, WNT4, WNT5B, WNT16), bone development: ossification (CLCN7, CSF1, MEF2C, MEPE, PKDCC, PTHLH, RUNX2, SOX6, SOX9, SPP1, SP7), mesenchymal-stem-cell differentiation (FAM3C, MEF2C, RUNX2, SOX4, SOX9, SP7), osteoclast differentiation (JAG1, RUNX2), and TGF-signaling (FOXL1, SPTBN1, TGFBR3). There are still 30 BMD GWAS loci without prior molecular or biological evidence of their involvement in skeletal phenotypes. Other skeletal phenotypes that either have been or are being studied include hip geometry, bone ultrasound, quantitative computed tomography, high-resolution peripheral quantitative computed tomography, biochemical markers, and fractures such as vertebral, nonvertebral, hip, and forearm. Although several challenges lie ahead as GWAS moves into the next generation, there are prospects of new discoveries in skeletal biology. This review integrates findings from previous GWAS and provides a roadmap for future directions building on current GWAS successes.
Khramtsova, Ekaterina A; Stranger, Barbara E
2017-02-01
Over the last decade, genome-wide association studies (GWAS) have generated vast amounts of analysis results, requiring development of novel tools for data visualization. Quantile–quantile (QQ) plots and Manhattan plots are classical tools which have been utilized to visually summarize GWAS results and identify genetic variants significantly associated with traits of interest. However, static visualizations are limiting in the information that can be shown. Here, we present Assocplots, a Python package for viewing and exploring GWAS results not only using classic static Manhattan and QQ plots, but also through a dynamic extension which allows to interactively visualize the relationships between GWAS results from multiple cohorts or studies. The Assocplots package is open source and distributed under the MIT license via GitHub (https://github.com/khramts/assocplots) along with examples, documentation and installation instructions. ekhramts@medicine.bsd.uchicago.edu or bstranger@medicine.bsd.uchicago.edu
Statistical methods to detect novel genetic variants using publicly available GWAS summary data.
Guo, Bin; Wu, Baolin
2018-03-01
We propose statistical methods to detect novel genetic variants using only genome-wide association studies (GWAS) summary data without access to raw genotype and phenotype data. With more and more summary data being posted for public access in the post GWAS era, the proposed methods are practically very useful to identify additional interesting genetic variants and shed lights on the underlying disease mechanism. We illustrate the utility of our proposed methods with application to GWAS meta-analysis results of fasting glucose from the international MAGIC consortium. We found several novel genome-wide significant loci that are worth further study. Copyright © 2018 Elsevier Ltd. All rights reserved.
Wei, Wen-Hua; Massey, Jonathan; Worthington, Jane; Barton, Anne; Warren, Richard B
2018-03-01
Genome-wide association studies (GWASs) have identified a number of loci for psoriasis but largely ignored non-additive effects. We report a genotypic variability-based GWAS (vGWAS) that can prioritize non-additive loci without requiring prior knowledge of interaction types or interacting factors in two steps, using a mixed model to partition dichotomous phenotypes into an additive component and non-additive environmental residuals on the liability scale and then the Levene's (Brown-Forsythe) test to assess equality of the residual variances across genotype groups genome widely. The vGWAS identified two genome-wide significant (P < 5.0e-08) non-additive loci HLA-C and IL12B that were also genome-wide significant in an accompanying GWAS in the discovery cohort. Both loci were statistically replicated in vGWAS of an independent cohort with a small sample size. HLA-C and IL12B were reported in moderate gene-gene and/or gene-environment interactions in several occasions. We found a moderate interaction with age-of-onset of psoriasis, which was replicated indirectly. The vGWAS also revealed five suggestive loci (P < 6.76e-05) including FUT2 that was associated with psoriasis with environmental aspects triggered by virus infection and/or metabolic factors. Replication and functional investigation are needed to validate the suggestive vGWAS loci.
Xu, Zheng; Zhang, Guosheng; Duan, Qing; Chai, Shengjie; Zhang, Baqun; Wu, Cong; Jin, Fulai; Yue, Feng; Li, Yun; Hu, Ming
2016-03-11
Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with complex traits and diseases. However, most of them are located in the non-protein coding regions, and therefore it is challenging to hypothesize the functions of these non-coding GWAS variants. Recent large efforts such as the ENCODE and Roadmap Epigenomics projects have predicted a large number of regulatory elements. However, the target genes of these regulatory elements remain largely unknown. Chromatin conformation capture based technologies such as Hi-C can directly measure the chromatin interactions and have generated an increasingly comprehensive catalog of the interactome between the distal regulatory elements and their potential target genes. Leveraging such information revealed by Hi-C holds the promise of elucidating the functions of genetic variants in human diseases. In this work, we present HiView, the first integrative genome browser to leverage Hi-C results for the interpretation of GWAS variants. HiView is able to display Hi-C data and statistical evidence for chromatin interactions in genomic regions surrounding any given GWAS variant, enabling straightforward visualization and interpretation. We believe that as the first GWAS variants-centered Hi-C genome browser, HiView is a useful tool guiding post-GWAS functional genomics studies. HiView is freely accessible at: http://www.unc.edu/~yunmli/HiView .
Carlson, Christopher S; Matise, Tara C; North, Kari E; Haiman, Christopher A; Fesinmeyer, Megan D; Buyske, Steven; Schumacher, Fredrick R; Peters, Ulrike; Franceschini, Nora; Ritchie, Marylyn D; Duggan, David J; Spencer, Kylee L; Dumitrescu, Logan; Eaton, Charles B; Thomas, Fridtjof; Young, Alicia; Carty, Cara; Heiss, Gerardo; Le Marchand, Loic; Crawford, Dana C; Hindorff, Lucia A; Kooperberg, Charles L
2013-09-01
The vast majority of genome-wide association study (GWAS) findings reported to date are from populations with European Ancestry (EA), and it is not yet clear how broadly the genetic associations described will generalize to populations of diverse ancestry. The Population Architecture Using Genomics and Epidemiology (PAGE) study is a consortium of multi-ancestry, population-based studies formed with the objective of refining our understanding of the genetic architecture of common traits emerging from GWAS. In the present analysis of five common diseases and traits, including body mass index, type 2 diabetes, and lipid levels, we compare direction and magnitude of effects for GWAS-identified variants in multiple non-EA populations against EA findings. We demonstrate that, in all populations analyzed, a significant majority of GWAS-identified variants have allelic associations in the same direction as in EA, with none showing a statistically significant effect in the opposite direction, after adjustment for multiple testing. However, 25% of tagSNPs identified in EA GWAS have significantly different effect sizes in at least one non-EA population, and these differential effects were most frequent in African Americans where all differential effects were diluted toward the null. We demonstrate that differential LD between tagSNPs and functional variants within populations contributes significantly to dilute effect sizes in this population. Although most variants identified from GWAS in EA populations generalize to all non-EA populations assessed, genetic models derived from GWAS findings in EA may generate spurious results in non-EA populations due to differential effect sizes. Regardless of the origin of the differential effects, caution should be exercised in applying any genetic risk prediction model based on tagSNPs outside of the ancestry group in which it was derived. Models based directly on functional variation may generalize more robustly, but the identification of functional variants remains challenging.
Carlson, Christopher S.; Matise, Tara C.; North, Kari E.; Haiman, Christopher A.; Fesinmeyer, Megan D.; Buyske, Steven; Schumacher, Fredrick R.; Peters, Ulrike; Franceschini, Nora; Ritchie, Marylyn D.; Duggan, David J.; Spencer, Kylee L.; Dumitrescu, Logan; Eaton, Charles B.; Thomas, Fridtjof; Young, Alicia; Carty, Cara; Heiss, Gerardo; Le Marchand, Loic; Crawford, Dana C.; Hindorff, Lucia A.; Kooperberg, Charles L.
2013-01-01
The vast majority of genome-wide association study (GWAS) findings reported to date are from populations with European Ancestry (EA), and it is not yet clear how broadly the genetic associations described will generalize to populations of diverse ancestry. The Population Architecture Using Genomics and Epidemiology (PAGE) study is a consortium of multi-ancestry, population-based studies formed with the objective of refining our understanding of the genetic architecture of common traits emerging from GWAS. In the present analysis of five common diseases and traits, including body mass index, type 2 diabetes, and lipid levels, we compare direction and magnitude of effects for GWAS-identified variants in multiple non-EA populations against EA findings. We demonstrate that, in all populations analyzed, a significant majority of GWAS-identified variants have allelic associations in the same direction as in EA, with none showing a statistically significant effect in the opposite direction, after adjustment for multiple testing. However, 25% of tagSNPs identified in EA GWAS have significantly different effect sizes in at least one non-EA population, and these differential effects were most frequent in African Americans where all differential effects were diluted toward the null. We demonstrate that differential LD between tagSNPs and functional variants within populations contributes significantly to dilute effect sizes in this population. Although most variants identified from GWAS in EA populations generalize to all non-EA populations assessed, genetic models derived from GWAS findings in EA may generate spurious results in non-EA populations due to differential effect sizes. Regardless of the origin of the differential effects, caution should be exercised in applying any genetic risk prediction model based on tagSNPs outside of the ancestry group in which it was derived. Models based directly on functional variation may generalize more robustly, but the identification of functional variants remains challenging. PMID:24068893
[Genetic factors in myocardial infarction].
Hara, Masahiko; Sakata, Yasuhiko; Sato, Hiroshi
2013-02-01
One of the main mechanisms of acute myocardial infarction (AMI) is plaque rupture or erosion followed by intraluminal thrombus formation and occlusion of the coronary arteries. Thus far, many underlying conditions or environmental factors, such as hypertension, diabetes, dyslipidemia, smoking or obesity, as well as a family history of coronary artery diseases have been identified as risks for the onset of AMI. These risks suggest that AMI occurs due to interactions between underlying conditions and multiple genetic susceptibilities. For this reason, many target gene-disease association studies have been performed with the recent introduction of genome-wide association studies (GWAS) that have further revealed new genetic susceptibilities for AMI. GWAS is a way to examine many common genetic variants in different individuals to see if any variant is associated with a trait in a case-control fashion, and typically focuses on associations between single-nucleotide polymorphisms (SNP) and traits. SNP on chromosome 9p21 is one of the robust susceptibility variants for AMI which has been identified by many GWAS. In this review, we overview the methodology of GWAS, introduce genetic variants identified by GWAS as those with susceptibility for AMI, and describe the foresight of using GWAS to investigate genetic susceptibility to AMI.
BioSMACK: a linux live CD for genome-wide association analyses.
Hong, Chang Bum; Kim, Young Jin; Moon, Sanghoon; Shin, Young-Ah; Go, Min Jin; Kim, Dong-Joon; Lee, Jong-Young; Cho, Yoon Shin
2012-01-01
Recent advances in high-throughput genotyping technologies have enabled us to conduct a genome-wide association study (GWAS) on a large cohort. However, analyzing millions of single nucleotide polymorphisms (SNPs) is still a difficult task for researchers conducting a GWAS. Several difficulties such as compatibilities and dependencies are often encountered by researchers using analytical tools, during the installation of software. This is a huge obstacle to any research institute without computing facilities and specialists. Therefore, a proper research environment is an urgent need for researchers working on GWAS. We developed BioSMACK to provide a research environment for GWAS that requires no configuration and is easy to use. BioSMACK is based on the Ubuntu Live CD that offers a complete Linux-based operating system environment without installation. Moreover, we provide users with a GWAS manual consisting of a series of guidelines for GWAS and useful examples. BioSMACK is freely available at http://ksnp.cdc. go.kr/biosmack.
Hall, F Scott; Drgonova, Jana; Jain, Siddharth; Uhl, George R
2013-12-01
Substantial genetic contributions to addiction vulnerability are supported by data from twin studies, linkage studies, candidate gene association studies and, more recently, Genome Wide Association Studies (GWAS). Parallel to this work, animal studies have attempted to identify the genes that may contribute to responses to addictive drugs and addiction liability, initially focusing upon genes for the targets of the major drugs of abuse. These studies identified genes/proteins that affect responses to drugs of abuse; however, this does not necessarily mean that variation in these genes contributes to the genetic component of addiction liability. One of the major problems with initial linkage and candidate gene studies was an a priori focus on the genes thought to be involved in addiction based upon the known contributions of those proteins to drug actions, making the identification of novel genes unlikely. The GWAS approach is systematic and agnostic to such a priori assumptions. From the numerous GWAS now completed several conclusions may be drawn: (1) addiction is highly polygenic; each allelic variant contributing in a small, additive fashion to addiction vulnerability; (2) unexpected, compared to our a priori assumptions, classes of genes are most important in explaining addiction vulnerability; (3) although substantial genetic heterogeneity exists, there is substantial convergence of GWAS signals on particular genes. This review traces the history of this research; from initial transgenic mouse models based upon candidate gene and linkage studies, through the progression of GWAS for addiction and nicotine cessation, to the current human and transgenic mouse studies post-GWAS. © 2013.
Kim, Jihye; Yoo, Minjae; Shin, Jimin; Kim, Hyunmin; Kang, Jaewoo; Tan, Aik Choon
2018-01-01
Traditional Chinese medicine (TCM) originated in ancient China has been practiced over thousands of years for treating various symptoms and diseases. However, the molecular mechanisms of TCM in treating these diseases remain unknown. In this study, we employ a systems pharmacology-based approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. We studied 102 TCM components and their target genes by analyzing microarray gene expression experiments. We constructed disease-gene networks from 2558 GWAS studies. We applied a systems pharmacology approach to prioritize disease-target genes. Using this bioinformatics approach, we analyzed 14,713 GWAS disease-TCM-target gene pairs and identified 115 disease-gene pairs with q value < 0.2. We validated several of these GWAS disease-TCM-target gene pairs with literature evidence, demonstrating that this computational approach could reveal novel indications for TCM. We also develop TCM-Disease web application to facilitate the traditional Chinese medicine drug repurposing efforts. Systems pharmacology is a promising approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. The computational approaches described in this study could be easily expandable to other disease-gene network analysis.
Kim, Jihye; Yoo, Minjae; Shin, Jimin; Kim, Hyunmin; Kang, Jaewoo
2018-01-01
Traditional Chinese medicine (TCM) originated in ancient China has been practiced over thousands of years for treating various symptoms and diseases. However, the molecular mechanisms of TCM in treating these diseases remain unknown. In this study, we employ a systems pharmacology-based approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. We studied 102 TCM components and their target genes by analyzing microarray gene expression experiments. We constructed disease-gene networks from 2558 GWAS studies. We applied a systems pharmacology approach to prioritize disease-target genes. Using this bioinformatics approach, we analyzed 14,713 GWAS disease-TCM-target gene pairs and identified 115 disease-gene pairs with q value < 0.2. We validated several of these GWAS disease-TCM-target gene pairs with literature evidence, demonstrating that this computational approach could reveal novel indications for TCM. We also develop TCM-Disease web application to facilitate the traditional Chinese medicine drug repurposing efforts. Systems pharmacology is a promising approach for connecting GWAS diseases with TCM for potential drug repurposing and repositioning. The computational approaches described in this study could be easily expandable to other disease-gene network analysis. PMID:29765977
Rare Variant Association Test with Multiple Phenotypes
Lee, Selyeong; Won, Sungho; Kim, Young Jin; Kim, Yongkang; Kim, Bong-Jo; Park, Taesung
2016-01-01
Although genome-wide association studies (GWAS) have now discovered thousands of genetic variants associated with common traits, such variants cannot explain the large degree of “missing heritability,” likely due to rare variants. The advent of next generation sequencing technology has allowed rare variant detection and association with common traits, often by investigating specific genomic regions for rare variant effects on a trait. Although multiply correlated phenotypes are often concurrently observed in GWAS, most studies analyze only single phenotypes, which may lessen statistical power. To increase power, multivariate analyses, which consider correlations between multiple phenotypes, can be used. However, few existing multi-variant analyses can identify rare variants for assessing multiple phenotypes. Here, we propose Multivariate Association Analysis using Score Statistics (MAAUSS), to identify rare variants associated with multiple phenotypes, based on the widely used Sequence Kernel Association Test (SKAT) for a single phenotype. We applied MAAUSS to Whole Exome Sequencing (WES) data from a Korean population of 1,058 subjects, to discover genes associated with multiple traits of liver function. We then assessed validation of those genes by a replication study, using an independent dataset of 3,445 individuals. Notably, we detected the gene ZNF620 among five significant genes. We then performed a simulation study to compare MAAUSS's performance with existing methods. Overall, MAAUSS successfully conserved type 1 error rates and in many cases, had a higher power than the existing methods. This study illustrates a feasible and straightforward approach for identifying rare variants correlated with multiple phenotypes, with likely relevance to missing heritability. PMID:28039885
Genotype harmonizer: automatic strand alignment and format conversion for genotype data integration.
Deelen, Patrick; Bonder, Marc Jan; van der Velde, K Joeri; Westra, Harm-Jan; Winder, Erwin; Hendriksen, Dennis; Franke, Lude; Swertz, Morris A
2014-12-11
To gain statistical power or to allow fine mapping, researchers typically want to pool data before meta-analyses or genotype imputation. However, the necessary harmonization of genetic datasets is currently error-prone because of many different file formats and lack of clarity about which genomic strand is used as reference. Genotype Harmonizer (GH) is a command-line tool to harmonize genetic datasets by automatically solving issues concerning genomic strand and file format. GH solves the unknown strand issue by aligning ambiguous A/T and G/C SNPs to a specified reference, using linkage disequilibrium patterns without prior knowledge of the used strands. GH supports many common GWAS/NGS genotype formats including PLINK, binary PLINK, VCF, SHAPEIT2 & Oxford GEN. GH is implemented in Java and a large part of the functionality can also be used as Java 'Genotype-IO' API. All software is open source under license LGPLv3 and available from http://www.molgenis.org/systemsgenetics. GH can be used to harmonize genetic datasets across different file formats and can be easily integrated as a step in routine meta-analysis and imputation pipelines.
Tissue-Specific Enrichment of Lymphoma Risk Loci in Regulatory Elements
Hayes, James E.; Trynka, Gosia; Vijai, Joseph; Offit, Kenneth; Raychaudhuri, Soumya; Klein, Robert J.
2015-01-01
Though numerous polymorphisms have been associated with risk of developing lymphoma, how these variants function to promote tumorigenesis is poorly understood. Here, we report that lymphoma risk SNPs, especially in the non-Hodgkin’s lymphoma subtype chronic lymphocytic leukemia, are significantly enriched for co-localization with epigenetic marks of active gene regulation. These enrichments were seen in a lymphoid-specific manner for numerous ENCODE datasets, including DNase-hypersensitivity as well as multiple segmentation-defined enhancer regions. Furthermore, we identify putatively functional SNPs that are both in regulatory elements in lymphocytes and are associated with gene expression changes in blood. We developed an algorithm, UES, that uses a Monte Carlo simulation approach to calculate the enrichment of previously identified risk SNPs in various functional elements. This multiscale approach integrating multiple datasets helps disentangle the underlying biology of lymphoma, and more broadly, is generally applicable to GWAS results from other diseases as well. PMID:26422229
Ma, Li; Runesha, H Birali; Dvorkin, Daniel; Garbe, John R; Da, Yang
2008-01-01
Background Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers provide opportunities to detect epistatic SNPs associated with quantitative traits and to detect the exact mode of an epistasis effect. Computational difficulty is the main bottleneck for epistasis testing in large scale GWAS. Results The EPISNPmpi and EPISNP computer programs were developed for testing single-locus and epistatic SNP effects on quantitative traits in GWAS, including tests of three single-locus effects for each SNP (SNP genotypic effect, additive and dominance effects) and five epistasis effects for each pair of SNPs (two-locus interaction, additive × additive, additive × dominance, dominance × additive, and dominance × dominance) based on the extended Kempthorne model. EPISNPmpi is the parallel computing program for epistasis testing in large scale GWAS and achieved excellent scalability for large scale analysis and portability for various parallel computing platforms. EPISNP is the serial computing program based on the EPISNPmpi code for epistasis testing in small scale GWAS using commonly available operating systems and computer hardware. Three serial computing utility programs were developed for graphical viewing of test results and epistasis networks, and for estimating CPU time and disk space requirements. Conclusion The EPISNPmpi parallel computing program provides an effective computing tool for epistasis testing in large scale GWAS, and the epiSNP serial computing programs are convenient tools for epistasis analysis in small scale GWAS using commonly available computer hardware. PMID:18644146
Evaluation of European Schizophrenia GWAS Loci in Asian Populations via Comprehensive Meta-Analyses.
Xiao, Xiao; Luo, Xiong-Jian; Chang, Hong; Liu, Zichao; Li, Ming
2017-08-01
Schizophrenia is a severe and highly heritable neuropsychiatric disorder. Recent genetic analyses including genome-wide association studies (GWAS) have implicated multiple genome-wide significant variants for schizophrenia among European populations. However, many of these risk variants were not largely validated in other populations of different ancestry such as Asians. To validate whether these European GWAS significant loci are associated with schizophrenia in Asian populations, we conducted a systematic literature search and meta-analyses on 19 single nucleotide polymorphisms (SNPs) in Asian populations by combining all available case-control and family-based samples, including up to 30,000 individuals. We employed classical fixed (or random) effects inverse variance weighted methods to calculate summary odds ratios (ORs) and 95 % confidence intervals (CIs). Among the 19 GWAS loci, we replicated the risk associations of nine markers (e.g., SNPs at VRK2, ITIH3/4, NDST3, NOTCH4) surpassing significance level (two-tailed P < 0.05), and three additional SNPs in MIR137 and ZNF804A also showed trend associations (one-tailed P < 0.05). These risk associations are in the same directions of allelic effects between Asian replication samples and initial European GWAS findings, and the successful replications of these GWAS loci in a different ethnic group provide stronger evidence for their clinical associations with schizophrenia. Further studies, focusing on the molecular mechanisms of these GWAS significant loci, will become increasingly important for understanding of the pathogenesis to schizophrenia.
FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm.
Tuo, Shouheng; Zhang, Junying; Yuan, Xiguo; Zhang, Yuanyuan; Liu, Zhaowen
2016-01-01
Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset.
FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm
Tuo, Shouheng; Zhang, Junying; Yuan, Xiguo; Zhang, Yuanyuan; Liu, Zhaowen
2016-01-01
Motivation Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. Method In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. Results We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset. PMID:27014873
Testing Genetic Pleiotropy with GWAS Summary Statistics for Marginal and Conditional Analyses.
Deng, Yangqing; Pan, Wei
2017-12-01
There is growing interest in testing genetic pleiotropy, which is when a single genetic variant influences multiple traits. Several methods have been proposed; however, these methods have some limitations. First, all the proposed methods are based on the use of individual-level genotype and phenotype data; in contrast, for logistical, and other, reasons, summary statistics of univariate SNP-trait associations are typically only available based on meta- or mega-analyzed large genome-wide association study (GWAS) data. Second, existing tests are based on marginal pleiotropy, which cannot distinguish between direct and indirect associations of a single genetic variant with multiple traits due to correlations among the traits. Hence, it is useful to consider conditional analysis, in which a subset of traits is adjusted for another subset of traits. For example, in spite of substantial lowering of low-density lipoprotein cholesterol (LDL) with statin therapy, some patients still maintain high residual cardiovascular risk, and, for these patients, it might be helpful to reduce their triglyceride (TG) level. For this purpose, in order to identify new therapeutic targets, it would be useful to identify genetic variants with pleiotropic effects on LDL and TG after adjusting the latter for LDL; otherwise, a pleiotropic effect of a genetic variant detected by a marginal model could simply be due to its association with LDL only, given the well-known correlation between the two types of lipids. Here, we develop a new pleiotropy testing procedure based only on GWAS summary statistics that can be applied for both marginal analysis and conditional analysis. Although the main technical development is based on published union-intersection testing methods, care is needed in specifying conditional models to avoid invalid statistical estimation and inference. In addition to the previously used likelihood ratio test, we also propose using generalized estimating equations under the working independence model for robust inference. We provide numerical examples based on both simulated and real data, including two large lipid GWAS summary association datasets based on ∼100,000 and ∼189,000 samples, respectively, to demonstrate the difference between marginal and conditional analyses, as well as the effectiveness of our new approach. Copyright © 2017 by the Genetics Society of America.
Applications and Limitations of Mouse Models for Understanding Human Atherosclerosis
von Scheidt, Moritz; Zhao, Yuqi; Kurt, Zeyneb; Pan, Calvin; Zeng, Lingyao; Yang, Xia; Schunkert, Heribert; Lusis, Aldons J.
2017-01-01
Most of the biological understanding of mechanisms underlying coronary artery disease (CAD) derives from studies of mouse models. The identification of multiple CAD loci and strong candidate genes in large human genome-wide association studies (GWAS) presented an opportunity to examine the relevance of mouse models for the human disease. We comprehensively reviewed the mouse literature, including 827 literature-derived genes, and compared it to human data. First, we observed striking concordance of risk factors for atherosclerosis in mice and humans. Second, there was highly significant overlap of mouse genes with human genes identified by GWAS. In particular, of the 46 genes with strong association signals in CAD-GWAS that were studied in mouse models all but one exhibited consistent effects on atherosclerosis-related phenotypes. Third, we compared 178 CAD-associated pathways derived from human GWAS with 263 from mouse studies and observed that over 50% were consistent between both species. PMID:27916529
Progress of genome wide association study in domestic animals
2012-01-01
Domestic animals are invaluable resources for study of the molecular architecture of complex traits. Although the mapping of quantitative trait loci (QTL) responsible for economically important traits in domestic animals has achieved remarkable results in recent decades, not all of the genetic variation in the complex traits has been captured because of the low density of markers used in QTL mapping studies. The genome wide association study (GWAS), which utilizes high-density single-nucleotide polymorphism (SNP), provides a new way to tackle this issue. Encouraging achievements in dissection of the genetic mechanisms of complex diseases in humans have resulted from the use of GWAS. At present, GWAS has been applied to the field of domestic animal breeding and genetics, and some advances have been made. Many genes or markers that affect economic traits of interest in domestic animals have been identified. In this review, advances in the use of GWAS in domestic animals are described. PMID:22958308
Kuo, Kevin H M
2017-01-01
The issue of multiple testing, also termed multiplicity, is ubiquitous in studies where multiple hypotheses are tested simultaneously. Genome-wide association study (GWAS), a type of genetic association study that has gained popularity in the past decade, is most susceptible to the issue of multiple testing. Different methodologies have been employed to address the issue of multiple testing in GWAS. The purpose of the review is to examine the methodologies employed in dealing with multiple testing in the context of gene discovery using GWAS in sickle cell disease complications.
Fang, Hai; Knezevic, Bogdan; Burnham, Katie L; Knight, Julian C
2016-12-13
Biological interpretation of genomic summary data such as those resulting from genome-wide association studies (GWAS) and expression quantitative trait loci (eQTL) studies is one of the major bottlenecks in medical genomics research, calling for efficient and integrative tools to resolve this problem. We introduce eXploring Genomic Relations (XGR), an open source tool designed for enhanced interpretation of genomic summary data enabling downstream knowledge discovery. Targeting users of varying computational skills, XGR utilises prior biological knowledge and relationships in a highly integrated but easily accessible way to make user-input genomic summary datasets more interpretable. We show how by incorporating ontology, annotation, and systems biology network-driven approaches, XGR generates more informative results than conventional analyses. We apply XGR to GWAS and eQTL summary data to explore the genomic landscape of the activated innate immune response and common immunological diseases. We provide genomic evidence for a disease taxonomy supporting the concept of a disease spectrum from autoimmune to autoinflammatory disorders. We also show how XGR can define SNP-modulated gene networks and pathways that are shared and distinct between diseases, how it achieves functional, phenotypic and epigenomic annotations of genes and variants, and how it enables exploring annotation-based relationships between genetic variants. XGR provides a single integrated solution to enhance interpretation of genomic summary data for downstream biological discovery. XGR is released as both an R package and a web-app, freely available at http://galahad.well.ox.ac.uk/XGR .
Huang, Dandan; Yi, Xianfu; Zhang, Shijie; Zheng, Zhanye; Wang, Panwen; Xuan, Chenghao; Sham, Pak Chung; Wang, Junwen; Li, Mulin Jun
2018-05-16
Genome-wide association studies have generated over thousands of susceptibility loci for many human complex traits, and yet for most of these associations the true causal variants remain unknown. Tissue/cell type-specific prediction and prioritization of non-coding regulatory variants will facilitate the identification of causal variants and underlying pathogenic mechanisms for particular complex diseases and traits. By leveraging recent large-scale functional genomics/epigenomics data, we develop an intuitive web server, GWAS4D (http://mulinlab.tmu.edu.cn/gwas4d or http://mulinlab.org/gwas4d), that systematically evaluates GWAS signals and identifies context-specific regulatory variants. The updated web server includes six major features: (i) updates the regulatory variant prioritization method with our new algorithm; (ii) incorporates 127 tissue/cell type-specific epigenomes data; (iii) integrates motifs of 1480 transcriptional regulators from 13 public resources; (iv) uniformly processes Hi-C data and generates significant interactions at 5 kb resolution across 60 tissues/cell types; (v) adds comprehensive non-coding variant functional annotations; (vi) equips a highly interactive visualization function for SNP-target interaction. Using a GWAS fine-mapped set for 161 coronary artery disease risk loci, we demonstrate that GWAS4D is able to efficiently prioritize disease-causal regulatory variants.
Frau, Francesca; Crowther, Daniel; Ruetten, Hartmut; Allebrandt, Karla V
2017-05-01
Genome-wide association studies (GWAs) for type 2 diabetes (T2D) have been successful in identifying many loci with robust association signals. Nevertheless, there is a clear need for post-GWAs strategies to understand mechanism of action and clinical relevance of these variants. The association of several comorbidities with T2D suggests a common etiology for these phenotypes and complicates the management of the disease. In this study, we focused on the genetics underlying these relationships, using systems genomics to identify genetic variation associated with T2D and 12 other traits. GWAs studies summary statistics for pairwise comparisons were obtained for glycemic traits, obesity, coronary artery disease, and lipids from large consortia GWAs meta-analyses. We used a network medicine approach to leverage experimental information about the identified genes and variants with cross traits effects for biological function interpretation. We identified a set of 38 genetic variants with cross traits effects that point to a main network of genes that should be relevant for T2D and its comorbidities. We prioritized the T2D associated genes based on the number of traits they showed association with and the experimental evidence showing their relation to the disease etiology. In this study, we demonstrated how systems genomics and network medicine approaches can shed light into GWAs discoveries, translating findings into a more therapeutically relevant context. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Machiela, Mitchell J; Hofmann, Jonathan N; Carreras-Torres, Robert; Brown, Kevin M; Johansson, Mattias; Wang, Zhaoming; Foll, Matthieu; Li, Peng; Rothman, Nathaniel; Savage, Sharon A; Gaborieau, Valerie; McKay, James D; Ye, Yuanqing; Henrion, Marc; Bruinsma, Fiona; Jordan, Susan; Severi, Gianluca; Hveem, Kristian; Vatten, Lars J; Fletcher, Tony; Koppova, Kvetoslava; Larsson, Susanna C; Wolk, Alicja; Banks, Rosamonde E; Selby, Peter J; Easton, Douglas F; Pharoah, Paul; Andreotti, Gabriella; Freeman, Laura E Beane; Koutros, Stella; Albanes, Demetrius; Mannisto, Satu; Weinstein, Stephanie; Clark, Peter E; Edwards, Todd E; Lipworth, Loren; Gapstur, Susan M; Stevens, Victoria L; Carol, Hallie; Freedman, Matthew L; Pomerantz, Mark M; Cho, Eunyoung; Kraft, Peter; Preston, Mark A; Wilson, Kathryn M; Gaziano, J Michael; Sesso, Howard S; Black, Amanda; Freedman, Neal D; Huang, Wen-Yi; Anema, John G; Kahnoski, Richard J; Lane, Brian R; Noyes, Sabrina L; Petillo, David; Colli, Leandro M; Sampson, Joshua N; Besse, Celine; Blanche, Helene; Boland, Anne; Burdette, Laurie; Prokhortchouk, Egor; Skryabin, Konstantin G; Yeager, Meredith; Mijuskovic, Mirjana; Ognjanovic, Miodrag; Foretova, Lenka; Holcatova, Ivana; Janout, Vladimir; Mates, Dana; Mukeriya, Anush; Rascu, Stefan; Zaridze, David; Bencko, Vladimir; Cybulski, Cezary; Fabianova, Eleonora; Jinga, Viorel; Lissowska, Jolanta; Lubinski, Jan; Navratilova, Marie; Rudnai, Peter; Szeszenia-Dabrowska, Neonila; Benhamou, Simone; Cancel-Tassin, Geraldine; Cussenot, Olivier; Bueno-de-Mesquita, H Bas; Canzian, Federico; Duell, Eric J; Ljungberg, Börje; Sitaram, Raviprakash T; Peters, Ulrike; White, Emily; Anderson, Garnet L; Johnson, Lisa; Luo, Juhua; Buring, Julie; Lee, I-Min; Chow, Wong-Ho; Moore, Lee E; Wood, Christopher; Eisen, Timothy; Larkin, James; Choueiri, Toni K; Lathrop, G Mark; Teh, Bin Tean; Deleuze, Jean-Francois; Wu, Xifeng; Houlston, Richard S; Brennan, Paul; Chanock, Stephen J; Scelo, Ghislaine; Purdue, Mark P
2017-11-01
Relative telomere length in peripheral blood leukocytes has been evaluated as a potential biomarker for renal cell carcinoma (RCC) risk in several studies, with conflicting findings. We performed an analysis of genetic variants associated with leukocyte telomere length to assess the relationship between telomere length and RCC risk using Mendelian randomization, an approach unaffected by biases from temporal variability and reverse causation that might have affected earlier investigations. Genotypes from nine telomere length-associated variants for 10 784 cases and 20 406 cancer-free controls from six genome-wide association studies (GWAS) of RCC were aggregated into a weighted genetic risk score (GRS) predictive of leukocyte telomere length. Odds ratios (ORs) relating the GRS and RCC risk were computed in individual GWAS datasets and combined by meta-analysis. Longer genetically inferred telomere length was associated with an increased risk of RCC (OR=2.07 per predicted kilobase increase, 95% confidence interval [CI]:=1.70-2.53, p<0.0001). As a sensitivity analysis, we excluded two telomere length variants in linkage disequilibrium (R 2 >0.5) with GWAS-identified RCC risk variants (rs10936599 and rs9420907) from the telomere length GRS; despite this exclusion, a statistically significant association between the GRS and RCC risk persisted (OR=1.73, 95% CI=1.36-2.21, p<0.0001). Exploratory analyses for individual histologic subtypes suggested comparable associations with the telomere length GRS for clear cell (N=5573, OR=1.93, 95% CI=1.50-2.49, p<0.0001), papillary (N=573, OR=1.96, 95% CI=1.01-3.81, p=0.046), and chromophobe RCC (N=203, OR=2.37, 95% CI=0.78-7.17, p=0.13). Our investigation adds to the growing body of evidence indicating some aspect of longer telomere length is important for RCC risk. Telomeres are segments of DNA at chromosome ends that maintain chromosomal stability. Our study investigated the relationship between genetic variants associated with telomere length and renal cell carcinoma risk. We found evidence suggesting individuals with inherited predisposition to longer telomere length are at increased risk of developing renal cell carcinoma. Published by Elsevier B.V.
GWAS-based pathway analysis differentiates between fluid and crystallized intelligence.
Christoforou, A; Espeseth, T; Davies, G; Fernandes, C P D; Giddaluru, S; Mattheisen, M; Tenesa, A; Harris, S E; Liewald, D C; Payton, A; Ollier, W; Horan, M; Pendleton, N; Haggarty, P; Djurovic, S; Herms, S; Hoffman, P; Cichon, S; Starr, J M; Lundervold, A; Reinvang, I; Steen, V M; Deary, I J; Le Hellard, S
2014-09-01
Cognitive abilities vary among people. About 40-50% of this variability is due to general intelligence (g), which reflects the positive correlation among individuals' scores on diverse cognitive ability tests. g is positively correlated with many life outcomes, such as education, occupational status and health, motivating the investigation of its underlying biology. In psychometric research, a distinction is made between general fluid intelligence (gF) - the ability to reason in novel situations - and general crystallized intelligence (gC) - the ability to apply acquired knowledge. This distinction is supported by developmental and cognitive neuroscience studies. Classical epidemiological studies and recent genome-wide association studies (GWASs) have established that these cognitive traits have a large genetic component. However, no robust genetic associations have been published thus far due largely to the known polygenic nature of these traits and insufficient sample sizes. Here, using two GWAS datasets, in which the polygenicity of gF and gC traits was previously confirmed, a gene- and pathway-based approach was undertaken with the aim of characterizing and differentiating their genetic architecture. Pathway analysis, using genes selected on the basis of relaxed criteria, revealed notable differences between these two traits. gF appeared to be characterized by genes affecting the quantity and quality of neurons and therefore neuronal efficiency, whereas long-term depression (LTD) seemed to underlie gC. Thus, this study supports the gF-gC distinction at the genetic level and identifies functional annotations and pathways worthy of further investigation. © 2014 The Authors. Genes, Brain and Behavior published by International Behavioural and Neural Genetics Society and John Wiley & Sons Ltd.
2012-01-01
Background Through the wealth of information contained within them, genome-wide association studies (GWAS) have the potential to provide researchers with a systematic means of associating genetic variants with a wide variety of disease phenotypes. Due to the limitations of approaches that have analyzed single variants one at a time, it has been proposed that the genetic basis of these disorders could be determined through detailed analysis of the genetic variants themselves and in conjunction with one another. The construction of models that account for these subsets of variants requires methodologies that generate predictions based on the total risk of a particular group of polymorphisms. However, due to the excessive number of variants, constructing these types of models has so far been computationally infeasible. Results We have implemented an algorithm, known as greedy RLS, that we use to perform the first known wrapper-based feature selection on the genome-wide level. The running time of greedy RLS grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. This speed is achieved through computational short-cuts based on matrix calculus. Since the memory consumption in present-day computers can form an even tighter bottleneck than running time, we also developed a space efficient variation of greedy RLS which trades running time for memory. These approaches are then compared to traditional wrapper-based feature selection implementations based on support vector machines (SVM) to reveal the relative speed-up and to assess the feasibility of the new algorithm. As a proof of concept, we apply greedy RLS to the Hypertension – UK National Blood Service WTCCC dataset and select the most predictive variants using 3-fold external cross-validation in less than 26 minutes on a high-end desktop. On this dataset, we also show that greedy RLS has a better classification performance on independent test data than a classifier trained using features selected by a statistical p-value-based filter, which is currently the most popular approach for constructing predictive models in GWAS. Conclusions Greedy RLS is the first known implementation of a machine learning based method with the capability to conduct a wrapper-based feature selection on an entire GWAS containing several thousand examples and over 400,000 variants. In our experiments, greedy RLS selected a highly predictive subset of genetic variants in a fraction of the time spent by wrapper-based selection methods used together with SVM classifiers. The proposed algorithms are freely available as part of the RLScore software library at http://users.utu.fi/aatapa/RLScore/. PMID:22551170
Genome-wide association study of alcohol dependence
Treutlein, Jens; Cichon, Sven; Ridinger, Monika; Wodarz, Norbert; Soyka, Michael; Zill, Peter; Maier, Wolfgang; Moessner, Rainald; Gaebel, Wolfgang; Dahmen, Norbert; Fehr, Christoph; Scherbaum, Norbert; Steffens, Michael; Ludwig, Kerstin U.; Frank, Josef; Wichmann, H.- Erich; Schreiber, Stefan; Dragano, Nico; Sommer, Wolfgang; Leonardi-Essmann, Fernando; Lourdusamy, Anbarasu; Gebicke-Haerter, Peter; Wienker, Thomas F.; Sullivan, Patrick F.; Nöthen, Markus M.; Kiefer, Falk; Spanagel, Rainer; Mann, Karl; Rietschel, Marcella
2014-01-01
Context Identification of genes contributing to alcohol dependence will improve our understanding of the mechanisms underlying this disorder. Objective To identify susceptibility genes for alcohol dependence through a genome-wide association study (GWAS) and follow-up study in a population of German male inpatients with an early age at onset. Design The GWAS included 487 male inpatients with DSM-IV alcohol dependence with an age at onset below 28 years and 1,358 population based control individuals. The follow-up study included 1,024 male inpatients and 996 age-matched male controls. All subjects were of German descent. The GWAS tested 524,396 single nucleotide polymorphisms (SNPs). All SNPs with p<10-4 were subjected to the follow-up study. In addition, nominally significant SNPs from those genes that had also shown expression changes in rat brains after chronic alcohol consumption were selected for the follow-up step. Results The GWAS produced 121 SNPs with nominal p<10-4. These, together with 19 additional SNPs from homologs of rat genes showing differential expression, were genotyped in the follow-up sample. Fifteen SNPs showed significant association with the same allele as in the GWAS. In the combined analysis, two closely linked intergenic SNPs met genome-wide significance (rs7590720 p=9.72×10-9; rs1344694 p=1.69×10-8). They are located on chromosome 2q35, a region which has been implicated in linkage studies for alcohol phenotypes. Nine SNPs were located in genes, including CDH13 and ADH1C genes which have been reported to be associated with alcohol dependence. Conclusion This is the first GWAS and follow-up study to identify a genome-wide significant association in alcohol dependence. Further independent studies are required to confirm these findings. PMID:19581569
Implications of genome-wide association studies in cancer therapeutics.
Patel, Jai N; McLeod, Howard L; Innocenti, Federico
2013-09-01
Genome wide association studies (GWAS) provide an agnostic approach to identifying potential genetic variants associated with disease susceptibility, prognosis of survival and/or predictive of drug response. Although these techniques are costly and interpretation of study results is challenging, they do allow for a more unbiased interrogation of the entire genome, resulting in the discovery of novel genes and understanding of novel biological associations. This review will focus on the implications of GWAS in cancer therapy, in particular germ-line mutations, including findings from major GWAS which have identified predictive genetic loci for clinical outcome and/or toxicity. Lessons and challenges in cancer GWAS are also discussed, including the need for functional analysis and replication, as well as future perspectives for biological and clinical utility. Given the large heterogeneity in response to cancer therapeutics, novel methods of identifying mechanisms and biology of variable drug response and ultimately treatment individualization will be indispensable. © 2013 The British Pharmacological Society.
Fast and accurate genotype imputation in genome-wide association studies through pre-phasing
Howie, Bryan; Fuchsberger, Christian; Stephens, Matthew; Marchini, Jonathan; Abecasis, Gonçalo R.
2013-01-01
Sequencing efforts, including the 1000 Genomes Project and disease-specific efforts, are producing large collections of haplotypes that can be used for genotype imputation in genome-wide association studies (GWAS). Imputing from these reference panels can help identify new risk alleles, but the use of large panels with existing methods imposes a high computational burden. To keep imputation broadly accessible, we introduce a strategy called “pre-phasing” that maintains the accuracy of leading methods while cutting computational costs by orders of magnitude. In brief, we first statistically estimate the haplotypes for each GWAS individual (“pre-phasing”) and then impute missing genotypes into these estimated haplotypes. This reduces the computational cost because: (i) the GWAS samples must be phased only once, whereas standard methods would implicitly re-phase with each reference panel update; (ii) it is much faster to match a phased GWAS haplotype to one reference haplotype than to match unphased GWAS genotypes to a pair of reference haplotypes. This strategy will be particularly valuable for repeated imputation as reference panels evolve. PMID:22820512
Transethnic differences in GWAS signals: A simulation study.
Zanetti, Daniela; Weale, Michael E
2018-05-07
Genome-wide association studies (GWASs) have allowed researchers to identify thousands of single nucleotide polymorphisms (SNPs) and other variants associated with particular complex traits. Previous studies have reported differences in the strength and even the direction of GWAS signals across different populations. These differences could be due to a combination of (1) lack of power, (2) allele frequency differences, (3) linkage disequilibrium (LD) differences, and (4) true differences in causal variant effect sizes. To determine whether properties (1)-(3) on their own might be sufficient to explain the patterns previously noted in strong GWAS signals, we simulated case-control data of European, Asian and African ancestry, applying realistic allele frequencies and LD from 1000 Genomes data but enforcing equal causal effect sizes across populations. Much of the observed differences in strong GWAS signals could indeed be accounted for by allele frequency and LD differences, enhanced by the Euro-centric SNP bias and lower SNP coverage found in older GWAS panels. While we cannot rule out a role for true transethnic effect size differences, our results suggest that strong causal effects may be largely shared among human populations, motivating the use of transethnic data for fine-mapping. © 2018 John Wiley & Sons Ltd/University College London.
Hamza, Taye H.; Chen, Honglei; Hill-Burns, Erin M.; Rhodes, Shannon L.; Montimurro, Jennifer; Kay, Denise M.; Tenesa, Albert; Kusel, Victoria I.; Sheehan, Patricia; Eaaswarkhanth, Muthukrishnan; Yearout, Dora; Samii, Ali; Roberts, John W.; Agarwal, Pinky; Bordelon, Yvette; Park, Yikyung; Wang, Liyong; Gao, Jianjun; Vance, Jeffery M.; Kendler, Kenneth S.; Bacanu, Silviu-Alin; Scott, William K.; Ritz, Beate; Nutt, John; Factor, Stewart A.; Zabetian, Cyrus P.; Payami, Haydeh
2011-01-01
Our aim was to identify genes that influence the inverse association of coffee with the risk of developing Parkinson's disease (PD). We used genome-wide genotype data and lifetime caffeinated-coffee-consumption data on 1,458 persons with PD and 931 without PD from the NeuroGenetics Research Consortium (NGRC), and we performed a genome-wide association and interaction study (GWAIS), testing each SNP's main-effect plus its interaction with coffee, adjusting for sex, age, and two principal components. We then stratified subjects as heavy or light coffee-drinkers and performed genome-wide association study (GWAS) in each group. We replicated the most significant SNP. Finally, we imputed the NGRC dataset, increasing genomic coverage to examine the region of interest in detail. The primary analyses (GWAIS, GWAS, Replication) were performed using genotyped data. In GWAIS, the most significant signal came from rs4998386 and the neighboring SNPs in GRIN2A. GRIN2A encodes an NMDA-glutamate-receptor subunit and regulates excitatory neurotransmission in the brain. Achieving P2df = 10−6, GRIN2A surpassed all known PD susceptibility genes in significance in the GWAIS. In stratified GWAS, the GRIN2A signal was present in heavy coffee-drinkers (OR = 0.43; P = 6×10−7) but not in light coffee-drinkers. The a priori Replication hypothesis that “Among heavy coffee-drinkers, rs4998386_T carriers have lower PD risk than rs4998386_CC carriers” was confirmed: ORReplication = 0.59, PReplication = 10−3; ORPooled = 0.51, PPooled = 7×10−8. Compared to light coffee-drinkers with rs4998386_CC genotype, heavy coffee-drinkers with rs4998386_CC genotype had 18% lower risk (P = 3×10−3), whereas heavy coffee-drinkers with rs4998386_TC genotype had 59% lower risk (P = 6×10−13). Imputation revealed a block of SNPs that achieved P2df<5×10−8 in GWAIS, and OR = 0.41, P = 3×10−8 in heavy coffee-drinkers. This study is proof of concept that inclusion of environmental factors can help identify genes that are missed in GWAS. Both adenosine antagonists (caffeine-like) and glutamate antagonists (GRIN2A-related) are being tested in clinical trials for treatment of PD. GRIN2A may be a useful pharmacogenetic marker for subdividing individuals in clinical trials to determine which medications might work best for which patients. PMID:21876681
Zhu, Qianqian; Shepherd, Lori; Lunetta, Kathryn L.; Yao, Song; Liu, Qian; Hu, Qiang; Haddad, Stephen A.; Sucheston-Campbell, Lara; Bensen, Jeannette T.; Bandera, Elisa V.; Rosenberg, Lynn; Liu, Song; Haiman, Christopher A.; Olshan, Andrew F.; Palmer, Julie R.; Ambrosone, Christine B.
2016-01-01
Leveraging population-distinct linkage equilibrium (LD) patterns, trans-ethnic follow-up of variants discovered from genome-wide association studies (GWAS) has proved to be useful in facilitating the identification of bona fide causal variants. We previously developed the preferential LD approach, a novel method that successfully identified causal variants driving the GWAS signals within European-descent populations even when the causal variants were only weakly linked with the GWAS-discovered variants. To evaluate the performance of our approach in a trans-ethnic setting, we applied it to follow up breast cancer GWAS hits identified mostly from populations of European ancestry in African Americans (AA). We evaluated 74 breast cancer GWAS variants in 8,315 AA women from the African American Breast Cancer Epidemiology and Risk (AMBER) consortium. Only 27% of them were associated with breast cancer risk at significance level α=0.05, suggesting race-specificity of the identified breast cancer risk loci. We followed up on those replicated GWAS hits in the AMBER consortium utilizing the preferential LD approach, to search for causal variants or better breast cancer markers from the 1000 Genomes variant catalog. Our approach identified stronger breast cancer markers for 80% of the GWAS hits with at least nominal breast cancer association, and in 81% of these cases, the marker identified was among the top 10 of all 1000 Genomes variants in the corresponding locus. The results support trans-ethnic application of the preferential LD approach in search for candidate causal variants, and may have implications for future genetic research of breast cancer in AA women. PMID:27825120
e-GRASP: an integrated evolutionary and GRASP resource for exploring disease associations.
Karim, Sajjad; NourEldin, Hend Fakhri; Abusamra, Heba; Salem, Nada; Alhathli, Elham; Dudley, Joel; Sanderford, Max; Scheinfeldt, Laura B; Chaudhary, Adeel G; Al-Qahtani, Mohammed H; Kumar, Sudhir
2016-10-17
Genome-wide association studies (GWAS) have become a mainstay of biological research concerned with discovering genetic variation linked to phenotypic traits and diseases. Both discrete and continuous traits can be analyzed in GWAS to discover associations between single nucleotide polymorphisms (SNPs) and traits of interest. Associations are typically determined by estimating the significance of the statistical relationship between genetic loci and the given trait. However, the prioritization of bona fide, reproducible genetic associations from GWAS results remains a central challenge in identifying genomic loci underlying common complex diseases. Evolutionary-aware meta-analysis of the growing GWAS literature is one way to address this challenge and to advance from association to causation in the discovery of genotype-phenotype relationships. We have created an evolutionary GWAS resource to enable in-depth query and exploration of published GWAS results. This resource uses the publically available GWAS results annotated in the GRASP2 database. The GRASP2 database includes results from 2082 studies, 177 broad phenotype categories, and ~8.87 million SNP-phenotype associations. For each SNP in e-GRASP, we present information from the GRASP2 database for convenience as well as evolutionary information (e.g., rate and timespan). Users can, therefore, identify not only SNPs with highly significant phenotype-association P-values, but also SNPs that are highly replicated and/or occur at evolutionarily conserved sites that are likely to be functionally important. Additionally, we provide an evolutionary-adjusted SNP association ranking (E-rank) that uses cross-species evolutionary conservation scores and population allele frequencies to transform P-values in an effort to enhance the discovery of SNPs with a greater probability of biologically meaningful disease associations. By adding an evolutionary dimension to the GWAS results available in the GRASP2 database, our e-GRASP resource will enable a more effective exploration of SNPs not only by the statistical significance of trait associations, but also by the number of studies in which associations have been replicated, and the evolutionary context of the associated mutations. Therefore, e-GRASP will be a valuable resource for aiding researchers in the identification of bona fide, reproducible genetic associations from GWAS results. This resource is freely available at http://www.mypeg.info/egrasp .
Hill, W D; Marioni, R E; Maghzian, O; Ritchie, S J; Hagenaars, S P; McIntosh, A M; Gale, C R; Davies, G; Deary, I J
2018-01-11
Intelligence, or general cognitive function, is phenotypically and genetically correlated with many traits, including a wide range of physical, and mental health variables. Education is strongly genetically correlated with intelligence (r g = 0.70). We used these findings as foundations for our use of a novel approach-multi-trait analysis of genome-wide association studies (MTAG; Turley et al. 2017)-to combine two large genome-wide association studies (GWASs) of education and intelligence, increasing statistical power and resulting in the largest GWAS of intelligence yet reported. Our study had four goals: first, to facilitate the discovery of new genetic loci associated with intelligence; second, to add to our understanding of the biology of intelligence differences; third, to examine whether combining genetically correlated traits in this way produces results consistent with the primary phenotype of intelligence; and, finally, to test how well this new meta-analytic data sample on intelligence predicts phenotypic intelligence in an independent sample. By combining datasets using MTAG, our functional sample size increased from 199,242 participants to 248,482. We found 187 independent loci associated with intelligence, implicating 538 genes, using both SNP-based and gene-based GWAS. We found evidence that neurogenesis and myelination-as well as genes expressed in the synapse, and those involved in the regulation of the nervous system-may explain some of the biological differences in intelligence. The results of our combined analysis demonstrated the same pattern of genetic correlations as those from previous GWASs of intelligence, providing support for the meta-analysis of these genetically-related phenotypes.
Nagaie, Satoshi; Ogishima, Soichi; Nakaya, Jun; Tanaka, Hiroshi
2015-01-01
Genome-wide association studies (GWAS) and linkage analysis has identified many single nucleotide polymorphisms (SNPs) related to disease. There are many unknown SNPs whose minor allele frequencies (MAFs) as low as 0.005 having intermediate effects with odds ratio between 1.5~3.0. Low frequency variants having intermediate effects on disease pathogenesis are believed to have complex interactions with environmental factors called gene-environment interactions (GxE). Hence, we describe a model using 3D Manhattan plot called GxE landscape plot to visualize the association of p-values for gene-environment interactions (GxE). We used the Gene-Environment iNteraction Simulator 2 (GENS2) program to simulate interactions between two genetic loci and one environmental factor in this exercise. The dataset used for training contains disease status, gender, 20 environmental exposures and 100 genotypes for 170 subjects, and p-values were calculated by Cochran-Mantel-Haenszel chi-squared test on known data. Subsequently, we created a 3D GxE landscape plot of negative logarithm of the association of p-values for all the possible combinations of genetic and environmental factors with their hierarchical clustering. Thus, the GxE landscape plot is a valuable model to predict association of p-values for GxE and similarity among genotypes and environments in the context of disease pathogenesis. GxE - Gene-environment interactions, GWAS - Genome-wide association study, MAFs - Minor allele frequencies, SNPs - Single nucleotide polymorphisms, EWAS - Environment-wide association study, FDR - False discovery rate, JPT+CHB - HapMap population of Japanese in Tokyo, Japan - Han Chinese in Beijing.
Workalemahu, Tsegaselassie; Enquobahrie, Daniel A; Gelaye, Bizu; Sanchez, Sixto E; Garcia, Pedro J; Tekola-Ayele, Fasil; Hajat, Anjum; Thornton, Timothy A; Ananth, Cande V; Williams, Michelle A
2018-06-01
Accumulating epidemiological evidence points to strong genetic susceptibility to placental abruption (PA). However, characterization of genes associated with PA remains incomplete. We conducted a genome-wide association study (GWAS) of PA and a meta-analysis of GWAS. Participants of the Placental Abruption Genetic Epidemiology (PAGE) study, a population based case-control study of PA conducted in Lima, Peru, were genotyped using the Illumina HumanCore-24 BeadChip platform. Genotypes were imputed using the 1000 genomes reference panel, and >4.9 million SNPs that passed quality control were analyzed. We performed a GWAS in PAGE participants (507 PA cases and 1090 controls) and a GWAS meta-analysis in 2512 participants (959 PA cases and 1553 controls) that included PAGE and the previously reported Peruvian Abruptio Placentae Epidemiology (PAPE) study. We fitted population stratification-adjusted logistic regression models and fixed-effects meta-analyses using inverse-variance weighting. Independent loci (linkage-disequilibrium<0.80) suggestively associated with PA (P-value<5e-5) included rs4148646 and rs2074311 in ABCC8, rs7249210, rs7250184, rs7249100 and rs10401828 in ZNF28, rs11133659 in CTNND2, and rs2074314 and rs35271178 near KCNJ11 in the PAGE GWAS. Similarly, independent loci suggestively associated with PA in the GWAS meta-analysis included rs76258369 near IRX1, and rs7094759 and rs12264492 in ADAM12. Functional analyses of these genes showed trophoblast-like cell interaction, as well as networks involved in endocrine system disorders, cardiovascular diseases, and cellular function. We identified several genetic loci and related functions that may play a role in PA risk. Understanding genetic factors underlying pathophysiological mechanisms of PA may facilitate prevention and early diagnostic efforts. Published by Elsevier Ltd.
Cowper-Sal lari, Richard; Cole, Michael D; Karagas, Margaret R; Lupien, Mathieu; Moore, Jason H
2011-01-01
The conceptual foundation of the genome-wide association study (GWAS) has advanced unchecked since its conception. A revision might seem premature as the potential of GWAS has not been fully realized. Multiple technical and practical limitations need to be overcome before GWAS can be fairly criticized. But with the completion of hundreds of studies and a deeper understanding of the genetic architecture of disease, warnings are being raised. The results compiled to date indicate that risk-associated variants lie predominantly in noncoding regions of the genome. Additionally, alternative methodologies are uncovering large and heterogeneous sets of rare variants underlying disease. The fear is that, even in its fulfillment, the current GWAS paradigm might be incapable of dissecting all kinds of phenotypes. In the following text, we review several initiatives that aim to overcome these limitations. The overarching theme of these studies is the inclusion of biological knowledge to both the analysis and interpretation of genotyping data. GWAS is uninformed of biology by design and although there is some virtue in its simplicity, it is also its most conspicuous deficiency. We propose a framework in which to integrate these novel approaches, both empirical and theoretical, in the form of a genome-wide regulatory network (GWRN). By processing experimental data into networks, emerging data types based on chromatin immunoprecipitation are made computationally tractable. This will give GWAS re-analysis efforts the most current and relevant substrates, and root them firmly on our knowledge of human disease. Copyright © 2010 John Wiley & Sons, Inc.
variety of arrays appropriate for a wide breadth of study design needs. Genomic coverage of many of the chromosomal anomalies are services offered at NO ADDITIONAL COST to study investigators with GWAS projects be submitted for both the initial GWAS study as well as replication using our custom SNP service
Bossini-Castillo, Lara; Martin, Jose-Ezequiel; Broen, Jasper; Gorlova, Olga; Simeón, Carmen P.; Beretta, Lorenzo; Vonk, Madelon C.; Luis Callejas, Jose; Castellví, Ivan; Carreira, Patricia; José García-Hernández, Francisco; Fernández Castro, Mónica; Coenen, Marieke J.H.; Riemekasten, Gabriela; Witte, Torsten; Hunzelmann, Nicolas; Kreuter, Alexander; Distler, Jörg H.W.; Koeleman, Bobby P.; Voskuyl, Alexandre E.; Schuerwegh, Annemie J.; Palm, Øyvind; Hesselstrand, Roger; Nordin, Annika; Airó, Paolo; Lunardi, Claudio; Scorza, Raffaella; Shiels, Paul; van Laar, Jacob M.; Herrick, Ariane; Worthington, Jane; Denton, Christopher; Tan, Filemon K.; Arnett, Frank C.; Agarwal, Sandeep K.; Assassi, Shervin; Fonseca, Carmen; Mayes, Maureen D.; Radstake, Timothy R.D.J.; Martin, Javier
2012-01-01
A single-nucleotide polymorphism (SNP) at the IL12RB2 locus showed a suggestive association signal in a previously published genome-wide association study (GWAS) in systemic sclerosis (SSc). Aiming to reveal the possible implication of the IL12RB2 gene in SSc, we conducted a follow-up study of this locus in different Caucasian cohorts. We analyzed 10 GWAS-genotyped SNPs in the IL12RB2 region (2309 SSc patients and 5161 controls). We then selected three SNPs (rs3790567, rs3790566 and rs924080) based on their significance level in the GWAS, for follow-up in an independent European cohort comprising 3344 SSc and 3848 controls. The most-associated SNP (rs3790567) was further tested in an independent cohort comprising 597 SSc patients and 1139 controls from the USA. After conditional logistic regression analysis of the GWAS data, we selected rs3790567 [PMH= 1.92 × 10−5 odds ratio (OR) = 1.19] as the genetic variant with the firmest independent association observed in the analyzed GWAS peak of association. After the first follow-up phase, only the association of rs3790567 was consistent (PMH= 4.84 × 10−3 OR = 1.12). The second follow-up phase confirmed this finding (Pχ2 = 2.82 × 10−4 OR = 1.34). After performing overall pooled-analysis of all the cohorts included in the present study, the association found for the rs3790567 SNP in the IL12RB2 gene region reached GWAS-level significant association (PMH= 2.82 × 10−9 OR = 1.17). Our data clearly support the IL12RB2 genetic association with SSc, and suggest a relevant role of the interleukin 12 signaling pathway in SSc pathogenesis. PMID:22076442
Wu, Mengmeng; Zeng, Wanwen; Liu, Wenqiang; Lv, Hairong; Chen, Ting; Jiang, Rui
2018-06-03
Genome-wide association studies (GWAS) have successfully discovered a number of disease-associated genetic variants in the past decade, providing an unprecedented opportunity for deciphering genetic basis of human inherited diseases. However, it is still a challenging task to extract biological knowledge from the GWAS data, due to such issues as missing heritability and weak interpretability. Indeed, the fact that the majority of discovered loci fall into noncoding regions without clear links to genes has been preventing the characterization of their functions and appealing for a sophisticated approach to bridge genetic and genomic studies. Towards this problem, network-based prioritization of candidate genes, which performs integrated analysis of gene networks with GWAS data, has emerged as a promising direction and attracted much attention. However, most existing methods overlook the sparse and noisy properties of gene networks and thus may lead to suboptimal performance. Motivated by this understanding, we proposed a novel method called REGENT for integrating multiple gene networks with GWAS data to prioritize candidate genes for complex diseases. We leveraged a technique called the network representation learning to embed a gene network into a compact and robust feature space, and then designed a hierarchical statistical model to integrate features of multiple gene networks with GWAS data for the effective inference of genes associated with a disease of interest. We applied our method to six complex diseases and demonstrated the superior performance of REGENT over existing approaches in recovering known disease-associated genes. We further conducted a pathway analysis and showed that the ability of REGENT to discover disease-associated pathways. We expect to see applications of our method to a broad spectrum of diseases for post-GWAS analysis. REGENT is freely available at https://github.com/wmmthu/REGENT. Copyright © 2018 Elsevier Inc. All rights reserved.
Genome-wide association studies in maize: praise and stargaze
USDA-ARS?s Scientific Manuscript database
Genome-wide association study (GWAS) has appeared as a widespread strategy in decoding genotype-phenotype associations in many species thanks to technical advances in next-generation sequencing (NGS) applications. Maize is an ideal crop for GWAS and significant progress has been made in the last dec...
Genetics of Sputum Gene Expression in Chronic Obstructive Pulmonary Disease
Qiu, Weiliang; Cho, Michael H.; Riley, John H.; Anderson, Wayne H.; Singh, Dave; Bakke, Per; Gulsvik, Amund; Litonjua, Augusto A.; Lomas, David A.; Crapo, James D.; Beaty, Terri H.; Celli, Bartolome R.; Rennard, Stephen; Tal-Singer, Ruth; Fox, Steven M.; Silverman, Edwin K.; Hersh, Craig P.
2011-01-01
Previous expression quantitative trait loci (eQTL) studies have performed genetic association studies for gene expression, but most of these studies examined lymphoblastoid cell lines from non-diseased individuals. We examined the genetics of gene expression in a relevant disease tissue from chronic obstructive pulmonary disease (COPD) patients to identify functional effects of known susceptibility genes and to find novel disease genes. By combining gene expression profiling on induced sputum samples from 131 COPD cases from the ECLIPSE Study with genomewide single nucleotide polymorphism (SNP) data, we found 4315 significant cis-eQTL SNP-probe set associations (3309 unique SNPs). The 3309 SNPs were tested for association with COPD in a genomewide association study (GWAS) dataset, which included 2940 COPD cases and 1380 controls. Adjusting for 3309 tests (p<1.5e-5), the two SNPs which were significantly associated with COPD were located in two separate genes in a known COPD locus on chromosome 15: CHRNA5 and IREB2. Detailed analysis of chromosome 15 demonstrated additional eQTLs for IREB2 mapping to that gene. eQTL SNPs for CHRNA5 mapped to multiple linkage disequilibrium (LD) bins. The eQTLs for IREB2 and CHRNA5 were not in LD. Seventy-four additional eQTL SNPs were associated with COPD at p<0.01. These were genotyped in two COPD populations, finding replicated associations with a SNP in PSORS1C1, in the HLA-C region on chromosome 6. Integrative analysis of GWAS and gene expression data from relevant tissue from diseased subjects has located potential functional variants in two known COPD genes and has identified a novel COPD susceptibility locus. PMID:21949713
Schizophrenia interactome with 504 novel protein–protein interactions
Ganapathiraju, Madhavi K; Thahir, Mohamed; Handen, Adam; Sarkar, Saumendra N; Sweet, Robert A; Nimgaonkar, Vishwajit L; Loscher, Christine E; Bauer, Eileen M; Chaparala, Srilakshmi
2016-01-01
Genome-wide association studies of schizophrenia (GWAS) have revealed the role of rare and common genetic variants, but the functional effects of the risk variants remain to be understood. Protein interactome-based studies can facilitate the study of molecular mechanisms by which the risk genes relate to schizophrenia (SZ) genesis, but protein–protein interactions (PPIs) are unknown for many of the liability genes. We developed a computational model to discover PPIs, which is found to be highly accurate according to computational evaluations and experimental validations of selected PPIs. We present here, 365 novel PPIs of liability genes identified by the SZ Working Group of the Psychiatric Genomics Consortium (PGC). Seventeen genes that had no previously known interactions have 57 novel interactions by our method. Among the new interactors are 19 drug targets that are targeted by 130 drugs. In addition, we computed 147 novel PPIs of 25 candidate genes investigated in the pre-GWAS era. While there is little overlap between the GWAS genes and the pre-GWAS genes, the interactomes reveal that they largely belong to the same pathways, thus reconciling the apparent disparities between the GWAS and prior gene association studies. The interactome including 504 novel PPIs overall, could motivate other systems biology studies and trials with repurposed drugs. The PPIs are made available on a webserver, called Schizo-Pi at http://severus.dbmi.pitt.edu/schizo-pi with advanced search capabilities. PMID:27336055
Wei, Wen-Hua; Bowes, John; Plant, Darren; Viatte, Sebastien; Yarwood, Annie; Massey, Jonathan; Worthington, Jane; Eyre, Stephen
2016-04-25
Genotypic variability based genome-wide association studies (vGWASs) can identify potentially interacting loci without prior knowledge of the interacting factors. We report a two-stage approach to make vGWAS applicable to diseases: firstly using a mixed model approach to partition dichotomous phenotypes into additive risk and non-additive environmental residuals on the liability scale and secondly using the Levene's (Brown-Forsythe) test to assess equality of the residual variances across genotype groups per marker. We found widespread significant (P < 2.5e-05) vGWAS signals within the major histocompatibility complex (MHC) across all three study cohorts of rheumatoid arthritis. We further identified 10 epistatic interactions between the vGWAS signals independent of the MHC additive effects, each with a weak effect but jointly explained 1.9% of phenotypic variance. PTPN22 was also identified in the discovery cohort but replicated in only one independent cohort. Combining the three cohorts boosted power of vGWAS and additionally identified TYK2 and ANKRD55. Both PTPN22 and TYK2 had evidence of interactions reported elsewhere. We conclude that vGWAS can help discover interacting loci for complex diseases but require large samples to find additional signals.
Anonymization of electronic medical records for validating genome-wide association studies
Loukides, Grigorios; Gkoulalas-Divanis, Aris; Malin, Bradley
2010-01-01
Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet of personalized medicine. The increasing adoption of electronic medical records allows large amounts of patients’ standardized clinical features to be combined with the genomic sequences of these patients and shared to support validation of GWAS findings and to enable novel discoveries. However, disseminating these data “as is” may lead to patient reidentification when genomic sequences are linked to resources that contain the corresponding patients’ identity information based on standardized clinical features. This work proposes an approach that provably prevents this type of data linkage and furnishes a result that helps support GWAS. Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual reidentification, while supporting GWAS validation and clinical case analysis tasks. PMID:20385806
A Conceptual Framework for Pharmacodynamic Genome-wide Association Studies in Pharmacogenomics
Wu, Rongling; Tong, Chunfa; Wang, Zhong; Mauger, David; Tantisira, Kelan; Szefler, Stanley J.; Chinchilli, Vernon M.; Israel, Elliot
2013-01-01
Summary Genome-wide association studies (GWAS) have emerged as a powerful tool to identify loci that affect drug response or susceptibility to adverse drug reactions. However, current GWAS based on a simple analysis of associations between genotype and phenotype ignores the biochemical reactions of drug response, thus limiting the scope of inference about its genetic architecture. To facilitate the inference of GWAS in pharmacogenomics, we sought to undertake the mathematical integration of the pharmacodynamic process of drug reactions through computational models. By estimating and testing the genetic control of pharmacodynamic and pharmacokinetic parameters, this mechanistic approach does not only enhance the biological and clinical relevance of significant genetic associations, but also improve the statistical power and robustness of gene detection. This report discusses the general principle and development of pharmacodynamics-based GWAS, highlights the practical use of this approach in addressing various pharmacogenomic problems, and suggests that this approach will be an important method to study the genetic architecture of drug responses or reactions. PMID:21920452
Genetic determinants of leucocyte telomere length in children: a neglected and challenging field.
Stathopoulou, Maria G; Petrelis, Alexandros M; Buxton, Jessica L; Froguel, Philippe; Blakemore, Alexandra I F; Visvikis-Siest, Sophie
2015-03-01
Telomere length is associated with a large range of human diseases. Genome-wide association studies (GWAS) have identified genetic variants that are associated with leucocyte telomere length (LTL). However, these studies are limited to adult populations. Nevertheless, childhood is a crucial period for the determination of LTL, and the assessment of age-specific genetic determinants, although neglected, could be of great importance. Our aim was to provide insights and preliminary results on genetic determinants of LTL in children. Healthy children (n = 322, age range = 6.75-17 years) with available GWAS data (Illumina Human CNV370-Duo array) were included. The LTL was measured using multiplex quantitative real-time polymerase chain reaction. Linear regression models adjusted for age, gender, parental age at child's birth, and body mass index were used to test the associations of LTL with polymorphisms identified in adult GWAS and to perform a discovery-only GWAS. The previously GWAS-identified variants in adults were not associated with LTL in our paediatric sample. This lack of association was not due to possible interactions with age or gene × gene interactions. Furthermore, a discovery-only GWAS approach demonstrated six novel variants that reached the level of suggestive association (P ≤ 5 × 10(-5)) and explain a high percentage of children's LTL. The study of genetic determinants of LTL in children may identify novel variants not previously identified in adults. Studies in large-scale children populations are needed for the confirmation of these results, possibly through a childhood consortium that could better handle the methodological challenges of LTL genetic epidemiology field. © 2015 John Wiley & Sons Ltd.
Sharma, Amitabh; Gulbahce, Natali; Pevzner, Samuel J.; Menche, Jörg; Ladenvall, Claes; Folkersen, Lasse; Eriksson, Per; Orho-Melander, Marju; Barabási, Albert-László
2013-01-01
Genome wide association studies (GWAS) identify susceptibility loci for complex traits, but do not identify particular genes of interest. Integration of functional and network information may help in overcoming this limitation and identifying new susceptibility loci. Using GWAS and comorbidity data, we present a network-based approach to predict candidate genes for lipid and lipoprotein traits. We apply a prediction pipeline incorporating interactome, co-expression, and comorbidity data to Global Lipids Genetics Consortium (GLGC) GWAS for four traits of interest, identifying phenotypically coherent modules. These modules provide insights regarding gene involvement in complex phenotypes with multiple susceptibility alleles and low effect sizes. To experimentally test our predictions, we selected four candidate genes and genotyped representative SNPs in the Malmö Diet and Cancer Cardiovascular Cohort. We found significant associations with LDL-C and total-cholesterol levels for a synonymous SNP (rs234706) in the cystathionine beta-synthase (CBS) gene (p = 1 × 10−5 and adjusted-p = 0.013, respectively). Further, liver samples taken from 206 patients revealed that patients with the minor allele of rs234706 had significant dysregulation of CBS (p = 0.04). Despite the known biological role of CBS in lipid metabolism, SNPs within the locus have not yet been identified in GWAS of lipoprotein traits. Thus, the GWAS-based Comorbidity Module (GCM) approach identifies candidate genes missed by GWAS studies, serving as a broadly applicable tool for the investigation of other complex disease phenotypes. PMID:23882023
Integration of mouse and human genome-wide association data identifies KCNIP4 as an asthma gene.
Himes, Blanca E; Sheppard, Keith; Berndt, Annerose; Leme, Adriana S; Myers, Rachel A; Gignoux, Christopher R; Levin, Albert M; Gauderman, W James; Yang, James J; Mathias, Rasika A; Romieu, Isabelle; Torgerson, Dara G; Roth, Lindsey A; Huntsman, Scott; Eng, Celeste; Klanderman, Barbara; Ziniti, John; Senter-Sylvia, Jody; Szefler, Stanley J; Lemanske, Robert F; Zeiger, Robert S; Strunk, Robert C; Martinez, Fernando D; Boushey, Homer; Chinchilli, Vernon M; Israel, Elliot; Mauger, David; Koppelman, Gerard H; Postma, Dirkje S; Nieuwenhuis, Maartje A E; Vonk, Judith M; Lima, John J; Irvin, Charles G; Peters, Stephen P; Kubo, Michiaki; Tamari, Mayumi; Nakamura, Yusuke; Litonjua, Augusto A; Tantisira, Kelan G; Raby, Benjamin A; Bleecker, Eugene R; Meyers, Deborah A; London, Stephanie J; Barnes, Kathleen C; Gilliland, Frank D; Williams, L Keoki; Burchard, Esteban G; Nicolae, Dan L; Ober, Carole; DeMeo, Dawn L; Silverman, Edwin K; Paigen, Beverly; Churchill, Gary; Shapiro, Steve D; Weiss, Scott T
2013-01-01
Asthma is a common chronic respiratory disease characterized by airway hyperresponsiveness (AHR). The genetics of asthma have been widely studied in mouse and human, and homologous genomic regions have been associated with mouse AHR and human asthma-related phenotypes. Our goal was to identify asthma-related genes by integrating AHR associations in mouse with human genome-wide association study (GWAS) data. We used Efficient Mixed Model Association (EMMA) analysis to conduct a GWAS of baseline AHR measures from males and females of 31 mouse strains. Genes near or containing SNPs with EMMA p-values <0.001 were selected for further study in human GWAS. The results of the previously reported EVE consortium asthma GWAS meta-analysis consisting of 12,958 diverse North American subjects from 9 study centers were used to select a subset of homologous genes with evidence of association with asthma in humans. Following validation attempts in three human asthma GWAS (i.e., Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG) and two human AHR GWAS (i.e., SHARP, DAG), the Kv channel interacting protein 4 (KCNIP4) gene was identified as nominally associated with both asthma and AHR at a gene- and SNP-level. In EVE, the smallest KCNIP4 association was at rs6833065 (P-value 2.9e-04), while the strongest associations for Sepracor/LOCCS/LODO/Illumina, GABRIEL, DAG were 1.5e-03, 1.0e-03, 3.1e-03 at rs7664617, rs4697177, rs4696975, respectively. At a SNP level, the strongest association across all asthma GWAS was at rs4697177 (P-value 1.1e-04). The smallest P-values for association with AHR were 2.3e-03 at rs11947661 in SHARP and 2.1e-03 at rs402802 in DAG. Functional studies are required to validate the potential involvement of KCNIP4 in modulating asthma susceptibility and/or AHR. Our results suggest that a useful approach to identify genes associated with human asthma is to leverage mouse AHR association data.
The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog)
MacArthur, Jacqueline; Bowler, Emily; Cerezo, Maria; Gil, Laurent; Hall, Peggy; Hastings, Emma; Junkins, Heather; McMahon, Aoife; Milano, Annalisa; Morales, Joannella; Pendlington, Zoe May; Welter, Danielle; Burdett, Tony; Hindorff, Lucia; Flicek, Paul; Cunningham, Fiona; Parkinson, Helen
2017-01-01
The NHGRI-EBI GWAS Catalog has provided data from published genome-wide association studies since 2008. In 2015, the database was redesigned and relocated to EMBL-EBI. The new infrastructure includes a new graphical user interface (www.ebi.ac.uk/gwas/), ontology supported search functionality and an improved curation interface. These developments have improved the data release frequency by increasing automation of curation and providing scaling improvements. The range of available Catalog data has also been extended with structured ancestry and recruitment information added for all studies. The infrastructure improvements also support scaling for larger arrays, exome and sequencing studies, allowing the Catalog to adapt to the needs of evolving study design, genotyping technologies and user needs in the future. PMID:27899670
Genome-wide association studies and epigenome-wide association studies go together in cancer control
Verma, Mukesh
2016-01-01
Completion of the human genome a decade ago laid the foundation for: using genetic information in assessing risk to identify individuals and populations that are likely to develop cancer, and designing treatments based on a person's genetic profiling (precision medicine). Genome-wide association studies (GWAS) completed during the past few years have identified risk-associated single nucleotide polymorphisms that can be used as screening tools in epidemiologic studies of a variety of tumor types. This led to the conduct of epigenome-wide association studies (EWAS). This article discusses the current status, challenges and research opportunities in GWAS and EWAS. Information gained from GWAS and EWAS has potential applications in cancer control and treatment. PMID:27079684
A genome-wide gene–environment interaction analysis for tobacco smoke and lung cancer susceptibility
Zhang, Ruyang; Chu, Minjie; Zhao, Yang; Wu, Chen; Guo, Huan; Shi, Yongyong; Dai, Juncheng; Wei, Yongyue; Jin, Guangfu; Ma, Hongxia; Dong, Jing; Yi, Honggang; Bai, Jianling; Gong, Jianhang; Sun, Chongqi; Zhu, Meng; Wu, Tangchun; Hu, Zhibin; Lin, Dongxin; Shen, Hongbing; Chen, Feng
2014-01-01
Tobacco smoke is the major environmental risk factor underlying lung carcinogenesis. However, approximately one-tenth smokers develop lung cancer in their lifetime indicating there is significant individual variation in susceptibility to lung cancer. And, the reasons for this are largely unknown. In particular, the genetic variants discovered in genome-wide association studies (GWAS) account for only a small fraction of the phenotypic variations for lung cancer, and gene–environment interactions are thought to explain the missing fraction of disease heritability. The ability to identify smokers at high risk of developing cancer has substantial preventive implications. Thus, we undertook a gene–smoking interaction analysis in a GWAS of lung cancer in Han Chinese population using a two-phase designed case–control study. In the discovery phase, we evaluated all pair-wise (591 370) gene–smoking interactions in 5408 subjects (2331 cases and 3077 controls) using a logistic regression model with covariate adjustment. In the replication phase, promising interactions were validated in an independent population of 3023 subjects (1534 cases and 1489 controls). We identified interactions between two single nucleotide polymorphisms and smoking. The interaction P values are 6.73 × 10− 6 and 3.84 × 10− 6 for rs1316298 and rs4589502, respectively, in the combined dataset from the two phases. An antagonistic interaction (rs1316298–smoking) and a synergetic interaction (rs4589502–smoking) were observed. The two interactions identified in our study may help explain some of the missing heritability in lung cancer susceptibility and present strong evidence for further study of these gene–smoking interactions, which are benefit to intensive screening and smoking cessation interventions. PMID:24658283
Hill, W David
2018-04-01
Intelligence and educational attainment are strongly genetically correlated. This relationship can be exploited by Multi-Trait Analysis of GWAS (MTAG) to add power to Genome-wide Association Studies (GWAS) of intelligence. MTAG allows the user to meta-analyze GWASs of different phenotypes, based on their genetic correlations, to identify association's specific to the trait of choice. An MTAG analysis using GWAS data sets on intelligence and education was conducted by Lam et al. (2017). Lam et al. (2017) reported 70 loci that they described as 'trait specific' to intelligence. This article examines whether the analysis conducted by Lam et al. (2017) has resulted in genetic information about a phenotype that is more similar to education than intelligence.
Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary.
Brynildsrud, Ola; Bohlin, Jon; Scheffer, Lonneke; Eldholm, Vegard
2016-11-25
Genome-wide association studies (GWAS) have become indispensable in human medicine and genomics, but very few have been carried out on bacteria. Here we introduce Scoary, an ultra-fast, easy-to-use, and widely applicable software tool that scores the components of the pan-genome for associations to observed phenotypic traits while accounting for population stratification, with minimal assumptions about evolutionary processes. We call our approach pan-GWAS to distinguish it from traditional, single nucleotide polymorphism (SNP)-based GWAS. Scoary is implemented in Python and is available under an open source GPLv3 license at https://github.com/AdmiralenOla/Scoary .
Inflammation in Alzheimer's Disease and Molecular Genetics: Recent Update.
Zhang, Zhi-Gang; Li, Yan; Ng, Cheung Toa; Song, You-Qiang
2015-10-01
Alzheimer's disease (AD) is a complex age-related neurodegenerative disorder of the central nervous system. Since the first description of AD in 1907, many hypotheses have been established to explain its causes. The inflammation theory is one of them. Pathological and biochemical studies of brains from AD individuals have provided solid evidence of the activation of inflammatory pathways. Furthermore, people with long-term medication of anti-inflammatory drugs have shown a reduced risk to develop the disease. After three decades of genetic study in AD, dozens of loci harboring genetic variants influencing inflammatory pathways in AD patients has been identified through genome-wide association studies (GWAS). The most well-known GWAS risk factor that is responsible for immune response and inflammation in AD development should be APOE ε4 allele. However, a growing number of other GWAS risk AD candidate genes in inflammation have recently been discovered. In the present study, we try to review the inflammation in AD and immunity-associated GWAS risk genes like HLA-DRB5/DRB1, INPP5D, MEF2C, CR1, CLU and TREM2.
Chen, Guo-Bo; Lee, Sang Hong; Brion, Marie-Jo A; Montgomery, Grant W; Wray, Naomi R; Radford-Smith, Graham L; Visscher, Peter M
2014-09-01
As custom arrays are cheaper than generic GWAS arrays, larger sample size is achievable for gene discovery. Custom arrays can tag more variants through denser genotyping of SNPs at associated loci, but at the cost of losing genome-wide coverage. Balancing this trade-off is important for maximizing experimental designs. We quantified both the gain in captured SNP-heritability at known candidate regions and the loss due to imperfect genome-wide coverage for inflammatory bowel disease using immunochip (iChip) and imputed GWAS data on 61,251 and 38.550 samples, respectively. For Crohn's disease (CD), the iChip and GWAS data explained 19 and 26% of variation in liability, respectively, and SNPs in the densely genotyped iChip regions explained 13% of the SNP-heritability for both the iChip and GWAS data. For ulcerative colitis (UC), the iChip and GWAS data explained 15 and 19% of variation in liability, respectively, and the dense iChip regions explained 10 and 9% of the SNP-heritability in the iChip and the GWAS data. From bivariate analyses, estimates of the genetic correlation in risk between CD and UC were 0.75 (SE 0.017) and 0.62 (SE 0.042) for the iChip and GWAS data, respectively. We also quantified the SNP-heritability of genomic regions that did or did not contain the previous 163 GWAS hits for CD and UC, and SNP-heritability of the overlapping loci between the densely genotyped iChip regions and the 163 GWAS hits. For both diseases, over different genomic partitioning, the densely genotyped regions on the iChip tagged at least as much variation in liability as in the corresponding regions in the GWAS data, however a certain amount of tagged SNP-heritability in the GWAS data was lost using the iChip due to the low coverage at unselected regions. These results imply that custom arrays with a GWAS backbone will facilitate more gene discovery, both at associated and novel loci. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Genome-wide association study of acute post-surgical pain in humans
Kim, Hyungsuk; Ramsay, Edward; Lee, Hyewon; Wahl, Sharon; Dionne, Raymond A
2009-01-01
Aims Testing a relatively small genomic region with a few hundred SNPs provides limited information. Genome-wide association studies (GWAS) provide an opportunity to overcome the limitation of candidate gene association studies. Here, we report the results of a GWAS for the responses to an NSAID analgesic. Materials & methods European Americans (60 females and 52 males) undergoing oral surgery were genotyped with Affymetrix 500K SNP assay. Additional SNP genotyping was performed from the gene in linkage disequilibrium with the candidate SNP revealed by the GWAS. Results GWAS revealed a candidate SNP (rs2562456) associated with analgesic onset, which is in linkage disequilibrium with a gene encoding a zinc finger protein. Additional SNP genotyping of ZNF429 confirmed the association with analgesic onset in humans (p = 1.8 × 10−10, degrees of freedom = 103, F = 28.3). We also found candidate loci for the maximum post-operative pain rating (rs17122021, p = 6.9 × 10−7) and post-operative pain onset time (rs6693882, p = 2.1 × 10−6), however, correcting for multiple comparisons did not sustain these genetic associations. Conclusion GWAS for acute clinical pain followed by additional SNP genotyping of a neighboring gene suggests that genetic variations in or near the loci encoding DNA binding proteins play a role in the individual variations in responses to analgesic drugs. PMID:19207018
Genome-wide association studies in pharmacogenetics research debate
Bailey, Kent R; Cheng, Cheng
2016-01-01
Will genome-wide association studies (GWAS) ‘work’ for pharmacogenetics research? This question was the topic of a staged debate, with pro and con sides, aimed to bring out the strengths and weaknesses of GWAS for pharmacogenetics studies. After a full day of seminars at the Fifth Statistical Analysis Workshop of the Pharmacogenetics Research Network, the lively debate was held – appropriately – at Goonies Comedy Club in Rochester (MN, USA). The pro side emphasized that the many GWAS successes for identifying genetic variants associated with disease risk show that it works; that the current genotyping platforms are efficient, with good imputation methods to fill in missing data; that its global assessment is always a success even if no significant associations are detected; and that genetic effects are likely to be large because humans have not evolved in a drug-therapy environment. By contrast, the con side emphasized that we have limited knowledge of the complexity of the genome; limited clinical phenotypes compromise studies; the likely multifactorial nature of drug response clouding the small genetic effects; and limitations of sample size and replication studies in pharmacogenetic studies. Lively and insightful discussions emphasized further research efforts that might benefit GWAS in pharmacogenetics. PMID:20235786
Whitton, Laura; Cosgrove, Donna; Clarkson, Christopher; Harold, Denise; Kendall, Kimberley; Richards, Alex; Mantripragada, Kiran; Owen, Michael J; O'Donovan, Michael C; Walters, James; Hartmann, Annette; Konte, Betina; Rujescu, Dan; Gill, Michael; Corvin, Aiden; Rea, Stephen; Donohoe, Gary; Morris, Derek W
2016-12-01
Epigenetic mechanisms are an important heritable and dynamic means of regulating various genomic functions, including gene expression, to orchestrate brain development, adult neurogenesis, and synaptic plasticity. These processes when perturbed are thought to contribute to schizophrenia pathophysiology. A core feature of schizophrenia is cognitive dysfunction. For genetic disorders where cognitive impairment is more severe such as intellectual disability, there are a disproportionally high number of genes involved in the epigenetic regulation of gene transcription. Evidence now supports some shared genetic aetiology between schizophrenia and intellectual disability. GWAS have identified 108 chromosomal regions associated with schizophrenia risk that span 350 genes. This study identified genes mapping to those loci that have epigenetic functions, and tested the risk alleles defining those loci for association with cognitive deficits. We developed a list of 350 genes with epigenetic functions and cross-referenced this with the GWAS loci. This identified eight candidate genes: BCL11B, CHD7, EP300, EPC2, GATAD2A, KDM3B, RERE, SATB2. Using a dataset of Irish psychosis cases and controls (n = 1235), the schizophrenia risk SNPs at these loci were tested for effects on IQ, working memory, episodic memory, and attention. Strongest associations were for rs6984242 with both measures of IQ (P = 0.001) and episodic memory (P = 0.007). We link rs6984242 to CHD7 via a long range eQTL. These associations were not replicated in independent samples. Our study highlights that a number of genes mapping to risk loci for schizophrenia may function as epigenetic regulators of gene expression but further studies are required to establish a role for these genes in cognition. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Sánchez-Mora, Cristina; Ramos-Quiroga, Josep A; Bosch, Rosa; Corrales, Montse; Garcia-Martínez, Iris; Nogueira, Mariana; Pagerols, Mireia; Palomar, Gloria; Richarte, Vanesa; Vidal, Raquel; Arias-Vasquez, Alejandro; Bustamante, Mariona; Forns, Joan; Gross-Lesch, Silke; Guxens, Monica; Hinney, Anke; Hoogman, Martine; Jacob, Christian; Jacobsen, Kaya K; Kan, Cornelis C; Kiemeney, Lambertus; Kittel-Schneider, Sarah; Klein, Marieke; Onnink, Marten; Rivero, Olga; Zayats, Tetyana; Buitelaar, Jan; Faraone, Stephen V; Franke, Barbara; Haavik, Jan; Johansson, Stefan; Lesch, Klaus-Peter; Reif, Andreas; Sunyer, Jordi; Bayés, Mònica; Casas, Miguel; Cormand, Bru; Ribasés, Marta
2015-01-01
Attention-deficit hyperactivity disorder (ADHD) is a neurodevelopmental disorder with high heritability. At least 30% of patients diagnosed in childhood continue to suffer from ADHD during adulthood and genetic risk factors may play an essential role in the persistence of the disorder throughout lifespan. To date, genome-wide association studies (GWAS) of ADHD have been completed in seven independent datasets, six of which were pediatric samples and one on persistent ADHD using a DNA-pooling strategy, but none of them reported genome-wide significant associations. In an attempt to unravel novel genes for the persistence of ADHD into adulthood, we conducted the first two-stage GWAS in adults with ADHD. The discovery sample included 607 ADHD cases and 584 controls. Top signals were subsequently tested for replication in three independent follow-up samples of 2104 ADHD patients and 1901 controls. None of the findings exceeded the genome-wide threshold for significance (PGC<5e−08), but we found evidence for the involvement of the FBXO33 (F-box only protein 33) gene in combined ADHD in the discovery sample (P=9.02e−07) and in the joint analysis of both stages (P=9.7e−03). Additional evidence for a FBXO33 role in ADHD was found through gene-wise and pathway enrichment analyses in our genomic study. Risk alleles were associated with lower FBXO33 expression in lymphoblastoid cell lines and with reduced frontal gray matter volume in a sample of 1300 adult subjects. Our findings point for the first time at the ubiquitination machinery as a new disease mechanism for adult ADHD and establish a rationale for searching for additional risk variants in ubiquitination-related genes. PMID:25284319
Antoni, G; Morange, P-E; Luo, Y; Saut, N; Burgos, G; Heath, S; Germain, M; Biron-Andreani, C; Schved, J-F; Pernod, G; Galan, P; Zelenika, D; Alessi, M-C; Drouet, L; Visvikis-Siest, S; Wells, P S; Lathrop, M; Emmerich, J; Tregouet, D-A; Gagnon, F
2010-12-01
Factor VIII (FVIII) and von Willebrand factor (VWF) are two known quantitative risk factors for venous thromboembolism (VTE). To identify new loci that could contribute to VTE susceptibility and to modulating FVIII and/or VWF levels. A pedigree linkage analysis was first performed in five extended French-Canadian families, including 253 individuals, to identify genomic regions linked to FVIII or VWF levels. Identified regions were further explored using 'in silico' genome-wide association studies (GWAS) data on VTE (419 patients and 1228 controls), and two independent case-control studies (MARTHA and FARIVE) for VTE, gathering 1166 early-onset patients and 1408 healthy individuals. Single nucleotide polymorphisms (SNPs) associated with VTE risk were further investigated in relation to plasma levels of FVIII and VWF in a cohort of 108 healthy nuclear families. Four main linkage regions were identified, among which the well-characterized ABO locus, the recently identified STAB 2 gene, and a third one, on chromosome 6q13-14, harbouring four non-redundant SNPs, associated with VTE at P < 10(-4) in the GWAS dataset. The association of one of these SNPs, rs9363864, with VTE was further replicated in the MARTHA and FARIVE studies. The rs9363864-AA genotype was associated with a lower risk for VTE (OR = 0.58 [0.42-0.80], P = 0.0005) but mainly in non-carriers of the FV Leiden mutation. This genotype was further found to be associated with the lowest levels of FVIII (P = 0.006) and VWF (P = 0.001). The BAI3 locus where the rs9363864 maps is a new candidate for VTE risk. © 2010 International Society on Thrombosis and Haemostasis.
Espin-Garcia, Osvaldo; Craiu, Radu V; Bull, Shelley B
2018-02-01
We evaluate two-phase designs to follow-up findings from genome-wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation-maximization-based inference under a semiparametric maximum likelihood formulation tailored for post-GWAS inference. A GWAS-SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT-SNP-dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme-QT strata yields significant power improvements compared to marginal QT- or SNP-based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure. © 2017 The Authors. Genetic Epidemiology Published by Wiley Periodicals, Inc.
The MR-Base platform supports systematic causal inference across the human phenome
Wade, Kaitlin H; Haberland, Valeriia; Baird, Denis; Laurin, Charles; Burgess, Stephen; Bowden, Jack; Langdon, Ryan; Tan, Vanessa Y; Yarmolinsky, James; Shihab, Hashem A; Timpson, Nicholas J; Evans, David M; Relton, Caroline; Martin, Richard M; Davey Smith, George
2018-01-01
Results from genome-wide association studies (GWAS) can be used to infer causal relationships between phenotypes, using a strategy known as 2-sample Mendelian randomization (2SMR) and bypassing the need for individual-level data. However, 2SMR methods are evolving rapidly and GWAS results are often insufficiently curated, undermining efficient implementation of the approach. We therefore developed MR-Base (http://www.mrbase.org): a platform that integrates a curated database of complete GWAS results (no restrictions according to statistical significance) with an application programming interface, web app and R packages that automate 2SMR. The software includes several sensitivity analyses for assessing the impact of horizontal pleiotropy and other violations of assumptions. The database currently comprises 11 billion single nucleotide polymorphism-trait associations from 1673 GWAS and is updated on a regular basis. Integrating data with software ensures more rigorous application of hypothesis-driven analyses and allows millions of potential causal relationships to be efficiently evaluated in phenome-wide association studies. PMID:29846171
2013-01-01
Preterm birth has the highest mortality and morbidity of all pregnancy complications. The burden of preterm birth on public health worldwide is enormous, yet there are few effective means to prevent a preterm delivery. To date, much of its etiology is unexplained, but genetic predisposition is thought to play a major role. In the upcoming year, the international Preterm Birth Genome Project (PGP) consortium plans to publish a large genome wide association study in early preterm birth. Genome-wide association studies (GWAS) are designed to identify common genetic variants that influence health and disease. Despite the many challenges that are involved, GWAS can be an important discovery tool, revealing genetic variations that are associated with preterm birth. It is highly unlikely that findings of a GWAS can be directly translated into clinical practice in the short run. Nonetheless, it will help us to better understand the etiology of preterm birth and the GWAS results will generate new hypotheses for further research, thus enhancing our understanding of preterm birth and informing prevention efforts in the long run. PMID:23445776
Dolan, Siobhan M; Christiaens, Inge
2013-01-01
Preterm birth has the highest mortality and morbidity of all pregnancy complications. The burden of preterm birth on public health worldwide is enormous, yet there are few effective means to prevent a preterm delivery. To date, much of its etiology is unexplained, but genetic predisposition is thought to play a major role. In the upcoming year, the international Preterm Birth Genome Project (PGP) consortium plans to publish a large genome wide association study in early preterm birth. Genome-wide association studies (GWAS) are designed to identify common genetic variants that influence health and disease. Despite the many challenges that are involved, GWAS can be an important discovery tool, revealing genetic variations that are associated with preterm birth. It is highly unlikely that findings of a GWAS can be directly translated into clinical practice in the short run. Nonetheless, it will help us to better understand the etiology of preterm birth and the GWAS results will generate new hypotheses for further research, thus enhancing our understanding of preterm birth and informing prevention efforts in the long run.
Mullin, Benjamin H; Zhao, Jing Hua; Brown, Suzanne J; Perry, John R B; Luan, Jian'an; Zheng, Hou-Feng; Langenberg, Claudia; Dudbridge, Frank; Scott, Robert; Wareham, Nick J; Spector, Tim D; Richards, J Brent; Walsh, John P; Wilson, Scott G
2017-07-15
Osteoporosis is a common and debilitating bone disease that is characterised by low bone mineral density, typically assessed using dual-energy X-ray absorptiometry. Quantitative ultrasound (QUS), commonly utilising the two parameters velocity of sound (VOS) and broadband ultrasound attenuation (BUA), is an alternative technology used to assess bone properties at peripheral skeletal sites. The genetic influence on the bone qualities assessed by QUS remains an under-studied area. We performed a comprehensive genome-wide association study (GWAS) including low-frequency variants (minor allele frequency ≥0.005) for BUA and VOS using a discovery population of individuals with whole-genome sequence (WGS) data from the UK10K project (n = 1268). These results were then meta-analysed with those from two deeply imputed GWAS replication cohorts (n = 1610 and 13 749). In the gender-combined analysis, we identified eight loci associated with BUA and five with VOS at the genome-wide significance level, including three novel loci for BUA at 8p23.1 (PPP1R3B), 11q23.1 (LOC387810) and 22q11.21 (SEPT5) (P = 2.4 × 10-8 to 1.6 × 10-9). Gene-based association testing in the gender-combined dataset revealed eight loci associated with BUA and seven with VOS after correction for multiple testing, with one novel locus for BUA at FAM167A (8p23.1) (P = 1.4 × 10-6). An additional novel locus for BUA was seen in the male-specific analysis at DEFB103B (8p23.1) (P = 1.8 × 10-6). Fracture analysis revealed significant associations between variation at the WNT16 and RSPO3 loci and fracture risk (P = 0.004 and 4.0 × 10-4, respectively). In conclusion, by performing a large GWAS meta-analysis for QUS parameters of bone using a combination of WGS and deeply imputed genotype data, we have identified five novel genetic loci associated with BUA. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Traylor, Matthew; Farrall, Martin; Holliday, Elizabeth G; Sudlow, Cathie; Hopewell, Jemma C; Cheng, Yu-Ching; Fornage, Myriam; Ikram, M Arfan; Malik, Rainer; Bevan, Steve; Thorsteinsdottir, Unnur; Nalls, Mike A; Longstreth, WT; Wiggins, Kerri L; Yadav, Sunaina; Parati, Eugenio A; DeStefano, Anita L; Worrall, Bradford B; Kittner, Steven J; Khan, Muhammad Saleem; Reiner, Alex P; Helgadottir, Anna; Achterberg, Sefanja; Fernandez-Cadenas, Israel; Abboud, Sherine; Schmidt, Reinhold; Walters, Matthew; Chen, Wei-Min; Ringelstein, E Bernd; O'Donnell, Martin; Ho, Weang Kee; Pera, Joanna; Lemmens, Robin; Norrving, Bo; Higgins, Peter; Benn, Marianne; Sale, Michele; Kuhlenbäumer, Gregor; Doney, Alexander S F; Vicente, Astrid M; Delavaran, Hossein; Algra, Ale; Davies, Gail; Oliveira, Sofia A; Palmer, Colin N A; Deary, Ian; Schmidt, Helena; Pandolfo, Massimo; Montaner, Joan; Carty, Cara; de Bakker, Paul I W; Kostulas, Konstantinos; Ferro, Jose M; van Zuydam, Natalie R; Valdimarsson, Einar; Nordestgaard, Børge G; Lindgren, Arne; Thijs, Vincent; Slowik, Agnieszka; Saleheen, Danish; Paré, Guillaume; Berger, Klaus; Thorleifsson, Gudmar; Hofman, Albert; Mosley, Thomas H; Mitchell, Braxton D; Furie, Karen; Clarke, Robert; Levi, Christopher; Seshadri, Sudha; Gschwendtner, Andreas; Boncoraglio, Giorgio B; Sharma, Pankaj; Bis, Joshua C; Gretarsdottir, Solveig; Psaty, Bruce M; Rothwell, Peter M; Rosand, Jonathan; Meschia, James F; Stefansson, Kari; Dichgans, Martin; Markus, Hugh S
2012-01-01
Summary Background Various genome-wide association studies (GWAS) have been done in ischaemic stroke, identifying a few loci associated with the disease, but sample sizes have been 3500 cases or less. We established the METASTROKE collaboration with the aim of validating associations from previous GWAS and identifying novel genetic associations through meta-analysis of GWAS datasets for ischaemic stroke and its subtypes. Methods We meta-analysed data from 15 ischaemic stroke cohorts with a total of 12 389 individuals with ischaemic stroke and 62 004 controls, all of European ancestry. For the associations reaching genome-wide significance in METASTROKE, we did a further analysis, conditioning on the lead single nucleotide polymorphism in every associated region. Replication of novel suggestive signals was done in 13 347 cases and 29 083 controls. Findings We verified previous associations for cardioembolic stroke near PITX2 (p=2·8×10−16) and ZFHX3 (p=2·28×10−8), and for large-vessel stroke at a 9p21 locus (p=3·32×10−5) and HDAC9 (p=2·03×10−12). Additionally, we verified that all associations were subtype specific. Conditional analysis in the three regions for which the associations reached genome-wide significance (PITX2, ZFHX3, and HDAC9) indicated that all the signal in each region could be attributed to one risk haplotype. We also identified 12 potentially novel loci at p<5×10−6. However, we were unable to replicate any of these novel associations in the replication cohort. Interpretation Our results show that, although genetic variants can be detected in patients with ischaemic stroke when compared with controls, all associations we were able to confirm are specific to a stroke subtype. This finding has two implications. First, to maximise success of genetic studies in ischaemic stroke, detailed stroke subtyping is required. Second, different genetic pathophysiological mechanisms seem to be associated with different stroke subtypes. Funding Wellcome Trust, UK Medical Research Council (MRC), Australian National and Medical Health Research Council, National Institutes of Health (NIH) including National Heart, Lung and Blood Institute (NHLBI), the National Institute on Aging (NIA), the National Human Genome Research Institute (NHGRI), and the National Institute of Neurological Disorders and Stroke (NINDS). PMID:23041239
Power considerations for λ inflation factor in meta-analyses of genome-wide association studies.
Georgiopoulos, Georgios; Evangelou, Evangelos
2016-05-19
The genomic control (GC) approach is extensively used to effectively control false positive signals due to population stratification in genome-wide association studies (GWAS). However, GC affects the statistical power of GWAS. The loss of power depends on the magnitude of the inflation factor (λ) that is used for GC. We simulated meta-analyses of different GWAS. Minor allele frequency (MAF) ranged from 0·001 to 0·5 and λ was sampled from two scenarios: (i) random scenario (empirically-derived distribution of real λ values) and (ii) selected scenario from simulation parameter modification. Adjustment for λ was considered under single correction (within study corrected standard errors) and double correction (additional λ corrected summary estimate). MAF was a pivotal determinant of observed power. In random λ scenario, double correction induced a symmetric power reduction in comparison to single correction. For MAF 1·2 and MAF >5%. Our results provide a quick but detailed index for power considerations of future meta-analyses of GWAS that enables a more flexible design from early steps based on the number of studies accumulated in different groups and the λ values observed in the single studies.
Cis-eQTL-based trans-ethnic meta-analysis reveals novel genes associated with breast cancer risk
Tai, Caroline G.; Passarelli, Michael N.; Hu, Donglei; Huntsman, Scott; Zaitlen, Noah; Ziv, Elad; Witte, John S.
2017-01-01
Breast cancer is the most common solid organ malignancy and the most frequent cause of cancer death among women worldwide. Previous research has yielded insights into its genetic etiology, but there remains a gap in the understanding of genetic factors that contribute to risk, and particularly in the biological mechanisms by which genetic variation modulates risk. The National Cancer Institute’s “Up for a Challenge” (U4C) competition provided an opportunity to further elucidate the genetic basis of the disease. Our group leveraged the seven datasets made available by the U4C organizers and data from the publicly available UK Biobank cohort to examine associations between imputed gene expression and breast cancer risk. In particular, we used reference datasets describing the breast tissue and whole blood transcriptomes to impute expression levels in breast cancer cases and controls. In trans-ethnic meta-analyses of U4C and UK Biobank data, we found significant associations between breast cancer risk and the expression of RCCD1 (joint p-value: 3.6x10-06) and DHODH (p-value: 7.1x10-06) in breast tissue, as well as a suggestive association for ANKLE1 (p-value: 9.3x10-05). Expression of RCCD1 in whole blood was also suggestively associated with disease risk (p-value: 1.2x10-05), as were expression of ACAP1 (p-value: 1.9x10-05) and LRRC25 (p-value: 5.2x10-05). While genome-wide association studies (GWAS) have implicated RCCD1 and ANKLE1 in breast cancer risk, they have not identified the remaining three genes. Among the genetic variants that contributed to the predicted expression of the five genes, we found 23 nominally (p-value < 0.05) associated with breast cancer risk, among which 15 are not in high linkage disequilibrium with risk variants previously identified by GWAS. In summary, we used a transcriptome-based approach to investigate the genetic underpinnings of breast carcinogenesis. This approach provided an avenue for deciphering the functional relevance of genes and genetic variants involved in breast cancer. PMID:28362817
Dobbyn, Amanda; Huckins, Laura M; Boocock, James; Sloofman, Laura G; Glicksberg, Benjamin S; Giambartolomei, Claudia; Hoffman, Gabriel E; Perumal, Thanneer M; Girdhar, Kiran; Jiang, Yan; Raj, Towfique; Ruderfer, Douglas M; Kramer, Robin S; Pinto, Dalila; Akbarian, Schahram; Roussos, Panos; Domenici, Enrico; Devlin, Bernie; Sklar, Pamela; Stahl, Eli A; Sieberts, Solveig K
2018-06-07
Causal genes and variants within genome-wide association study (GWAS) loci can be identified by integrating GWAS statistics with expression quantitative trait loci (eQTL) and determining which variants underlie both GWAS and eQTL signals. Most analyses, however, consider only the marginal eQTL signal, rather than dissect this signal into multiple conditionally independent signals for each gene. Here we show that analyzing conditional eQTL signatures, which could be important under specific cellular or temporal contexts, leads to improved fine mapping of GWAS associations. Using genotypes and gene expression levels from post-mortem human brain samples (n = 467) reported by the CommonMind Consortium (CMC), we find that conditional eQTL are widespread; 63% of genes with primary eQTL also have conditional eQTL. In addition, genomic features associated with conditional eQTL are consistent with context-specific (e.g., tissue-, cell type-, or developmental time point-specific) regulation of gene expression. Integrating the 2014 Psychiatric Genomics Consortium schizophrenia (SCZ) GWAS and CMC primary and conditional eQTL data reveals 40 loci with strong evidence for co-localization (posterior probability > 0.8), including six loci with co-localization of conditional eQTL. Our co-localization analyses support previously reported genes, identify novel genes associated with schizophrenia risk, and provide specific hypotheses for their functional follow-up. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Gene- and pathway-based association tests for multiple traits with GWAS summary statistics.
Kwak, Il-Youp; Pan, Wei
2017-01-01
To identify novel genetic variants associated with complex traits and to shed new insights on underlying biology, in addition to the most popular single SNP-single trait association analysis, it would be useful to explore multiple correlated (intermediate) traits at the gene- or pathway-level by mining existing single GWAS or meta-analyzed GWAS data. For this purpose, we present an adaptive gene-based test and a pathway-based test for association analysis of multiple traits with GWAS summary statistics. The proposed tests are adaptive at both the SNP- and trait-levels; that is, they account for possibly varying association patterns (e.g. signal sparsity levels) across SNPs and traits, thus maintaining high power across a wide range of situations. Furthermore, the proposed methods are general: they can be applied to mixed types of traits, and to Z-statistics or P-values as summary statistics obtained from either a single GWAS or a meta-analysis of multiple GWAS. Our numerical studies with simulated and real data demonstrated the promising performance of the proposed methods. The methods are implemented in R package aSPU, freely and publicly available at: https://cran.r-project.org/web/packages/aSPU/ CONTACT: weip@biostat.umn.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A genome-wide survey of transgenerational genetic effects in autism.
Tsang, Kathryn M; Croen, Lisa A; Torres, Anthony R; Kharrazi, Martin; Delorenze, Gerald N; Windham, Gayle C; Yoshida, Cathleen K; Zerbo, Ousseny; Weiss, Lauren A
2013-01-01
Effects of parental genotype or parent-offspring genetic interaction are well established in model organisms for a variety of traits. However, these transgenerational genetic models are rarely studied in humans. We have utilized an autism case-control study with 735 mother-child pairs to perform genome-wide screening for maternal genetic effects and maternal-offspring genetic interaction. We used simple models of single locus parent-child interaction and identified suggestive results (P<10(-4)) that cannot be explained by main effects, but no genome-wide significant signals. Some of these maternal and maternal-child associations were in or adjacent to autism candidate genes including: PCDH9, FOXP1, GABRB3, NRXN1, RELN, MACROD2, FHIT, RORA, CNTN4, CNTNAP2, FAM135B, LAMA1, NFIA, NLGN4X, RAPGEF4, and SDK1. We attempted validation of potential autism association under maternal-specific models using maternal-paternal comparison in family-based GWAS datasets. Our results suggest that further study of parental genetic effects and parent-child interaction in autism is warranted.
Genetic overlap between Alzheimer’s disease and Parkinson’s disease at the MAPT locus
Desikan, Rahul S.; Schork, Andrew J.; Wang, Yunpeng; Witoelar, Aree; Sharma, Manu; McEvoy, Linda K.; Holland, Dominic; Brewer, James B.; Chen, Chi-Hua; Thompson, Wesley K.; Harold, Denise; Williams, Julie; Owen, Michael J.; O’Donovan, Michael C.; Pericak-Vance, Margaret A.; Mayeux, Richard; Haines, Jonathan L.; Farrer, Lindsay A.; Schellenberg, Gerard D.; Heutink, Peter; Singleton, Andrew B.; Brice, Alexis; Wood, Nicolas W.; Hardy, John; Martinez, Maria; Choi, Seung Hoi; DeStefano, Anita; Ikram, M. Arfan; Bis, Joshua C.; Smith, Albert; Fitzpatrick, Annette L.; Launer, Lenore; van Duijn, Cornelia; Seshadri, Sudha; Ulstein, Ingun Dina; Aarsland, Dag; Fladby, Tormod; Djurovic, Srdjan; Hyman, Bradley T.; Snaedal, Jon; Stefansson, Hreinn; Stefansson, Kari; Gasser, Thomas; Andreassen, Ole A.; Dale, Anders M.
2015-01-01
We investigated genetic overlap between Alzheimer’s disease (AD) and Parkinson’s disease (PD). Using summary statistics (p-values) from large recent genomewide association studies (GWAS) (total n = 89,904 individuals), we sought to identify single nucleotide polymorphisms (SNPs) associating with both AD and PD. We found and replicated association of both AD and PD with the A allele of rs393152 within the extended MAPT region on chromosome 17 (meta analysis p-value across 5 independent AD cohorts = 1.65 × 10−7). In independent datasets, we found a dose-dependent effect of the A allele of rs393152 on intra-cerebral MAPT transcript levels and volume loss within the entorhinal cortex and hippocampus. Our findings identify the tau-associated MAPT locus as a site of genetic overlap between AD and PD and extending prior work, we show that the MAPT region increases risk of Alzheimer’s neurodegeneration. PMID:25687773
The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog).
MacArthur, Jacqueline; Bowler, Emily; Cerezo, Maria; Gil, Laurent; Hall, Peggy; Hastings, Emma; Junkins, Heather; McMahon, Aoife; Milano, Annalisa; Morales, Joannella; Pendlington, Zoe May; Welter, Danielle; Burdett, Tony; Hindorff, Lucia; Flicek, Paul; Cunningham, Fiona; Parkinson, Helen
2017-01-04
The NHGRI-EBI GWAS Catalog has provided data from published genome-wide association studies since 2008. In 2015, the database was redesigned and relocated to EMBL-EBI. The new infrastructure includes a new graphical user interface (www.ebi.ac.uk/gwas/), ontology supported search functionality and an improved curation interface. These developments have improved the data release frequency by increasing automation of curation and providing scaling improvements. The range of available Catalog data has also been extended with structured ancestry and recruitment information added for all studies. The infrastructure improvements also support scaling for larger arrays, exome and sequencing studies, allowing the Catalog to adapt to the needs of evolving study design, genotyping technologies and user needs in the future. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
The (in)famous GWAS P-value threshold revisited and updated for low-frequency variants.
Fadista, João; Manning, Alisa K; Florez, Jose C; Groop, Leif
2016-08-01
Genome-wide association studies (GWAS) have long relied on proposed statistical significance thresholds to be able to differentiate true positives from false positives. Although the genome-wide significance P-value threshold of 5 × 10(-8) has become a standard for common-variant GWAS, it has not been updated to cope with the lower allele frequency spectrum used in many recent array-based GWAS studies and sequencing studies. Using a whole-genome- and -exome-sequencing data set of 2875 individuals of European ancestry from the Genetics of Type 2 Diabetes (GoT2D) project and a whole-exome-sequencing data set of 13 000 individuals from five ancestries from the GoT2D and T2D-GENES (Type 2 Diabetes Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples) projects, we describe guidelines for genome- and exome-wide association P-value thresholds needed to correct for multiple testing, explaining the impact of linkage disequilibrium thresholds for distinguishing independent variants, minor allele frequency and ancestry characteristics. We emphasize the advantage of studying recent genetic isolate populations when performing rare and low-frequency genetic association analyses, as the multiple testing burden is diminished due to higher genetic homogeneity.
Overview of the Genetics of Alcohol Use Disorder
Tawa, Elisabeth A.; Hall, Samuel D.; Lohoff, Falk W.
2016-01-01
Aims Alcohol Use Disorder (AUD) is a chronic psychiatric illness characterized by harmful drinking patterns leading to negative emotional, physical, and social ramifications. While the underlying pathophysiology of AUD is poorly understood, there is substantial evidence for a genetic component; however, identification of universal genetic risk variants for AUD has been difficult. Recent efforts in the search for AUD susceptibility genes will be reviewed in this article. Methods In this review, we provide an overview of genetic studies on AUD, including twin studies, linkage studies, candidate gene studies, and genome-wide association studies (GWAS). Results Several potential genetic susceptibility factors for AUD have been identified, but the genes of alcohol metabolism, alcohol dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH), have been found to be protective against the development of AUD. GWAS have also identified a heterogeneous list of SNPs associated with AUD and alcohol-related phenotypes, emphasizing the complexity and heterogeneity of the disorder. In addition, many of these findings have small effect sizes when compared to alcohol metabolism genes, and biological relevance is often unknown. Conclusions Although studies spanning multiple approaches have suggested a genetic basis for AUD, identification of the genetic risk variants has been challenging. Some promising results are emerging from GWAS studies; however, larger sample sizes are needed to improve GWAS results and resolution. As the field of genetics is rapidly developing, whole genome sequencing could soon become the new standard of interrogation of the genes and neurobiological pathways which contribute to the complex phenotype of AUD. Short summary This review examines the genetic underpinnings of Alcohol Use Disorder (AUD), with an emphasis on GWAS approaches for identifying genetic risk variants. The most promising results associated with AUD and alcohol-related phenotypes have included SNPs of the alcohol metabolism genes ADH and ALDH. PMID:27445363
Bioinformatics challenges for genome-wide association studies.
Moore, Jason H; Asselbergs, Folkert W; Williams, Scott M
2010-02-15
The sequencing of the human genome has made it possible to identify an informative set of >1 million single nucleotide polymorphisms (SNPs) across the genome that can be used to carry out genome-wide association studies (GWASs). The availability of massive amounts of GWAS data has necessitated the development of new biostatistical methods for quality control, imputation and analysis issues including multiple testing. This work has been successful and has enabled the discovery of new associations that have been replicated in multiple studies. However, it is now recognized that most SNPs discovered via GWAS have small effects on disease susceptibility and thus may not be suitable for improving health care through genetic testing. One likely explanation for the mixed results of GWAS is that the current biostatistical analysis paradigm is by design agnostic or unbiased in that it ignores all prior knowledge about disease pathobiology. Further, the linear modeling framework that is employed in GWAS often considers only one SNP at a time thus ignoring their genomic and environmental context. There is now a shift away from the biostatistical approach toward a more holistic approach that recognizes the complexity of the genotype-phenotype relationship that is characterized by significant heterogeneity and gene-gene and gene-environment interaction. We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. The goal of this review is to identify and discuss those GWAS challenges that will require computational methods.
Reprogramming neurodegeneration in the big data era.
Zhou, Lujia; Verstreken, Patrik
2018-02-01
Recent genome-wide association studies (GWAS) have identified numerous genetic risk variants for late-onset Alzheimer's disease (AD) and Parkinson's disease (PD). However, deciphering the functional consequences of GWAS data is challenging due to a lack of reliable model systems to study the genetic variants that are often of low penetrance and non-coding identities. Pluripotent stem cell (PSC) technologies offer unprecedented opportunities for molecular phenotyping of GWAS variants in human neurons and microglia. Moreover, rapid technological advances in whole-genome RNA-sequencing and epigenome mapping fuel comprehensive and unbiased investigations of molecular alterations in PSC-derived disease models. Here, we review and discuss how integrated studies that utilize PSC technologies and genome-wide approaches may bring new mechanistic insight into the pathogenesis of AD and PD. Copyright © 2018 Elsevier Ltd. All rights reserved.
Nieuwenhuis, Maartje A.; Siedlinski, Matteusz; van den Berge, Maarten; Granell, Raquel; Li, Xingnan; Niens, Marijke; van der Vlies, Pieter; Altmüller, Janine; Nürnberg, Peter; Kerkhof, Marjan; van Schayck, Onno C.; Riemersma, Ronald A.; van der Molen, Thys; de Monchy, Jan G.; Bossé, Yohan; Sandford, Andrew; Bruijnzeel-Koomen, Carla A.; van Wijk, Roy G.; ten Hacken, Nick H.; Timens, Wim; Boezen, H. Marike; Henderson, John; Kabesch, Michael; Vonk, Judith M.; Postma, Dirkje S.; Koppelman, Gerard H.
2016-01-01
Background Genome wide association studies (GWAS) of asthma have identified single nucleotide polymorphisms (SNPs) that modestly increase the risk for asthma. This could be due to phenotypic heterogeneity of asthma. Bronchial hyperresponsiveness (BHR) is a phenotypic hallmark of asthma. We aim to identify susceptibility genes for asthma combined with BHR and analyse the presence of cis-eQTLs among replicated SNPs. Secondly, we compare the genetic association of SNPs previously associated with (doctor diagnosed) asthma to our GWAS of asthma with BHR. Methods A GWAS was performed in 920 asthmatics with BHR and 980 controls. Top SNPs of our GWAS were analysed in four replication cohorts and lung cis-eQTL analysis was performed on replicated SNPs. We investigated association of SNPs previously associated with asthma in our data. Results 368 SNPs were followed up for replication. Six SNPs in genes encoding ABI3BP, NAF1, MICA and the 17q21 locus replicated in one or more cohorts, with one locus (17q21) achieving genome wide significance after meta-analysis. Five out of 6 replicated SNPs regulated 35 gene transcripts in whole lung. Eight of 20 asthma associated SNPs from previous GWAS were significantly associated with asthma and BHR. Three SNPs, in IL-33 and GSDMB, showed larger effect sizes in our data compared to published literature. Conclusions Combining GWAS with subsequent lung eQTL analysis revealed disease associated SNPs regulating lung mRNA expression levels of potential new asthma genes. Adding BHR to the asthma definition does not lead to an overall larger genetic effect size than analysing (doctor’s diagnosed) asthma. PMID:27439200
Mägi, Reedik; Suleimanov, Yury V; Clarke, Geraldine M; Kaakinen, Marika; Fischer, Krista; Prokopenko, Inga; Morris, Andrew P
2017-01-11
Genome-wide association studies (GWAS) of single nucleotide polymorphisms (SNPs) have been successful in identifying loci contributing genetic effects to a wide range of complex human diseases and quantitative traits. The traditional approach to GWAS analysis is to consider each phenotype separately, despite the fact that many diseases and quantitative traits are correlated with each other, and often measured in the same sample of individuals. Multivariate analyses of correlated phenotypes have been demonstrated, by simulation, to increase power to detect association with SNPs, and thus may enable improved detection of novel loci contributing to diseases and quantitative traits. We have developed the SCOPA software to enable GWAS analysis of multiple correlated phenotypes. The software implements "reverse regression" methodology, which treats the genotype of an individual at a SNP as the outcome and the phenotypes as predictors in a general linear model. SCOPA can be applied to quantitative traits and categorical phenotypes, and can accommodate imputed genotypes under a dosage model. The accompanying META-SCOPA software enables meta-analysis of association summary statistics from SCOPA across GWAS. Application of SCOPA to two GWAS of high-and low-density lipoprotein cholesterol, triglycerides and body mass index, and subsequent meta-analysis with META-SCOPA, highlighted stronger association signals than univariate phenotype analysis at established lipid and obesity loci. The META-SCOPA meta-analysis also revealed a novel signal of association at genome-wide significance for triglycerides mapping to GPC5 (lead SNP rs71427535, p = 1.1x10 -8 ), which has not been reported in previous large-scale GWAS of lipid traits. The SCOPA and META-SCOPA software enable discovery and dissection of multiple phenotype association signals through implementation of a powerful reverse regression approach.
Genome-wide association studies for the identification of biomarkers in metabolic diseases.
Pattin, Kristine A; Moore, Jason H
2010-01-01
The field of genetics as it relates to metabolic disorders such as obesity and type II diabetes is complicated, and along with the medical research community, great strides are being taken to begin to understand the biological and genetic underpinnings of these diseases, with the hope of improving therapeutic, diagnostic and preventive strategies. Although research on metabolic disorders has been continuing for decades, the completion of the Human Genome Project in 2003 and the International HapMap Project in 2005 gave rise to an abundance of research tools, such as genome-wide genotyping, which allow researchers to conduct genome-wide association studies (GWAS) for detecting genetic variants that confer increased or decreased susceptibility to such complex diseases. In this review, the complex nature of metabolic disorders is discussed, specifically obesity and type II diabetes, as well as the limitations of the GWAS as applied to these disorders. While acknowledging limitations of GWAS, it is hoped to provide an insight about how GWAS can be adapted and advantageous in the clinical setting, enhancing prevention, diagnosis and treatment of these diseases. To be able to use the GWAS in a clinical setting is a complex challenge, yet it is hoped that in the future this tool will ultimately allow the development of pharmaceutical options that are capable of targeting the cause of metabolic disorders, not just the symptoms themselves.
Analysis and visualization of Arabidopsis thaliana GWAS using web 2.0 technologies.
Huang, Yu S; Horton, Matthew; Vilhjálmsson, Bjarni J; Seren, Umit; Meng, Dazhe; Meyer, Christopher; Ali Amer, Muhammad; Borevitz, Justin O; Bergelson, Joy; Nordborg, Magnus
2011-01-01
With large-scale genomic data becoming the norm in biological studies, the storing, integrating, viewing and searching of such data have become a major challenge. In this article, we describe the development of an Arabidopsis thaliana database that hosts the geographic information and genetic polymorphism data for over 6000 accessions and genome-wide association study (GWAS) results for 107 phenotypes representing the largest collection of Arabidopsis polymorphism data and GWAS results to date. Taking advantage of a series of the latest web 2.0 technologies, such as Ajax (Asynchronous JavaScript and XML), GWT (Google-Web-Toolkit), MVC (Model-View-Controller) web framework and Object Relationship Mapper, we have created a web-based application (web app) for the database, that offers an integrated and dynamic view of geographic information, genetic polymorphism and GWAS results. Essential search functionalities are incorporated into the web app to aid reverse genetics research. The database and its web app have proven to be a valuable resource to the Arabidopsis community. The whole framework serves as an example of how biological data, especially GWAS, can be presented and accessed through the web. In the end, we illustrate the potential to gain new insights through the web app by two examples, showcasing how it can be used to facilitate forward and reverse genetics research. Database URL: http://arabidopsis.usc.edu/
Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data
Oetjens, Matthew T.; Brown-Gentry, Kristin; Goodloe, Robert; Dilks, Holli H.; Crawford, Dana C.
2016-01-01
Population stratification or confounding by genetic ancestry is a potential cause of false associations in genetic association studies. Estimation of and adjustment for genetic ancestry has become common practice thanks in part to the availability of ancestry informative markers on genome-wide association study (GWAS) arrays. While array data is now widespread, these data are not ubiquitous as several large epidemiologic and clinic-based studies lack genome-wide data. One such large epidemiologic-based study lacking genome-wide data accessible to investigators is the National Health and Nutrition Examination Surveys (NHANES), population-based cross-sectional surveys of Americans linked to demographic, health, and lifestyle data conducted by the Centers for Disease Control and Prevention. DNA samples (n = 14,998) were extracted from biospecimens from consented NHANES participants between 1991–1994 (NHANES III, phase 2) and 1999–2002 and represent three major self-identified racial/ethnic groups: non-Hispanic whites (n = 6,634), non-Hispanic blacks (n = 3,458), and Mexican Americans (n = 3,950). We as the Epidemiologic Architecture for Genes Linked to Environment study genotyped candidate gene and GWAS-identified index variants in NHANES as part of the larger Population Architecture using Genomics and Epidemiology I study for collaborative genetic association studies. To enable basic quality control such as estimation of genetic ancestry to control for population stratification in NHANES san genome-wide data, we outline here strategies that use limited genetic data to identify the markers optimal for characterizing genetic ancestry. From among 411 and 295 autosomal SNPs available in NHANES III and NHANES 1999–2002, we demonstrate that markers with ancestry information can be identified to estimate global ancestry. Despite limited resolution, global genetic ancestry is highly correlated with self-identified race for the majority of participants, although less so for ethnicity. Overall, the strategies outlined here for a large epidemiologic study can be applied to other datasets accessible for genotype–phenotype studies but are sans genome-wide data. PMID:27200085
Tang, Hongwei; Wei, Peng; Duell, Eric J; Risch, Harvey A; Olson, Sara H; Bueno-de-Mesquita, H Bas; Gallinger, Steven; Holly, Elizabeth A; Petersen, Gloria M; Bracci, Paige M; McWilliams, Robert R; Jenab, Mazda; Riboli, Elio; Tjønneland, Anne; Boutron-Ruault, Marie Christine; Kaaks, Rudolf; Trichopoulos, Dimitrios; Panico, Salvatore; Sund, Malin; Peeters, Petra H M; Khaw, Kay-Tee; Amos, Christopher I; Li, Donghui
2014-01-01
Obesity and diabetes are potentially alterable risk factors for pancreatic cancer. Genetic factors that modify the associations of obesity and diabetes with pancreatic cancer have previously not been examined at the genome-wide level. Using genome-wide association studies (GWAS) genotype and risk factor data from the Pancreatic Cancer Case Control Consortium, we conducted a discovery study of 2,028 cases and 2,109 controls to examine gene-obesity and gene-diabetes interactions in relation to pancreatic cancer risk by using the likelihood-ratio test nested in logistic regression models and Ingenuity Pathway Analysis (IPA). After adjusting for multiple comparisons, a significant interaction of the chemokine signaling pathway with obesity (P = 3.29 × 10(-6)) and a near significant interaction of calcium signaling pathway with diabetes (P = 1.57 × 10(-4)) in modifying the risk of pancreatic cancer were observed. These findings were supported by results from IPA analysis of the top genes with nominal interactions. The major contributing genes to the two top pathways include GNGT2, RELA, TIAM1, and GNAS. None of the individual genes or single-nucleotide polymorphism (SNP) except one SNP remained significant after adjusting for multiple testing. Notably, SNP rs10818684 of the PTGS1 gene showed an interaction with diabetes (P = 7.91 × 10(-7)) at a false discovery rate of 6%. Genetic variations in inflammatory response and insulin resistance may affect the risk of obesity- and diabetes-related pancreatic cancer. These observations should be replicated in additional large datasets. A gene-environment interaction analysis may provide new insights into the genetic susceptibility and molecular mechanisms of obesity- and diabetes-related pancreatic cancer.
Transferability of genome-wide associated loci for asthma in African Americans.
Faruque, Mezbah U; Chen, Guanjie; Doumatey, Ayo P; Zhou, Jie; Huang, Hanxia; Shriner, Daniel; Adeyemo, Adebowale A; Rotimi, Charles N; Dunston, Georgia M
2017-01-02
Transferability of significantly associated loci or GWAS "hits" adds credibility to genotype-disease associations and provides evidence for generalizability across different ancestral populations. We sought evidence of association of known asthma-associated single nucleotide polymorphisms (SNPs) in an African American population. Subjects comprised 661 participants (261 asthma cases and 400 controls) from the Howard University Family Study. Forty-eight SNPs previously reported to be associated with asthma by GWAS were selected for testing. We adopted a combined strategy by first adopting an "exact" approach where we looked-up only the reported index SNP. For those index SNPs missing form our dataset, we used a "local" approach that examined all the regional SNPs in LD with the index SNP. Out of the 48 SNPs, our cohort had genotype data available for 27, which were examined for exact replication. Of these, two SNPs were found positively associated with asthma. These included: rs10508372 (OR = 1.567 [95%CI, 1.133-2.167], P = 0.0066) and rs2378383 (OR = 2.147 [95%CI, 1.149-4.013], P = 0.0166), located on chromosomal bands 10p14 and 9q21.31, respectively. Local replication of the remaining 21 loci showed association at two chromosomal loci (9p24.1-rs2381413 and 6p21.32-rs3132947; Bonferroni-corrected P values: 0.0033 and 0.0197, respectively). Of note, multiple SNPs in LD with rs2381413 located upstream of IL33 were significantly associated with asthma. This study has successfully transferred four reported asthma-associated loci in an independent African American population. Identification of several asthma-associated SNPs in the upstream of the IL33, a gene previously implicated in allergic inflammation of asthmatic airway, supports the generalizability of this finding.
Biernacka, Joanna M.; Geske, Jennifer R.; Schneekloth, Terry D.; Frye, Mark A.; Cunningham, Julie M.; Choi, Doo-Sup; Tapp, Courtney L.; Lewis, Bradley R.; Drews, Maureen S.; L.Pietrzak, Tracy; Colby, Colin L.; Hall-Flavin, Daniel K.; Loukianova, Larissa L.; Heit, John A.; Mrazek, David A.; Karpyak, Victor M.
2013-01-01
Genome-wide association studies (GWAS) have revealed many single nucleotide polymorphisms (SNPs) associated with complex traits. Although these studies frequently fail to identify statistically significant associations, the top association signals from GWAS may be enriched for true associations. We therefore investigated the association of alcohol dependence with 43 SNPs selected from association signals in the first two published GWAS of alcoholism. Our analysis of 808 alcohol-dependent cases and 1,248 controls provided evidence of association of alcohol dependence with SNP rs1614972 in the ADH1C gene (unadjusted p = 0.0017). Because the GWAS study that originally reported association of alcohol dependence with this SNP [1] included only men, we also performed analyses in sex-specific strata. The results suggest that this SNP has a similar effect in both sexes (men: OR (95%CI) = 0.80 (0.66, 0.95); women: OR (95%CI) = 0.83 (0.66, 1.03)). We also observed marginal evidence of association of the rs1614972 minor allele with lower alcohol consumption in the non-alcoholic controls (p = 0.081), and independently in the alcohol-dependent cases (p = 0.046). Despite a number of potential differences between the samples investigated by the prior GWAS and the current study, data presented here provide additional support for the association of SNP rs1614972 in ADH1C with alcohol dependence and extend this finding by demonstrating association with consumption levels in both non-alcoholic and alcohol-dependent populations. Further studies should investigate the association of other polymorphisms in this gene with alcohol dependence and related alcohol-use phenotypes. PMID:23516558
Al-Tassan, Nada A; Whiffin, Nicola; Hosking, Fay J; Palles, Claire; Farrington, Susan M; Dobbins, Sara E; Harris, Rebecca; Gorman, Maggie; Tenesa, Albert; Meyer, Brian F; Wakil, Salma M; Kinnersley, Ben; Campbell, Harry; Martin, Lynn; Smith, Christopher G; Idziaszczyk, Shelley; Barclay, Ella; Maughan, Timothy S; Kaplan, Richard; Kerr, Rachel; Kerr, David; Buchanan, Daniel D; Buchannan, Daniel D; Win, Aung Ko; Hopper, John; Jenkins, Mark; Lindor, Noralane M; Newcomb, Polly A; Gallinger, Steve; Conti, David; Schumacher, Fred; Casey, Graham; Dunlop, Malcolm G; Tomlinson, Ian P; Cheadle, Jeremy P; Houlston, Richard S
2015-05-20
Genome-wide association studies (GWAS) of colorectal cancer (CRC) have identified 23 susceptibility loci thus far. Analyses of previously conducted GWAS indicate additional risk loci are yet to be discovered. To identify novel CRC susceptibility loci, we conducted a new GWAS and performed a meta-analysis with five published GWAS (totalling 7,577 cases and 9,979 controls of European ancestry), imputing genotypes utilising the 1000 Genomes Project. The combined analysis identified new, significant associations with CRC at 1p36.2 marked by rs72647484 (minor allele frequency [MAF] = 0.09) near CDC42 and WNT4 (P = 1.21 × 10(-8), odds ratio [OR] = 1.21 ) and at 16q24.1 marked by rs16941835 (MAF = 0.21, P = 5.06 × 10(-8); OR = 1.15) within the long non-coding RNA (lncRNA) RP11-58A18.1 and ~500 kb from the nearest coding gene FOXL1. Additionally we identified a promising association at 10p13 with rs10904849 intronic to CUBN (MAF = 0.32, P = 7.01 × 10(-8); OR = 1.14). These findings provide further insights into the genetic and biological basis of inherited genetic susceptibility to CRC. Additionally, our analysis further demonstrates that imputation can be used to exploit GWAS data to identify novel disease-causing variants.
Family-Based Genome-Wide Association Scan of Attention-Deficit/Hyperactivity Disorder
ERIC Educational Resources Information Center
Mick, Eric; Todorov, Alexandre; Smalley, Susan; Hu, Xiaolan; Loo, Sandra; Todd, Richard D.; Biederman, Joseph; Byrne, Deirdre; Dechairo, Bryan; Guiney, Allan; McCracken, James; McGough, James; Nelson, Stanley F.; Reiersen, Angela M.; Wilens, Timothy E.; Wozniak, Janet; Neale, Benjamin M.; Faraone, Stephen V.
2010-01-01
Objective: Genes likely play a substantial role in the etiology of attention-deficit/hyperactivity disorder (ADHD). However, the genetic architecture of the disorder is unknown, and prior genome-wide association studies (GWAS) have not identified a genome-wide significant association. We have conducted a third, independent, multisite GWAS of…
Nested association mapping for dissecting complex traits using Peanut 58K SNP array
USDA-ARS?s Scientific Manuscript database
Genome-wide association studies (GWAS) and linkage mapping have been the two most predominant strategies to dissect complex traits, but are limited by the occurrence of false positives reported for GWAS, and low resolution in the case of linkage analysis. This has led to the development of a joint a...
USDA-ARS?s Scientific Manuscript database
Copy number variation (CNV) is an important type of genetic variation contributing to phenotypic differences among mammals and may serve as an alternative molecular marker to single nucleotide polymorphism (SNP) for genome-wide association study (GWAS). Recently, GWAS analysis using CNV has been app...
Transcriptional risk scores link GWAS to eQTLs and predict complications in Crohn's disease.
Marigorta, Urko M; Denson, Lee A; Hyams, Jeffrey S; Mondal, Kajari; Prince, Jarod; Walters, Thomas D; Griffiths, Anne; Noe, Joshua D; Crandall, Wallace V; Rosh, Joel R; Mack, David R; Kellermayer, Richard; Heyman, Melvin B; Baker, Susan S; Stephens, Michael C; Baldassano, Robert N; Markowitz, James F; Kim, Mi-Ok; Dubinsky, Marla C; Cho, Judy; Aronow, Bruce J; Kugathasan, Subra; Gibson, Greg
2017-10-01
Gene expression profiling can be used to uncover the mechanisms by which loci identified through genome-wide association studies (GWAS) contribute to pathology. Given that most GWAS hits are in putative regulatory regions and transcript abundance is physiologically closer to the phenotype of interest, we hypothesized that summation of risk-allele-associated gene expression, namely a transcriptional risk score (TRS), should provide accurate estimates of disease risk. We integrate summary-level GWAS and expression quantitative trait locus (eQTL) data with RNA-seq data from the RISK study, an inception cohort of pediatric Crohn's disease. We show that TRSs based on genes regulated by variants linked to inflammatory bowel disease (IBD) not only outperform genetic risk scores (GRSs) in distinguishing Crohn's disease from healthy samples, but also serve to identify patients who in time will progress to complicated disease. Our dissection of eQTL effects may be used to distinguish genes whose association with disease is through promotion versus protection, thereby linking statistical association to biological mechanism. The TRS approach constitutes a potential strategy for personalized medicine that enhances inference from static genotypic risk assessment.
Meta-analysis of Parkinson's disease: identification of a novel locus, RIT2.
Pankratz, Nathan; Beecham, Gary W; DeStefano, Anita L; Dawson, Ted M; Doheny, Kimberly F; Factor, Stewart A; Hamza, Taye H; Hung, Albert Y; Hyman, Bradley T; Ivinson, Adrian J; Krainc, Dmitri; Latourelle, Jeanne C; Clark, Lorraine N; Marder, Karen; Martin, Eden R; Mayeux, Richard; Ross, Owen A; Scherzer, Clemens R; Simon, David K; Tanner, Caroline; Vance, Jeffery M; Wszolek, Zbigniew K; Zabetian, Cyrus P; Myers, Richard H; Payami, Haydeh; Scott, William K; Foroud, Tatiana
2012-03-01
Genome-wide association (GWAS) methods have identified genes contributing to Parkinson's disease (PD); we sought to identify additional genes associated with PD susceptibility. A 2-stage design was used. First, individual level genotypic data from 5 recent PD GWAS (Discovery Sample: 4,238 PD cases and 4,239 controls) were combined. Following imputation, a logistic regression model was employed in each dataset to test for association with PD susceptibility and results from each dataset were meta-analyzed. Second, 768 single-nucleotide polymorphisms (SNPs) were genotyped in an independent Replication Sample (3,738 cases and 2,111 controls). Genome-wide significance was reached for SNPs in SNCA (rs356165; G: odds ratio [OR]=1.37; p=9.3×10(-21)), MAPT (rs242559; C: OR=0.78; p=1.5×10(-10)), GAK/DGKQ (rs11248051; T: OR=1.35; p=8.2×10(-9)/rs11248060; T: OR=1.35; p=2.0×10(-9)), and the human leukocyte antigen (HLA) region (rs3129882; A: OR=0.83; p=1.2×10(-8)), which were previously reported. The Replication Sample confirmed the associations with SNCA, MAPT, and the HLA region and also with GBA (E326K; OR=1.71; p=5×10(-8) Combined Sample) (N370; OR=3.08; p=7×10(-5) Replication sample). A novel PD susceptibility locus, RIT2, on chromosome 18 (rs12456492; p=5×10(-5) Discovery Sample; p=1.52×10(-7) Replication sample; p=2×10(-10) Combined Sample) was replicated. Conditional analyses within each of the replicated regions identified distinct SNP associations within GBA and SNCA, suggesting that there may be multiple risk alleles within these genes. We identified a novel PD susceptibility locus, RIT2, replicated several previously identified loci, and identified more than 1 risk allele within SNCA and GBA. Copyright © 2012 American Neurological Association.
Meta-analysis of Parkinson disease: Identification of a novel locus, RIT2
Pankratz, Nathan; Beecham, Gary W.; DeStefano, Anita L.; Dawson, Ted M.; Doheny, Kimberly F.; Factor, Stewart A.; Hamza, Taye H.; Hung, Albert Y.; Hyman, Bradley T.; Ivinson, Adrian J.; Krainc, Dmitri; Latourelle, Jeanne C.; Clark, Lorraine N.; Marder, Karen; Martin, Eden R.; Mayeux, Richard; Ross, Owen A.; Scherzer, Clemens R.; Simon, David K.; Tanner, Caroline; Vance, Jeffery M.; Wszolek, Zbigniew K.; Zabetian, Cyrus P.; Myers, Richard H.; Payami, Haydeh; Scott, William K.; Foroud, Tatiana
2011-01-01
Objective Genome-wide association (GWAS) methods have identified genes contributing to Parkinson disease (PD); we sought to identify additional genes associated with PD susceptibility. Methods A two stage design was used. First, individual level genotypic data from five recent PD GWAS (Discovery Sample: 4,238 PD cases and 4,239 controls) were combined. Following imputation, a logistic regression model was employed in each dataset to test for association with PD susceptibility and results from each dataset were meta-analyzed. Second, 768 SNPs were genotyped in an independent Replication Sample (3,738 cases and 2,111 controls). Results Genome-wide significance was reached for SNPs in SNCA (rs356165, G: odds ratio (OR)=1.37; p=9.3 × 10−21), MAPT (rs242559, C: OR=0.78; p=1.5 × 10−10), GAK/DGKQ (rs11248051, T:OR=1.35; p=8.2 × 10−9/ rs11248060, T: OR=1.35; p=2.0×10−9), and the HLA region (rs3129882, A: OR=0.83; p=1.2 × 10−8), which were previously reported. The Replication Sample confirmed the associations with SNCA, MAPT, and the HLA region and also with GBA (E326K OR=1.71; p=5 × 10−8 Combined Sample) (N370 OR=3.08; p=7 × 10−5 Replication sample). A novel PD susceptibility locus, RIT2, on chromosome 18 (rs12456492; p=5 × 10−5 Discovery Sample; p=1.52 × 10−7 Replication sample; p=2 × 10−10 Combined Sample) was replicated. Conditional analyses within each of the replicated regions identified distinct SNP associations within GBA and SNCA, suggesting that there may be multiple risk alleles within these genes. Interpretation We identified a novel PD susceptibility locus, RIT2, replicated several previously identified loci, and identified more than one risk allele within SNCA and GBA. PMID:22451204
Genomic Influences on Hyperuricemia and Gout.
Merriman, Tony
2017-08-01
Genome-wide association studies (GWAS) have identified nearly 30 loci associated with urate concentrations that also influence the subsequent risk of gout. The ABCG2 Q141 K variant is highly likely to be causal and results in internalization of ABCG2, which can be rescued by drugs. Three other GWAS loci contain uric acid transporter genes, which are also highly likely to be causal. However identification of causal genes at other urate loci is challenging. Finally, relatively little is known about the genetic control of progression from hyperuricemia to gout. Only 4 small GWAS have been published for gout. Copyright © 2017 Elsevier Inc. All rights reserved.
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-01-01
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008–2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0. PMID:27892471
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-11-28
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.
NASA Astrophysics Data System (ADS)
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-11-01
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.
Zbrowse: An interactive GWAS results browser
USDA-ARS?s Scientific Manuscript database
The growing number of genotyped populations, the advent of high-throughput phenotyping techniques and the development of GWAS analysis software has rapidly accelerated the number of GWAS experimental results. Candidate gene discovery from these results files is often tedious, involving many manual s...
Yang, Wanneng; Guo, Zilong; Huang, Chenglong; Duan, Lingfeng; Chen, Guoxing; Jiang, Ni; Fang, Wei; Feng, Hui; Xie, Weibo; Lian, Xingming; Wang, Gongwei; Luo, Qingming; Zhang, Qifa; Liu, Qian; Xiong, Lizhong
2014-01-01
Even as the study of plant genomics rapidly develops through the use of high-throughput sequencing techniques, traditional plant phenotyping lags far behind. Here we develop a high-throughput rice phenotyping facility (HRPF) to monitor 13 traditional agronomic traits and 2 newly defined traits during the rice growth period. Using genome-wide association studies (GWAS) of the 15 traits, we identify 141 associated loci, 25 of which contain known genes such as the Green Revolution semi-dwarf gene, SD1. Based on a performance evaluation of the HRPF and GWAS results, we demonstrate that high-throughput phenotyping has the potential to replace traditional phenotyping techniques and can provide valuable gene identification information. The combination of the multifunctional phenotyping tools HRPF and GWAS provides deep insights into the genetic architecture of important traits. PMID:25295980
Genome-wide association mapping of frost tolerance in barley (Hordeum vulgare L.)
2013-01-01
Background Frost tolerance is a key trait with economic and agronomic importance in barley because it is a major component of winter hardiness, and therefore limits the geographical distribution of the crop and the effective transfer of quality traits between spring and winter crop types. Three main frost tolerance QTL (Fr-H1, Fr-H2 and Fr-H3) have been identified from bi-parental genetic mapping but it can be argued that those mapping populations only capture a portion of the genetic diversity of the species. A genetically broad dataset consisting of 184 genotypes, representative of the barley gene pool cultivated in the Mediterranean basin over an extended time period, was genotyped with 1536 SNP markers. Frost tolerance phenotype scores were collected from two trial sites, Foradada (Spain) and Fiorenzuola (Italy) and combined with the genotypic data in genome wide association analyses (GWAS) using Eigenstrat and kinship approaches to account for population structure. Results GWAS analyses identified twelve and seven positive SNP associations at Foradada and Fiorenzuola, respectively, using Eigenstrat and six and four, respectively, using kinship. Linkage disequilibrium analyses of the significant SNP associations showed they are genetically independent. In the kinship analysis, two of the significant SNP associations were tightly linked to the Fr-H2 and HvBmy loci on chromosomes 5H and 4HL, respectively. The other significant kinship associations were located in genomic regions that have not previously been associated with cold stress. Conclusions Haplotype analysis revealed that most of the significant SNP loci are fixed in the winter or facultative types, while they are freely segregating within the un-adapted spring barley genepool. Although there is a major interest in detecting new variation to improve frost tolerance of available winter and facultative types, from a GWAS perspective, working within the un-adapted spring germplasm pool is an attractive alternative strategy which would minimize statistical issues, simplify the interpretation of the data and identify phenology independent genetic determinants of frost tolerance. PMID:23802597
Refining Susceptibility Loci of Chronic Obstructive Pulmonary Disease with Lung eqtls
Lamontagne, Maxime; Couture, Christian; Postma, Dirkje S.; Timens, Wim; Sin, Don D.; Paré, Peter D.; Hogg, James C.; Nickle, David; Laviolette, Michel; Bossé, Yohan
2013-01-01
Chronic obstructive pulmonary disease (COPD) is the fourth leading cause of mortality worldwide. Recent genome-wide association studies (GWAS) have identified robust susceptibility loci associated with COPD. However, the mechanisms mediating the risk conferred by these loci remain to be found. The goal of this study was to identify causal genes/variants within susceptibility loci associated with COPD. In the discovery cohort, genome-wide gene expression profiles of 500 non-tumor lung specimens were obtained from patients undergoing lung surgery. Blood-DNA from the same patients were genotyped for 1,2 million SNPs. Following genotyping and gene expression quality control filters, 409 samples were analyzed. Lung expression quantitative trait loci (eQTLs) were identified and overlaid onto three COPD susceptibility loci derived from GWAS; 4q31 (HHIP), 4q22 (FAM13A), and 19q13 (RAB4B, EGLN2, MIA, CYP2A6). Significant eQTLs were replicated in two independent datasets (n = 363 and 339). SNPs previously associated with COPD and lung function on 4q31 (rs1828591, rs13118928) were associated with the mRNA expression of HHIP. An association between mRNA expression level of FAM13A and SNP rs2045517 was detected at 4q22, but did not reach statistical significance. At 19q13, significant eQTLs were detected with EGLN2. In summary, this study supports HHIP, FAM13A, and EGLN2 as the most likely causal COPD genes on 4q31, 4q22, and 19q13, respectively. Strong lung eQTL SNPs identified in this study will need to be tested for association with COPD in case-control studies. Further functional studies will also be needed to understand the role of genes regulated by disease-related variants in COPD. PMID:23936167
Veturi, Yogasudha; Ritchie, Marylyn D
2018-01-01
Transcriptome-wide association studies (TWAS) have recently been employed as an approach that can draw upon the advantages of genome-wide association studies (GWAS) and gene expression studies to identify genes associated with complex traits. Unlike standard GWAS, summary level data suffices for TWAS and offers improved statistical power. Two popular TWAS methods include either (a) imputing the cis genetic component of gene expression from smaller sized studies (using multi-SNP prediction or MP) into much larger effective sample sizes afforded by GWAS - TWAS-MP or (b) using summary-based Mendelian randomization - TWAS-SMR. Although these methods have been effective at detecting functional variants, it remains unclear how extensive variability in the genetic architecture of complex traits and diseases impacts TWAS results. Our goal was to investigate the different scenarios under which these methods yielded enough power to detect significant expression-trait associations. In this study, we conducted extensive simulations based on 6000 randomly chosen, unrelated Caucasian males from Geisinger's MyCode population to compare the power to detect cis expression-trait associations (within 500 kb of a gene) using the above-described approaches. To test TWAS across varying genetic backgrounds we simulated gene expression and phenotype using different quantitative trait loci per gene and cis-expression /trait heritability under genetic models that differentiate the effect of causality from that of pleiotropy. For each gene, on a training set ranging from 100 to 1000 individuals, we either (a) estimated regression coefficients with gene expression as the response using five different methods: LASSO, elastic net, Bayesian LASSO, Bayesian spike-slab, and Bayesian ridge regression or (b) performed eQTL analysis. We then sampled with replacement 50,000, 150,000, and 300,000 individuals respectively from the testing set of the remaining 5000 individuals and conducted GWAS on each set. Subsequently, we integrated the GWAS summary statistics derived from the testing set with the weights (or eQTLs) derived from the training set to identify expression-trait associations using (a) TWAS-MP (b) TWAS-SMR (c) eQTL-based GWAS, or (d) standalone GWAS. Finally, we examined the power to detect functionally relevant genes using the different approaches under the considered simulation scenarios. In general, we observed great similarities among TWAS-MP methods although the Bayesian methods resulted in improved power in comparison to LASSO and elastic net as the trait architecture grew more complex while training sample sizes and expression heritability remained small. Finally, we observed high power under causality but very low to moderate power under pleiotropy.
Guo, Michael; Liu, Zun; Willen, Jessie; Shaw, Cameron P; Richard, Daniel; Jagoda, Evelyn; Doxey, Andrew C; Hirschhorn, Joel; Capellini, Terence D
2017-12-05
GWAS have identified hundreds of height-associated loci. However, determining causal mechanisms is challenging, especially since height-relevant tissues (e.g. growth plates) are difficult to study. To uncover mechanisms by which height GWAS variants function, we performed epigenetic profiling of murine femoral growth plates. The profiled open chromatin regions recapitulate known chondrocyte and skeletal biology, are enriched at height GWAS loci, particularly near differentially expressed growth plate genes, and enriched for binding motifs of transcription factors with roles in chondrocyte biology. At specific loci, our analyses identified compelling mechanisms for GWAS variants. For example, at CHSY1 , we identified a candidate causal variant (rs9920291) overlapping an open chromatin region. Reporter assays demonstrated that rs9920291 shows allelic regulatory activity, and CRISPR/Cas9 targeting of human chondrocytes demonstrates that the region regulates CHSY1 expression. Thus, integrating biologically relevant epigenetic information (here, from growth plates) with genetic association results can identify biological mechanisms important for human growth.
Scalable privacy-preserving data sharing methodology for genome-wide association studies.
Yu, Fei; Fienberg, Stephen E; Slavković, Aleksandra B; Uhler, Caroline
2014-08-01
The protection of privacy of individual-level information in genome-wide association study (GWAS) databases has been a major concern of researchers following the publication of "an attack" on GWAS data by Homer et al. (2008). Traditional statistical methods for confidentiality and privacy protection of statistical databases do not scale well to deal with GWAS data, especially in terms of guarantees regarding protection from linkage to external information. The more recent concept of differential privacy, introduced by the cryptographic community, is an approach that provides a rigorous definition of privacy with meaningful privacy guarantees in the presence of arbitrary external information, although the guarantees may come at a serious price in terms of data utility. Building on such notions, Uhler et al. (2013) proposed new methods to release aggregate GWAS data without compromising an individual's privacy. We extend the methods developed in Uhler et al. (2013) for releasing differentially-private χ(2)-statistics by allowing for arbitrary number of cases and controls, and for releasing differentially-private allelic test statistics. We also provide a new interpretation by assuming the controls' data are known, which is a realistic assumption because some GWAS use publicly available data as controls. We assess the performance of the proposed methods through a risk-utility analysis on a real data set consisting of DNA samples collected by the Wellcome Trust Case Control Consortium and compare the methods with the differentially-private release mechanism proposed by Johnson and Shmatikov (2013). Copyright © 2014 Elsevier Inc. All rights reserved.
Genome-Wide Association Study of Multiple Sclerosis Confirms a Novel Locus at 5p13.1
Sanna, Serena; Gayán, Javier; Urcelay, Elena; Zara, Ilenia; Pitzalis, Maristella; Cavanillas, María L.; Arroyo, Rafael; Zoledziewska, Magdalena; Marrosu, Marisa; Fernández, Oscar; Leyva, Laura; Alcina, Antonio; Fedetz, Maria; Moreno-Rey, Concha; Velasco, Juan; Real, Luis M.; Ruiz-Peña, Juan Luis; Cucca, Francesco
2012-01-01
Multiple Sclerosis (MS) is the most common progressive and disabling neurological condition affecting young adults in the world today. From a genetic point of view, MS is a complex disorder resulting from the combination of genetic and non-genetic factors. We aimed to identify previously unidentified loci conducting a new GWAS of Multiple Sclerosis (MS) in a sample of 296 MS cases and 801 controls from the Spanish population. Meta-analysis of our data in combination with previous GWAS was done. A total of 17 GWAS-significant SNPs, corresponding to three different loci were identified:HLA, IL2RA, and 5p13.1. All three have been previously reported as GWAS-significant. We confirmed our observation in 5p13.1 for rs9292777 using two additional independent Spanish samples to make a total of 4912 MS cases and 7498 controls (ORpooled = 0.84; 95%CI: 0.80–0.89; p = 1.36×10-9). This SNP differs from the one reported within this locus in a recent GWAS. Although it is unclear whether both signals are tapping the same genetic association, it seems clear that this locus plays an important role in the pathogenesis of MS. PMID:22570697
Koller, Daniel L.; Ichikawa, Shoji; Lai, Dongbing; Padgett, Leah R.; Doheny, Kimberly F.; Pugh, Elizabeth; Paschall, Justin; Hui, Siu L.; Edenberg, Howard J.; Xuei, Xiaoling; Peacock, Munro; Econs, Michael J.; Foroud, Tatiana
2010-01-01
Context: Several genome-wide association studies (GWAS) have been performed to identify genes contributing to bone mineral density (BMD), typically in samples of elderly women and men. Objective: The objective of the study was to identify genes contributing to BMD in premenopausal women. Design: GWAS using the Illumina 610Quad array in premenopausal European-American (EA) women and replication of the top 50 single-nucleotide polymorphisms (SNPs) for two BMD measures in African-American (AA) women. Subjects: Subjects included 1524 premenopausal EA women aged 20–45 yr from 762 sibships and 669 AA premenopausal women aged 20–44 yr from 383 sibships. Interventions: There were no interventions. Main Outcome Measures: BMD was measured at the lumbar spine and femoral neck by dual-energy x-ray absorptiometry. Age- and weight-adjusted BMD values were tested for association with each SNP, with P values determined by permutation. Results: SNPs in CATSPERB on chromosome 14 provided evidence of association with femoral neck BMD (rs1298989, P = 2.7 × 10−5; rs1285635, P = 3.0 × 10−5) in the EA women, and some supporting evidence was also observed with these SNPs in the AA women (rs1285635, P = 0.003). Genes identified in other BMD GWAS studies, including IBSP and ADAMTS18, were also among the most significant findings in our GWAS. Conclusions: Evidence of association to several novel loci was detected in a GWAS of premenopausal EA women, and SNPs in one of these loci also provided supporting evidence in a sample of AA women. PMID:20164292
Cellular dissection of psoriasis for transcriptome analyses and the post-GWAS era
2014-01-01
Background Genome-scale studies of psoriasis have been used to identify genes of potential relevance to disease mechanisms. For many identified genes, however, the cell type mediating disease activity is uncertain, which has limited our ability to design gene functional studies based on genomic findings. Methods We identified differentially expressed genes (DEGs) with altered expression in psoriasis lesions (n = 216 patients), as well as candidate genes near susceptibility loci from psoriasis GWAS studies. These gene sets were characterized based upon their expression across 10 cell types present in psoriasis lesions. Susceptibility-associated variation at intergenic (non-coding) loci was evaluated to identify sites of allele-specific transcription factor binding. Results Half of DEGs showed highest expression in skin cells, although the dominant cell type differed between psoriasis-increased DEGs (keratinocytes, 35%) and psoriasis-decreased DEGs (fibroblasts, 33%). In contrast, psoriasis GWAS candidates tended to have highest expression in immune cells (71%), with a significant fraction showing maximal expression in neutrophils (24%, P < 0.001). By identifying candidate cell types for genes near susceptibility loci, we could identify and prioritize SNPs at which susceptibility variants are predicted to influence transcription factor binding. This led to the identification of potentially causal (non-coding) SNPs for which susceptibility variants influence binding of AP-1, NF-κB, IRF1, STAT3 and STAT4. Conclusions These findings underscore the role of innate immunity in psoriasis and highlight neutrophils as a cell type linked with pathogenetic mechanisms. Assignment of candidate cell types to genes emerging from GWAS studies provides a first step towards functional analysis, and we have proposed an approach for generating hypotheses to explain GWAS hits at intergenic loci. PMID:24885462
Lloyd-Jones, Luke R; Robinson, Matthew R; Yang, Jian; Visscher, Peter M
2018-04-01
Genome-wide association studies (GWAS) have identified thousands of loci that are robustly associated with complex diseases. The use of linear mixed model (LMM) methodology for GWAS is becoming more prevalent due to its ability to control for population structure and cryptic relatedness and to increase power. The odds ratio (OR) is a common measure of the association of a disease with an exposure ( e.g. , a genetic variant) and is readably available from logistic regression. However, when the LMM is applied to all-or-none traits it provides estimates of genetic effects on the observed 0-1 scale, a different scale to that in logistic regression. This limits the comparability of results across studies, for example in a meta-analysis, and makes the interpretation of the magnitude of an effect from an LMM GWAS difficult. In this study, we derived transformations from the genetic effects estimated under the LMM to the OR that only rely on summary statistics. To test the proposed transformations, we used real genotypes from two large, publicly available data sets to simulate all-or-none phenotypes for a set of scenarios that differ in underlying model, disease prevalence, and heritability. Furthermore, we applied these transformations to GWAS summary statistics for type 2 diabetes generated from 108,042 individuals in the UK Biobank. In both simulation and real-data application, we observed very high concordance between the transformed OR from the LMM and either the simulated truth or estimates from logistic regression. The transformations derived and validated in this study improve the comparability of results from prospective and already performed LMM GWAS on complex diseases by providing a reliable transformation to a common comparative scale for the genetic effects. Copyright © 2018 by the Genetics Society of America.
Genetics of common forms of heart failure: challenges and potential solutions.
Rau, Christoph D; Lusis, Aldons J; Wang, Yibin
2015-05-01
In contrast to many other human diseases, the use of genome-wide association studies (GWAS) to identify genes for heart failure (HF) has had limited success. We will discuss the underlying challenges as well as potential new approaches to understanding the genetics of common forms of HF. Recent research using intermediate phenotypes, more detailed and quantitative stratification of HF symptoms, founder populations and novel animal models has begun to allow researchers to make headway toward explaining the genetics underlying HF using GWAS techniques. By expanding analyses of HF to improved clinical traits, additional HF classifications and innovative model systems, the intractability of human HF GWAS should be ameliorated significantly.
Brookes, Keeley J; McConnell, George; Williams, Kirsty; Chaudhury, Sultan; Madhan, Gaganjit; Patel, Tulsi; Turley, Christopher; Guetta-Baranes, Tamar; Bras, Jose; Guerreiro, Rita; Hardy, John; Francis, Paul T; Morgan, Kevin
2018-06-08
The Brains for Dementia Research project is a recently established longitudinal cohort which aims to provide brain tissue for research purposes from neuropathologically defined samples. Here we present the findings from our analysis on the 19 established GWAS index SNPs for Alzheimer's disease, in order to demonstrate if the BDR sample also displays association to these variants. A highly significant association of the APOEɛ4 allele was identified (p = 3.99×10-12). Association tests for the 19 GWAS SNPs found that although no SNPs survive multiple testing, nominal significant findings were detected and concordance with the Lambert et al. GWAS meta-analysis was observed.
Zheng, Jie; Erzurumluoglu, A Mesut; Elsworth, Benjamin L; Kemp, John P; Howe, Laurence; Haycock, Philip C; Hemani, Gibran; Tansey, Katherine; Laurin, Charles; Pourcain, Beate St; Warrington, Nicole M; Finucane, Hilary K; Price, Alkes L; Bulik-Sullivan, Brendan K; Anttila, Verneri; Paternoster, Lavinia; Gaunt, Tom R; Evans, David M; Neale, Benjamin M
2017-01-15
LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/ CONTACT: jie.zheng@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Abe, Makiko; Ito, Hidemi; Oze, Isao; Nomura, Masatoshi; Ogawa, Yoshihiro; Matsuo, Keitaro
2017-12-01
Little is known about the difference of genetic predisposition for CRC between ethnicities; however, many genetic traits common to colorectal cancer have been identified. This study investigated whether more SNPs identified in GWAS in East Asian population could improve the risk prediction of Japanese and explored possible application of genetic risk groups as an instrument of the risk communication. 558 Patients histologically verified colorectal cancer and 1116 first-visit outpatients were included for derivation study, and 547 cases and 547 controls were for replication study. Among each population, we evaluated prediction models for the risk of CRC that combined the genetic risk group based on SNPs from GWASs in European-population and a similarly developed model adding SNPs from GWASs in East Asian-population. We examined whether adding East Asian-specific SNPs would improve the discrimination. Six SNPs (rs6983267, rs4779584, rs4444235, rs9929218, rs10936599, rs16969681) from 23 SNPs by European-based GWAS and five SNPs (rs704017, rs11196172, rs10774214, rs647161, rs2423279) among ten SNPs by Asian-based GWAS were selected in CRC risk prediction model. Compared with a 6-SNP-based model, an 11-SNP model including Asian GWAS-SNPs showed improved discrimination capacity in Receiver operator characteristic analysis. A model with 11 SNPs resulted in statistically significant improvement in both derivation (P = 0.0039) and replication studies (P = 0.0018) compared with six SNP model. We estimated cumulative risk of CRC by using genetic risk group based on 11 SNPs and found that the cumulative risk at age 80 is approximately 13% in the high-risk group while 6% in the low-risk group. We constructed a more efficient CRC risk prediction model with 11 SNPs including newly identified East Asian-based GWAS SNPs (rs704017, rs11196172, rs10774214, rs647161, rs2423279). Risk grouping based on 11 SNPs depicted lifetime difference of CRC risk. This might be useful for effective individualized prevention for East Asian.
USDA-ARS?s Scientific Manuscript database
Copy number variation (CNV) is an important type of genetic variation contributing to phenotypic differences among mammals and may serve as an alternative molecular marker to single nucleotide polymorphism (SNP) for genome-wide association study (GWAS). Recently, GWAS analysis using CNV has been app...
Genome-wide association studies in Alzheimer's disease.
Bertram, Lars; Tanzi, Rudolph E
2009-10-15
Genome-wide association studies (GWAS) have gained considerable momentum over the last couple of years for the identification of novel complex disease genes. In the field of Alzheimer's disease (AD), there are currently eight published and two provisionally reported GWAS, highlighting over two dozen novel potential susceptibility loci beyond the well-established APOE association. On the basis of the data available at the time of this writing, the most compelling novel GWAS signal has been observed in GAB2 (GRB2-associated binding protein 2), followed by less consistently replicated signals in galanin-like peptide (GALP), piggyBac transposable element derived 1 (PGBD1), tyrosine kinase, non-receptor 1 (TNK1). Furthermore, consistent replication has been recently announced for CLU (clusterin, also known as apolipoprotein J). Finally, there are at least three replicated loci in hitherto uncharacterized genomic intervals on chromosomes 14q32.13, 14q31.2 and 6q24.1 likely implicating the existence of novel AD genes in these regions. In this review, we will discuss the characteristics and potential relevance to pathogenesis of the outcomes of all currently available GWAS in AD. A particular emphasis will be laid on findings with independent data in favor of the original association.
Unsupervised text mining for assessing and augmenting GWAS results.
Ailem, Melissa; Role, François; Nadif, Mohamed; Demenais, Florence
2016-04-01
Text mining can assist in the analysis and interpretation of large-scale biomedical data, helping biologists to quickly and cheaply gain confirmation of hypothesized relationships between biological entities. We set this question in the context of genome-wide association studies (GWAS), an actively emerging field that contributed to identify many genes associated with multifactorial diseases. These studies allow to identify groups of genes associated with the same phenotype, but provide no information about the relationships between these genes. Therefore, our objective is to leverage unsupervised text mining techniques using text-based cosine similarity comparisons and clustering applied to candidate and random gene vectors, in order to augment the GWAS results. We propose a generic framework which we used to characterize the relationships between 10 genes reported associated with asthma by a previous GWAS. The results of this experiment showed that the similarities between these 10 genes were significantly stronger than would be expected by chance (one-sided p-value<0.01). The clustering of observed and randomly selected gene also allowed to generate hypotheses about potential functional relationships between these genes and thus contributed to the discovery of new candidate genes for asthma. Copyright © 2016 Elsevier Inc. All rights reserved.
GWAS meta-analysis and replication identifies three new susceptibility loci for ovarian cancer
Pharoah, Paul D. P.; Tsai, Ya-Yu; Ramus, Susan J.; Phelan, Catherine M.; Goode, Ellen L.; Lawrenson, Kate; Price, Melissa; Fridley, Brooke L.; Tyrer, Jonathan P.; Shen, Howard; Weber, Rachel; Karevan, Rod; Larson, Melissa C.; Song, Honglin; Tessier, Daniel C.; Bacot, François; Vincent, Daniel; Cunningham, Julie M.; Dennis, Joe; Dicks, Ed; Aben, Katja K.; Anton-Culver, Hoda; Antonenkova, Natalia; Armasu, Sebastian M.; Baglietto, Laura; Bandera, Elisa V.; Beckmann, Matthias W.; Birrer, Michael J.; Bloom, Greg; Bogdanova, Natalia; Brenton, James D.; Brinton, Louise A.; Brooks-Wilson, Angela; Brown, Robert; Butzow, Ralf; Campbell, Ian; Carney, Michael E; Carvalho, Renato S.; Chang-Claude, Jenny; Chen, Y. Anne; Chen, Zhihua; Chow, Wong-Ho; Cicek, Mine S.; Coetzee, Gerhard; Cook, Linda S.; Cramer, Daniel W.; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Despierre, Evelyn; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Edwards, Robert; Ekici, Arif B.; Fasching, Peter A.; Fenstermacher, David; Flanagan, James; Gao, Yu-Tang; Garcia-Closas, Montserrat; Gentry-Maharaj, Aleksandra; Giles, Graham; Gjyshi, Anxhela; Gore, Martin; Gronwald, Jacek; Guo, Qi; Halle, Mari K; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hillemanns, Peter; Hoatlin, Maureen; Høgdall, Estrid; Høgdall, Claus K.; Hosono, Satoyo; Jakubowska, Anna; Jensen, Allan; Kalli, Kimberly R.; Karlan, Beth Y.; Kelemen, Linda E.; Kiemeney, Lambertus A.; Kjaer, Susanne Krüger; Konecny, Gottfried E.; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Nathan; Lee, Janet; Leminen, Arto; Lim, Boon Kiong; Lissowska, Jolanta; Lubiński, Jan; Lundvall, Lene; Lurie, Galina; Massuger, Leon F.A.G.; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B.; Nakanishi, Toru; Narod, Steven A.; Ness, Roberta B.; Nevanlinna, Heli; Nickels, Stefan; Noushmehr, Houtan; Odunsi, Kunle; Olson, Sara; Orlow, Irene; Paul, James; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jenny; Pike, Malcolm C; Poole, Elizabeth M; Qu, Xiaotao; Risch, Harvey A.; Rodriguez-Rodriguez, Lorna; Rossing, Mary Anne; Rudolph, Anja; Runnebaum, Ingo; Rzepecka, Iwona K; Salvesen, Helga B.; Schwaab, Ira; Severi, Gianluca; Shen, Hui; Shridhar, Vijayalakshmi; Shu, Xiao-Ou; Sieh, Weiva; Southey, Melissa C.; Spellman, Paul; Tajima, Kazuo; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J; Timorek, Agnieszka; Tworoger, Shelley S.; van Altena, Anne M.; Berg, David Van Den; Vergote, Ignace; Vierkant, Robert A.; Vitonis, Allison F.; Wang-Gohrke, Shan; Wentzensen, Nicolas; Whittemore, Alice S.; Wik, Elisabeth; Winterhoff, Boris; Woo, Yin Ling; Wu, Anna H; Yang, Hannah P.; Zheng, Wei; Ziogas, Argyrios; Zulkifli, Famida; Goodman, Marc T.; Hall, Per; Easton, Douglas F; Pearce, Celeste L; Berchuck, Andrew; Chenevix-Trench, Georgia; Iversen, Edwin; Monteiro, Alvaro N.A.; Gayther, Simon A.; Schildkraut, Joellen M.; Sellers, Thomas A.
2013-01-01
Genome wide association studies (GWAS) have identified four susceptibility loci for epithelial ovarian cancer (EOC) with another two loci being close to genome-wide significance. We pooled data from a GWAS conducted in North America with another GWAS from the United Kingdom. We selected the top 24,551 SNPs for inclusion on the iCOGS custom genotyping array. Follow-up genotyping was carried out in 18,174 cases and 26,134 controls from 43 studies from the Ovarian Cancer Association Consortium. We validated the two loci at 3q25 and 17q21 previously near genome-wide significance and identified three novel loci associated with risk; two loci associated with all EOC subtypes, at 8q21 (rs11782652, P=5.5×10-9) and 10p12 (rs1243180; P=1.8×10-8), and another locus specific to the serous subtype at 17q12 (rs757210; P=8.1×10-10). An integrated molecular analysis of genes and regulatory regions at these loci provided evidence for functional mechanisms underlying susceptibility that implicates CHMP4C in the pathogenesis of ovarian cancer. PMID:23535730
GWAS meta-analysis and replication identifies three new susceptibility loci for ovarian cancer.
Pharoah, Paul D P; Tsai, Ya-Yu; Ramus, Susan J; Phelan, Catherine M; Goode, Ellen L; Lawrenson, Kate; Buckley, Melissa; Fridley, Brooke L; Tyrer, Jonathan P; Shen, Howard; Weber, Rachel; Karevan, Rod; Larson, Melissa C; Song, Honglin; Tessier, Daniel C; Bacot, François; Vincent, Daniel; Cunningham, Julie M; Dennis, Joe; Dicks, Ed; Aben, Katja K; Anton-Culver, Hoda; Antonenkova, Natalia; Armasu, Sebastian M; Baglietto, Laura; Bandera, Elisa V; Beckmann, Matthias W; Birrer, Michael J; Bloom, Greg; Bogdanova, Natalia; Brenton, James D; Brinton, Louise A; Brooks-Wilson, Angela; Brown, Robert; Butzow, Ralf; Campbell, Ian; Carney, Michael E; Carvalho, Renato S; Chang-Claude, Jenny; Chen, Y Anne; Chen, Zhihua; Chow, Wong-Ho; Cicek, Mine S; Coetzee, Gerhard; Cook, Linda S; Cramer, Daniel W; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Despierre, Evelyn; Doherty, Jennifer A; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Edwards, Robert; Ekici, Arif B; Fasching, Peter A; Fenstermacher, David; Flanagan, James; Gao, Yu-Tang; Garcia-Closas, Montserrat; Gentry-Maharaj, Aleksandra; Giles, Graham; Gjyshi, Anxhela; Gore, Martin; Gronwald, Jacek; Guo, Qi; Halle, Mari K; Harter, Philipp; Hein, Alexander; Heitz, Florian; Hillemanns, Peter; Hoatlin, Maureen; Høgdall, Estrid; Høgdall, Claus K; Hosono, Satoyo; Jakubowska, Anna; Jensen, Allan; Kalli, Kimberly R; Karlan, Beth Y; Kelemen, Linda E; Kiemeney, Lambertus A; Kjaer, Susanne Krüger; Konecny, Gottfried E; Krakstad, Camilla; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D; Lee, Nathan; Lee, Janet; Leminen, Arto; Lim, Boon Kiong; Lissowska, Jolanta; Lubiński, Jan; Lundvall, Lene; Lurie, Galina; Massuger, Leon F A G; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R; Menon, Usha; Modugno, Francesmary; Moysich, Kirsten B; Nakanishi, Toru; Narod, Steven A; Ness, Roberta B; Nevanlinna, Heli; Nickels, Stefan; Noushmehr, Houtan; Odunsi, Kunle; Olson, Sara; Orlow, Irene; Paul, James; Pejovic, Tanja; Pelttari, Liisa M; Permuth-Wey, Jenny; Pike, Malcolm C; Poole, Elizabeth M; Qu, Xiaotao; Risch, Harvey A; Rodriguez-Rodriguez, Lorna; Rossing, Mary Anne; Rudolph, Anja; Runnebaum, Ingo; Rzepecka, Iwona K; Salvesen, Helga B; Schwaab, Ira; Severi, Gianluca; Shen, Hui; Shridhar, Vijayalakshmi; Shu, Xiao-Ou; Sieh, Weiva; Southey, Melissa C; Spellman, Paul; Tajima, Kazuo; Teo, Soo-Hwang; Terry, Kathryn L; Thompson, Pamela J; Timorek, Agnieszka; Tworoger, Shelley S; van Altena, Anne M; van den Berg, David; Vergote, Ignace; Vierkant, Robert A; Vitonis, Allison F; Wang-Gohrke, Shan; Wentzensen, Nicolas; Whittemore, Alice S; Wik, Elisabeth; Winterhoff, Boris; Woo, Yin Ling; Wu, Anna H; Yang, Hannah P; Zheng, Wei; Ziogas, Argyrios; Zulkifli, Famida; Goodman, Marc T; Hall, Per; Easton, Douglas F; Pearce, Celeste L; Berchuck, Andrew; Chenevix-Trench, Georgia; Iversen, Edwin; Monteiro, Alvaro N A; Gayther, Simon A; Schildkraut, Joellen M; Sellers, Thomas A
2013-04-01
Genome-wide association studies (GWAS) have identified four susceptibility loci for epithelial ovarian cancer (EOC), with another two suggestive loci reaching near genome-wide significance. We pooled data from a GWAS conducted in North America with another GWAS from the UK. We selected the top 24,551 SNPs for inclusion on the iCOGS custom genotyping array. We performed follow-up genotyping in 18,174 individuals with EOC (cases) and 26,134 controls from 43 studies from the Ovarian Cancer Association Consortium. We validated the two loci at 3q25 and 17q21 that were previously found to have associations close to genome-wide significance and identified three loci newly associated with risk: two loci associated with all EOC subtypes at 8q21 (rs11782652, P = 5.5 × 10(-9)) and 10p12 (rs1243180, P = 1.8 × 10(-8)) and another locus specific to the serous subtype at 17q12 (rs757210, P = 8.1 × 10(-10)). An integrated molecular analysis of genes and regulatory regions at these loci provided evidence for functional mechanisms underlying susceptibility and implicated CHMP4C in the pathogenesis of ovarian cancer.
Peprah, Emmanuel; Xu, Huichun; Tekola-Ayele, Fasil; Royal, Charmaine D.
2014-01-01
Genomic research is one of the tools for elucidating the pathogenesis of diseases of global health relevance, and paving the research dimension to clinical and public health translation. Recent advances in genomic research and technologies have increased our understanding of human diseases, genes associated with these disorders, and the relevant mechanisms. Genome-wide association studies (GWAS) have proliferated since the first studies were published several years ago, and have become an important tool in helping researchers comprehend human variation and the role genetic variants play in disease. However, the need to expand the diversity of populations in GWAS has become increasingly apparent as new knowledge is gained about genetic variation. Inclusion of diverse populations in genomic studies is critical to a more complete understanding of human variation and elucidation of the underpinnings of complex diseases. In this review, we summarize the available data on GWAS in recent-African ancestry populations within the western hemisphere (i.e. African Americans and peoples of the Caribbean) and continental African populations. Furthermore, we highlight ways in which genomic studies in populations of recent African ancestry have led to advances in the areas of malaria, HIV, prostate cancer, and other diseases. Finally, we discuss the advantages of conducting GWAS in recent African ancestry populations in the context of addressing existing and emerging global health conditions. PMID:25427668
Benner, Christian; Havulinna, Aki S; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ripatti, Samuli; Pirinen, Matti
2017-10-05
During the past few years, various novel statistical methods have been developed for fine-mapping with the use of summary statistics from genome-wide association studies (GWASs). Although these approaches require information about the linkage disequilibrium (LD) between variants, there has not been a comprehensive evaluation of how estimation of the LD structure from reference genotype panels performs in comparison with that from the original individual-level GWAS data. Using population genotype data from Finland and the UK Biobank, we show here that a reference panel of 1,000 individuals from the target population is adequate for a GWAS cohort of up to 10,000 individuals, whereas smaller panels, such as those from the 1000 Genomes Project, should be avoided. We also show, both theoretically and empirically, that the size of the reference panel needs to scale with the GWAS sample size; this has important consequences for the application of these methods in ongoing GWAS meta-analyses and large biobank studies. We conclude by providing software tools and by recommending practices for sharing LD information to more efficiently exploit summary statistics in genetics research. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Iwata, Hiroyoshi; Hayashi, Takeshi; Terakami, Shingo; Takada, Norio; Sawamura, Yutaka; Yamamoto, Toshiya
2013-01-01
Although the potential of marker-assisted selection (MAS) in fruit tree breeding has been reported, bi-parental QTL mapping before MAS has hindered the introduction of MAS to fruit tree breeding programs. Genome-wide association studies (GWAS) are an alternative to bi-parental QTL mapping in long-lived perennials. Selection based on genomic predictions of breeding values (genomic selection: GS) is another alternative for MAS. This study examined the potential of GWAS and GS in pear breeding with 76 Japanese pear cultivars to detect significant associations of 162 markers with nine agronomic traits. We applied multilocus Bayesian models accounting for ordinal categorical phenotypes for GWAS and GS model training. Significant associations were detected at harvest time, black spot resistance and the number of spurs and two of the associations were closely linked to known loci. Genome-wide predictions for GS were accurate at the highest level (0.75) in harvest time, at medium levels (0.38–0.61) in resistance to black spot, firmness of flesh, fruit shape in longitudinal section, fruit size, acid content and number of spurs and at low levels (<0.2) in all soluble solid content and vigor of tree. Results suggest the potential of GWAS and GS for use in future breeding programs in Japanese pear. PMID:23641189
Sardos, Julie; Rouard, Mathieu; Hueber, Yann; Cenci, Alberto; Hyma, Katie E; van den Houwe, Ines; Hribova, Eva; Courtois, Brigitte; Roux, Nicolas
2016-01-01
Banana (Musa sp.) is a vegetatively propagated, low fertility, potentially hybrid and polyploid crop. These qualities make the breeding and targeted genetic improvement of this crop a difficult and long process. The Genome-Wide Association Study (GWAS) approach is becoming widely used in crop plants and has proven efficient to detecting candidate genes for traits of interest, especially in cereals. GWAS has not been applied yet to a vegetatively propagated crop. However, successful GWAS in banana would considerably help unravel the genomic basis of traits of interest and therefore speed up this crop improvement. We present here a dedicated panel of 105 accessions of banana, freely available upon request, and their corresponding GBS data. A set of 5,544 highly reliable markers revealed high levels of admixture in most accessions, except for a subset of 33 individuals from Papua. A GWAS on the seedless phenotype was then successfully applied to the panel. By applying the Mixed Linear Model corrected for both kinship and structure as implemented in TASSEL, we detected 13 candidate genomic regions in which we found a number of genes potentially linked with the seedless phenotype (i.e. parthenocarpy combined with female sterility). An additional GWAS performed on the unstructured Papuan subset composed of 33 accessions confirmed six of these regions as candidate. Out of both sets of analyses, one strong candidate gene for female sterility, a putative orthologous gene to Histidine Kinase CKI1, was identified. The results presented here confirmed the feasibility and potential of GWAS when applied to small sets of banana accessions, at least for traits underpinned by a few loci. As phenotyping in banana is extremely space and time-consuming, this latest finding is of particular importance in the context of banana improvement.
Sardos, Julie; Rouard, Mathieu; Hueber, Yann; Cenci, Alberto; Hyma, Katie E.; van den Houwe, Ines; Hribova, Eva; Courtois, Brigitte; Roux, Nicolas
2016-01-01
Banana (Musa sp.) is a vegetatively propagated, low fertility, potentially hybrid and polyploid crop. These qualities make the breeding and targeted genetic improvement of this crop a difficult and long process. The Genome-Wide Association Study (GWAS) approach is becoming widely used in crop plants and has proven efficient to detecting candidate genes for traits of interest, especially in cereals. GWAS has not been applied yet to a vegetatively propagated crop. However, successful GWAS in banana would considerably help unravel the genomic basis of traits of interest and therefore speed up this crop improvement. We present here a dedicated panel of 105 accessions of banana, freely available upon request, and their corresponding GBS data. A set of 5,544 highly reliable markers revealed high levels of admixture in most accessions, except for a subset of 33 individuals from Papua. A GWAS on the seedless phenotype was then successfully applied to the panel. By applying the Mixed Linear Model corrected for both kinship and structure as implemented in TASSEL, we detected 13 candidate genomic regions in which we found a number of genes potentially linked with the seedless phenotype (i.e. parthenocarpy combined with female sterility). An additional GWAS performed on the unstructured Papuan subset composed of 33 accessions confirmed six of these regions as candidate. Out of both sets of analyses, one strong candidate gene for female sterility, a putative orthologous gene to Histidine Kinase CKI1, was identified. The results presented here confirmed the feasibility and potential of GWAS when applied to small sets of banana accessions, at least for traits underpinned by a few loci. As phenotyping in banana is extremely space and time-consuming, this latest finding is of particular importance in the context of banana improvement. PMID:27144345
Gong, Xian; Zhang, Chao; Yiliyasi·Aisa, Yiliyasi·Aisa; Shi, Ying; Yang, Xue-wei; NuersimanguliAosiman, NuersimanguliAosiman; Guan, Ya-qun; Xu, Shu-hua
2016-06-20
Over the last decade, a larger number of type 2 diabetes mellitus (T2DM) susceptible candidate genes have been reported by numerous genome-wide association studies (GWAS). Understanding the genetic diversity of these candidate genes among worldwide populations not only facilitates to elucidating the genetic mechanism of T2DM, but also provides guidance to further studies of pathogenesis of T2DM in any certain population. In this study, we identified 170 genes or genomic regions associated with T2DM by searching the GWAS databases and related literatures. We next analyzed the genetic diversity of these genes (or genomic regions) among present-day human populations by curetting the 1000 Genomes Projects phase1 dataset covering 14 worldwide populations. We further compared the characteristics of T2DM genes in different populations. No significant differences of genetic diversity were observed among the 14 worldwide populations between the T2DM candidate genes and the non-T2DM genes in terms of overall pattern. However, we observed some genes, such as IL20RA, RNMTL1-NXN, NOTCH2, ADRA2A-BTBD7P2, TBC1D4, RBM38-HMGB1P1, UBE2E2, and PPARD, show considerable differentiation between populations. In particular, IL20RA (FST=0.1521) displays the greatest population difference which is mainly contributed by that between Africans and non-Africans. Moreover, we revealed genetic differences between East Asians and Europeans on some candidate genes such as DGKB-AGMO (FST=0.173) and JAZF1 (FST=0.182). Our results indicate that some T2DM susceptible candidate genes harbor highly-differentiated variants between populations. These analyses, despite preliminary, should advance our understanding of the population difference of susceptibility to T2DM and provide insightful reference that future studies can relay on.
2010-01-01
Background Schizophrenia is the collective term for an exclusively clinically diagnosed, heterogeneous group of mental disorders with still obscure biological roots. Based on the assumption that valuable information about relevant genetic and environmental disease mechanisms can be obtained by association studies on patient cohorts of ≥ 1000 patients, if performed on detailed clinical datasets and quantifiable biological readouts, we generated a new schizophrenia data base, the GRAS (Göttingen Research Association for Schizophrenia) data collection. GRAS is the necessary ground to study genetic causes of the schizophrenic phenotype in a 'phenotype-based genetic association study' (PGAS). This approach is different from and complementary to the genome-wide association studies (GWAS) on schizophrenia. Methods For this purpose, 1085 patients were recruited between 2005 and 2010 by an invariable team of traveling investigators in a cross-sectional field study that comprised 23 German psychiatric hospitals. Additionally, chart records and discharge letters of all patients were collected. Results The corresponding dataset extracted and presented in form of an overview here, comprises biographic information, disease history, medication including side effects, and results of comprehensive cross-sectional psychopathological, neuropsychological, and neurological examinations. With >3000 data points per schizophrenic subject, this data base of living patients, who are also accessible for follow-up studies, provides a wide-ranging and standardized phenotype characterization of as yet unprecedented detail. Conclusions The GRAS data base will serve as prerequisite for PGAS, a novel approach to better understanding 'the schizophrenias' through exploring the contribution of genetic variation to the schizophrenic phenotypes. PMID:21067598
Piette, Elizabeth R; Moore, Jason H
2018-01-01
Machine learning methods and conventions are increasingly employed for the analysis of large, complex biomedical data sets, including genome-wide association studies (GWAS). Reproducibility of machine learning analyses of GWAS can be hampered by biological and statistical factors, particularly so for the investigation of non-additive genetic interactions. Application of traditional cross validation to a GWAS data set may result in poor consistency between the training and testing data set splits due to an imbalance of the interaction genotypes relative to the data as a whole. We propose a new cross validation method, proportional instance cross validation (PICV), that preserves the original distribution of an independent variable when splitting the data set into training and testing partitions. We apply PICV to simulated GWAS data with epistatic interactions of varying minor allele frequencies and prevalences and compare performance to that of a traditional cross validation procedure in which individuals are randomly allocated to training and testing partitions. Sensitivity and positive predictive value are significantly improved across all tested scenarios for PICV compared to traditional cross validation. We also apply PICV to GWAS data from a study of primary open-angle glaucoma to investigate a previously-reported interaction, which fails to significantly replicate; PICV however improves the consistency of testing and training results. Application of traditional machine learning procedures to biomedical data may require modifications to better suit intrinsic characteristics of the data, such as the potential for highly imbalanced genotype distributions in the case of epistasis detection. The reproducibility of genetic interaction findings can be improved by considering this variable imbalance in cross validation implementation, such as with PICV. This approach may be extended to problems in other domains in which imbalanced variable distributions are a concern.
Evaluation of different sources of DNA for use in genome wide studies and forensic application.
Al Safar, Habiba S; Abidi, Fatima H; Khazanehdari, Kamal A; Dadour, Ian R; Tay, Guan K
2011-02-01
In the field of epidemiology, Genome-Wide Association Studies (GWAS) are commonly used to identify genetic predispositions of many human diseases. Large repositories housing biological specimens for clinical and genetic investigations have been established to store material and data for these studies. The logistics of specimen collection and sample storage can be onerous, and new strategies have to be explored. This study examines three different DNA sources (namely, degraded genomic DNA, amplified degraded genomic DNA and amplified extracted DNA from FTA card) for GWAS using the Illumina platform. No significant difference in call rate was detected between amplified degraded genomic DNA extracted from whole blood and amplified DNA retrieved from FTA™ cards. However, using unamplified-degraded genomic DNA reduced the call rate to a mean of 42.6% compared to amplified DNA extracted from FTA card (mean of 96.6%). This study establishes the utility of FTA™ cards as a viable storage matrix for cells from which DNA can be extracted to perform GWAS analysis.
A Genome-Wide Association Study of the Human Metabolome in a Community-Based Cohort
Rhee, Eugene P.; Ho, Jennifer E.; Chen, Ming-Huei; Shen, Dongxiao; Cheng, Susan; Larson, Martin G.; Ghorbani, Anahita; Shi, Xu; Helenius, Iiro T.; O’Donnell, Christopher J.; Souza, Amanda L.; Deik, Amy; Pierce, Kerry A.; Bullock, Kevin; Walford, Geoffrey A.; Vasan, Ramachandran S.; Florez, Jose C.; Clish, Clary; Yeh, J.-R. Joanna; Wang, Thomas J.; Gerszten, Robert E.
2014-01-01
SUMMARY Because metabolites are hypothesized to play key roles as markers and effectors of cardio-metabolic diseases, recent studies have sought to annotate the genetic determinants of circulating metabolite levels. We report a genome-wide association study (GWAS) of 217 plasma metabolites, including >100 not measured in prior GWAS, in 2,076 participants of the Framingham Heart Study. For the majority of analytes, we find that estimated heritability explains >20% of inter-individual variation, and that variation attributable to heritable factors is greater than that attributable to clinical factors. Further, we identify 31 genetic loci associated with plasma metabolites, including 23 that have not previously been reported. Importantly, we include GWAS results for all surveyed metabolites, and demonstrate how this information highlights a role for AGXT2 in cholesterol ester and triacylglycerol metabolism. Thus, our study outlines the relative contributions of inherited and clinical factors on the plasma metabolome and provides a resource for metabolism research. PMID:23823483
Multi-trait analysis of genome-wide association summary statistics using MTAG.
Turley, Patrick; Walters, Raymond K; Maghzian, Omeed; Okbay, Aysu; Lee, James J; Fontana, Mark Alan; Nguyen-Viet, Tuan Anh; Wedow, Robbee; Zacher, Meghan; Furlotte, Nicholas A; Magnusson, Patrik; Oskarsson, Sven; Johannesson, Magnus; Visscher, Peter M; Laibson, David; Cesarini, David; Neale, Benjamin M; Benjamin, Daniel J
2018-02-01
We introduce multi-trait analysis of GWAS (MTAG), a method for joint analysis of summary statistics from genome-wide association studies (GWAS) of different traits, possibly from overlapping samples. We apply MTAG to summary statistics for depressive symptoms (N eff = 354,862), neuroticism (N = 168,105), and subjective well-being (N = 388,538). As compared to the 32, 9, and 13 genome-wide significant loci identified in the single-trait GWAS (most of which are themselves novel), MTAG increases the number of associated loci to 64, 37, and 49, respectively. Moreover, association statistics from MTAG yield more informative bioinformatics analyses and increase the variance explained by polygenic scores by approximately 25%, matching theoretical expectations.
Smith, Andrew J P; Deloukas, Panos; Munroe, Patricia B
2018-04-13
Over the last decade, genome-wide association studies (GWAS) have propelled the discovery of thousands of loci associated with complex diseases. The focus is now turning towards the function of these association signals, determining the causal variant(s) amongst those in strong linkage disequilibrium, and identifying their underlying mechanisms, such as long-range gene regulation. Genome-editing techniques utilising zinc-finger nucleases (ZFN), transcription activator-like effector nucleases (TALENs) and clustered regularly-interspaced short palindromic repeats with Cas9 nuclease (CRISPR-Cas9), are becoming the tools of choice to establish functionality for these variants, due to the ability to assess effects of single variants in vivo. This review will discuss examples of how these technologies have begun to aid functional analysis of GWAS loci for complex traits such as cardiovascular disease, type 2 diabetes, cancer, obesity and autoimmune disease. We focus on analysis of variants occurring within non-coding genomic regions, as these comprise the majority of GWAS variants, providing the greatest challenges to determining functionality, and compare editing strategies that provide different levels of evidence for variant functionality. The review describes molecular insights into some of these potentially causal variants, and how these may relate to the pathology of the trait, and look towards future directions for these technologies in post-GWAS analysis, such as base-editing.
GWAS-based machine learning approach to predict duloxetine response in major depressive disorder.
Maciukiewicz, Malgorzata; Marshe, Victoria S; Hauschild, Anne-Christin; Foster, Jane A; Rotzinger, Susan; Kennedy, James L; Kennedy, Sidney H; Müller, Daniel J; Geraci, Joseph
2018-04-01
Major depressive disorder (MDD) is one of the most prevalent psychiatric disorders and is commonly treated with antidepressant drugs. However, large variability is observed in terms of response to antidepressants. Machine learning (ML) models may be useful to predict treatment outcomes. A sample of 186 MDD patients received treatment with duloxetine for up to 8 weeks were categorized as "responders" based on a MADRS change >50% from baseline; or "remitters" based on a MADRS score ≤10 at end point. The initial dataset (N = 186) was randomly divided into training and test sets in a nested 5-fold cross-validation, where 80% was used as a training set and 20% made up five independent test sets. We performed genome-wide logistic regression to identify potentially significant variants related to duloxetine response/remission and extracted the most promising predictors using LASSO regression. Subsequently, classification-regression trees (CRT) and support vector machines (SVM) were applied to construct models, using ten-fold cross-validation. With regards to response, none of the pairs performed significantly better than chance (accuracy p > .1). For remission, SVM achieved moderate performance with an accuracy = 0.52, a sensitivity = 0.58, and a specificity = 0.46, and 0.51 for all coefficients for CRT. The best performing SVM fold was characterized by an accuracy = 0.66 (p = .071), sensitivity = 0.70 and a sensitivity = 0.61. In this study, the potential of using GWAS data to predict duloxetine outcomes was examined using ML models. The models were characterized by a promising sensitivity, but specificity remained moderate at best. The inclusion of additional non-genetic variables to create integrated models may improve prediction. Copyright © 2017. Published by Elsevier Ltd.
Replication of 13 obesity loci among Singaporean Chinese, Malay and Asian-Indian populations.
Dorajoo, R; Blakemore, A I F; Sim, X; Ong, R T-H; Ng, D P K; Seielstad, M; Wong, T-Y; Saw, S-M; Froguel, P; Liu, J; Tai, E-S
2012-01-01
Recent genome-wide association studies (GWAS) have identified 38 obesity-associated loci among European populations. However, their contribution to obesity in other ethnicities is largely unknown. We utilised five GWAS (N=10 482) from Chinese (three cohorts, including one with type 2 diabetes and another one of children), Malay and Indian ethnic groups from Singapore. Data sets were analysed individually and subsequently in combined meta-analysis for Z-score body-mass index (BMI) associations. Variants at the FTO locus showed the strongest associations with BMI Z-score after meta-analysis (P-values 1.16 × 10(-7)-7.95 × 10(-7)). We further detected associations with nine other index obesity variants close to the MC4R, GNPDA2, TMEM18, QPCTL/GIPR, BDNF, ETV5, MAP2K5/SKOR1, SEC16B and TNKS/MSRA loci (meta-analysis P-values ranging from 3.58 × 10(-4)-1.44 × 10(-2)). Three other single-nucleotide polymorphisms (SNPs) from CADM2, PTBP2 and FAIM2 were associated with BMI (P-value ≤ 0.0418) in at least one dataset. The neurotrophin/TRK pathway (P-value=0.029) was highlighted by pathway-based analysis of loci that had statistically significant associations among Singaporean populations. Our data confirm the role of FTO in obesity predisposition among Chinese, Malays and Indians, the three major Asian ethnic groups. We additionally detected associations for 12 obesity-associated SNPs among Singaporeans. Thus, it is likely that Europeans and Asians share some of the genetic predisposition to obesity. Furthermore, the neurotrophin/TRK signalling may have a central role for common obesity among Asians.
Roshandel, Delnaz; Gubitosi-Klug, Rose; Bull, Shelley B; Canty, Angelo J; Pezzolesi, Marcus G; King, George L; Keenan, Hillary A; Snell-Bergeon, Janet K; Maahs, David M; Klein, Ronald; Klein, Barbara E K; Orchard, Trevor J; Costacou, Tina; Weedon, Michael N; Oram, Richard A; Paterson, Andrew D
2018-05-01
The aim of this study was to identify genetic variants associated with beta cell function in type 1 diabetes, as measured by serum C-peptide levels, through meta-genome-wide association studies (meta-GWAS). We performed a meta-GWAS to combine the results from five studies in type 1 diabetes with cross-sectionally measured stimulated, fasting or random C-peptide levels, including 3479 European participants. The p values across studies were combined, taking into account sample size and direction of effect. We also performed separate meta-GWAS for stimulated (n = 1303), fasting (n = 2019) and random (n = 1497) C-peptide levels. In the meta-GWAS for stimulated/fasting/random C-peptide levels, a SNP on chromosome 1, rs559047 (Chr1:238753916, T>A, minor allele frequency [MAF] 0.24-0.26), was associated with C-peptide (p = 4.13 × 10 -8 ), meeting the genome-wide significance threshold (p < 5 × 10 -8 ). In the same meta-GWAS, a locus in the MHC region (rs9260151) was close to the genome-wide significance threshold (Chr6:29911030, C>T, MAF 0.07-0.10, p = 8.43 × 10 -8 ). In the stimulated C-peptide meta-GWAS, rs61211515 (Chr6:30100975, T/-, MAF 0.17-0.19) in the MHC region was associated with stimulated C-peptide (β [SE] = - 0.39 [0.07], p = 9.72 × 10 -8 ). rs61211515 was also associated with the rate of stimulated C-peptide decline over time in a subset of individuals (n = 258) with annual repeated measures for up to 6 years (p = 0.02). In the meta-GWAS of random C-peptide, another MHC region, SNP rs3135002 (Chr6:32668439, C>A, MAF 0.02-0.06), was associated with C-peptide (p = 3.49 × 10 -8 ). Conditional analyses suggested that the three identified variants in the MHC region were independent of each other. rs9260151 and rs3135002 have been associated with type 1 diabetes, whereas rs559047 and rs61211515 have not been associated with a risk of developing type 1 diabetes. We identified a locus on chromosome 1 and multiple variants in the MHC region, at least some of which were distinct from type 1 diabetes risk loci, that were associated with C-peptide, suggesting partly non-overlapping mechanisms for the development and progression of type 1 diabetes. These associations need to be validated in independent populations. Further investigations could provide insights into mechanisms of beta cell loss and opportunities to preserve beta cell function.
Spindel, J E; Begum, H; Akdemir, D; Collard, B; Redoña, E; Jannink, J-L; McCouch, S
2016-01-01
To address the multiple challenges to food security posed by global climate change, population growth and rising incomes, plant breeders are developing new crop varieties that can enhance both agricultural productivity and environmental sustainability. Current breeding practices, however, are unable to keep pace with demand. Genomic selection (GS) is a new technique that helps accelerate the rate of genetic gain in breeding by using whole-genome data to predict the breeding value of offspring. Here, we describe a new GS model that combines RR-BLUP with markers fit as fixed effects selected from the results of a genome-wide-association study (GWAS) on the RR-BLUP training data. We term this model GS + de novo GWAS. In a breeding population of tropical rice, GS + de novo GWAS outperformed six other models for a variety of traits and in multiple environments. On the basis of these results, we propose an extended, two-part breeding design that can be used to efficiently integrate novel variation into elite breeding populations, thus expanding genetic diversity and enhancing the potential for sustainable productivity gains. PMID:26860200
Novel genome-wide association study-based candidate loci for differentiated thyroid cancer risk.
Figlioli, Gisella; Köhler, Aleksandra; Chen, Bowang; Elisei, Rossella; Romei, Cristina; Cipollini, Monica; Cristaudo, Alfonso; Bambi, Franco; Paolicchi, Elisa; Hoffmann, Per; Herms, Stefan; Kalemba, Michał; Kula, Dorota; Pastor, Susana; Marcos, Ricard; Velázquez, Antonia; Jarząb, Barbara; Landi, Stefano; Hemminki, Kari; Försti, Asta; Gemignani, Federica
2014-10-01
Genome-wide association studies (GWASs) on differentiated thyroid cancer (DTC) have identified robust associations with single nucleotide polymorphisms (SNPs) at 9q22.33 (FOXE1), 14q13.3 (NKX2-1), and 2q35 (DIRC3). Our recently published GWAS suggested additional susceptibility loci specific for the high-incidence Italian population. The purpose of this study was to identify novel Italian-specific DTC risk variants based on our GWAS and to test them further in low-incidence populations. We investigated 45 SNPs selected from our GWAS first in an Italian population. SNPs that showed suggestive evidence of association were investigated in the Polish and Spanish cohorts. The combined analysis of the GWAS and the Italian replication study (2260 case patients and 2218 control subjects) provided strong evidence of association with rs10136427 near BATF (odds ratio [OR] =1.40, P = 4.35 × 10(-7)) and rs7267944 near DHX35 (OR = 1.39, P = 2.13 × 10(-8)). A possible role in DTC susceptibility in the Italian populations was also found for rs13184587 (ARSB) (P = 8.54 × 10(-6)) and rs1220597 (SPATA13) (P = 3.25 × 10(-6)). Only the associations between rs10136427 and rs7267944 and DTC risk were replicated in the Polish and the Spanish populations with little evidence of population heterogeneity (GWAS and all replications combined, OR = 1.30, P = 9.30 × 10(-7) and OR = 1.32, P = 1.34 × 10(-8), respectively). In silico analyses provided new insights into the possible functional consequences of the SNPs that showed the strongest association with DTC. Our findings provide evidence for novel DTC susceptibility variants. Further studies are warranted to identify the specific genetic variants responsible for the observed associations and to functionally validate our in silico predictions.
Espin‐Garcia, Osvaldo; Craiu, Radu V.
2017-01-01
ABSTRACT We evaluate two‐phase designs to follow‐up findings from genome‐wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation‐maximization‐based inference under a semiparametric maximum likelihood formulation tailored for post‐GWAS inference. A GWAS‐SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT‐SNP‐dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme‐QT strata yields significant power improvements compared to marginal QT‐ or SNP‐based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure. PMID:29239496
Improved minimum cost and maximum power two stage genome-wide association study designs.
Stanhope, Stephen A; Skol, Andrew D
2012-01-01
In a two stage genome-wide association study (2S-GWAS), a sample of cases and controls is allocated into two groups, and genetic markers are analyzed sequentially with respect to these groups. For such studies, experimental design considerations have primarily focused on minimizing study cost as a function of the allocation of cases and controls to stages, subject to a constraint on the power to detect an associated marker. However, most treatments of this problem implicitly restrict the set of feasible designs to only those that allocate the same proportions of cases and controls to each stage. In this paper, we demonstrate that removing this restriction can improve the cost advantages demonstrated by previous 2S-GWAS designs by up to 40%. Additionally, we consider designs that maximize study power with respect to a cost constraint, and show that recalculated power maximizing designs can recover a substantial amount of the planned study power that might otherwise be lost if study funding is reduced. We provide open source software for calculating cost minimizing or power maximizing 2S-GWAS designs.
Brassinosteroid and gibberellin control of seedling traits in maize (Zea mays L.).
Hu, Songlin; Sanchez, Darlene L; Wang, Cuiling; Lipka, Alexander E; Yin, Yanhai; Gardner, Candice A C; Lübberstedt, Thomas
2017-10-01
In this study, we established two doubled haploid (DH) libraries with a total of 207 DH lines. We applied BR and GA inhibitors to all DH lines at seedling stage and measured seedling BR and GA inhibitor responses. Moreover, we evaluated field traits for each DH line (untreated). We conducted genome-wide association studies (GWAS) with 62,049 genome wide SNPs to explore the genetic control of seedling traits by BR and GA. In addition, we correlate seedling stage hormone inhibitor response with field traits. Large variation for BR and GA inhibitor response and field traits was observed across these DH lines. Seedling stage BR and GA inhibitor response was significantly correlate with yield and flowering time. Using three different GWAS approaches to balance false positive/negatives, multiple SNPs were discovered to be significantly associated with BR/GA inhibitor responses with some localized within gene models. SNPs from gene model GRMZM2G013391 were associated with GA inhibitor response across all three GWAS models. This gene is expressed in roots and shoots and was shown to regulate GA signaling. These results show that BRs and GAs have a great impact for controlling seedling growth. Gene models from GWAS results could be targets for seeding traits improvement. Copyright © 2017 Elsevier B.V. All rights reserved.
Huo, Dezheng
2013-01-01
Numerous single nucleotide polymorphisms (SNPs) associated with breast cancer susceptibility have been identified by genome-wide association studies (GWAS). However, these SNPs were primarily discovered and validated in women of European and Asian ancestry. Because linkage disequilibrium is ancestry-dependent and heterogeneous among racial/ethnic populations, we evaluated common genetic variants at 22 GWAS-identified breast cancer susceptibility loci in a pooled sample of 1502 breast cancer cases and 1378 controls of African ancestry. None of the 22 GWAS index SNPs could be validated, challenging the direct generalizability of breast cancer risk variants identified in Caucasians or Asians to other populations. Novel breast cancer risk variants for women of African ancestry were identified in regions including 5p12 (odds ratio [OR] = 1.40, 95% confidence interval [CI] = 1.11–1.76; P = 0.004), 5q11.2 (OR = 1.22, 95% CI = 1.09–1.36; P = 0.00053) and 10p15.1 (OR = 1.22, 95% CI = 1.08–1.38; P = 0.0015). We also found positive association signals in three regions (6q25.1, 10q26.13 and 16q12.1–q12.2) previously confirmed by fine mapping in women of African ancestry. In addition, polygenic model indicated that eight best markers in this study, compared with 22 GWAS-identified SNPs, could better predict breast cancer risk in women of African ancestry (per-allele OR = 1.21, 95% CI = 1.16–1.27; P = 9.7 × 10–16). Our results demonstrate that fine mapping is a powerful approach to better characterize the breast cancer risk alleles in diverse populations. Future studies and new GWAS in women of African ancestry hold promise to discover additional variants for breast cancer susceptibility with clinical implications throughout the African diaspora. PMID:23475944
Duell, Eric J.; Yu, Kai; Risch, Harvey A.; Olson, Sara H.; Kooperberg, Charles; Wolpin, Brian M.; Jiao, Li; Dong, Xiaoqun; Wheeler, Bill; Arslan, Alan A.; Bueno-de-Mesquita, H. Bas; Fuchs, Charles S.; Gallinger, Steven; Gross, Myron; Hartge, Patricia; Hoover, Robert N.; Holly, Elizabeth A.; Jacobs, Eric J.; Klein, Alison P.; LaCroix, Andrea; Mandelson, Margaret T.; Petersen, Gloria; Zheng, Wei; Agalliu, Ilir; Albanes, Demetrius; Boutron-Ruault, Marie-Christine; Bracci, Paige M.; Buring, Julie E.; Canzian, Federico; Chang, Kenneth; Chanock, Stephen J.; Cotterchio, Michelle; Gaziano, J.Michael; Giovannucci, Edward L.; Goggins, Michael; Hallmans, Göran; Hankinson, Susan E.; Hoffman Bolton, Judith A.; Hunter, David J.; Hutchinson, Amy; Jacobs, Kevin B.; Jenab, Mazda; Khaw, Kay-Tee; Kraft, Peter; Krogh, Vittorio; Kurtz, Robert C.; McWilliams, Robert R.; Mendelsohn, Julie B.; Patel, Alpa V.; Rabe, Kari G.; Riboli, Elio; Shu, Xiao-Ou; Tjønneland, Anne; Tobias, Geoffrey S.; Trichopoulos, Dimitrios; Virtamo, Jarmo; Visvanathan, Kala; Watters, Joanne; Yu, Herbert; Zeleniuch-Jacquotte, Anne; Stolzenberg-Solomon, Rachael Z.
2012-01-01
Four loci have been associated with pancreatic cancer through genome-wide association studies (GWAS). Pathway-based analysis of GWAS data is a complementary approach to identify groups of genes or biological pathways enriched with disease-associated single-nucleotide polymorphisms (SNPs) whose individual effect sizes may be too small to be detected by standard single-locus methods. We used the adaptive rank truncated product method in a pathway-based analysis of GWAS data from 3851 pancreatic cancer cases and 3934 control participants pooled from 12 cohort studies and 8 case–control studies (PanScan). We compiled 23 biological pathways hypothesized to be relevant to pancreatic cancer and observed a nominal association between pancreatic cancer and five pathways (P < 0.05), i.e. pancreatic development, Helicobacter pylori lacto/neolacto, hedgehog, Th1/Th2 immune response and apoptosis (P = 2.0 × 10−6, 1.6 × 10−5, 0.0019, 0.019 and 0.023, respectively). After excluding previously identified genes from the original GWAS in three pathways (NR5A2, ABO and SHH), the pancreatic development pathway remained significant (P = 8.3 × 10−5), whereas the others did not. The most significant genes (P < 0.01) in the five pathways were NR5A2, HNF1A, HNF4G and PDX1 for pancreatic development; ABO for H. pylori lacto/neolacto; SHH for hedgehog; TGFBR2 and CCL18 for Th1/Th2 immune response and MAPK8 and BCL2L11 for apoptosis. Our results provide a link between inherited variation in genes important for pancreatic development and cancer and show that pathway-based approaches to analysis of GWAS data can yield important insights into the collective role of genetic risk variants in cancer. PMID:22523087
Graham, Deborah S Cunninghame; Pinder, Christopher L; Tombleson, Philip; Behrens, Timothy W; Martín, Javier; Fairfax, Benjamin P; Knight, Julian C; Chen, Lingyan; Replogle, Joseph; Syvänen, Ann-Christine; Rönnblom, Lars; Graham, Robert R; Wither, Joan E; Rioux, John D; Alarcón-Riquelme, Marta E; Vyse, Timothy J
2015-01-01
Systemic lupus erythematosus (SLE; OMIM 152700) is a genetically complex autoimmune disease characterized by loss of immune tolerance to nuclear and cell surface antigens. Previous genome-wide association studies (GWAS) had modest sample sizes, reducing their scope and reliability. Our study comprised 7,219 cases and 15,991 controls of European ancestry: a new GWAS, meta-analysis with a published GWAS and a replication study. We have mapped 43 susceptibility loci, including 10 novel associations. Assisted by dense genome coverage, imputation provided evidence for missense variants underpinning associations in eight genes. Other likely causal genes were established by examining associated alleles for cis-acting eQTL effects in a range of ex vivo immune cells. We found an over-representation (n=16) of transcription factors among SLE susceptibility genes. This supports the view that aberrantly regulated gene expression networks in multiple cell types in both the innate and adaptive immune response contribute to the risk of developing SLE. PMID:26502338
No association between telomere length-related loci and number of cutaneous nevi.
Li, Xin; Liang, Geyu; Du, Mengmeng; De Vivo, Immaculata; Nan, Hongmei
2016-12-13
Longer telomeres have been associated both with increased melanoma risk and increased nevus counts. Nevus count is one of the strongest risk factors for melanoma. Recent data showed that a genetic score derived by telomere length-related single nucleotide polymorphisms (SNPs) was strongly associated with melanoma risk; however, the relationships between these SNPs and number of cutaneous nevi have not been investigated. We evaluated the associations between telomere length-related SNPs reported by previous genome-wide association study (GWAS) and nevus counts among 15,955 participants of European Ancestry in the Nurses' Health Study and Health Professionals Follow-up Study. None of the SNPs was associated with nevus counts, nor was the genetic score combining the dosage of alleles related to increased telomere length. The telomere length-related SNPs identified by published GWAS do not appear to play an important role in nevus formation. Genetic determinants of telomere length reported by GWAS do not explain the observed epidemiologic association between telomere length and nevus counts.
An empirical comparison of several recent epistatic interaction detection methods.
Wang, Yue; Liu, Guimei; Feng, Mengling; Wong, Limsoon
2011-11-01
Many new methods have recently been proposed for detecting epistatic interactions in GWAS data. There is, however, no in-depth independent comparison of these methods yet. Five recent methods-TEAM, BOOST, SNPHarvester, SNPRuler and Screen and Clean (SC)-are evaluated here in terms of power, type-1 error rate, scalability and completeness. In terms of power, TEAM performs best on data with main effect and BOOST performs best on data without main effect. In terms of type-1 error rate, TEAM and BOOST have higher type-1 error rates than SNPRuler and SNPHarvester. SC does not control type-1 error rate well. In terms of scalability, we tested the five methods using a dataset with 100 000 SNPs on a 64 bit Ubuntu system, with Intel (R) Xeon(R) CPU 2.66 GHz, 16 GB memory. TEAM takes ~36 days to finish and SNPRuler reports heap allocation problems. BOOST scales up to 100 000 SNPs and the cost is much lower than that of TEAM. SC and SNPHarvester are the most scalable. In terms of completeness, we study how frequently the pruning techniques employed by these methods incorrectly prune away the most significant epistatic interactions. We find that, on average, 20% of datasets without main effect and 60% of datasets with main effect are pruned incorrectly by BOOST, SNPRuler and SNPHarvester. The software for the five methods tested are available from the URLs below. TEAM: http://csbio.unc.edu/epistasis/download.php BOOST: http://ihome.ust.hk/~eeyang/papers.html. SNPHarvester: http://bioinformatics.ust.hk/SNPHarvester.html. SNPRuler: http://bioinformatics.ust.hk/SNPRuler.zip. Screen and Clean: http://wpicr.wpic.pitt.edu/WPICCompGen/. wangyue@nus.edu.sg.
Amin Al Olama, Ali; Dadaev, Tokhir; Hazelett, Dennis J; Li, Qiuyan; Leongamornlert, Daniel; Saunders, Edward J; Stephens, Sarah; Cieza-Borrella, Clara; Whitmore, Ian; Benlloch Garcia, Sara; Giles, Graham G; Southey, Melissa C; Fitzgerald, Liesel; Gronberg, Henrik; Wiklund, Fredrik; Aly, Markus; Henderson, Brian E; Schumacher, Fredrick; Haiman, Christopher A; Schleutker, Johanna; Wahlfors, Tiina; Tammela, Teuvo L; Nordestgaard, Børge G; Key, Tim J; Travis, Ruth C; Neal, David E; Donovan, Jenny L; Hamdy, Freddie C; Pharoah, Paul; Pashayan, Nora; Khaw, Kay-Tee; Stanford, Janet L; Thibodeau, Stephen N; Mcdonnell, Shannon K; Schaid, Daniel J; Maier, Christiane; Vogel, Walther; Luedeke, Manuel; Herkommer, Kathleen; Kibel, Adam S; Cybulski, Cezary; Wokołorczyk, Dominika; Kluzniak, Wojciech; Cannon-Albright, Lisa; Brenner, Hermann; Butterbach, Katja; Arndt, Volker; Park, Jong Y; Sellers, Thomas; Lin, Hui-Yi; Slavov, Chavdar; Kaneva, Radka; Mitev, Vanio; Batra, Jyotsna; Clements, Judith A; Spurdle, Amanda; Teixeira, Manuel R; Paulo, Paula; Maia, Sofia; Pandha, Hardev; Michael, Agnieszka; Kierzek, Andrzej; Govindasami, Koveela; Guy, Michelle; Lophatonanon, Artitaya; Muir, Kenneth; Viñuela, Ana; Brown, Andrew A; Freedman, Mathew; Conti, David V; Easton, Douglas; Coetzee, Gerhard A; Eeles, Rosalind A; Kote-Jarai, Zsofia
2015-10-01
Genome-wide association studies (GWAS) have identified numerous common prostate cancer (PrCa) susceptibility loci. We have fine-mapped 64 GWAS regions known at the conclusion of the iCOGS study using large-scale genotyping and imputation in 25 723 PrCa cases and 26 274 controls of European ancestry. We detected evidence for multiple independent signals at 16 regions, 12 of which contained additional newly identified significant associations. A single signal comprising a spectrum of correlated variation was observed at 39 regions; 35 of which are now described by a novel more significantly associated lead SNP, while the originally reported variant remained as the lead SNP only in 4 regions. We also confirmed two association signals in Europeans that had been previously reported only in East-Asian GWAS. Based on statistical evidence and linkage disequilibrium (LD) structure, we have curated and narrowed down the list of the most likely candidate causal variants for each region. Functional annotation using data from ENCODE filtered for PrCa cell lines and eQTL analysis demonstrated significant enrichment for overlap with bio-features within this set. By incorporating the novel risk variants identified here alongside the refined data for existing association signals, we estimate that these loci now explain ∼38.9% of the familial relative risk of PrCa, an 8.9% improvement over the previously reported GWAS tag SNPs. This suggests that a significant fraction of the heritability of PrCa may have been hidden during the discovery phase of GWAS, in particular due to the presence of multiple independent signals within the same region. © The Author 2015. Published by Oxford University Press.
Dolejsi, Erich; Bodenstorfer, Bernhard; Frommlet, Florian
2014-01-01
The prevailing method of analyzing GWAS data is still to test each marker individually, although from a statistical point of view it is quite obvious that in case of complex traits such single marker tests are not ideal. Recently several model selection approaches for GWAS have been suggested, most of them based on LASSO-type procedures. Here we will discuss an alternative model selection approach which is based on a modification of the Bayesian Information Criterion (mBIC2) which was previously shown to have certain asymptotic optimality properties in terms of minimizing the misclassification error. Heuristic search strategies are introduced which attempt to find the model which minimizes mBIC2, and which are efficient enough to allow the analysis of GWAS data. Our approach is implemented in a software package called MOSGWA. Its performance in case control GWAS is compared with the two algorithms HLASSO and d-GWASelect, as well as with single marker tests, where we performed a simulation study based on real SNP data from the POPRES sample. Our results show that MOSGWA performs slightly better than HLASSO, where specifically for more complex models MOSGWA is more powerful with only a slight increase in Type I error. On the other hand according to our simulations GWASelect does not at all control the type I error when used to automatically determine the number of important SNPs. We also reanalyze the GWAS data from the Wellcome Trust Case-Control Consortium and compare the findings of the different procedures, where MOSGWA detects for complex diseases a number of interesting SNPs which are not found by other methods. PMID:25061809
2014-01-01
Background Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories. Methods 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 695,193 SNPs were conducted using UNPHASED, which combines information across families and unrelated individuals. We attempted to replicate signals found in 23 genomic regions using existing data on nonoverlapping samples from the Psychiatric GWAS Consortium and Schizophrenia-GENE-plus cohorts (10,352 schizophrenia patients and 24,474 controls). Results No individual SNP showed compelling evidence for association with psychosis in our data. However, we observed a trend for association with same risk alleles at loci previously associated with schizophrenia (one-sided p = .003). A polygenic score analysis found that the Psychiatric GWAS Consortium’s panel of SNPs associated with schizophrenia significantly predicted disease status in our sample (p = 5 × 10–14) and explained approximately 2% of the phenotypic variance. Conclusions Although narrowly defined phenotypes have their advantages, we believe new loci may also be discovered through meta-analysis across broad phenotypes. The novel statistical methodology we introduced to model effect size heterogeneity between studies should help future GWAS that combine association evidence from related phenotypes. Applying these approaches, we highlight three loci that warrant further investigation. We found that SNPs conveying risk for schizophrenia are also predictive of disease status in our data. PMID:23871474
Smeland, Olav B; Wang, Yunpeng; Frei, Oleksandr; Li, Wen; Hibar, Derrek P; Franke, Barbara; Bettella, Francesco; Witoelar, Aree; Djurovic, Srdjan; Chen, Chi-Hua; Thompson, Paul M; Dale, Anders M; Andreassen, Ole A
2018-06-06
Schizophrenia (SCZ) is associated with differences in subcortical brain volumes and intracranial volume (ICV). However, little is known about the underlying etiology of these brain alterations. Here, we explored whether brain structure volumes and SCZ share genetic risk factors. Using conditional false discovery rate (FDR) analysis, we integrated genome-wide association study (GWAS) data on SCZ (n = 82315) and GWAS data on 7 subcortical brain volumes and ICV (n = 11840). By conditioning the FDR on overlapping associations, this statistical approach increases power to discover genetic loci. To assess the credibility of our approach, we studied the identified loci in larger GWAS samples on ICV (n = 26577) and hippocampal volume (n = 26814). We observed polygenic overlap between SCZ and volumes of hippocampus, putamen, and ICV. Based on conjunctional FDR < 0.05, we identified 2 loci shared between SCZ and ICV implicating genes FOXO3 (rs10457180) and ITIH4 (rs4687658), 2 loci shared between SCZ and hippocampal volume implicating SLC4A10 (rs4664442) and SPATS2L (rs1653290), and 2 loci shared between SCZ and volume of putamen implicating DCC (rs4632195) and DLG2 (rs11233632). The loci shared between SCZ and hippocampal volume or ICV had not reached significance in the primary GWAS on brain phenotypes. Proving our point of increased power, 2 loci did reach genome-wide significance with ICV (rs10457180) and hippocampal volume (rs4664442) in the larger GWAS. Three of the 6 identified loci are novel for SCZ. Altogether, the findings provide new insights into the relationship between SCZ and brain structure volumes, suggesting that their genetic architectures are not independent.
Trampush, J W; Yang, M L Z; Yu, J; Knowles, E; Davies, G; Liewald, D C; Starr, J M; Djurovic, S; Melle, I; Sundet, K; Christoforou, A; Reinvang, I; DeRosse, P; Lundervold, A J; Steen, V M; Espeseth, T; Räikkönen, K; Widen, E; Palotie, A; Eriksson, J G; Giegling, I; Konte, B; Roussos, P; Giakoumaki, S; Burdick, K E; Payton, A; Ollier, W; Horan, M; Chiba-Falek, O; Attix, D K; Need, A C; Cirulli, E T; Voineskos, A N; Stefanis, N C; Avramopoulos, D; Hatzimanolis, A; Arking, D E; Smyrnis, N; Bilder, R M; Freimer, N A; Cannon, T D; London, E; Poldrack, R A; Sabb, F W; Congdon, E; Conley, E D; Scult, M A; Dickinson, D; Straub, R E; Donohoe, G; Morris, D; Corvin, A; Gill, M; Hariri, A R; Weinberger, D R; Pendleton, N; Bitsios, P; Rujescu, D; Lahti, J; Le Hellard, S; Keller, M C; Andreassen, O A; Deary, I J; Glahn, D C; Malhotra, A K; Lencz, T
2017-03-01
The complex nature of human cognition has resulted in cognitive genomics lagging behind many other fields in terms of gene discovery using genome-wide association study (GWAS) methods. In an attempt to overcome these barriers, the current study utilized GWAS meta-analysis to examine the association of common genetic variation (~8M single-nucleotide polymorphisms (SNP) with minor allele frequency ⩾1%) to general cognitive function in a sample of 35 298 healthy individuals of European ancestry across 24 cohorts in the Cognitive Genomics Consortium (COGENT). In addition, we utilized individual SNP lookups and polygenic score analyses to identify genetic overlap with other relevant neurobehavioral phenotypes. Our primary GWAS meta-analysis identified two novel SNP loci (top SNPs: rs76114856 in the CENPO gene on chromosome 2 and rs6669072 near LOC105378853 on chromosome 1) associated with cognitive performance at the genome-wide significance level (P<5 × 10 -8 ). Gene-based analysis identified an additional three Bonferroni-corrected significant loci at chromosomes 17q21.31, 17p13.1 and 1p13.3. Altogether, common variation across the genome resulted in a conservatively estimated SNP heritability of 21.5% (s.e.=0.01%) for general cognitive function. Integration with prior GWAS of cognitive performance and educational attainment yielded several additional significant loci. Finally, we found robust polygenic correlations between cognitive performance and educational attainment, several psychiatric disorders, birth length/weight and smoking behavior, as well as a novel genetic association to the personality trait of openness. These data provide new insight into the genetics of neurocognitive function with relevance to understanding the pathophysiology of neuropsychiatric illness.
Trampush, J W; Yang, M L Z; Yu, J; Knowles, E; Davies, G; Liewald, D C; Starr, J M; Djurovic, S; Melle, I; Sundet, K; Christoforou, A; Reinvang, I; DeRosse, P; Lundervold, A J; Steen, V M; Espeseth, T; Räikkönen, K; Widen, E; Palotie, A; Eriksson, J G; Giegling, I; Konte, B; Roussos, P; Giakoumaki, S; Burdick, K E; Payton, A; Ollier, W; Horan, M; Chiba-Falek, O; Attix, D K; Need, A C; Cirulli, E T; Voineskos, A N; Stefanis, N C; Avramopoulos, D; Hatzimanolis, A; Arking, D E; Smyrnis, N; Bilder, R M; Freimer, N A; Cannon, T D; London, E; Poldrack, R A; Sabb, F W; Congdon, E; Conley, E D; Scult, M A; Dickinson, D; Straub, R E; Donohoe, G; Morris, D; Corvin, A; Gill, M; Hariri, A R; Weinberger, D R; Pendleton, N; Bitsios, P; Rujescu, D; Lahti, J; Le Hellard, S; Keller, M C; Andreassen, O A; Deary, I J; Glahn, D C; Malhotra, A K; Lencz, T
2017-01-01
The complex nature of human cognition has resulted in cognitive genomics lagging behind many other fields in terms of gene discovery using genome-wide association study (GWAS) methods. In an attempt to overcome these barriers, the current study utilized GWAS meta-analysis to examine the association of common genetic variation (~8M single-nucleotide polymorphisms (SNP) with minor allele frequency ⩾1%) to general cognitive function in a sample of 35 298 healthy individuals of European ancestry across 24 cohorts in the Cognitive Genomics Consortium (COGENT). In addition, we utilized individual SNP lookups and polygenic score analyses to identify genetic overlap with other relevant neurobehavioral phenotypes. Our primary GWAS meta-analysis identified two novel SNP loci (top SNPs: rs76114856 in the CENPO gene on chromosome 2 and rs6669072 near LOC105378853 on chromosome 1) associated with cognitive performance at the genome-wide significance level (P<5 × 10−8). Gene-based analysis identified an additional three Bonferroni-corrected significant loci at chromosomes 17q21.31, 17p13.1 and 1p13.3. Altogether, common variation across the genome resulted in a conservatively estimated SNP heritability of 21.5% (s.e.=0.01%) for general cognitive function. Integration with prior GWAS of cognitive performance and educational attainment yielded several additional significant loci. Finally, we found robust polygenic correlations between cognitive performance and educational attainment, several psychiatric disorders, birth length/weight and smoking behavior, as well as a novel genetic association to the personality trait of openness. These data provide new insight into the genetics of neurocognitive function with relevance to understanding the pathophysiology of neuropsychiatric illness. PMID:28093568
Bramon, Elvira; Pirinen, Matti; Strange, Amy; Lin, Kuang; Freeman, Colin; Bellenguez, Céline; Su, Zhan; Band, Gavin; Pearson, Richard; Vukcevic, Damjan; Langford, Cordelia; Deloukas, Panos; Hunt, Sarah; Gray, Emma; Dronov, Serge; Potter, Simon C; Tashakkori-Ghanbaria, Avazeh; Edkins, Sarah; Bumpstead, Suzannah J; Arranz, Maria J; Bakker, Steven; Bender, Stephan; Bruggeman, Richard; Cahn, Wiepke; Chandler, David; Collier, David A; Crespo-Facorro, Benedicto; Dazzan, Paola; de Haan, Lieuwe; Di Forti, Marta; Dragović, Milan; Giegling, Ina; Hall, Jeremy; Iyegbe, Conrad; Jablensky, Assen; Kahn, René S; Kalaydjieva, Luba; Kravariti, Eugenia; Lawrie, Stephen; Linszen, Don H; Mata, Ignacio; McDonald, Colm; McIntosh, Andrew; Myin-Germeys, Inez; Ophoff, Roel A; Pariante, Carmine M; Paunio, Tiina; Picchioni, Marco; Ripke, Stephan; Rujescu, Dan; Sauer, Heinrich; Shaikh, Madiha; Sussmann, Jessika; Suvisaari, Jaana; Tosato, Sarah; Toulopoulou, Timothea; Van Os, Jim; Walshe, Muriel; Weisbrod, Matthias; Whalley, Heather; Wiersma, Durk; Blackwell, Jenefer M; Brown, Matthew A; Casas, Juan P; Corvin, Aiden; Duncanson, Audrey; Jankowski, Janusz A Z; Markus, Hugh S; Mathew, Christopher G; Palmer, Colin N A; Plomin, Robert; Rautanen, Anna; Sawcer, Stephen J; Trembath, Richard C; Wood, Nicholas W; Barroso, Ines; Peltonen, Leena; Lewis, Cathryn M; Murray, Robin M; Donnelly, Peter; Powell, John; Spencer, Chris C A
2014-03-01
Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories. 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 695,193 SNPs were conducted using UNPHASED, which combines information across families and unrelated individuals. We attempted to replicate signals found in 23 genomic regions using existing data on nonoverlapping samples from the Psychiatric GWAS Consortium and Schizophrenia-GENE-plus cohorts (10,352 schizophrenia patients and 24,474 controls). No individual SNP showed compelling evidence for association with psychosis in our data. However, we observed a trend for association with same risk alleles at loci previously associated with schizophrenia (one-sided p = .003). A polygenic score analysis found that the Psychiatric GWAS Consortium's panel of SNPs associated with schizophrenia significantly predicted disease status in our sample (p = 5 × 10(-14)) and explained approximately 2% of the phenotypic variance. Although narrowly defined phenotypes have their advantages, we believe new loci may also be discovered through meta-analysis across broad phenotypes. The novel statistical methodology we introduced to model effect size heterogeneity between studies should help future GWAS that combine association evidence from related phenotypes. Applying these approaches, we highlight three loci that warrant further investigation. We found that SNPs conveying risk for schizophrenia are also predictive of disease status in our data. Copyright © 2014 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Fanous, Ayman H; Zhou, Baiyu; Aggen, Steven H; Bergen, Sarah E; Amdur, Richard L; Duan, Jubao; Sanders, Alan R; Shi, Jianxin; Mowry, Bryan J; Olincy, Ann; Amin, Farooq; Cloninger, C Robert; Silverman, Jeremy M; Buccola, Nancy G; Byerley, William F; Black, Donald W; Freedman, Robert; Dudbridge, Frank; Holmans, Peter A; Ripke, Stephan; Gejman, Pablo V; Kendler, Kenneth S; Levinson, Douglas F
2012-12-01
Multiple sources of evidence suggest that genetic factors influence variation in clinical features of schizophrenia. The authors present the first genome-wide association study (GWAS) of dimensional symptom scores among individuals with schizophrenia. Based on the Lifetime Dimensions of Psychosis Scale ratings of 2,454 case subjects of European ancestry from the Molecular Genetics of Schizophrenia (MGS) sample, three symptom factors (positive, negative/disorganized, and mood) were identified with exploratory factor analysis. Quantitative scores for each factor from a confirmatory factor analysis were analyzed for association with 696,491 single-nucleotide polymorphisms (SNPs) using linear regression, with correction for age, sex, clinical site, and ancestry. Polygenic score analysis was carried out to determine whether case and comparison subjects in 16 Psychiatric GWAS Consortium (PGC) schizophrenia samples (excluding MGS samples) differed in scores computed by weighting their genotypes by MGS association test results for each symptom factor. No genome-wide significant associations were observed between SNPs and factor scores. Most of the SNPs producing the strongest evidence for association were in or near genes involved in neurodevelopment, neuroprotection, or neurotransmission, including genes playing a role in Mendelian CNS diseases, but no statistically significant effect was observed for any defined gene pathway. Finally, polygenic scores based on MGS GWAS results for the negative/disorganized factor were significantly different between case and comparison subjects in the PGC data set; for MGS subjects, negative/disorganized factor scores were correlated with polygenic scores generated using case-control GWAS results from the other PGC samples. The polygenic signal that has been observed in cross-sample analyses of schizophrenia GWAS data sets could be in part related to genetic effects on negative and disorganized symptoms (i.e., core features of chronic schizophrenia).
Accounting for linkage disequilibrium in association analysis of diverse populations.
Charles, Bashira A; Shriner, Daniel; Rotimi, Charles N
2014-04-01
The National Human Genome Research Institute's catalog of published genome-wide association studies (GWAS) lists over 10,000 genetic variants collectively associated with over 800 human diseases or traits. Most of these GWAS have been conducted in European-ancestry populations. Findings gleaned from these studies have led to identification of disease-associated loci and biologic pathways involved in disease etiology. In multiple instances, these genomic findings have led to the development of novel medical therapies or evidence for prescribing a given drug as the appropriate treatment for a given individual beyond phenotypic appearances or socially defined constructs of race or ethnicity. Such findings have implications for populations throughout the globe and GWAS are increasingly being conducted in more diverse populations. A major challenge for investigators seeking to follow up genomic findings between diverse populations is discordant patterns of linkage disequilibrium (LD). We provide an overview of common measures of LD and opportunities for their use in novel methods designed to address challenges associated with following up GWAS conducted in European-ancestry populations in African-ancestry populations or, more generally, between populations with discordant LD patterns. We detail the strengths and weaknesses associated with different approaches. We also describe application of these strategies in follow-up studies of populations with concordant LD patterns (replication) or discordant LD patterns (transferability) as well as fine-mapping studies. We review application of these methods to a variety of traits and diseases. © 2014 WILEY PERIODICALS, INC.
Galesloot, Tessel E.; van Dijk, Freerk; Geurts-Moespot, Anneke J.; Girelli, Domenico; Kiemeney, Lambertus A. L. M.; Sweep, Fred C. G. J.; Swertz, Morris A.; van der Meer, Peter; Camaschella, Clara; Toniolo, Daniela; Vermeulen, Sita H.; van der Harst, Pim; Swinkels, Dorine W.
2016-01-01
Serum hepcidin concentration is regulated by iron status, inflammation, erythropoiesis and numerous other factors, but underlying processes are incompletely understood. We studied the association of common and rare single nucleotide variants (SNVs) with serum hepcidin in one Italian study and two large Dutch population-based studies. We genotyped common SNVs with genome-wide association study (GWAS) arrays and subsequently performed imputation using the 1000 Genomes reference panel. Cohort-specific GWAS were performed for log-transformed serum hepcidin, adjusted for age and gender, and results were combined in a fixed-effects meta-analysis (total N 6,096). Six top SNVs (p<5x10-6) were genotyped in 3,821 additional samples, but associations were not replicated. Furthermore, we meta-analyzed cohort-specific exome array association results of rare SNVs with serum hepcidin that were available for two of the three cohorts (total N 3,226), but no exome-wide significant signal (p<1.4x10-6) was identified. Gene-based meta-analyses revealed 19 genes that showed significant association with hepcidin. Our results suggest the absence of common SNVs and rare exonic SNVs explaining a large proportion of phenotypic variation in serum hepcidin. We recommend extension of our study once additional substantial cohorts with hepcidin measurements, GWAS and/or exome array data become available in order to increase power to identify variants that explain a smaller proportion of hepcidin variation. In addition, we encourage follow-up of the potentially interesting genes that resulted from the gene-based analysis of low-frequency and rare variants. PMID:27846281
Integrated Post-GWAS Analysis Sheds New Light on the Disease Mechanisms of Schizophrenia
Lin, Jhih-Rong; Cai, Ying; Zhang, Quanwei; Zhang, Wen; Nogales-Cadenas, Rubén; Zhang, Zhengdong D.
2016-01-01
Schizophrenia is a severe mental disorder with a large genetic component. Recent genome-wide association studies (GWAS) have identified many schizophrenia-associated common variants. For most of the reported associations, however, the underlying biological mechanisms are not clear. The critical first step for their elucidation is to identify the most likely disease genes as the source of the association signals. Here, we describe a general computational framework of post-GWAS analysis for complex disease gene prioritization. We identify 132 putative schizophrenia risk genes in 76 risk regions spanning 120 schizophrenia-associated common variants, 78 of which have not been recognized as schizophrenia disease genes by previous GWAS. Even more significantly, 29 of them are outside the risk regions, likely under regulation of transcriptional regulatory elements contained therein. These putative schizophrenia risk genes are transcriptionally active in both brain and the immune system, and highly enriched among cellular pathways, consistent with leading pathophysiological hypotheses about the pathogenesis of schizophrenia. With their involvement in distinct biological processes, these putative schizophrenia risk genes, with different association strengths, show distinctive temporal expression patterns, and play specific biological roles during brain development. PMID:27754856
Deng, Yangqing; Pan, Wei
2018-06-01
Due to issues of practicality and confidentiality of genomic data sharing on a large scale, typically only meta- or mega-analyzed genome-wide association study (GWAS) summary data, not individual-level data, are publicly available. Reanalyses of such GWAS summary data for a wide range of applications have become more and more common and useful, which often require the use of an external reference panel with individual-level genotypic data to infer linkage disequilibrium (LD) among genetic variants. However, with a small sample size in only hundreds, as for the most popular 1000 Genomes Project European sample, estimation errors for LD are not negligible, leading to often dramatically increased numbers of false positives in subsequent analyses of GWAS summary data. To alleviate the problem in the context of association testing for a group of SNPs, we propose an alternative estimator of the covariance matrix with an idea similar to multiple imputation. We use numerical examples based on both simulated and real data to demonstrate the severe problem with the use of the 1000 Genomes Project reference panels, and the improved performance of our new approach. Copyright © 2018 by the Genetics Society of America.
Missing data imputation and haplotype phase inference for genome-wide association studies
Browning, Sharon R.
2009-01-01
Imputation of missing data and the use of haplotype-based association tests can improve the power of genome-wide association studies (GWAS). In this article, I review methods for haplotype inference and missing data imputation, and discuss their application to GWAS. I discuss common features of the best algorithms for haplotype phase inference and missing data imputation in large-scale data sets, as well as some important differences between classes of methods, and highlight the methods that provide the highest accuracy and fastest computational performance. PMID:18850115
Genetic Risk Variants for Social Anxiety
Stein, Murray B.; Chen, Chia-Yen; Jain, Sonia; Jensen, Kevin P.; He, Feng; Heeringa, Steven G.; Kessler, Ronald C.; Maihofer, Adam; Nock, Matthew K.; Ripke, Stephan; Sun, Xiaoying; Thomas, Michael L.; Ursano, Robert J.; Smoller, Jordan W.; Gelernter, Joel
2017-01-01
Social anxiety is a neurobehavioral trait characterized by fear and reticence in social situations. Twin studies have shown that social anxiety has a heritable basis, shared with neuroticism and extraversion, but genetic studies have yet to demonstrate robust risk variants. We conducted genomewide association analysis (GWAS) of subjects within the Army Study To Assess Risk and Resilience in Service members (Army STARRS) to (1) determine SNP-based heritability of social anxiety; (2) discern genetic risk loci for social anxiety; and (3) determine shared genetic risk with neuroticism and extraversion. GWAS were conducted within ancestral groups (EUR, AFR, LAT) using linear regression models for each of the 3 component studies in Army STARRS, and then meta-analyzed across studies. SNP-based heritability for social anxiety was significant (h2g=0.12, p=2.17×10-4 in EUR). One meta-analytically genomewide significant locus was seen in each of EUR (rs708012, Chr 6: BP 36965970, p = 1.55×10-8; beta = 0.073) and AFR (rs78924501, Chr 1: BP 88406905, p = 3.58×10-8; beta = 0.265) samples. Social anxiety in Army STARRS was significantly genetically correlated (negatively) with extraversion (rg = -0.52, se = 0.22, p = 0.02) but not with neuroticism (rg = 0.05, se = 0.22, p = 0.81) or with an anxiety disorder factor score (rg = 0.02, se = 0.32, p = 0.94) from external GWAS meta-analyses. This first GWAS of social anxiety confirms a genetic basis for social anxiety, shared with extraversion but possibly less so with neuroticism. PMID:28224735
Gjerdevik, Miriam; Haaland, Øystein A.; Romanowska, Julia; Lie, Rolv T.
2017-01-01
Background GWAS discoveries on the X-chromosome are underrepresented in the literature primarily because the analytical tools that have been applied were originally designed for autosomal markers. Our objective here is to employ a new robust and flexible tool for chromosome-wide analysis of X-linked markers in complex traits. Orofacial clefts are good candidates for such analysis because of the consistently observed excess of females with cleft palate only (CPO) and excess of males with cleft lip with or without cleft palate (CL/P). Methods Genotypes for 14,486 X-chromosome SNPs in 1,291 Asian and 1,118 European isolated cleft triads were available from a previously published GWAS. The R-package HAPLIN enables genome-wide–level analyses as well as statistical power simulations for a range of biologic scenarios. We analyzed isolated CL/P and isolated CPO for each ethnicity in HAPLIN, using a sliding-window approach to haplotype analysis and two different statistical models, with and without X-inactivation in females. Results There was a larger number of associations in the Asian versus the European sample, and similar to previous reports that have analyzed the same GWAS dataset using different methods, we identified associations with EFNB1/PJA1 and DMD. In addition, new associations were detected with several other genes, among which KLHL4, TBX22, CPXCR1 and BCOR were noteworthy because of their roles in clefting syndromes. A few of the associations were only detected by one particular X-inactivation model, whereas a few others were only detected in one sex. Discussion/Conclusion We found new support for the involvement of X-linked variants in isolated clefts. The associations were specific for ethnicity, sex and model parameterization, highlighting the need for flexible tools that are capable of detecting and estimating such effects. Further efforts are needed to verify and elucidate the potential roles of EFNB1/PJA1, KLHL4, TBX22, CPXCR1 and BCOR in isolated clefts. PMID:28877219
Age at menarche and age at natural menopause in East Asian women: a genome-wide association study.
Shi, Jiajun; Zhang, Ben; Choi, Ji-Yeob; Gao, Yu-Tang; Li, Huaixing; Lu, Wei; Long, Jirong; Kang, Daehee; Xiang, Yong-Bing; Wen, Wanqing; Park, Sue K; Ye, Xingwang; Noh, Dong-Young; Zheng, Ying; Wang, Yiqin; Chung, Seokang; Lin, Xu; Cai, Qiuyin; Shu, Xiao-Ou
2016-12-01
Age at menarche (AM) and age at natural menopause (ANM) are complex traits with a high heritability. Abnormal timing of menarche or menopause is associated with a reduced span of fertility and risk for several age-related diseases including breast, endometrial and ovarian cancer, cardiovascular disease, and osteoporosis. To identify novel genetic loci for AM or ANM in East Asian women and to replicate previously identified loci primarily in women of European ancestry by genome-wide association studies (GWASs), we conducted a two-stage GWAS. Stage I aimed to discover promising novel AM and ANM loci using GWAS data of 8073 women from Shanghai, China. The Stage II replication study used the data from another Chinese GWAS (n = 1230 for AM and n = 1458 for ANM), a Korean GWAS (n = 4215 for AM and n = 1739 for ANM), and de novo genotyping of 2877 additional Chinese women. Previous GWAS-identified loci for AM and ANM were also evaluated. We identified two suggestive menarcheal age loci tagged by rs79195475 at 10q21.3 (beta = -0.118 years, P = 3.4 × 10 -6 ) and rs1023935 at 4p15.1 (beta = -0.145 years, P = 4.9 × 10 -6 ) and one menopausal age locus tagged by rs3818134 at 22q12.2 (beta = -0.276 years, P = 8.8 × 10 -6 ). These suggestive loci warrant a further validation in independent populations. Although limited by low statistical power, we replicated 19 of the 98 menarche loci and 5 of the 20 menopause loci previously identified in women of European ancestry in East Asian women, suggesting a shared genetic architecture for these two traits across populations.
van Leeuwen, Elisabeth M; Sabo, Aniko; Bis, Joshua C; Huffman, Jennifer E; Manichaikul, Ani; Smith, Albert V; Feitosa, Mary F; Demissie, Serkalem; Joshi, Peter K; Duan, Qing; Marten, Jonathan; van Klinken, Jan B; Surakka, Ida; Nolte, Ilja M; Zhang, Weihua; Mbarek, Hamdi; Li-Gao, Ruifang; Trompet, Stella; Verweij, Niek; Evangelou, Evangelos; Lyytikäinen, Leo-Pekka; Tayo, Bamidele O; Deelen, Joris; van der Most, Peter J; van der Laan, Sander W; Arking, Dan E; Morrison, Alanna; Dehghan, Abbas; Franco, Oscar H; Hofman, Albert; Rivadeneira, Fernando; Sijbrands, Eric J; Uitterlinden, Andre G; Mychaleckyj, Josyf C; Campbell, Archie; Hocking, Lynne J; Padmanabhan, Sandosh; Brody, Jennifer A; Rice, Kenneth M; White, Charles C; Harris, Tamara; Isaacs, Aaron; Campbell, Harry; Lange, Leslie A; Rudan, Igor; Kolcic, Ivana; Navarro, Pau; Zemunik, Tatijana; Salomaa, Veikko; Kooner, Angad S; Kooner, Jaspal S; Lehne, Benjamin; Scott, William R; Tan, Sian-Tsung; de Geus, Eco J; Milaneschi, Yuri; Penninx, Brenda W J H; Willemsen, Gonneke; de Mutsert, Renée; Ford, Ian; Gansevoort, Ron T; Segura-Lepe, Marcelo P; Raitakari, Olli T; Viikari, Jorma S; Nikus, Kjell; Forrester, Terrence; McKenzie, Colin A; de Craen, Anton J M; de Ruijter, Hester M; Pasterkamp, Gerard; Snieder, Harold; Oldehinkel, Albertine J; Slagboom, P Eline; Cooper, Richard S; Kähönen, Mika; Lehtimäki, Terho; Elliott, Paul; van der Harst, Pim; Jukema, J Wouter; Mook-Kanamori, Dennis O; Boomsma, Dorret I; Chambers, John C; Swertz, Morris; Ripatti, Samuli; Willems van Dijk, Ko; Vitart, Veronique; Polasek, Ozren; Hayward, Caroline; Wilson, James G; Wilson, James F; Gudnason, Vilmundur; Rich, Stephen S; Psaty, Bruce M; Borecki, Ingrid B; Boerwinkle, Eric; Rotter, Jerome I; Cupples, L Adrienne; van Duijn, Cornelia M
2016-01-01
Background So far, more than 170 loci have been associated with circulating lipid levels through genome-wide association studies (GWAS). These associations are largely driven by common variants, their function is often not known, and many are likely to be markers for the causal variants. In this study we aimed to identify more new rare and low-frequency functional variants associated with circulating lipid levels. Methods We used the 1000 Genomes Project as a reference panel for the imputations of GWAS data from ∼60 000 individuals in the discovery stage and ∼90 000 samples in the replication stage. Results Our study resulted in the identification of five new associations with circulating lipid levels at four loci. All four loci are within genes that can be linked biologically to lipid metabolism. One of the variants, rs116843064, is a damaging missense variant within the ANGPTL4 gene. Conclusions This study illustrates that GWAS with high-scale imputation may still help us unravel the biological mechanism behind circulating lipid levels. PMID:27036123
Transferability of genome-wide associated loci for asthma in African Americans
Faruque, Mezbah U.; Chen, Guanjie; Doumatey, Ayo P.; Zhou, Jie; Huang, Hanxia; Shriner, Daniel; Adeyemo, Adebowale A.; Rotimi, Charles N.; Dunston, Georgia M.
2017-01-01
Objective Transferability of significantly associated loci or GWAS “hits” adds credibility to genotype-disease associations and provides evidence for generalizability across different ancestral populations. We sought evidence of association of known asthma-associated single nucleotide polymorphisms (SNPs) in an African American population. Methods Subjects comprised 661 participants (261 asthma cases and 400 controls) from the Howard University Family Study. Forty-eight SNPs previously reported to be associated with asthma by GWAS were selected for testing. We adopted a combined strategy by first adopting an “exact” approach where we looked-up only the reported index SNP. For those index SNPs missing form our dataset, we used a “local” approach that examined all the regional SNPs in LD with the index SNP. Results Out of the 48 SNPs, our cohort had genotype data available for 27, which were examined for exact replication. Of these, two SNPs were found positively associated with asthma. These included: rs10508372 (OR = 1.567 [95%CI, 1.133–2.167], P = 0.0066) and rs2378383 (OR = 2.147 [95%CI, 1.149–4.013], P = 0.0166), located on chromosomal bands 10p14 and 9q21.31, respectively. Local replication of the remaining 21 loci showed association at two chromosomal loci (9p24.1-rs2381413 and 6p21.32-rs3132947; Bonferroni-corrected P values: 0.0033 and 0.0197, respectively). Of note, multiple SNPs in LD with rs2381413 located upstream of IL33 were significantly associated with asthma. Conclusions This study has successfully transferred four reported asthma-associated loci in an independent African American population. Identification of several asthma-associated SNPs in the upstream of the IL33, a gene previously implicated in allergic inflammation of asthmatic airway, supports the generalizability of this finding. PMID:27177148
Zhang, Qingrun; Long, Quan; Ott, Jurg
2014-06-01
Identifying gene-gene interaction is a hot topic in genome wide association studies. Two fundamental challenges are: (1) how to smartly identify combinations of variants that may be associated with the trait from astronomical number of all possible combinations; and (2) how to test epistatic interaction when all potential combinations are available. We developed AprioriGWAS, which brings two innovations. (1) Based on Apriori, a successful method in field of Frequent Itemset Mining (FIM) in which a pattern growth strategy is leveraged to effectively and accurately reduce search space, AprioriGWAS can efficiently identify genetically associated genotype patterns. (2) To test the hypotheses of epistasis, we adopt a new conditional permutation procedure to obtain reliable statistical inference of Pearson's chi-square test for the [Formula: see text] contingency table generated by associated variants. By applying AprioriGWAS to age-related macular degeneration (AMD) data, we found that: (1) angiopoietin 1 (ANGPT1) and four retinal genes interact with Complement Factor H (CFH). (2) GO term "glycosaminoglycan biosynthetic process" was enriched in AMD interacting genes. The epistatic interactions newly found by AprioriGWAS on AMD data are likely true interactions, since genes interacting with CFH are retinal genes, and GO term enrichment also verified that interaction between glycosaminoglycans (GAGs) and CFH plays an important role in disease pathology of AMD. By applying AprioriGWAS on Bipolar disorder in WTCCC data, we found variants without marginal effect show significant interactions. For example, multiple-SNP genotype patterns inside gene GABRB2 and GRIA1 (AMPA subunit 1 receptor gene). AMPARs are found in many parts of the brain and are the most commonly found receptor in the nervous system. The GABRB2 mediates the fastest inhibitory synaptic transmission in the central nervous system. GRIA1 and GABRB2 are relevant to mental disorders supported by multiple evidences.
The 19q12 bladder cancer GWAS signal: association with cyclin E function and aggressive disease
Fu, Yi-Ping; Kohaar, Indu; Moore, Lee E.; Lenz, Petra; Figueroa, Jonine D.; Tang, Wei; Porter-Gill, Patricia; Chatterjee, Nilanjan; Scott-Johnson, Alexandra; Garcia-Closas, Montserrat; Muchmore, Brian; Baris, Dalsu; Paquin, Ashley; Ylaya, Kris; Schwenn, Molly; Apolo, Andrea B.; Karagas, Margaret R.; Tarway, McAnthony; Johnson, Alison; Mumy, Adam; Schned, Alan; Guedez, Liliana; Jones, Michael A.; Kida, Masatoshi; Monawar Hosain, GM; Malats, Nuria; Kogevinas, Manolis; Tardon, Adonina; Serra, Consol; Carrato, Alfredo; Garcia-Closas, Reina; Lloreta, Josep; Wu, Xifeng; Purdue, Mark; Andriole, Gerald L.; Grubb, Robert L.; Black, Amanda; Landi, Maria T.; Caporaso, Neil E.; Vineis, Paolo; Siddiq, Afshan; Bueno-de-Mesquita, H. Bas; Trichopoulos, Dimitrios; Ljungberg, Börje; Severi, Gianluca; Weiderpass, Elisabete; Krogh, Vittorio; Dorronsoro, Miren; Travis, Ruth C.; Tjønneland, Anne; Brennan, Paul; Chang-Claude, Jenny; Riboli, Elio; Prescott, Jennifer; Chen, Constance; De Vivo, Immaculata; Govannucci, Edward; Hunter, David; Kraft, Peter; Lindstrom, Sara; Gapstur, Susan M.; Jacobs, Eric J.; Diver, W. Ryan; Albanes, Demetrius; Weinstein, Stephanie J.; Virtamo, Jarmo; Kooperberg, Charles; Hohensee, Chancellor; Rodabough, Rebecca J.; Cortessis, Victoria K.; Conti, David V.; Gago-Dominguez, Manuela; Stern, Mariana C.; Pike, Malcolm C.; Van Den Berg, David; Yuan, Jian-Min; Haiman, Christopher A.; Cussenot, Olivier; Cancel-Tassin, Geraldine; Roupret, Morgan; Comperat, Eva; Porru, Stefano; Carta, Angela; Pavanello, Sofia; Arici, Cecilia; Mastrangelo, Giuseppe; Grossman, H. Barton; Wang, Zhaoming; Deng, Xiang; Chung, Charles C.; Hutchinson, Amy; Burdette, Laurie; Wheeler, William; Fraumeni, Joseph; Chanock, Stephen J.; Hewitt, Stephen M.; Silverman, Debra T.; Rothman, Nathaniel; Prokunina-Olsson, Ludmila
2014-01-01
A genome-wide association study (GWAS) of bladder cancer identified a genetic marker rs8102137 within the 19q12 region as a novel susceptibility variant. This marker is located upstream of the CCNE1 gene, which encodes cyclin E, a cell cycle protein. We performed genetic fine mapping analysis of the CCNE1 region using data from two bladder cancer GWAS (5,942 cases and 10,857 controls). We found that the original GWAS marker rs8102137 represents a group of 47 linked SNPs (with r2≥0.7) associated with increased bladder cancer risk. From this group we selected a functional promoter variant rs7257330, which showed strong allele-specific binding of nuclear proteins in several cell lines. In both GWAS, rs7257330 was associated only with aggressive bladder cancer, with a combined per-allele odds ratio (OR) =1.18 (95%CI=1.09-1.27, p=4.67×10−5 vs. OR =1.01 (95%CI=0.93-1.10, p=0.79) for non-aggressive disease, with p=0.0015 for case-only analysis. Cyclin E protein expression analyzed in 265 bladder tumors was increased in aggressive tumors (p=0.013) and, independently, with each rs7257330-A risk allele (ptrend=0.024). Over-expression of recombinant cyclin E in cell lines caused significant acceleration of cell cycle. In conclusion, we defined the 19q12 signal as the first GWAS signal specific for aggressive bladder cancer. Molecular mechanisms of this genetic association may be related to cyclin E over-expression and alteration of cell cycle in carriers of CCNE1 risk variants. In combination with established bladder cancer risk factors and other somatic and germline genetic markers, the CCNE1 variants could be useful for inclusion into bladder cancer risk prediction models. PMID:25320178
[Genome-wide association study for adolescent idiopathic scoliosis].
Ogura, Yoji; Kou, Ikuyo; Scoliosis, Japan; Matsumoto, Morio; Watanabe, Kota; Ikegawa, Shiro
2016-04-01
Adolescent idiopathic scoliosis(AIS)is a polygenic disease. Genome-wide association studies(GWASs)have been performed for a lot of polygenic diseases. For AIS, we conducted GWAS and identified the first AIS locus near LBX1. After the discovery, we have extended our study by increasing the numbers of subjects and SNPs. In total, our Japanese GWAS has identified four susceptibility genes. GWASs for AIS have also been performed in the USA and China, which identified one and three susceptibility genes, respectively. Here we review GWASs in Japan and abroad and functional analysis to clarify the pathomechanism of AIS.
Bonfiglio, F; Henström, M; Nag, A; Hadizadeh, F; Zheng, T; Cenit, M C; Tigchelaar, E; Williams, F; Reznichenko, A; Ek, W E; Rivera, N V; Homuth, G; Aghdassi, A A; Kacprowski, T; Männikkö, M; Karhunen, V; Bujanda, L; Rafter, J; Wijmenga, C; Ronkainen, J; Hysi, P; Zhernakova, A; D'Amato, M
2018-04-19
Irritable bowel syndrome (IBS) shows genetic predisposition, however, large-scale, powered gene mapping studies are lacking. We sought to exploit existing genetic (genotype) and epidemiological (questionnaire) data from a series of population-based cohorts for IBS genome-wide association studies (GWAS) and their meta-analysis. Based on questionnaire data compatible with Rome III Criteria, we identified a total of 1335 IBS cases and 9768 asymptomatic individuals from 5 independent European genotyped cohorts. Individual GWAS were carried out with sex-adjusted logistic regression under an additive model, followed by meta-analysis using the inverse variance method. Functional annotation of significant results was obtained via a computational pipeline exploiting ontology and interaction networks, and tissue-specific and gene set enrichment analyses. Suggestive GWAS signals (P ≤ 5.0 × 10 -6 ) were detected for 7 genomic regions, harboring 64 gene candidates to affect IBS risk via functional or expression changes. Functional annotation of this gene set convincingly (best FDR-corrected P = 3.1 × 10 -10 ) highlighted regulation of ion channel activity as the most plausible pathway affecting IBS risk. Our results confirm the feasibility of population-based studies for gene-discovery efforts in IBS, identify risk genes and loci to be prioritized in independent follow-ups, and pinpoint ion channels as important players and potential therapeutic targets warranting further investigation. © 2018 John Wiley & Sons Ltd.
CONAN: copy number variation analysis software for genome-wide association studies
2010-01-01
Background Genome-wide association studies (GWAS) based on single nucleotide polymorphisms (SNPs) revolutionized our perception of the genetic regulation of complex traits and diseases. Copy number variations (CNVs) promise to shed additional light on the genetic basis of monogenic as well as complex diseases and phenotypes. Indeed, the number of detected associations between CNVs and certain phenotypes are constantly increasing. However, while several software packages support the determination of CNVs from SNP chip data, the downstream statistical inference of CNV-phenotype associations is still subject to complicated and inefficient in-house solutions, thus strongly limiting the performance of GWAS based on CNVs. Results CONAN is a freely available client-server software solution which provides an intuitive graphical user interface for categorizing, analyzing and associating CNVs with phenotypes. Moreover, CONAN assists the evaluation process by visualizing detected associations via Manhattan plots in order to enable a rapid identification of genome-wide significant CNV regions. Various file formats including the information on CNVs in population samples are supported as input data. Conclusions CONAN facilitates the performance of GWAS based on CNVs and the visual analysis of calculated results. CONAN provides a rapid, valid and straightforward software solution to identify genetic variation underlying the 'missing' heritability for complex traits that remains unexplained by recent GWAS. The freely available software can be downloaded at http://genepi-conan.i-med.ac.at. PMID:20546565
Langlois, Christine; Abadi, Arkan; Peralta-Romero, Jesus; Alyass, Akram; Suarez, Fernando; Gomez-Zamudio, Jaime; Burguete-Garcia, Ana I.; Yazdi, Fereshteh T.; Cruz, Miguel; Meyre, David
2016-01-01
Genome wide association studies (GWAS) have identified single-nucleotide polymorphisms (SNPs) that are associated with fasting plasma glucose (FPG) in adult European populations. The contribution of these SNPs to FPG in non-Europeans and children is unclear. We studied the association of 15 GWAS SNPs and a genotype score (GS) with FPG and 7 metabolic traits in 1,421 Mexican children and adolescents from Mexico City. Genotyping of the 15 SNPs was performed using TaqMan Open Array. We used multivariate linear regression models adjusted for age, sex, body mass index standard deviation score, and recruitment center. We identified significant associations between 3 SNPs (G6PC2 (rs560887), GCKR (rs1260326), MTNR1B (rs10830963)), the GS and FPG level. The FPG risk alleles of 11 out of the 15 SNPs (73.3%) displayed significant or non-significant beta values for FPG directionally consistent with those reported in adult European GWAS. The risk allele frequencies for 11 of 15 (73.3%) SNPs differed significantly in Mexican children and adolescents compared to European adults from the 1000G Project, but no significant enrichment in FPG risk alleles was observed in the Mexican population. Our data support a partial transferability of European GWAS FPG association signals in children and adolescents from the admixed Mexican population. PMID:27782183
Pooled Genome-Wide Analysis to Identify Novel Risk Loci for Pediatric Allergic Asthma
Ricci, Giampaolo; Astolfi, Annalisa; Remondini, Daniel; Cipriani, Francesca; Formica, Serena; Dondi, Arianna; Pession, Andrea
2011-01-01
Background Genome-wide association studies of pooled DNA samples were shown to be a valuable tool to identify candidate SNPs associated to a phenotype. No such study was up to now applied to childhood allergic asthma, even if the very high complexity of asthma genetics is an appropriate field to explore the potential of pooled GWAS approach. Methodology/Principal Findings We performed a pooled GWAS and individual genotyping in 269 children with allergic respiratory diseases comparing allergic children with and without asthma. We used a modular approach to identify the most significant loci associated with asthma by combining silhouette statistics and physical distance method with cluster-adapted thresholding. We found 97% concordance between pooled GWAS and individual genotyping, with 36 out of 37 top-scoring SNPs significant at individual genotyping level. The most significant SNP is located inside the coding sequence of C5, an already identified asthma susceptibility gene, while the other loci regulate functions that are relevant to bronchial physiopathology, as immune- or inflammation-mediated mechanisms and airway smooth muscle contraction. Integration with gene expression data showed that almost half of the putative susceptibility genes are differentially expressed in experimental asthma mouse models. Conclusion/Significance Combined silhouette statistics and cluster-adapted physical distance threshold analysis of pooled GWAS data is an efficient method to identify candidate SNP associated to asthma development in an allergic pediatric population. PMID:21359210
Wei, Lijuan; Qu, Cunmin; Xu, Xinfu; Lu, Kun; Qian, Wei; Li, Jiana; Li, Maoteng; Liu, Liezhao
2015-01-01
A stable yellow-seeded variety is the breeding goal for obtaining the ideal rapeseed (Brassica napus L.) plant, and the amount of acid detergent lignin (ADL) in the seeds and the hull content (HC) are often used as yellow-seeded rapeseed screening indices. In this study, a genome-wide association analysis of 520 accessions was performed using the Q + K model with a total of 31,839 single-nucleotide polymorphism (SNP) sites. As a result, three significant associations on the B. napus chromosomes A05, A09, and C05 were detected for seed ADL content. The peak SNPs were within 9.27, 14.22, and 20.86 kb of the key genes BnaA.PAL4, BnaA.CAD2/BnaA.CAD3, and BnaC.CCR1, respectively. Further analyses were performed on the major locus of A05, which was also detected in the seed HC examination. A comparison of our genome-wide association study (GWAS) results and previous linkage mappings revealed a common chromosomal region on A09, which indicates that GWAS can be used as a powerful complementary strategy for dissecting complex traits in B. napus. Genomic selection (GS) utilizing the significant SNP markers based on the GWAS results exhibited increased predictive ability, indicating that the predictive ability of a given model can be substantially improved by using GWAS and GS. PMID:26673885
Genome-wide association studies of obesity and metabolic syndrome.
Fall, Tove; Ingelsson, Erik
2014-01-25
Until just a few years ago, the genetic determinants of obesity and metabolic syndrome were largely unknown, with the exception of a few forms of monogenic extreme obesity. Since genome-wide association studies (GWAS) became available, large advances have been made. The first single nucleotide polymorphism robustly associated with increased body mass index (BMI) was in 2007 mapped to a gene with for the time unknown function. This gene, now known as fat mass and obesity associated (FTO) has been repeatedly replicated in several ethnicities and is affecting obesity by regulating appetite. Since the first report from a GWAS of obesity, an increasing number of markers have been shown to be associated with BMI, other measures of obesity or fat distribution and metabolic syndrome. This systematic review of obesity GWAS will summarize genome-wide significant findings for obesity and metabolic syndrome and briefly give a few suggestions of what is to be expected in the next few years. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Further support for association between GWAS variant for positive emotion and reward systems.
Lancaster, T M; Ihssen, N; Brindley, L M; Linden, D E J
2017-01-31
A recent genome-wide association study (GWAS) identified a significant single-nucleotide polymorphism (SNP) for trait-positive emotion at rs322931 on chromosome 1, which was also associated with brain activation in the reward system of healthy individuals when observing positive stimuli in a functional magnetic resonance imaging (fMRI) study. In the current study, we aimed to further validate the role of variation at rs322931 in reward processing. Using a similar fMRI approach, we use two paradigms that elicit a strong ventral striatum (VS) blood oxygen-level dependency (BOLD) response in a sample of young, healthy individuals (N=82). In the first study we use a similar picture-viewing task to the discovery sample (positive>neutral stimuli) to replicate an effect of the variant on emotion processing. In the second study we use a probabilistic reversal learning procedure to identify reward processing during decision-making under uncertainly (reward>punishment). In a region of interest (ROI) analysis of the bilateral VS, we show that the rs322931 genotype was associated with BOLD in the left VS during the positive>neutral contrast (P ROI-CORRECTED =0.045) and during the reward>punishment contrast (P ROI-CORRECTED =0.018), although the effect of passive picture viewing was in the opposite direction from that reported in the discovery sample. These findings suggest that the recently identified GWAS hit may influence positive emotion via individual differences in activity in the key hubs of the brain's reward system. Furthermore, these effects may not be limited to the passive viewing of positive emotional scenes, but may also be observed during dynamic decision-making. This study suggests that future studies of this GWAS locus may yield further insight into the biological mechanisms of psychopathologies characterised by deficits in reward processing and positive emotion.
Almli, Lynn M; Duncan, Richard; Feng, Hao; Ghosh, Debashis; Binder, Elisabeth B; Bradley, Bekh; Ressler, Kerry J; Conneely, Karen N; Epstein, Michael P
2014-12-01
Genetic association studies of psychiatric outcomes often consider interactions with environmental exposures and, in particular, apply tests that jointly consider gene and gene-environment interaction effects for analysis. Using a genome-wide association study (GWAS) of posttraumatic stress disorder (PTSD), we report that heteroscedasticity (defined as variability in outcome that differs by the value of the environmental exposure) can invalidate traditional joint tests of gene and gene-environment interaction. To identify the cause of bias in traditional joint tests of gene and gene-environment interaction in a PTSD GWAS and determine whether proposed robust joint tests are insensitive to this problem. The PTSD GWAS data set consisted of 3359 individuals (978 men and 2381 women) from the Grady Trauma Project (GTP), a cohort study from Atlanta, Georgia. The GTP performed genome-wide genotyping of participants and collected environmental exposures using the Childhood Trauma Questionnaire and Trauma Experiences Inventory. We performed joint interaction testing of the Beck Depression Inventory and modified PTSD Symptom Scale in the GTP GWAS. We assessed systematic bias in our interaction analyses using quantile-quantile plots and genome-wide inflation factors. Application of the traditional joint interaction test to the GTP GWAS yielded systematic inflation across different outcomes and environmental exposures (inflation-factor estimates ranging from 1.07 to 1.21), whereas application of the robust joint test to the same data set yielded no such inflation (inflation-factor estimates ranging from 1.01 to 1.02). Simulated data further revealed that the robust joint test is valid in different heteroscedasticity models, whereas the traditional joint test is invalid. The robust joint test also has power similar to the traditional joint test when heteroscedasticity is not an issue. We believe the robust joint test should be used in candidate-gene studies and GWASs of psychiatric outcomes that consider environmental interactions. To make the procedure useful for applied investigators, we created a software tool that can be called from the popular PLINK package for analysis.
Logue, Mark W; Amstadter, Ananda B; Baker, Dewleen G; Duncan, Laramie; Koenen, Karestan C; Liberzon, Israel; Miller, Mark W; Morey, Rajendra A; Nievergelt, Caroline M; Ressler, Kerry J; Smith, Alicia K; Smoller, Jordan W; Stein, Murray B; Sumner, Jennifer A; Uddin, Monica
2015-01-01
The development of posttraumatic stress disorder (PTSD) is influenced by genetic factors. Although there have been some replicated candidates, the identification of risk variants for PTSD has lagged behind genetic research of other psychiatric disorders such as schizophrenia, autism, and bipolar disorder. Psychiatric genetics has moved beyond examination of specific candidate genes in favor of the genome-wide association study (GWAS) strategy of very large numbers of samples, which allows for the discovery of previously unsuspected genes and molecular pathways. The successes of genetic studies of schizophrenia and bipolar disorder have been aided by the formation of a large-scale GWAS consortium: the Psychiatric Genomics Consortium (PGC). In contrast, only a handful of GWAS of PTSD have appeared in the literature to date. Here we describe the formation of a group dedicated to large-scale study of PTSD genetics: the PGC-PTSD. The PGC-PTSD faces challenges related to the contingency on trauma exposure and the large degree of ancestral genetic diversity within and across participating studies. Using the PGC analysis pipeline supplemented by analyses tailored to address these challenges, we anticipate that our first large-scale GWAS of PTSD will comprise over 10 000 cases and 30 000 trauma-exposed controls. Following in the footsteps of our PGC forerunners, this collaboration—of a scope that is unprecedented in the field of traumatic stress—will lead the search for replicable genetic associations and new insights into the biological underpinnings of PTSD. PMID:25904361
Hamidi Hay, E; Roberts, A
2017-04-01
Longevity is a highly important trait to the efficiency of beef cattle production. The objective of this study was to evaluate the genomic prediction of longevity and identify genomic regions associated with this trait. The data used in this study consisted of 547 Composite Gene Combination cows (1/2 Red Angus, 1/4 Charolais, 1/4 Tarentaise) born from 2002 to 2011 genotyped with Illumina BovineSNP50 BeadChip. Three models were used to assess genomic prediction: Bayes A, Bayes B and GBLUP using a genomic relationship matrix. To identify genomic regions associated with longevity 2 approaches were adopted: single marker genome wide association and Bayesian approach using GenSel software. The genomic prediction accuracy was low 0.28, 0.25, and 0.22 for Bayes A, Bayes B and GBLUP, respectively. The single-marker genome wide association study (GWAS)identified 5 loci with -value less than 0.05 after false discovery correction: UA-IFASA-7571 on chromosome 19 (58.03 Mb), ARS-BFGL-BAC-15059 on BTA 1 (28.8 Mb), ARS-BFGL-NGS-104159 on BTA3 (29.4 Mb), ARS-BFGL-NGS-32882 on BTA9 (104.07 Mb) and ARS-BFGL-NGS-32883 on BTA25 (33.77 Mb). The Bayesian GWAS yielded 4 genomic regions overlapping with the single marker GWAS results. The region with the highest percentage of genomic variance (3.73%) was detected on chromosome 19. Both GWAS approaches adopted in this study showed evidence for association with various chromosomal locations.
Genome-wide Association Study for Ovarian Cancer Susceptibility using Pooled DNA
Lu, Yi; Chen, Xiaoqing; Beesley, Jonathan; Johnatty, Sharon E.; deFazio, Anna; Lambrechts, Sandrina; Lambrechts, Diether; Despierre, Evelyn; Vergotes, Ignace; Chang-Claude, Jenny; Hein, Rebecca; Nickels, Stefan; Wang-Gohrke, Shan; Dörk, Thilo; Dürst, Matthias; Antonenkova, Natalia; Bogdanova, Natalia; Goodman, Marc T.; Lurie, Galina; Wilkens, Lynne R.; Carney, Michael E.; Butzow, Ralf; Nevanlinna, Heli; Heikkinen, Tuomas; Leminen, Arto; Kiemeney, Lambertus A.; Massuger, Leon F.A.G.; van Altena, Anne M.; Aben, Katja K.; Kjaer, Susanne Krüger; Høgdall, Estrid; Jensen, Allan; Brooks-Wilson, Angela; Le, Nhu; Cook, Linda; Earp, Madalene; Kelemen, Linda; Easton, Douglas; Pharoah, Paul; Song, Honglin; Tyrer, Jonathan; Ramus, Susan; Menon, Usha; Gentry-Maharaj, Alexandra; Gayther, Simon A.; Bandera, Elisa V.; Olson, Sara H.; Orlow, Irene; Rodriguez-Rodriguez, Lorna
2013-01-01
Recent genome-wide association studies (GWAS) have identified four low-penetrance ovarian cancer susceptibility loci. We hypothesized that further moderate or low penetrance variants exist among the subset of SNPs not well tagged by the genotyping arrays used in the previous studies which would account for some of the remaining risk. We therefore conducted a time- and cost-effective stage 1 GWAS on 342 invasive serous cases and 643 controls genotyped on pooled DNA using the high density Illumina 1M-Duo array. We followed up 20 of the most significantly associated SNPs, which are not well tagged by the lower density arrays used by the published GWAS, and genotyping them on individual DNA. Most of the top 20 SNPs were clearly validated by individually genotyping the samples used in the pools. However, none of the 20 SNPs replicated when tested for association in a much larger stage 2 set of 4,651 cases and 6,966 controls from the Ovarian Cancer Association Consortium. Given that most of the top 20 SNPs from pooling were validated in the same samples by individual genotyping, the lack of replication is likely to be due to the relatively small sample size in our stage 1 GWAS rather than due to problems with the pooling approach. We conclude that there are unlikely to be any moderate or large effects on ovarian cancer risk untagged by the less dense arrays. However our study lacked power to make clear statements on the existence of hitherto untagged small effect variants. PMID:22794196
GWASinlps: Nonlocal prior based iterative SNP selection tool for genome-wide association studies.
Sanyal, Nilotpal; Lo, Min-Tzu; Kauppi, Karolina; Djurovic, Srdjan; Andreassen, Ole A; Johnson, Valen E; Chen, Chi-Hua
2018-06-19
Multiple marker analysis of the genome-wide association study (GWAS) data has gained ample attention in recent years. However, because of the ultra high-dimensionality of GWAS data, such analysis is challenging. Frequently used penalized regression methods often lead to large number of false positives, whereas Bayesian methods are computationally very expensive. Motivated to ameliorate these issues simultaneously, we consider the novel approach of using nonlocal priors in an iterative variable selection framework. We develop a variable selection method, named, iterative nonlocal prior based selection for GWAS, or GWASinlps, that combines, in an iterative variable selection framework, the computational efficiency of the screen-and-select approach based on some association learning and the parsimonious uncertainty quantification provided by the use of nonlocal priors. The hallmark of our method is the introduction of 'structured screen-and-select' strategy, that considers hierarchical screening, which is not only based on response-predictor associations, but also based on response-response associations, and concatenates variable selection within that hierarchy. Extensive simulation studies with SNPs having realistic linkage disequilibrium structures demonstrate the advantages of our computationally efficient method compared to several frequentist and Bayesian variable selection methods, in terms of true positive rate, false discovery rate, mean squared error, and effect size estimation error. Further, we provide empirical power analysis useful for study design. Finally, a real GWAS data application was considered with human height as phenotype. An R-package for implementing the GWASinlps method is available at https://cran.r-project.org/web/packages/GWASinlps/index.html. Supplementary data are available at Bioinformatics online.
Genome-wide association study (GWAS) for molar-incisor hypomineralization (MIH).
Kühnisch, Jan; Thiering, Elisabeth; Heitmüller, Daniela; Tiesler, Carla M T; Grallert, Harald; Heinrich-Weltzien, Roswitha; Hickel, Reinhard; Heinrich, Joachim
2014-01-01
This genome-wide association study (GWAS) investigated the relationship between molar-incisor hypomineralization (MIH) and possible genetic loci. Clinical and genetic data from the 10-year follow-up of 668 children from the Munich GINI-plus and LISA-plus birth cohort studies were analyzed. The dental examinations included the diagnosis of MIH according to the criteria of the European Academy of Paediatric Dentistry (EAPD). Children with MIH were categorized as those with a minimum of one hypomineralized first permanent molar. A GWAS was implemented following a quality-control step and an additive genetic effect was assumed. A total of 2,013,491 single-nucleotide polymorphisms (SNPs) were available for analysis. Rs13058467, which is located near the SCUBE1 gene on chromosome 22 (p < 3.72E-7), was identified as a possible locus linked to MIH when using a threshold of p value <1E-6. After considering the limitations of the present study (e.g., limited sample size and lack of an independent replication sample), it can be concluded that (1) replication analyses in an independent cohort study are strongly recommended and (2) large-scale and well-powered studies are needed to investigate a possible genetic link to MIH.
Evaluation of 19 susceptibility loci of breast cancer in women of African ancestry
Huo, Dezheng; Zheng, Yonglan; Ogundiran, Temidayo O.; Adebamowo, Clement; Nathanson, Katherine L.; Domchek, Susan M.; Rebbeck, Timothy R.; Simon, Michael S.; John, Esther M.; Hennis, Anselm; Nemesure, Barbara; Wu, Suh-Yuh; Leske, M.Cristina; Ambs, Stefan; Niu, Qun; Zhang, Jing; Cox, Nancy J.; Olopade, Olufunmilayo I.
2012-01-01
Multiple breast cancer susceptibility loci have been identified in genome-wide association studies (GWAS) in populations of European and Asian ancestry using array chips optimized for populations of European ancestry. It is important to examine whether these loci are associated with breast cancer risk in women of African ancestry. We evaluated 25 single nucleotide polymorphisms (SNPs) at 19 loci in a pooled case–control study of breast cancer, which included 1509 cases and 1383 controls. Cases and controls were enrolled in Nigeria, Barbados and the USA; all women were of African ancestry. We found significant associations for three SNPs, which were in the same direction and of similar magnitude as those reported in previous fine-mapping studies in women of African ancestry. The allelic odds ratios were 1.24 [95% confidence interval (CI): 1.04–1.47; P = 0.018] for the rs2981578-G allele (10q26/FGFR2), 1.34 (95% CI: 1.10–1.63; P = 0.0035) for the rs9397435-G allele (6q25) and 1.12 (95% CI: 1.00–1.25; P = 0.04) for the rs3104793-C allele (16q12). Although a significant association was observed for an additional index SNP (rs3817198), it was in the opposite direction to prior GWAS studies. In conclusion, this study highlights the complexity of applying current GWAS findings across racial/ethnic groups, as none of GWAS-identified index SNPs could be replicated in women of African ancestry. Further fine-mapping studies in women of African ancestry will be needed to reveal additional and causal variants for breast cancer. PMID:22357627
Saykin, Andrew J.; Shen, Li; Foroud, Tatiana M.; Potkin, Steven G.; Swaminathan, Shanker; Kim, Sungeun; Risacher, Shannon L.; Nho, Kwangsik; Huentelman, Matthew J.; Craig, David W.; Thompson, Paul M.; Stein, Jason L.; Moore, Jason H.; Farrer, Lindsay A.; Green, Robert C.; Bertram, Lars; Jack, Clifford R.; Weiner, Michael W.
2010-01-01
The role of the Alzheimer’s Disease Neuroimaging Initiative Genetics Core is to facilitate the investigation of genetic influences on disease onset and trajectory as reflected in structural, functional, and molecular imaging changes; fluid biomarkers; and cognitive status. Major goals include (1) blood sample processing, genotyping, and dissemination, (2) genome-wide association studies (GWAS) of longitudinal phenotypic data, and (3) providing a central resource, point of contact and planning group for genetics within Alzheimer’s Disease Neuroimaging Initiative. Genome-wide array data have been publicly released and updated, and several neuroimaging GWAS have recently been reported examining baseline magnetic resonance imaging measures as quantitative phenotypes. Other preliminary investigations include copy number variation in mild cognitive impairment and Alzheimer’s disease and GWAS of baseline cerebrospinal fluid biomarkers and longitudinal changes on magnetic resonance imaging. Blood collection for RNA studies is a new direction. Genetic studies of longitudinal phenotypes hold promise for elucidating disease mechanisms and risk, development of therapeutic strategies, and refining selection criteria for clinical trials. PMID:20451875
Analyzing Association Mapping in Pedigree-Based GWAS Using a Penalized Multitrait Mixed Model
Liu, Jin; Yang, Can; Shi, Xingjie; Li, Cong; Huang, Jian; Zhao, Hongyu; Ma, Shuangge
2017-01-01
Genome-wide association studies (GWAS) have led to the identification of many genetic variants associated with complex diseases in the past 10 years. Penalization methods, with significant numerical and statistical advantages, have been extensively adopted in analyzing GWAS. This study has been partly motivated by the analysis of Genetic Analysis Workshop (GAW) 18 data, which have two notable characteristics. First, the subjects are from a small number of pedigrees and hence related. Second, for each subject, multiple correlated traits have been measured. Most of the existing penalization methods assume independence between subjects and traits and can be suboptimal. There are a few methods in the literature based on mixed modeling that can accommodate correlations. However, they cannot fully accommodate the two types of correlations while conducting effective marker selection. In this study, we develop a penalized multitrait mixed modeling approach. It accommodates the two different types of correlations and includes several existing methods as special cases. Effective penalization is adopted for marker selection. Simulation demonstrates its satisfactory performance. The GAW 18 data are analyzed using the proposed method. PMID:27247027
Hong, Kyung-Won; Min, Haesook; Heo, Byeong-Mun; Joo, Seong Eun; Kim, Sung Soo; Kim, Yeonjung
2012-06-01
Increased pulse pressure (PP) and decreased mean arterial pressure (MAP) are strong prognostic predictors of adverse cardiovascular events. Recently, the International Consortium for Blood Pressure Genome-Wide Association Studies (ICBP-GWAS) reported eight loci that influenced PP and MAP. The ICBP-GWAS examined 51 cohorts--comprising 122 671 individuals of European ancestry--and identified eight SNPs: five that governed PP and three that controlled MAP. Six of these loci were novel. To replicate these newly identified loci and examine genetic architecture of PP and MAP between European and Asian populations, we conducted a meta-analysis of the eight SNPs combining data from ICBP and general population-based Korean cohorts. Two SNPs (rs13002573 (FIGN) and rs871606 (CHIC2)) for PP and two SNPs (rs1446468 (FIGN) and rs319690 (MAP4)) for MAP were replicated in Koreans. Although our GWAS only found moderate association, we believe that the findings promote us to propose that a similar genetic architecture governs PP and MAP in Asians and Europeans. However, further studies will be needed to confirm the possibility using other Asian population.
seXY: a tool for sex inference from genotype arrays.
Qian, David C; Busam, Jonathan A; Xiao, Xiangjun; O'Mara, Tracy A; Eeles, Rosalind A; Schumacher, Frederick R; Phelan, Catherine M; Amos, Christopher I
2017-02-15
Checking concordance between reported sex and genotype-inferred sex is a crucial quality control measure in genome-wide association studies (GWAS). However, limited insights exist regarding the true accuracy of software that infer sex from genotype array data. We present seXY, a logistic regression model trained on both X chromosome heterozygosity and Y chromosome missingness, that consistently demonstrated >99.5% sex inference accuracy in cross-validation for 889 males and 5,361 females enrolled in prostate cancer and ovarian cancer GWAS. Compared to PLINK, one of the most popular tools for sex inference in GWAS that assesses only X chromosome heterozygosity, seXY achieved marginally better male classification and 3% more accurate female classification. https://github.com/Christopher-Amos-Lab/seXY. Christopher.I.Amos@dartmouth.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Chen, Zhijian; Craiu, Radu V; Bull, Shelley B
2014-11-01
In focused studies designed to follow up associations detected in a genome-wide association study (GWAS), investigators can proceed to fine-map a genomic region by targeted sequencing or dense genotyping of all variants in the region, aiming to identify a functional sequence variant. For the analysis of a quantitative trait, we consider a Bayesian approach to fine-mapping study design that incorporates stratification according to a promising GWAS tag SNP in the same region. Improved cost-efficiency can be achieved when the fine-mapping phase incorporates a two-stage design, with identification of a smaller set of more promising variants in a subsample taken in stage 1, followed by their evaluation in an independent stage 2 subsample. To avoid the potential negative impact of genetic model misspecification on inference we incorporate genetic model selection based on posterior probabilities for each competing model. Our simulation study shows that, compared to simple random sampling that ignores genetic information from GWAS, tag-SNP-based stratified sample allocation methods reduce the number of variants continuing to stage 2 and are more likely to promote the functional sequence variant into confirmation studies. © 2014 WILEY PERIODICALS, INC.
Genome-wide association mapping of quantitative traits in a breeding population of sugarcane.
Racedo, Josefina; Gutiérrez, Lucía; Perera, María Francisca; Ostengo, Santiago; Pardo, Esteban Mariano; Cuenya, María Inés; Welin, Bjorn; Castagnaro, Atilio Pedro
2016-06-24
Molecular markers associated with relevant agronomic traits could significantly reduce the time and cost involved in developing new sugarcane varieties. Previous sugarcane genome-wide association analyses (GWAS) have found few molecular markers associated with relevant traits at plant-cane stage. The aim of this study was to establish an appropriate GWAS to find molecular markers associated with yield related traits consistent across harvesting seasons in a breeding population. Sugarcane clones were genotyped with DArT (Diversity Array Technology) and TRAP (Target Region Amplified Polymorphism) markers, and evaluated for cane yield (CY) and sugar content (SC) at two locations during three successive crop cycles. GWAS mapping was applied within a novel mixed-model framework accounting for population structure with Principal Component Analysis scores as random component. A total of 43 markers significantly associated with CY in plant-cane, 42 in first ratoon, and 41 in second ratoon were detected. Out of these markers, 20 were associated with CY in 2 years. Additionally, 38 significant associations for SC were detected in plant-cane, 34 in first ratoon, and 47 in second ratoon. For SC, one marker-trait association was found significant for the 3 years of the study, while twelve markers presented association for 2 years. In the multi-QTL model several markers with large allelic substitution effect were found. Sequences of four DArT markers showed high similitude and e-value with coding sequences of Sorghum bicolor, confirming the high gene microlinearity between sorghum and sugarcane. In contrast with other sugarcane GWAS studies reported earlier, the novel methodology to analyze multi-QTLs through successive crop cycles used in the present study allowed us to find several markers associated with relevant traits. Combining existing phenotypic trial data and genotypic DArT and TRAP marker characterizations within a GWAS approach including population structure as random covariates may prove to be highly successful. Moreover, sequences of DArT marker associated with the traits of interest were aligned in chromosomal regions where sorghum QTLs has previously been reported. This approach could be a valuable tool to assist the improvement of sugarcane and better supply sugarcane demand that has been projected for the upcoming decades.
Correcting Systematic Inflation in Genetic Association Tests That Consider Interaction Effects
Almli, Lynn M.; Duncan, Richard; Feng, Hao; Ghosh, Debashis; Binder, Elisabeth B.; Bradley, Bekh; Ressler, Kerry J.; Conneely, Karen N.; Epstein, Michael P.
2015-01-01
IMPORTANCE Genetic association studies of psychiatric outcomes often consider interactions with environmental exposures and, in particular, apply tests that jointly consider gene and gene-environment interaction effects for analysis. Using a genome-wide association study (GWAS) of posttraumatic stress disorder (PTSD), we report that heteroscedasticity (defined as variability in outcome that differs by the value of the environmental exposure) can invalidate traditional joint tests of gene and gene-environment interaction. OBJECTIVES To identify the cause of bias in traditional joint tests of gene and gene-environment interaction in a PTSD GWAS and determine whether proposed robust joint tests are insensitive to this problem. DESIGN, SETTING, AND PARTICIPANTS The PTSD GWAS data set consisted of 3359 individuals (978 men and 2381 women) from the Grady Trauma Project (GTP), a cohort study from Atlanta, Georgia. The GTP performed genome-wide genotyping of participants and collected environmental exposures using the Childhood Trauma Questionnaire and Trauma Experiences Inventory. MAIN OUTCOMES AND MEASURES We performed joint interaction testing of the Beck Depression Inventory and modified PTSD Symptom Scale in the GTP GWAS. We assessed systematic bias in our interaction analyses using quantile-quantile plots and genome-wide inflation factors. RESULTS Application of the traditional joint interaction test to the GTP GWAS yielded systematic inflation across different outcomes and environmental exposures (inflation-factor estimates ranging from 1.07 to 1.21), whereas application of the robust joint test to the same data set yielded no such inflation (inflation-factor estimates ranging from 1.01 to 1.02). Simulated data further revealed that the robust joint test is valid in different heteroscedasticity models, whereas the traditional joint test is invalid. The robust joint test also has power similar to the traditional joint test when heteroscedasticity is not an issue. CONCLUSIONS AND RELEVANCE We believe the robust joint test should be used in candidate-gene studies and GWASs of psychiatric outcomes that consider environmental interactions. To make the procedure useful for applied investigators, we created a software tool that can be called from the popular PLINK package for analysis. PMID:25354142
Bensen, Jeannette T; Xu, Zongli; Smith, Gary J; Mohler, James L; Fontham, Elizabeth T H; Taylor, Jack A
2013-01-01
Genome-wide association studies have established a number of replicated single nucleotide polymorphisms (SNPs) for susceptibility to prostate cancer (CaP), but it is unclear whether these susceptibility SNPs are also associated with disease aggressiveness. This study evaluates whether such replication SNPs or other candidate SNPs are associated with CaP aggressiveness in African-American (AA) and European-American (EA) men. A 1,536 SNP panel which included 34 genome-wide association study (GWAS) replication SNPs, 38 flanking SNPs, a set of ancestry informative markers, and SNPs in candidate genes and other areas was genotyped in 1,060 AA and 1,087 EA men with incident CaP from the North Carolina-Louisiana Prostate Cancer Project (PCaP). Tests for association were conducted using ordinal logistic regression with a log-additive genotype model and a 3-category CaP aggressiveness variable. Four GWAS replication SNPs (rs2660753, rs13254738, rs10090154, rs2735839) and seven flanking SNPs were associated with CaP aggressiveness (P < 0.05) in three genomic regions: One at 3p12 (EA), seven at 8q24 (5 AA, 2 EA), and three at 19q13 at the kallilkrein-related peptidase 3 (KLK3) locus (two AA, one AA and EA). The KLK3 SNPs also were associated with serum prostate-specific antigen (PSA) levels in AA (P < 0.001) but not in EA. A number of the other SNPs showed some evidence of association but none met study-wide significance levels after adjusting for multiple comparisons. Some replicated GWAS susceptibility SNPs may play a role in CaP aggressiveness. However, like susceptibility, these associations are not consistent between racial groups. Copyright © 2012 Wiley Periodicals, Inc.
Bensen, Jeannette T.; Xu, Zongli; Smith, Gary J.; Mohler, James L.; Fontham, Elizabeth T.H.; Taylor, Jack A.
2012-01-01
BACKGROUND Genome-wide association studies have established a number of replicated single nucleotide polymorphisms (SNPs) for susceptibility to prostate cancer (CaP), but it is unclear whether these susceptibility SNPs are also associated with disease aggressiveness. This study evaluates whether such replication SNPs or other candidate SNPs are associated with CaP aggressiveness in African-American (AA) and European-American (EA) men. METHODS A 1,536 SNP panel which included 34 genome-wide association study (GWAS) replication SNPs, 38 flanking SNPs, a set of ancestry informative markers, and SNPs in candidate genes and other areas was genotyped in 1,060 AA and 1,087 EA men with incident CaP from the North Carolina-Louisiana Prostate Cancer Project (PCaP). Tests for association were conducted using ordinal logistic regression with a log-additive genotype model and a 3-category CaP aggressiveness variable. RESULTS 4 GWAS replication SNPs (rs2660753, rs13254738, rs10090154, rs2735839) and 7 flanking SNPs were associated with CaP aggressiveness (P<0.05) in 3 genomic regions: one at 3p12 (EA), 7 at 8q24 (5 AA, 2 EA), and 3 at 19q13 at the kallilkrein-related peptidase 3 (KLK3) locus (2 AA, 1 AA and EA). The KLK3 SNPs also were associated with serum prostate-specific antigen (PSA) levels in AA (p < 0.001) but not in EA. A number of the other SNPs showed some evidence of association but none met study-wide significance levels after adjusting for multiple comparisons. CONCLUSIONS Some replicated GWAS susceptibility SNPs may play a role in CaP aggressiveness. However, like susceptibility, these associations are not consistent between racial groups. PMID:22549899
Zhang, Chenan; Chen, Lin S; Gao, Jianjun; Roy, Shantanu; Shinkle, Justin; Sabarinathan, Mekala; Tong, Lin; Ahmed, Alauddin; Islam, Tariqul; Rakibuz-Zaman, Muhammad; Sarwar, Golam; Shahriar, Hasan; Rahman, Mahfuzar; Yunus, Mohammad; Jasmine, Farzana; Kibriya, Muhammad G; Ahsan, Habibul; Pierce, Brandon L
2018-01-01
Background Leucocyte telomere length (TL) is a potential biomarker of ageing and risk for age-related disease. Leucocyte TL is heritable and shows substantial differences by race/ethnicity. Recent genome-wide association studies (GWAS) report ~10 loci harbouring SNPs associated with leucocyte TL, but these studies focus primarily on populations of European ancestry. Objective This study aims to enhance our understanding of genetic determinants of TL across populations. Methods We performed a GWAS of TL using data on 5075 Bangladeshi adults. We measured TL using one of two technologies (qPCR or a Luminex-based method) and used standardised variables as TL phenotypes. Results Our results replicate previously reported associations in the TERC and TERT regions (P=2.2×10−8 and P=6.4×10−6, respectively). We observed a novel association signal in the RTEL1 gene (intronic SNP rs2297439; P=2.82×10−7) that is independent of previously reported TL-associated SNPs in this region. The minor allele for rs2297439 is common in South Asian populations (≥0.25) but at lower frequencies in other populations (eg, 0.07 in Northern Europeans). Among the eight other previously reported association signals, all were directionally consistent with our study, but only rs8105767 (ZNF208) was nominally significant (P=0.003). SNP-based heritability estimates were as high as 44% when analysing close relatives but much lower when analysing distant relatives only. Conclusions In this first GWAS of TL in a South Asian population, we replicate some, but not all, of the loci reported in prior GWAS of individuals of European ancestry, and we identify a novel second association signal at the RTEL1 locus. PMID:29151059
Intergenic disease-associated regions are abundant in novel transcripts.
Bartonicek, N; Clark, M B; Quek, X C; Torpy, J R; Pritchard, A L; Maag, J L V; Gloss, B S; Crawford, J; Taft, R J; Hayward, N K; Montgomery, G W; Mattick, J S; Mercer, T R; Dinger, M E
2017-12-28
Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.
Thorwarth, Patrick; Yousef, Eltohamy A A; Schmid, Karl J
2018-02-02
Genetic resources are an important source of genetic variation for plant breeding. Genome-wide association studies (GWAS) and genomic prediction greatly facilitate the analysis and utilization of useful genetic diversity for improving complex phenotypic traits in crop plants. We explored the potential of GWAS and genomic prediction for improving curd-related traits in cauliflower ( Brassica oleracea var. botrytis ) by combining 174 randomly selected cauliflower gene bank accessions from two different gene banks. The collection was genotyped with genotyping-by-sequencing (GBS) and phenotyped for six curd-related traits at two locations and three growing seasons. A GWAS analysis based on 120,693 single-nucleotide polymorphisms identified a total of 24 significant associations for curd-related traits. The potential for genomic prediction was assessed with a genomic best linear unbiased prediction model and BayesB. Prediction abilities ranged from 0.10 to 0.66 for different traits and did not differ between prediction methods. Imputation of missing genotypes only slightly improved prediction ability. Our results demonstrate that GWAS and genomic prediction in combination with GBS and phenotyping of highly heritable traits can be used to identify useful quantitative trait loci and genotypes among genetically diverse gene bank material for subsequent utilization as genetic resources in cauliflower breeding. Copyright © 2018 Thorwarth et al.
Human brain arousal in the resting state: a genome-wide association study.
Jawinski, Philippe; Kirsten, Holger; Sander, Christian; Spada, Janek; Ulke, Christine; Huang, Jue; Burkhardt, Ralph; Scholz, Markus; Hensch, Tilman; Hegerl, Ulrich
2018-04-27
Arousal affects cognition, emotion, and behavior and has been implicated in the etiology of psychiatric disorders. Although environmental conditions substantially contribute to the level of arousal, stable interindividual characteristics are well-established and a genetic basis has been suggested. Here we investigated the molecular genetics of brain arousal in the resting state by conducting a genome-wide association study (GWAS). We selected N = 1877 participants from the population-based LIFE-Adult cohort. Participants underwent a 20-min eyes-closed resting state EEG, which was analyzed using the computerized VIGALL 2.1 (Vigilance Algorithm Leipzig). At the SNP-level, GWAS analyses revealed no genome-wide significant locus (p < 5E-8), although seven loci were suggestive (p < 1E-6). The strongest hit was an expression quantitative trait locus (eQTL) of TMEM159 (lead-SNP: rs79472635, p = 5.49E-8). Importantly, at the gene-level, GWAS analyses revealed significant evidence for TMEM159 (p = 0.013, Bonferroni-corrected). By mapping our SNPs to the GWAS results from the Psychiatric Genomics Consortium, we found that all corresponding markers of TMEM159 showed nominally significant associations with Major Depressive Disorder (MDD; 0.006 ≤ p ≤ 0.011). More specifically, variants associated with high arousal levels have previously been linked to an increased risk for MDD. In line with this, the MetaXcan database suggests increased expression levels of TMEM159 in MDD, as well as Autism Spectrum Disorder, and Alzheimer's Disease. Furthermore, our pathway analyses provided evidence for a role of sodium/calcium exchangers in resting state arousal. In conclusion, the present GWAS identifies TMEM159 as a novel candidate gene which may modulate the risk for psychiatric disorders through arousal mechanisms. Our results also encourage the elaboration of the previously reported interrelations between ion-channel modulators, sleep-wake behavior, and psychiatric disorders.
Bostrom, Meredith A.; Kao, W.H. Linda; Li, Man; Abboud, Hanna E.; Adler, Sharon G.; Iyengar, Sudha K.; Kimmel, Paul L.; Hanson, Robert L.; Nicholas, Susanne B.; Rasooly, Rebekah S.; Sedor, John R.; Coresh, Josef; Kohn, Orly F.; Leehey, David J.; Thornley-Brown, Denyse; Bottinger, Erwin P.; Lipkowitz, Michael S.; Meoni, Lucy A.; Klag, Michael J.; Lu, Lingyi; Hicks, Pamela J.; Langefeld, Carl D.; Parekh, Rulan S.; Bowden, Donald W.; Freedman, Barry I.
2011-01-01
Background African Americans (AAs) have increased susceptibility to non-diabetic nephropathy relative to European Americans. Study Design Follow-up of a pooled genome-wide association study (GWAS) in AA dialysis patients with nondiabetic nephropathy; novel gene-gene interaction analyses. Setting & Participants Wake Forest sample: 962 AA nondiabetic nephropathy cases; 931 non-nephropathy controls. Replication sample: 668 Family Investigation of Nephropathy and Diabetes (FIND) AA nondiabetic nephropathy cases; 804 non-nephropathy controls. Predictors Individual genotyping of top 1420 pooled GWAS-associated single nucleotide polymorphisms (SNPs) and 54 SNPs in six nephropathy susceptibility genes. Outcomes APOL1 genetic association and additional candidate susceptibility loci interacting with, or independently from, APOL1. Results The strongest GWAS associations included two non-coding APOL1 SNPs, rs2239785 (odds ratio [OR], 0.33; dominant; p = 5.9 × 10−24) and rs136148 (OR, 0.54; additive; p = 1.1 × 10−7) with replication in FIND (p = 5.0 × 10−21 and 1.9 × 10−05, respectively). Rs2239785 remained significantly associated after controlling for the APOL1 G1 and G2 coding variants. Additional top hits included a CFH SNP(OR from meta-analysis in above 3367 AA cases and controls, 0.81; additive; p = 6.8 × 10−4). The 1420 SNPs were tested for interaction with APOL1 G1 and G2 variants. Several interactive SNPs were detected, the most significant was rs16854341 in the podocin gene (NPHS2) (p = 0.0001). Limitations Non-pooled GWAS have not been performed in AA nondiabetic nephropathy. Conclusions This follow-up of a pooled GWAS provides additional and independent evidence that APOL1 variants contribute to nondiabetic nephropathy in AAs and identified additional associated and interactive non-diabetic nephropathy susceptibility genes. PMID:22119407
SUSCEPTIBILITY LOCI FOR UMBILICAL HERNIA IN SWINE DETECTED BY GENOME-WIDE ASSOCIATION.
Liao, X J; Lia, L; Zhang, Z Y; Long, Y; Yang, B; Ruan, G R; Su, Y; Ai, H S; Zhang, W C; Deng, W Y; Xiao, S J; Ren, J; Ding, N S; Huang, L S
2015-10-01
Umbilical hernia (UH) is a complex disorder caused by both genetic and environmental factors. UH brings animal welfare problems and severe economic loss to the pig industry. Until now, the genetic basis of UH is poorly understood. The high-density 60K porcine SNP array enables the rapid application of genome-wide association study (GWAS) to identify genetic loci for phenotypic traits at genome wide scale in pigs. The objective of this research was to identify susceptibility loci for swine umbilical hernia using the GWAS approach. We genotyped 478 piglets from 142 families representing three Western commercial breeds with the Illumina PorcineSNP60 BeadChip. Then significant SNPs were detected by GWAS using ROADTRIPS (Robust Association-Detection Test for Related Individuals with Population Substructure) software base on a Bonferroni corrected threshold (P = 1.67E-06) or suggestive threshold (P = 3.34E-05) and false discovery rate (FDR = 0.05). After quality control, 29,924 qualified SNPs and 472 piglets were used for GWAS. Two suggestive loci predisposing to pig UH were identified at 44.25MB on SSC2 (rs81358018, P = 3.34E-06, FDR = 0.049933) and at 45.90MB on SSC17 (rs81479278, P = 3.30E-06, FDR = 0.049933) in Duroc population, respectively. And no SNP was detected to be associated with pig UH at significant level in neither Landrace nor Large White population. Furthermore, we carried out a meta-analysis in the combined pure-breed population containing all the 472 piglets. rs81479278 (P = 1.16E-06, FDR = 0.022475) was identified to associate with pig UH at genome-wide significant level. SRC was characterized as plausible candidate gene for susceptibility to pig UH according to its genomic position and biological functions. To our knowledge, this study gives the first description of GWAS identifying susceptibility loci for umbilical hernia in pigs. Our findings provide deeper insights to the genetic architecture of umbilical hernia in pigs.
Sekula, Peggy; Li, Yong; Stanescu, Horia C; Wuttke, Matthias; Ekici, Arif B; Bockenhauer, Detlef; Walz, Gerd; Powis, Stephen H; Kielstein, Jan T; Brenchley, Paul; Eckardt, Kai-Uwe; Kronenberg, Florian; Kleta, Robert; Köttgen, Anna
2017-02-01
Membranous nephropathy (MN) is a common cause of nephrotic syndrome in adults. Previous genome-wide association studies (GWAS) of 300 000 genotyped variants identified MN-associated loci at HLA-DQA1 and PLA2R1. We used a combined approach of genotype imputation, GWAS, human leucocyte antigen (HLA) imputation and extension to other aetiologies of chronic kidney disease (CKD) to investigate genetic MN risk variants more comprehensively. GWAS using 9 million high-quality imputed genotypes and classical HLA alleles were conducted for 323 MN European-ancestry cases and 345 controls. Additionally, 4960 patients with different CKD aetiologies in the German Chronic Kidney Disease (GCKD) study were genotyped for risk variants at HLA-DQA1 and PLA2R1. In GWAS, lead variants in known loci [rs9272729, HLA-DQA1, odds ratio (OR) = 7.3 per risk allele, P = 5.9 × 10 -27 and rs17830558, PLA2R1, OR = 2.2, P = 1.9 × 10 -8 ] were significantly associated with MN. No novel signals emerged in GWAS of X-chromosomal variants or in sex-specific analyses. Classical HLA alleles (DRB1*0301-DQA1*0501-DQB1*0201 haplotype) were associated with MN but provided little additional information beyond rs9272729. Associations were replicated in 137 GCKD patients with MN (HLA-DQA1: P = 6.4 × 10 -24 ; PLA2R1: P = 5.0 × 10 -4 ). MN risk increased steeply for patients with high-risk genotype combinations (OR > 79). While genetic variation in PLA2R1 exclusively associated with MN across 19 CKD aetiologies, the HLA-DQA1 risk allele was also associated with lupus nephritis (P = 2.8 × 10 -6 ), type 1 diabetic nephropathy (P = 6.9 × 10 -5 ) and focal segmental glomerulosclerosis (P = 5.1 × 10 -5 ), but not with immunoglobulin A nephropathy. PLA2R1 and HLA-DQA1 are the predominant risk loci for MN detected by GWAS. While HLA-DQA1 risk variants show an association with other CKD aetiologies, PLA2R1 variants are specific to MN. © The Author 2016. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Shu, Xiang; Purdue, Mark P; Ye, Yuanqing; Tu, Huakang; Wood, Christopher G; Tannir, Nizar M; Wang, Zhaoming; Albanes, Demetrius; Gapstur, Susan M; Stevens, Victoria L; Rothman, Nathaniel; Chanock, Stephen J; Wu, Xifeng
2017-09-01
Background: Obesity is an established risk factor for renal cell carcinoma (RCC). Although genome-wide association studies (GWAS) of RCC have identified several susceptibility loci, additional variants might be missed due to the highly conservative selection. Methods: We conducted a multiphase study utilizing three independent genome-wide scans at MD Anderson Cancer Center (MDA RCC GWAS and MDA RCC OncoArray) and National Cancer Institute (NCI RCC GWAS), which consisted of a total of 3,530 cases and 5,714 controls, to investigate genetic variations in obesity-related genes and RCC risk. Results: In the discovery phase, 32,946 SNPs located at ±10 kb of 2,001 obesity-related genes were extracted from MDA RCC GWAS and analyzed using multivariable logistic regression. Proxies ( R 2 > 0.8) were searched or imputation was performed if SNPs were not directly genotyped in the validation sets. Twenty-one SNPs with P < 0.05 in both MDA RCC GWAS and NCI RCC GWAS were subsequently evaluated in MDA RCC OncoArray. In the overall meta-analysis, significant ( P < 0.05) associations with RCC risk were observed for SNP mapping to IL1RAPL2 [rs10521506-G: OR meta = 0.87 (0.81-0.93), P meta = 2.33 × 10 -5 ], PLIN2 [rs2229536-A: OR meta = 0.87 (0.81-0.93), P meta = 2.33 × 10 -5 ], SMAD3 [rs4601989-A: OR meta = 0.86 (0.80-0.93), P meta = 2.71 × 10 -4 ], MED13L [rs10850596-A: OR meta = 1.14 (1.07-1.23), P meta = 1.50 × 10 -4 ], and TSC1 [rs3761840-G: OR meta = 0.90 (0.85-0.97), P meta = 2.47 × 10 -3 ]. We did not observe any significant cis-expression quantitative trait loci effect for these SNPs in the TCGA KIRC data. Conclusions: Taken together, we found that genetic variation of obesity-related genes could influence RCC susceptibility. Impact: The five identified loci may provide new insights into disease etiology that reveal importance of obesity-related genes in RCC development. Cancer Epidemiol Biomarkers Prev; 26(9); 1436-42. ©2017 AACR . ©2017 American Association for Cancer Research.
Genetic Structure of the Han Chinese Population Revealed by Genome-wide SNP Variation
Chen, Jieming; Zheng, Houfeng; Bei, Jin-Xin; Sun, Liangdan; Jia, Wei-hua; Li, Tao; Zhang, Furen; Seielstad, Mark; Zeng, Yi-Xin; Zhang, Xuejun; Liu, Jianjun
2009-01-01
Population stratification is a potential problem for genome-wide association studies (GWAS), confounding results and causing spurious associations. Hence, understanding how allele frequencies vary across geographic regions or among subpopulations is an important prelude to analyzing GWAS data. Using over 350,000 genome-wide autosomal SNPs in over 6000 Han Chinese samples from ten provinces of China, our study revealed a one-dimensional “north-south” population structure and a close correlation between geography and the genetic structure of the Han Chinese. The north-south population structure is consistent with the historical migration pattern of the Han Chinese population. Metropolitan cities in China were, however, more diffused “outliers,” probably because of the impact of modern migration of peoples. At a very local scale within the Guangdong province, we observed evidence of population structure among dialect groups, probably on account of endogamy within these dialects. Via simulation, we show that empirical levels of population structure observed across modern China can cause spurious associations in GWAS if not properly handled. In the Han Chinese, geographic matching is a good proxy for genetic matching, particularly in validation and candidate-gene studies in which population stratification cannot be directly accessed and accounted for because of the lack of genome-wide data, with the exception of the metropolitan cities, where geographical location is no longer a good indicator of ancestral origin. Our findings are important for designing GWAS in the Chinese population, an activity that is expected to intensify greatly in the near future. PMID:19944401
Graham, Hillary T; Rotroff, Daniel M; Marvel, Skylar W; Buse, John B; Havener, Tammy M; Wilson, Alyson G; Wagner, Michael J; Motsinger-Reif, Alison A
2016-01-01
Given the high costs of conducting a drug-response trial, researchers are now aiming to use retrospective analyses to conduct genome-wide association studies (GWAS) to identify underlying genetic contributions to drug-response variation. To prevent confounding results from a GWAS to investigate drug response, it is necessary to account for concomitant medications, defined as any medication taken concurrently with the primary medication being investigated. We use data from the Action to Control Cardiovascular Disease (ACCORD) trial in order to implement a novel scoring procedure for incorporating concomitant medication information into a linear regression model in preparation for GWAS. In order to accomplish this, two primary medications were selected: thiazolidinediones and metformin because of the wide-spread use of these medications and large sample sizes available within the ACCORD trial. A third medication, fenofibrate, along with a known confounding medication, statin, were chosen as a proof-of-principle for the scoring procedure. Previous studies have identified SNP rs7412 as being associated with statin response. Here we hypothesize that including the score for statin as a covariate in the GWAS model will correct for confounding of statin and yield a change in association at rs7412. The response of the confounded signal was successfully diminished from p = 3.19 × 10 -7 to p = 1.76 × 10 -5 , by accounting for statin using the scoring procedure presented here. This approach provides the ability for researchers to account for concomitant medications in complex trial designs where monotherapy treatment regimens are not available.
Controlling the Rate of GWAS False Discoveries
Brzyski, Damian; Peterson, Christine B.; Sobczyk, Piotr; Candès, Emmanuel J.; Bogdan, Malgorzata; Sabatti, Chiara
2017-01-01
With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study. PMID:27784720
Genotype Imputation for Latinos Using the HapMap and 1000 Genomes Project Reference Panels.
Gao, Xiaoyi; Haritunians, Talin; Marjoram, Paul; McKean-Cowdin, Roberta; Torres, Mina; Taylor, Kent D; Rotter, Jerome I; Gauderman, William J; Varma, Rohit
2012-01-01
Genotype imputation is a vital tool in genome-wide association studies (GWAS) and meta-analyses of multiple GWAS results. Imputation enables researchers to increase genomic coverage and to pool data generated using different genotyping platforms. HapMap samples are often employed as the reference panel. More recently, the 1000 Genomes Project resource is becoming the primary source for reference panels. Multiple GWAS and meta-analyses are targeting Latinos, the most populous, and fastest growing minority group in the US. However, genotype imputation resources for Latinos are rather limited compared to individuals of European ancestry at present, largely because of the lack of good reference data. One choice of reference panel for Latinos is one derived from the population of Mexican individuals in Los Angeles contained in the HapMap Phase 3 project and the 1000 Genomes Project. However, a detailed evaluation of the quality of the imputed genotypes derived from the public reference panels has not yet been reported. Using simulation studies, the Illumina OmniExpress GWAS data from the Los Angles Latino Eye Study and the MACH software package, we evaluated the accuracy of genotype imputation in Latinos. Our results show that the 1000 Genomes Project AMR + CEU + YRI reference panel provides the highest imputation accuracy for Latinos, and that also including Asian samples in the panel can reduce imputation accuracy. We also provide the imputation accuracy for each autosomal chromosome using the 1000 Genomes Project panel for Latinos. Our results serve as a guide to future imputation based analysis in Latinos.
Controlling the Rate of GWAS False Discoveries.
Brzyski, Damian; Peterson, Christine B; Sobczyk, Piotr; Candès, Emmanuel J; Bogdan, Malgorzata; Sabatti, Chiara
2017-01-01
With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study. Copyright © 2017 by the Genetics Society of America.
Integrative Genomics Reveals Novel Molecular Pathways and Gene Networks for Coronary Artery Disease
Mäkinen, Ville-Petteri; Civelek, Mete; Meng, Qingying; Zhang, Bin; Zhu, Jun; Levian, Candace; Huan, Tianxiao; Segrè, Ayellet V.; Ghosh, Sujoy; Vivar, Juan; Nikpay, Majid; Stewart, Alexandre F. R.; Nelson, Christopher P.; Willenborg, Christina; Erdmann, Jeanette; Blakenberg, Stefan; O'Donnell, Christopher J.; März, Winfried; Laaksonen, Reijo; Epstein, Stephen E.; Kathiresan, Sekar; Shah, Svati H.; Hazen, Stanley L.; Reilly, Muredach P.; Lusis, Aldons J.; Samani, Nilesh J.; Schunkert, Heribert; Quertermous, Thomas; McPherson, Ruth; Yang, Xia; Assimes, Themistocles L.
2014-01-01
The majority of the heritability of coronary artery disease (CAD) remains unexplained, despite recent successes of genome-wide association studies (GWAS) in identifying novel susceptibility loci. Integrating functional genomic data from a variety of sources with a large-scale meta-analysis of CAD GWAS may facilitate the identification of novel biological processes and genes involved in CAD, as well as clarify the causal relationships of established processes. Towards this end, we integrated 14 GWAS from the CARDIoGRAM Consortium and two additional GWAS from the Ottawa Heart Institute (25,491 cases and 66,819 controls) with 1) genetics of gene expression studies of CAD-relevant tissues in humans, 2) metabolic and signaling pathways from public databases, and 3) data-driven, tissue-specific gene networks from a multitude of human and mouse experiments. We not only detected CAD-associated gene networks of lipid metabolism, coagulation, immunity, and additional networks with no clear functional annotation, but also revealed key driver genes for each CAD network based on the topology of the gene regulatory networks. In particular, we found a gene network involved in antigen processing to be strongly associated with CAD. The key driver genes of this network included glyoxalase I (GLO1) and peptidylprolyl isomerase I (PPIL1), which we verified as regulatory by siRNA experiments in human aortic endothelial cells. Our results suggest genetic influences on a diverse set of both known and novel biological processes that contribute to CAD risk. The key driver genes for these networks highlight potential novel targets for further mechanistic studies and therapeutic interventions. PMID:25033284
Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries.
Baurley, James W; Edlund, Christopher K; Pardamean, Carissa I; Conti, David V; Krasnow, Ruth; Javitz, Harold S; Hops, Hyman; Swan, Gary E; Benowitz, Neal L; Bergen, Andrew W
2016-09-01
Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3'-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan-continental population biomarkers for nicotine metabolism. This multiple ancestry meta-GWAS of the laboratory study-based NMR provides novel evidence and replication for genome-wide association of CYP2A6 single nucleotide and insertion-deletion polymorphisms. We identify three regions of genome-wide significance: proximal, intronic, and distal to CYP2A6. We replicate the top-ranking single nucleotide polymorphism from a recent GWAS of the NMR in Finnish smokers, identify a functional mechanism for this intronic variant from in silico analyses of RNA-seq data that is consistent with CYP2A6 expression measured in postmortem lung and liver, and provide additional support for the intergenic region between CYP2A6 and CYP2A7. © The Author 2016. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco.
Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries
Baurley, James W.; Edlund, Christopher K.; Pardamean, Carissa I.; Conti, David V.; Krasnow, Ruth; Javitz, Harold S.; Hops, Hyman; Swan, Gary E.; Benowitz, Neal L.
2016-01-01
Introduction: Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3′-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Methods: Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. Results: African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). Conclusions: This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan-continental population biomarkers for nicotine metabolism. Implications: This multiple ancestry meta-GWAS of the laboratory study-based NMR provides novel evidence and replication for genome-wide association of CYP2A6 single nucleotide and insertion–deletion polymorphisms. We identify three regions of genome-wide significance: proximal, intronic, and distal to CYP2A6. We replicate the top-ranking single nucleotide polymorphism from a recent GWAS of the NMR in Finnish smokers, identify a functional mechanism for this intronic variant from in silico analyses of RNA-seq data that is consistent with CYP2A6 expression measured in postmortem lung and liver, and provide additional support for the intergenic region between CYP2A6 and CYP2A7. PMID:27113016
Zhang, Dong; Kong, Wenqian; Robertson, Jon; Goff, Valorie H; Epps, Ethan; Kerr, Alexandra; Mills, Gabriel; Cromwell, Jay; Lugin, Yelena; Phillips, Christine; Paterson, Andrew H
2015-04-19
Domestication has played an important role in shaping characteristics of the inflorescence and plant height in cultivated cereals. Taking advantage of meta-analysis of QTLs, phylogenetic analyses in 502 diverse sorghum accessions, GWAS in a sorghum association panel (n = 354) and comparative data, we provide insight into the genetic basis of the domestication traits in sorghum and rice. We performed genome-wide association studies (GWAS) on 6 traits related to inflorescence morphology and 6 traits related to plant height in sorghum, comparing the genomic regions implicated in these traits by GWAS and QTL mapping, respectively. In a search for signatures of selection, we identify genomic regions that may contribute to sorghum domestication regarding plant height, flowering time and pericarp color. Comparative studies across taxa show functionally conserved 'hotspots' in sorghum and rice for awn presence and pericarp color that do not appear to reflect corresponding single genes but may indicate co-regulated clusters of genes. We also reveal homoeologous regions retaining similar functions for plant height and flowering time since genome duplication an estimated 70 million years ago or more in a common ancestor of cereals. In most such homoeologous QTL pairs, only one QTL interval exhibits strong selection signals in modern sorghum. Intersections among QTL, GWAS and comparative data advance knowledge of genetic determinants of inflorescence and plant height components in sorghum, and add new dimensions to comparisons between sorghum and rice.
A simulation study of gene-by-environment interactions in GWAS implies ample hidden effects
Marigorta, Urko M.; Gibson, Greg
2014-01-01
The switch to a modern lifestyle in recent decades has coincided with a rapid increase in prevalence of obesity and other diseases. These shifts in prevalence could be explained by the release of genetic susceptibility for disease in the form of gene-by-environment (GxE) interactions. Yet, the detection of interaction effects requires large sample sizes, little replication has been reported, and a few studies have demonstrated environmental effects only after summing the risk of GWAS alleles into genetic risk scores (GRSxE). We performed extensive simulations of a quantitative trait controlled by 2500 causal variants to inspect the feasibility to detect gene-by-environment interactions in the context of GWAS. The simulated individuals were assigned either to an ancestral or a modern setting that alters the phenotype by increasing the effect size by 1.05–2-fold at a varying fraction of perturbed SNPs (from 1 to 20%). We report two main results. First, for a wide range of realistic scenarios, highly significant GRSxE is detected despite the absence of individual genotype GxE evidence at the contributing loci. Second, an increase in phenotypic variance after environmental perturbation reduces the power to discover susceptibility variants by GWAS in mixed cohorts with individuals from both ancestral and modern environments. We conclude that a pervasive presence of gene-by-environment effects can remain hidden even though it contributes to the genetic architecture of complex traits. PMID:25101110
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Dong; Kong, Wenqian; Robertson, Jon
Domestication has played an important role in shaping characteristics of the inflorescence and plant height in cultivated cereals. Taking advantage of meta-analysis of QTLs, phylogenetic analyses in 502 diverse sorghum accessions, GWAS in a sorghum association panel (n = 354) and comparative data, we provide insight into the genetic basis of the domestication traits in sorghum and rice. We performed genome-wide association studies (GWAS) on 6 traits related to inflorescence morphology and 6 traits related to plant height in sorghum, comparing the genomic regions implicated in these traits by GWAS and QTL mapping, respectively. In a search for signatures ofmore » selection, we identify genomic regions that may contribute to sorghum domestication regarding plant height, flowering time and pericarp color. Comparative studies across taxa show functionally conserved ‘hotspots’ in sorghum and rice for awn presence and pericarp color that do not appear to reflect corresponding single genes but may indicate co-regulated clusters of genes. We also reveal homoeologous regions retaining similar functions for plant height and flowering time since genome duplication an estimated 70 million years ago or more in a common ancestor of cereals. In most such homoeologous QTL pairs, only one QTL interval exhibits strong selection signals in modern sorghum. Intersections among QTL, GWAS and comparative data advance knowledge of genetic determinants of inflorescence and plant height components in sorghum, and add new dimensions to comparisons between sorghum and rice.« less
Zhang, Dong; Kong, Wenqian; Robertson, Jon; ...
2015-12-01
Domestication has played an important role in shaping characteristics of the inflorescence and plant height in cultivated cereals. Taking advantage of meta-analysis of QTLs, phylogenetic analyses in 502 diverse sorghum accessions, GWAS in a sorghum association panel (n = 354) and comparative data, we provide insight into the genetic basis of the domestication traits in sorghum and rice. We performed genome-wide association studies (GWAS) on 6 traits related to inflorescence morphology and 6 traits related to plant height in sorghum, comparing the genomic regions implicated in these traits by GWAS and QTL mapping, respectively. In a search for signatures ofmore » selection, we identify genomic regions that may contribute to sorghum domestication regarding plant height, flowering time and pericarp color. Comparative studies across taxa show functionally conserved ‘hotspots’ in sorghum and rice for awn presence and pericarp color that do not appear to reflect corresponding single genes but may indicate co-regulated clusters of genes. We also reveal homoeologous regions retaining similar functions for plant height and flowering time since genome duplication an estimated 70 million years ago or more in a common ancestor of cereals. In most such homoeologous QTL pairs, only one QTL interval exhibits strong selection signals in modern sorghum. Intersections among QTL, GWAS and comparative data advance knowledge of genetic determinants of inflorescence and plant height components in sorghum, and add new dimensions to comparisons between sorghum and rice.« less
Perez-Andreu, Virginia; Roberts, Kathryn G; Xu, Heng; Smith, Colton; Zhang, Hui; Yang, Wenjian; Harvey, Richard C; Payne-Turner, Debbie; Devidas, Meenakshi; Cheng, I-Ming; Carroll, William L; Heerema, Nyla A; Carroll, Andrew J; Raetz, Elizabeth A; Gastier-Foster, Julie M; Marcucci, Guido; Bloomfield, Clara D; Mrózek, Krzysztof; Kohlschmidt, Jessica; Stock, Wendy; Kornblau, Steven M; Konopleva, Marina; Paietta, Elisabeth; Rowe, Jacob M; Luger, Selina M; Tallman, Martin S; Dean, Michael; Burchard, Esteban G; Torgerson, Dara G; Yue, Feng; Wang, Yanli; Pui, Ching-Hon; Jeha, Sima; Relling, Mary V; Evans, William E; Gerhard, Daniela S; Loh, Mignon L; Willman, Cheryl L; Hunger, Stephen P; Mullighan, Charles G; Yang, Jun J
2015-01-22
Acute lymphoblastic leukemia (ALL) in adolescents and young adults (AYA) is characterized by distinct presenting features and inferior prognosis compared with pediatric ALL. We performed a genome-wide association study (GWAS) to comprehensively identify inherited genetic variants associated with susceptibility to AYA ALL. In the discovery GWAS, we compared genotype frequency at 635 297 single nucleotide polymorphisms (SNPs) in 308 AYA ALL cases and 6,661 non-ALL controls by using a logistic regression model with genetic ancestry as a covariate. SNPs that reached P ≤ 5 × 10(-8) in GWAS were tested in an independent cohort of 162 AYA ALL cases and 5,755 non-ALL controls. We identified a single genome-wide significant susceptibility locus in GATA3: rs3824662, odds ratio (OR), 1.77 (P = 2.8 × 10(-10)) and rs3781093, OR, 1.73 (P = 3.2 × 10(-9)). These findings were validated in the replication cohort. The risk allele at rs3824662 was most frequent in Philadelphia chromosome (Ph)-like ALL but also conferred susceptibility to non-Ph-like ALL in AYAs. In 1,827 non-selected ALL cases, the risk allele frequency at this SNP was positively correlated with age at diagnosis (P = 6.29 × 10(-11)). Our results from this first GWAS of AYA ALL susceptibility point to unique biology underlying leukemogenesis and potentially distinct disease etiology by age group.
Establishing the role of rare coding variants in known Parkinson's disease risk loci.
Jansen, Iris E; Gibbs, J Raphael; Nalls, Mike A; Price, T Ryan; Lubbe, Steven; van Rooij, Jeroen; Uitterlinden, André G; Kraaij, Robert; Williams, Nigel M; Brice, Alexis; Hardy, John; Wood, Nicholas W; Morris, Huw R; Gasser, Thomas; Singleton, Andrew B; Heutink, Peter; Sharma, Manu
2017-11-01
Many common genetic factors have been identified to contribute to Parkinson's disease (PD) susceptibility, improving our understanding of the related underlying biological mechanisms. The involvement of rarer variants in these loci has been poorly studied. Using International Parkinson's Disease Genomics Consortium data sets, we performed a comprehensive study to determine the impact of rare variants in 23 previously published genome-wide association studies (GWAS) loci in PD. We applied Prix fixe to select the putative causal genes underneath the GWAS peaks, which was based on underlying functional similarities. The Sequence Kernel Association Test was used to analyze the joint effect of rare, common, or both types of variants on PD susceptibility. All genes were tested simultaneously as a gene set and each gene individually. We observed a moderate association of common variants, confirming the involvement of the known PD risk loci within our genetic data sets. Focusing on rare variants, we identified additional association signals for LRRK2, STBD1, and SPATA19. Our study suggests an involvement of rare variants within several putatively causal genes underneath previously identified PD GWAS peaks. Copyright © 2017 Elsevier Inc. All rights reserved.
He, Liang; Zhbannikov, Ilya; Arbeev, Konstantin G; Yashin, Anatoliy I; Kulminski, Alexander M
2017-11-01
Unraveling the underlying biological mechanisms or pathways behind the effects of genetic variations on complex diseases remains one of the major challenges in the post-GWAS (where GWAS is genome-wide association study) era. To further explore the relationship between genetic variations, biomarkers, and diseases for elucidating underlying pathological mechanism, a huge effort has been placed on examining pleiotropic and gene-environmental interaction effects. We propose a novel genetic stochastic process model (GSPM) that can be applied to GWAS and jointly investigate the genetic effects on longitudinally measured biomarkers and risks of diseases. This model is characterized by more profound biological interpretation and takes into account the dynamics of biomarkers during follow-up when investigating the hazards of a disease. We illustrate the rationale and evaluate the performance of the proposed model through two GWAS. One is to detect single nucleotide polymorphisms (SNPs) having interaction effects on type 2 diabetes (T2D) with body mass index (BMI) and the other is to detect SNPs affecting the optimal BMI level for protecting from T2D. We identified multiple SNPs that showed interaction effects with BMI on T2D, including a novel SNP rs11757677 in the CDKAL1 gene (P = 5.77 × 10 -7 ). We also found a SNP rs1551133 located on 2q14.2 that reversed the effect of BMI on T2D (P = 6.70 × 10 -7 ). In conclusion, the proposed GSPM provides a promising and useful tool in GWAS of longitudinal data for interrogating pleiotropic and interaction effects to gain more insights into the relationship between genes, quantitative biomarkers, and risks of complex diseases. © 2017 WILEY PERIODICALS, INC.
Markunas, Christina A; Johnson, Eric O; Hancock, Dana B
2017-07-01
Genome-wide association study (GWAS)-identified variants are enriched for functional elements. However, we have limited knowledge of how functional enrichment may differ by disease/trait and tissue type. We tested a broad set of eight functional elements for enrichment among GWAS-identified SNPs (p < 5×10 -8 ) from the NHGRI-EBI Catalog across seven disease/trait categories: cancer, cardiovascular disease, diabetes, autoimmune disease, psychiatric disease, neurological disease, and anthropometric traits. SNPs were annotated using HaploReg for the eight functional elements across any tissue: DNase sites, expression quantitative trait loci (eQTL), sequence conservation, enhancers, promoters, missense variants, sequence motifs, and protein binding sites. In addition, tissue-specific annotations were considered for brain vs. blood. Disease/trait SNPs were compared to a control set of 4809 SNPs matched to the GWAS SNPs (N = 1639) on allele frequency, gene density, distance to nearest gene, and linkage disequilibrium at ~3:1 ratio. Enrichment analyses were conducted using logistic regression, with Bonferroni correction. Overall, a significant enrichment was observed for all functional elements, except sequence motifs. Missense SNPs showed the strongest magnitude of enrichment. eQTLs were the only functional element significantly enriched across all diseases/traits. Magnitudes of enrichment were generally similar across diseases/traits, where enrichment was statistically significant. Blood vs. brain tissue effects on enrichment were dependent on disease/trait and functional element (e.g., cardiovascular disease: eQTLs P TissueDifference = 1.28 × 10 -6 vs. enhancers P TissueDifference = 0.94). Identifying disease/trait-relevant functional elements and tissue types could provide new insight into the underlying biology, by guiding a priori GWAS analyses (e.g., brain enhancer elements for psychiatric disease) or facilitating post hoc interpretation.
Network-Guided GWAS Improves Identification of Genes Affecting Free Amino Acids.
Angelovici, Ruthie; Batushansky, Albert; Deason, Nicholas; Gonzalez-Jorge, Sabrina; Gore, Michael A; Fait, Aaron; DellaPenna, Dean
2017-01-01
Amino acids are essential for proper growth and development in plants. Amino acids serve as building blocks for proteins but also are important for responses to stress and the biosynthesis of numerous essential compounds. In seed, the pool of free amino acids (FAAs) also contributes to alternative energy, desiccation, and seed vigor; thus, manipulating FAA levels can significantly impact a seed's nutritional qualities. While genome-wide association studies (GWAS) on branched-chain amino acids have identified some regulatory genes controlling seed FAAs, the genetic regulation of FAA levels, composition, and homeostasis in seeds remains mostly unresolved. Hence, we performed GWAS on 18 FAAs from a 313-ecotype Arabidopsis (Arabidopsis thaliana) association panel. Specifically, GWAS was performed on 98 traits derived from known amino acid metabolic pathways (approach 1) and then on 92 traits generated from an unbiased correlation-based metabolic network analysis (approach 2), and the results were compared. The latter approach facilitated the discovery of additional novel metabolic interactions and single-nucleotide polymorphism-trait associations not identified by the former approach. The most prominent network-guided GWAS signal was for a histidine (His)-related trait in a region containing two genes: a cationic amino acid transporter (CAT4) and a polynucleotide phosphorylase resistant to inhibition with fosmidomycin. A reverse genetics approach confirmed CAT4 to be responsible for the natural variation of His-related traits across the association panel. Given that His is a semiessential amino acid and a potent metal chelator, CAT4 orthologs could be considered as candidate genes for seed quality biofortification in crop plants. © 2017 American Society of Plant Biologists. All Rights Reserved.
Gupta, Aditi; Juyal, Garima; Sood, Ajit; Midha, Vandana; Yamazaki, Keiko; Vich Vila, Arnau; Esaki, Motohiro; Matsui, Toshiyuki; Takahashi, Atsushi; Kubo, Michiaki; Weersma, Rinse K; Thelma, B K
2017-01-01
The first ever genome-wide association study (GWAS) of ulcerative colitis in genetically distinct north Indian population identified two novel genes namely CFB and SLC44A4. Considering their biological relevance, we investigated allelic/genetic heterogeneity in these genes among ulcerative colitis cohorts of north Indian, Japanese and Dutch origin using high-density ImmunoChip case–control genotype data. Comparative linkage disequilibrium profiling and test of association were performed. Of the 28 CFB SNPs, similar strength of association was observed for rs4151657 (novel ulcerative colitis GWAS SNP) in north Indians (P=1.73 × 10−10) and Japanese (P=2.02 × 10−12) but not in the Dutch. Further, a three-marker haplotype was shared between north Indians and Japanese (P<10−8), but a different five-marker haplotype was associated (P=2.07 × 10−6) in the Dutch. Of the 22 SLC44A4 SNPs, rs2736428 (novel ulcerative colitis GWAS SNP) was found significantly associated in north Indians (P=4.94 × 10−10) and Japanese (P=3.37 × 10−9), but not among the Dutch. These results suggest (i) apparent allelic heterogeneity in CFB and genetic heterogeneity in SLC44A4 across different ethnic groups; (ii) shared ulcerative colitis genetic etiological factors among Asians; and finally (iii) re-exploration of GWAS findings together with high-density genotyping/sequencing and trans-ethnic fine mapping approaches may help identify shared and population-specific risk variants and enable to explain missing disease heritability. PMID:27759029
Gala, Manish; Abecasis, Goncalo; Bezieau, Stephane; Brenner, Hermann; Butterbach, Katja; Caan, Bette J.; Carlson, Christopher S.; Casey, Graham; Chang-Claude, Jenny; Conti, David V.; Curtis, Keith R.; Duggan, David; Gallinger, Steven; Haile, Robert W.; Harrison, Tabitha A.; Hayes, Richard B.; Hoffmeister, Michael; Hopper, John L.; Hudson, Thomas J.; Jenkins, Mark A.; Küry, Sébastien; Le Marchand, Loic; Leal, Suzanne M.; Newcomb, Polly A.; Nickerson, Deborah A.; Potter, John D.; Schoen, Robert E.; Schumacher, Fredrick R.; Seminara, Daniela; Slattery, Martha L.; Hsu, Li; Chan, Andrew T.; White, Emily; Berndt, Sonja I.; Peters, Ulrike
2016-01-01
Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) associated with colorectal cancer risk. These SNPs may tag correlated variants with biological importance. Fine-mapping around GWAS loci can facilitate detection of functional candidates and additional independent risk variants. We analyzed 11,900 cases and 14,311 controls in the Genetics and Epidemiology of Colorectal Cancer Consortium and the Colon Cancer Family Registry. To fine-map genomic regions containing all known common risk variants, we imputed high-density genetic data from the 1000 Genomes Project. We tested single-variant associations with colorectal tumor risk for all variants spanning genomic regions 250-kb upstream or downstream of 31 GWAS-identified SNPs (index SNPs). We queried the University of California, Santa Cruz Genome Browser to examine evidence for biological function. Index SNPs did not show the strongest association signals with colorectal tumor risk in their respective genomic regions. Bioinformatics analysis of SNPs showing smaller P-values in each region revealed 21 functional candidates in 12 loci (5q31.1, 8q24, 11q13.4, 11q23, 12p13.32, 12q24.21, 14q22.2, 15q13, 18q21, 19q13.1, 20p12.3, and 20q13.33). We did not observe evidence of additional independent association signals in GWAS-identified regions. Our results support the utility of integrating data from comprehensive fine-mapping with expanding publicly available genomic databases to help clarify GWAS associations and identify functional candidates that warrant more onerous laboratory follow-up. Such efforts may aid the eventual discovery of disease-causing variant(s). PMID:27379672
Mirkovic, Bojan; Laurent, Claudine; Podlipski, Marc-Antoine; Frebourg, Thierry; Cohen, David; Gerardin, Priscille
2016-01-01
Suicidal behaviors (SBs), which range from suicidal ideation to suicide attempts and completed suicide, represent a fatal dimension of mental ill-health. The involvement of genetic risk factors in SB is supported by family, twin, and adoption studies. The aim of this paper is to review recent genetic association studies in SBs including (i) case–control studies, (ii) family-based association studies, and (iii) genome-wide association studies (GWAS). Various studies on genetic associations have tended to suggest that a number of genes [e.g., tryptophan hydroxylase, serotonin receptors and transporters, or brain-derived neurotrophic factors (BDNFs)] are linked to SBs, but these findings are not consistently supported by the results obtained. Although the candidate–gene approach is useful, it is hampered by the present state of knowledge concerning the pathophysiology of diseases. Interpretations of GWAS results are mostly hindered by a lack of annotation describing the functions of most variation throughout the genome. Association studies have addressed a wide range of single-nucleotide polymorphisms in numerous genes. We have included 104 such studies, of which 10 are family-based association studies and 11 are GWAS. Numerous meta-analyses of case–control studies have shown significant associations of SB with variants in the serotonin transporter gene (5-HTT or SLC6A4) and the tryptophan hydroxylase 1 gene (TPH1), but others report contradictory results. The gene encoding BDNF and its receptor (NTRK2) are also promising candidates. Only two of the GWAS showed any significant associations. Several pathways are mentioned in an attempt to understand the lack of reproducibility and the disappointing results. Consequently, we review and discuss here the following aspects: (i) sample characteristics and confounding factors; (ii) statistical limits; (iii) gene–gene interactions; (iv) gene, environment, and by time interactions; and (v) technological and theoretical limits. PMID:27721799
Postmus, Iris; Trompet, Stella; Deshmukh, Harshal A.; Barnes, Michael R.; Li, Xiaohui; Warren, Helen R.; Chasman, Daniel I.; Zhou, Kaixin; Arsenault, Benoit J.; Donnelly, Louise A.; Wiggins, Kerri L.; Avery, Christy L.; Griffin, Paula; Feng, QiPing; Taylor, Kent D.; Li, Guo; Evans, Daniel S.; Smith, Albert V.; de Keyser, Catherine E.; Johnson, Andrew D.; de Craen, Anton J. M.; Stott, David J.; Buckley, Brendan M.; Ford, Ian; Westendorp, Rudi G. J.; Eline Slagboom, P.; Sattar, Naveed; Munroe, Patricia B.; Sever, Peter; Poulter, Neil; Stanton, Alice; Shields, Denis C.; O’Brien, Eoin; Shaw-Hawkins, Sue; Ida Chen, Y.-D.; Nickerson, Deborah A.; Smith, Joshua D.; Pierre Dubé, Marie; Matthijs Boekholdt, S.; Kees Hovingh, G.; Kastelein, John J. P.; McKeigue, Paul M.; Betteridge, John; Neil, Andrew; Durrington, Paul N.; Doney, Alex; Carr, Fiona; Morris, Andrew; McCarthy, Mark I.; Groop, Leif; Ahlqvist, Emma; Bis, Joshua C.; Rice, Kenneth; Smith, Nicholas L.; Lumley, Thomas; Whitsel, Eric A.; Stürmer, Til; Boerwinkle, Eric; Ngwa, Julius S.; O’Donnell, Christopher J.; Vasan, Ramachandran S.; Wei, Wei-Qi; Wilke, Russell A.; Liu, Ching-Ti; Sun, Fangui; Guo, Xiuqing; Heckbert, Susan R; Post, Wendy; Sotoodehnia, Nona; Arnold, Alice M.; Stafford, Jeanette M.; Ding, Jingzhong; Herrington, David M.; Kritchevsky, Stephen B.; Eiriksdottir, Gudny; Launer, Leonore J.; Harris, Tamara B.; Chu, Audrey Y.; Giulianini, Franco; MacFadyen, Jean G.; Barratt, Bryan J.; Nyberg, Fredrik; Stricker, Bruno H.; Uitterlinden, André G.; Hofman, Albert; Rivadeneira, Fernando; Emilsson, Valur; Franco, Oscar H.; Ridker, Paul M.; Gudnason, Vilmundur; Liu, Yongmei; Denny, Joshua C.; Ballantyne, Christie M.; Rotter, Jerome I.; Adrienne Cupples, L.; Psaty, Bruce M.; Palmer, Colin N. A.; Tardif, Jean-Claude; Colhoun, Helen M.; Hitman, Graham; Krauss, Ronald M.; Wouter Jukema, J; Caulfield, Mark J.; Donnelly, Peter; Barroso, Ines; Blackwell, Jenefer M.; Bramon, Elvira; Brown, Matthew A.; Casas, Juan P.; Corvin, Aiden; Deloukas, Panos; Duncanson, Audrey; Jankowski, Janusz; Markus, Hugh S.; Mathew, Christopher G.; Palmer, Colin N. A.; Plomin, Robert; Rautanen, Anna; Sawcer, Stephen J.; Trembath, Richard C.; Viswanathan, Ananth C.; Wood, Nicholas W.; Spencer, Chris C. A.; Band, Gavin; Bellenguez, Céline; Freeman, Colin; Hellenthal, Garrett; Giannoulatou, Eleni; Pirinen, Matti; Pearson, Richard; Strange, Amy; Su, Zhan; Vukcevic, Damjan; Donnelly, Peter; Langford, Cordelia; Hunt, Sarah E.; Edkins, Sarah; Gwilliam, Rhian; Blackburn, Hannah; Bumpstead, Suzannah J.; Dronov, Serge; Gillman, Matthew; Gray, Emma; Hammond, Naomi; Jayakumar, Alagurevathi; McCann, Owen T.; Liddle, Jennifer; Potter, Simon C.; Ravindrarajah, Radhi; Ricketts, Michelle; Waller, Matthew; Weston, Paul; Widaa, Sara; Whittaker, Pamela; Barroso, Ines; Deloukas, Panos; Mathew, Christopher G.; Blackwell, Jenefer M.; Brown, Matthew A.; Corvin, Aiden; McCarthy, Mark I.; Spencer, Chris C. A.
2014-01-01
Statins effectively lower LDL cholesterol levels in large studies and the observed interindividual response variability may be partially explained by genetic variation. Here we perform a pharmacogenetic meta-analysis of genome-wide association studies (GWAS) in studies addressing the LDL cholesterol response to statins, including up to 18,596 statin-treated subjects. We validate the most promising signals in a further 22,318 statin recipients and identify two loci, SORT1/CELSR2/PSRC1 and SLCO1B1, not previously identified in GWAS. Moreover, we confirm the previously described associations with APOE and LPA. Our findings advance the understanding of the pharmacogenetic architecture of statin response. PMID:25350695
Pasaniuc, Bogdan; Zaitlen, Noah; Lettre, Guillaume; Chen, Gary K; Tandon, Arti; Kao, W H Linda; Ruczinski, Ingo; Fornage, Myriam; Siscovick, David S; Zhu, Xiaofeng; Larkin, Emma; Lange, Leslie A; Cupples, L Adrienne; Yang, Qiong; Akylbekova, Ermeg L; Musani, Solomon K; Divers, Jasmin; Mychaleckyj, Joe; Li, Mingyao; Papanicolaou, George J; Millikan, Robert C; Ambrosone, Christine B; John, Esther M; Bernstein, Leslie; Zheng, Wei; Hu, Jennifer J; Ziegler, Regina G; Nyante, Sarah J; Bandera, Elisa V; Ingles, Sue A; Press, Michael F; Chanock, Stephen J; Deming, Sandra L; Rodriguez-Gil, Jorge L; Palmer, Cameron D; Buxbaum, Sarah; Ekunwe, Lynette; Hirschhorn, Joel N; Henderson, Brian E; Myers, Simon; Haiman, Christopher A; Reich, David; Patterson, Nick; Wilson, James G; Price, Alkes L
2011-04-01
While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.
Chen, Zhuo; Tao, Sha; Gao, Yong; Zhang, Ju; Hu, Yanling; Mo, Linjian; Kim, Seong-Tae; Yang, Xiaobo; Tan, Aihua; Zhang, Haiying; Qin, Xue; Li, Li; Wu, Yongming; Zhang, Shijun; Zheng, S Lilly; Xu, Jianfeng; Mo, Zengnan; Sun, Jielin
2013-12-01
Sex hormones and gonadotropins exert a wide variety of effects in physiological and pathological processes. Accumulated evidence shows a strong heritable component of circulating concentrations of these hormones. Recently, several genome-wide association studies (GWASs) conducted in Caucasians have identified multiple loci that influence serum levels of sex hormones. However, the genetic determinants remain unknown in Chinese populations. In this study, we aimed to identify genetic variants associated with major sex hormones, gonadotropins, including testosterone, oestradiol, follicle-stimulating hormone (FSH), luteinising hormone (LH) and sex hormone binding globulin (SHBG) in a Chinese population. A two-stage GWAS was conducted in a total of 3495 healthy Chinese men (1999 subjects in the GWAS discovery stage and 1496 in the confirmation stage). We identified a novel genetic region at 15q21.2 (rs2414095 in CYP19A1), which was significantly associated with oestradiol and FSH in the Chinese population at a genome-wide significant level (p=6.54×10(-31) and 1.59×10(-16), respectively). Another single nucleotide polymorphism in CYP19A1 gene was significantly associated with oestradiol level (rs2445762, p=7.75×10(-28)). In addition, we confirmed the previous GWAS-identified locus at 17p13.1 for testosterone (rs2075230, p=1.13×10(-8)) and SHBG level (rs2075230, p=4.75×10(-19)) in the Chinese population. This study is the first GWAS investigation of genetic determinants of FSH and LH. The identification of novel susceptibility loci may provide more biological implications for the synthesis and metabolism of these hormones. More importantly, the confirmation of the genetic loci for testosterone and SHBG suggests common genetic components shared among different ethnicities.
Alqudah, Ahmad M.; Sharma, Rajiv; Pasam, Raj K.; Graner, Andreas; Kilian, Benjamin; Schnurbusch, Thorsten
2014-01-01
Heading time is a complex trait, and natural variation in photoperiod responses is a major factor controlling time to heading, adaptation and grain yield. In barley, previous heading time studies have been mainly conducted under field conditions to measure total days to heading. We followed a novel approach and studied the natural variation of time to heading in a world-wide spring barley collection (218 accessions), comprising of 95 photoperiod-sensitive (Ppd-H1) and 123 accessions with reduced photoperiod sensitivity (ppd-H1) to long-day (LD) through dissecting pre-anthesis development into four major stages and sub-phases. The study was conducted under greenhouse (GH) conditions (LD; 16/8 h; ∼20/∼16°C day/night). Genotyping was performed using a genome-wide high density 9K single nucleotide polymorphisms (SNPs) chip which assayed 7842 SNPs. We used the barley physical map to identify candidate genes underlying genome-wide association scans (GWAS). GWAS for pre-anthesis stages/sub-phases in each photoperiod group provided great power for partitioning genetic effects on floral initiation and heading time. In addition to major genes known to regulate heading time under field conditions, several novel QTL with medium to high effects, including new QTL having major effects on developmental stages/sub-phases were found to be associated in this study. For example, highly associated SNPs tagged the physical regions around HvCO1 (barley CONSTANS1) and BFL (BARLEY FLORICAULA/LEAFY) genes. Based upon our GWAS analysis, we propose a new genetic network model for each photoperiod group, which includes several newly identified genes, such as several HvCO-like genes, belonging to different heading time pathways in barley. PMID:25420105
Multi-variant study of obesity risk genes in African Americans: The Jackson Heart Study.
Liu, Shijian; Wilson, James G; Jiang, Fan; Griswold, Michael; Correa, Adolfo; Mei, Hao
2016-11-30
Genome-wide association study (GWAS) has been successful in identifying obesity risk genes by single-variant association analysis. For this study, we designed steps of analysis strategy and aimed to identify multi-variant effects on obesity risk among candidate genes. Our analyses were focused on 2137 African American participants with body mass index measured in the Jackson Heart Study and 657 common single nucleotide polymorphisms (SNPs) genotyped at 8 GWAS-identified obesity risk genes. Single-variant association test showed that no SNPs reached significance after multiple testing adjustment. The following gene-gene interaction analysis, which was focused on SNPs with unadjusted p-value<0.10, identified 6 significant multi-variant associations. Logistic regression showed that SNPs in these associations did not have significant linear interactions; examination of genetic risk score evidenced that 4 multi-variant associations had significant additive effects of risk SNPs; and haplotype association test presented that all multi-variant associations contained one or several combinations of particular alleles or haplotypes, associated with increased obesity risk. Our study evidenced that obesity risk genes generated multi-variant effects, which can be additive or non-linear interactions, and multi-variant study is an important supplement to existing GWAS for understanding genetic effects of obesity risk genes. Copyright © 2016 Elsevier B.V. All rights reserved.
Multi-Trait GWAS and New Candidate Genes Annotation for Growth Curve Parameters in Brahman Cattle
Crispim, Aline Camporez; Kelly, Matthew John; Guimarães, Simone Eliza Facioni; e Silva, Fabyano Fonseca; Fortes, Marina Rufino Salinas; Wenceslau, Raphael Rocha; Moore, Stephen
2015-01-01
Understanding the genetic architecture of beef cattle growth cannot be limited simply to the genome-wide association study (GWAS) for body weight at any specific ages, but should be extended to a more general purpose by considering the whole growth trajectory over time using a growth curve approach. For such an approach, the parameters that are used to describe growth curves were treated as phenotypes under a GWAS model. Data from 1,255 Brahman cattle that were weighed at birth, 6, 12, 15, 18, and 24 months of age were analyzed. Parameter estimates, such as mature weight (A) and maturity rate (K) from nonlinear models are utilized as substitutes for the original body weights for the GWAS analysis. We chose the best nonlinear model to describe the weight-age data, and the estimated parameters were used as phenotypes in a multi-trait GWAS. Our aims were to identify and characterize associated SNP markers to indicate SNP-derived candidate genes and annotate their function as related to growth processes in beef cattle. The Brody model presented the best goodness of fit, and the heritability values for the parameter estimates for mature weight (A) and maturity rate (K) were 0.23 and 0.32, respectively, proving that these traits can be a feasible alternative when the objective is to change the shape of growth curves within genetic improvement programs. The genetic correlation between A and K was -0.84, indicating that animals with lower mature body weights reached that weight at younger ages. One hundred and sixty seven (167) and two hundred and sixty two (262) significant SNPs were associated with A and K, respectively. The annotated genes closest to the most significant SNPs for A had direct biological functions related to muscle development (RAB28), myogenic induction (BTG1), fetal growth (IL2), and body weights (APEX2); K genes were functionally associated with body weight, body height, average daily gain (TMEM18), and skeletal muscle development (SMN1). Candidate genes emerging from this GWAS may inform the search for causative mutations that could underpin genomic breeding for improved growth rates. PMID:26445451
Multi-Trait GWAS and New Candidate Genes Annotation for Growth Curve Parameters in Brahman Cattle.
Crispim, Aline Camporez; Kelly, Matthew John; Guimarães, Simone Eliza Facioni; Fonseca e Silva, Fabyano; Fortes, Marina Rufino Salinas; Wenceslau, Raphael Rocha; Moore, Stephen
2015-01-01
Understanding the genetic architecture of beef cattle growth cannot be limited simply to the genome-wide association study (GWAS) for body weight at any specific ages, but should be extended to a more general purpose by considering the whole growth trajectory over time using a growth curve approach. For such an approach, the parameters that are used to describe growth curves were treated as phenotypes under a GWAS model. Data from 1,255 Brahman cattle that were weighed at birth, 6, 12, 15, 18, and 24 months of age were analyzed. Parameter estimates, such as mature weight (A) and maturity rate (K) from nonlinear models are utilized as substitutes for the original body weights for the GWAS analysis. We chose the best nonlinear model to describe the weight-age data, and the estimated parameters were used as phenotypes in a multi-trait GWAS. Our aims were to identify and characterize associated SNP markers to indicate SNP-derived candidate genes and annotate their function as related to growth processes in beef cattle. The Brody model presented the best goodness of fit, and the heritability values for the parameter estimates for mature weight (A) and maturity rate (K) were 0.23 and 0.32, respectively, proving that these traits can be a feasible alternative when the objective is to change the shape of growth curves within genetic improvement programs. The genetic correlation between A and K was -0.84, indicating that animals with lower mature body weights reached that weight at younger ages. One hundred and sixty seven (167) and two hundred and sixty two (262) significant SNPs were associated with A and K, respectively. The annotated genes closest to the most significant SNPs for A had direct biological functions related to muscle development (RAB28), myogenic induction (BTG1), fetal growth (IL2), and body weights (APEX2); K genes were functionally associated with body weight, body height, average daily gain (TMEM18), and skeletal muscle development (SMN1). Candidate genes emerging from this GWAS may inform the search for causative mutations that could underpin genomic breeding for improved growth rates.
Recent developments in the genetics of ADHD.
Grimm, Oliver; Kittel-Schneider, Sarah; Reif, Andreas
2018-05-02
Attention deficit hyperactivity disorder (ADHD) is a developmental psychiatric disorder which affects children and adults. ADHD is one of the psychiatric disorders with the strongest genetic basis according to familial, twin and SNP-based epidemiological studies. In this review, we provide an update of recent insights in the genetic basis of ADHD. We discuss recent progress from genome-wide association studies (GWAS) looking at common variants as well as rare copy number variations (CNVs). New analysis of gene groups, so-called functional ontologies, provide some insight into the gene networks afflicted, pointing to the role of neurodevelopmentally expressed gene-networks. Bioinformatic methods such as functional enrichment analysis and protein-protein network analysis are used to highlight biological processes of likely relevance to the aetiology of ADHD. Additionally, CNVs seem to map on important pathways implicated in synaptic signalling and neurodevelopment. While some candidate gene associations of e.g. neurotransmitter receptors and signalling have been replicated, they do not seem to explain significant variance in recent GWAS. We discuss insights from recent case-control SNP-GWAS which gave whole-genome significant SNPs in ADHD. This article is protected by copyright. All rights reserved.
Kulbrock, Maike; Lehner, Stefanie; Metzger, Julia; Ohnesorge, Bernhard; Distl, Ottmar
2013-01-01
Equine recurrent uveitis (ERU) is a common eye disease affecting up to 3-15% of the horse population. A genome-wide association study (GWAS) using the Illumina equine SNP50 bead chip was performed to identify loci conferring risk to ERU. The sample included a total of 144 German warmblood horses. A GWAS showed a significant single nucleotide polymorphism (SNP) on horse chromosome (ECA) 20 at 49.3 Mb, with IL-17A and IL-17F being the closest genes. This locus explained a fraction of 23% of the phenotypic variance for ERU. A GWAS taking into account the severity of ERU, revealed a SNP on ECA18 nearby to the crystalline gene cluster CRYGA-CRYGF. For both genomic regions on ECA18 and 20, significantly associated haplotypes containing the genome-wide significant SNPs could be demonstrated. In conclusion, our results are indicative for a genetic component regulating the possible critical role of IL-17A and IL-17F in the pathogenesis of ERU. The associated SNP on ECA18 may be indicative for cataract formation in the course of ERU.
Identification of ten variants associated with risk of estrogen-receptor-negative breast cancer
Milne, Roger L; Kuchenbaecker, Karoline B; Michailidou, Kyriaki; Beesley, Jonathan; Kar, Siddhartha; Lindström, Sara; Hui, Shirley; Lemaçon, Audrey; Soucy, Penny; Dennis, Joe; Jiang, Xia; Rostamianfar, Asha; Finucane, Hilary; Bolla, Manjeet K; McGuffog, Lesley; Wang, Qin; Aalfs, Cora M; Adams, Marcia; Adlard, Julian; Agata, Simona; Ahmed, Shahana; Ahsan, Habibul; Aittomäki, Kristiina; Al-Ejeh, Fares; Allen, Jamie; Ambrosone, Christine B; Amos, Christopher I; Andrulis, Irene L; Anton-Culver, Hoda; Antonenkova, Natalia N; Arndt, Volker; Arnold, Norbert; Aronson, Kristan J; Auber, Bernd; Auer, Paul L; Ausems, Margreet G E M; Azzollini, Jacopo; Bacot, François; Balmaña, Judith; Barile, Monica; Barjhoux, Laure; Barkardottir, Rosa B; Barrdahl, Myrto; Barnes, Daniel; Barrowdale, Daniel; Baynes, Caroline; Beckmann, Matthias W; Benitez, Javier; Bermisheva, Marina; Bernstein, Leslie; Bignon, Yves-Jean; Blazer, Kathleen R; Blok, Marinus J; Blomqvist, Carl; Blot, William; Bobolis, Kristie; Boeckx, Bram; Bogdanova, Natalia V; Bojesen, Anders; Bojesen, Stig E; Bonanni, Bernardo; Børresen-Dale, Anne-Lise; Bozsik, Aniko; Bradbury, Angela R; Brand, Judith S; Brauch, Hiltrud; Brenner, Hermann; Bressac-de Paillerets, Brigitte; Brewer, Carole; Brinton, Louise; Broberg, Per; Brooks-Wilson, Angela; Brunet, Joan; Brüning, Thomas; Burwinkel, Barbara; Buys, Saundra S; Byun, Jinyoung; Cai, Qiuyin; Caldés, Trinidad; Caligo, Maria A; Campbell, Ian; Canzian, Federico; Caron, Olivier; Carracedo, Angel; Carter, Brian D; Castelao, J Esteban; Castera, Laurent; Caux-Moncoutier, Virginie; Chan, Salina B; Chang-Claude, Jenny; Chanock, Stephen J; Chen, Xiaoqing; Cheng, Ting-Yuan David; Chiquette, Jocelyne; Christiansen, Hans; Claes, Kathleen B M; Clarke, Christine L; Conner, Thomas; Conroy, Don M; Cook, Jackie; Cordina-Duverger, Emilie; Cornelissen, Sten; Coupier, Isabelle; Cox, Angela; Cox, David G; Cross, Simon S; Cuk, Katarina; Cunningham, Julie M; Czene, Kamila; Daly, Mary B; Damiola, Francesca; Darabi, Hatef; Davidson, Rosemarie; De Leeneer, Kim; Devilee, Peter; Dicks, Ed; Diez, Orland; Ding, Yuan Chun; Ditsch, Nina; Doheny, Kimberly F; Domchek, Susan M; Dorfling, Cecilia M; Dörk, Thilo; dos-Santos-Silva, Isabel; Dubois, Stéphane; Dugué, Pierre-Antoine; Dumont, Martine; Dunning, Alison M; Durcan, Lorraine; Dwek, Miriam; Dworniczak, Bernd; Eccles, Diana; Eeles, Ros; Ehrencrona, Hans; Eilber, Ursula; Ejlertsen, Bent; Ekici, Arif B; Engel, Christoph; Eriksson, Mikael; Fachal, Laura; Faivre, Laurence; Fasching, Peter A; Faust, Ulrike; Figueroa, Jonine; Flesch-Janys, Dieter; Fletcher, Olivia; Flyger, Henrik; Foulkes, William D; Friedman, Eitan; Fritschi, Lin; Frost, Debra; Gabrielson, Marike; Gaddam, Pragna; Gammon, Marilie D; Ganz, Patricia A; Gapstur, Susan M; Garber, Judy; Garcia-Barberan, Vanesa; García-Sáenz, José A; Gaudet, Mia M; Gauthier-Villars, Marion; Gehrig, Andrea; Georgoulias, Vassilios; Gerdes, Anne-Marie; Giles, Graham G; Glendon, Gord; Godwin, Andrew K; Goldberg, Mark S; Goldgar, David E; González-Neira, Anna; Goodfellow, Paul; Greene, Mark H; Grip, Mervi; Gronwald, Jacek; Grundy, Anne; Gschwantler-Kaulich, Daphne; Guénel, Pascal; Guo, Qi; Haeberle, Lothar; Hahnen, Eric; Haiman, Christopher A; Håkansson, Niclas; Hallberg, Emily; Hamann, Ute; Hamel, Nathalie; Hankinson, Susan; Hansen, Thomas V O; Harrington, Patricia; Hart, Steven N; Hartikainen, Jaana M; Healey, Catherine S; Hein, Alexander; Helbig, Sonja; Henderson, Alex; Heyworth, Jane; Hicks, Belynda; Hillemanns, Peter; Hodgson, Shirley; Hogervorst, Frans B; Hollestelle, Antoinette; Hooning, Maartje J; Hoover, Bob; Hopper, John L; Hu, Chunling; Huang, Guanmengqian; Hulick, Peter J; Humphreys, Keith; Hunter, David J; Imyanitov, Evgeny N; Isaacs, Claudine; Iwasaki, Motoki; Izatt, Louise; Jakubowska, Anna; James, Paul; Janavicius, Ramunas; Janni, Wolfgang; Jensen, Uffe Birk; John, Esther M; Johnson, Nichola; Jones, Kristine; Jones, Michael; Jukkola-Vuorinen, Arja; Kaaks, Rudolf; Kabisch, Maria; Kaczmarek, Katarzyna; Kang, Daehee; Kast, Karin; Keeman, Renske; Kerin, Michael J; Kets, Carolien M; Keupers, Machteld; Khan, Sofia; Khusnutdinova, Elza; Kiiski, Johanna I; Kim, Sung-Won; Knight, Julia A; Konstantopoulou, Irene; Kosma, Veli-Matti; Kristensen, Vessela N; Kruse, Torben A; Kwong, Ava; Lænkholm, Anne-Vibeke; Laitman, Yael; Lalloo, Fiona; Lambrechts, Diether; Landsman, Keren; Lasset, Christine; Lazaro, Conxi; Le Marchand, Loic; Lecarpentier, Julie; Lee, Andrew; Lee, Eunjung; Lee, Jong Won; Lee, Min Hyuk; Lejbkowicz, Flavio; Lesueur, Fabienne; Li, Jingmei; Lilyquist, Jenna; Lincoln, Anne; Lindblom, Annika; Lissowska, Jolanta; Lo, Wing-Yee; Loibl, Sibylle; Long, Jirong; Loud, Jennifer T; Lubinski, Jan; Luccarini, Craig; Lush, Michael; MacInnis, Robert J; Maishman, Tom; Makalic, Enes; Kostovska, Ivana Maleva; Malone, Kathleen E; Manoukian, Siranoush; Manson, JoAnn E; Margolin, Sara; Martens, John W M; Martinez, Maria Elena; Matsuo, Keitaro; Mavroudis, Dimitrios; Mazoyer, Sylvie; McLean, Catriona; Meijers-Heijboer, Hanne; Menéndez, Primitiva; Meyer, Jeffery; Miao, Hui; Miller, Austin; Miller, Nicola; Mitchell, Gillian; Montagna, Marco; Muir, Kenneth; Mulligan, Anna Marie; Mulot, Claire; Nadesan, Sue; Nathanson, Katherine L; Neuhausen, Susan L; Nevanlinna, Heli; Nevelsteen, Ines; Niederacher, Dieter; Nielsen, Sune F; Nordestgaard, Børge G; Norman, Aaron; Nussbaum, Robert L; Olah, Edith; Olopade, Olufunmilayo I; Olson, Janet E; Olswold, Curtis; Ong, Kai-ren; Oosterwijk, Jan C; Orr, Nick; Osorio, Ana; Pankratz, V Shane; Papi, Laura; Park-Simon, Tjoung-Won; Paulsson-Karlsson, Ylva; Lloyd, Rachel; Pedersen, Inge Søkilde; Peissel, Bernard; Peixoto, Ana; Perez, Jose I A; Peterlongo, Paolo; Peto, Julian; Pfeiler, Georg; Phelan, Catherine M; Pinchev, Mila; Plaseska-Karanfilska, Dijana; Poppe, Bruce; Porteous, Mary E; Prentice, Ross; Presneau, Nadege; Prokofieva, Darya; Pugh, Elizabeth; Pujana, Miquel Angel; Pylkäs, Katri; Rack, Brigitte; Radice, Paolo; Rahman, Nazneen; Rantala, Johanna; Rappaport-Fuerhauser, Christine; Rennert, Gad; Rennert, Hedy S; Rhenius, Valerie; Rhiem, Kerstin; Richardson, Andrea; Rodriguez, Gustavo C; Romero, Atocha; Romm, Jane; Rookus, Matti A; Rudolph, Anja; Ruediger, Thomas; Saloustros, Emmanouil; Sanders, Joyce; Sandler, Dale P; Sangrajrang, Suleeporn; Sawyer, Elinor J; Schmidt, Daniel F; Schoemaker, Minouk J; Schumacher, Fredrick; Schürmann, Peter; Schwentner, Lukas; Scott, Christopher; Scott, Rodney J; Seal, Sheila; Senter, Leigha; Seynaeve, Caroline; Shah, Mitul; Sharma, Priyanka; Shen, Chen-Yang; Sheng, Xin; Shimelis, Hermela; Shrubsole, Martha J; Shu, Xiao-Ou; Side, Lucy E; Singer, Christian F; Sohn, Christof; Southey, Melissa C; Spinelli, John J; Spurdle, Amanda B; Stegmaier, Christa; Stoppa-Lyonnet, Dominique; Sukiennicki, Grzegorz; Surowy, Harald; Sutter, Christian; Swerdlow, Anthony; Szabo, Csilla I; Tamimi, Rulla M; Tan, Yen Y; Taylor, Jack A; Tejada, Maria-Isabel; Tengström, Maria; Teo, Soo H; Terry, Mary B; Tessier, Daniel C; Teulé, Alex; Thöne, Kathrin; Thull, Darcy L; Tibiletti, Maria Grazia; Tihomirova, Laima; Tischkowitz, Marc; Toland, Amanda E; Tollenaar, Rob A E M; Tomlinson, Ian; Tong, Ling; Torres, Diana; Tranchant, Martine; Truong, Thérèse; Tucker, Kathy; Tung, Nadine; Tyrer, Jonathan; Ulmer, Hans-Ulrich; Vachon, Celine; van Asperen, Christi J; Van Den Berg, David; van den Ouweland, Ans M W; van Rensburg, Elizabeth J; Varesco, Liliana; Varon-Mateeva, Raymonda; Vega, Ana; Viel, Alessandra; Vijai, Joseph; Vincent, Daniel; Vollenweider, Jason; Walker, Lisa; Wang, Zhaoming; Wang-Gohrke, Shan; Wappenschmidt, Barbara; Weinberg, Clarice R; Weitzel, Jeffrey N; Wendt, Camilla; Wesseling, Jelle; Whittemore, Alice S; Wijnen, Juul T; Willett, Walter; Winqvist, Robert; Wolk, Alicja; Wu, Anna H; Xia, Lucy; Yang, Xiaohong R; Yannoukakos, Drakoulis; Zaffaroni, Daniela; Zheng, Wei; Zhu, Bin; Ziogas, Argyrios; Ziv, Elad; Zorn, Kristin K; Gago-Dominguez, Manuela; Mannermaa, Arto; Olsson, Håkan; Teixeira, Manuel R; Stone, Jennifer; Offit, Kenneth; Ottini, Laura; Park, Sue K; Thomassen, Mads; Hall, Per; Meindl, Alfons; Schmutzler, Rita K; Droit, Arnaud; Bader, Gary D; Pharoah, Paul D P; Couch, Fergus J; Easton, Douglas F; Kraft, Peter; Chenevix-Trench, Georgia; García-Closas, Montserrat; Schmidt, Marjanka K; Antoniou, Antonis C; Simard, Jacques
2018-01-01
Most common breast cancer susceptibility variants have been identified through genome-wide association studies (GWAS) of predominantly estrogen receptor (ER)-positive disease1. We conducted a GWAS using 21,468 ER-negative cases and 100,594 controls combined with 18,908 BRCA1 mutation carriers (9,414 with breast cancer), all of European origin. We identified independent associations at P < 5 × 10−8 with ten variants at nine new loci. At P < 0.05, we replicated associations with 10 of 11 variants previously reported in ER-negative disease or BRCA1 mutation carrier GWAS and observed consistent associations with ER-negative disease for 105 susceptibility variants identified by other studies. These 125 variants explain approximately 14% of the familial risk of this breast cancer subtype. There was high genetic correlation (0.72) between risk of ER-negative breast cancer and breast cancer risk for BRCA1 mutation carriers. These findings may lead to improved risk prediction and inform further fine-mapping and functional work to better understand the biological basis of ER-negative breast cancer. PMID:29058716
Identification of ten variants associated with risk of estrogen-receptor-negative breast cancer.
Milne, Roger L; Kuchenbaecker, Karoline B; Michailidou, Kyriaki; Beesley, Jonathan; Kar, Siddhartha; Lindström, Sara; Hui, Shirley; Lemaçon, Audrey; Soucy, Penny; Dennis, Joe; Jiang, Xia; Rostamianfar, Asha; Finucane, Hilary; Bolla, Manjeet K; McGuffog, Lesley; Wang, Qin; Aalfs, Cora M; Adams, Marcia; Adlard, Julian; Agata, Simona; Ahmed, Shahana; Ahsan, Habibul; Aittomäki, Kristiina; Al-Ejeh, Fares; Allen, Jamie; Ambrosone, Christine B; Amos, Christopher I; Andrulis, Irene L; Anton-Culver, Hoda; Antonenkova, Natalia N; Arndt, Volker; Arnold, Norbert; Aronson, Kristan J; Auber, Bernd; Auer, Paul L; Ausems, Margreet G E M; Azzollini, Jacopo; Bacot, François; Balmaña, Judith; Barile, Monica; Barjhoux, Laure; Barkardottir, Rosa B; Barrdahl, Myrto; Barnes, Daniel; Barrowdale, Daniel; Baynes, Caroline; Beckmann, Matthias W; Benitez, Javier; Bermisheva, Marina; Bernstein, Leslie; Bignon, Yves-Jean; Blazer, Kathleen R; Blok, Marinus J; Blomqvist, Carl; Blot, William; Bobolis, Kristie; Boeckx, Bram; Bogdanova, Natalia V; Bojesen, Anders; Bojesen, Stig E; Bonanni, Bernardo; Børresen-Dale, Anne-Lise; Bozsik, Aniko; Bradbury, Angela R; Brand, Judith S; Brauch, Hiltrud; Brenner, Hermann; Bressac-de Paillerets, Brigitte; Brewer, Carole; Brinton, Louise; Broberg, Per; Brooks-Wilson, Angela; Brunet, Joan; Brüning, Thomas; Burwinkel, Barbara; Buys, Saundra S; Byun, Jinyoung; Cai, Qiuyin; Caldés, Trinidad; Caligo, Maria A; Campbell, Ian; Canzian, Federico; Caron, Olivier; Carracedo, Angel; Carter, Brian D; Castelao, J Esteban; Castera, Laurent; Caux-Moncoutier, Virginie; Chan, Salina B; Chang-Claude, Jenny; Chanock, Stephen J; Chen, Xiaoqing; Cheng, Ting-Yuan David; Chiquette, Jocelyne; Christiansen, Hans; Claes, Kathleen B M; Clarke, Christine L; Conner, Thomas; Conroy, Don M; Cook, Jackie; Cordina-Duverger, Emilie; Cornelissen, Sten; Coupier, Isabelle; Cox, Angela; Cox, David G; Cross, Simon S; Cuk, Katarina; Cunningham, Julie M; Czene, Kamila; Daly, Mary B; Damiola, Francesca; Darabi, Hatef; Davidson, Rosemarie; De Leeneer, Kim; Devilee, Peter; Dicks, Ed; Diez, Orland; Ding, Yuan Chun; Ditsch, Nina; Doheny, Kimberly F; Domchek, Susan M; Dorfling, Cecilia M; Dörk, Thilo; Dos-Santos-Silva, Isabel; Dubois, Stéphane; Dugué, Pierre-Antoine; Dumont, Martine; Dunning, Alison M; Durcan, Lorraine; Dwek, Miriam; Dworniczak, Bernd; Eccles, Diana; Eeles, Ros; Ehrencrona, Hans; Eilber, Ursula; Ejlertsen, Bent; Ekici, Arif B; Eliassen, A Heather; Engel, Christoph; Eriksson, Mikael; Fachal, Laura; Faivre, Laurence; Fasching, Peter A; Faust, Ulrike; Figueroa, Jonine; Flesch-Janys, Dieter; Fletcher, Olivia; Flyger, Henrik; Foulkes, William D; Friedman, Eitan; Fritschi, Lin; Frost, Debra; Gabrielson, Marike; Gaddam, Pragna; Gammon, Marilie D; Ganz, Patricia A; Gapstur, Susan M; Garber, Judy; Garcia-Barberan, Vanesa; García-Sáenz, José A; Gaudet, Mia M; Gauthier-Villars, Marion; Gehrig, Andrea; Georgoulias, Vassilios; Gerdes, Anne-Marie; Giles, Graham G; Glendon, Gord; Godwin, Andrew K; Goldberg, Mark S; Goldgar, David E; González-Neira, Anna; Goodfellow, Paul; Greene, Mark H; Alnæs, Grethe I Grenaker; Grip, Mervi; Gronwald, Jacek; Grundy, Anne; Gschwantler-Kaulich, Daphne; Guénel, Pascal; Guo, Qi; Haeberle, Lothar; Hahnen, Eric; Haiman, Christopher A; Håkansson, Niclas; Hallberg, Emily; Hamann, Ute; Hamel, Nathalie; Hankinson, Susan; Hansen, Thomas V O; Harrington, Patricia; Hart, Steven N; Hartikainen, Jaana M; Healey, Catherine S; Hein, Alexander; Helbig, Sonja; Henderson, Alex; Heyworth, Jane; Hicks, Belynda; Hillemanns, Peter; Hodgson, Shirley; Hogervorst, Frans B; Hollestelle, Antoinette; Hooning, Maartje J; Hoover, Bob; Hopper, John L; Hu, Chunling; Huang, Guanmengqian; Hulick, Peter J; Humphreys, Keith; Hunter, David J; Imyanitov, Evgeny N; Isaacs, Claudine; Iwasaki, Motoki; Izatt, Louise; Jakubowska, Anna; James, Paul; Janavicius, Ramunas; Janni, Wolfgang; Jensen, Uffe Birk; John, Esther M; Johnson, Nichola; Jones, Kristine; Jones, Michael; Jukkola-Vuorinen, Arja; Kaaks, Rudolf; Kabisch, Maria; Kaczmarek, Katarzyna; Kang, Daehee; Kast, Karin; Keeman, Renske; Kerin, Michael J; Kets, Carolien M; Keupers, Machteld; Khan, Sofia; Khusnutdinova, Elza; Kiiski, Johanna I; Kim, Sung-Won; Knight, Julia A; Konstantopoulou, Irene; Kosma, Veli-Matti; Kristensen, Vessela N; Kruse, Torben A; Kwong, Ava; Lænkholm, Anne-Vibeke; Laitman, Yael; Lalloo, Fiona; Lambrechts, Diether; Landsman, Keren; Lasset, Christine; Lazaro, Conxi; Le Marchand, Loic; Lecarpentier, Julie; Lee, Andrew; Lee, Eunjung; Lee, Jong Won; Lee, Min Hyuk; Lejbkowicz, Flavio; Lesueur, Fabienne; Li, Jingmei; Lilyquist, Jenna; Lincoln, Anne; Lindblom, Annika; Lissowska, Jolanta; Lo, Wing-Yee; Loibl, Sibylle; Long, Jirong; Loud, Jennifer T; Lubinski, Jan; Luccarini, Craig; Lush, Michael; MacInnis, Robert J; Maishman, Tom; Makalic, Enes; Kostovska, Ivana Maleva; Malone, Kathleen E; Manoukian, Siranoush; Manson, JoAnn E; Margolin, Sara; Martens, John W M; Martinez, Maria Elena; Matsuo, Keitaro; Mavroudis, Dimitrios; Mazoyer, Sylvie; McLean, Catriona; Meijers-Heijboer, Hanne; Menéndez, Primitiva; Meyer, Jeffery; Miao, Hui; Miller, Austin; Miller, Nicola; Mitchell, Gillian; Montagna, Marco; Muir, Kenneth; Mulligan, Anna Marie; Mulot, Claire; Nadesan, Sue; Nathanson, Katherine L; Neuhausen, Susan L; Nevanlinna, Heli; Nevelsteen, Ines; Niederacher, Dieter; Nielsen, Sune F; Nordestgaard, Børge G; Norman, Aaron; Nussbaum, Robert L; Olah, Edith; Olopade, Olufunmilayo I; Olson, Janet E; Olswold, Curtis; Ong, Kai-Ren; Oosterwijk, Jan C; Orr, Nick; Osorio, Ana; Pankratz, V Shane; Papi, Laura; Park-Simon, Tjoung-Won; Paulsson-Karlsson, Ylva; Lloyd, Rachel; Pedersen, Inge Søkilde; Peissel, Bernard; Peixoto, Ana; Perez, Jose I A; Peterlongo, Paolo; Peto, Julian; Pfeiler, Georg; Phelan, Catherine M; Pinchev, Mila; Plaseska-Karanfilska, Dijana; Poppe, Bruce; Porteous, Mary E; Prentice, Ross; Presneau, Nadege; Prokofieva, Darya; Pugh, Elizabeth; Pujana, Miquel Angel; Pylkäs, Katri; Rack, Brigitte; Radice, Paolo; Rahman, Nazneen; Rantala, Johanna; Rappaport-Fuerhauser, Christine; Rennert, Gad; Rennert, Hedy S; Rhenius, Valerie; Rhiem, Kerstin; Richardson, Andrea; Rodriguez, Gustavo C; Romero, Atocha; Romm, Jane; Rookus, Matti A; Rudolph, Anja; Ruediger, Thomas; Saloustros, Emmanouil; Sanders, Joyce; Sandler, Dale P; Sangrajrang, Suleeporn; Sawyer, Elinor J; Schmidt, Daniel F; Schoemaker, Minouk J; Schumacher, Fredrick; Schürmann, Peter; Schwentner, Lukas; Scott, Christopher; Scott, Rodney J; Seal, Sheila; Senter, Leigha; Seynaeve, Caroline; Shah, Mitul; Sharma, Priyanka; Shen, Chen-Yang; Sheng, Xin; Shimelis, Hermela; Shrubsole, Martha J; Shu, Xiao-Ou; Side, Lucy E; Singer, Christian F; Sohn, Christof; Southey, Melissa C; Spinelli, John J; Spurdle, Amanda B; Stegmaier, Christa; Stoppa-Lyonnet, Dominique; Sukiennicki, Grzegorz; Surowy, Harald; Sutter, Christian; Swerdlow, Anthony; Szabo, Csilla I; Tamimi, Rulla M; Tan, Yen Y; Taylor, Jack A; Tejada, Maria-Isabel; Tengström, Maria; Teo, Soo H; Terry, Mary B; Tessier, Daniel C; Teulé, Alex; Thöne, Kathrin; Thull, Darcy L; Tibiletti, Maria Grazia; Tihomirova, Laima; Tischkowitz, Marc; Toland, Amanda E; Tollenaar, Rob A E M; Tomlinson, Ian; Tong, Ling; Torres, Diana; Tranchant, Martine; Truong, Thérèse; Tucker, Kathy; Tung, Nadine; Tyrer, Jonathan; Ulmer, Hans-Ulrich; Vachon, Celine; van Asperen, Christi J; Van Den Berg, David; van den Ouweland, Ans M W; van Rensburg, Elizabeth J; Varesco, Liliana; Varon-Mateeva, Raymonda; Vega, Ana; Viel, Alessandra; Vijai, Joseph; Vincent, Daniel; Vollenweider, Jason; Walker, Lisa; Wang, Zhaoming; Wang-Gohrke, Shan; Wappenschmidt, Barbara; Weinberg, Clarice R; Weitzel, Jeffrey N; Wendt, Camilla; Wesseling, Jelle; Whittemore, Alice S; Wijnen, Juul T; Willett, Walter; Winqvist, Robert; Wolk, Alicja; Wu, Anna H; Xia, Lucy; Yang, Xiaohong R; Yannoukakos, Drakoulis; Zaffaroni, Daniela; Zheng, Wei; Zhu, Bin; Ziogas, Argyrios; Ziv, Elad; Zorn, Kristin K; Gago-Dominguez, Manuela; Mannermaa, Arto; Olsson, Håkan; Teixeira, Manuel R; Stone, Jennifer; Offit, Kenneth; Ottini, Laura; Park, Sue K; Thomassen, Mads; Hall, Per; Meindl, Alfons; Schmutzler, Rita K; Droit, Arnaud; Bader, Gary D; Pharoah, Paul D P; Couch, Fergus J; Easton, Douglas F; Kraft, Peter; Chenevix-Trench, Georgia; García-Closas, Montserrat; Schmidt, Marjanka K; Antoniou, Antonis C; Simard, Jacques
2017-12-01
Most common breast cancer susceptibility variants have been identified through genome-wide association studies (GWAS) of predominantly estrogen receptor (ER)-positive disease. We conducted a GWAS using 21,468 ER-negative cases and 100,594 controls combined with 18,908 BRCA1 mutation carriers (9,414 with breast cancer), all of European origin. We identified independent associations at P < 5 × 10 -8 with ten variants at nine new loci. At P < 0.05, we replicated associations with 10 of 11 variants previously reported in ER-negative disease or BRCA1 mutation carrier GWAS and observed consistent associations with ER-negative disease for 105 susceptibility variants identified by other studies. These 125 variants explain approximately 16% of the familial risk of this breast cancer subtype. There was high genetic correlation (0.72) between risk of ER-negative breast cancer and breast cancer risk for BRCA1 mutation carriers. These findings may lead to improved risk prediction and inform further fine-mapping and functional work to better understand the biological basis of ER-negative breast cancer.
Convergent evidence from systematic analysis of GWAS revealed genetic basis of esophageal cancer.
Gao, Xue-Xin; Gao, Lei; Wang, Jiu-Qiang; Qu, Su-Su; Qu, Yue; Sun, Hong-Lei; Liu, Si-Dang; Shang, Ying-Li
2016-07-12
Recent genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with risk of esophageal cancer (EC). However, investigation of genetic basis from the perspective of systematic biology and integrative genomics remains scarce.In this study, we explored genetic basis of EC based on GWAS data and implemented a series of bioinformatics methods including functional annotation, expression quantitative trait loci (eQTL) analysis, pathway enrichment analysis and pathway grouped network analysis.Two hundred and thirteen risk SNPs were identified, in which 44 SNPs were found to have significantly differential gene expression in esophageal tissues by eQTL analysis. By pathway enrichment analysis, 170 risk genes mapped by risk SNPs were enriched into 38 significant GO terms and 17 significant KEGG pathways, which were significantly grouped into 9 sub-networks by pathway grouped network analysis. The 9 groups of interconnected pathways were mainly involved with muscle cell proliferation, cellular response to interleukin-6, cell adhesion molecules, and ethanol oxidation, which might participate in the development of EC.Our findings provide genetic evidence and new insight for exploring the molecular mechanisms of EC.
Raffler, Johannes; Friedrich, Nele; Arnold, Matthias; Kacprowski, Tim; Rueedi, Rico; Altmaier, Elisabeth; Bergmann, Sven; Budde, Kathrin; Gieger, Christian; Homuth, Georg; Pietzner, Maik; Römisch-Margl, Werner; Strauch, Konstantin; Völzke, Henry; Waldenberger, Melanie; Wallaschofski, Henri; Nauck, Matthias; Völker, Uwe; Kastenmüller, Gabi; Suhre, Karsten
2015-01-01
Genome-wide association studies with metabolic traits (mGWAS) uncovered many genetic variants that influence human metabolism. These genetically influenced metabotypes (GIMs) contribute to our metabolic individuality, our capacity to respond to environmental challenges, and our susceptibility to specific diseases. While metabolic homeostasis in blood is a well investigated topic in large mGWAS with over 150 known loci, metabolic detoxification through urinary excretion has only been addressed by few small mGWAS with only 11 associated loci so far. Here we report the largest mGWAS to date, combining targeted and non-targeted 1H NMR analysis of urine samples from 3,861 participants of the SHIP-0 cohort and 1,691 subjects of the KORA F4 cohort. We identified and replicated 22 loci with significant associations with urinary traits, 15 of which are new (HIBCH, CPS1, AGXT, XYLB, TKT, ETNPPL, SLC6A19, DMGDH, SLC36A2, GLDC, SLC6A13, ACSM3, SLC5A11, PNMT, SLC13A3). Two-thirds of the urinary loci also have a metabolite association in blood. For all but one of the 6 loci where significant associations target the same metabolite in blood and urine, the genetic effects have the same direction in both fluids. In contrast, for the SLC5A11 locus, we found increased levels of myo-inositol in urine whereas mGWAS in blood reported decreased levels for the same genetic variant. This might indicate less effective re-absorption of myo-inositol in the kidneys of carriers. In summary, our study more than doubles the number of known loci that influence urinary phenotypes. It thus allows novel insights into the relationship between blood homeostasis and its regulation through excretion. The newly discovered loci also include variants previously linked to chronic kidney disease (CPS1, SLC6A13), pulmonary hypertension (CPS1), and ischemic stroke (XYLB). By establishing connections from gene to disease via metabolic traits our results provide novel hypotheses about molecular mechanisms involved in the etiology of diseases. PMID:26352407
A genome-wide association study of seed protein and oil content in soybean
2014-01-01
Background Association analysis is an alternative to conventional family-based methods to detect the location of gene(s) or quantitative trait loci (QTL) and provides relatively high resolution in terms of defining the genome position of a gene or QTL. Seed protein and oil concentration are quantitative traits which are determined by the interaction among many genes with small to moderate genetic effects and their interaction with the environment. In this study, a genome-wide association study (GWAS) was performed to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. Results A total of 55,159 single nucleotide polymorphisms (SNPs) were genotyped using various methods including Illumina Infinium and GoldenGate assays and 31,954 markers with minor allele frequency >0.10 were used to estimate linkage disequilibrium (LD) in heterochromatic and euchromatic regions. In euchromatic regions, the mean LD (r 2 ) rapidly declined to 0.2 within 360 Kbp, whereas the mean LD declined to 0.2 at 9,600 Kbp in heterochromatic regions. The GWAS results identified 40 SNPs in 17 different genomic regions significantly associated with seed protein. Of these, the five SNPs with the highest associations and seven adjacent SNPs were located in the 27.6-30.0 Mbp region of Gm20. A major seed protein QTL has been previously mapped to the same location and potential candidate genes have recently been identified in this region. The GWAS results also detected 25 SNPs in 13 different genomic regions associated with seed oil. Of these markers, seven SNPs had a significant association with both protein and oil. Conclusions This research indicated that GWAS not only identified most of the previously reported QTL controlling seed protein and oil, but also resulted in narrower genomic regions than the regions reported as containing these QTL. The narrower GWAS-defined genome regions will allow more precise marker-assisted allele selection and will expedite positional cloning of the causal gene(s). PMID:24382143
A genome-wide association study of seed protein and oil content in soybean.
Hwang, Eun-Young; Song, Qijian; Jia, Gaofeng; Specht, James E; Hyten, David L; Costa, Jose; Cregan, Perry B
2014-01-02
Association analysis is an alternative to conventional family-based methods to detect the location of gene(s) or quantitative trait loci (QTL) and provides relatively high resolution in terms of defining the genome position of a gene or QTL. Seed protein and oil concentration are quantitative traits which are determined by the interaction among many genes with small to moderate genetic effects and their interaction with the environment. In this study, a genome-wide association study (GWAS) was performed to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. A total of 55,159 single nucleotide polymorphisms (SNPs) were genotyped using various methods including Illumina Infinium and GoldenGate assays and 31,954 markers with minor allele frequency >0.10 were used to estimate linkage disequilibrium (LD) in heterochromatic and euchromatic regions. In euchromatic regions, the mean LD (r2) rapidly declined to 0.2 within 360 Kbp, whereas the mean LD declined to 0.2 at 9,600 Kbp in heterochromatic regions. The GWAS results identified 40 SNPs in 17 different genomic regions significantly associated with seed protein. Of these, the five SNPs with the highest associations and seven adjacent SNPs were located in the 27.6-30.0 Mbp region of Gm20. A major seed protein QTL has been previously mapped to the same location and potential candidate genes have recently been identified in this region. The GWAS results also detected 25 SNPs in 13 different genomic regions associated with seed oil. Of these markers, seven SNPs had a significant association with both protein and oil. This research indicated that GWAS not only identified most of the previously reported QTL controlling seed protein and oil, but also resulted in narrower genomic regions than the regions reported as containing these QTL. The narrower GWAS-defined genome regions will allow more precise marker-assisted allele selection and will expedite positional cloning of the causal gene(s).
Pe’er, Itsik
2017-01-01
Genome-wide association studies (GWAS) have identified hundreds of SNPs responsible for variation in human quantitative traits. However, genome-wide-significant associations often fail to replicate across independent cohorts, in apparent inconsistency with their apparent strong effects in discovery cohorts. This limited success of replication raises pervasive questions about the utility of the GWAS field. We identify all 332 studies of quantitative traits from the NHGRI-EBI GWAS Database with attempted replication. We find that the majority of studies provide insufficient data to evaluate replication rates. The remaining papers replicate significantly worse than expected (p < 10−14), even when adjusting for regression-to-the-mean of effect size between discovery- and replication-cohorts termed the Winner’s Curse (p < 10−16). We show this is due in part to misreporting replication cohort-size as a maximum number, rather than per-locus one. In 39 studies accurately reporting per-locus cohort-size for attempted replication of 707 loci in samples with similar ancestry, replication rate matched expectation (predicted 458, observed 457, p = 0.94). In contrast, ancestry differences between replication and discovery (13 studies, 385 loci) cause the most highly-powered decile of loci to replicate worse than expected, due to difference in linkage disequilibrium. PMID:28715421
Meta-analysis and genome-wide interpretation of genetic susceptibility to drug addiction
2011-01-01
Background Classical genetic studies provide strong evidence for heritable contributions to susceptibility to developing dependence on addictive substances. Candidate gene and genome-wide association studies (GWAS) have sought genes, chromosomal regions and allelic variants likely to contribute to susceptibility to drug addiction. Results Here, we performed a meta-analysis of addiction candidate gene association studies and GWAS to investigate possible functional mechanisms associated with addiction susceptibility. From meta-data retrieved from 212 publications on candidate gene association studies and 5 GWAS reports, we linked a total of 843 haplotypes to addiction susceptibility. We mapped the SNPs in these haplotypes to functional and regulatory elements in the genome and estimated the magnitude of the contributions of different molecular mechanisms to their effects on addiction susceptibility. In addition to SNPs in coding regions, these data suggest that haplotypes in gene regulatory regions may also contribute to addiction susceptibility. When we compared the lists of genes identified by association studies and those identified by molecular biological studies of drug-regulated genes, we observed significantly higher participation in the same gene interaction networks than expected by chance, despite little overlap between the two gene lists. Conclusions These results appear to offer new insights into the genetic factors underlying drug addiction. PMID:21999673
Shen, Changbing; Gao, Jing; Sheng, Yujun; Dou, Jinfa; Zhou, Fusheng; Zheng, Xiaodong; Ko, Randy; Tang, Xianfa; Zhu, Caihong; Yin, Xianyong; Sun, Liangdan; Cui, Yong; Zhang, Xuejun
2016-01-01
Vitiligo is an autoimmune disease with a strong genetic component, characterized by areas of depigmented skin resulting from loss of epidermal melanocytes. Genetic factors are known to play key roles in vitiligo through discoveries in association studies and family studies. Previously, vitiligo susceptibility genes were mainly revealed through linkage analysis and candidate gene studies. Recently, our understanding of the genetic basis of vitiligo has been rapidly advancing through genome-wide association study (GWAS). More than 40 robust susceptible loci have been identified and confirmed to be associated with vitiligo by using GWAS. Most of these associated genes participate in important pathways involved in the pathogenesis of vitiligo. Many susceptible loci with unknown functions in the pathogenesis of vitiligo have also been identified, indicating that additional molecular mechanisms may contribute to the risk of developing vitiligo. In this review, we summarize the key loci that are of genome-wide significance, which have been shown to influence vitiligo risk. These genetic loci may help build the foundation for genetic diagnosis and personalize treatment for patients with vitiligo in the future. However, substantial additional studies, including gene-targeted and functional studies, are required to confirm the causality of the genetic variants and their biological relevance in the development of vitiligo. PMID:26870082
Sul, Jae Hoon; Bilow, Michael; Yang, Wen-Yun; Kostem, Emrah; Furlotte, Nick; He, Dan; Eskin, Eleazar
2016-03-01
Although genome-wide association studies (GWASs) have discovered numerous novel genetic variants associated with many complex traits and diseases, those genetic variants typically explain only a small fraction of phenotypic variance. Factors that account for phenotypic variance include environmental factors and gene-by-environment interactions (GEIs). Recently, several studies have conducted genome-wide gene-by-environment association analyses and demonstrated important roles of GEIs in complex traits. One of the main challenges in these association studies is to control effects of population structure that may cause spurious associations. Many studies have analyzed how population structure influences statistics of genetic variants and developed several statistical approaches to correct for population structure. However, the impact of population structure on GEI statistics in GWASs has not been extensively studied and nor have there been methods designed to correct for population structure on GEI statistics. In this paper, we show both analytically and empirically that population structure may cause spurious GEIs and use both simulation and two GWAS datasets to support our finding. We propose a statistical approach based on mixed models to account for population structure on GEI statistics. We find that our approach effectively controls population structure on statistics for GEIs as well as for genetic variants.
Lin, Ying-Ju; Liao, Wen-Ling; Wang, Chung-Hsing; Tsai, Li-Ping; Tang, Chih-Hsin; Chen, Chien-Hsiun; Wu, Jer-Yuarn; Liang, Wen-Miin; Hsieh, Ai-Ru; Cheng, Chi-Fung; Chen, Jin-Hua; Chien, Wen-Kuei; Lin, Ting-Hsu; Wu, Chia-Ming; Liao, Chiu-Chu; Huang, Shao-Mei; Tsai, Fuu-Jen
2017-07-25
Human height can be described as a classical and inherited trait model. Genome-wide association studies (GWAS) have revealed susceptible loci and provided insights into the polygenic nature of human height. Familial short stature (FSS) represents a suitable trait for investigating short stature genetics because disease associations with short stature have been ruled out in this case. In addition, FSS is caused only by genetically inherited factors. In this study, we explored the correlations of FSS risk with the genetic loci associated with human height in previous GWAS, alone and cumulatively. We systematically evaluated 34 known human height single nucleotide polymorphisms (SNPs) in relation to FSS in the additive model (p < 0.00005). A cumulative effect was observed: the odds ratios gradually increased with increasing genetic risk score quartiles (p < 0.001; Cochran-Armitage trend test). Six affected genes-ZBTB38, ZNF638, LCORL, CABLES1, CDK10, and TSEN15-are located in the nucleus and have been implicated in embryonic, organismal, and tissue development. In conclusion, our study suggests that 13 human height GWAS-identified SNPs are associated with FSS risk both alone and cumulatively.
Mahmoudpour, Seyed Hamidreza; Veluchamy, Abirami; Siddiqui, Moneeza Kalhan; Asselbergs, Folkert W.; Souverein, Patrick C.; de Keyser, Catherine E.; Hofman, Albert; Lang, Chim C.; Doney, Alexander SF.; Stricker, Bruno H.; de Boer, Anthonius; Maitland-van der Zee, Anke-Hilse; Palmer, Colin NA.
2016-01-01
Objectives To identify SNPs associated with switching from an ACE-inhibitor to an angiotensin receptor blocker (ARB). Methods Two cohorts of patients starting ACE-inhibitors were identified within the Rotterdam Study in the Netherlands and the GoDARTS study in Scotland. Cases were intolerant subjects who switched from an ACE-inhibitor to an ARB, controls were subjects who used ACE-inhibitors continuously for at least 2 years and did not switch. GWAS using an additive model was run in these sets and results were meta-analysed using GWAMA. Results 972 cases out of 5 161 ACE-inhibitor starters were identified. 8 SNPs within 4 genes reached the GWAS significance level (P<5×10-8) in the meta-analysis (RBFOX3, GABRG2, SH2B1 and MBOAT1). The strongest associated SNP was located in an intron of RBFOX3, which contains a RNA binding protein (rs2061538: MAF=0.16, OR=1.52[95%CI: 1.32-1.76], p=6.2x10-9). Conclusions These results indicate that genetic variation in abovementioned genes may increase the risk of ACE-inhibitors induced adverse reactions. PMID:28030426
Imamura, Minako; Takahashi, Atsushi; Yamauchi, Toshimasa; Hara, Kazuo; Yasuda, Kazuki; Grarup, Niels; Zhao, Wei; Wang, Xu; Huerta-Chagoya, Alicia; Hu, Cheng; Moon, Sanghoon; Long, Jirong; Kwak, Soo Heon; Rasheed, Asif; Saxena, Richa; Ma, Ronald C. W.; Okada, Yukinori; Iwata, Minoru; Hosoe, Jun; Shojima, Nobuhiro; Iwasaki, Minaka; Fujita, Hayato; Suzuki, Ken; Danesh, John; Jørgensen, Torben; Jørgensen, Marit E.; Witte, Daniel R.; Brandslund, Ivan; Christensen, Cramer; Hansen, Torben; Mercader, Josep M.; Flannick, Jason; Moreno-Macías, Hortensia; Burtt, Noël P.; Zhang, Rong; Kim, Young Jin; Zheng, Wei; Singh, Jai Rup; Tam, Claudia H. T.; Hirose, Hiroshi; Maegawa, Hiroshi; Ito, Chikako; Kaku, Kohei; Watada, Hirotaka; Tanaka, Yasushi; Tobe, Kazuyuki; Kawamori, Ryuzo; Kubo, Michiaki; Cho, Yoon Shin; Chan, Juliana C. N.; Sanghera, Dharambir; Frossard, Philippe; Park, Kyong Soo; Shu, Xiao-Ou; Kim, Bong-Jo; Florez, Jose C.; Tusié-Luna, Teresa; Jia, Weiping; Tai, E Shyong; Pedersen, Oluf; Saleheen, Danish; Maeda, Shiro; Kadowaki, Takashi
2016-01-01
Genome-wide association studies (GWAS) have identified more than 80 susceptibility loci for type 2 diabetes (T2D), but most of its heritability still remains to be elucidated. In this study, we conducted a meta-analysis of GWAS for T2D in the Japanese population. Combined data from discovery and subsequent validation analyses (23,399 T2D cases and 31,722 controls) identify 7 new loci with genome-wide significance (P<5 × 10−8), rs1116357 near CCDC85A, rs147538848 in FAM60A, rs1575972 near DMRTA1, rs9309245 near ASB3, rs67156297 near ATP8B2, rs7107784 near MIR4686 and rs67839313 near INAFM2. Of these, the association of 4 loci with T2D is replicated in multi-ethnic populations other than Japanese (up to 65,936 T2Ds and 158,030 controls, P<0.007). These results indicate that expansion of single ethnic GWAS is still useful to identify novel susceptibility loci to complex traits not only for ethnicity-specific loci but also for common loci across different ethnicities. PMID:26818947
An efficient empirical Bayes method for genomewide association studies.
Wang, Q; Wei, J; Pan, Y; Xu, S
2016-08-01
Linear mixed model (LMM) is one of the most popular methods for genomewide association studies (GWAS). Numerous forms of LMM have been developed; however, there are two major issues in GWAS that have not been fully addressed before. The two issues are (i) the genomic background noise and (ii) low statistical power after Bonferroni correction. We proposed an empirical Bayes (EB) method by assigning each marker effect a normal prior distribution, resulting in shrinkage estimates of marker effects. We found that such a shrinkage approach can selectively shrink marker effects and reduce the noise level to zero for majority of non-associated markers. In the meantime, the EB method allows us to use an 'effective number of tests' to perform Bonferroni correction for multiple tests. Simulation studies for both human and pig data showed that EB method can significantly increase statistical power compared with the widely used exact GWAS methods, such as GEMMA and FaST-LMM-Select. Real data analyses in human breast cancer identified improved detection signals for markers previously known to be associated with breast cancer. We therefore believe that EB method is a valuable tool for identifying the genetic basis of complex traits. © 2015 Blackwell Verlag GmbH.
Nicoletti, Paola; Bansal, Mukesh; Lefebvre, Celine; Guarnieri, Paolo; Shen, Yufeng; Pe'er, Itsik; Califano, Andrea; Floratos, Aris
2015-01-01
Stevens-Johnson syndrome (SJS) and Toxic Epidermal Necrolysis (TEN) represent rare but serious adverse drug reactions (ADRs). Both are characterized by distinctive blistering lesions and significant mortality rates. While there is evidence for strong drug-specific genetic predisposition related to HLA alleles, recent genome wide association studies (GWAS) on European and Asian populations have failed to identify genetic susceptibility alleles that are common across multiple drugs. We hypothesize that this is a consequence of the low to moderate effect size of individual genetic risk factors. To test this hypothesis we developed Pointer, a new algorithm that assesses the aggregate effect of multiple low risk variants on a pathway using a gene set enrichment approach. A key advantage of our method is the capability to associate SNPs with genes by exploiting physical proximity as well as by using expression quantitative trait loci (eQTLs) that capture information about both cis- and trans-acting regulatory effects. We control for known bias-inducing aspects of enrichment based analyses, such as: 1) gene length, 2) gene set size, 3) presence of biologically related genes within the same linkage disequilibrium (LD) region, and, 4) genes shared among multiple gene sets. We applied this approach to publicly available SJS/TEN genome-wide genotype data and identified the ABC transporter and Proteasome pathways as potentially implicated in the genetic susceptibility of non-drug-specific SJS/TEN. We demonstrated that the innovative SNP-to-gene mapping phase of the method was essential in detecting the significant enrichment for those pathways. Analysis of an independent gene expression dataset provides supportive functional evidence for the involvement of Proteasome pathways in SJS/TEN cutaneous lesions. These results suggest that Pointer provides a useful framework for the integrative analysis of pharmacogenetic GWAS data, by increasing the power to detect aggregate effects of multiple low risk variants. The software is available for download at https://sourceforge.net/projects/pointergsa/.
Karyadi, Danielle M.; Karlins, Eric; Decker, Brennan; vonHoldt, Bridgett M.; Carpintero-Ramirez, Gretchen; Parker, Heidi G.; Wayne, Robert K.; Ostrander, Elaine A.
2013-01-01
The domestic dog is a robust model for studying the genetics of complex disease susceptibility. The strategies used to develop and propagate modern breeds have resulted in an elevated risk for specific diseases in particular breeds. One example is that of Standard Poodles (STPOs), who have increased risk for squamous cell carcinoma of the digit (SCCD), a locally aggressive cancer that causes lytic bone lesions, sometimes with multiple toe recurrence. However, only STPOs of dark coat color are at high risk; light colored STPOs are almost entirely unaffected, suggesting that interactions between multiple pathways are necessary for oncogenesis. We performed a genome-wide association study (GWAS) on STPOs, comparing 31 SCCD cases to 34 unrelated black STPO controls. The peak SNP on canine chromosome 15 was statistically significant at the genome-wide level (Praw = 1.60×10−7; Pgenome = 0.0066). Additional mapping resolved the region to the KIT Ligand (KITLG) locus. Comparison of STPO cases to other at-risk breeds narrowed the locus to a 144.9-Kb region. Haplotype mapping among 84 STPO cases identified a minimal region of 28.3 Kb. A copy number variant (CNV) containing predicted enhancer elements was found to be strongly associated with SCCD in STPOs (P = 1.72×10−8). Light colored STPOs carry the CNV risk alleles at the same frequency as black STPOs, but are not susceptible to SCCD. A GWAS comparing 24 black and 24 light colored STPOs highlighted only the MC1R locus as significantly different between the two datasets, suggesting that a compensatory mutation within the MC1R locus likely protects light colored STPOs from disease. Our findings highlight a role for KITLG in SCCD susceptibility, as well as demonstrate that interactions between the KITLG and MC1R loci are potentially required for SCCD oncogenesis. These findings highlight how studies of breed-limited diseases are useful for disentangling multigene disorders. PMID:23555311
Li, Zhihua; Fan, Jingyi; Li, Ni; Zhu, Meng; Zhang, Jiahui; Wang, Yuzhuo; Geng, Liguo; Cheng, Yang; Ma, Hongxia; Jin, Guangfu; Dai, Juncheng; Hu, Zhibin; Shen, Hongbing
2018-05-29
Genome-wide association studies (GWAS) and fine mapping studies have identified multiple lung cancer susceptibility variants in TERT-CLPTM1L region. However, it is still unclear about the relationship between these risk variants and the independent lung cancer risk signals in this region. Therefore, we evaluated the independent susceptibility signals for lung cancer and explored the potential functional variants in this region. Sequential conditional analysis was used to detect the independent susceptibility loci based on four lung cancer GWAS datasets with 12,843 lung cases and 12,639 controls. Comprehensively functional annotations were performed for each independent signal. Three independent susceptibility signals were identified in multi-ethnic population. For the first signal, rs2736100 showed the most significant association with lung cancer risk (C > A, OR = 0.82, 95%CI: 0.79-0.85, P = 1.98 × 10 -25 ). Rs36019446 was the top-ranked site (A > G, OR = 0.88, 95%CI: 0.84-0.92, P = 1.74 × 10 -9 ) in the second signal. For the third signal, rs326048 was the leading SNP (A > G, OR = 0.91, 95%CI: 0.87-0.95, P = 1.38 × 10 -5 ). The following subgroup analysis found the same three loci among Asian population. Further, we compared the difference between various subgroup populations. Functional annotations revealed that rs2736100, rs27996 (r 2 = 0.85 with rs36019446) and rs326049 (r 2 = 0.73 with rs326048) could be potential functional variants in these three risk signals, respectively. In conclusion, although multiple variants have been found associated with lung cancer risk in TERT-CLPTM1L region, our findings indicated that there are three independent lung cancer susceptibility signals in this region. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Zhao, Zhongming; Guo, An-Yuan; van den Oord, Edwin J C G; Aliev, Fazil; Jia, Peilin; Edenberg, Howard J; Riley, Brien P; Dick, Danielle M; Bettinger, Jill C; Davies, Andrew G; Grotewiel, Michael S; Schuckit, Marc A; Agrawal, Arpana; Kramer, John; Nurnberger, John I; Kendler, Kenneth S; Webb, Bradley T; Miles, Michael F
2012-01-01
A variety of species and experimental designs have been used to study genetic influences on alcohol dependence, ethanol response, and related traits. Integration of these heterogeneous data can be used to produce a ranked target gene list for additional investigation. In this study, we performed a unique multi-species evidence-based data integration using three microarray experiments in mice or humans that generated an initial alcohol dependence (AD) related genes list, human linkage and association results, and gene sets implicated in C. elegans and Drosophila. We then used permutation and false discovery rate (FDR) analyses on the genome-wide association studies (GWAS) dataset from the Collaborative Study on the Genetics of Alcoholism (COGA) to evaluate the ranking results and weighting matrices. We found one weighting score matrix could increase FDR based q-values for a list of 47 genes with a score greater than 2. Our follow up functional enrichment tests revealed these genes were primarily involved in brain responses to ethanol and neural adaptations occurring with alcoholism. These results, along with our experimental validation of specific genes in mice, C. elegans and Drosophila, suggest that a cross-species evidence-based approach is useful to identify candidate genes contributing to alcoholism.
Fang, Lingzhao; Sahana, Goutam; Su, Guosheng; Yu, Ying; Zhang, Shengli; Lund, Mogens Sandø; Sørensen, Peter
2017-01-01
Connecting genome-wide association study (GWAS) to biological mechanisms underlying complex traits is a major challenge. Mastitis resistance and milk production are complex traits of economic importance in the dairy sector and are associated with intra-mammary infection (IMI). Here, we integrated IMI-relevant RNA-Seq data from Holstein cattle and sequence-based GWAS data from three dairy cattle breeds (i.e., Holstein, Nordic red cattle, and Jersey) to explore the genetic basis of mastitis resistance and milk production using post-GWAS analyses and a genomic feature linear mixed model. At 24 h post-IMI, genes responsive to IMI in the mammary gland were preferentially enriched for genetic variants associated with mastitis resistance rather than milk production. Response genes in the liver were mainly enriched for variants associated with mastitis resistance at an early time point (3 h) post-IMI, whereas responsive genes at later stages were enriched for associated variants with milk production. The up- and down-regulated genes were enriched for associated variants with mastitis resistance and milk production, respectively. The patterns were consistent across breeds, indicating that different breeds shared similarities in the genetic basis of these traits. Our approaches provide a framework for integrating multiple layers of data to understand the genetic architecture underlying complex traits. PMID:28358110
1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function
Gorski, Mathias; van der Most, Peter J.; Teumer, Alexander; Chu, Audrey Y.; Li, Man; Mijatovic, Vladan; Nolte, Ilja M.; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F.; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P.; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C.; Curhan, Gary C.; d’Adamo, Adamo Pio; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H.; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J.; Harris, Tamara B.; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G.; Homuth, Georg; Hu, Frank B.; Huth, Cornelia; Hutri-Kähönen, Nina; Hwang, Shih-Jen; Imboden, Medea; Johansson, Åsa; Kähönen, Mika; König, Wolfgang; Kramer, Holly; Krämer, Bernhard K.; Kumar, Ashish; Kutalik, Zoltan; Lambert, Jean-Charles; Launer, Lenore J.; Lehtimäki, Terho; de Borst, Martin; Navis, Gerjan; Swertz, Morris; Liu, Yongmei; Lohman, Kurt; Loos, Ruth J. F.; Lu, Yingchang; Lyytikäinen, Leo-Pekka; McEvoy, Mark A.; Meisinger, Christa; Meitinger, Thomas; Metspalu, Andres; Metzger, Marie; Mihailov, Evelin; Mitchell, Paul; Nauck, Matthias; Oldehinkel, Albertine J.; Olden, Matthias; WJH Penninx, Brenda; Pistis, Giorgio; Pramstaller, Peter P.; Probst-Hensch, Nicole; Raitakari, Olli T.; Rettig, Rainer; Ridker, Paul M.; Rivadeneira, Fernando; Robino, Antonietta; Rosas, Sylvia E.; Ruderfer, Douglas; Ruggiero, Daniela; Saba, Yasaman; Sala, Cinzia; Schmidt, Helena; Schmidt, Reinhold; Scott, Rodney J.; Sedaghat, Sanaz; Smith, Albert V.; Sorice, Rossella; Stengel, Benedicte; Stracke, Sylvia; Strauch, Konstantin; Toniolo, Daniela; Uitterlinden, Andre G.; Ulivi, Sheila; Viikari, Jorma S.; Völker, Uwe; Vollenweider, Peter; Völzke, Henry; Vuckovic, Dragana; Waldenberger, Melanie; Jin Wang, Jie; Yang, Qiong; Chasman, Daniel I.; Tromp, Gerard; Snieder, Harold; Heid, Iris M.; Fox, Caroline S.; Köttgen, Anna; Pattaro, Cristian; Böger, Carsten A.; Fuchsberger, Christian
2017-01-01
HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10−8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples. PMID:28452372
1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function.
Gorski, Mathias; van der Most, Peter J; Teumer, Alexander; Chu, Audrey Y; Li, Man; Mijatovic, Vladan; Nolte, Ilja M; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C; Curhan, Gary C; d'Adamo, Adamo Pio; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J; Harris, Tamara B; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G; Homuth, Georg; Hu, Frank B; Huth, Cornelia; Hutri-Kähönen, Nina; Hwang, Shih-Jen; Imboden, Medea; Johansson, Åsa; Kähönen, Mika; König, Wolfgang; Kramer, Holly; Krämer, Bernhard K; Kumar, Ashish; Kutalik, Zoltan; Lambert, Jean-Charles; Launer, Lenore J; Lehtimäki, Terho; de Borst, Martin; Navis, Gerjan; Swertz, Morris; Liu, Yongmei; Lohman, Kurt; Loos, Ruth J F; Lu, Yingchang; Lyytikäinen, Leo-Pekka; McEvoy, Mark A; Meisinger, Christa; Meitinger, Thomas; Metspalu, Andres; Metzger, Marie; Mihailov, Evelin; Mitchell, Paul; Nauck, Matthias; Oldehinkel, Albertine J; Olden, Matthias; Wjh Penninx, Brenda; Pistis, Giorgio; Pramstaller, Peter P; Probst-Hensch, Nicole; Raitakari, Olli T; Rettig, Rainer; Ridker, Paul M; Rivadeneira, Fernando; Robino, Antonietta; Rosas, Sylvia E; Ruderfer, Douglas; Ruggiero, Daniela; Saba, Yasaman; Sala, Cinzia; Schmidt, Helena; Schmidt, Reinhold; Scott, Rodney J; Sedaghat, Sanaz; Smith, Albert V; Sorice, Rossella; Stengel, Benedicte; Stracke, Sylvia; Strauch, Konstantin; Toniolo, Daniela; Uitterlinden, Andre G; Ulivi, Sheila; Viikari, Jorma S; Völker, Uwe; Vollenweider, Peter; Völzke, Henry; Vuckovic, Dragana; Waldenberger, Melanie; Jin Wang, Jie; Yang, Qiong; Chasman, Daniel I; Tromp, Gerard; Snieder, Harold; Heid, Iris M; Fox, Caroline S; Köttgen, Anna; Pattaro, Cristian; Böger, Carsten A; Fuchsberger, Christian
2017-04-28
HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10 -8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples.
Cannon, Maren E.; Duan, Qing; Wu, Ying; Zeynalzadeh, Monica; Xu, Zheng; Kangas, Antti J.; Soininen, Pasi; Ala-Korpela, Mika; Civelek, Mete; Lusis, Aldons J.; Kuusisto, Johanna; Collins, Francis S.; Boehnke, Michael; Tang, Hua; Laakso, Markku; Li, Yun; Mohlke, Karen L.
2017-01-01
Recent genome-wide association studies (GWAS) have identified variants associated with high-density lipoprotein cholesterol (HDL-C) located in or near the ANGPTL8 gene. Given the extensive sharing of GWAS loci across populations, we hypothesized that at least one shared variant at this locus affects HDL-C. The HDL-C–associated variants are coincident with expression quantitative trait loci for ANGPTL8 and DOCK6 in subcutaneous adipose tissue; however, only ANGPTL8 expression levels are associated with HDL-C levels. We identified a 400-bp promoter region of ANGPTL8 and enhancer regions within 5 kb that contribute to regulating expression in liver and adipose. To identify variants functionally responsible for the HDL-C association, we performed fine-mapping analyses and selected 13 candidate variants that overlap putative regulatory regions to test for allelic differences in regulatory function. Of these variants, rs12463177-G increased transcriptional activity (1.5-fold, P = 0.004) and showed differential protein binding. Six additional variants (rs17699089, rs200788077, rs56322906, rs3760782, rs737337, and rs3745683) showed evidence of allelic differences in transcriptional activity and/or protein binding. Taken together, these data suggest a regulatory mechanism at the ANGPTL8 HDL-C GWAS locus involving tissue-selective expression and at least one functional variant. PMID:28754724
Bauchet, Guillaume; Grenier, Stéphane; Samson, Nicolas; Bonnet, Julien; Grivet, Laurent; Causse, Mathilde
2017-05-01
A panel of 300 tomato accessions including breeding materials was built and characterized with >11,000 SNP. A population structure in six subgroups was identified. Strong heterogeneity in linkage disequilibrium and recombination landscape among groups and chromosomes was shown. GWAS identified several associations for fruit weight, earliness and plant growth. Genome-wide association studies (GWAS) have become a method of choice in quantitative trait dissection. First limited to highly polymorphic and outcrossing species, it is now applied in horticultural crops, notably in tomato. Until now GWAS in tomato has been performed on panels of heirloom and wild accessions. Using modern breeding materials would be of direct interest for breeding purpose. To implement GWAS on a large panel of 300 tomato accessions including 168 breeding lines, this study assessed the genetic diversity and linkage disequilibrium decay and revealed the population structure and performed GWA experiment. Genetic diversity and population structure analyses were based on molecular markers (>11,000 SNP) covering the whole genome. Six genetic subgroups were revealed and associated to traits of agronomical interest, such as fruit weight and disease resistance. Estimates of linkage disequilibrium highlighted the heterogeneity of its decay among genetic subgroups. Haplotype definition allowed a fine characterization of the groups and their recombination landscape revealing the patterns of admixture along the genome. Selection footprints showed results in congruence with introgressions. Taken together, all these elements refined our knowledge of the genetic material included in this panel and allowed the identification of several associations for fruit weight, plant growth and earliness, deciphering the genetic architecture of these complex traits and identifying several new loci useful for tomato breeding.
Adib-Samii, Poneh; Devan, William; Traylor, Matthew; Lanfranconi, Silvia; Zhang, Cathy R; Cloonan, Lisa; Falcone, Guido J; Radmanesh, Farid; Fitzpatrick, Kaitlin; Kanakis, Allison; Rothwell, Peter M; Sudlow, Cathie; Boncoraglio, Giorgio B; Meschia, James F; Levi, Chris; Dichgans, Martin; Bevan, Steve; Rosand, Jonathan; Rost, Natalia S; Markus, Hugh S
2015-02-01
Epidemiological studies suggest that white matter hyperintensities (WMH) are extremely heritable, but the underlying genetic variants are largely unknown. Pathophysiological heterogeneity is known to reduce the power of genome-wide association studies (GWAS). Hypertensive and nonhypertensive individuals with WMH might have different underlying pathologies. We used GWAS data to calculate the variance in WMH volume (WMHV) explained by common single nucleotide polymorphisms (SNPs) as a measure of heritability (SNP heritability [HSNP]) and tested the hypothesis that WMH heritability differs between hypertensive and nonhypertensive individuals. WMHV was measured on MRI in the stroke-free cerebral hemisphere of 2336 ischemic stroke cases with GWAS data. After adjustment for age and intracranial volume, we determined which cardiovascular risk factors were independent predictors of WMHV. Using the genome-wide complex trait analysis tool to estimate HSNP for WMHV overall and within subgroups stratified by risk factors found to be significant in multivariate analyses. A significant proportion of the variance of WMHV was attributable to common SNPs after adjustment for significant risk factors (HSNP=0.23; P=0.0026). HSNP estimates were higher among hypertensive individuals (HSNP=0.45; P=7.99×10(-5)); this increase was greater than expected by chance (P=0.012). In contrast, estimates were lower, and nonsignificant, in nonhypertensive individuals (HSNP=0.13; P=0.13). A quarter of variance is attributable to common SNPs, but this estimate was greater in hypertensive individuals. These findings suggest that the genetic architecture of WMH in ischemic stroke differs between hypertensives and nonhypertensives. Future WMHV GWAS studies may gain power by accounting for this interaction. © 2014 The Authors. Published on behalf of the American Heart Association, Inc., by Wolters Kluwer.
Contrasting results from GWAS and QTL mapping on wing length in great reed warblers.
Hansson, Bengt; Sigeman, Hanna; Stervander, Martin; Tarka, Maja; Ponnikas, Suvi; Strandh, Maria; Westerdahl, Helena; Hasselquist, Dennis
2018-04-15
A major goal in evolutionary biology is to understand the genetic basis of adaptive traits. In migratory birds, wing morphology is such a trait. Our previous work on the great reed warbler (Acrocephalus arundinaceus) shows that wing length is highly heritable and under sexually antagonistic selection. Moreover, a quantitative trait locus (QTL) mapping analysis detected a pronounced QTL for wing length on chromosome 2, suggesting that wing morphology is partly controlled by genes with large effects. Here, we re-evaluate the genetic basis of wing length in great reed warblers using a genomewide association study (GWAS) approach based on restriction site-associated DNA sequencing (RADseq) data. We use GWAS models that account for relatedness between individuals and include covariates (sex, age and tarsus length). The resulting association landscape was flat with no peaks on chromosome 2 or elsewhere, which is in line with expectations for polygenic traits. Analysis of the distribution of p-values did not reveal biases, and the inflation factor was low. Effect sizes were however not uniformly distributed on some chromosomes, and the Z chromosome had weaker associations than autosomes. The level of linkage disequilibrium (LD) in the population decayed to background levels within c. 1 kbp. There could be several reasons to why our QTL study and GWAS gave contrasting results including differences in how associations are modelled (cosegregation in pedigree vs. LD associations), how covariates are accounted for in the models, type of marker used (multi- vs. biallelic), difference in power or a combination of these. Our study highlights that the genetic architecture even of highly heritable traits is difficult to characterize in wild populations. © 2018 John Wiley & Sons Ltd.
Qian, David C.; Byun, Jinyoung; Han, Younghun; Greene, Casey S.; Field, John K.; Hung, Rayjean J.; Brhane, Yonathan; Mclaughlin, John R.; Fehringer, Gordon; Landi, Maria Teresa; Rosenberger, Albert; Bickeböller, Heike; Malhotra, Jyoti; Risch, Angela; Heinrich, Joachim; Hunter, David J.; Henderson, Brian E.; Haiman, Christopher A.; Schumacher, Fredrick R.; Eeles, Rosalind A.; Easton, Douglas F.; Seminara, Daniela; Amos, Christopher I.
2015-01-01
Results from genome-wide association studies (GWAS) have indicated that strong single-gene effects are the exception, not the rule, for most diseases. We assessed the joint effects of germline genetic variations through a pathway-based approach that considers the tissue-specific contexts of GWAS findings. From GWAS meta-analyses of lung cancer (12 160 cases/16 838 controls), breast cancer (15 748 cases/18 084 controls) and prostate cancer (14 160 cases/12 724 controls) in individuals of European ancestry, we determined the tissue-specific interaction networks of proteins expressed from genes that are likely to be affected by disease-associated variants. Reactome pathways exhibiting enrichment of proteins from each network were compared across the cancers. Our results show that pathways associated with all three cancers tend to be broad cellular processes required for growth and survival. Significant examples include the nerve growth factor (P = 7.86 × 10−33), epidermal growth factor (P = 1.18 × 10−31) and fibroblast growth factor (P = 2.47 × 10−31) signaling pathways. However, within these shared pathways, the genes that influence risk largely differ by cancer. Pathways found to be unique for a single cancer focus on more specific cellular functions, such as interleukin signaling in lung cancer (P = 1.69 × 10−15), apoptosis initiation by Bad in breast cancer (P = 3.14 × 10−9) and cellular responses to hypoxia in prostate cancer (P = 2.14 × 10−9). We present the largest comparative cross-cancer pathway analysis of GWAS to date. Our approach can also be applied to the study of inherited mechanisms underlying risk across multiple diseases in general. PMID:26483192
Eleftherohorinou, Hariklia; Hoggart, Clive J; Wright, Victoria J; Levin, Michael; Coin, Lachlan J M
2011-09-01
Rheumatoid arthritis (RA) is the commonest chronic, systemic, inflammatory disorder affecting ∼1% of the world population. It has a strong genetic component and a growing number of associated genes have been discovered in genome-wide association studies (GWAS), which nevertheless only account for 23% of the total genetic risk. We aimed to identify additional susceptibility loci through the analysis of GWAS in the context of biological function. We bridge the gap between pathway and gene-oriented analyses of GWAS, by introducing a pathway-driven gene stability-selection methodology that identifies potential causal genes in the top-associated disease pathways that may be driving the pathway association signals. We analysed the WTCCC and the NARAC studies of ∼5000 and ∼2000 subjects, respectively. We examined 700 pathways comprising ∼8000 genes. Ranking pathways by significance revealed that the NARAC top-ranked ∼6% laid within the top 10% of WTCCC. Gene selection on those pathways identified 58 genes in WTCCC and 61 in NARAC; 21 of those were common (P(overlap)< 10(-21)), of which 16 were novel discoveries. Among the identified genes, we validated 10 known RA associations in WTCCC and 13 in NARAC, not discovered using single-SNP approaches on the same data. Gene ontology functional enrichment analysis on the identified genes showed significant over-representation of signalling activity (P< 10(-29)) in both studies. Our findings suggest a novel model of RA genetic predisposition, which involves cell-membrane receptors and genes in second messenger signalling systems, in addition to genes that regulate immune responses, which have been the focus of interest previously.
Chen, D T; Jiang, X; Akula, N; Shugart, Y Y; Wendland, J R; Steele, C J M; Kassem, L; Park, J-H; Chatterjee, N; Jamain, S; Cheng, A; Leboyer, M; Muglia, P; Schulze, T G; Cichon, S; Nöthen, M M; Rietschel, M; McMahon, F J; Farmer, A; McGuffin, P; Craig, I; Lewis, C; Hosang, G; Cohen-Woods, S; Vincent, J B; Kennedy, J L; Strauss, J
2013-02-01
Meta-analyses of bipolar disorder (BD) genome-wide association studies (GWAS) have identified several genome-wide significant signals in European-ancestry samples, but so far account for little of the inherited risk. We performed a meta-analysis of ∼750,000 high-quality genetic markers on a combined sample of ∼14,000 subjects of European and Asian-ancestry (phase I). The most significant findings were further tested in an extended sample of ∼17,700 cases and controls (phase II). The results suggest novel association findings near the genes TRANK1 (LBA1), LMAN2L and PTGFR. In phase I, the most significant single nucleotide polymorphism (SNP), rs9834970 near TRANK1, was significant at the P=2.4 × 10(-11) level, with no heterogeneity. Supportive evidence for prior association findings near ANK3 and a locus on chromosome 3p21.1 was also observed. The phase II results were similar, although the heterogeneity test became significant for several SNPs. On the basis of these results and other established risk loci, we used the method developed by Park et al. to estimate the number, and the effect size distribution, of BD risk loci that could still be found by GWAS methods. We estimate that >63,000 case-control samples would be needed to identify the ∼105 BD risk loci discoverable by GWAS, and that these will together explain <6% of the inherited risk. These results support previous GWAS findings and identify three new candidate genes for BD. Further studies are needed to replicate these findings and may potentially lead to identification of functional variants. Sample size will remain a limiting factor in the discovery of common alleles associated with BD.
Kulminski, Alexander M.; Culminskaya, Irina; Arbeev, Konstantin G.; Arbeeva, Liubov; Ukraintseva, Svetlana V.; Stallard, Eric; Wu, Deqing; Yashin, Anatoliy I.
2015-01-01
Insights into genetic origin of diseases and related traits could substantially impact strategies for improving human health. The results of genome-wide association studies (GWAS) are often positioned as discoveries of unconditional risk alleles of complex health traits. We re-analyzed the associations of single nucleotide polymorphisms (SNPs) associated with total cholesterol (TC) in a large-scale GWAS meta-analysis. We focused on three generations of genotyped participants of the Framingham Heart Study (FHS). We show that the effects of all ten directly-genotyped SNPs were clustered in different FHS generations and/or birth cohorts in a sex-specific or sex-unspecific manner. The sample size and procedure-therapeutic issues play, at most, a minor role in this clustering. An important result was clustering of significant associations with the strongest effects in the youngest, or 3rd Generation, cohort. These results imply that an assumption of unconditional connections of these SNPs with TC is generally implausible and that a demographic perspective can substantially improve GWAS efficiency. The analyses of genetic effects in age-matched samples suggest a role of environmental and age-related mechanisms in the associations of different SNPs with TC. Analysis of the literature supports systemic roles for genes for these SNPs beyond those related to lipid metabolism. Our analyses reveal strong antagonistic effects of rs2479409 (the PCSK9 gene) that cautions strategies aimed at targeting this gene in the next generation of lipid drugs. Our results suggest that standard GWAS strategies need to be advanced in order to appropriately address the problem of genetic susceptibility to complex traits that is imperative for translation to health care. PMID:26295473
Complex Disease Endotypes and Implications for GWAS and Exposomics***
Presentation Type: Symposia Symposium Title: Human Exposome Discovery and Disease Investigation Abstract Title: Complex Disease Endotypes and Implications for GWAS and Exposomics Authors: Stephen W. Edwards1, David M. Reif, Elaine Cohen Hubaf, ClarLynda Williams-DeVa...
Hicks, Chindo; Kumar, Ranjit; Pannuti, Antonio; Miele, Lucio
2012-01-01
Variable response and resistance to tamoxifen treatment in breast cancer patients remains a major clinical problem. To determine whether genes and biological pathways containing SNPs associated with risk for breast cancer are dysregulated in response to tamoxifen treatment, we performed analysis combining information from 43 genome-wide association studies with gene expression data from 298 ER(+) breast cancer patients treated with tamoxifen and 125 ER(+) controls. We identified 95 genes which distinguished tamoxifen treated patients from controls. Additionally, we identified 54 genes which stratified tamoxifen treated patients into two distinct groups. We identified biological pathways containing SNPs associated with risk for breast cancer, which were dysregulated in response to tamoxifen treatment. Key pathways identified included the apoptosis, P53, NFkB, DNA repair and cell cycle pathways. Combining GWAS with transcription profiling provides a unified approach for associating GWAS findings with response to drug treatment and identification of potential drug targets.
Assessment of Parkinson’s disease risk loci in Greece
Kara, Eleanna; Xiromerisiou, Georgia; Spanaki, Cleanthe; Bozi, Maria; Koutsis, Georgios; Panas, Marios; Dardiotis, Efthimios; Ralli, Styliani; Bras, Jose; Letson, Christopher; Edsall, Connor; Pliner, Hannah; Arepali, Sampath; Kalinderi, Kallirhoe; Fidani, Liana; Bostanjopoulou, Sevasti; Keller, Margaux F; Wood, Nicholas W; Hardy, John; Houlden, Henry; Stefanis, Leonidas; Plaitakis, Andreas; Hernandez, Dena; Hadjigeorgiou, Georgios M; Nalls, Mike A; Singleton, Andrew B
2013-01-01
Genome wide association studies (GWAS) have been shown to be a powerful approach to identify risk loci for neurodegenerative diseases. Recent GWAS in Parkinson’s disease (PD) have been successful in identifying numerous risk variants pointing to novel pathways potentially implicated in the pathogenesis of PD. Contributing to these GWAS efforts, we performed genotyping of previously identified risk alleles in PD patients and controls from Greece. We showed that previously published risk profiles for Northern European and American populations are also applicable to the Greek population. In addition, while we were largely underpowered to detect individual associations we replicated 5 of 32 previously published risk variants with nominal p-values <0.05. Genome-wide complex trait analysis (GCTA) revealed that known risk loci explain disease risk in 1.27% of Greek PD patients. Collectively, these results indicate that there is likely a substantial genetic component to PD in Greece similarly to other worldwide populations that remains to be discovered. PMID:24080174
Effects of GWAS-Associated Genetic Variants on lncRNAs within IBD and T1D Candidate Loci
Brorsson, Caroline A.; Pociot, Flemming
2014-01-01
Long non-coding RNAs are a new class of non-coding RNAs that are at the crosshairs in many human diseases such as cancers, cardiovascular disorders, inflammatory and autoimmune disease like Inflammatory Bowel Disease (IBD) and Type 1 Diabetes (T1D). Nearly 90% of the phenotype-associated single-nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) lie outside of the protein coding regions, and map to the non-coding intervals. However, the relationship between phenotype-associated loci and the non-coding regions including the long non-coding RNAs (lncRNAs) is poorly understood. Here, we systemically identified all annotated IBD and T1D loci-associated lncRNAs, and mapped nominally significant GWAS/ImmunoChip SNPs for IBD and T1D within these lncRNAs. Additionally, we identified tissue-specific cis-eQTLs, and strong linkage disequilibrium (LD) signals associated with these SNPs. We explored sequence and structure based attributes of these lncRNAs, and also predicted the structural effects of mapped SNPs within them. We also identified lncRNAs in IBD and T1D that are under recent positive selection. Our analysis identified putative lncRNA secondary structure-disruptive SNPs within and in close proximity (+/−5 kb flanking regions) of IBD and T1D loci-associated candidate genes, suggesting that these RNA conformation-altering polymorphisms might be associated with diseased-phenotype. Disruption of lncRNA secondary structure due to presence of GWAS SNPs provides valuable information that could be potentially useful for future structure-function studies on lncRNAs. PMID:25144376
Sonah, Humira; O'Donoughue, Louise; Cober, Elroy; Rajcan, Istvan; Belzile, François
2015-02-01
Soya bean is a major source of edible oil and protein for human consumption as well as animal feed. Understanding the genetic basis of different traits in soya bean will provide important insights for improving breeding strategies for this crop. A genome-wide association study (GWAS) was conducted to accelerate molecular breeding for the improvement of agronomic traits in soya bean. A genotyping-by-sequencing (GBS) approach was used to provide dense genome-wide marker coverage (>47,000 SNPs) for a panel of 304 short-season soya bean lines. A subset of 139 lines, representative of the diversity among these, was characterized phenotypically for eight traits under six environments (3 sites × 2 years). Marker coverage proved sufficient to ensure highly significant associations between the genes known to control simple traits (flower, hilum and pubescence colour) and flanking SNPs. Between one and eight genomic loci associated with more complex traits (maturity, plant height, seed weight, seed oil and protein) were also identified. Importantly, most of these GWAS loci were located within genomic regions identified by previously reported quantitative trait locus (QTL) for these traits. In some cases, the reported QTLs were also successfully validated by additional QTL mapping in a biparental population. This study demonstrates that integrating GBS and GWAS can be used as a powerful complementary approach to classical biparental mapping for dissecting complex traits in soya bean. © 2014 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.
Abdulkadir, Mohamed; Londono, Douglas; Gordon, Derek; Fernandez, Thomas V; Brown, Lawrence W; Cheon, Keun-Ah; Coffey, Barbara J; Elzerman, Lonneke; Fremer, Carolin; Fründt, Odette; Garcia-Delgar, Blanca; Gilbert, Donald L; Grice, Dorothy E; Hedderly, Tammy; Heyman, Isobel; Hong, Hyun Ju; Huyser, Chaim; Ibanez-Gomez, Laura; Jakubovski, Ewgeni; Kim, Young Key; Kim, Young Shin; Koh, Yun-Joo; Kook, Sodahm; Kuperman, Samuel; Leventhal, Bennett; Ludolph, Andrea G; Madruga-Garrido, Marcos; Maras, Athanasios; Mir, Pablo; Morer, Astrid; Müller-Vahl, Kirsten; Münchau, Alexander; Murphy, Tara L; Plessen, Kerstin J; Roessner, Veit; Shin, Eun-Young; Song, Dong-Ho; Song, Jungeun; Tübing, Jennifer; van den Ban, Els; Visscher, Frank; Wanderer, Sina; Woods, Martin; Zinner, Samuel H; King, Robert A; Tischfield, Jay A; Heiman, Gary A; Hoekstra, Pieter J; Dietrich, Andrea
2018-04-01
Genetic studies in Tourette syndrome (TS) are characterized by scattered and poorly replicated findings. We aimed to replicate findings from candidate gene and genome-wide association studies (GWAS). Our cohort included 465 probands with chronic tic disorder (93% TS) and both parents from 412 families (some probands were siblings). We assessed 75 single nucleotide polymorphisms (SNPs) in 465 parent-child trios; 117 additional SNPs in 211 trios; and 4 additional SNPs in 254 trios. We performed SNP and gene-based transmission disequilibrium tests and compared nominally significant SNP results with those from a large independent case-control cohort. After quality control 71 SNPs were available in 371 trios; 112 SNPs in 179 trios; and 3 SNPs in 192 trios. 17 were candidate SNPs implicated in TS and 2 were implicated in obsessive-compulsive disorder (OCD) or autism spectrum disorder (ASD); 142 were tagging SNPs from eight monoamine neurotransmitter-related genes (including dopamine and serotonin); 10 were top SNPs from TS GWAS; and 13 top SNPs from attention-deficit/hyperactivity disorder, OCD, or ASD GWAS. None of the SNPs or genes reached significance after adjustment for multiple testing. We observed nominal significance for the candidate SNPs rs3744161 (TBCD) and rs4565946 (TPH2) and for five tagging SNPs; none of these showed significance in the independent cohort. Also, SLC1A1 in our gene-based analysis and two TS GWAS SNPs showed nominal significance, rs11603305 (intergenic) and rs621942 (PICALM). We found no convincing support for previously implicated genetic polymorphisms. Targeted re-sequencing should fully appreciate the relevance of candidate genes.
Relevance of genetic relationship in GWAS and genomic prediction.
Pereira, Helcio Duarte; Soriano Viana, José Marcelo; Andrade, Andréa Carla Bastos; Fonseca E Silva, Fabyano; Paes, Geísa Pinheiro
2018-02-01
The objective of this study was to analyze the relevance of relationship information on the identification of low heritability quantitative trait loci (QTLs) from a genome-wide association study (GWAS) and on the genomic prediction of complex traits in human, animal and cross-pollinating populations. The simulation-based data sets included 50 samples of 1000 individuals of seven populations derived from a common population with linkage disequilibrium. The populations had non-inbred and inbred progeny structure (50 to 200) with varying number of members (5 to 20). The individuals were genotyped for 10,000 single nucleotide polymorphisms (SNPs) and phenotyped for a quantitative trait controlled by 10 QTLs and 90 minor genes showing dominance. The SNP density was 0.1 cM and the narrow sense heritability was 25%. The QTL heritabilities ranged from 1.1 to 2.9%. We applied mixed model approaches for both GWAS and genomic prediction using pedigree-based and genomic relationship matrices. For GWAS, the observed false discovery rate was kept below the significance level of 5%, the power of detection for the low heritability QTLs ranged from 14 to 50%, and the average bias between significant SNPs and a QTL ranged from less than 0.01 to 0.23 cM. The QTL detection power was consistently higher using genomic relationship matrix. Regardless of population and training set size, genomic prediction provided higher prediction accuracy of complex trait when compared to pedigree-based prediction. The accuracy of genomic prediction when there is relatedness between individuals in the training set and the reference population is much higher than the value for unrelated individuals.
Lack of replication of previous autism spectrum disorder GWAS hits in European populations.
Torrico, Bàrbara; Chiocchetti, Andreas G; Bacchelli, Elena; Trabetti, Elisabetta; Hervás, Amaia; Franke, Barbara; Buitelaar, Jan K; Rommelse, Nanda; Yousaf, Afsheen; Duketis, Eftichia; Freitag, Christine M; Caballero-Andaluz, Rafaela; Martinez-Mir, Amalia; Scholl, Francisco G; Ribasés, Marta; Battaglia, Agatino; Malerba, Giovanni; Delorme, Richard; Benabou, Marion; Maestrini, Elena; Bourgeron, Thomas; Cormand, Bru; Toma, Claudio
2017-02-01
Common variants contribute significantly to the genetics of autism spectrum disorder (ASD), although the identification of individual risk polymorphisms remains still elusive due to their small effect sizes and limited sample sizes available for association studies. During the last decade several genome-wide association studies (GWAS) have enabled the detection of a few plausible risk variants. The three main studies are family-based and pointed at SEMA5A (rs10513025), MACROD2 (rs4141463) and MSNP1 (rs4307059). In our study we attempted to replicate these GWAS hits using a case-control association study in five European populations of ASD patients and gender-matched controls, all Caucasians. Results showed no association of individual variants with ASD in any of the population groups considered or in the combined European sample. We performed a meta-analysis study across five European populations for rs10513025 (1,904 ASD cases and 2,674 controls), seven European populations for rs4141463 (2,855 ASD cases and 36,177 controls) and five European populations for rs4307059 (2,347 ASD cases and 2,764 controls). The results showed an odds ratio (OR) of 1.05 (95% CI = 0.84-1.32) for rs10513025, 1.0002 (95% CI = 0.93-1.08) for rs4141463 and 1.01 (95% CI = 0.92-1.1) for rs4307059, with no significant P-values (rs10513025, P = 0.73; rs4141463, P = 0.95; rs4307059, P = 0.9). No association was found when we considered either only high functioning autism (HFA), genders separately or only multiplex families. Ongoing GWAS projects with larger ASD cohorts will contribute to clarify the role of common variation in the disorder and will likely identify risk variants of modest effect not detected previously. Autism Res 2017, 10: 202-211. © 2016 International Society for Autism Research, Wiley Periodicals, Inc. © 2016 International Society for Autism Research, Wiley Periodicals, Inc.
Howard, Jeremy T; Jiao, Shihui; Tiezzi, Francesco; Huang, Yijian; Gray, Kent A; Maltecca, Christian
2015-05-30
Feed intake and growth are economically important traits in swine production. Previous genome wide association studies (GWAS) have utilized average daily gain or daily feed intake to identify regions that impact growth and feed intake across time. The use of longitudinal models in GWAS studies, such as random regression, allows for SNPs having a heterogeneous effect across the trajectory to be characterized. The objective of this study is therefore to conduct a single step GWAS (ssGWAS) on the animal polynomial coefficients for feed intake and growth. Corrected daily feed intake (DFI Adj) and average daily weight measurements (DBW Avg) on 8981 (n=525,240 observations) and 5643 (n=283,607 observations) animals were utilized in a random regression model using Legendre polynomials (order=2) and a relationship matrix that included genotyped and un-genotyped animals. A ssGWAS was conducted on the animal polynomials coefficients (intercept, linear and quadratic) for animals with genotypes (DFIAdj: n=855; DBWAvg: n=590). Regions were characterized based on the variance of 10-SNP sliding windows GEBV (WGEBV). A bootstrap analysis (n=1000) was conducted to declare significance. Heritability estimates for the traits trajectory ranged from 0.34-0.52 to 0.07-0.23 for DBWAvg and DFIAdj, respectively. Genetic correlations across age classes were large and positive for both DBWAvg and DFIAdj, albeit age classes at the beginning had a small to moderate genetic correlation with age classes towards the end of the trajectory for both traits. The WGEBV variance explained by significant regions (P<0.001) for each polynomial coefficient ranged from 0.2-0.9 to 0.3-1.01% for DBWAvg and DFIAdj, respectively. The WGEBV variance explained by significant regions for the trajectory was 1.54 and 1.95% for DBWAvg and DFIAdj. Both traits identified candidate genes with functions related to metabolite and energy homeostasis, glucose and insulin signaling and behavior. We have identified regions of the genome that have an impact on the intercept, linear and quadratic terms for DBWAvg and DFIAdj. These results provide preliminary evidence that individual growth and feed intake trajectories are impacted by different regions of the genome at different times.
Polygenic risk and the development and course of asthma: Evidence from a 4-decade longitudinal study
Belsky, DW; Sears, MR; Hancox, RJ; Harrington, HL; Houts, R; Moffitt, TE; Sugden, K; Williams, B; Poulton, R; Caspi, A
2013-01-01
BACKGROUND Genome-wide association studies (GWAS) have discovered loci that predispose to asthma. To integrate these new discoveries with emerging models of asthma pathobiology, research is needed to test how genetic discoveries relate to developmental and biological characteristics of asthma. METHODS We derived a multi-locus profile of genetic risk from published GWAS of asthma case status. We then tested associations between this “genetic risk score” and developmental and biological characteristics of asthma in a population-based long-running birth cohort, the Dunedin Longitudinal Study (n=1,037). We evaluated asthma onset, persistence, atopy, airway hyperresponsiveness, incompletely reversible airflow obstruction, and asthma-related school and work absenteeism and hospitalization during 9 prospective assessments spanning ages 9–38 years, when 95% of surviving cohort members were seen. INTERPRETATION Cohort members at higher genetic risk experienced asthma onset earlier in life (HR=1.12 [1.01–1.26]). Childhood-onset asthma cases at higher genetic risk were more likely to become life-course-persistent asthma cases (RR=1.36 [1.14–1.63]). Asthma cases at higher genetic risk more often manifested atopy (RR=1.07 [1.01–1.14]), airway hyperresponsiveness (RR=1.16 [1.03–1.32]), and incompletely reversible airflow obstruction (RR=1.28 [1.04–1.57]). They were also more likely to miss school or work due to asthma (IRR=1.38 [1.02–1.86]) and to be hospitalized with breathing problems (HR=1.38 [1.07–1.79]). Genotypic information about asthma risk was independent of and additive to information derived from cohort members’ family histories of asthma. CONCLUSIONS Findings from this population study confirm that GWAS-discoveries for asthma associate with a childhood-onset phenotype and advance asthma genetics beyond the original GWAS-discoveries in three ways: (1) We show that genetic risks predict which childhood-onset asthma cases remit and which become life-course-persistent cases, although these predictions are not sufficiently sensitive or specific to support immediate clinical translation; (2) We elucidate a biological profile of the asthma that arises from these genetic risks: asthma characterized by atopy and airway hyperresponsiveness and leading to incompletely reversible airflow obstruction; and (3) We describe the real-life impact of GWAS-discoveries by quantifying genetic associations with missed school and work and hospitalization. PMID:24429243
2013-01-01
Background The apparent effect of a single nucleotide polymorphism (SNP) on phenotype depends on the linkage disequilibrium (LD) between the SNP and a quantitative trait locus (QTL). However, the phase of LD between a SNP and a QTL may differ between Bos indicus and Bos taurus because they diverged at least one hundred thousand years ago. Here, we test the hypothesis that the apparent effect of a SNP on a quantitative trait depends on whether the SNP allele is inherited from a Bos taurus or Bos indicus ancestor. Methods Phenotype data on one or more traits and SNP genotype data for 10 181 cattle from Bos taurus, Bos indicus and composite breeds were used. All animals had genotypes for 729 068 SNPs (real or imputed). Chromosome segments were classified as originating from B. indicus or B. taurus on the basis of the haplotype of SNP alleles they contained. Consequently, SNP alleles were classified according to their sub-species origin. Three models were used for the association study: (1) conventional GWAS (genome-wide association study), fitting a single SNP effect regardless of subspecies origin, (2) interaction GWAS, fitting an interaction between SNP and subspecies-origin, and (3) best variable GWAS, fitting the most significant combination of SNP and sub-species origin. Results Fitting an interaction between SNP and subspecies origin resulted in more significant SNPs (i.e. more power) than a conventional GWAS. Thus, the effect of a SNP depends on the subspecies that the allele originates from. Also, most QTL segregated in only one subspecies, suggesting that many mutations that affect the traits studied occurred after divergence of the subspecies or the mutation became fixed or was lost in one of the subspecies. Conclusions The results imply that GWAS and genomic selection could gain power by distinguishing SNP alleles based on their subspecies origin, and that only few QTL segregate in both B. indicus and B. taurus cattle. Thus, the QTL that segregate in current populations likely resulted from mutations that occurred in one of the subspecies and can have both positive and negative effects on the traits. There was no evidence that selection has increased the frequency of alleles that increase body weight. PMID:24168700
Xu, Jinfeng; Yuan, Ao; Zheng, Gang
2012-01-01
Summary In the analysis of case-control genetic association, the trend test and Pearson’s test are the two most commonly used tests. In genome-wide association studies (GWAS), Bayes factor is a useful tool to support significant p-values, and a better measure than p-value when results are compared across studies with different sample sizes. When reporting the p-value of the trend test, we propose a Bayes factor directly based on the trend test. To improve the power to detect association under recessive or dominant genetic models, we propose a Bayes factor based on the trend test and incorporating Hardy-Weinberg disequilibrium in cases. When the true model is unknown, or both the trend test and Pearson’s test or other robust tests are applied in genome-wide scans, we propose a joint Bayes factor, combining the previous two Bayes factors. All three Bayes factors studied in this paper have closed forms and are easy to compute without integrations, so they can be reported along with p-values, especially in GWAS. We discuss how to use each of them and how to specify priors. Simulation studies and applications to three GWAS are provided to illustrate their usefulness to detect non-additive gene susceptibility in practice. PMID:22607017
Genome-wide association study identifies multiple loci associated with bladder cancer risk
Figueroa, Jonine D.; Ye, Yuanqing; Siddiq, Afshan; Garcia-Closas, Montserrat; Chatterjee, Nilanjan; Prokunina-Olsson, Ludmila; Cortessis, Victoria K.; Kooperberg, Charles; Cussenot, Olivier; Benhamou, Simone; Prescott, Jennifer; Porru, Stefano; Dinney, Colin P.; Malats, Núria; Baris, Dalsu; Purdue, Mark; Jacobs, Eric J.; Albanes, Demetrius; Wang, Zhaoming; Deng, Xiang; Chung, Charles C.; Tang, Wei; Bas Bueno-de-Mesquita, H.; Trichopoulos, Dimitrios; Ljungberg, Börje; Clavel-Chapelon, Françoise; Weiderpass, Elisabete; Krogh, Vittorio; Dorronsoro, Miren; Travis, Ruth; Tjønneland, Anne; Brenan, Paul; Chang-Claude, Jenny; Riboli, Elio; Conti, David; Gago-Dominguez, Manuela; Stern, Mariana C.; Pike, Malcolm C.; Van Den Berg, David; Yuan, Jian-Min; Hohensee, Chancellor; Rodabough, Rebecca; Cancel-Tassin, Geraldine; Roupret, Morgan; Comperat, Eva; Chen, Constance; De Vivo, Immaculata; Giovannucci, Edward; Hunter, David J.; Kraft, Peter; Lindstrom, Sara; Carta, Angela; Pavanello, Sofia; Arici, Cecilia; Mastrangelo, Giuseppe; Kamat, Ashish M.; Lerner, Seth P.; Barton Grossman, H.; Lin, Jie; Gu, Jian; Pu, Xia; Hutchinson, Amy; Burdette, Laurie; Wheeler, William; Kogevinas, Manolis; Tardón, Adonina; Serra, Consol; Carrato, Alfredo; García-Closas, Reina; Lloreta, Josep; Schwenn, Molly; Karagas, Margaret R.; Johnson, Alison; Schned, Alan; Armenti, Karla R.; Hosain, G.M.; Andriole, Gerald; Grubb, Robert; Black, Amanda; Ryan Diver, W.; Gapstur, Susan M.; Weinstein, Stephanie J.; Virtamo, Jarmo; Haiman, Chris A.; Landi, Maria T.; Caporaso, Neil; Fraumeni, Joseph F.; Vineis, Paolo; Wu, Xifeng; Silverman, Debra T.; Chanock, Stephen; Rothman, Nathaniel
2014-01-01
Candidate gene and genome-wide association studies (GWAS) have identified 11 independent susceptibility loci associated with bladder cancer risk. To discover additional risk variants, we conducted a new GWAS of 2422 bladder cancer cases and 5751 controls, followed by a meta-analysis with two independently published bladder cancer GWAS, resulting in a combined analysis of 6911 cases and 11 814 controls of European descent. TaqMan genotyping of 13 promising single nucleotide polymorphisms with P < 1 × 10−5 was pursued in a follow-up set of 801 cases and 1307 controls. Two new loci achieved genome-wide statistical significance: rs10936599 on 3q26.2 (P = 4.53 × 10−9) and rs907611 on 11p15.5 (P = 4.11 × 10−8). Two notable loci were also identified that approached genome-wide statistical significance: rs6104690 on 20p12.2 (P = 7.13 × 10−7) and rs4510656 on 6p22.3 (P = 6.98 × 10−7); these require further studies for confirmation. In conclusion, our study has identified new susceptibility alleles for bladder cancer risk that require fine-mapping and laboratory investigation, which could further understanding into the biological underpinnings of bladder carcinogenesis. PMID:24163127
Understanding the pharmacogenetics of selective serotonin reuptake inhibitors.
Fabbri, Chiara; Minarini, Alessandro; Niitsu, Tomihisa; Serretti, Alessandro
2014-08-01
The genetic background of antidepressant response represents a unique opportunity to identify biological markers of treatment outcome. Encouraging results alternating with inconsistent findings made antidepressant pharmacogenetics a stimulating but often discouraging field that requires careful discussion about cumulative evidence and methodological issues. The present review discusses both known and less replicated genes that have been implicated in selective serotonin reuptake inhibitors (SSRIs) efficacy and side effects. Candidate genes studies and genome-wide association studies (GWAS) were collected through MEDLINE database search (articles published till January 2014). Further, GWAS signals localized in promising genetic regions according to candidate gene studies are reported in order to assess the general comparability of results obtained through these two types of pharmacogenetic studies. Finally, a pathway enrichment approach is applied to the top genes (those harboring SNPs with p < 0.0001) outlined by previous GWAS in order to identify possible molecular mechanisms involved in SSRI effect. In order to improve the understanding of SSRI pharmacogenetics, the present review discusses the proposal of moving from the analysis of individual polymorphisms to genes and molecular pathways, and from the separation across different methodological approaches to their combination. Efforts in this direction are justified by the recent evidence of a favorable cost-utility of gene-guided antidepressant treatment.
Allele-Skewed DNA Modification in the Brain: Relevance to a Schizophrenia GWAS
Gagliano, Sarah A.; Ptak, Carolyn; Mak, Denise Y.F.; Shamsi, Mehrdad; Oh, Gabriel; Knight, Joanne; Boutros, Paul C.; Petronis, Arturas
2016-01-01
Numerous recent studies have suggested that phenotypic effects of DNA sequence variants can be mediated or modulated by their epigenetic marks, such as allele-skewed DNA modification (ASM). Using Affymetrix SNP microarrays, we performed a comprehensive search of ASM effects in human post-mortem brain and sperm samples (total n = 256) from individuals with major psychosis and control individuals. Depending on the phenotypic category of the brain samples, 1.4%–7.5% of interrogated SNPs exhibited ASM effects. Next, we investigated ASM in the context of genetic studies of schizophrenia and detected that brain ASM SNPs were significantly overrepresented among sub-threshold SNPs from a schizophrenia genome-wide association study (GWAS). Brain ASM SNPs showed a much stronger enrichment in a schizophrenia GWAS than in 17 large GWASs of non-psychiatric diseases and traits, arguing that ASM effects are at least partially tissue specific. Studies of germline and control brain ASM SNPs supported a causal association between ASM and schizophrenia. Finally, significantly higher proportions of ASM SNPs than of non-ASM SNPs were detected at loci exhibiting epigenetic signatures of enhancers and promoters, and they were overrepresented within transcription factor binding regions and DNase I hypersensitive sites. All of these findings collectively indicate that ASM SNPs should be prioritized in follow-up GWASs. PMID:27087318
Li, C; Sun, D; Zhang, S; Liu, L; Alim, M A; Zhang, Q
2016-08-01
The stearoyl-CoA desaturase (delta-9-desaturase) gene encodes a key enzyme in the cellular biosynthesis of monounsaturated fatty acids. In our initial genome-wide association study (GWAS) of Chinese Holstein cows, 19 SNPs fell in a 1.8-Mb region (20.3-22.1 Mb) on chromosome 26 underlying the SCD gene and were highly significantly associated with C14:1 or C14 index. The aims of this study were to verify whether the SCD gene has significant genetic effects on milk fatty acid composition in dairy cattle. By resequencing the entire coding region of the bovine SCD gene, a total of six variations were identified, including three coding variations (g.10153G>A, g.10213T>C and g.10329C>T) and three intronic variations (g.6926A>G, g.8646G>A and g.16158G>C). The SNP in exon 3, g.10329C>T, was predicted to result in an amino acid replacement from alanine (GCG) to valine (GTG) in the SCD protein. An association study for 16 milk fatty acids using 346 Chinese Holstein cows with accurate phenotypes and genotypes was performed using the mixed animal model with the proc mixed procedure in sas 9.2. All six detected SNPs were revealed to be associated with six medium- and long-chain unsaturated fatty acids (P = 0.0457 to P < 0.0001), specifically for C14:1 and C14 index (P = 0.0005 to P < 0.0001). Subsequently, strong linkage disequilibrium (D' = 0.88-1.00) was observed among all six SNPs in SCD and the five SNPs (rs41623887, rs109923480, rs42090224, rs42092174 and rs42091426) within the 1.8-Mb region identified in our previous GWAS, indicating that the significant association of the SCD gene with milk fatty acid content traits reduced the observed significant 1.8-Mb chromosome region in GWAS. Haplotype-based analysis revealed significant associations of the haplotypes encompassing the six SCD SNPs and one SNP (rs109923480) in a GWAS with C14:1, C14 index, C16:1 and C16 index (P = 0.0011 to P < 0.0001). In summary, our findings provide replicate evidence for our previous GWAS and demonstrate that variants in the SCD gene are significantly associated with milk fatty acid composition in dairy cattle, which provides clear evidence for an increased understanding of milk fatty acid synthesis and enhances opportunities to improve milk-fat composition in dairy cattle. © 2016 Stichting International Foundation for Animal Genetics.
Mapping of Gene Expression Reveals CYP27A1 as a Susceptibility Gene for Sporadic ALS
van Rheenen, Wouter; Franke, Lude; Jansen, Ritsert C.; van Es, Michael A.; van Vught, Paul W. J.; Blauw, Hylke M.; Groen, Ewout J. N.; Horvath, Steve; Estrada, Karol; Rivadeneira, Fernando; Hofman, Albert; Uitterlinden, Andre G.; Robberecht, Wim; Andersen, Peter M.; Melki, Judith; Meininger, Vincent; Hardiman, Orla; Landers, John E.; Brown, Robert H.; Shatunov, Aleksey; Shaw, Christopher E.; Leigh, P. Nigel; Al-Chalabi, Ammar; Ophoff, Roel A.
2012-01-01
Amyotrophic lateral sclerosis (ALS) is a progressive, neurodegenerative disease characterized by loss of upper and lower motor neurons. ALS is considered to be a complex trait and genome-wide association studies (GWAS) have implicated a few susceptibility loci. However, many more causal loci remain to be discovered. Since it has been shown that genetic variants associated with complex traits are more likely to be eQTLs than frequency-matched variants from GWAS platforms, we conducted a two-stage genome-wide screening for eQTLs associated with ALS. In addition, we applied an eQTL analysis to finemap association loci. Expression profiles using peripheral blood of 323 sporadic ALS patients and 413 controls were mapped to genome-wide genotyping data. Subsequently, data from a two-stage GWAS (3,568 patients and 10,163 controls) were used to prioritize eQTLs identified in the first stage (162 ALS, 207 controls). These prioritized eQTLs were carried forward to the second sample with both gene-expression and genotyping data (161 ALS, 206 controls). Replicated eQTL SNPs were then tested for association in the second-stage GWAS data to find SNPs associated with disease, that survived correction for multiple testing. We thus identified twelve cis eQTLs with nominally significant associations in the second-stage GWAS data. Eight SNP-transcript pairs of highest significance (lowest p = 1.27×10−51) withstood multiple-testing correction in the second stage and modulated CYP27A1 gene expression. Additionally, we show that C9orf72 appears to be the only gene in the 9p21.2 locus that is regulated in cis, showing the potential of this approach in identifying causative genes in association loci in ALS. This study has identified candidate genes for sporadic ALS, most notably CYP27A1. Mutations in CYP27A1 are causal to cerebrotendinous xanthomatosis which can present as a clinical mimic of ALS with progressive upper motor neuron loss, making it a plausible susceptibility gene for ALS. PMID:22509407
Zhang, Chunyan; Wang, Zhiquan; Bruce, Heather; Kemp, Robert Alan; Charagu, Patrick; Miar, Younes; Yang, Tianfu; Plastow, Graham
2015-04-07
Improving meat quality is a high priority for the pork industry to satisfy consumers' preferences. GWAS have become a state-of-the-art approach to genetically improve economically important traits. However, GWAS focused on pork quality are still relatively rare. Six genomic regions were shown to affect loin pH and Minolta colour a* and b* on both loin and ham through GWAS in 1943 crossbred commercial pigs. Five of them, located on Sus scrofa chromosome (SSC) 1, SSC5, SSC9, SSC16 and SSCX, were associated with meat colour. However, the most promising region was detected on SSC15 spanning 133-134 Mb which explained 3.51% - 17.06% of genetic variance for five measurements of pH and colour. Three SNPs (ASGA0070625, MARC0083357 and MARC0039273) in very strong LD were considered most likely to account for the effects in this region. ASGA0070625 is located in intron 2 of ZNF142, and the other two markers are close to PRKAG3, STK36, TTLL7 and CDK5R2. After fitting MARC0083357 (the closest SNP to PRKAG3) as a fixed factor, six SNPs still remained significant for at least one trait. Four of them are intragenic with ARPC2, TMBIM1, NRAMP1 and VIL1, while the remaining two are close to RUFY4 and CDK5R2. The gene network constructed demonstrated strong connections of these genes with two major hubs of PRKAG3 and UBC in the super-pathways of cell-to-cell signaling and interaction, cellular function and maintenance. All these pathways play important roles in maintaining the integral architecture and functionality of muscle cells facing the dramatic changes that occur after exsanguination, which is in agreement with the GWAS results found in this study. There may be other markers and/or genes in this region besides PRKAG3 that have an important effect on pH and colour. The potential markers and their interactions with PRKAG3 require further investigation.
Delgado, Dayana A; Zhang, Chenan; Chen, Lin S; Gao, Jianjun; Roy, Shantanu; Shinkle, Justin; Sabarinathan, Mekala; Argos, Maria; Tong, Lin; Ahmed, Alauddin; Islam, Tariqul; Rakibuz-Zaman, Muhammad; Sarwar, Golam; Shahriar, Hasan; Rahman, Mahfuzar; Yunus, Mohammad; Jasmine, Farzana; Kibriya, Muhammad G; Ahsan, Habibul; Pierce, Brandon L
2018-01-01
Leucocyte telomere length (TL) is a potential biomarker of ageing and risk for age-related disease. Leucocyte TL is heritable and shows substantial differences by race/ethnicity. Recent genome-wide association studies (GWAS) report ~10 loci harbouring SNPs associated with leucocyte TL, but these studies focus primarily on populations of European ancestry. This study aims to enhance our understanding of genetic determinants of TL across populations. We performed a GWAS of TL using data on 5075 Bangladeshi adults. We measured TL using one of two technologies (qPCR or a Luminex-based method) and used standardised variables as TL phenotypes. Our results replicate previously reported associations in the TERC and TERT regions (P=2.2×10 -8 and P=6.4×10 -6 , respectively). We observed a novel association signal in the RTEL1 gene (intronic SNP rs2297439; P=2.82×10 -7 ) that is independent of previously reported TL-associated SNPs in this region. The minor allele for rs2297439 is common in South Asian populations (≥0.25) but at lower frequencies in other populations (eg, 0.07 in Northern Europeans). Among the eight other previously reported association signals, all were directionally consistent with our study, but only rs8105767 ( ZNF208 ) was nominally significant (P=0.003). SNP-based heritability estimates were as high as 44% when analysing close relatives but much lower when analysing distant relatives only. In this first GWAS of TL in a South Asian population, we replicate some, but not all, of the loci reported in prior GWAS of individuals of European ancestry, and we identify a novel second association signal at the RTEL1 locus. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
LD Score Regression Distinguishes Confounding from Polygenicity in Genome-Wide Association Studies
Bulik-Sullivan, Brendan K.; Loh, Po-Ru; Finucane, Hilary; Ripke, Stephan; Yang, Jian; Patterson, Nick; Daly, Mark J.; Price, Alkes L.; Neale, Benjamin M.
2015-01-01
Both polygenicity (i.e., many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of test statistic inflation in many GWAS of large sample size. PMID:25642630
Genome-wide association studies on HIV susceptibility, pathogenesis and pharmacogenomics
2012-01-01
Susceptibility to HIV-1 and the clinical course after infection show a substantial heterogeneity between individuals. Part of this variability can be attributed to host genetic variation. Initial candidate gene studies have revealed interesting host factors that influence HIV infection, replication and pathogenesis. Recently, genome-wide association studies (GWAS) were utilized for unbiased searches at a genome-wide level to discover novel genetic factors and pathways involved in HIV-1 infection. This review gives an overview of findings from the GWAS performed on HIV infection, within different cohorts, with variable patient and phenotype selection. Furthermore, novel techniques and strategies in research that might contribute to the complete understanding of virus-host interactions and its role on the pathogenesis of HIV infection are discussed. PMID:22920050
Cabrera, Claudia P; Ng, Fu Liang; Warren, Helen R; Barnes, Michael R; Munroe, Patricia B; Caulfield, Mark J
2015-01-01
Hypertension is a major risk factor for global mortality. Recent genome-wide association studies (GWAS) have led to successful identification of many genetic loci influencing blood pressure, although these studies account for less than 5% of heritability. While genetic discovery efforts continue, it is timely to pause and reflect on what information has been gained to date from reported loci. Knowledge from GWAS findings inform our understanding of the pathways and pleiotropy underpinning hypertension and aid in the identification of potential druggable targets. By reviewing blood pressure loci we aim to determine how much potential the current observations have for future clinical utility. The authors have declared no conflicts of interest for this article. © 2015 Wiley Periodicals, Inc.
Austin, Melissa A.; Hair, Marilyn S.; Fullerton, Stephanie M.
2012-01-01
Scientific research has shifted from studies conducted by single investigators to the creation of large consortia. Genetic epidemiologists, for example, now collaborate extensively for genome-wide association studies (GWAS). The effect has been a stream of confirmed disease-gene associations. However, effects on human subjects oversight, data-sharing, publication and authorship practices, research organization and productivity, and intellectual property remain to be examined. The aim of this analysis was to identify all research consortia that had published the results of a GWAS analysis since 2005, characterize them, determine which have publicly accessible guidelines for research practices, and summarize the policies in these guidelines. A review of the National Human Genome Research Institute’s Catalog of Published Genome-Wide Association Studies identified 55 GWAS consortia as of April 1, 2011. These consortia were comprised of individual investigators, research centers, studies, or other consortia and studied 48 different diseases or traits. Only 14 (25%) were found to have publicly accessible research guidelines on consortia websites. The available guidelines provide information on organization, governance, and research protocols; half address institutional review board approval. Details of publication, authorship, data-sharing, and intellectual property vary considerably. Wider access to consortia guidelines is needed to establish appropriate research standards with broad applicability to emerging forms of large-scale collaboration. PMID:22491085
Multi-ethnic genome-wide association study identifies novel locus for type 2 diabetes susceptibility
Cook, James P; Morris, Andrew P
2016-01-01
Genome-wide association studies (GWAS) have traditionally been undertaken in homogeneous populations from the same ancestry group. However, with the increasing availability of GWAS in large-scale multi-ethnic cohorts, we have evaluated a framework for detecting association of genetic variants with complex traits, allowing for population structure, and developed a powerful test of heterogeneity in allelic effects between ancestry groups. We have applied the methodology to identify and characterise loci associated with susceptibility to type 2 diabetes (T2D) using GWAS data from the Resource for Genetic Epidemiology on Adult Health and Aging, a large multi-ethnic population-based cohort, created for investigating the genetic and environmental basis of age-related diseases. We identified a novel locus for T2D susceptibility at genome-wide significance (P<5 × 10−8) that maps to TOMM40-APOE, a region previously implicated in lipid metabolism and Alzheimer's disease. We have also confirmed previous reports that single-nucleotide polymorphisms at the TCF7L2 locus demonstrate the greatest extent of heterogeneity in allelic effects between ethnic groups, with the lowest risk observed in populations of East Asian ancestry. PMID:27189021
Pleiotropic analysis of cancer risk loci on esophageal adenocarcinoma risk
Lee, Eunjung; Stram, Daniel O.; Ek, Weronica E.; Onstad, Lynn E; MacGregor, Stuart; Gharahkhani, Puya; Ye, Weimin; Lagergren, Jesper; Shaheen, Nicholas J.; Murray, Liam J.; Hardie, Laura J; Gammon, Marilie D.; Chow, Wong-Ho; Risch, Harvey A.; Corley, Douglas A.; Levine, David M; Whiteman, David C.; Bernstein, Leslie; Bird, Nigel C.; Vaughan, Thomas L.; Wu, Anna H.
2015-01-01
Background Several cancer-associated loci identified from genome-wide association studies (GWAS) have been associated with risks of multiple cancer sites, suggesting pleiotropic effects. We investigated whether GWAS-identified risk variants for other common cancers are associated with risk of esophageal adenocarcinoma (EA) or its precursor, Barrett's esophagus (BE). Methods We examined the associations between risks of EA and BE and 387 single nucleotide polymorphisms (SNPs) that have been associated with risks of other cancers, by using genotype imputation data on 2,163 control participants and 3,885 (1,501 EA and 2,384 BE) case patients from the Barrett's and Esophageal Adenocarcinoma Genetic Susceptibility Study, and investigated effect modification by smoking history, body mass index (BMI), and reflux/heartburn. Results After correcting for multiple testing, none of the tested 387 SNPs were statistically significantly associated with risk of EA or BE. No evidence of effect modification by smoking, BMI, or reflux/heartburn was observed. Conclusions Genetic risk variants for common cancers identified from GWAS appear not to be associated with risks of EA or BE. Impact To our knowledge, this is the first investigation of pleiotropic genetic associations with risks of EA and BE. PMID:26364162
Kulbrock, Maike; Lehner, Stefanie; Metzger, Julia; Ohnesorge, Bernhard; Distl, Ottmar
2013-01-01
Equine recurrent uveitis (ERU) is a common eye disease affecting up to 3–15% of the horse population. A genome-wide association study (GWAS) using the Illumina equine SNP50 bead chip was performed to identify loci conferring risk to ERU. The sample included a total of 144 German warmblood horses. A GWAS showed a significant single nucleotide polymorphism (SNP) on horse chromosome (ECA) 20 at 49.3 Mb, with IL-17A and IL-17F being the closest genes. This locus explained a fraction of 23% of the phenotypic variance for ERU. A GWAS taking into account the severity of ERU, revealed a SNP on ECA18 nearby to the crystalline gene cluster CRYGA-CRYGF. For both genomic regions on ECA18 and 20, significantly associated haplotypes containing the genome-wide significant SNPs could be demonstrated. In conclusion, our results are indicative for a genetic component regulating the possible critical role of IL-17A and IL-17F in the pathogenesis of ERU. The associated SNP on ECA18 may be indicative for cataract formation in the course of ERU. PMID:23977091
Age-related macular degeneration: genome-wide association studies to translation.
Black, James R M; Clark, Simon J
2016-04-01
In recent years, genome-wide association studies (GWAS), which are able to analyze the contribution to disease of genetic variations that are common within a population, have attracted considerable investment. Despite identifying genetic variants for many conditions, they have been criticized for yielding data with minimal clinical utility. However, in this regard, age-related macular degeneration (AMD), the most common form of blindness in the Western world, is a striking exception. Through GWAS, common genetic variants at a number of loci have been discovered. Two loci in particular, including genes of the complement cascade on chromosome 1 and the ARMS2/HTRA1 genes on chromosome 10, have been shown to convey significantly increased susceptibility to developing AMD. Today, although it is possible to screen individuals for a genetic predisposition to the disease, effective interventional strategies for those at risk of developing AMD are scarce. Ongoing research in this area is nonetheless promising. After providing brief overviews of AMD and common disease genetics, we outline the main recent advances in the understanding of AMD, particularly those made through GWAS. Finally, the true merit of these findings and their current and potential translational value is examined.Genet Med 18 4, 283-289.
A genome-wide association study of breast cancer in women of African ancestry
Chen, Fang; Chen, Gary K.; Stram, Daniel O.; Millikan, Robert C.; Ambrosone, Christine B.; John, Esther M.; Bernstein, Leslie; Zheng, Wei; Palmer, Julie R.; Hu, Jennifer J.; Rebbeck, Tim R.; Ziegler, Regina G.; Nyante, Sarah; Bandera, Elisa V.; Ingles, Sue A.; Press, Michael F.; Ruiz-Narvaez, Edward A.; Deming, Sandra L.; Rodriguez-Gil, Jorge L.; DeMichele, Angela; Chanock, Stephen J.; Blot, William; Signorello, Lisa; Cai, Qiuyin; Li, Guoliang; Long, Jirong; Huo, Dezheng; Zheng, Yonglan; Cox, Nancy J.; Olopade, Olufunmilayo I.; Ogundiran, Temidayo O.; Adebamowo, Clement; Nathanson, Katherine L.; Domchek, Susan M.; Simon, Michael S.; Hennis, Anselm; Nemesure, Barbara; Wu, Suh-Yuh; Leske, M. Cristina; Ambs, Stefan; Hutter, Carolyn M.; Young, Alicia; Kooperberg, Charles; Peters, Ulrike; Rhie, Suhn K.; Wan, Peggy; Sheng, Xin; Pooler, Loreall C.; Van Den Berg, David J.; Le Marchand, Loic; Kolonel, Laurence N.; Henderson, Brian E.; Haiman, Christopher A.
2013-01-01
Genome-wide association studies (GWAS) in diverse populations are needed to reveal variants that are more common and/or limited to defined populations. We conducted a GWAS of breast cancer in women of African ancestry, with genotyping of > 1,000,000 SNPs in 3,153 African American cases and 2,831 controls, and replication testing of the top 66 associations in an additional 3,607 breast cancer cases and 11,330 controls of African ancestry. Two of the 66 SNPs replicated (p < 0.05) in stage 2, which reached statistical significance levels of 10−6 and 10−5 in the stage 1 and 2 combined analysis (rs4322600 at chromosome 14q31: OR = 1.18, p = 4.3×10−6; rs10510333 at chromosome 3p26: OR = 1.15, p = 1.5×10−5). These suggestive risk loci have not been identified in previous GWAS in other populations and will need to be examined in additional samples. Identification of novel risk variants for breast cancer in women of African ancestry will demand testing of a substantially larger set of markers from stage 1 in a larger replication sample. PMID:22923054
Cook, James P; Mahajan, Anubha; Morris, Andrew P
2017-02-01
Linear mixed models are increasingly used for the analysis of genome-wide association studies (GWAS) of binary phenotypes because they can efficiently and robustly account for population stratification and relatedness through inclusion of random effects for a genetic relationship matrix. However, the utility of linear (mixed) models in the context of meta-analysis of GWAS of binary phenotypes has not been previously explored. In this investigation, we present simulations to compare the performance of linear and logistic regression models under alternative weighting schemes in a fixed-effects meta-analysis framework, considering designs that incorporate variable case-control imbalance, confounding factors and population stratification. Our results demonstrate that linear models can be used for meta-analysis of GWAS of binary phenotypes, without loss of power, even in the presence of extreme case-control imbalance, provided that one of the following schemes is used: (i) effective sample size weighting of Z-scores or (ii) inverse-variance weighting of allelic effect sizes after conversion onto the log-odds scale. Our conclusions thus provide essential recommendations for the development of robust protocols for meta-analysis of binary phenotypes with linear models.
Genomics-assisted breeding in fruit trees.
Iwata, Hiroyoshi; Minamikawa, Mai F; Kajiya-Kanegae, Hiromi; Ishimori, Motoyuki; Hayashi, Takeshi
2016-01-01
Recent advancements in genomic analysis technologies have opened up new avenues to promote the efficiency of plant breeding. Novel genomics-based approaches for plant breeding and genetics research, such as genome-wide association studies (GWAS) and genomic selection (GS), are useful, especially in fruit tree breeding. The breeding of fruit trees is hindered by their long generation time, large plant size, long juvenile phase, and the necessity to wait for the physiological maturity of the plant to assess the marketable product (fruit). In this article, we describe the potential of genomics-assisted breeding, which uses these novel genomics-based approaches, to break through these barriers in conventional fruit tree breeding. We first introduce the molecular marker systems and whole-genome sequence data that are available for fruit tree breeding. Next we introduce the statistical methods for biparental linkage and quantitative trait locus (QTL) mapping as well as GWAS and GS. We then review QTL mapping, GWAS, and GS studies conducted on fruit trees. We also review novel technologies for rapid generation advancement. Finally, we note the future prospects of genomics-assisted fruit tree breeding and problems that need to be overcome in the breeding.
Genomics-assisted breeding in fruit trees
Iwata, Hiroyoshi; Minamikawa, Mai F.; Kajiya-Kanegae, Hiromi; Ishimori, Motoyuki; Hayashi, Takeshi
2016-01-01
Recent advancements in genomic analysis technologies have opened up new avenues to promote the efficiency of plant breeding. Novel genomics-based approaches for plant breeding and genetics research, such as genome-wide association studies (GWAS) and genomic selection (GS), are useful, especially in fruit tree breeding. The breeding of fruit trees is hindered by their long generation time, large plant size, long juvenile phase, and the necessity to wait for the physiological maturity of the plant to assess the marketable product (fruit). In this article, we describe the potential of genomics-assisted breeding, which uses these novel genomics-based approaches, to break through these barriers in conventional fruit tree breeding. We first introduce the molecular marker systems and whole-genome sequence data that are available for fruit tree breeding. Next we introduce the statistical methods for biparental linkage and quantitative trait locus (QTL) mapping as well as GWAS and GS. We then review QTL mapping, GWAS, and GS studies conducted on fruit trees. We also review novel technologies for rapid generation advancement. Finally, we note the future prospects of genomics-assisted fruit tree breeding and problems that need to be overcome in the breeding. PMID:27069395
Evaluating genetic risk for prostate cancer among Japanese and Latinos
Cheng, Iona; Chen, Gary K.; Nakagawa, Hidewaki; He, Jing; Wan, Peggy; Laurie, Cathy; Shen, Jess; Sheng, Xin; Pooler, Loreall C.; Crenshaw, Andrew T.; Mirel, Daniel B.; Takahashi, Atsushi; Kubo, Michiaki; Nakamura, Yusuke; Al Olama, Ali Amin; Benlloch, Sara; Donovan, Jenny L.; Guy, Michelle; Hamdy, Freddie C.; Kote-Jarai, Zsofia; Neal, David E.; Wilkens, Lynne R.; Monroe, Kristine R.; Stram, Daniel O.; Muir, Kenneth; Eeles, Rosalind A.; Easton, Douglas F.; Kolonel, Laurence N.; Henderson, Brian E.; Le Marchand, Loïc; Haiman, Christopher A.
2012-01-01
Background There have been few genome-wide association studies (GWAS) of prostate cancer among diverse populations. To search for novel prostate cancer risk variants, we conducted GWAS of prostate cancer in Japanese and Latinos. In addition, we tested prostate cancer risk variants and developed genetic risk models of prostate cancer for Japanese and Latinos. Methods Our first stage GWAS of prostate cancer included Japanese (cases/controls=1,033/1,042) and Latino (cases/controls=1,043/1,057) from the Multiethnic Cohort. Significant associations from stage 1 (P < 1.0×10−4) were examined in silico in GWAS of prostate cancer (stage 2) in Japanese (cases/controls=1,583/3,386) and Europeans (cases/controls=1,854/1,894). Results No novel stage 1 SNPs outside of known risk regions reached genome-wide significance. For Japanese, in stage 1, the most notable putative novel association was seen with 10 SNPs (P<8.0. x10−6) at chromosome 2q33; however, this was not replicated in stage 2. For Latinos, the most significant association was observed with rs17023900 at the known 3p12 risk locus (stage 1: OR=1.45; P=7.01×10−5 and stage 2: OR=1.58; P =3.05×10−7). The majority of the established risk variants for prostate cancer, 79% and 88%, were positively associated with prostate cancer in Japanese and Latinos (stage I), respectively. The cumulative effects of these variants significantly influence prostate cancer risk (OR per allele=1.10; P = 2.71×10−25 and OR=1.07; P = 1.02×10−16 for Japanese and Latinos, respectively). Conclusion and Impact Our GWAS of prostate cancer did not identify novel genome-wide significant variants. However, our findings demonstrate that established risk variants for prostate cancer significantly contribute to risk among Japanese and Latinos. PMID:22923026
Mahajan, Anubha; Sim, Xueling; Ng, Hui Jin; Manning, Alisa; Rivas, Manuel A.; Highland, Heather M.; Locke, Adam E.; Grarup, Niels; Im, Hae Kyung; Cingolani, Pablo; Flannick, Jason; Fontanillas, Pierre; Fuchsberger, Christian; Gaulton, Kyle J.; Teslovich, Tanya M.; Rayner, N. William; Robertson, Neil R.; Beer, Nicola L.; Rundle, Jana K.; Bork-Jensen, Jette; Ladenvall, Claes; Blancher, Christine; Buck, David; Buck, Gemma; Burtt, Noël P.; Gabriel, Stacey; Gjesing, Anette P.; Groves, Christopher J.; Hollensted, Mette; Huyghe, Jeroen R.; Jackson, Anne U.; Jun, Goo; Justesen, Johanne Marie; Mangino, Massimo; Murphy, Jacquelyn; Neville, Matt; Onofrio, Robert; Small, Kerrin S.; Stringham, Heather M.; Syvänen, Ann-Christine; Trakalo, Joseph; Abecasis, Goncalo; Bell, Graeme I.; Blangero, John; Cox, Nancy J.; Duggirala, Ravindranath; Hanis, Craig L.; Seielstad, Mark; Wilson, James G.; Christensen, Cramer; Brandslund, Ivan; Rauramaa, Rainer; Surdulescu, Gabriela L.; Doney, Alex S. F.; Lannfelt, Lars; Linneberg, Allan; Isomaa, Bo; Tuomi, Tiinamaija; Jørgensen, Marit E.; Jørgensen, Torben; Kuusisto, Johanna; Uusitupa, Matti; Salomaa, Veikko; Spector, Timothy D.; Morris, Andrew D.; Palmer, Colin N. A.; Collins, Francis S.; Mohlke, Karen L.; Bergman, Richard N.; Ingelsson, Erik; Lind, Lars; Tuomilehto, Jaakko; Hansen, Torben; Watanabe, Richard M.; Prokopenko, Inga; Dupuis, Josee; Karpe, Fredrik; Groop, Leif; Laakso, Markku; Pedersen, Oluf; Florez, Jose C.; Morris, Andrew P.; Altshuler, David; Meigs, James B.; Boehnke, Michael; McCarthy, Mark I.; Lindgren, Cecilia M.; Gloyn, Anna L.
2015-01-01
Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P<5×10-7) evidence for two loci not previously highlighted by common variant GWAS: GLP1R (p.Ala316Thr, minor allele frequency (MAF)=1.5%) influencing FG levels, and URB2 (p.Glu594Val, MAF = 0.1%) influencing FI levels. Coding variant associations can highlight potential effector genes at (non-coding) GWAS signals. At the G6PC2/ABCB11 locus, we identified multiple coding variants in G6PC2 (p.Val219Leu, p.His177Tyr, and p.Tyr207Ser) influencing FG levels, conditionally independent of each other and the non-coding GWAS signal. In vitro assays demonstrate that these associated coding alleles result in reduced protein abundance via proteasomal degradation, establishing G6PC2 as an effector gene at this locus. Reconciliation of single-variant associations and functional effects was only possible when haplotype phase was considered. In contrast to earlier reports suggesting that, paradoxically, glucose-raising alleles at this locus are protective against type 2 diabetes (T2D), the p.Val219Leu G6PC2 variant displayed a modest but directionally consistent association with T2D risk. Coding variant associations for glycemic traits in GWAS signals highlight PCSK1, RREB1, and ZHX3 as likely effector transcripts. These coding variant association signals do not have a major impact on the trait variance explained, but they do provide valuable biological insights. PMID:25625282
Evaluating genetic risk for prostate cancer among Japanese and Latinos.
Cheng, Iona; Chen, Gary K; Nakagawa, Hidewaki; He, Jing; Wan, Peggy; Laurie, Cathy C; Shen, Jess; Sheng, Xin; Pooler, Loreall C; Crenshaw, Andrew T; Mirel, Daniel B; Takahashi, Atsushi; Kubo, Michiaki; Nakamura, Yusuke; Al Olama, Ali Amin; Benlloch, Sara; Donovan, Jenny L; Guy, Michelle; Hamdy, Freddie C; Kote-Jarai, Zsofia; Neal, David E; Wilkens, Lynne R; Monroe, Kristine R; Stram, Daniel O; Muir, Kenneth; Eeles, Rosalind A; Easton, Douglas F; Kolonel, Laurence N; Henderson, Brian E; Le Marchand, Loïc; Haiman, Christopher A
2012-11-01
There have been few genome-wide association studies (GWAS) of prostate cancer among diverse populations. To search for novel prostate cancer risk variants, we conducted GWAS of prostate cancer in Japanese and Latinos. In addition, we tested prostate cancer risk variants and developed genetic risk models of prostate cancer for Japanese and Latinos. Our first-stage GWAS of prostate cancer included Japanese (cases/controls = 1,033/1,042) and Latino (cases/controls = 1,043/1,057) from the Multiethnic Cohort (MEC). Significant associations from stage I (P < 1.0 × 10(-4)) were examined in silico in GWAS of prostate cancer (stage II) in Japanese (cases/controls = 1,583/3,386) and Europeans (cases/controls = 1,854/1,894). No novel stage I single-nucleotide polymorphism (SNP) outside of known risk regions reached genome-wide significance. For Japanese, in stage I, the most notable putative novel association was seen with 10 SNPs (P ≤ 8.0 × 10(-6)) at chromosome 2q33; however, this was not replicated in stage II. For Latinos, the most significant association was observed with rs17023900 at the known 3p12 risk locus (stage I: OR = 1.45; P = 7.01 × 10(-5) and stage II: OR = 1.58; P = 3.05 × 10(-7)). The majority of the established risk variants for prostate cancer, 79% and 88%, were positively associated with prostate cancer in Japanese and Latinos (stage I), respectively. The cumulative effects of these variants significantly influence prostate cancer risk (OR per allele = 1.10; P = 2.71 × 10(-25) and OR = 1.07; P = 1.02 × 10(-16) for Japanese and Latinos, respectively). Our GWAS of prostate cancer did not identify novel genome-wide significant variants. However, our findings show that established risk variants for prostate cancer significantly contribute to risk among Japanese and Latinos. ©2012 AACR.
Mahajan, Anubha; Sim, Xueling; Ng, Hui Jin; Manning, Alisa; Rivas, Manuel A; Highland, Heather M; Locke, Adam E; Grarup, Niels; Im, Hae Kyung; Cingolani, Pablo; Flannick, Jason; Fontanillas, Pierre; Fuchsberger, Christian; Gaulton, Kyle J; Teslovich, Tanya M; Rayner, N William; Robertson, Neil R; Beer, Nicola L; Rundle, Jana K; Bork-Jensen, Jette; Ladenvall, Claes; Blancher, Christine; Buck, David; Buck, Gemma; Burtt, Noël P; Gabriel, Stacey; Gjesing, Anette P; Groves, Christopher J; Hollensted, Mette; Huyghe, Jeroen R; Jackson, Anne U; Jun, Goo; Justesen, Johanne Marie; Mangino, Massimo; Murphy, Jacquelyn; Neville, Matt; Onofrio, Robert; Small, Kerrin S; Stringham, Heather M; Syvänen, Ann-Christine; Trakalo, Joseph; Abecasis, Goncalo; Bell, Graeme I; Blangero, John; Cox, Nancy J; Duggirala, Ravindranath; Hanis, Craig L; Seielstad, Mark; Wilson, James G; Christensen, Cramer; Brandslund, Ivan; Rauramaa, Rainer; Surdulescu, Gabriela L; Doney, Alex S F; Lannfelt, Lars; Linneberg, Allan; Isomaa, Bo; Tuomi, Tiinamaija; Jørgensen, Marit E; Jørgensen, Torben; Kuusisto, Johanna; Uusitupa, Matti; Salomaa, Veikko; Spector, Timothy D; Morris, Andrew D; Palmer, Colin N A; Collins, Francis S; Mohlke, Karen L; Bergman, Richard N; Ingelsson, Erik; Lind, Lars; Tuomilehto, Jaakko; Hansen, Torben; Watanabe, Richard M; Prokopenko, Inga; Dupuis, Josee; Karpe, Fredrik; Groop, Leif; Laakso, Markku; Pedersen, Oluf; Florez, Jose C; Morris, Andrew P; Altshuler, David; Meigs, James B; Boehnke, Michael; McCarthy, Mark I; Lindgren, Cecilia M; Gloyn, Anna L
2015-01-01
Genome wide association studies (GWAS) for fasting glucose (FG) and insulin (FI) have identified common variant signals which explain 4.8% and 1.2% of trait variance, respectively. It is hypothesized that low-frequency and rare variants could contribute substantially to unexplained genetic variance. To test this, we analyzed exome-array data from up to 33,231 non-diabetic individuals of European ancestry. We found exome-wide significant (P<5×10-7) evidence for two loci not previously highlighted by common variant GWAS: GLP1R (p.Ala316Thr, minor allele frequency (MAF)=1.5%) influencing FG levels, and URB2 (p.Glu594Val, MAF = 0.1%) influencing FI levels. Coding variant associations can highlight potential effector genes at (non-coding) GWAS signals. At the G6PC2/ABCB11 locus, we identified multiple coding variants in G6PC2 (p.Val219Leu, p.His177Tyr, and p.Tyr207Ser) influencing FG levels, conditionally independent of each other and the non-coding GWAS signal. In vitro assays demonstrate that these associated coding alleles result in reduced protein abundance via proteasomal degradation, establishing G6PC2 as an effector gene at this locus. Reconciliation of single-variant associations and functional effects was only possible when haplotype phase was considered. In contrast to earlier reports suggesting that, paradoxically, glucose-raising alleles at this locus are protective against type 2 diabetes (T2D), the p.Val219Leu G6PC2 variant displayed a modest but directionally consistent association with T2D risk. Coding variant associations for glycemic traits in GWAS signals highlight PCSK1, RREB1, and ZHX3 as likely effector transcripts. These coding variant association signals do not have a major impact on the trait variance explained, but they do provide valuable biological insights.
Chung, Sharon A.; Taylor, Kimberly E.; Graham, Robert R.; Nititham, Joanne; Lee, Annette T.; Ortmann, Ward A.; Jacob, Chaim O.; Alarcón-Riquelme, Marta E.; Tsao, Betty P.; Harley, John B.; Gaffney, Patrick M.; Moser, Kathy L.; Petri, Michelle; Demirci, F. Yesim; Kamboh, M. Ilyas; Manzi, Susan; Gregersen, Peter K.; Langefeld, Carl D.; Behrens, Timothy W.; Criswell, Lindsey A.
2011-01-01
Systemic lupus erythematosus (SLE) is a clinically heterogeneous, systemic autoimmune disease characterized by autoantibody formation. Previously published genome-wide association studies (GWAS) have investigated SLE as a single phenotype. Therefore, we conducted a GWAS to identify genetic factors associated with anti–dsDNA autoantibody production, a SLE–related autoantibody with diagnostic and clinical importance. Using two independent datasets, over 400,000 single nucleotide polymorphisms (SNPs) were studied in a total of 1,717 SLE cases and 4,813 healthy controls. Anti–dsDNA autoantibody positive (anti–dsDNA +, n = 811) and anti–dsDNA autoantibody negative (anti–dsDNA –, n = 906) SLE cases were compared to healthy controls and to each other to identify SNPs associated specifically with these SLE subtypes. SNPs in the previously identified SLE susceptibility loci STAT4, IRF5, ITGAM, and the major histocompatibility complex were strongly associated with anti–dsDNA + SLE. Far fewer and weaker associations were observed for anti–dsDNA – SLE. For example, rs7574865 in STAT4 had an OR for anti–dsDNA + SLE of 1.77 (95% CI 1.57–1.99, p = 2.0E-20) compared to an OR for anti–dsDNA – SLE of 1.26 (95% CI 1.12–1.41, p = 2.4E-04), with pheterogeneity<0.0005. SNPs in the SLE susceptibility loci BANK1, KIAA1542, and UBE2L3 showed evidence of association with anti–dsDNA + SLE and were not associated with anti–dsDNA – SLE. In conclusion, we identified differential genetic associations with SLE based on anti–dsDNA autoantibody production. Many previously identified SLE susceptibility loci may confer disease risk through their role in autoantibody production and be more accurately described as autoantibody propensity loci. Lack of strong SNP associations may suggest that other types of genetic variation or non-genetic factors such as environmental exposures have a greater impact on susceptibility to anti–dsDNA – SLE. PMID:21408207
Genome-wide significant association between a sequence variant at 15q15.2 and lung cancer risk
Rafnar, Thorunn; Sulem, Patrick; Besenbacher, Soren; Gudbjartsson, Daniel F.; Zanon, Carlo; Gudmundsson, Julius; Stacey, Simon N.; Kostic, Jelena P.; Thorgeirsson, Thorgeir E.; Thorleifsson, Gudmar; Bjarnason, Hjordis; Skuladottir, Halla; Gudbjartsson, Tomas; Isaksson, Helgi J.; Isla, Dolores; Murillo, Laura; García-Prats, Maria D.; Panadero, Angeles; Aben, Katja K.H.; Vermeulen, Sita H.; van der Heijden, Henricus F.M.; Feser, William; Miller, York E.; Bunn, Paul A.; Kong, Augustine; Wolf, Holly J.; Franklin, Wilbur A.; Mayordomo, Jose I; Kiemeney, Lambertus A.; Jonsson, Steinn; Thorsteinsdottir, Unnur; Stefansson, Kari
2010-01-01
Genome-wide association studies (GWAS) have identified three genomic regions, at 15q24-25.1, 5p15.33 and 6p21.33, which associate with risk of lung cancer. Large meta-analyses of GWA data have failed to find additional associations of genome-wide significance. In this study, we sought to confirm 7 variants with suggestive association to lung cancer (P<10−5) in a recently published meta-analysis. In a GWA dataset of 1,447 lung cancer cases and 36,256 controls in Iceland, three correlated variants on 15q15.2 (rs504417, rs11853991 and rs748404) showed a significant association with lung cancer whereas rs4254535 on 2p14, rs1530057 on 3p24.1, rs6438347 on 3q13.31 and rs1926203 on 10q23.31 did not. The most significant variant, rs748404, was genotyped in additional 1,299 lung cancer cases and 4,102 controls from the Netherlands, Spain and the USA and the results combined with published GWAS data. In this analysis, the T allele of rs748404 reached genome-wide significance (OR=1.15, P=1.1×10−9). Another variant at the same locus, rs12050604, showed association with lung cancer (OR=1.09, 3.6×10−6) and remained significant after adjustment for rs748404 and vice versa. rs748404 is located 140 kb centromeric of the TP53BP1 gene that has been implicated in lung cancer risk. Two fully correlated, non-synonymous coding variants in TP53BP1, rs2602141 (Q1136K) and rs560191 (E353D), showed association with lung cancer in our sample set; however, this association did not remain significant after adjustment for rs748404. Our data show that one or more lung cancer risk variants of genome-wide significance and distinct from the coding variants in TP53BP1 are located at 15q15.2. PMID:21303977
Rare coding variants in Phospholipase D3 (PLD3) confer risk for Alzheimer's disease
Cruchaga, Carlos; Benitez, Bruno A.; Cai, Yefei; Guerreiro, Rita; Harari, Oscar; Norton, Joanne; Budde, John; Bertelsen, Sarah; Jeng, Amanda T.; Cooper, Breanna; Skorupa, Tara; Carrell, David; Levitch, Denise; Hsu, Simon; Choi, Jiyoon; Ryten, Mina; Sassi, Celeste; Bras, Jose; Gibbs, Raphael J.; Hernandez, Dena G.; Lupton, Michelle K.; Powell, John; Forabosco, Paola; Ridge, Perry G.; Corcoran, Christopher D.; Tschanz, JoAnn T.; Norton, Maria C.; Munger, Ronald G.; Schmutz, Cameron; Leary, Maegan; Demirci, F. Yesim; Bamne, Mikhil N.; Wang, Xingbin; Lopez, Oscar L.; Ganguli, Mary; Medway, Christopher; Turton, James; Lord, Jenny; Braae, Anne; Barber, Imelda; Brown, Kristelle; Pastor, Pau; Lorenzo-Betancor, Oswaldo; Brkanac, Zoran; Scott, Erick; Topol, Eric; Morgan, Kevin; Rogaeva, Ekaterina; Singleton, Andy; Hardy, John; Kamboh, M. Ilyas; George-Hyslop, Peter St; Cairns, Nigel; Morris, John C.; Kauwe, John S.K.; Goate, Alison M.
2014-01-01
Genome-wide association studies (GWAS) have identified several risk variants for late-onset Alzheimer's disease (LOAD)1,2. These common variants have replicable but small effects on LOAD risk and generally do not have obvious functional effects. Low-frequency coding variants, not detected by GWAS, are predicted to include functional variants with larger effects on risk. To identify low frequency coding variants with large effects on LOAD risk, we performed whole exome-sequencing (WES) in 14 large LOAD families and follow-up analyses of the candidate variants in several large case-control datasets. A rare variant in PLD3 (phospholipase-D family, member 3, rs145999145; V232M) segregated with disease status in two independent families and doubled risk for AD in seven independent case-control series (V232M meta-analysis; OR= 2.10, CI=1.47-2.99; p= 2.93×10-5, 11,354 cases and controls of European-descent). Gene-based burden analyses in 4,387 cases and controls of European-descent and 302 African American cases and controls, with complete sequence data for PLD3, indicate that several variants in this gene increase risk for AD in both populations (EA: OR= 2.75, CI=2.05-3.68; p=1.44×10-11, AA: OR= 5.48, CI=1.77-16.92; p=1.40×10-3). PLD3 is highly expressed in brain regions vulnerable to AD pathology, including hippocampus and cortex, and is expressed at lower levels in neurons from AD brains compared to control brains (p=8.10×10-10). Over-expression of PLD3 leads to a significant decrease in intracellular APP and extracellular Aβ42 and Aβ40, while knock-down of PLD3 leads to a significant increase in extracellular Aβ42 and Aβ40. Together, our genetic and functional data indicate that carriers of PLD3 coding variants have a two-fold increased risk for LOAD and that PLD3 influences APP processing. This study provides an example of how densely affected families may be used to identify rare variants with large effects on risk for disease or other complex traits. PMID:24336208
Sun, Chengming; Wang, Benqi; Yan, Lei; Hu, Kaining; Liu, Sheng; Zhou, Yongming; Guan, Chunyun; Zhang, Zhenqian; Li, Jiana; Zhang, Jiefu; Chen, Song; Wen, Jing; Ma, Chaozhi; Tu, Jinxing; Shen, Jinxiong; Fu, Tingdong; Yi, Bin
2016-01-01
Plant height is a key morphological trait of rapeseed. In this study, we measured plant height of a rapeseed population across six environments. This population contains 476 inbred lines representing the major Chinese rapeseed genepool and 44 lines from other countries. The 60K Brassica Infinium® SNP array was utilized to genotype the association panel. A genome-wide association study (GWAS) was performed via three methods, including a robust, novel, nonparametric Anderson-Darling (A-D) test. Consequently, 68 loci were identified as significantly associated with plant height (P < 5.22 × 10(-5)), and more than 70% of the loci (48) overlapped the confidence intervals of reported QTLs from nine mapping populations. Moreover, 24 GWAS loci were detected with selective sweep signals, which reflected the signatures of historical semi-dwarf breeding. In the linkage disequilibrium (LD) decay range up-and downstream of 65 loci (r (2) > 0.1), we found plausible candidates orthologous to the documented Arabidopsis genes involved in height regulation. One significant association found by GWAS colocalized with the established height locus BnRGA in rapeseed. Our results provide insights into the genetic basis of plant height in rapeseed and may facilitate marker-based breeding.
Renin-Angiotensin System Gene Variants and Type 2 Diabetes Mellitus: Influence of Angiotensinogen
Joyce-Tan, Siew Mei; Zain, Shamsul Mohd; Abdul Sattar, Munavvar Zubaid; Abdullah, Nor Azizan
2016-01-01
Genome-wide association studies (GWAS) have been successfully used to call for variants associated with diseases including type 2 diabetes mellitus (T2DM). However, some variants are not included in the GWAS to avoid penalty in multiple hypothetic testing. Thus, candidate gene approach is still useful even at GWAS era. This study attempted to assess whether genetic variations in the renin-angiotensin system (RAS) and their gene interactions are associated with T2DM risk. We genotyped 290 T2DM patients and 267 controls using three genes of the RAS, namely, angiotensin converting enzyme (ACE), angiotensinogen (AGT), and angiotensin II type 1 receptor (AGTR1). There were significant differences in allele frequencies between cases and controls for AGT variants (P = 0.05) but not for ACE and AGTR1. Haplotype TCG of the AGT was associated with increased risk of T2DM (OR 1.92, 95% CI 1.15–3.20, permuted P = 0.012); however, no evidence of significant gene-gene interactions was seen. Nonetheless, our analysis revealed that the associations of the AGT variants with T2DM were independently associated. Thus, this study suggests that genetic variants of the RAS can modestly influence the T2DM risk. PMID:26682227
Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M
2012-01-01
Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.
Nature vs. nurture in human sociality: multi-level genomic analyses of social conformity.
Chen, Biqing; Zhu, Zijian; Wang, Yingying; Ding, Xiaohu; Guo, Xiaobo; He, Mingguang; Fang, Wan; Zhou, Qin; Zhou, Shanbi; Lei, Han; Huang, Ailong; Chen, Tingmei; Ni, Dongsheng; Gu, Yuping; Liu, Jianing; Rao, Yi
2018-05-01
Social conformity is fundamental to human societies and has been studied for more than six decades, but our understanding of its mechanisms remains limited. Individual differences in conformity have been attributed to social and cultural environmental influences, but not to genes. Here we demonstrate a genetic contribution to conformity after analyzing 1,140 twins and single-nucleotide polymorphism (SNP)-based studies of 2,130 young adults. A two-step genome-wide association study (GWAS) revealed replicable associations in 9 genomic loci, and a meta-analysis of three GWAS with a sample size of ~2,600 further confirmed one locus, corresponding to the NAV3 (Neuron Navigator 3) gene which encodes a protein important for axon outgrowth and guidance. Further multi-level (haplotype, gene, pathway) GWAS strongly associated genes including NAV3, PTPRD (protein tyrosine phosphatase receptor type D), ARL10 (ADP ribosylation factor-like GTPase 10), and CTNND2 (catenin delta 2), with conformity. Magnetic resonance imaging of 64 subjects shows correlation of activation or structural features of brain regions with the SNPs of these genes, supporting their functional significance. Our results suggest potential moderate genetic influence on conformity, implicate several specific genetic elements in conformity and will facilitate further research on cellular and molecular mechanisms underlying human conformity.
Pathak, Jyotishman; Kiefer, Richard C.; Chute, Christopher G.
2012-01-01
The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. One of the key requirements to perform GWAS is the identification of subject cohorts with accurate classification of disease phenotypes. In this work, we study how emerging Semantic Web technologies can be applied in conjunction with clinical data stored in electronic health records (EHRs) to accurately identify subjects with specific diseases for inclusion in cohort studies. In particular, we demonstrate the role of using Resource Description Framework (RDF) for representing EHR data and enabling federated querying and inferencing via standardized Web protocols for identifying subjects with Diabetes Mellitus. Our study highlights the potential of using Web-scale data federation approaches to execute complex queries. PMID:22779040
Gonzalez-Pena, Dianelys; Gao, Guangtu; Baranski, Matthew; Moen, Thomas; Cleveland, Beth M; Kenney, P Brett; Vallejo, Roger L; Palti, Yniv; Leeds, Timothy D
2016-01-01
Fillet yield (FY, %) is an economically-important trait in rainbow trout aquaculture that affects production efficiency. Despite that, FY has received little attention in breeding programs because it is difficult to measure on a large number of fish and cannot be directly measured on breeding candidates. The recent development of a high-density SNP array for rainbow trout has provided the needed tool for studying the underlying genetic architecture of this trait. A genome-wide association study (GWAS) was conducted for FY, body weight at 10 (BW10) and 13 (BW13) months post-hatching, head-off carcass weight (CAR), and fillet weight (FW) in a pedigreed rainbow trout population selectively bred for improved growth performance. The GWAS analysis was performed using the weighted single-step GBLUP method (wssGWAS). Phenotypic records of 1447 fish (1.5 kg at harvest) from 299 full-sib families in three successive generations, of which 875 fish from 196 full-sib families were genotyped, were used in the GWAS analysis. A total of 38,107 polymorphic SNPs were analyzed in a univariate model with hatch year and harvest group as fixed effects, harvest weight as a continuous covariate, and animal and common environment as random effects. A new linkage map was developed to create windows of 20 adjacent SNPs for use in the GWAS. The two windows with largest effect for FY and FW were located on chromosome Omy9 and explained only 1.0-1.5% of genetic variance, thus suggesting a polygenic architecture affected by multiple loci with small effects in this population. One window on Omy5 explained 1.4 and 1.0% of the genetic variance for BW10 and BW13, respectively. Three windows located on Omy27, Omy17, and Omy9 (same window detected for FY) explained 1.7, 1.7, and 1.0%, respectively, of genetic variance for CAR. Among the detected 100 SNPs, 55% were located directly in genes (intron and exons). Nucleotide sequences of intragenic SNPs were blasted to the Mus musculus genome to create a putative gene network. The network suggests that differences in the ability to maintain a proliferative and renewable population of myogenic precursor cells may affect variation in growth and fillet yield in rainbow trout.
Gonzalez-Pena, Dianelys; Gao, Guangtu; Baranski, Matthew; Moen, Thomas; Cleveland, Beth M.; Kenney, P. Brett; Vallejo, Roger L.; Palti, Yniv; Leeds, Timothy D.
2016-01-01
Fillet yield (FY, %) is an economically-important trait in rainbow trout aquaculture that affects production efficiency. Despite that, FY has received little attention in breeding programs because it is difficult to measure on a large number of fish and cannot be directly measured on breeding candidates. The recent development of a high-density SNP array for rainbow trout has provided the needed tool for studying the underlying genetic architecture of this trait. A genome-wide association study (GWAS) was conducted for FY, body weight at 10 (BW10) and 13 (BW13) months post-hatching, head-off carcass weight (CAR), and fillet weight (FW) in a pedigreed rainbow trout population selectively bred for improved growth performance. The GWAS analysis was performed using the weighted single-step GBLUP method (wssGWAS). Phenotypic records of 1447 fish (1.5 kg at harvest) from 299 full-sib families in three successive generations, of which 875 fish from 196 full-sib families were genotyped, were used in the GWAS analysis. A total of 38,107 polymorphic SNPs were analyzed in a univariate model with hatch year and harvest group as fixed effects, harvest weight as a continuous covariate, and animal and common environment as random effects. A new linkage map was developed to create windows of 20 adjacent SNPs for use in the GWAS. The two windows with largest effect for FY and FW were located on chromosome Omy9 and explained only 1.0–1.5% of genetic variance, thus suggesting a polygenic architecture affected by multiple loci with small effects in this population. One window on Omy5 explained 1.4 and 1.0% of the genetic variance for BW10 and BW13, respectively. Three windows located on Omy27, Omy17, and Omy9 (same window detected for FY) explained 1.7, 1.7, and 1.0%, respectively, of genetic variance for CAR. Among the detected 100 SNPs, 55% were located directly in genes (intron and exons). Nucleotide sequences of intragenic SNPs were blasted to the Mus musculus genome to create a putative gene network. The network suggests that differences in the ability to maintain a proliferative and renewable population of myogenic precursor cells may affect variation in growth and fillet yield in rainbow trout. PMID:27920797
SNP association study in PMS2-associated Lynch syndrome.
Ten Broeke, Sanne W; Elsayed, Fadwa A; Pagan, Lisa; Olderode-Berends, Maran J W; Garcia, Encarna Gomez; Gille, Hans J P; van Hest, Liselot P; Letteboer, Tom G W; van der Kolk, Lizet E; Mensenkamp, Arjen R; van Os, Theo A; Spruijt, Liesbeth; Redeker, Bert J W; Suerink, Manon; Vos, Yvonne J; Wagner, Anja; Wijnen, Juul T; Steyerberg, E W; Tops, Carli M J; van Wezel, Tom; Nielsen, Maartje
2017-11-17
Lynch syndrome (LS) patients are at high risk of developing colorectal cancer (CRC). Phenotypic variability might in part be explained by common susceptibility loci identified in Genome Wide Association Studies (GWAS). Previous studies focused mostly on MLH1, MSH2 and MSH6 carriers, with conflicting results. We aimed to determine the role of GWAS SNPs in PMS2 mutation carriers. A cohort study was performed in 507 PMS2 carriers (124 CRC cases), genotyped for 24 GWAS SNPs, including SNPs at 11q23.1 and 8q23.3. Hazard ratios (HRs) were calculated using a weighted Cox regression analysis to correct for ascertainment bias. Discrimination was assessed with a concordance statistic in a bootstrap cross-validation procedure. Individual SNPs only had non-significant associations with CRC occurrence with HRs lower than 2, although male carriers of allele A at rs1321311 (6p21.31) may have increased risk of CRC (HR = 2.1, 95% CI 1.2-3.0). A polygenic risk score (PRS) based on 24 HRs had an HR of 2.6 (95% CI 1.5-4.6) for the highest compared to the lowest quartile, but had no discriminative ability (c statistic 0.52). Previously suggested SNPs do not modify CRC risk in PMS2 carriers. Future large studies are needed for improved risk stratification among Lynch syndrome patients.
Hemoglobin genetics: recent contributions of GWAS and gene editing
Smith, Elenoe C.; Orkin, Stuart H.
2016-01-01
The β-hemoglobinopathies are inherited disorders resulting from altered coding potential or expression of the adult β-globin gene. Impaired expression of β-globin reduces adult hemoglobin (α2β2) production, the hallmark of β-thalassemia. A single-base mutation at codon 6 leads to formation of HbS (α2βS2) and sickle cell disease. While the basis of these diseases is known, therapy remains largely supportive. Bone marrow transplantation is the only curative therapy. Patients with elevated levels of fetal hemoglobin (HbF, α2γ2) as adults exhibit reduced symptoms and enhanced survival. The β-globin gene locus is a paradigm of cell- and developmental stage-specific regulation. Although the principal erythroid cell transcription factors are known, mechanisms responsible for silencing of the γ-globin gene were obscure until application of genome-wide association studies (GWAS). Here, we review findings in the field. GWAS identified BCL11A as a candidate negative regulator of γ-globin expression. Subsequent studies have established BCL11A as a quantitative repressor. GWAS-related single-nucleotide polymorphisms lie within an essential erythroid enhancer of the BCL11A gene. Disruption of a discrete region within the enhancer reduces BCL11A expression and induces HbF expression, providing the basis for gene therapy using gene editing tools. A recently identified, second silencing factor, leukemia/lymphoma-related factor/Pokemon, shares features with BCL11A, including interaction with the nucleosome remodeling deacetylase repressive complex. These findings suggest involvement of a common pathway for HbF silencing. In addition, we discuss other factors that may be involved in γ-globin gene silencing and their potential manipulation for therapeutic benefit in treating the β-hemoglobinopathies. PMID:27340226
Mägi, Reedik; Horikoshi, Momoko; Sofer, Tamar; Mahajan, Anubha; Kitajima, Hidetoshi; Franceschini, Nora; McCarthy, Mark I.; Morris, Andrew P.
2017-01-01
Abstract Trans-ethnic meta-analysis of genome-wide association studies (GWAS) across diverse populations can increase power to detect complex trait loci when the underlying causal variants are shared between ancestry groups. However, heterogeneity in allelic effects between GWAS at these loci can occur that is correlated with ancestry. Here, a novel approach is presented to detect SNP association and quantify the extent of heterogeneity in allelic effects that is correlated with ancestry. We employ trans-ethnic meta-regression to model allelic effects as a function of axes of genetic variation, derived from a matrix of mean pairwise allele frequency differences between GWAS, and implemented in the MR-MEGA software. Through detailed simulations, we demonstrate increased power to detect association for MR-MEGA over fixed- and random-effects meta-analysis across a range of scenarios of heterogeneity in allelic effects between ethnic groups. We also demonstrate improved fine-mapping resolution, in loci containing a single causal variant, compared to these meta-analysis approaches and PAINTOR, and equivalent performance to MANTRA at reduced computational cost. Application of MR-MEGA to trans-ethnic GWAS of kidney function in 71,461 individuals indicates stronger signals of association than fixed-effects meta-analysis when heterogeneity in allelic effects is correlated with ancestry. Application of MR-MEGA to fine-mapping four type 2 diabetes susceptibility loci in 22,086 cases and 42,539 controls highlights: (i) strong evidence for heterogeneity in allelic effects that is correlated with ancestry only at the index SNP for the association signal at the CDKAL1 locus; and (ii) 99% credible sets with six or fewer variants for five distinct association signals. PMID:28911207
Combined linkage and association analyses identify a novel locus for obesity near PROX1 in Asians.
Kim, Hyun-Jin; Yoo, Yun Joo; Ju, Young Seok; Lee, Seungbok; Cho, Sung-Il; Sung, Joohon; Kim, Jong-Il; Seo, Jeong-Sun
2013-11-01
Although genome-wide association studies (GWAS) have substantially contributed to understanding the genetic architecture, unidentified variants for complex traits remain an issue. One of the efficient approaches is the improvement of the power of GWAS scan by weighting P values with prior linkage signals. Our objective was to identify the novel candidates for obesity in Asian populations by using genemapping strategies that combine linkage and association analyses. To obtain linkage information for body mass index (BMI) and waist circumference (WC), we performed a multipoint genome-wide linkage study in an isolated Mongolian sample of 1,049 individuals from 74 families. Next, a family-based GWAS, which integrates within- and between-family components, was performed using the genotype data of 756 individuals of the Mongolian sample, and P values for association were weighted using linkage information obtained previously. For both BMI (LOD = 3.3) and WC (LOD = 2.6), the highest linkage peak was discovered at chromosome 10q11.22. In family-based GWAS combined with linkage information, six single-nucleotide polymorphisms (SNPs) for BMI and five SNPs for WC reached a significant level of association (linkage weighted P < 1 × 10(-5) ). Of these, only one of the SNPs associated with WC (rs1704198) was replicated in 327 Korean families comprising 1,301 individuals. This SNP was located in the proximity of the prosperorelated homeobox 1 (PROX1) gene, the function of which was validated previously in a mouse model. Our powerful strategic analysis enabled the discovery of a novel candidate gene, PROX1, associated with WC in an Asian population. Copyright © 2012 The Obesity Society.
Genetic characteristics of inflammatory bowel disease in a Japanese population.
Fuyuno, Yuta; Yamazaki, Keiko; Takahashi, Atsushi; Esaki, Motohiro; Kawaguchi, Takaaki; Takazoe, Masakazu; Matsumoto, Takayuki; Matsui, Toshiyuki; Tanaka, Hiroki; Motoya, Satoshi; Suzuki, Yasuo; Kiyohara, Yutaka; Kitazono, Takanari; Kubo, Michiaki
2016-07-01
Crohn's disease (CD) and ulcerative colitis (UC) are two major forms of inflammatory bowel disease (IBD). Meta-analyses of genome-wide association studies (GWAS) have identified 163 susceptibility loci for IBD among European populations; however, there is limited information for IBD susceptibility in a Japanese population. We performed a GWAS using imputed genotypes of 743 IBD patients (372 with CD and 371 with UC) and 3321 controls. Using 100 tag single-nucleotide polymorphisms (SNPs) (P < 5 × 10(-5)), a replication study was conducted with an independent set of 1310 IBD patients (949 with CD and 361 with UC) and 4163 controls. In addition, 163 SNPs identified by a European IBD GWAS were genotyped, and genetic backgrounds were compared between the Japanese and European populations. In the IBD GWAS, two East Asia-specific IBD susceptibility loci were identified in the Japanese population: ATG16L2-FCHSD2 and SLC25A15-ELF1-WBP4. Among 163 reported SNPs in European IBD patients, significant associations were confirmed in 18 (8 CD-specific, 4 UC-specific, and 6 IBD-shared). In Japanese CD patients, genes in the Th17-IL23 pathway showed stronger genetic effects, whereas the association of genes in the autophagy pathway was limited. The association of genes in the epithelial barrier and the Th17-IL23R pathways were similar in the Japanese and European UC populations. We confirmed two IBD susceptibility loci as common for CD and UC, and East Asian-specific. The genetic architecture in UC appeared to be similar between Europeans and East Asians, but may have some differences in CD.
Dennis, Jessica; Medina-Rivera, Alejandra; Truong, Vinh; Antounians, Lina; Zwingerman, Nora; Carrasco, Giovana; Strug, Lisa; Wells, Phil; Trégouët, David-Alexandre; Morange, Pierre-Emmanuel; Wilson, Michael D; Gagnon, France
2017-07-01
Tissue factor pathway inhibitor (TFPI) regulates the formation of intravascular blood clots, which manifest clinically as ischemic heart disease, ischemic stroke, and venous thromboembolism (VTE). TFPI plasma levels are heritable, but the genetics underlying TFPI plasma level variability are poorly understood. Herein we report the first genome-wide association scan (GWAS) of TFPI plasma levels, conducted in 251 individuals from five extended French-Canadian Families ascertained on VTE. To improve discovery, we also applied a hypothesis-driven (HD) GWAS approach that prioritized single nucleotide polymorphisms (SNPs) in (1) hemostasis pathway genes, and (2) vascular endothelial cell (EC) regulatory regions, which are among the highest expressers of TFPI. Our GWAS identified 131 SNPs with suggestive evidence of association (P-value < 5 × 10 -8 ), but no SNPs reached the genome-wide threshold for statistical significance. Hemostasis pathway genes were not enriched for TFPI plasma level associated SNPs (global hypothesis test P-value = 0.147), but EC regulatory regions contained more TFPI plasma level associated SNPs than expected by chance (global hypothesis test P-value = 0.046). We therefore stratified our genome-wide SNPs, prioritizing those in EC regulatory regions via stratified false discovery rate (sFDR) control, and reranked the SNPs by q-value. The minimum q-value was 0.27, and the top-ranked SNPs did not show association evidence in the MARTHA replication sample of 1,033 unrelated VTE cases. Although this study did not result in new loci for TFPI, our work lays out a strategy to utilize epigenomic data in prioritization schemes for future GWAS studies. © 2017 WILEY PERIODICALS, INC.
Park, Sung Hee; Lee, Ji Young; Kim, Sangsoo
2011-01-01
Current Genome-Wide Association Studies (GWAS) are performed in a single trait framework without considering genetic correlations between important disease traits. Hence, the GWAS have limitations in discovering genetic risk factors affecting pleiotropic effects. This work reports a novel data mining approach to discover patterns of multiple phenotypic associations over 52 anthropometric and biochemical traits in KARE and a new analytical scheme for GWAS of multivariate phenotypes defined by the discovered patterns. This methodology applied to the GWAS for multivariate phenotype highLDLhighTG derived from the predicted patterns of the phenotypic associations. The patterns of the phenotypic associations were informative to draw relations between plasma lipid levels with bone mineral density and a cluster of common traits (Obesity, hypertension, insulin resistance) related to Metabolic Syndrome (MS). A total of 15 SNPs in six genes (PAK7, C20orf103, NRIP1, BCL2, TRPM3, and NAV1) were identified for significant associations with highLDLhighTG. Noteworthy findings were that the significant associations included a mis-sense mutation (PAK7:R335P), a frame shift mutation (C20orf103) and SNPs in splicing sites (TRPM3). The six genes corresponded to rat and mouse quantitative trait loci (QTLs) that had shown associations with the common traits such as the well characterized MS and even tumor susceptibility. Our findings suggest that the six genes may play important roles in the pleiotropic effects on lipid metabolism and the MS, which increase the risk of Type 2 Diabetes and cardiovascular disease. The use of the multivariate phenotypes can be advantageous in identifying genetic risk factors, accounting for the pleiotropic effects when the multivariate phenotypes have a common etiological pathway.
Nagao, Yumiko; Nishida, Nao; Toyo-Oka, Licht; Kawaguchi, Atsushi; Amoroso, Antonio; Carrozzo, Marco; Sata, Michio; Mizokami, Masashi; Tokunaga, Katsushi; Tanaka, Yasuhito
2017-06-01
There is a close relationship between hepatitis C virus (HCV) infection and lichen planus, a chronic inflammatory mucocutaneous disease. We performed a genome-wide association study (GWAS) to identify genetic variants associated with HCV-related lichen planus. We conducted a GWAS of 261 patients with HCV infection treated at a tertiary medical center in Japan from October 2007 through January 2013; a total of 71 had lichen planus and 190 had normal oral mucosa. We validated our findings in a GWAS of 38 patients with HCV-associated lichen planus and 7 HCV-infected patients with normal oral mucosa treated at a medical center in Italy. Single-nucleotide polymorphisms in NRP2 (rs884000) and IGFBP4 (rs538399) were associated with risk of HCV-associated lichen planus (P < 1 × 10 -4 ). We also found an association between a single-nucleotide polymorphism in the HLA-DR/DQ genes (rs9461799) and susceptibility to HCV-associated lichen planus. The odds ratios for the minor alleles of rs884000, rs538399, and rs9461799 were 3.25 (95% confidence interval, 1.95-5.41), 0.40 (95% confidence interval, 0.25-0.63), and 2.15 (95% confidence interval, 1.41-3.28), respectively. In a GWAS of Japanese patients with HCV infection, we replicated associations between previously reported polymorphisms in HLA class II genes and risk for lichen planus. We also identified single-nucleotide polymorphisms in NRP2 and IGFBP4 loci that increase and reduce risk of lichen planus, respectively. These genetic variants might be used to identify patients with HCV infection who are at risk for lichen planus. Copyright © 2017 AGA Institute. Published by Elsevier Inc. All rights reserved.
Melo, Thaise P; Takada, Luciana; Baldi, Fernando; Oliveira, Henrique N; Dias, Marina M; Neves, Haroldo H R; Schenkel, Flavio S; Albuquerque, Lucia G; Carvalheiro, Roberto
2016-06-21
QTL mapping through genome-wide association studies (GWAS) is challenging, especially in the case of low heritability complex traits and when few animals possess genotypic and phenotypic information. When most of the phenotypic information is from non-genotyped animals, GWAS can be performed using the weighted single-step GBLUP (WssGBLUP) method, which permits to combine all available information, even that of non-genotyped animals. However, it is not clear to what extent phenotypic information from non-genotyped animals increases the power of QTL detection, and whether factors such as the extent of linkage disequilibrium (LD) in the population and weighting SNPs in WssGBLUP affect the importance of using information from non-genotyped animals in GWAS. These questions were investigated in this study using real and simulated data. Analysis of real data showed that the use of phenotypes of non-genotyped animals affected SNP effect estimates and, consequently, QTL mapping. Despite some coincidence, the most important genomic regions identified by the analyses, either using or ignoring phenotypes of non-genotyped animals, were not the same. The simulation results indicated that the inclusion of all available phenotypic information, even that of non-genotyped animals, tends to improve QTL detection for low heritability complex traits. For populations with low levels of LD, this trend of improvement was less pronounced. Stronger shrinkage on SNPs explaining lower variance was not necessarily associated with better QTL mapping. The use of phenotypic information from non-genotyped animals in GWAS may improve the ability to detect QTL for low heritability complex traits, especially in populations in which the level of LD is high.
Paziewska, Agnieszka; Cukrowska, Bozena; Dabrowska, Michalina; Goryca, Krzysztof; Piatkowska, Magdalena; Kluska, Anna; Mikula, Michal; Karczmarski, Jakub; Oralewska, Beata; Rybak, Anna; Socha, Jerzy; Balabas, Aneta; Zeber-Lubecka, Natalia; Ambrozkiewicz, Filip; Konopka, Ewa; Trojanowska, Ilona; Zagroba, Malgorzata; Szperl, Malgorzata; Ostrowski, Jerzy
2015-01-01
Assessment of non-HLA variants alongside standard HLA testing was previously shown to improve the identification of potential coeliac disease (CD) patients. We intended to identify new genetic variants associated with CD in the Polish population that would improve CD risk prediction when used alongside HLA haplotype analysis. DNA samples of 336 CD and 264 unrelated healthy controls were used to create DNA pools for a genome wide association study (GWAS). GWAS findings were validated with individual HLA tag single nucleotide polymorphism (SNP) typing of 473 patients and 714 healthy controls. Association analysis using four HLA-tagging SNPs showed that, as was found in other populations, positive predicting genotypes (HLA-DQ2.5/DQ2.5, HLA-DQ2.5/DQ2.2, and HLA-DQ2.5/DQ8) were found at higher frequencies in CD patients than in healthy control individuals in the Polish population. Both CD-associated SNPs discovered by GWAS were found in the CD susceptibility region, confirming the previously-determined association of the major histocompatibility (MHC) region with CD pathogenesis. The two most significant SNPs from the GWAS were rs9272346 (HLA-dependent; localized within 1 Kb of DQA1) and rs3130484 (HLA-independent; mapped to MSH5). Specificity of CD prediction using the four HLA-tagging SNPs achieved 92.9%, but sensitivity was only 45.5%. However, when a testing combination of the HLA-tagging SNPs and the MSH5 SNP was used, specificity decreased to 80%, and sensitivity increased to 74%. This study confirmed that improvement of CD risk prediction sensitivity could be achieved by including non-HLA SNPs alongside HLA SNPs in genetic testing.
Discovery and characterization of two new stem rust resistance genes in Aegilops sharonensis.
Yu, Guotai; Champouret, Nicolas; Steuernagel, Burkhard; Olivera, Pablo D; Simmons, Jamie; Williams, Cole; Johnson, Ryan; Moscou, Matthew J; Hernández-Pinzón, Inmaculada; Green, Phon; Sela, Hanan; Millet, Eitan; Jones, Jonathan D G; Ward, Eric R; Steffenson, Brian J; Wulff, Brande B H
2017-06-01
We identified two novel wheat stem rust resistance genes, Sr-1644-1Sh and Sr-1644-5Sh in Aegilops sharonensis that are effective against widely virulent African races of the wheat stem rust pathogen. Stem rust is one of the most important diseases of wheat in the world. When single stem rust resistance (Sr) genes are deployed in wheat, they are often rapidly overcome by the pathogen. To this end, we initiated a search for novel sources of resistance in diverse wheat relatives and identified the wild goatgrass species Aegilops sharonesis (Sharon goatgrass) as a rich reservoir of resistance to wheat stem rust. The objectives of this study were to discover and map novel Sr genes in Ae. sharonensis and to explore the possibility of identifying new Sr genes by genome-wide association study (GWAS). We developed two biparental populations between resistant and susceptible accessions of Ae. sharonensis and performed QTL and linkage analysis. In an F 6 recombinant inbred line and an F 2 population, two genes were identified that mapped to the short arm of chromosome 1S sh , designated as Sr-1644-1Sh, and the long arm of chromosome 5S sh , designated as Sr-1644-5Sh. The gene Sr-1644-1Sh confers a high level of resistance to race TTKSK (a member of the Ug99 race group), while the gene Sr-1644-5Sh conditions strong resistance to TRTTF, another widely virulent race found in Yemen. Additionally, GWAS was conducted on 125 diverse Ae. sharonensis accessions for stem rust resistance. The gene Sr-1644-1Sh was detected by GWAS, while Sr-1644-5Sh was not detected, indicating that the effectiveness of GWAS might be affected by marker density, population structure, low allele frequency and other factors.
Bigdeli, Tim B.; Ripke, Stephan; Bacanu, Silviu-Alin; Lee, Sang Hong; Wray, Naomi R.; Gejman, Pablo V.; Rietschel, Marcella; Cichon, Sven; St Clair, David; Corvin, Aiden; Kirov, George; McQuillin, Andrew; Gurling, Hugh; Rujescu, Dan; Andreassen, Ole A.; Werge, Thomas; Blackwood, Douglas H.R.; Pato, Carlos N.; Pato, Michele T.; Malhotra, Anil K.; O’Donovan, Michael C.; Kendler, Kenneth S.; Fanous, Ayman H.
2018-01-01
Genome-wide association studies (GWAS) of schizophrenia have yielded more than 100 common susceptibility variants, and strongly support a substantial polygenic contribution of a large number of small allelic effects. It has been hypothesized that familial schizophrenia is largely a consequence of inherited rather than environmental factors. We investigated the extent to which familiality of schizophrenia is associated with enrichment for common risk variants detectable in a large GWAS. We analyzed single nucleotide polymorphism (SNP) data for cases reporting a family history of psychotic illness (N = 978), cases reporting no such family history (N = 4,503), and unscreened controls (N = 8,285) from the Psychiatric Genomics Consortium (PGC1) study of schizophrenia. We used a multinomial logistic regression approach with model-fitting to detect allelic effects specific to either family history subgroup. We also considered a polygenic model, in which we tested whether family history positive subjects carried more schizophrenia risk alleles than family history negative subjects, on average. Several individual SNPs attained suggestive but not genome-wide significant association with either family history subgroup. Comparison of genome-wide polygenic risk scores based on GWAS summary statistics indicated a significant enrichment for SNP effects among family history positive compared to family history negative cases (Nagelkerke’s R2 = 0.0021; P = 0.00331; P-value threshold <0.4). Estimates of variability in disease liability attributable to the aggregate effect of genome-wide SNPs were significantly greater for family history positive compared to family history negative cases (0.32 and 0.22, respectively; P = 0.031).We found suggestive evidence of allelic effects detectable in large GWAS of schizophrenia that might be specific to particular family history subgroups. However, consideration of a polygenic risk score indicated a significant enrichment among family history positive cases for common allelic effects. Familial illness might, therefore, represent a more heritable form of schizophrenia, as suggested by previous epidemiological studies. PMID:26663532
Meng, Shan; He, Jianbo; Zhao, Tuanjie; Xing, Guangnan; Li, Yan; Yang, Shouping; Lu, Jiangjie; Wang, Yufeng; Gai, Junyi
2016-08-01
Utilizing an innovative GWAS in CSLRP, 44 QTL 199 alleles with 72.2 % contribution to SIFC variation were detected and organized into a QTL-allele matrix for cross design and gene annotation. The seed isoflavone content (SIFC) of soybeans is of great importance to health care. The Chinese soybean landrace population (CSLRP) as a genetic reservoir was studied for its whole-genome quantitative trait loci (QTL) system of the SIFC using an innovative restricted two-stage multi-locus genome-wide association study procedure (RTM-GWAS). A sample of 366 landraces was tested under four environments and sequenced using RAD-seq (restriction-site-associated DNA sequencing) technique to obtain 116,769 single nucleotide polymorphisms (SNPs) then organized into 29,119 SNP linkage disequilibrium blocks (SNPLDBs) for GWAS. The detected 44 QTL 199 alleles on 16 chromosomes (explaining 72.2 % of the total phenotypic variation) with the allele effects (92 positive and 107 negative) of the CSLRP were organized into a QTL-allele matrix showing the SIFC population genetic structure. Additional differentiation among eco-regions due to the SIFC in addition to that of genome-wide markers was found. All accessions comprised both positive and negative alleles, implying a great potential for recombination within the population. The optimal crosses were predicted from the matrices, showing transgressive potentials in the CSLRP. From the detected QTL system, 55 candidate genes related to 11 biological processes were χ (2)-tested as an SIFC candidate gene system. The present study explored the genome-wide SIFC QTL/gene system with the innovative RTM-GWAS and found the potentials of the QTL-allele matrix in optimal cross design and population genetic and genomic studies, which may have provided a solution to match the breeding by design strategy at both QTL and gene levels in breeding programs.
Gan, Wei; Walters, Robin G; Holmes, Michael V; Bragg, Fiona; Millwood, Iona Y; Banasik, Karina; Chen, Yiping; Du, Huaidong; Iona, Andri; Mahajan, Anubha; Yang, Ling; Bian, Zheng; Guo, Yu; Clarke, Robert J; Li, Liming; McCarthy, Mark I; Chen, Zhengming
2016-07-01
Genome-wide association studies (GWAS) have discovered many risk variants for type 2 diabetes. However, estimates of the contributions of risk variants to type 2 diabetes predisposition are often based on highly selected case-control samples, and reliable estimates of population-level effect sizes are missing, especially in non-European populations. The individual and cumulative effects of 59 established type 2 diabetes risk loci were measured in a population-based China Kadoorie Biobank (CKB) study of 93,000 Chinese adults, including >7,100 diabetes cases. Association signals were directionally consistent between CKB and the original discovery GWAS: of 56 variants passing quality control, 48 showed the same direction of effect (binomial test, p = 2.3 × 10(-8)). We observed a consistent overall trend towards lower risk variant effect sizes in CKB than in case-control samples of GWAS meta-analyses (mean 19-22% decrease in log odds, p ≤ 0.0048), likely to reflect correction of both 'winner's curse' and spectrum bias effects. The association with risk of diabetes of a genetic risk score, based on lead variants at 25 loci considered to act through beta cell function, demonstrated significant interactions with several measures of adiposity (BMI, waist circumference [WC], WHR and percentage body fat [PBF]; all p interaction < 1 × 10(-4)), with a greater effect being observed in leaner adults. Our study provides further evidence of shared genetic architecture for type 2 diabetes between Europeans and East Asians. It also indicates that even very large GWAS meta-analyses may be vulnerable to substantial inflation of effect size estimates, compared with those observed in large-scale population-based cohort studies. Details of how to access China Kadoorie Biobank data and details of the data release schedule are available from www.ckbiobank.org/site/Data+Access .
The case of GWAS of obesity: does body weight control play by the rules?
Müller, Manfred J; Geisler, Corinna; Blundell, John; Dulloo, Abdul; Schutz, Yves; Krawczak, Michael; Bosy-Westphal, Anja; Enderle, Janna; Heymsfield, Steven B
2018-05-24
As yet, genome-wide association studies (GWAS) have not added much to our understanding of the mechanisms of body weight control and of the etiology of obesity. This shortcoming is widely attributed to the complexity of the issues. The appeal of this explanation notwithstanding, we surmise that (i) an oversimplification of the phenotype (namely by the use of crude anthropometric traits) and (ii) a lack of sound concepts of body weight control and, thus, a lack of a clear research focus have impeded better insights most. The idea of searching for polygenetic mechanisms underlying common forms of obesity was born out of the impressive findings made for monogenetic forms of extreme obesity. In the case of common obesity, however, observational studies on normal weight and overweight subjects never provided any strong evidence for a tight internal control of body weight. In addition, empirical studies of weight changes in normal weight and overweight subjects revealed an intra-individual variance that was similar to inter-individual variance suggesting the absence of tight control of body weight. Not least, this lack of coerciveness is reflected by the present obesity epidemic. Finally, data on detailed body composition highlight that body weight is too heterogeneous a phenotype to be controlled as a single entity. In summary GWAS of obesity using crude anthropometric traits have likely been misled by popular heritability estimates that may have been inflated in the first place. To facilitate more robust and useful insights into the mechanisms of internal control of human body weight and, consequently, the genetic basis of obesity, we argue in favor of a broad discussion between scientists from the areas of integrative physiologic and of genomics. This discussion should aim at better conceived studies employing biologically more meaningful phenotypes based on in depth body composition analysis. To advance the scientific community-including the editors of our top journals-needs a re-launch of future GWAS of obesity.
Yu, Kai; Chin, Yoon-Ming; Lou, Pei-Jen; Hsu, Wan-Lun; McKay, James D.; Chen, Chien-Jen; Chang, Yu-Sun; Chen, Li-Zhen; Chen, Ming-Yuan; Cui, Qian; Feng, Fu-Tuo; Feng, Qi-Shen; Guo, Yun-Miao; Jia, Wei-Hua; Khoo, Alan Soo-Beng; Liu, Wen-Sheng; Mo, Hao-Yuan; Pua, Kin-Choo; Teo, Soo-Hwang; Tse, Ka-Po; Xia, Yun-Fei; Zhang, Hongxin; Zhou, Gang-Qiao; Liu, Jian-Jun; Zeng, Yi-Xin; Hildesheim, Allan
2015-01-01
Background Genetic loci within the major histocompatibility complex (MHC) have been associated with nasopharyngeal carcinoma (NPC), an Epstein-Barr virus (EBV)-associated cancer, in several GWAS. Results outside this region have varied. Methods We conducted a meta-analysis of four NPC GWAS among Chinese individuals (2,152 cases;3,740 controls). 43 noteworthy findings outside the MHC region were identified and targeted for replication in a pooled analysis of 4 independent case-control studies across 3 regions in Asia (4,716 cases;5,379 controls). A meta-analysis that combined results from the initial GWA and replication studies was performed. Results In the combined meta-analysis, rs31489, located within the CLPTM1L/TERT region on chromosome 5p15.33, was strongly associated with NPC (OR=0.81;p-value 6.3*10−13). Our results also provide support for associations reported from published NPC GWAS - rs6774494 (p = 1.5*10−12;located in the MECOM gene region), rs9510787 (p = 5.0*10−10;located in the TNFRSF19 gene region), and rs1412829/rs4977756/rs1063192 (p = 2.8*10−8,p = 7.0*10−7,and p = 8.4*10−7 respectively;located in the CDKN2A/B gene region). Conclusion We have identified a novel association between genetic variation in the CLPTM1L/TERT region and NPC. Supporting our finding, rs31489 and other SNPs in this region have been reported to be associated with multiple cancer sites, candidate-based studies have reported associations between polymorphisms in this region and NPC, the TERT gene is important for telomere maintenance and has been reported to be over-expressed in NPC, and an EBV protein expressed in NPC (LMP1) modulates TERT expression/telomerase activity. Impact Our finding suggests that factors involved in telomere length maintenance are involved in NPC pathogenesis. PMID:26545403
Staley, James R; Jones, Edmund; Kaptoge, Stephen; Butterworth, Adam S; Sweeting, Michael J; Wood, Angela M; Howson, Joanna M M
2017-06-01
Logistic regression is often used instead of Cox regression to analyse genome-wide association studies (GWAS) of single-nucleotide polymorphisms (SNPs) and disease outcomes with cohort and case-cohort designs, as it is less computationally expensive. Although Cox and logistic regression models have been compared previously in cohort studies, this work does not completely cover the GWAS setting nor extend to the case-cohort study design. Here, we evaluated Cox and logistic regression applied to cohort and case-cohort genetic association studies using simulated data and genetic data from the EPIC-CVD study. In the cohort setting, there was a modest improvement in power to detect SNP-disease associations using Cox regression compared with logistic regression, which increased as the disease incidence increased. In contrast, logistic regression had more power than (Prentice weighted) Cox regression in the case-cohort setting. Logistic regression yielded inflated effect estimates (assuming the hazard ratio is the underlying measure of association) for both study designs, especially for SNPs with greater effect on disease. Given logistic regression is substantially more computationally efficient than Cox regression in both settings, we propose a two-step approach to GWAS in cohort and case-cohort studies. First to analyse all SNPs with logistic regression to identify associated variants below a pre-defined P-value threshold, and second to fit Cox regression (appropriately weighted in case-cohort studies) to those identified SNPs to ensure accurate estimation of association with disease.
Lam, Max; Trampush, Joey W; Yu, Jin; Knowles, Emma; Davies, Gail; Liewald, David C; Starr, John M; Djurovic, Srdjan; Melle, Ingrid; Sundet, Kjetil; Christoforou, Andrea; Reinvang, Ivar; DeRosse, Pamela; Lundervold, Astri J; Steen, Vidar M; Espeseth, Thomas; Räikkönen, Katri; Widen, Elisabeth; Palotie, Aarno; Eriksson, Johan G; Giegling, Ina; Konte, Bettina; Roussos, Panos; Giakoumaki, Stella; Burdick, Katherine E; Payton, Antony; Ollier, William; Chiba-Falek, Ornit; Attix, Deborah K; Need, Anna C; Cirulli, Elizabeth T; Voineskos, Aristotle N; Stefanis, Nikos C; Avramopoulos, Dimitrios; Hatzimanolis, Alex; Arking, Dan E; Smyrnis, Nikolaos; Bilder, Robert M; Freimer, Nelson A; Cannon, Tyrone D; London, Edythe; Poldrack, Russell A; Sabb, Fred W; Congdon, Eliza; Conley, Emily Drabant; Scult, Matthew A; Dickinson, Dwight; Straub, Richard E; Donohoe, Gary; Morris, Derek; Corvin, Aiden; Gill, Michael; Hariri, Ahmad R; Weinberger, Daniel R; Pendleton, Neil; Bitsios, Panos; Rujescu, Dan; Lahti, Jari; Le Hellard, Stephanie; Keller, Matthew C; Andreassen, Ole A; Deary, Ian J; Glahn, David C; Malhotra, Anil K; Lencz, Todd
2017-11-28
Here, we present a large (n = 107,207) genome-wide association study (GWAS) of general cognitive ability ("g"), further enhanced by combining results with a large-scale GWAS of educational attainment. We identified 70 independent genomic loci associated with general cognitive ability. Results showed significant enrichment for genes causing Mendelian disorders with an intellectual disability phenotype. Competitive pathway analysis implicated the biological processes of neurogenesis and synaptic regulation, as well as the gene targets of two pharmacologic agents: cinnarizine, a T-type calcium channel blocker, and LY97241, a potassium channel inhibitor. Transcriptome-wide and epigenome-wide analysis revealed that the implicated loci were enriched for genes expressed across all brain regions (most strongly in the cerebellum). Enrichment was exclusive to genes expressed in neurons but not oligodendrocytes or astrocytes. Finally, we report genetic correlations between cognitive ability and disparate phenotypes including psychiatric disorders, several autoimmune disorders, longevity, and maternal age at first birth. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits.
Shi, Huwenbo; Mancuso, Nicholas; Spendlove, Sarah; Pasaniuc, Bogdan
2017-11-02
Although genetic correlations between complex traits provide valuable insights into epidemiological and etiological studies, a precise quantification of which genomic regions disproportionately contribute to the genome-wide correlation is currently lacking. Here, we introduce ρ-HESS, a technique to quantify the correlation between pairs of traits due to genetic variation at a small region in the genome. Our approach requires GWAS summary data only and makes no distributional assumption on the causal variant effect sizes while accounting for linkage disequilibrium (LD) and overlapping GWAS samples. We analyzed large-scale GWAS summary data across 36 quantitative traits, and identified 25 genomic regions that contribute significantly to the genetic correlation among these traits. Notably, we find 6 genomic regions that contribute to the genetic correlation of 10 pairs of traits that show negligible genome-wide correlation, further showcasing the power of local genetic correlation analyses. Finally, we report the distribution of local genetic correlations across the genome for 55 pairs of traits that show putative causal relationships. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
GlobAl Distribution of GEnetic Traits (GADGET) web server: polygenic trait scores worldwide.
Chande, Aroon T; Wang, Lu; Rishishwar, Lavanya; Conley, Andrew B; Norris, Emily T; Valderrama-Aguirre, Augusto; Jordan, I King
2018-05-18
Human populations from around the world show striking phenotypic variation across a wide variety of traits. Genome-wide association studies (GWAS) are used to uncover genetic variants that influence the expression of heritable human traits; accordingly, population-specific distributions of GWAS-implicated variants may shed light on the genetic basis of human phenotypic diversity. With this in mind, we developed the GlobAl Distribution of GEnetic Traits web server (GADGET http://gadget.biosci.gatech.edu). The GADGET web server provides users with a dynamic visual platform for exploring the relationship between worldwide genetic diversity and the genetic architecture underlying numerous human phenotypes. GADGET integrates trait-implicated single nucleotide polymorphisms (SNPs) from GWAS, with population genetic data from the 1000 Genomes Project, to calculate genome-wide polygenic trait scores (PTS) for 818 phenotypes in 2504 individual genomes. Population-specific distributions of PTS are shown for 26 human populations across 5 continental population groups, with traits ordered based on the extent of variation observed among populations. Users of GADGET can also upload custom trait SNP sets to visualize global PTS distributions for their own traits of interest.
Poisson Approximation-Based Score Test for Detecting Association of Rare Variants.
Fang, Hongyan; Zhang, Hong; Yang, Yaning
2016-07-01
Genome-wide association study (GWAS) has achieved great success in identifying genetic variants, but the nature of GWAS has determined its inherent limitations. Under the common disease rare variants (CDRV) hypothesis, the traditional association analysis methods commonly used in GWAS for common variants do not have enough power for detecting rare variants with a limited sample size. As a solution to this problem, pooling rare variants by their functions provides an efficient way for identifying susceptible genes. Rare variant typically have low frequencies of minor alleles, and the distribution of the total number of minor alleles of the rare variants can be approximated by a Poisson distribution. Based on this fact, we propose a new test method, the Poisson Approximation-based Score Test (PAST), for association analysis of rare variants. Two testing methods, namely, ePAST and mPAST, are proposed based on different strategies of pooling rare variants. Simulation results and application to the CRESCENDO cohort data show that our methods are more powerful than the existing methods. © 2016 John Wiley & Sons Ltd/University College London.
Genetic architecture for susceptibility to gout in the KARE cohort study.
Shin, Jimin; Kim, Younyoung; Kong, Minyoung; Lee, Chaeyoung
2012-06-01
This study aimed to identify functional associations of cis-regulatory regions with gout susceptibility using data resulted from a genome-wide association study (GWAS), and to show a genetic architecture for gout with interaction effects among genes within each of the identified functions. The GWAS was conducted with 8314 control subjects and 520 patients with gout in the Korea Association REsource cohort. However, genetic associations with any individual nucleotide variants were not discovered by Bonferroni multiple testing in the GWAS (P>1.42 × 10(-7)). Genomic regions enrichment analysis was employed to identify functional associations of cis-regulatory regions. This analysis revealed several biological processes associated with gout susceptibility, and they were quite different from those with serum uric acid level. Epistasis for susceptibility to gout was estimated using entropy decomposition with selected genes within each biological process identified by the genomic regions enrichment analysis. Some epistases among nucleotide sequence variants for gout susceptibility were found to be larger than their individual effects. This study provided the first evidence that genetic factors for gout susceptibility greatly differed from those for serum uric acid level, which may suggest that research endeavors for identifying genetic factors for gout susceptibility should not be heavily dependent on pathogenesis of uric acid. Interaction effects between genes should be examined to explain a large portion of phenotypic variability for gout susceptibility.
2012-01-01
Background We performed a genome-wide association study (GWAS) to identify common risk variants for schizophrenia. Methods The discovery scan included 1606 patients and 1794 controls from Ireland, using 6,212,339 directly genotyped or imputed single nucleotide polymorphisms (SNPs). A subset of this sample (270 cases and 860 controls) was subsequently included in the Psychiatric GWAS Consortium-schizophrenia GWAS meta-analysis. Results One hundred eight SNPs were taken forward for replication in an independent sample of 13,195 cases and 31,021 control subjects. The most significant associations in discovery, corrected for genomic inflation, were (rs204999, p combined = 1.34 × 10−9 and in combined samples (rs2523722 p combined = 2.88 × 10−16) mapped to the major histocompatibility complex (MHC) region. We imputed classical human leukocyte antigen (HLA) alleles at the locus; the most significant finding was with HLA-C*01:02. This association was distinct from the top SNP signal. The HLA alleles DRB1*03:01 and B*08:01 were protective, replicating a previous study. Conclusions This study provides further support for involvement of MHC class I molecules in schizophrenia. We found evidence of association with previously reported risk alleles at the TCF4, VRK2, and ZNF804A loci. PMID:22883433
Goudey, Benjamin; Abedini, Mani; Hopper, John L; Inouye, Michael; Makalic, Enes; Schmidt, Daniel F; Wagner, John; Zhou, Zeyu; Zobel, Justin; Reumann, Matthias
2015-01-01
Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
Pardo, Luba M; Piras, Giovanna; Asproni, Rosanna; van der Gaag, Kristiaan J; Gabbas, Attilio; Ruiz-Linares, Andres; de Knijff, Peter; Monne, Maria; Rizzu, Patrizia; Heutink, Peter
2012-09-01
Sardinia has been used for genetic studies because of its historical isolation, genetic homogeneity and increased prevalence of certain rare diseases. Controversy remains concerning the genetic substructure and the extent of genetic homogeneity, which has implications for the design of genome-wide association studies (GWAS). We revisited this issue by examining the genetic make-up of a sample from North-East Sardinia using a dense set of autosomal, Y chromosome and mitochondrial markers to assess the potential of the sample for GWAS and fine mapping studies. We genotyped individuals for 500K single-nucleotide polymorphisms, Y chromosome markers and sequenced the mitochondrial hypervariable (HVI-HVII) regions. We identified major haplogroups and compared these with other populations. We estimated linkage disequilibrium (LD) and haplotype diversity across autosomal markers, and compared these with other populations. Our results show that within Sardinia there is no major population substructure and thus it can be considered a genetically homogenous population. We did not find substantial differences in the extent of LD in Sardinians compared with other populations. However, we showed that at least 9% of genomic regions in Sardinians differed in LD structure, which is helpful for identifying functional variants using fine mapping. We concluded that Sardinia is a powerful setting for genetic studies including GWAS and other mapping approaches.
Merriman, Tony R; Choi, Hyon K; Dalbeth, Nicola
2014-05-01
Gout results from deposition of monosodium urate (MSU) crystals. Elevated serum urate concentrations (hyperuricemia) are not sufficient for the development of disease. Genome-wide association studies (GWAS) have identified 28 loci controlling serum urate levels. The largest genetic effects are seen in genes involved in the renal excretion of uric acid, with others being involved in glycolysis. Whereas much is understood about the genetic control of serum urate levels, little is known about the genetic control of inflammatory responses to MSU crystals. Extending knowledge in this area depends on recruitment of large, clinically ascertained gout sample sets suitable for GWAS. Copyright © 2014 Elsevier Inc. All rights reserved.
Insights into the genetics of gastroesophageal reflux disease (GERD) and GERD-related disorders.
Böhmer, A C; Schumacher, J
2017-02-01
Gastroesophageal reflux disease (GERD) is associated with obesity and hiatal hernia, and often precedes the development of Barrett's esophagus (BE) and esophageal adenocarcinoma (EA). Epidemiological studies show that the global prevalence of GERD is increasing. GERD is a multifactorial disease with a complex genetic architecture. Genome-wide association studies (GWAS) have provided initial insights into the genetic background of GERD. The present review summarizes current knowledge of the genetics of GERD and a possible genetic overlap between GERD and BE and EA. The review discusses genes and cellular pathways that have been implicated through GWAS, and provides an outlook on how future molecular research will enhance understanding of GERD pathophysiology. © 2017 John Wiley & Sons Ltd.
An, Ping; Miljkovic, Iva; Thyagarajan, Bharat; Kraja, Aldi T; Daw, E Warwick; Pankow, James S; Selvin, Elizabeth; Kao, W H Linda; Maruthur, Nisa M; Nalls, Micahel A; Liu, Yongmei; Harris, Tamara B; Lee, Joseph H; Borecki, Ingrid B; Christensen, Kaare; Eckfeldt, John H; Mayeux, Richard; Perls, Thomas T; Newman, Anne B; Province, Michael A
2014-04-01
Glycated hemoglobin (HbA1c) is a stable index of chronic glycemic status and hyperglycemia associated with progressive development of insulin resistance and frank diabetes. It is also associated with premature aging and increased mortality. To uncover novel loci for HbA1c that are associated with healthy aging, we conducted a genome-wide association study (GWAS) using non-diabetic participants in the Long Life Family Study (LLFS), a study with familial clustering of exceptional longevity in the US and Denmark. A total of 4088 non-diabetic subjects from the LLFS were used for GWAS discoveries, and a total of 8231 non-diabetic subjects from the Atherosclerosis Risk in Communities Study (ARIC, in the MAGIC Consortium) and the Health, Aging, and Body Composition Study (HABC) were used for GWAS replications. HbA1c was adjusted for age, sex, centers, 20 principal components, without and with BMI. A linear mixed effects model was used for association testing. Two known loci at GCK rs730497 (or rs2908282) and HK1 rs17476364 were confirmed (p<5e-8). Of 25 suggestive (5e-8
USDA-ARS?s Scientific Manuscript database
Fine-mapping of causal variants is becoming feasible for complex traits in livestock GWAS, as an increasing number of animals are sequenced. Imputation has been routinely applied to ascertain sequence variants in large genotyped populations based on small reference populations of sequenced animals. ...
"Good Work Awards:" Effects on Children's Families. Technical Report #12.
ERIC Educational Resources Information Center
Chun, Sherlyn; Mays, Violet
This brief report describes parental reaction to a reinforcement strategy used with children in the Kamehameha Early Education Program (KEEP). Staff members report that "Good Work Awards" (GWAs) are viewed favorably by mothers of students. GWAs are dittoed notes sent home with children when they have met a minimum criterion for daily…
Turuspekov, Yerlan; Baibulatova, Aida; Yermekbayev, Kanat; Tokhetova, Laura; Chudinov, Vladimir; Sereda, Grigoriy; Ganal, Martin; Griffiths, Simon; Abugalieva, Saule
2017-11-14
Spring wheat is the largest agricultural crop grown in Kazakhstan with an annual sowing area of 12 million hectares in 2016. Annually, the country harvests around 15 million tons of high quality grain. Despite environmental stress factors it is predicted that the use of new technologies may lead to increases in productivity from current levels of 1.5 to up to 3 tons per hectare. One way of improving wheat productivity is by the application of new genomic oriented approaches in plant breeding projects. Genome wide association studies (GWAS) are emerging as powerful tools for the understanding of the inheritance of complex traits via utilization of high throughput genotyping technologies and phenotypic assessments of plant collections. In this study, phenotyping and genotyping data on 194 spring wheat accessions from Kazakhstan, Russia, Europe, and CIMMYT were assessed for the identification of marker-trait associations (MTA) of agronomic traits by using GWAS. Field trials in Northern, Central and Southern regions of Kazakhstan using 194 spring wheat accessions revealed strong correlations of yield with booting date, plant height, biomass, number of spikes per plant, and number of kernels per spike. The accessions from Europe and CIMMYT showed high breeding potential for Southern and Central regions of the country in comparison with the performance of the local varieties. The GGE biplot method, using average yield per plant, suggested a clear separation of accessions into their three breeding origins in relationship to the three environments in which they were evaluated. The genetic variation in the three groups of accessions was further studied using 3245 polymorphic SNP (single nucleotide polymorphism) markers. The application of Principal Coordinate analysis clearly grouped the 194 accessions into three clades according to their breeding origins. GWAS on data from nine field trials allowed the identification of 114 MTAs for 12 different agronomic traits. Field evaluation of foreign germplasm revealed its poor yield performance in Northern Kazakhstan, which is the main wheat growing region in the country. However, it was found that EU and CIMMYT germplasm has high breeding potential to improve yield performance in Central and Southern regions. The use of Principal Coordinate analysis clearly separated the panel into three distinct groups according to their breeding origin. GWAS based on use of the TASSEL 5.0 package allowed the identification of 114 MTAs for twelve agronomic traits. The study identifies a network of key genes for improvement of yield productivity in wheat growing regions of Kazakhstan.
Identification of IL6R and chromosome 11q13.5 as risk loci for asthma.
Ferreira, Manuel A R; Matheson, Melanie C; Duffy, David L; Marks, Guy B; Hui, Jennie; Le Souëf, Peter; Danoy, Patrick; Baltic, Svetlana; Nyholt, Dale R; Jenkins, Mark; Hayden, Catherine; Willemsen, Gonneke; Ang, Wei; Kuokkanen, Mikko; Beilby, John; Cheah, Faang; de Geus, Eco J C; Ramasamy, Adaikalavan; Vedantam, Sailaja; Salomaa, Veikko; Madden, Pamela A; Heath, Andrew C; Hopper, John L; Visscher, Peter M; Musk, Bill; Leeder, Stephen R; Jarvelin, Marjo-Riitta; Pennell, Craig; Boomsma, Dorret I; Hirschhorn, Joel N; Walters, Haydn; Martin, Nicholas G; James, Alan; Jones, Graham; Abramson, Michael J; Robertson, Colin F; Dharmage, Shyamali C; Brown, Matthew A; Montgomery, Grant W; Thompson, Philip J
2011-09-10
We aimed to identify novel genetic variants affecting asthma risk, since these might provide novel insights into molecular mechanisms underlying the disease. We did a genome-wide association study (GWAS) in 2669 physician-diagnosed asthmatics and 4528 controls from Australia. Seven loci were prioritised for replication after combining our results with those from the GABRIEL consortium (n=26,475), and these were tested in an additional 25,358 independent samples from four in-silico cohorts. Quantitative multi-marker scores of genetic load were constructed on the basis of results from the GABRIEL study and tested for association with asthma in our Australian GWAS dataset. Two loci were confirmed to associate with asthma risk in the replication cohorts and reached genome-wide significance in the combined analysis of all available studies (n=57,800): rs4129267 (OR 1·09, combined p=2·4×10(-8)) in the interleukin-6 receptor (IL6R) gene and rs7130588 (OR 1·09, p=1·8×10(-8)) on chromosome 11q13.5 near the leucine-rich repeat containing 32 gene (LRRC32, also known as GARP). The 11q13.5 locus was significantly associated with atopic status among asthmatics (OR 1·33, p=7×10(-4)), suggesting that it is a risk factor for allergic but not non-allergic asthma. Multi-marker association results are consistent with a highly polygenic contribution to asthma risk, including loci with weak effects that might be shared with other immune-related diseases, such as NDFIP1, HLA-B, LPP, and BACH2. The IL6R association further supports the hypothesis that cytokine signalling dysregulation affects asthma risk, and raises the possibility that an IL6R antagonist (tocilizumab) may be effective to treat the disease, perhaps in a genotype-dependent manner. Results for the 11q13.5 locus suggest that it directly increases the risk of allergic sensitisation which, in turn, increases the risk of subsequent development of asthma. Larger or more functionally focused studies are needed to characterise the many loci with modest effects that remain to be identified for asthma. National Health and Medical Research Council of Australia. A full list of funding sources is provided in the webappendix. Copyright © 2011 Elsevier Ltd. All rights reserved.
Identification of IL6R and chromosome 11q13.5 as risk loci for asthma
Ferreira, Manuel A.R.; Matheson, Melanie C.; Duffy, David L.; Marks, Guy B.; Hui, Jennie; Le Souëf, Peter; Danoy, Patrick; Baltic, Svetlana; Nyholt, Dale R.; Jenkins, Mark; Hayden, Catherine; Willemsen, Gonneke; Ang, Wei; Kuokkanen, Mikko; Beilby, John; Cheah, Faang; de Geus, Eco J. C.; Ramasamy, Adaikalavan; Vedantam, Sailaja; Salomaa, Veikko; Madden, Pamela A.; Heath, Andrew C.; Hopper, John L.; Visscher, Peter M.; Musk, Bill; Leeder, Stephen R.; Jarvelin, Marjo-Riitta; Pennell, Craig; Boomsma, Dorret I.; Hirschhorn, Joel; Walters, Haydn; Martin, Nicholas G.; James, Alan; Jones, Graham; Abramson, Michael J.; Robertson, Colin F.; Dharmage, Shyamali C.; Brown, Matthew A.; Montgomery, Grant W.; Thompson, Philip J.
2012-01-01
Background We aimed to identify novel genetic variants affecting asthma risk, since these might provide novel insights into molecular mechanisms underlying asthma. Methods We performed a genome-wide association study (GWAS) in 2,669 physician-diagnosed asthmatics and 4,528 controls from Australia. Seven loci were prioritised for replication after combining our results with those from the GABRIEL consortium (n=26,475), and these were tested in an additional 25,358 independent samples from four in-silico cohorts. Quantitative multi-SNP scores of genetic load were constructed on the basis of results from the GABRIEL study and tested for association with asthma in our Australian GWAS dataset. Findings Two loci were confirmed to associate with asthma risk in the replication cohorts and reached genome-wide significance in the combined analysis of all available studies (n=57,800): rs4129267 (OR=1.09, combined P=2.4×10−8) in the interleukin-6 receptor gene (IL6R) and rs7130588 (OR=1.09, P=1.8×10−8) on chromosome 11q13.5 near the leucine-rich repeat containing 32 gene (LRRC32, also known as GARP). The 11q13.5 locus was significantly associated with atopic status among asthmatics (OR = 1.33, P = 7×10−4), suggesting that it is a risk factor for allergic but not non-allergic asthma. Multi-SNP association results are consistent with a highly polygenic contribution to asthma risk, including loci with weak effects that may be shared with other immune-related diseases, such as NDFIP1, HLA-B, LPP and BACH2. Interpretation The IL6R association further supports the hypothesis that cytokine signalling dysregulation affects asthma risk, and raises the possibility that an IL6R antagonist (tocilizumab) may be effective to treat the disease, perhaps in a genotype-dependent manner. Results for the 11q13.5 locus suggest that it directly increases the risk of allergic sensitisation which, in turn, increases the risk of subsequent development of asthma. Larger or more functionally focused studies are needed to characterise the many loci with modest effects that remain to be identified for asthma. Funding A full list of funding sources appears at the end of the paper. PMID:21907864
de Tayrac, Marie; Roth, Marie-Paule; Jouanolle, Anne-Marie; Coppin, Hélène; le Gac, Gérald; Piperno, Alberto; Férec, Claude; Pelucchi, Sara; Scotet, Virginie; Bardou-Jacquet, Edouard; Ropert, Martine; Bouvet, Régis; Génin, Emmanuelle; Mosser, Jean; Deugnier, Yves
2015-03-01
Hereditary hemochromatosis (HH) is the most common form of genetic iron loading disease. It is mainly related to the homozygous C282Y/C282Y mutation in the HFE gene that is, however, a necessary but not a sufficient condition to develop clinical and even biochemical HH. This suggests that modifier genes are likely involved in the expressivity of the disease. Our aim was to identify such modifier genes. We performed a genome-wide association study (GWAS) using DNA collected from 474 unrelated C282Y homozygotes. Associations were examined for both quantitative iron burden indices and clinical outcomes with 534,213 single nucleotide polymorphisms (SNP) genotypes, with replication analyses in an independent sample of 748 C282Y homozygotes from four different European centres. One SNP met genome-wide statistical significance for association with transferrin concentration (rs3811647, GWAS p value of 7×10(-9) and replication p value of 5×10(-13)). This SNP, located within intron 11 of the TF gene, had a pleiotropic effect on serum iron (GWAS p value of 4.9×10(-6) and replication p value of 3.2×10(-6)). Both serum transferrin and iron levels were associated with serum ferritin levels, amount of iron removed and global clinical stage (p<0.01). Serum iron levels were also associated with fibrosis stage (p<0.0001). This GWAS, the largest one performed so far in unselected HFE-associated HH (HFE-HH) patients, identified the rs3811647 polymorphism in the TF gene as the only SNP significantly associated with iron metabolism through serum transferrin and iron levels. Because these two outcomes were clearly associated with the biochemical and clinical expression of the disease, an indirect link between the rs3811647 polymorphism and the phenotypic presentation of HFE-HH is likely. Copyright © 2014 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
The GTPase Activating Rap/RanGAP Domain-Like 1 Gene Is Associated with Chicken Reproductive Traits
Shen, Xu; Zeng, Hua; Xie, Liang; He, Jun; Li, Jian; Xie, Xiujuan; Luo, Chenglong; Xu, Haiping; Zhou, Min; Nie, Qinghua; Zhang, Xiquan
2012-01-01
Background Abundant evidence indicates that chicken reproduction is strictly regulated by the hypothalamic-pituitary-gonad (HPG) axis, and the genes included in the HPG axis have been studied extensively. However, the question remains as to whether any other genes outside of the HPG system are involved in regulating chicken reproduction. The present study was aimed to identify, on a genome-wide level, novel genes associated with chicken reproductive traits. Methodology/Principal Finding Suppressive subtractive hybridization (SSH), genome-wide association study (GWAS), and gene-centric GWAS were used to identify novel genes underlying chicken reproduction. Single marker-trait association analysis with a large population and allelic frequency spectrum analysis were used to confirm the effects of candidate genes. Using two full-sib Ningdu Sanhuang (NDH) chickens, GARNL1 was identified as a candidate gene involved in chicken broodiness by SSH analysis. Its expression levels in the hypothalamus and pituitary were significantly higher in brooding chickens than in non-brooding chickens. GWAS analysis with a NDH two tail sample showed that 2802 SNPs were significantly associated with egg number at 300 d of age (EN300). Among the 2802 SNPs, 2 SNPs composed a block overlapping the GARNL1 gene. The gene-centric GWAS analysis with another two tail sample of NDH showed that GARNL1 was strongly associated with EN300 and age at first egg (AFE). Single marker-trait association analysis in 1301 female NDH chickens confirmed that variation in this gene was related to EN300 and AFE. The allelic frequency spectrum of the SNP rs15700989 among 5 different populations supported the above associations. Western blotting, RT-PCR, and qPCR were used to analyze alternative splicing of the GARNL1 gene. RT-PCR detected 5 transcripts and revealed that the transcript, which has a 141 bp insertion, was expressed in a tissue-specific manner. Conclusions/Significance Our findings demonstrate that the GARNL1 gene contributes to chicken reproductive traits. PMID:22496769
Cericola, Fabio; Jahoor, Ahmed; Orabi, Jihad; Andersen, Jeppe R; Janss, Luc L; Jensen, Just
2017-01-01
Wheat breeding programs generate a large amount of variation which cannot be completely explored because of limited phenotyping throughput. Genomic prediction (GP) has been proposed as a new tool which provides breeding values estimations without the need of phenotyping all the material produced but only a subset of it named training population (TP). However, genotyping of all the accessions under analysis is needed and, therefore, optimizing TP dimension and genotyping strategy is pivotal to implement GP in commercial breeding schemes. Here, we explored the optimum TP size and we integrated pedigree records and genome wide association studies (GWAS) results to optimize the genotyping strategy. A total of 988 advanced wheat breeding lines were genotyped with the Illumina 15K SNPs wheat chip and phenotyped across several years and locations for yield, lodging, and starch content. Cross-validation using the largest possible TP size and all the SNPs available after editing (~11k), yielded predictive abilities (rGP) ranging between 0.5-0.6. In order to explore the Training population size, rGP were computed using progressively smaller TP. These exercises showed that TP of around 700 lines were enough to yield the highest observed rGP. Moreover, rGP were calculated by randomly reducing the SNPs number. This showed that around 1K markers were enough to reach the highest observed rGP. GWAS was used to identify markers associated with the traits analyzed. A GWAS-based selection of SNPs resulted in increased rGP when compared with random selection and few hundreds SNPs were sufficient to obtain the highest observed rGP. For each of these scenarios, advantages of adding the pedigree information were shown. Our results indicate that moderate TP sizes were enough to yield high rGP and that pedigree information and GWAS results can be used to greatly optimize the genotyping strategy.
A hidden two-locus disease association pattern in genome-wide association studies
2011-01-01
Background Recent association analyses in genome-wide association studies (GWAS) mainly focus on single-locus association tests (marginal tests) and two-locus interaction detections. These analysis methods have provided strong evidence of associations between genetics variances and complex diseases. However, there exists a type of association pattern, which often occurs within local regions in the genome and is unlikely to be detected by either marginal tests or interaction tests. This association pattern involves a group of correlated single-nucleotide polymorphisms (SNPs). The correlation among SNPs can lead to weak marginal effects and the interaction does not play a role in this association pattern. This phenomenon is due to the existence of unfaithfulness: the marginal effects of correlated SNPs do not express their significant joint effects faithfully due to the correlation cancelation. Results In this paper, we develop a computational method to detect this association pattern masked by unfaithfulness. We have applied our method to analyze seven data sets from the Wellcome Trust Case Control Consortium (WTCCC). The analysis for each data set takes about one week to finish the examination of all pairs of SNPs. Based on the empirical result of these real data, we show that this type of association masked by unfaithfulness widely exists in GWAS. Conclusions These newly identified associations enrich the discoveries of GWAS, which may provide new insights both in the analysis of tagSNPs and in the experiment design of GWAS. Since these associations may be easily missed by existing analysis tools, we can only connect some of them to publicly available findings from other association studies. As independent data set is limited at this moment, we also have difficulties to replicate these findings. More biological implications need further investigation. Availability The software is freely available at http://bioinformatics.ust.hk/hidden_pattern_finder.zip. PMID:21569557
Ahsan, Muhammad; Ek, Weronica E.; Karlsson, Torgny; Gyllensten, Ulf
2017-01-01
Associations between epigenetic alterations and disease status have been identified for many diseases. However, there is no strong evidence that epigenetic alterations are directly causal for disease pathogenesis. In this study, we combined SNP and DNA methylation data with measurements of protein biomarkers for cancer, inflammation or cardiovascular disease, to investigate the relative contribution of genetic and epigenetic variation on biomarker levels. A total of 121 protein biomarkers were measured and analyzed in relation to DNA methylation at 470,000 genomic positions and to over 10 million SNPs. We performed epigenome-wide association study (EWAS) and genome-wide association study (GWAS) analyses, and integrated biomarker, DNA methylation and SNP data using between 698 and 1033 samples depending on data availability for the different analyses. We identified 124 and 45 loci (Bonferroni adjusted P < 0.05) with effect sizes up to 0.22 standard units’ change per 1% change in DNA methylation levels and up to four standard units’ change per copy of the effective allele in the EWAS and GWAS respectively. Most GWAS loci were cis-regulatory whereas most EWAS loci were located in trans. Eleven EWAS loci were associated with multiple biomarkers, including one in NLRC5 associated with CXCL11, CXCL9, IL-12, and IL-18 levels. All EWAS signals that overlapped with a GWAS locus were driven by underlying genetic variants and three EWAS signals were confounded by smoking. While some cis-regulatory SNPs for biomarkers appeared to have an effect also on DNA methylation levels, cis-regulatory SNPs for DNA methylation were not observed to affect biomarker levels. We present associations between protein biomarker and DNA methylation levels at numerous loci in the genome. The associations are likely to reflect the underlying pattern of genetic variants, specific environmental exposures, or represent secondary effects to the pathogenesis of disease. PMID:28915241
Ellinghaus, David; Folseraas, Trine; Holm, Kristian; Ellinghaus, Eva; Melum, Espen; Balschun, Tobias; Laerdahl, Jon K; Shiryaev, Alexey; Gotthardt, Daniel N; Weismüller, Tobias J; Schramm, Christoph; Wittig, Michael; Bergquist, Annika; Björnsson, Einar; Marschall, Hanns-Ulrich; Vatn, Morten; Teufel, Andreas; Rust, Christian; Gieger, Christian; Wichmann, H-Erich; Runz, Heiko; Sterneck, Martina; Rupp, Christian; Braun, Felix; Weersma, Rinse K; Wijmenga, Cisca; Ponsioen, Cyriel Y; Mathew, Christopher G; Rutgeerts, Paul; Vermeire, Séverine; Schrumpf, Erik; Hov, Johannes R; Manns, Michael P; Boberg, Kirsten M; Schreiber, Stefan; Franke, Andre; Karlsen, Tom H
2013-09-01
Approximately 60%-80% of patients with primary sclerosing cholangitis (PSC) have concurrent ulcerative colitis (UC). Previous genome-wide association studies (GWAS) in PSC have detected a number of susceptibility loci that also show associations in UC and other immune-mediated diseases. We aimed to systematically compare genetic associations in PSC with genotype data in UC patients with the aim of detecting new susceptibility loci for PSC. We performed combined analyses of GWAS for PSC and UC comprising 392 PSC cases, 987 UC cases, and 2,977 controls and followed up top association signals in an additional 1,012 PSC cases, 4,444 UC cases, and 11,659 controls. We discovered novel genome-wide significant associations with PSC at 2q37 [rs3749171 at G-protein-coupled receptor 35 (GPR35); P = 3.0 × 10(-9) in the overall study population, combined odds ratio [OR] and 95% confidence interval [CI] of 1.39 (1.24-1.55)] and at 18q21 [rs1452787 at transcription factor 4 (TCF4); P = 2.61 × 10(-8) , OR (95% CI) = 0.75 (0.68-0.83)]. In addition, several suggestive PSC associations were detected. The GPR35 rs3749171 is a missense single nucleotide polymorphism resulting in a shift from threonine to methionine. Structural modeling showed that rs3749171 is located in the third transmembrane helix of GPR35 and could possibly alter efficiency of signaling through the GPR35 receptor. By refining the analysis of a PSC GWAS by parallel assessments in a UC GWAS, we were able to detect two novel risk loci at genome-wide significance levels. GPR35 shows associations in both UC and PSC, whereas TCF4 represents a PSC risk locus not associated with UC. Both loci may represent previously unexplored aspects of PSC pathogenesis. Copyright © 2012 American Association for the Study of Liver Diseases.
Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification
Faye, Laura L.; Machiela, Mitchell J.; Kraft, Peter; Bull, Shelley B.; Sun, Lei
2013-01-01
Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an established region of association remains a challenge. Counter-intuitively, certain factors that increase power to detect an associated region can decrease power to localize the causal variant. First, combining GWAS with imputation or low coverage sequencing to achieve the large sample sizes required for high power can have the unintended effect of producing differential genotyping error among SNPs. This tends to bias the relative evidence for association toward better genotyped SNPs. Second, re-use of GWAS data for fine-mapping exploits previous findings to ensure genome-wide significance in GWAS-associated regions. However, using GWAS findings to inform fine-mapping analysis can bias evidence away from the causal SNP toward the tag SNP and SNPs in high LD with the tag. Together these factors can reduce power to localize the causal SNP by more than half. Other strategies commonly employed to increase power to detect association, namely increasing sample size and using higher density genotyping arrays, can, in certain common scenarios, actually exacerbate these effects and further decrease power to localize causal variants. We develop a re-ranking procedure that accounts for these adverse effects and substantially improves the accuracy of causal SNP identification, often doubling the probability that the causal SNP is top-ranked. Application to the NCI BPC3 aggressive prostate cancer GWAS with imputation meta-analysis identified a new top SNP at 2 of 3 associated loci and several additional possible causal SNPs at these loci that may have otherwise been overlooked. This method is simple to implement using R scripts provided on the author's website. PMID:23950724
Characterizing Genetic Risk at Known Prostate Cancer Susceptibility Loci in African Americans
Haiman, Christopher A.; Chen, Gary K.; Blot, William J.; Strom, Sara S.; Berndt, Sonja I.; Kittles, Rick A.; Rybicki, Benjamin A.; Isaacs, William B.; Ingles, Sue A.; Stanford, Janet L.; Diver, W. Ryan; Witte, John S.; Chanock, Stephen J.; Kolb, Suzanne; Signorello, Lisa B.; Yamamura, Yuko; Neslund-Dudas, Christine; Thun, Michael J.; Murphy, Adam; Casey, Graham; Sheng, Xin; Wan, Peggy; Pooler, Loreall C.; Monroe, Kristine R.; Waters, Kevin M.; Le Marchand, Loic; Kolonel, Laurence N.; Stram, Daniel O.; Henderson, Brian E.
2011-01-01
GWAS of prostate cancer have been remarkably successful in revealing common genetic variants and novel biological pathways that are linked with its etiology. A more complete understanding of inherited susceptibility to prostate cancer in the general population will come from continuing such discovery efforts and from testing known risk alleles in diverse racial and ethnic groups. In this large study of prostate cancer in African American men (3,425 prostate cancer cases and 3,290 controls), we tested 49 risk variants located in 28 genomic regions identified through GWAS in men of European and Asian descent, and we replicated associations (at p≤0.05) with roughly half of these markers. Through fine-mapping, we identified nearby markers in many regions that better define associations in African Americans. At 8q24, we found 9 variants (p≤6×10−4) that best capture risk of prostate cancer in African Americans, many of which are more common in men of African than European descent. The markers found to be associated with risk at each locus improved risk modeling in African Americans (per allele OR = 1.17) over the alleles reported in the original GWAS (OR = 1.08). In summary, in this detailed analysis of the prostate cancer risk loci reported from GWAS, we have validated and improved upon markers of risk in some regions that better define the association with prostate cancer in African Americans. Our findings with variants at 8q24 also reinforce the importance of this region as a major risk locus for prostate cancer in men of African ancestry. PMID:21637779
Litchfield, K; Shipley, J; Turnbull, C
2015-01-01
Testicular germ cell tumour (TGCT) is the most common cause of cancer in young men (aged 15-45 years) in many populations. Multiple genome-wide association studies (GWAS) of TGCT have now been conducted, yielding over 25 disease-associated single-nucleotide polymorphism (SNP)s at 19 independent loci. The genes at these loci have provided rich biological and genetic insight into possible mechanisms underlying testicular germ cell oncogenesis. In this review, we summarize these mechanisms which can be grouped into five distinct categories: KIT/KITLG signalling, other pathways of male germ cell development/differentiation, telomerase function, microtubule assembly and DNA damage repair. The TGCT risk markers identified through GWAS include individual SNPs carrying per allele odds ratios (OR) in excess of 2.5. These ORs are among the highest reported in GWAS of any cancer type, hence suggesting a potential clinical utility in risk determination. Here, we present analysis of such an approach, using polygenic risk scores to calculate the combined effect of all risk loci on overall TGCT risk and discuss how a potential screening strategy may fit within a broader clinical context. © 2015 American Society of Andrology and European Academy of Andrology.
Influence of the LILRA3 Deletion on Multiple Sclerosis Risk: Original Data and Meta-Analysis
Ortiz, Miguel A.; Núñez, Concepción; Ordóñez, David; Alvarez-Cermeño, José C.; Martínez-Rodriguez, José E.; Sánchez, Antonio J.; Arroyo, Rafael; Izquierdo, Guillermo; Malhotra, Sunny; Montalban, Xavier; García-Merino, Antonio; Munteis, Elvira; Alcina, Antonio; Comabella, Manuel; Matesanz, Fuencisla
2015-01-01
Background Multiple sclerosis (MS) is a neurodegenerative, autoimmune disease of the central nervous system. Genome-wide association studies (GWAS) have identified over hundred polymorphisms with modest individual effects in MS susceptibility and they have confirmed the main individual effect of the Major Histocompatibility Complex. Additional risk loci with immunologically relevant genes were found significantly overrepresented. Nonetheless, it is accepted that most of the genetic architecture underlying susceptibility to the disease remains to be defined. Candidate association studies of the leukocyte immunoglobulin-like receptor LILRA3 gene in MS have been repeatedly reported with inconsistent results. Objectives In an attempt to shed some light on these controversial findings, a combined analysis was performed including the previously published datasets and three newly genotyped cohorts. Both wild-type and deleted LILRA3 alleles were discriminated in a single-tube PCR amplification and the resulting products were visualized by their different electrophoretic mobilities. Results and Conclusion Overall, this meta-analysis involved 3200 MS patients and 3069 matched healthy controls and it did not evidence significant association of the LILRA3 deletion [carriers of LILRA3 deletion: p = 0.25, OR (95% CI) = 1.07 (0.95–1.19)], even after stratification by gender and the HLA-DRB1*15:01 risk allele. PMID:26274821
Common variants on chromosome 6p22.1 are associated with schizophrenia
Shi, Jianxin; Levinson, Douglas F.; Duan, Jubao; Sanders, Alan R.; Zheng, Yonglan; Pe'er, Itsik; Dudbridge, Frank; Holmans, Peter A.; Whittemore, Alice S.; Mowry, Bryan J.; Olincy, Ann; Amin, Farooq; Cloninger, C. Robert; Silverman, Jeremy M.; Buccola, Nancy G.; Byerley, William F.; Black, Donald W.; Crowe, Raymond R.; Oksenberg, Jorge R.; Mirel, Daniel B.; Kendler, Kenneth S.; Freedman, Robert; Gejman, Pablo V.
2009-01-01
Schizophrenia, a devastating psychiatric disorder, has a prevalence of 0.5–1%, with high heritability (80–85%) and complex transmission.1 Recent studies implicate rare, large, high-penetrance copy number variants (CNVs) in some cases2, but it is not known what genes or biological mechanisms underlie susceptibility. Here we show that schizophrenia is significantly associated with single nucleotide polymorphisms (SNPs) in the extended Major Histocompatibility Complex (MHC) region on chromosome 6. We carried out a genome-wide association study (GWAS) of common SNPs in the Molecular Genetics of Schizophrenia (MGS) case-control sample, and then a meta-analysis of data from the MGS, International Schizophrenia Consortium (ISC) and SGENE datasets. No MGS finding achieved genome-wide statistical significance. In the meta-analysis of European-ancestry subjects (8,008 cases, 19,077 controls), significant association with schizophrenia was observed in a region of linkage disequilibrium on chromosome 6p22.1 (P = 9.54 × 10−9). This region includes a histone gene cluster and several immunity-related genes, possibly implicating etiologic mechanisms involving chromatin modification, transcriptional regulation, auto-immunity and/or infection. These results demonstrate that common schizophrenia susceptibility alleles can be detected. The characterization of these signals will suggest important directions for research on susceptibility mechanisms. PMID:19571809
A genome-wide investigation of food addiction.
Cornelis, Marilyn C; Flint, Alan; Field, Alison E; Kraft, Peter; Han, Jiali; Rimm, Eric B; van Dam, Rob M
2016-06-01
Evidence of parallels between drug addiction and eating behavior continues to accumulate. Genetic studies of addictive substances have yielded a number of susceptibility loci that point to common higher order genetic pathways underlying addiction. It was hypothesized that a genome-wide association study (GWAS) of food addiction would yield significant enrichment in genes and pathways linked to addiction. A GWAS of food addiction, determined by the modified Yale Food Addiction Scale (mYFAS), was conducted among 9,314 women of European ancestry, and results for enrichment of single-nucleotide polymorphisms (SNPs) (n = 44), genes (n = 238), and pathways (n = 11) implicated in drug addiction were examined. Two loci met GW-significance (P < 2.5 × 10(-8) ) mapping to 17q21.31 and 11q13.4 that harbor genes with no obvious roles in eating behavior. GW results were significantly enriched for gene members of the MAPK signaling pathway (P = 0.02). No candidate SNP or gene for drug addiction was significantly associated with food addiction after correction for multiple testing. In the first GWAS of mYFAS, suggestive loci worthy of further follow-up were identified, but limited support was provided for shared genetic underpinnings of food addiction and drug addiction. The latter might be due to limited study power and knowledge of the genetics of drug addiction. © 2016 The Obesity Society.
Hou, Liping; Bergen, Sarah E.; Akula, Nirmala; Song, Jie; Hultman, Christina M.; Landén, Mikael; Adli, Mazda; Alda, Martin; Ardau, Raffaella; Arias, Bárbara; Aubry, Jean-Michel; Backlund, Lena; Badner, Judith A.; Barrett, Thomas B.; Bauer, Michael; Baune, Bernhard T.; Bellivier, Frank; Benabarre, Antonio; Bengesser, Susanne; Berrettini, Wade H.; Bhattacharjee, Abesh Kumar; Biernacka, Joanna M.; Birner, Armin; Bloss, Cinnamon S.; Brichant-Petitjean, Clara; Bui, Elise T.; Byerley, William; Cervantes, Pablo; Chillotti, Caterina; Cichon, Sven; Colom, Francesc; Coryell, William; Craig, David W.; Cruceanu, Cristiana; Czerski, Piotr M.; Davis, Tony; Dayer, Alexandre; Degenhardt, Franziska; Del Zompo, Maria; DePaulo, J. Raymond; Edenberg, Howard J.; Étain, Bruno; Falkai, Peter; Foroud, Tatiana; Forstner, Andreas J.; Frisén, Louise; Frye, Mark A.; Fullerton, Janice M.; Gard, Sébastien; Garnham, Julie S.; Gershon, Elliot S.; Goes, Fernando S.; Greenwood, Tiffany A.; Grigoroiu-Serbanescu, Maria; Hauser, Joanna; Heilbronner, Urs; Heilmann-Heimbach, Stefanie; Herms, Stefan; Hipolito, Maria; Hitturlingappa, Shashi; Hoffmann, Per; Hofmann, Andrea; Jamain, Stephane; Jiménez, Esther; Kahn, Jean-Pierre; Kassem, Layla; Kelsoe, John R.; Kittel-Schneider, Sarah; Kliwicki, Sebastian; Koller, Daniel L.; König, Barbara; Lackner, Nina; Laje, Gonzalo; Lang, Maren; Lavebratt, Catharina; Lawson, William B.; Leboyer, Marion; Leckband, Susan G.; Liu, Chunyu; Maaser, Anna; Mahon, Pamela B.; Maier, Wolfgang; Maj, Mario; Manchia, Mirko; Martinsson, Lina; McCarthy, Michael J.; McElroy, Susan L.; McInnis, Melvin G.; McKinney, Rebecca; Mitchell, Philip B.; Mitjans, Marina; Mondimore, Francis M.; Monteleone, Palmiero; Mühleisen, Thomas W.; Nievergelt, Caroline M.; Nöthen, Markus M.; Novák, Tomas; Nurnberger, John I.; Nwulia, Evaristus A.; Ösby, Urban; Pfennig, Andrea; Potash, James B.; Propping, Peter; Reif, Andreas; Reininghaus, Eva; Rice, John; Rietschel, Marcella; Rouleau, Guy A.; Rybakowski, Janusz K.; Schalling, Martin; Scheftner, William A.; Schofield, Peter R.; Schork, Nicholas J.; Schulze, Thomas G.; Schumacher, Johannes; Schweizer, Barbara W.; Severino, Giovanni; Shekhtman, Tatyana; Shilling, Paul D.; Simhandl, Christian; Slaney, Claire M.; Smith, Erin N.; Squassina, Alessio; Stamm, Thomas; Stopkova, Pavla; Streit, Fabian; Strohmaier, Jana; Szelinger, Szabolcs; Tighe, Sarah K.; Tortorella, Alfonso; Turecki, Gustavo; Vieta, Eduard; Volkert, Julia; Witt, Stephanie H.; Wright, Adam; Zandi, Peter P.; Zhang, Peng; Zollner, Sebastian; McMahon, Francis J.
2016-01-01
Bipolar disorder (BD) is a genetically complex mental illness characterized by severe oscillations of mood and behaviour. Genome-wide association studies (GWAS) have identified several risk loci that together account for a small portion of the heritability. To identify additional risk loci, we performed a two-stage meta-analysis of >9 million genetic variants in 9,784 bipolar disorder patients and 30,471 controls, the largest GWAS of BD to date. In this study, to increase power we used ∼2,000 lithium-treated cases with a long-term diagnosis of BD from the Consortium on Lithium Genetics, excess controls, and analytic methods optimized for markers on the X-chromosome. In addition to four known loci, results revealed genome-wide significant associations at two novel loci: an intergenic region on 9p21.3 (rs12553324, P = 5.87 × 10 − 9; odds ratio (OR) = 1.12) and markers within ERBB2 (rs2517959, P = 4.53 × 10 − 9; OR = 1.13). No significant X-chromosome associations were detected and X-linked markers explained very little BD heritability. The results add to a growing list of common autosomal variants involved in BD and illustrate the power of comparing well-characterized cases to an excess of controls in GWAS. PMID:27329760
Genome-wide and gene-based association implicates FRMD6 in Alzheimer disease.
Hong, Mun-Gwan; Reynolds, Chandra A; Feldman, Adina L; Kallin, Mikael; Lambert, Jean-Charles; Amouyel, Philippe; Ingelsson, Erik; Pedersen, Nancy L; Prince, Jonathan A
2012-03-01
Genome-wide association studies (GWAS) that allow for allelic heterogeneity may facilitate the discovery of novel genes not detectable by models that require replication of a single variant site. One strategy to accomplish this is to focus on genes rather than markers as units of association, and so potentially capture a spectrum of causal alleles that differ across populations. Here, we conducted a GWAS of Alzheimer disease (AD) in 2,586 Swedes and performed gene-based meta-analysis with three additional studies from France, Canada, and the United States, in total encompassing 4,259 cases and 8,284 controls. Implementing a newly designed gene-based algorithm, we identified two loci apart from the region around APOE that achieved study-wide significance in combined samples, the strongest finding being for FRMD6 on chromosome 14q (P = 2.6 × 10(-14)) and a weaker signal for NARS2 that is immediately adjacent to GAB2 on chromosome 11q (P = 7.8 × 10(-9)). Ontology-based pathway analyses revealed significant enrichment of genes involved in glycosylation. Results suggest that gene-based approaches that accommodate allelic heterogeneity in GWAS can provide a complementary avenue for gene discovery and may help to explain a portion of the missing heritability not detectable with single nucleotide polymorphisms (SNPs) derived from marker-specific meta-analysis. © 2011 Wiley Periodicals, Inc.
Genome-wide association study of rust traits in orchardgrass using SLAF-seq technology.
Zeng, Bing; Yan, Haidong; Liu, Xinchun; Zang, Wenjing; Zhang, Ailing; Zhou, Sifan; Huang, Linkai; Liu, Jinping
2017-01-01
While orchardgrass ( Dactylis glomerata L.) is a well-known perennial forage species, rust diseases cause serious reductions in the yield and quality of orchardgrass; however, genetic mechanisms of rust resistance are not well understood in orchardgrass. In this study, a genome-wide association study (GWAS) was performed using specific-locus amplified fragment sequencing (SLAF-seq) technology in orchardgrass. A total of 2,334,889 SLAF tags were generated to produce 2,309,777 SNPs. ADMIXTURE analysis revealed unstructured subpopulations for 33 accessions, indicating that this orchardgrass population could be used for association analysis. Linkage disequilibrium (LD) analysis revealed an average r 2 of 0.4 across all SNP pairs, indicating a high extent of LD in these samples. Through GWAS, a total of 4,604 SNPs were found to be significantly ( P < 0.01) associated with the rust trait. The bulk analysis discovered a number of 5,211 SNPs related to rust trait. Two candidate genes, including cytochrome P450, and prolamin were implicated in disease resistance through prediction of functional genes surrounding each high-quality SNP ( P < 0.01) associated with rust traits based on GWAS analysis and bulk analysis. The large number of SNPs associated with rust traits and these two candidate genes may provide the basis for further research on rust resistance mechanisms and marker-assisted selection (MAS) for rust-resistant lineages.
Gene-Environment Interactions in Asthma: Genetic and Epigenetic Effects.
Lee, Jong-Uk; Kim, Jeong Dong; Park, Choon-Sik
2015-07-01
Over the past three decades, a large number of genetic studies have been aimed at finding genetic variants associated with the risk of asthma, applying various genetic and genomic approaches including linkage analysis, candidate gene polymorphism studies, and genome-wide association studies (GWAS). However, contrary to general expectation, even single nucleotide polymorphisms (SNPs) discovered by GWAS failed to fully explain the heritability of asthma. Thus, application of rare allele polymorphisms in well defined phenotypes and clarification of environmental factors have been suggested to overcome the problem of 'missing' heritability. Such factors include allergens, cigarette smoke, air pollutants, and infectious agents during pre- and post-natal periods. The first and simplest interaction between a gene and the environment is a candidate interaction of both a well known gene and environmental factor in a direct physical or chemical interaction such as between CD14 and endotoxin or between HLA and allergens. Several GWAS have found environmental interactions with occupational asthma, aspirin exacerbated respiratory disease, tobacco smoke-related airway dysfunction, and farm-related atopic diseases. As one of the mechanisms behind gene-environment interaction is epigenetics, a few studies on DNA CpG methylation have been reported on subphenotypes of asthma, pitching the exciting idea that it may be possible to intervene at the junction between the genome and the environment. Epigenetic studies are starting to include data from clinical samples, which will make them another powerful tool for re-search on gene-environment interactions in asthma.
Mansour, Hader A; Talkowski, Michael E; Wood, Joel; Chowdari, Kodavali V; McClain, Lora; Prasad, Konasale; Montrose, Debra; Fagiolini, Andrea; Friedman, Edward S; Allen, Michael H; Bowden, Charles L; Calabrese, Joseph; El-Mallakh, Rif S; Escamilla, Michael; Faraone, Stephen V; Fossey, Mark D; Gyulai, Laszlo; Loftis, Jennifer M; Hauser, Peter; Ketter, Terence A; Marangell, Lauren B; Miklowitz, David J; Nierenberg, Andrew A; Patel, Jayendra; Sachs, Gary S; Sklar, Pamela; Smoller, Jordan W; Laird, Nan; Keshavan, Matcheri; Thase, Michael E; Axelson, David; Birmaher, Boris; Lewis, David; Monk, Tim; Frank, Ellen; Kupfer, David J; Devlin, Bernie; Nimgaonkar, Vishwajit L
2012-01-01
Objective Published studies suggest associations between circadian gene polymorphisms and bipolar I disorder (BPI), as well as schizoaffective disorder (SZA) and schizophrenia (SZ). The results are plausible, based on prior studies of circadian abnormalities. As replications have not been attempted uniformly, we evaluated representative, common polymorphisms in all three disorders. Methods We assayed 276 publicly available ‘tag’ single nucleotide polymorphisms (SNPs) at 21 circadian genes among 523 patients with BPI, 527 patients with SZ/SZA, and 477 screened adult controls. Detected associations were evaluated in relation to two published genome-wide association studies (GWAS). Results Using gene-based tests, suggestive associations were noted between EGR3 and BPI (p = 0.017), and between NPAS2 and SZ/SZA (p = 0.034). Three SNPs were associated with both sets of disorders (NPAS2: rs13025524 and rs11123857; RORB: rs10491929; p < 0.05). None of the associations remained significant following corrections for multiple comparisons. Approximately 15% of the analyzed SNPs overlapped with an independent study that conducted GWAS for BPI; suggestive overlap between the GWAS analyses and ours was noted at ARNTL. Conclusions Several suggestive, novel associations were detected with circadian genes and BPI and SZ/SZA, but the present analyses do not support associations with common polymorphisms that confer risk with odds ratios greater than 1.5. Additional analyses using adequately powered samples are warranted to further evaluate these results. PMID:19839995
SQC: secure quality control for meta-analysis of genome-wide association studies.
Huang, Zhicong; Lin, Huang; Fellay, Jacques; Kutalik, Zoltán; Hubaux, Jean-Pierre
2017-08-01
Due to the limited power of small-scale genome-wide association studies (GWAS), researchers tend to collaborate and establish a larger consortium in order to perform large-scale GWAS. Genome-wide association meta-analysis (GWAMA) is a statistical tool that aims to synthesize results from multiple independent studies to increase the statistical power and reduce false-positive findings of GWAS. However, it has been demonstrated that the aggregate data of individual studies are subject to inference attacks, hence privacy concerns arise when researchers share study data in GWAMA. In this article, we propose a secure quality control (SQC) protocol, which enables checking the quality of data in a privacy-preserving way without revealing sensitive information to a potential adversary. SQC employs state-of-the-art cryptographic and statistical techniques for privacy protection. We implement the solution in a meta-analysis pipeline with real data to demonstrate the efficiency and scalability on commodity machines. The distributed execution of SQC on a cluster of 128 cores for one million genetic variants takes less than one hour, which is a modest cost considering the 10-month time span usually observed for the completion of the QC procedure that includes timing of logistics. SQC is implemented in Java and is publicly available at https://github.com/acs6610987/secureqc. jean-pierre.hubaux@epfl.ch. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Ng, Maggie C Y; Graff, Mariaelisa; Lu, Yingchang; Justice, Anne E; Mudgal, Poorva; Liu, Ching-Ti; Young, Kristin; Yanek, Lisa R; Feitosa, Mary F; Wojczynski, Mary K; Rand, Kristin; Brody, Jennifer A; Cade, Brian E; Dimitrov, Latchezar; Duan, Qing; Guo, Xiuqing; Lange, Leslie A; Nalls, Michael A; Okut, Hayrettin; Tajuddin, Salman M; Tayo, Bamidele O; Vedantam, Sailaja; Bradfield, Jonathan P; Chen, Guanjie; Chen, Wei-Min; Chesi, Alessandra; Irvin, Marguerite R; Padhukasahasram, Badri; Smith, Jennifer A; Zheng, Wei; Allison, Matthew A; Ambrosone, Christine B; Bandera, Elisa V; Bartz, Traci M; Berndt, Sonja I; Bernstein, Leslie; Blot, William J; Bottinger, Erwin P; Carpten, John; Chanock, Stephen J; Chen, Yii-Der Ida; Conti, David V; Cooper, Richard S; Fornage, Myriam; Freedman, Barry I; Garcia, Melissa; Goodman, Phyllis J; Hsu, Yu-Han H; Hu, Jennifer; Huff, Chad D; Ingles, Sue A; John, Esther M; Kittles, Rick; Klein, Eric; Li, Jin; McKnight, Barbara; Nayak, Uma; Nemesure, Barbara; Ogunniyi, Adesola; Olshan, Andrew; Press, Michael F; Rohde, Rebecca; Rybicki, Benjamin A; Salako, Babatunde; Sanderson, Maureen; Shao, Yaming; Siscovick, David S; Stanford, Janet L; Stevens, Victoria L; Stram, Alex; Strom, Sara S; Vaidya, Dhananjay; Witte, John S; Yao, Jie; Zhu, Xiaofeng; Ziegler, Regina G; Zonderman, Alan B; Adeyemo, Adebowale; Ambs, Stefan; Cushman, Mary; Faul, Jessica D; Hakonarson, Hakon; Levin, Albert M; Nathanson, Katherine L; Ware, Erin B; Weir, David R; Zhao, Wei; Zhi, Degui; Arnett, Donna K; Grant, Struan F A; Kardia, Sharon L R; Oloapde, Olufunmilayo I; Rao, D C; Rotimi, Charles N; Sale, Michele M; Williams, L Keoki; Zemel, Babette S; Becker, Diane M; Borecki, Ingrid B; Evans, Michele K; Harris, Tamara B; Hirschhorn, Joel N; Li, Yun; Patel, Sanjay R; Psaty, Bruce M; Rotter, Jerome I; Wilson, James G; Bowden, Donald W; Cupples, L Adrienne; Haiman, Christopher A; Loos, Ruth J F; North, Kari E
2017-04-01
Genome-wide association studies (GWAS) have identified >300 loci associated with measures of adiposity including body mass index (BMI) and waist-to-hip ratio (adjusted for BMI, WHRadjBMI), but few have been identified through screening of the African ancestry genomes. We performed large scale meta-analyses and replications in up to 52,895 individuals for BMI and up to 23,095 individuals for WHRadjBMI from the African Ancestry Anthropometry Genetics Consortium (AAAGC) using 1000 Genomes phase 1 imputed GWAS to improve coverage of both common and low frequency variants in the low linkage disequilibrium African ancestry genomes. In the sex-combined analyses, we identified one novel locus (TCF7L2/HABP2) for WHRadjBMI and eight previously established loci at P < 5×10-8: seven for BMI, and one for WHRadjBMI in African ancestry individuals. An additional novel locus (SPRYD7/DLEU2) was identified for WHRadjBMI when combined with European GWAS. In the sex-stratified analyses, we identified three novel loci for BMI (INTS10/LPL and MLC1 in men, IRX4/IRX2 in women) and four for WHRadjBMI (SSX2IP, CASC8, PDE3B and ZDHHC1/HSD11B2 in women) in individuals of African ancestry or both African and European ancestry. For four of the novel variants, the minor allele frequency was low (<5%). In the trans-ethnic fine mapping of 47 BMI loci and 27 WHRadjBMI loci that were locus-wide significant (P < 0.05 adjusted for effective number of variants per locus) from the African ancestry sex-combined and sex-stratified analyses, 26 BMI loci and 17 WHRadjBMI loci contained ≤ 20 variants in the credible sets that jointly account for 99% posterior probability of driving the associations. The lead variants in 13 of these loci had a high probability of being causal. As compared to our previous HapMap imputed GWAS for BMI and WHRadjBMI including up to 71,412 and 27,350 African ancestry individuals, respectively, our results suggest that 1000 Genomes imputation showed modest improvement in identifying GWAS loci including low frequency variants. Trans-ethnic meta-analyses further improved fine mapping of putative causal variants in loci shared between the African and European ancestry populations.
Roederer, Mario; Quaye, Lydia; Mangino, Massimo; Beddall, Margaret H.; Mahnke, Yolanda; Chattopadhyay, Pratip; Tosi, Isabella; Napolitano, Luca; Barberio, Manuela Terranova; Menni, Cristina; Villanova, Federica; Di Meglio, Paola; Spector, Tim D.; Nestle, Frank O.
2015-01-01
Summary Despite recent discoveries of genetic variants associated with autoimmunity and infection, genetic control of the human immune system during homeostasis is poorly understood. We undertook a comprehensive immunophenotyping approach, analysing 78,000 immune traits in 669 female twins. From the top 151 heritable traits (up to 96% heritable), we used replicated GWAS to obtain 297 SNP associations at 11 genetic loci explaining up to 36% of the variation of 19 traits. We found multiple associations with canonical traits of all major immune cell subsets, and uncovered insights into genetic control for regulatory T cells. This dataset also revealed traits associated with loci known to confer autoimmune susceptibility, providing mechanistic hypotheses linking immune traits with the etiology of disease. Our data establish a bioresource that links genetic control elements associated with normal immune traits to common autoimmune and infectious diseases, providing a shortcut to identifying potential mechanisms of immune-related diseases. PMID:25772697
Wang, Xianshu; Pankratz, V Shane; Fredericksen, Zachary; Tarrell, Robert; Karaus, Mary; McGuffog, Lesley; Pharaoh, Paul D P; Ponder, Bruce A J; Dunning, Alison M; Peock, Susan; Cook, Margaret; Oliver, Clare; Frost, Debra; Sinilnikova, Olga M; Stoppa-Lyonnet, Dominique; Mazoyer, Sylvie; Houdayer, Claude; Hogervorst, Frans B L; Hooning, Maartje J; Ligtenberg, Marjolijn J; Spurdle, Amanda; Chenevix-Trench, Georgia; Schmutzler, Rita K; Wappenschmidt, Barbara; Engel, Christoph; Meindl, Alfons; Domchek, Susan M; Nathanson, Katherine L; Rebbeck, Timothy R; Singer, Christian F; Gschwantler-Kaulich, Daphne; Dressler, Catherina; Fink, Anneliese; Szabo, Csilla I; Zikan, Michal; Foretova, Lenka; Claes, Kathleen; Thomas, Gilles; Hoover, Robert N; Hunter, David J; Chanock, Stephen J; Easton, Douglas F; Antoniou, Antonis C; Couch, Fergus J
2010-07-15
Recent studies have identified single nucleotide polymorphisms (SNPs) that significantly modify breast cancer risk in BRCA1 and BRCA2 mutation carriers. Since these risk modifiers were originally identified as genetic risk factors for breast cancer in genome-wide association studies (GWASs), additional risk modifiers for BRCA1 and BRCA2 may be identified from promising signals discovered in breast cancer GWAS. A total of 350 SNPs identified as candidate breast cancer risk factors (P < 1 x 10(-3)) in two breast cancer GWAS studies were genotyped in 3451 BRCA1 and 2006 BRCA2 mutation carriers from nine centers. Associations with breast cancer risk were assessed using Cox models weighted for penetrance. Eight SNPs in BRCA1 carriers and 12 SNPs in BRCA2 carriers, representing an enrichment over the number expected, were significantly associated with breast cancer risk (P(trend) < 0.01). The minor alleles of rs6138178 in SNRPB and rs6602595 in CAMK1D displayed the strongest associations in BRCA1 carriers (HR = 0.78, 95% CI: 0.69-0.90, P(trend) = 3.6 x 10(-4) and HR = 1.25, 95% CI: 1.10-1.41, P(trend) = 4.2 x 10(-4)), whereas rs9393597 in LOC134997 and rs12652447 in FBXL7 showed the strongest associations in BRCA2 carriers (HR = 1.55, 95% CI: 1.25-1.92, P(trend) = 6 x 10(-5) and HR = 1.37, 95% CI: 1.16-1.62, P(trend) = 1.7 x 10(-4)). The magnitude and direction of the associations were consistent with the original GWAS. In subsequent risk assessment studies, the loci appeared to interact multiplicatively for breast cancer risk in BRCA1 and BRCA2 carriers. Promising candidate SNPs from GWAS were identified as modifiers of breast cancer risk in BRCA1 and BRCA2 carriers. Upon further validation, these SNPs together with other genetic and environmental factors may improve breast cancer risk assessment in these populations.
Imaging genetics of schizophrenia in the post-GWAS era.
Arslan, Ayla
2018-01-03
Imaging genetics is a research methodology studying the effect of genetic variation on brain structure, function, behavior, and risk for psychopathology. Since the early 2000s, imaging genetics has been increasingly used in the research of schizophrenia (SZ). SZ is a severe mental disorder with no precise knowledge of its underlying neurobiology, however, new genetic and neurobiological data generate a climate for new avenues. The accumulating data of genome wide association studies (GWAS) continuously decode SZ risk genes. Global neuroimaging consortia produce collections of brain phenotypes from tens of thousands of people. In this context, imaging genetics will be strategically important both for the validation and discovery of SZ related findings. Thus, the study of GWAS supported risk variants as candidate genes to validate by neuroimaging is one trend. The study of epigenetic differences in relation to variations of brain phenotypes and the study of large scale multivariate analysis of genome wide and brain wide associations are other trends. While these studies hold a big potential for understanding the neurobiology of SZ, the problem of reproducibility appears as a major challenge, which requires standardizations in study designs and compensations of methodological limitations such as sensitivity and specificity. On the other hand, advancements of neuroimaging, optical and electron microscopy along with the use of genetically encoded fluorescent probes and robust statistical approaches will not only catalyze integrative methodologies but also will help better design the imaging genetics studies. In this invited paper, I will discuss the current perspective of imaging genetics and emerging opportunities of SZ research. Copyright © 2017 Elsevier Inc. All rights reserved.
Ryu, Dongchan; Ryu, Jihye; Lee, Chaeyoung
2016-05-01
A genome-wide association study (GWAS) was conducted to examine genetic associations of common autosomal nucleotide variants with sex in a Korean population with 4183 males and 4659 females. Nine genetic association signals were identified in four intragenic and five intergenic regions (P<5 × 10(-8)). Further analysis with an independent data set confirmed two intragenic association signals in the genes encoding protein phosphatase 1, regulatory subunit 12B (PPP1R12B, intron 12, rs1819043) and dynein, axonemal, heavy chain 11 (DNAH11, intron 61, rs10255013), which are directly involved in the reproductive system. This study revealed autosomal genetic variants associated with sex ratio by GWAS for the first time. This implies that genetic variants in proximity to the association signals may influence sex-specific selection and contribute to sex ratio variation. Further studies are required to reveal the mechanisms underlying sex-specific selection.
Host genetics of HIV acquisition and viral control.
Shea, Patrick R; Shianna, Kevin V; Carrington, Mary; Goldstein, David B
2013-01-01
Since the discovery of HIV as the cause of AIDS, numerous insights have been gained from studies of its natural history and epidemiology. It has become clear that there are substantial interindividual differences in the risk of HIV acquisition and course of disease. Meanwhile, the field of human genetics has undergone a series of rapid transitions that have fundamentally altered the approach to studying HIV host genetics. We aim to describe the field as it has transitioned from the era of candidate-gene studies and the era of genome-wide association studies (GWAS) to its current state in the infancy of comprehensive sequencing. In some ways the field has come full circle, having evolved from being driven almost exclusively by our knowledge of immunology, to a bias-free GWAS approach, to a point where our ability to catalogue human variation far outstrips our ability to biologically interpret it.
Vaithilingam, R D; Safii, S H; Baharuddin, N A; Ng, C C; Cheong, S C; Bartold, P M; Schaefer, A S; Loos, B G
2014-12-01
Studies to elucidate the role of genetics as a risk factor for periodontal disease have gone through various phases. In the majority of cases, the initial 'hypothesis-dependent' candidate-gene polymorphism studies did not report valid genetic risk loci. Following a large-scale replication study, these initially positive results are believed to be caused by type 1 errors. However, susceptibility genes, such as CDKN2BAS (Cyclin Dependend KiNase 2B AntiSense RNA; alias ANRIL [ANtisense Rna In the Ink locus]), glycosyltransferase 6 domain containing 1 (GLT6D1) and cyclooxygenase 2 (COX2), have been reported as conclusive risk loci of periodontitis. The search for genetic risk factors accelerated with the advent of 'hypothesis-free' genome-wide association studies (GWAS). However, despite many different GWAS being performed for almost all human diseases, only three GWAS on periodontitis have been published - one reported genome-wide association of GLT6D1 with aggressive periodontitis (a severe phenotype of periodontitis), whereas the remaining two, which were performed on patients with chronic periodontitis, were not able to find significant associations. This review discusses the problems faced and the lessons learned from the search for genetic risk variants of periodontitis. Current and future strategies for identifying genetic variance in periodontitis, and the importance of planning a well-designed genetic study with large and sufficiently powered case-control samples of severe phenotypes, are also discussed. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
A meta-analysis of genome-wide association studies of asthma in Puerto Ricans
Yan, Qi; Brehm, John; Pino-Yanes, Maria; Forno, Erick; Lin, Jerome; Oh, Sam S.; Acosta-Perez, Edna; Laurie, Cathy C.; Cloutier, Michelle M.; Raby, Benjamin A.; Stilp, Adrienne M.; Sofer, Tamar; Hu, Donglei; Huntsman, Scott; Eng, Celeste S.; Conomos, Matthew P.; Rastogi, Deepa; Rice, Kenneth; Canino, Glorisa; Chen, Wei; Barr, R. Graham; Burchard, Esteban G.; Celedón, Juan C.
2017-01-01
Rationale No genome-wide association study (GWAS) of asthma has been conducted in Puerto Ricans. Objective To identify susceptibility genetic variants for asthma in Puerto Ricans. Methods We conducted a meta-analysis of GWAS of asthma, including Puerto Rican participants from: GALA I-II, the Hartford-Puerto Rico Study, and the Hispanic Community Health Study. Moreover, we examined whether susceptibility loci identified in previous meta-analyses of GWAS are associated with asthma in Puerto Ricans. Results The only locus to achieve a genome-wide significant association with asthma in an analysis of 2,144 cases and 2,893 controls was chromosome 17q21, as evidenced by our top SNP, rs907092 (OR = 0.71, P = 1.2 ×10−12) on IKZF3. Similar to findings in non-Puerto Ricans, SNPs in genes in the same LD block as IKZF3 (e.g. ZPBP2, ORMDL3 and GSDMB) were also significantly associated with asthma in Puerto Ricans. With regard to results from a meta-analysis in Europeans, we replicated findings for the SNP at GSDMB, but not for SNPs in any other genes. On the other hand, we replicated results from a meta-analysis of North American populations for SNPs in IL1RL1, TSLP and GSDMB but not for IL33. Conclusions Common variants on chromosome 17q21 have the greatest effects on asthma in Puerto Ricans, a high-risk ethnic group. PMID:28461288
Morgan, Thomas M; House, John A; Cresci, Sharon; Jones, Philip; Allayee, Hooman; Hazen, Stanley L; Patel, Yesha; Patel, Riyaz S; Eapen, Danny J; Waddy, Salina P; Quyyumi, Arshed A; Kleber, Marcus E; März, Winfried; Winkelmann, Bernhard R; Boehm, Bernhard O; Krumholz, Harlan M; Spertus, John A
2011-09-29
Genome-wide association studies (GWAS) have identified new candidate genes for the occurrence of acute coronary syndrome (ACS), but possible effects of such genes on survival following ACS have yet to be investigated. We examined 95 polymorphisms in 69 distinct gene regions identified in a GWAS for premature myocardial infarction for their association with post-ACS mortality among 811 whites recruited from university-affiliated hospitals in Kansas City, Missouri. We then sought replication of a positive genetic association in a large, racially diverse cohort of myocardial infarction patients (N = 2284) using Kaplan-Meier survival analyses and Cox regression to adjust for relevant covariates. Finally, we investigated the apparent association further in 6086 additional coronary artery disease patients. After Cox adjustment for other ACS risk factors, of 95 SNPs tested in 811 whites only the association with the rs6922269 in MTHFD1L was statistically significant, with a 2.6-fold mortality hazard (P = 0.007). The recessive A/A genotype was of borderline significance in an age- and race-adjusted analysis of the entire combined cohort (N = 3095; P = 0.052), but this finding was not confirmed in independent cohorts (N = 6086). We found no support for the hypothesis that the GWAS-identified variants in this study substantially alter the probability of post-ACS survival. Large-scale, collaborative, genome-wide studies may be required in order to detect genetic variants that are robustly associated with survival in patients with coronary artery disease.
Genome-Wide Association Study of Erosive Tooth Wear in a Finnish Cohort.
Alaraudanjoki, Viivi Karoliina; Koivisto, Salla; Pesonen, Paula; Männikkö, Minna; Leinonen, Jukka; Tjäderhane, Leo; Laitala, Marja-Liisa; Lussi, Adrian; Anttonen, Vuokko Anna-Marketta
2018-06-13
Erosive tooth wear is defined as irreversible loss of dental tissues due to intrinsic or extrinsic acids, exacerbated by mechanical forces. Recent studies have suggested a higher prevalence of erosive tooth wear in males, as well as a genetic contribution to susceptibility to erosive tooth wear. Our aim was to examine erosive tooth wear by performing a genome-wide association study (GWAS) in a sample of the Northern Finland Birth Cohort 1966 (n = 1,962). Erosive tooth wear was assessed clinically using the basic erosive wear examination. A GWAS was performed for the whole sample as well as separately for males and females. We identified one genome-wide significant signal (rs11681214) in the GWAS of the whole sample near the genes PXDN and MYT1L. When the sample was stratified by sex, the strongest genome-wide significant signals were observed in or near the genes FGFR1, C8orf86, CDH4, SCD5, F2R, and ING1. Additionally, multiple suggestive association signals were detected in all GWASs performed. Many of the signals were in or near the genes putatively related to oral environment or tooth development, and some were near the regions considered to be associated with dental caries, such as 2p24, 4q21, and 13q33. Replications of these associations in other samples, as well as experimental studies to determine the biological functions of associated genetic variants, are needed. © 2018 S. Karger AG, Basel.
Genetic variants near MLST8 and DHX57 affect the epigenetic age of the cerebellum
NASA Astrophysics Data System (ADS)
Lu, Ake T.; Hannon, Eilis; Levine, Morgan E.; Hao, Ke; Crimmins, Eileen M.; Lunnon, Katie; Kozlenkov, Alexey; Mill, Jonathan; Dracheva, Stella; Horvath, Steve
2016-02-01
DNA methylation (DNAm) levels lend themselves for defining an epigenetic biomarker of aging known as the `epigenetic clock'. Our genome-wide association study (GWAS) of cerebellar epigenetic age acceleration identifies five significant (P<5.0 × 10-8) SNPs in two loci: 2p22.1 (inside gene DHX57) and 16p13.3 near gene MLST8 (a subunit of mTOR complex 1 and 2). We find that the SNP in 16p13.3 has a cis-acting effect on the expression levels of MLST8 (P=6.9 × 10-18) in most brain regions. In cerebellar samples, the SNP in 2p22.1 has a cis-effect on DHX57 (P=4.4 × 10-5). Gene sets found by our GWAS analysis of cerebellar age acceleration exhibit significant overlap with those of Alzheimer's disease (P=4.4 × 10-15), age-related macular degeneration (P=6.4 × 10-6), and Parkinson's disease (P=2.6 × 10-4). Overall, our results demonstrate the utility of a new paradigm for understanding aging and age-related diseases: it will be fruitful to use epigenetic tissue age as endophenotype in GWAS.
Investigation of Maternal Genotype Effects in Autism by Genome-Wide Association
Yuan, Han; Dougherty, Joseph D.
2014-01-01
Lay Abstract Autism spectrum disorders (ASDs) are pervasive developmental disorders which have both a genetic and environmental component. One source of the environmental component is the in utero (prenatal) environment. The maternal genome can potentially contribute to the risk of autism in children by altering this prenatal environment. In this study, the possibility of maternal genotype effects was explored by looking for common variants (single nucleotide polymorphisms, or SNPs) in the maternal genome associated with increased risk of autism in children. We performed a case/control genome-wide association study (GWAS) using mothers of probands as cases and either fathers of probands or normal females as controls, using two collections of families with autism. We did not identify any SNP that reached significance and thus a common variant of large effect is unlikely. However, there was evidence for the possibility of a large number of alleles each carrying a small effect. This suggested that if there is a contribution to autism risk through common-variant maternal genetic effects, it may be the result of multiple loci of small effects. We did not investigate rare variants in this study. Scientific Abstract Like most psychiatric disorders, autism spectrum disorders have both a genetic and an environmental component. While previous studies have clearly demonstrated the contribution of in utero (prenatal) environment on autism risk, most of them focused on transient environmental factors. Based on a recent sibling study, we hypothesized that environmental factors could also come from the maternal genome, which would result in persistent effects across siblings. In this study, the possibility of maternal genotype effects was examined by looking for common variants (single nucleotide polymorphisms, or SNPs) in the maternal genome associated with increased risk of autism in children. A case/control genome-wide association study (GWAS) was performed using mothers of probands as cases and either fathers of probands or normal females as controls. Autism Genetic Resource Exchange (AGRE) and Illumina Genotype Control Database (iCon) were used as our discovery cohort (n=1616). The same analysis was then replicated on Simon Simplex Collection (SSC) and Study of Addiction: Genetics and Environment (SAGE) datasets (n=2732). We did not identify any SNP that reached genome-wide significance (p<10−8) and thus a common variant of large effect is unlikely. However, there was evidence for the possibility of a large number of alleles of effective size marginally below our power to detect. PMID:24574247
Alharbi, Khalid Khalaf; Ali Khan, Imran; Alotaibi, Mohammad Abdullah; Saud Aloyaid, Abdullah; Al-Basheer, Haifa Abdulaziz; Alghamdi, Naelah Abdullah; Al-Baradie, Raid Saleem; Al-Sulaiman, A M
2018-01-01
Stroke is a multifactorial and heterogeneous disorder, correlates with heritability and considered as one of the major diseases. The prior reports performed the variable models such as genome-wide association studies (GWAS), replication, case-control, cross-sectional and meta-analysis studies and still, we lack diagnostic marker in the global world. There are limited studies were carried out in Saudi population, and we aim to investigate the molecular association of single nucleotide polymorphisms (SNPs) identified through GWAS and meta-analysis studies in stroke patients in the Saudi population. In this case-control study, we have opted gender equality of 207 cases and 207 controls from the capital city of Saudi Arabia in King Saud University Hospital. The peripheral blood (5 ml) sample will be collected in two different vacutainers, and three mL of the coagulated blood will be used for lipid analysis (biochemical tests) and two mL will be used for DNA analysis (molecular tests). Genomic DNA will be extracted with the collected blood samples, and specific primers will be designed for the opted SNPs ( SORT1 -rs646218 and OLR1 -rs11053646 polymorphisms) and PCR-RFLP will be performed and randomly DNA sequencing will be carried out to cross check the results. The rs646218 and rs11053646 polymorphisms were significantly associated with allele, genotype and dominant models with and without crude odds ratios (OR's) and Multiple logistic regression analysis (p < 0.05). Correlation between lipid profile and genotypes has confirmed the significant relation between triglycerides and rs646218 and rs1105364 6polymorphisms. However, rs11053646 polymorphism was correlated with HDLC (p = 0.04). Genotypes were examined in both males' vs. males and females' vs. females in cases and control and we concluded that in rs11053646 polymorphisms with male subjects compared between cases and controls found to be associated with dominant model heterozygote genotypes (p < 0.05). The results of the current study confirmed the SORT1 and OLR1 SNPs were associated in the Saudi population. The current results were in the association with the prior study results documented through GWAS and meta-analysis association. However, other ethnic population studies should be performed to rule out in the human hereditary diseases.
Bei, Jin-Xin; Su, Wen-Hui; Ng, Ching-Ching; Yu, Kai; Chin, Yoon-Ming; Lou, Pei-Jen; Hsu, Wan-Lun; McKay, James D; Chen, Chien-Jen; Chang, Yu-Sun; Chen, Li-Zhen; Chen, Ming-Yuan; Cui, Qian; Feng, Fu-Tuo; Feng, Qi-Shen; Guo, Yun-Miao; Jia, Wei-Hua; Khoo, Alan Soo-Beng; Liu, Wen-Sheng; Mo, Hao-Yuan; Pua, Kin-Choo; Teo, Soo-Hwang; Tse, Ka-Po; Xia, Yun-Fei; Zhang, Hongxin; Zhou, Gang-Qiao; Liu, Jian-Jun; Zeng, Yi-Xin; Hildesheim, Allan
2016-01-01
Genetic loci within the major histocompatibility complex (MHC) have been associated with nasopharyngeal carcinoma (NPC), an Epstein-Barr virus (EBV)-associated cancer, in several GWAS. Results outside this region have varied. We conducted a meta-analysis of four NPC GWAS among Chinese individuals (2,152 cases; 3,740 controls). Forty-three noteworthy findings outside the MHC region were identified and targeted for replication in a pooled analysis of four independent case-control studies across three regions in Asia (4,716 cases; 5,379 controls). A meta-analysis that combined results from the initial GWA and replication studies was performed. In the combined meta-analysis, rs31489, located within the CLPTM1L/TERT region on chromosome 5p15.33, was strongly associated with NPC (OR = 0.81; P value 6.3 × 10(-13)). Our results also provide support for associations reported from published NPC GWAS-rs6774494 (P = 1.5 × 10(-12); located in the MECOM gene region), rs9510787 (P = 5.0 × 10(-10); located in the TNFRSF19 gene region), and rs1412829/rs4977756/rs1063192 (P = 2.8 × 10(-8), P = 7.0 × 10(-7), and P = 8.4 × 10(-7), respectively; located in the CDKN2A/B gene region). We have identified a novel association between genetic variation in the CLPTM1L/TERT region and NPC. Supporting our finding, rs31489 and other SNPs in this region have been reported to be associated with multiple cancer sites, candidate-based studies have reported associations between polymorphisms in this region and NPC, the TERT gene has been shown to be important for telomere maintenance and has been reported to be overexpressed in NPC, and an EBV protein expressed in NPC (LMP1) has been reported to modulate TERT expression/telomerase activity. Our finding suggests that factors involved in telomere length maintenance are involved in NPC pathogenesis. ©2015 American Association for Cancer Research.
Methods for meta-analysis of multiple traits using GWAS summary statistics.
Ray, Debashree; Boehnke, Michael
2018-03-01
Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits. © 2017 WILEY PERIODICALS, INC.
Pyun, Jung-A; Kim, Sunshin; Cho, Nam H; Koh, InSong; Lee, Jong-Young; Shin, Chol; Kwack, KyuBum
2014-05-01
The aim of this study was to identify polymorphisms and gene-gene interactions that are significantly associated with age at menarche and age at menopause in a Korean population. A total of 3,452 and 1,827 women participated in studies of age at menarche and age at natural menopause, respectively. Linear regression analyses adjusted for residence area were used to perform genome-wide association studies (GWAS), candidate gene association studies, and interactions between the candidate genes for age at menarche and age at natural menopause. In GWAS, four single nucleotide polymorphisms (SNPs; rs7528241, rs1324329, rs11597068, and rs6495785) were strongly associated with age at natural menopause (lowest P = 9.66 × 10). However, GWAS of age at menarche did not reveal any strong associations. In candidate gene association studies, SNPs with P < 0.01 were selected to test their synergistic interactions. For age at natural menopause, there was a significant interaction between intronic SNPs on ADAM metallopeptidase with thrombospondin type I motif 9 (ADAMTS9) and SMAD family member 3 (SMAD3) genes (P = 9.52 × 10). For age at menarche, there were three significant interactions between three intronic SNPs on follicle-stimulating hormone receptor (FSHR) gene and one SNP located at the 3' flanking region of insulin-like growth factor 2 receptor (IGF2R) gene (lowest P = 1.95 × 10). Novel SNPs and synergistic interactions between candidate genes are significantly associated with age at menarche and age at natural menopause in a Korean population.